Identifying suspicious products using AI

Recently, a Malaysian KOL faced criticism for selling a skincare product believed to contain mercury. The product is believed to have not obtained permission to be sold. Let's try to use various LLMs to inspect the packaging of this product and see if we can find anything suspicious.


The input

The input is an image we obtain from a media portal.

A simple prompt is used:

Based on the product description, did you spot anything suspicious?

The LLMs

Since the input contains image and text, I selected the following models:

  • Open AI GPT 4.5
  • Claude 3.7 Sonnet
  • X Grok 2 Vision
  • Gemini 2.0 Flash

The Results

1) Open AI GPT 4.5 replied:

2) Claude 3.7 Sonnet replied:

3) Grok 2 Vision replied:

4) Google Gemini 2.0 Flash / 1.5 Pro

The result given by Gemini 2.0 Flash is rather disappointing. I retried the same request using Gemini 1.5 Pro, the result was similar.

What has happened, Google?


Summary

  Grok 2 Vision Claude 3.7 Sonnet Open AI GPT 4.5 Google Gemini 2.0 Flash
Based on description Detect spelling errors, lack of ingredients, no regulatory information Lack of ingredients, no manufacturer, safety certificate Detect spelling errors and bad English. Lack of ingredients. Lack of manufacturer and safety info. Detect 1 spelling mistake
Based on the packaging Mentioning that it is not a well-known brand and the packaging is basic The red packaging evokes certain associations    
Others   The small size (15g) looks more like a trial/sample    
Possible AI halucination   Suggesting that it could be a stimulant    
  • Grok 2 Vision and Claude 3.7 Sonnet analyze from both the extracted text and the picture itself. While GPT 4.5 and Gemini 2.0 Flash seem to just analyze the extracted text. (This could be due to the original prompt asking to check description specifically)
  • Grok 2 Vision seems to be surprisingly good, although it is kind of a "previous" generation model compared to the rest.
  • Google Gemini Flash is disappointing.

 


AI Summary AI Summary
gpt-4o-2024-08-06 2025-03-09 22:33:55
A Malaysian influencer faced backlash for selling a potential mercury-laden skincare product without sales authorization. Various AI models analyzed an image of the product's packaging for suspicious elements. Grok 2 Vision and Claude 3.7 Sonnet identified errors and omissions in both text and packaging. Open AI GPT 4.5 focused on text errors. Google Gemini 2.0 Flash performed poorly, identifying only minor issues.
Chrome On-device AI 2025-03-21 15:37:14

Share Share this Post