Google’s newly released Gemini 3 models have quickly risen to the top of major AI performance leaderboards, drawing strong industry reactions after an impressive early demo. Among the most notable praises was from Salesforce chief executive Marc Benioff, who described Gemini 3 as an “extraordinary step forward” and said that after spending just two hours with the model, he did not expect to return to ChatGPT.
To understand how the model performs in practical, everyday scenarios, both Gemini 3 and ChatGPT were tested across three creative and visual tasks involving infographics, image transformation and geographical labelling. The results offer a clear look at how both AI tools handle simple prompts as well as more challenging factual graphics.
Test 1: Infographic creation
Prompt:
Create a clean infographic with five sections on the topic “Climate-friendly habits for daily life”. Provide short titles, one-line descriptions and a simple visual layout with suggested icon styles.
How they performed
Both Gemini 3 and OpenAI’s ChatGPT successfully generated the required infographic. However, their approaches differed significantly in speed and visual polish.
Test 2: Image-to-portrait transformation
Prompt:
Take the user’s uploaded image and convert it into a classic black-and-white studio portrait in a tuxedo with a bow tie, Rembrandt lighting, a dark velvet backdrop, a vintage camera aesthetic and accurate facial features.
How they performed
This test highlighted a major difference in image-handling ability.
Test 3: India map with labels and descriptions
Prompt:
Create a graphic showing a simple map of India with all states and union territories labelled, along with a one-line description highlighting a unique cultural, geographical or economic fact for each region.
How they performed
Both models struggled considerably with this prompt.
This test revealed that both tools currently face limitations when required to generate complex, fact-heavy graphics.
Verdict
Across simple creative tasks, both AI tools can perform effectively, but when visuals become more demanding, differences emerge. Gemini 3 delivered faster, more refined results in the first two tests, while both systems struggled with accuracy in the final, more complex geography challenge.
Based on these evaluations, Gemini 3 outperformed ChatGPT in two out of the three tests, though both require improvements in handling detailed factual graphics.