Why you can trust ZDNET : ZDNET independently tests and researches products to bring you our best recommendations and advice. When you buy through our links, we may earn a commission. Our process

'ZDNET Recommends': What exactly does it mean?

ZDNET's recommendations are based on many hours of testing, research, and comparison shopping. We gather data from the best available sources, including vendor and retailer listings as well as other relevant and independent reviews sites. And we pore over customer reviews to find out what matters to real people who already own and use the products and services we’re assessing.

When you click through from our site to a retailer and buy a product or service, we may earn affiliate commissions. This helps support our work, but does not affect what we cover or how, and it does not affect the price you pay. Neither ZDNET nor the author are compensated for these independent reviews. Indeed, we follow strict guidelines that ensure our editorial content is never influenced by advertisers.

ZDNET's editorial team writes on behalf of you, our reader. Our goal is to deliver the most accurate information and the most knowledgeable advice possible in order to help you make smarter buying decisions on tech gear and a wide array of products and services. Our editors thoroughly review and fact-check every article to ensure that our content meets the highest standards. If we have made an error or published misleading information, we will correct or clarify the article. If you see inaccuracies in our content, please report the mistake via this form.


What is Gemini? Everything you should know about Google's new AI model

Google just released its most powerful AI model yet, but what can it do?
Written by Maria Diaz, Staff Writer
Google Gemini website on laptop reads, welcome to the Gemini era
Maria Diaz/ZDNET

What is Google Gemini?

Gemini is a new and powerful artificial intelligence model from Google that can understand not just text but also images, videos, and audio. As a multimodal model, Gemini is described as capable of completing complex tasks in math, physics, and other areas, as well as understanding and generating high-quality code in various programming languages. 

It is currently available through integrations with Google Bard and the Google Pixel 8 and will gradually be folded into other Google services. 

Also: AI in 2023: A year of breakthroughs that left no human thing unchanged

"Gemini is the result of large-scale collaborative efforts by teams across Google, including our colleagues at Google Research," according to Dennis Hassabis, CEO and co-founder of Google DeepMind. "It was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across, and combine different types of information including text, code, audio, image, and video."

Who made Gemini?

Gemini was created by Google and Alphabet, Google's parent company, and released as the company's most advanced AI model to date. Google DeepMind also made significant contributions to the development of Gemini. 

Also: Bing's new Deep Search uses GPT-4 to get you more thorough search results

Are there different versions of Gemini?

Google describes Gemini as a flexible model that is capable of running on everything from Google's data centers to mobile devices. To achieve this scalability, Gemini is being released in three sizes: Gemini Nano, Gemini Pro, and Gemini Ultra.

  • Gemini Nano: The Gemini Nano model size is designed to run on smartphones, specifically the Google Pixel 8. It's built to perform on-device tasks that require efficient AI processing without connecting to external servers, such as suggesting replies within chat applications or summarizing text. 
  • Gemini Pro: Running on Google's data centers, Gemini Pro is designed to power the latest version of the company's AI chatbot, Bard. It's capable of delivering fast response times and understanding complex queries. 
  • Gemini Ultra: Though still unavailable for widespread use, Google describes Gemini Ultra as its most capable model, exceeding "current state-of-the-art results on 30 of the 32 widely-used academic benchmarks used in large language model (LLM) research and development." It's designed for highly complex tasks and is set to be released after finishing its current phase of testing. 

How can you access Gemini?

Gemini is now available on Google products in its Nano and Pro sizes, like the Pixel 8 phone and Bard chatbot, respectively. Google plans to integrate Gemini over time into its Search, Ads, Chrome, and other services. 

Also: I asked DALL-E 3 to create a portrait of every US state, and the results were gloriously strange

Developers and enterprise customers will be able to access Gemini Pro via the Gemini API in Google's AI Studio and Google Cloud Vertex AI starting on December 13. Android developers will have access to Gemini Nano via AICore, which will be available on an early preview basis.

How does Gemini differ from other AI models, like GPT-4?

Google's new Gemini model appears to be one of the largest, most advanced AI models to date, though the release of the Ultra model will be the one to determine that for certain. Compared to other popular models that power AI chatbots right now, Gemini stands out due to its native multimodal characteristic, whereas other models, like GPT-4, rely on plugins and integrations to be truly multimodal. 

Also: Google says Bard is now smarter than ChatGPT, thanks to Gemini update

Gemini Ultra and Pro vs GPT-4

A comparison chart from Google shows how Gemini Ultra and Pro compare to OpenAI's GPT-4 and Whisper, respectively. 


Compared to GPT-4, a primarily text-based model, Gemini easily performs multimodal tasks natively. While GPT-4 excels in language-related tasks like content creation and complex text analysis natively, it resorts to OpenAI's plugins to perform image analysis and access the web, and it relies on DALL-E 3 and Whisper to generate images and process audio. 

Also: The best AI chatbots: ChatGPT and other noteworthy alternatives

Google's Gemini also appears to be more product-focused than other models available now. It's either integrated into the company's ecosystem or with plans to be, as it's powering both Bard and Pixel 8 devices. Other models, like GPT-4 and Meta's Llama, are more service-oriented, and available for various third-party developers for applications, tools, and services.

Editorial standards