What is Gemini Live? How Google's real-time chatbot competes with GPT-4o

Google made several big announcements at I/O 2024, including a new way to have natural voice conversations with its Gemini model.
Written by Maria Diaz, Staff Writer
Google Pixel 8a Gemini
Kerry Wan/ZDNET

Following OpenAI's Spring Update event yesterday, Google demoed its superpowered artificial intelligence voice assistant to rival GPT-4o. Gemini Live leverages an improved multimodal AI model to offer mobile users a more natural conversational experience in real time.

Also: Everything announced at Google I/O 2024: Gemini, Search, Android 15, and more

Gemini Live lets you have voice conversations with Gemini that feel natural and intuitive. For example, you can ask Gemini Live questions at your own pace and interrupt the AI bot mid-sentence to have it clarify or adjust how it's replying, similar to what OpenAI showed off during its GPT-4o demo. Google will offer a variety of voices for users to choose from for their Gemini Live experience, as OpenAI has done with ChatGPT since integrating Whisper in September 2023.

Google plans to add the full multimodal experience to Gemini Live later this year, allowing Gemini to view the world around you when you open the camera during a conversation. This is similar to what ChatGPT users will be able to do in the coming weeks after an update that will be first rolled out to ChatGPT Plus users. In the Gemini app, this functionality will be powered by Google's Project Astra.

Also: ChatGPT vs. ChatGPT Plus: Is a paid subscription still worth it?

Among this and other updates, Google also upgraded Gemini Nano to process text, images, and sounds, meaning the model is no longer limited to text input. Gemini Nano with Multimodality will be available first for Pixel smartphones.

Editorial standards