X
Innovation

Google updates Gemini and Gemma on Vertex AI, and gives Imagen a text-to-live-image generator

If you thought text-to-image generators were cool, you'll love this text-to-live-image generator.
Written by Sabrina Ortiz, Editor
Google logo in New York City Pier 57 location
Sabrina Ortiz/ZDNET

Google Cloud began offering Vertex AI in 2021 to help developers, data scientists, and companies build and use generative AI models on one platform. Now, at Google Cloud Next, Google announced updates to some of its biggest Vertex AI offerings, including Gemini, Gemma, and Imagen. 

For starters, Gemini 1.5 Pro, which Google announced as its most advanced AI model in February 2024, is now available in public preview for developers using Vertex AI. Gemini 1.5 Pro has a 1-million-token context window, which is bigger than any other AI model currently available, according to Google. A bigger context window allows for improved long-context understanding, which is useful when building more complex AI applications.

screenshot-2024-04-08-at-2-05-26pm.png
Google

Furthermore, Google announced that Gemini 1.5 Pro on Vertex AI can now process audio streams, including speech and audio portions of videos. This addition allows for high-quality transcription and seamless cross-modal analysis across text, images, audio, and videos.

Vertex AI now also includes CodeGemma, a new model from Google's family of lightweight Gemma models, fine-tuned for code generation and assistance, and Claude 3, Anthropic's latest family of models. These models add to Vertex AI's already robust collection of over 130 models.

Also: What developers trying out Google Gemini should know about their data

Google has additionally expanded Vertex AI's grounding capabilities by letting organizations directly ground responses with Google Search. This public preview should let organizations access the latest information to hopefully improve their AI tools' response accuracy. 

Imagen 2.0 in Vertex AI is also getting serious updates, letting users create four-second live images from text prompts, as seen below. The feature, available in public preview, could let marketing and creative teams generate GIFs with a 360x640 resolution and a 24 fps refresh rate from a simple text prompt. Additionally, Imagen 2.0 in Vertex AI has gained advanced photo editing capabilities, including outpainting and inpainting, which let you easily remove and add elements to a generated AI image.

Lastly, and perhaps most significantly, Google DeepMind's invisible watermarking system SynthID is now generally available on Vertex AI, letting customers watermark all images generated by Imagen.

Even though many of these updates, such as the ability to generate live images, aren't available to the public yet, this is still a significant development. Google likes to collect feedback from testers before making many of its new features available to the general public.

Editorial standards