Google has announced the launch of two generative AI models, Veo and Imagen 3, available for businesses using Vertex AI, its cloud platform for AI tools. Veo is designed to generate high-definition videos from images and text prompts, while Imagen 3 focuses on producing realistic images from simple text inputs.
Google launches generative AI models Veo and Imagen 3 for businesses
Veo, developed by Google DeepMind, generates videos featuring realistic-looking people and animals. Users can create content by uploading an image tied to a text prompt or by inputting text alone. Currently, Veo will be accessible to select businesses through a private preview. It produces 1080p video clips lasting up to six seconds, supporting 24 or 30 frames per second. According to Warren Barkley, senior director of product management at Google Cloud, the enterprise response to generative AI has been overwhelmingly positive, with reports indicating an 86% revenue increase among companies that have integrated these technologies.
Prompt: Timelapse of the northern lights dancing across the Arctic sky, stars twinkling, snow-covered landscape
Video: Google
Imagen 3, also newly launched, is touted as Google’s highest quality image generation model. It can create photorealistic images and offers advanced editing capabilities, such as adding, removing, or extending elements within an image. Starting next week, all Vertex AI customers will have access to Imagen 3. Brands like Cadbury, Oreo, and Milka are among the first to utilize these models in their marketing strategies.
Both models incorporate digital watermarks to prevent misinformation and misattribution, utilizing Google DeepMind’s SynthID technology. Additionally, they include built-in safeguards to prevent misuse and the generation of harmful content. Importantly, neither model is trained on customer data.
Veo’s capabilities and limitations
The availability of Veo in a private preview will allow businesses like Quora and Mondelez International to explore creative applications, such as generating video content for their platforms. Veo’s ability to create scenes with specific visual styles is one of its standout features. It can produce dynamic content, including landscape shots and time-lapse videos. However, the model is not without flaws. Issues like disappearing objects and unrealistic physics, such as reversing vehicles, highlight its current limitations.
Prompt: A fast-tracking shot down an suburban residential street lined with trees. Daytime with a clear blue sky. Saturated colors, high contrast
Video: Google
Veo has been trained on a diverse range of footage to enhance its capabilities. When asked about its training sources, Barkley mentioned that it “may” include content from YouTube, in line with agreements with content creators. He emphasized that Google focuses on using high-quality, curated data, adhering to safety and security standards. As with other AI models, concerns about copyright and proprietary content arise, especially with the potential for models to output nearly identical copies of existing work.
Google asserts that it has implemented prompt-level filters to manage potentially harmful outputs. Additionally, the company plans to indemnify output from Veo on Vertex AI once it becomes generally available, offering some protection for businesses utilizing the tool.
Google is gradually integrating Veo into its suite of products, as evidenced by its introduction into Google Labs earlier this year following initial announcements. In September, the model was incorporated into YouTube Shorts, allowing creators to produce background scenes and brief video clips easily.
Featured image credit: Google DeepMind