Google has unveiled two updated production-ready Gemini models -- Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002. The release includes a significant 50% price reduction for the Gemini 1.5 Pro model, faster output, and increased rate limits, providing more value for developers working with the Gemini API and Google AI Studio.
Key updates include a 52% reduction in output token pricing, 2x higher rate limits for Gemini 1.5 Flash, and 3x higher rate limits for Gemini 1.5 Pro, coupled with improvements in model latency. These updates are designed to lower the overall cost of building with Gemini while improving performance in text, code, and multimodal tasks.
"Two new production Gemini models, >2x higher rate limits, >50% price drop on Gemini 1.5 Pro, filters switched to opt-in, updated Flash 8B experimental model, and more. It's a good day to be a developer," posted Logan Kilpatrick, lead product manager at Google.
The new models show improvements in benchmarks, such as a 20% gain in math-related tasks and up to 7% in Python code generation and visual understanding. Developers will benefit from a more concise response style aimed at reducing costs in summarisation and question-answering tasks. Additionally, Google is introducing new default filter settings, giving developers greater flexibility in configuring safety features.
For Gemini 1.5 Pro, a long context window of 2 million tokens is maintained, while the input and output token prices have been reduced by 64% and 52%, respectively. The model is optimised for various tasks, including video understanding and large-scale document processing.
Google has also launched an improved version of the Gemini-1.5-Flash-8B-Exp-0924 model, available for developers looking to test experimental updates via Google AI Studio. These changes aim to make the Gemini platform more accessible for production use and experimental development.
Developers can access the updated models for free on Google AI Studio or through the Gemini API. Larger organizations can also access them via Vertex AI. These updates will be effective starting October 1, 2024.
Google recently introduced DataGemma, a new open model that integrates LLMs with real world data from its Data Commons repository, using retrieval augmented methods like RIG and RAG to reduce AI hallucinations and improve the accuracy of generative AI outputs in research and decision-making contexts.
Moreover, Google recently announced that YouTube is set to roll out advanced generative AI tools for creators in the coming months, enabling them to generate video content using the AI models Veo and Imagen 3 through a feature called Dream Screen.
When is Google releasing Gemini 2? "That's the Magic question," quipped Kilpatrick in an exclusive interview with AIM. However, he said that the company plans to release Veo, Google Search Grounding, Gemini 2.0, and Agents next, tentatively expected to debut in the coming months. "I think it'll be fun to see what the next six months to a year look like as the trend of unlocking new developments continues," said Kilpatrick.