With the latest update, Google Gemini 1.5 Pro introduces a significant enhancement: the integration of “ears,” allowing the AI to process downloaded audio files as input data. This breakthrough empowers the AI to generate summaries of press conferences without the need for transcription, streamlining the information extraction process. Additionally, the Imagen 2 image generator receives notable upgrades, enhancing its functionality.
Expanded Capabilities of Gemini 1.5 Pro
At the recent Google Next event, it was announced that Gemini 1.5 Pro would be accessible to the public via the Vertex AI cloud platform, facilitating the development of artificial intelligence applications. Users can now seamlessly interact with Gemini models through the accompanying chatbot, simplifying the utilization of AI capabilities. The advanced variant, Gemini Ultra, is available alongside the paid chatbot Gemini Advanced. While Gemini Advanced supports extended queries, it lags in speed compared to Gemini 1.5 Pro.
Enhancements in Imagen 2
The Imagen 2 image generator sees enhancements in its feature set, enabling the addition or removal of elements from generated images. Moreover, all images produced by Imagen 2 now bear a SynthID digital watermark, offering traceability while remaining imperceptible to human observers.
Improving Relevance in AI Responses
Google emphasizes its commitment to enhancing the relevance of AI responses by aligning them with search engine results. While this poses challenges, particularly for large language models, Google is actively exploring strategies to ensure the accuracy and timeliness of information provided. NIXSolutions notes that there are intentional limitations in certain contexts; for instance, Google restricts Gemini from addressing queries related to US elections.
As Google continues to innovate in the realm of artificial intelligence, we’ll keep you updated on further developments and enhancements. Stay tuned for more advancements in AI technology.