Google's Gemini Deep Research just got a major upgrade, and it's all about making research more interactive and accessible! The tech giant has introduced the Interactions API, a game-changer for developers and researchers.
Google's latest release enhances the Gemini Deep Research agent, a powerful tool for automating complex research. This agent, running on the advanced Gemini 3 Pro model, can understand handwriting, graphs, and mathematical symbols, and seamlessly integrates this data into reports and searches. It's a significant leap forward, as it enables the system to access and analyze information that was once challenging to automate.
Here's the exciting part: the agent works iteratively, planning its research process independently. It crafts search queries, assesses the results, and identifies missing information. But here's where it gets controversial—Google claims improved web navigation, allowing users to delve deeper into websites. Users can upload documents, which the agent scans for key passages, and then summarizes, interprets, or merges with external data.
The Interactions API is the star of the show, serving as a central hub for both Gemini models and pre-built agents. Google's future plans include expanding the API with more agents and support for custom agents. By handling data management, the API frees up developers' time, and models can even connect to external systems.
Benchmark tests prove the agent's prowess, scoring 46.4% on Humanity's Last Exam, a challenging math, physics, and programming quiz. It also excelled on Google's DeepSearchQA, a dataset measuring precision and completeness in multi-step tasks. And this is the part most people miss—Google uses this dataset to explore the impact of extended reasoning and processing time on model performance.
SiliconANGLE reports that Google Deep Research is tailored for industries heavy on document analysis and source collection. The agent aims to automate tedious tasks, while the Interactions API ensures seamless application integration. Google's vision is to demonstrate how language models can tackle research beyond conventional search and summary tasks, sparking a debate on the future of AI-assisted research.