Categories: Technology

“Gemini 2.5 Use of the computer” has solid web performance, Android

Google now allows developers to preview the model for using the Gemini 2.5 computer behind Project Mariner and agent features in AI mode.

This “specialized model” can interact with graphic user interfaces, in particular browsers and websites. There are several steps that occur in a loop “until the task is over”.

  • Send a request to the model: entries include “user demand, environmental screenshot and recent actions history”.
  • “The model then analyzes these entries and generates a response, generally a function call representing one of the actions of the user interface such as the clicking or entry.”
  • Receive the model’s response: “… The Customer Code then performs the action received.”
  • “Once the action is carried out, a new screenshot of the graphical interface and the current URL are returned to the model for using the computer as a function of function restarting the loop.”

Other user interface actions supported by the model include backward / search, web search, navigation to a specific URL, overview of the cursor, keyboard combinations, scrolling and draining and drop.

Google shared two examples (at a 3X speed) with the following prompts:

Advertising – Scroll for more content

“From https://tinyurl.com/pet-care-signup, get all the details for any pet with a residence in California and add them as a guest in my CRM SPA at https://pet-luxe-spa.web.app/. Then organizes the reason for the visit of the animal specialist.

“”My art club has thought about the tasks before our fair. The board is chaotic and I need your help to organize the tasks in certain categories I have created. Go to Sticky-note-jam.web.app and make sure that the notes are clearly in the right sections. Slide them there if not.

The use of the Gemini 2.5 computer is “mainly optimized for web browsers”. However, Google has a “Androidworld” reference which “demonstrates a strong promise for control tasks of the mobile user interface”, while it is not yet optimized for office OS level control “.

Google has demonstrated solid performance on web and mobile control references compared to the offer from Claude and Openai, as well as “the top quality for browser control to the lowest latency”.

This model is built on the capacities of visual understanding and reasoning of Gemini 2.5 Pro. Google indicates that “the versions of this model” Power Project Mariner and the agent capabilities of the AI ​​mode. It was used internally for user interface tests to accelerate software development, while Google has an early access program for third -party developer developers and workflow automation tools.

Gemini 2.5 The use of the computer is available in the public preview today via Gemini API in Google AI Studio and Vertex AI.

Try it now: In an environment of demonstration hosted by Browserbase.


FTC: We use automatic income affiliation links. More.

Source link

James Walker

James Walker – Technology Correspondent Writes about AI, Apple, Google, and emerging innovations.

Recent Posts

New York Giants hire John Harbaugh as coach

John Harbaugh agreed Saturday to become coach of the New York Giants, finalizing the longtime big-market franchise's all-out search for…

4 days ago

After U-Va. resignations, Spanberger appoints 27 to Virginia college boards

Virginia Gov. Abigail Spanberger (D) moved quickly to change direction at the state's universities in her first hours in office…

4 days ago

Lamar Odom arrested and booked for drunk driving

Lamar Odom faces new legal problems. The two-time NBA champion was arrested and convicted of driving under the influence on…

4 days ago

BMC elections 2026: Here’s how to check your name in the Mumbai electoral roll

Polling for the Maharashtra municipal corporation elections, including that of the crucial and cash-rich Brihanmumbai Municipal Corporation (BMC), will be…

4 days ago

Trump: I might want to keep Hassett where he is

Trump appears to rule out Hassett as Fed chairman in his comments.Trump said Hassett was good on television today and…

4 days ago

Broncos take 20-10 halftime lead as Josh Allen’s fumble sets up last-second field goal

An incredibly costly fumble by Josh Allen changed the game just before halftime today in Denver.After the Broncos scored a…

4 days ago