On Monday, Google launched a new rewards program dedicated specifically to finding bugs in AI products. Google’s list of qualifying bugs includes examples of the type of rogue actions it’s looking for, like indirectly injecting an AI prompt that causes Google Home to unlock a door, or a data exfiltration prompt injection that summarizes all of someone’s emails and sends the summary to the attacker’s own account.
The new program clarifies what constitutes an AI bug, breaking them down as issues that use a broad language model or generative AI system to cause damage or take advantage of a security vulnerability, with rogue actions at the top of the list. This includes modifying someone’s account or data to hinder their security or do something unwanted, like a previously exposed flaw that could open smart shutters and turn off lights using a poisoned Google calendar event.
Just getting Gemini in Hallucine won’t cut it. The company says that issues with content produced by AI products – such as the generation of hate speech or copyright infringing content – should be reported to the product’s feedback channel itself. According to Google, this way its AI security teams can “diagnose model behavior and implement long-term security training over the long term.”
Along with the new AI rewards program, Google also announced on Monday an AI agent that fixes vulnerable code called CodeMender. The company claims it used to patch “72 security patches to open source projects” after verification by a human researcher.
The $20,000 prize is awarded to root out rogue actions on Google’s “flagship” product search, Gemini apps, and core workspace apps like Gmail and Drive. Multipliers for reporting quality and a novelty bonus are also available, which could bring the total amount to $30,000. The price drops for bugs found in other Google products, like Jules or Notebooklm, and for lower-level abuses, such as stealing secret model settings.