Last week, Google unveiled its biggest search shift in years, showing off new artificial intelligence features that answer people's questions in the company's bid to catch up to rivals Microsoft and OpenAI.
The new technology has since spawned a litany of falsehoods and errors – including recommending glue as part of a pizza recipe and ingesting rocks as nutrients – giving Google a black eye and causing an uproar online.
Incorrect answers in the feature, called AI Overview, have undermined trust in a search engine that more than two billion people turn to for authoritative information. And while other AI-powered chatbots tell lies and behave strangely, the backlash has shown that Google is under more pressure to safely incorporate AI into its search engine.
The launch also extends a pattern of Google's problems with its latest AI features soon after rolling them out. In February 2023, when Google announced Bard, a chatbot to combat ChatGPT, it shared misinformation about the space. The company's market value subsequently fell by $100 billion.
Last February, the company released Bard's successor, Gemini, a chatbot that can generate images and act as a voice-controlled digital assistant. Users quickly realized that the system refused to generate images of white people in most cases and drew inaccurate depictions of historical figures.
With each incident, tech industry insiders have criticized the company for dropping the ball. But in interviews, financial analysts said Google needed to move quickly to keep up with its rivals, even if that meant growing pains.
Google “has no choice right now,” Thomas Monteiro, a Google analyst at Investing.com, said in an interview. “Companies need to move very quickly, even if it means skipping a few steps along the way. User experience will just have to catch up.”
Lara Levin, a Google spokeswoman, said in a statement that the vast majority of AI Overview queries produced “high-quality information, with links to dig deeper into the web.” The AI-generated result from the tool is usually displayed at the top of the results page.
“Many of the examples we saw were unusual queries, and we also saw examples that were spoofed or that we couldn't reproduce,” he added. The company will use “isolated examples” of problematic responses to refine its system.
Ever since OpenAI released its chatbot ChatGPT in late 2022 and it became an overnight phenomenon, Google has been under pressure to integrate AI into its popular apps. But there are challenges in taming large language models, which learn from huge amounts of data taken from the open web – including falsehoods and satirical posts – rather than being programmed like traditional software.
(The New York Times sued OpenAI and its partner, Microsoft, in December, alleging copyright infringement of news content related to artificial intelligence systems.)
Google announced the AI overview with much fanfare at its annual developer conference, I/O, last week. For the first time, the company had connected Gemini, its latest large-scale linguistic AI model, to its flagship product, its search engine.
AI Overview combines statements generated by its language models with active link snippets on the web. It can cite its sources, but it doesn't know when that source is incorrect.
The system was designed to answer more complex and specific questions than normal searches. The result, the company said, was that the public could benefit from everything Gemini could do, taking some of the work off of searching for information.
But things quickly went awry, and users posted screenshots of problematic examples on social media platforms like X.
AI Overview instructed some users to mix non-toxic glue into pizza sauce to keep the cheese from slipping off, a fake recipe that appeared to be borrowed from an 11-year-old Reddit post that was meant to be a joke. The AI told other users to ingest at least one rock a day for vitamins and minerals, advice that originated in a satirical post by The Onion.
As the company's cash cow, Google search is “the only property Google needs to keep relevant/reliable/useful,” Gergely Orosz, a software engineer with a technology newsletter, Pragmatic, wrote on Engineer. “And yet, examples of how AI Overviews are turning Google search into garbage are all over my timeline.”
People also shared examples of Google telling bold users to clean their washing machines using “bleach and white vinegar,” a mixture that, when combined, can create harmful chlorine gas. In a smaller font, it told users to clean with one, then the other.
Social media users tried to compete with each other over who could share Google's most bizarre answers. In some cases, they falsified the results. A doctored screenshot appeared to show Google saying that a good remedy for depression would be to jump off the Golden Gate Bridge, quoting a Reddit user. Ms. Levin, the Google spokeswoman, said the company's systems never returned that result.
AI Overview, however, had difficulty with presidential history, saying that 17 presidents were white and that Barack Obama was the first Muslim president, according to screenshots posted on X.
Andrew Jackson is also said to have graduated from college in 2005.
Kevin Roose contributed to the reporting.