Under pressure from new competitors, Google has added AI summaries to its search results. However, it seems to behaving teething problems. Much of that has to do with the internet information sources Google and other AI services are using. For instance, like OpenAI, Google is using Reddit as a source of data to train its models. It is well known that such conversational platforms have comments that vary from good to garbage.
Peter Yang shared one such example on Twitter. When asked how to best stick cheese to pizza, Google AI summary suggested adding glue. That ludicrous suggestion came from an 11-year-old Reddit comment. As I had pointed out in an earlier post, we should not be surprised that we are getting “garbage” outputs. It is going to get worse before it gets better.
This is often the case with technology. I’m fairly certain few remember that nearly a decade ago, Google had to confront an “anus” and “bum” problem.
Google had developed new OCR software for scanning books into Google Books. Just like the “AI Summaries,” it had bugs — it read “arms” as “anus” and “burn” as “bum” in certain typefaces. There were other such bugs.
That bug took some of the books in hilarious directions. For instance, here is a quote from John Mackay Wilson’s “Tales of the Borders”:
“…poor Janet shuddered at the words which she heard him utter for with strange and wicked oaths he vowed vengeance on the individual who’d persecuted him and she flung her anus around his neck….”
Here is another one from “Matisse on the Loose” by Georgina Bragg:
… “When she spotted me, she flung her anus high in the air and kept them up until she reached me. ‘Matisse. Oh boy!’ she said. She grabbed my anus and positioned my body in the direction of the east gallery and we started walking.”…
Google Books’ OCR has always provided great fodder for the literary minded — as so well articulated in this New Yorker article, The Artful Accident of Google Books. It also inspired its own Tumblr, The Art of Google Books.
Back to the present, like those funny quotes, some of these AI summaries might give us an opportunity to chuckle, but this is no laughing matter. The garbage data will only reinforce more garbage information.
Given the scale of technology and Google’s influence, the implications of such “mistakes” can be life-threatening. It is not just Google. Open AI is a snake pit of misinformation, as a group of Purdue University researchers have found.
Our analysis shows that 52% of ChatGPT answers contain incorrect information and 77% are verbose. Nonetheless, our user study participants still preferred ChatGPT answers 35% of the time due to their comprehensiveness and well-articulated language style. However, they also overlooked the misinformation in the ChatGPT answers 39% of the time. This implies the need to counter misinformation in ChatGPT answers to programming questions and raise awareness of the risks associated with seemingly correct answers.
We have entered into a new vortex of information callousness — whose impact we can only understand when looking back at the present.
May 24, 2024. San Francisco/