Google to use publisher content for AI training regardless of opt-out

This will ensure these web pages do not show up when user uses Google's search engine

Google to use publisher content for AI training regardless of opt-out
Google to use publisher content for AI training regardless of opt-out

Google Search products are reportedly rumoured to use content from publishers even if they have opted out of artificial intelligence (AI) training.

According to a Bloomberg report, Google DeepMind Vice President Eli Collins revealed that the rules for adhering to publishers' decision to opt-out from AI training are different for AI models from DeepMind and the company's Search products.

The American-based tech giant reportedly explained that content for search is managed by a separate mechanism that uses the robots.txt web standard.

Attorney representing the Department of Justice in the antitrust case, Diana Aguilar, reportedly produced a document highlighting those 80 billion out of 160 billion tokens used to train Google's AI models came from content that publishers had opted out of AI training.

However, when Aguilar reportedly questioned if the Gemini AI model could use the same content if it was put inside the Search product, Collins confirmed that as “correct,” as long as the use case was within the Search.

Speaking to Bloomberg, a Google spokesperson stated that the rules for Search-based AI tools are different, as publishers can “only decline to have their data used in Search AI if they opt out of being indexed for search.”

Publishers can do this by disabling the robots.txt web standard that allows Google's crawler bots to access the content to index it in search results.

To note, this would ensure that these web pages do not show up when a user uses Google's search engine to search for a topic.