Reality Check Commentary: Generative AI Models Are Finally Paying to be Trained on Trustworthy Journalism
NewsGuard Co-CEO Gordon Crovitz writes that the deal ChatGPT made to access Financial Times journalism is a sign that the AI models know they need to stop hallucinating
Welcome to NewsGuard's Reality Check, a report on how misinformation online is undermining trust — and who’s behind it.
Generative AI Models Are Finally Paying to be Trained on Trustworthy Journalism
By Gordon Crovitz, NewsGuard Co-CEO
This week brought the latest hopeful sign that the generative AI companies are getting serious about the trustworthiness of how their models treat topics in the news. Reports from NewsGuard showed that the AI models have a great propensity to create and spread false claims — “hallucinations” — on topics in the news, pointing to the urgent need for the models to be trained with reliable news sources and to learn truth from fiction.
Trustworthy training data, please: The AI models are “trained” on whatever they can find on the internet, so when people ask the chatbots about topics in the news, their responses are based on the news sources the models are able to access. OpenAI just announced that the Financial Times is the latest news publisher to get a licensing agreement, which means that its ChatGPT will be able to use the highly regarded London-based source of financial and business news in its training data. OpenAI previously made deals with The Associated Press, Berlin-based Axel Springer, and France’s Le Monde.
The press release announcing the FT deal is notable for the frank recognition of the need for the AI models to get access to quality journalism for their training data. Brad Lightcap, the chief operating officer of OpenAI, said that the deal would “enrich the ChatGPT experience with real-time, world-class journalism for millions of people around the world.”
FT CEO John Ridding said, “It’s right, of course, that AI platforms pay publishers for the use of their material.” He credited OpenAI with understanding “the importance of transparency, attribution, and compensation — all essential for us. At the same time, it’s clearly in the interests of users that these products contain reliable sources.”
The generative AI models need to license quality news because, as PCMag reported in January, more than one third of the top 100 most popular news websites have blocked ChatGPT from being trained on their journalism until licensing deals are reached. There are plenty of low-quality publishers that allow access for training by AI models without a license. These include sources that NewsGuard analysts determined repeatedly publish false content, such as Russian disinformation websites, conspiracy sites, and healthcare hoax sites. No wonder the AI models spread fabrications in the news instead of providing factual reporting.
Meta and Google vs. OpenAI: Contrast OpenAI’s willingness to pay for access to journalism with the continued resistance by the Silicon Valley digital platforms to license quality news. California is considering a bill similar to laws passed in Australia and Canada that would require social media and search companies to pay publishers to display or link to copyrighted news. Google and Meta’s Facebook and Instagram object. This month Google announced it is blocking news links completely for news publishers based in California from appearing in search results for some people in California. Google wouldn’t say how many of its users are being denied news links in their search results or which local news sites it’s boycotting, only describing its strongarm political tactic as a “short-term test.”
Social media and search platforms have long been superspreaders of misinformation, failing to warn their users about who’s feeding them the news on their platforms. Their business model is to maximize engagement, even through misinformation that maximizes their advertising revenue. The generative AI models, in contrast, know they need to stop spreading false claims and start earning trust from their customers among companies and governments.
By licensing quality news, AI models can replace garbage in, garbage out with trustworthy news in, trustworthy news out. Whatever OpenAI agreed to pay the FT (terms were not disclosed) and whatever it agrees to pay the many other publishers whose journalism it needs to access, the investment in quality news will likely be a bargain.
As an editorial in the FT back in 2021 warned, “Building trust in AI systems is essential.”
Gordon Crovitz is the Co-CEO and Co-Editor-In-Chief of NewsGuard. Previously, he was publisher of The Wall Street Journal.
We launched Reality Check after seeing how much interest there is in our work beyond the business and tech communities that we serve. Subscribe to this newsletter to support our apolitical mission to counter misinformation for readers, brands, and democracies. Have feedback? Send us an email: realitycheck@newsguardtech.com.