Using Its Stories to Train Chatbots, The New York Times Sues OpenAI and Microsoft

Spread the love

The New York Times filed a federal lawsuit Wednesday against OpenAI and Microsoft to stop exploiting their stories to train chatbots.

The Times claims the businesses are jeopardizing their livelihoods by stealing billions of dollars in journalistic work, including spewing out Times content to consumers seeking answers from generative AI like OpenAI’s ChatGPT. The newspaper filed its complaint in Manhattan federal court after talks with the two firms, which began in April, broke down.

Using Its Stories to Train Chatbots, The New York Times Sues OpenAI and Microsoft
Using Its Stories to Train Chatbots, The New York Times Sues OpenAI and Microsoft

 

Online readership has hit the media hard. The Times and other publishers have successfully navigated the digital environment, but AI’s rapid progress threatens to upend the publishing sector. Web traffic drives online subscriptions and advertising revenue for the paper. The Times reports that AI chatbots divert traffic from the paper and other copyright holders, making users less inclined to contact the original source for information.

Ian B. Crosby, partner and lead counsel at Susman Godfrey, representing The Times, said, “These bots compete with the content they are trained on.” In a prepared statement, OpenAI claimed it respects content producers’ rights and is “committed” to helping them benefit from technology and new income models. “Our conversations with the New York Times have been productive and moving forward constructively, so we are surprised and disappointed with this development,” the representative added. “We’re hopeful we’ll find a mutually beneficial way to work together, as we do with many publishers.” Microsoft declined comment.

To train generative AI chatbots, AI businesses collect online content, including news stories. Large language models are trained on a massive amount of human-written resources to improve their language and grammar skills and answer questions appropriately. However, the technology is still developing and has several flaws. For instance, the Times sued OpenAI’s GPT-4 for misrepresenting Wirecutter, its product reviews site, and damaging its reputation.

Since public and business interest in AI skyrocketed this year, OpenAI and other AI companies, including rival Anthropic, have raised billions of dollars. Microsoft collaborates with OpenAI to use its AI technologies. According to the lawsuit, Microsoft is OpenAI’s main sponsor and has invested at least $13 billion in the company since their 2019 agreement. Under the arrangement, Microsoft powers OpenAI’s AI research with its supercomputers and integrates its technology into its businesses.

The paper’s action follows a rise in copyright claims against OpenAI. Several writers, including comedian Sarah Silverman, have sued OpenAI for using their novels to train its AI models without permission. Over 4,000 writers wrote to OpenAI and other startup CEOs in June, accusing them of exploitative chatbot development. Fears of AI have led to labor unrest and litigation in various industries, including Hollywood. Sarah Kreps, director of Cornell University’s Tech Policy Institute, said stakeholders are aware the technology could disrupt their economic model. The challenge is how to respond. Kreps agreed that chatbots threaten The New York Times. She added that fixing the issue completely will be difficult. “There’s so many other language models that are doing the same thing,” she said.

Using Its Stories to Train Chatbots, The New York Times Sues OpenAI and Microsoft

 

OpenAI’s GPT-4 spewing out huge chunks of Times articles, including a Pulitzer-Prize-winning 18-month taxi sector probe, was referenced in the Wednesday lawsuit. It also cited Bing Chat (now Copilot) outputs with Times story extracts. While not specifying costs, the Times said it wants to make them accountable for the billions of dollars in statutory and actual damages they owe for copying and utilizing its work. It also wants the court to dissolve tech companies’ AI models and data sets that use its work. A trade group representing over 2,200 news organizations, the News/Media Alliance, praised the Times’ Wednesday decision.

“If approached collaboratively, quality journalism and GenAI can complement each other,” said alliance president and CEO Danielle Coffey. “But using journalism without permission or payment is illegal and unfair.”
OpenAI contracted with The Associated Press to license its news archive in July. OpenAI also partnered with Berlin-based Axel Springer, which owns Politico and Business Insider, this month. Axel Springer’s media companies will provide “selected global news content” to OpenAI’s ChatGPT customers under the partnership. Both companies said their responses will include attribution and links to the original publications.

The Times has contrasted its move to a copyright case against Napster 22 years ago for illegal use of record labels’ property. The record labels won, and Napster disappeared, but it changed the industry. Industry-backed streaming rules the music industry.


Spread the love