NYT v. GPT: Microsoft finds itself on the other side of an industry-defining copyright dispute

The New York Times headquarters in Manhattan. The newspaper’s lawsuit this week against Microsoft and OpenAI asks a federal judge to order the destruction of all AI large language models trained on its copyrighted material. (GeekWire Photo / Kurt Schlosser)

“Who can afford to do professional work for nothing?”

Bill Gates wrote those words in a landmark open letter in February 1976, calling on personal computer hobbyists to stop stealing software produced by the scrappy young “Micro-Soft” team without paying for it.

Five decades later, the same sentiment sums up The New York Times Co.’s lawsuit against Microsoft and OpenAI this week. The suit alleges that the companies wrongly used vast amounts of copyrighted material from the newspaper to train the large language models that power ChatGPT and other artificial intelligence models.

Microsoft and OpenAI “seek to free-ride on The Times’s massive investment in its journalism by using it to build substitutive products without permission or payment,” the Times Co. alleges in the introduction to the suit.

In other words, a company that built a technology empire on intellectual property rights now faces allegations that it’s flouting them in pursuit of its next phase of innovation and growth.

That irony is one of a few thoughts that came to mind while reading through the 69-page complaint filed by The Times Co. against Microsoft and OpenAI in federal court in New York City on Wednesday.

With the lawsuit, the Times Co. is drawing a legal line in the sand for the new era of AI. In addition to pursuing unspecified financial damages, the suit seeks an injunction against Microsoft and OpenAI to halt the alleged practice of using the Times’ copyrighted material, and a court order for the destruction “of all GPT or other LLM models and training sets” that incorporate its copyrighted work.

The lawsuit also delves into the implications of AI for journalism and democracy.

“Producing Times journalism is a creative and deeply human endeavor,” the suit says, adding later, “If The Times and other news organizations cannot produce and protect their independent journalism, there will be a vacuum that no computer or artificial intelligence can fill.”

The claims cover not just the alleged use of New York Times articles to train large language models, but also answers provided by ChatGPT and Microsoft Bing Chat to users. Several examples in the suit show how different prompts cause GPT-4, ChatGPT, and Bing Chat to reproduce large chunks of text, verbatim, from the newspaper’s articles.

An exhibit in the New York Times Co. lawsuit, with verbatim text in red. (Via U.S. District Court for the Southern District of New York.)

In other cases, the suit cites instances in which the AI attributes information to the Times incorrectly.

In one example, the suit says, Bing Chat “confidently purported to reproduce the sixth paragraph” from a widely cited 2015 New York Times article, Inside Amazon: Wrestling Big Ideas in a Bruising Workplace.

“Had Bing Chat actually done so, it would have committed copyright infringement,” the suit says. “But in this instance, Bing Chat completely fabricated a paragraph, including specific quotes attributed to Steve Forbes’s daughter Moira Forbes, that appear nowhere in The Times article in question or anywhere else on the internet.”

“In AI parlance, this is called a ‘hallucination.’ In plain English, it’s misinformation,” the suit says, generally referencing these types of instances.

This is one of numerous suits that promise to lay the groundwork for U.S. courts to establish precedents that could define the economics of artificial intelligence. Media consultant Dick Tofel told the New York Times in its own coverage of the lawsuit that a Supreme Court ruling on the issue is “essentially inevitable.”

Here is OpenAI’s statement.

“We respect the rights of content creators and owners and are committed to working with them to ensure they benefit from AI technology and new revenue models. Our ongoing conversations with the New York Times have been productive and moving forward constructively, so we are surprised and disappointed with this development. We’re hopeful that we will find a mutually beneficial way to work together, as we are doing with many other publishers.”

On the off chance that Microsoft tries to distance itself from OpenAI, the suit makes extensive use of comments from Microsoft CEO Satya Nadella to illustrate its instrumental role in developing and distributing the technologies in question — including statements he made to reassure investors and customers when OpenAI CEO Sam Altman was briefly ousted from the role in November.

Microsoft has not commented on the suit as of Thursday morning Pacific time.

NYT v. GPT: Microsoft finds itself on the other side of an industry-defining copyright dispute

Most Popular on GeekWire

Job Listings on GeekWork

Related Stories

Most Popular on GeekWire

Job Listings on GeekWork