ChatGPT Creator Prepares to Fight With New York Times and Authors Over ‘Fair Use’ of Copyrighted Works

An avalanche of high-profile lawsuits in federal court in New York will test the future of ChatGPT and other artificial intelligence products that wouldn’t be so eloquent if they hadn’t ingested huge troves of copyrighted human works.

But do AI chatbots (in this case, widely marketed products made by OpenAI and its business partner Microsoft) violate copyright and fair competition laws? Professional writers and media outlets will face a difficult fight to win that argument in court.

“I would like to be optimistic on behalf of the authors, but I am not. I just think they have an uphill battle here,” said copyright attorney Ashima Aggarwal, who used to work for academic publishing giant John Wiley & Sons.

One lawsuit comes from the New York Times. Another from a group of well-known novelists such as John Grisham, Jodi Picoult and George RR Martin. One-third of the best-selling nonfiction writers, including the author of the Pulitzer Prize-winning biography on which the hit film “Oppenheimer” was based.

THE DEMANDS

Each of the lawsuits makes different allegations, but they all center on the San Francisco-based company OpenAI “building this product on top of other people’s intellectual property,” said attorney Justin Nelson, who represents nonfiction writers and whose law firm also represents the Times.

“What OpenAI is saying is that they have the right to take anyone else’s intellectual property since the beginning of time, as long as it’s on the Internet,” Nelson said.

The Times filed a lawsuit in December, arguing that Microsoft’s ChatGPT and Copilot are competing with the very media they are trained in and diverting web traffic from the newspaper and other copyright holders that rely on advertising revenue generated by their content. to continue producing their journalism. It also provided evidence that chatbots were spitting out Times articles word for word. On other occasions, chatbots falsely attributed misinformation to the newspaper in a way they said damaged its reputation.

So far, a top federal judge is presiding over all three cases, as well as a quarter of two more nonfiction authors who filed another lawsuit last week. U.S. District Judge Sidney H. Stein has been on the Manhattan-based court since 1995, when he was nominated by then-President Bill Clinton.

THE ANSWER

OpenAI and Microsoft have not yet filed formal counterarguments to the New York cases, but OpenAI made a public statement this week describing the Times lawsuit as “meritless” and saying the chatbot’s ability to regurgitate some articles verbatim was a “strange”. bug.”

“Training AI models using publicly available Internet materials is fair use, as supported by long-standing and widely accepted precedents,” a company blog post said on Monday. He went on to suggest that the Times “either ordered the model to regurgitate or selected its examples from many attempts.”

OpenAI cited licensing agreements made last year with The Associated Press, German media company Axel Springer and other organizations to offer insight into how the company is trying to support a healthy news ecosystem. OpenAI is paying an undisclosed fee to license the AP news archive. The New York Times was in similar conversations before deciding to sue.

OpenAI said earlier this year that access to AP’s “high-quality factual text archive” would enhance the capabilities of its artificial intelligence systems. But his blog this week downplayed the importance of news content for AI training, arguing that large language models learn from a “huge aggregate of human knowledge” and that “any data source alone, including The New York Times, is not significant for the expected learning of the model.”

WHO WILL WIN?

Much of the AI ​​industry’s argument rests on the “fair use” doctrine of US copyright law, which allows limited uses of copyrighted materials, such as for teaching. , research or transformation of the protected work into something different.

So far, courts have largely sided with tech companies in interpreting how copyright laws should treat artificial intelligence systems. In a defeat for visual artists, a federal judge in San Francisco last year dismissed much of the first major lawsuit against the AI ​​image generators, although he allowed part of the case to continue. Another California judge rejected comedian Sarah Silverman’s arguments that Meta, Facebook’s parent company, infringed on the text of her memoir to build her AI model.

Subsequent cases filed over the past year have provided more detailed evidence, but Aggarwal said that when it comes to using copyrighted content to train artificial intelligence systems that deliver a “small portion of that to users, the courts simply won’t.” “They seem willing to find that to be copyright infringement.”

Most technology companies cite Google’s success in overcoming legal challenges to its online book library as a precedent. In 2016, the U.S. Supreme Court upheld lower court rulings that rejected authors’ claim that Google’s digitization of millions of books and displaying fragments of them to the public amounted to copyright infringement. From author.

But judges interpret fair use arguments on a case-by-case basis and “it really depends a lot on the facts,” depending on the economic impact and other factors, said Cathy Wolfe, an executive at Dutch firm Wolters Kluwer, who also sits on the council. board of the Copyright Clearance Center, which helps negotiate print and digital media licenses in the US.

“Just because something is free on the Internet, on a website, doesn’t mean you can copy it and email it, much less use it to conduct commercial business,” Wolfe said. “I don’t know who’s going to win, but I’m certainly an advocate for copyright protection for all of us. It drives innovation.”

BEYOND THE COURTS

Some media outlets and other content creators are looking beyond the courts and calling on lawmakers or the U.S. Copyright Office to strengthen copyright protection for the age of AI. A U.S. Senate Judiciary Committee panel will hear testimony from media executives and advocates on Wednesday in a hearing dedicated to the effect of AI on journalism.

Roger Lynch, CEO of the Condé Nast magazine chain, plans to tell senators that generative AI companies “are using our stolen intellectual property to build replacement tools.”

“We believe a legislative solution may be simple: clarify that use of copyrighted content in conjunction with commercial Gen AI is not fair use and requires a license,” a copy of Lynch’s prepared comments reads.

Leave a Reply

Your email address will not be published. Required fields are marked *