OpenAI Asserts Copyright’s Crucial Role in AI Development: Deems Creation of Tools Like ChatGPT ‘Impossible’ Without Protected Material

Amelia Smith
January 9, 2024
Artificial Intelligence, Info Gallery, Law, TechTalk

Pressure on AI Firms Rises as OpenAI Defends Use of Copyrighted Data in Creating Advanced Models

In the ever-evolving landscape of artificial intelligence (AI) development, OpenAI, the trailblazing creator of the widely acclaimed chatbot ChatGPT, has boldly declared that the creation of cutting-edge AI tools would be “impossible” without access to copyrighted material.

This revelation comes amid mounting scrutiny and legal challenges faced by AI firms regarding the content they employ to train their innovative products.

ChatGPT and similar AI marvels, including image generators like Stable Diffusion, undergo extensive training on vast datasets sourced from the internet, much of which falls under the protection of copyright laws. This legal safeguard prevents the unauthorized use of someone’s work, a contentious issue now at the forefront of the AI industry.

Recent legal action has intensified this debate, with the New York Times taking legal action against OpenAI and Microsoft, a leading investor in the company. The lawsuit accuses them of engaging in the “unlawful use” of the New York Times’ work to develop their respective products.

In response, OpenAI submitted a statement to the House of Lords communications and digital select committee, emphasizing the indispensability of copyrighted material in training large language models like the GPT-4 model, the backbone of ChatGPT.

OpenAI argued, “Because copyright today covers virtually every sort of human expression… it would be impossible to train today’s leading AI models without using copyrighted materials.” The organization dismissed the idea of limiting training data to public domain works from over a century ago, asserting that such an approach would yield interesting experiments but fall short of meeting the needs of contemporary citizens.

The defense put forth by AI companies often rests on the legal doctrine of “fair use,” which permits certain uses of copyrighted content without explicit permission. OpenAI, in its submission, maintained that “legally, copyright law does not forbid training.”

The New York Times lawsuit is just one among several legal challenges OpenAI is grappling with. Notable authors and entities, including John Grisham, Jodi Picoult, George RR Martin, and Getty Images, have raised concerns about alleged “systematic theft on a mass scale” and copyright breaches.

Amidst these legal battles, OpenAI remains committed to AI safety. In response to inquiries about AI safety in its House of Lords submission, the company expressed support for independent analysis of its security measures. OpenAI endorsed the concept of “red-teaming,” allowing third-party researchers to test the safety of their products by simulating the behavior of rogue actors.

OpenAI has also joined forces with governments to conduct safety testing on its most powerful models before and after deployment, as part of a global safety summit agreement struck in the UK the previous year. As the debate over copyright, fair use, and AI development continues, OpenAI’s position underscores the complex interplay between innovation, legal frameworks, and the evolving landscape of artificial intelligence.

Here are some more interesting articles for you!