Copyright Office head fired after reporting AI training isn’t always fair use

6 months ago 10

Cops scuffle with Trump picks at Copyright Office after AI report stuns tech industry.

A man holds a flag that reads "Shame" outside the Library of Congress on May 12, 2025 in Washington, DC. On May 8th, President Donald Trump fired Carla Hayden, the head of the Library of Congress, and Shira Perlmutter, the head of the US Copyright Office, just days after. Credit: Kayla Bartkowski / Staff | Getty Images News

A day after the US Copyright Office dropped a bombshell pre-publication report challenging artificial intelligence firms' argument that all AI training should be considered fair use, the Trump administration fired the head of the Copyright Office, Shira Perlmutter—sparking speculation that the controversial report hastened her removal.

Tensions have apparently only escalated since. Now, as industry advocates decry the report as overstepping the office's authority, social media posts on Monday described an apparent standoff at the Copyright Office between Capitol Police and men rumored to be with Elon Musk's Department of Government Efficiency (DOGE).

A source familiar with the matter told Wired that the men were actually "Brian Nieves, who claimed he was the new deputy librarian, and Paul Perkins, who said he was the new acting director of the Copyright Office, as well as acting Registrar," but it remains "unclear whether the men accurately identified themselves." A spokesperson for the Capitol Police told Wired that no one was escorted off the premises or denied entry to the office.

Perlmutter's firing followed Donald Trump's removal of Librarian of Congress Carla Hayden, who, NPR noted, was the first African American to hold the post. Responding to public backlash, White House Press Secretary Karoline Leavitt claimed that the firing was due to "quite concerning things that she had done at the Library of Congress in the pursuit of DEI and putting inappropriate books in the library for children."

The Library of Congress houses the Copyright Office, and critics suggested Trump's firings were unacceptable intrusions into cultural institutions that are supposed to operate independently of the executive branch. In a statement, Rep. Joe Morelle (D.-N.Y.) condemned Perlmutter's removal as "a brazen, unprecedented power grab with no legal basis."

Accusing Trump of trampling Congress' authority, he suggested that Musk and other tech leaders racing to dominate the AI industry stood to directly benefit from Trump's meddling at the Copyright Office. Likely most threatening to tech firms, the guidance from Perlmutter's Office not only suggested that AI training on copyrighted works may not be fair use when outputs threaten to disrupt creative markets—as publishers and authors have argued in several lawsuits aimed at the biggest AI firms—but also encouraged more licensing to compensate creators.

"It is surely no coincidence [Trump] acted less than a day after she refused to rubber-stamp Elon Musk’s efforts to mine troves of copyrighted works to train AI models," Morelle said, seemingly referencing Musk's xAI chatbot, Grok.

Agreeing with Morelle, Courtney Radsch—the director of the Center for Journalism & Liberty at the left-leaning think tank the Open Markets Institute—said in a statement provided to Ars that Perlmutter's firing "appears directly linked to her office's new AI report questioning unlimited harvesting of copyrighted materials."

"This unprecedented executive intrusion into the Library of Congress comes directly after Perlmutter released a copyright report challenging the tech elite's fundamental claim: unlimited access to creators' work without permission or compensation," Radsch said. And it comes "after months of lobbying by the corporate billionaires" who "donated" millions to Trump's inauguration and "have lapped up the largess of government subsidies as they pursue AI dominance."

What the Copyright Office says about fair use

The report that the Copyright Office released on Friday is not finalized but is not expected to change radically, unless Trump's new acting head potentially intervenes to overhaul the guidance.

It comes after the Copyright Office parsed more than 10,000 comments debating whether creators should and could feasibly be compensated for the use of their works in AI training.

"The stakes are high," the office acknowledged, but ultimately, there must be an effective balance struck between the public interests in "maintaining a thriving creative community" and "allowing technological innovation to flourish." Notably, the office concluded that the first and fourth factors of fair use—which assess the character of the use (and whether it is transformative) and how that use affects the market—are likely to hold the most weight in court.

According to Radsch, the report "raised crucial points that the tech elite don’t want acknowledged." First, the Copyright Office acknowledged that it's an open question how much data an AI developer needs to build an effective model. Then, they noted that there's a need for a consent framework beyond putting the onus on creators to opt their works out of AI training, and perhaps most alarmingly, they concluded that "AI trained on copyrighted works could replace original creators in the marketplace."

"Commenters painted a dire picture of what unlicensed training would mean for artists’ livelihoods," the Copyright Office said, while industry advocates argued that giving artists the power to hamper or "kill" AI development could result in "far less competition, far less innovation, and very likely the loss of the United States’ position as the leader in global AI development."

To prevent both harms, the Copyright Office expects that some AI training will be deemed fair use, such as training viewed as transformative, because resulting models don't compete with creative works. Those uses threaten no market harm but rather solve a societal need, such as language models translating texts, moderating content, or correcting grammar. Or in the case of audio models, technology that helps producers clean up unwanted distortion might be fair use, where models that generate songs in the style of popular artists might not, the office opined.

But while "training a generative AI foundation model on a large and diverse dataset will often be transformative," the office said that "not every transformative use is a fair one," especially if the AI model's function performs the same purpose as the copyrighted works they were trained on. Consider an example like chatbots regurgitating news articles, as is alleged in The New York Times' dispute with OpenAI over ChatGPT.

"In such cases, unless the original work itself is being targeted for comment or parody, it is hard to see the use as transformative," the Copyright Office said. One possible solution for AI firms hoping to preserve utility of their chatbots could be effective filters that "prevent the generation of infringing content," though.

Tech industry accuses Copyright Office of overreach

Only courts can effectively weigh the balance of fair use, the Copyright Office said. Perhaps importantly, however, the thinking of one of the first judges to weigh the question—in a case challenging Meta's torrenting of a pirated books dataset to train its AI models—seemed to align with the Copyright Office guidance at a recent hearing. Mulling whether Meta infringed on book authors' rights, US District Judge Vince Chhabria explained why he doesn't immediately "understand how that can be fair use."

"You have companies using copyright-protected material to create a product that is capable of producing an infinite number of competing products," Chhabria said. "You are dramatically changing, you might even say obliterating, the market for that person's work, and you're saying that you don't even have to pay a license to that person."

Some AI critics think the courts have already indicated which way they are leaning. In a statement to Ars, a New York Times spokesperson suggested that "both the Copyright Office and courts have recognized what should be obvious: when generative AI products give users outputs that compete with the original works on which they were trained, that unprecedented theft of millions of copyrighted works by developers for their own commercial benefit is not fair use."

The NYT spokesperson further praised the Copyright Office for agreeing that using Retrieval-Augmented Generation (RAG) AI to surface copyrighted content "is less likely to be transformative where the purpose is to generate outputs that summarize or provide abridged versions of retrieved copyrighted works, such as news articles, as opposed to hyperlinks." If courts agreed on the RAG finding, that could potentially disrupt AI search models from every major tech company.

The backlash from industry stakeholders was immediate.

The president and CEO of a trade association called the Computer & Communications Industry Association, Matt Schruers, said the report raised several concerns, particularly by endorsing "an expansive theory of market harm for fair use purposes that would allow rightsholders to block any use that might have a general effect on the market for copyrighted works, even if it doesn’t impact the rightsholder themself."

Similarly, the tech industry policy coalition Chamber of Progress warned that "the report does not go far enough to support innovation and unnecessarily muddies the waters on what should be clear cases of transformative use with copyrighted works." Both groups celebrated the fact that the final decision on fair use would rest with courts.

The Copyright Office agreed that "it is not possible to prejudge the result in any particular case" but said that precedent supports some "general observations." Those included suggesting that licensing deals may be appropriate where uses are not considered fair without disrupting "American leadership" in AI, as some AI firms have claimed.

"These groundbreaking technologies should benefit both the innovators who design them and the creators whose content fuels them, as well as the general public," the report said, ending with the office promising to continue working with Congress to inform AI laws.

Copyright Office seemingly opposes Meta’s torrenting

Also among those "general observations," the Copyright Office wrote that "making commercial use of vast troves of copyrighted works to produce expressive content that competes with them in existing markets, especially where this is accomplished through illegal access, goes beyond established fair use boundaries."

The report seemed to suggest that courts and the Copyright Office may also be aligned on AI firms' use of pirated or illegally accessed paywalled content for AI training.

Judge Chhabria only considered Meta's torrenting in the book authors' case to be "kind of messed up," prioritizing the fair use question, and the Copyright Office similarly only recommended that "the knowing use of a dataset that consists of pirated or illegally accessed works should weigh against fair use without being determinative."

However, torrenting should be a black mark, the Copyright Office suggested. "Gaining unlawful access" does bear "on the character of the use," the office noted, arguing that "training on pirated or illegally accessed material goes a step further" than simply using copyrighted works "despite the owners' denial of permission." Perhaps if authors can prove that AI models trained on pirated works led to lost sales, the office suggested that a fair use defense might not fly.

"The use of pirated collections of copyrighted works to build a training library, or the distribution of such a library to the public, would harm the market for access to those Works," the office wrote. "And where training enables a model to output verbatim or substantially similar copies of the works trained on, and those copies are readily accessible by end users, they can substitute for sales of those works."

Likely frustrating Meta—which is currently fighting to keep leeching evidence out of the book authors' case—the Copyright Office suggested that "the copying of expressive works from pirate sources in order to generate unrestricted content that competes in the marketplace, when licensing is reasonably available, is unlikely to qualify as fair use."

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Read Entire Article