AI's scary new trick: Conducting cyberattacks instead of just helping out

1 day ago 12
gettyimages-2180153686
Just_Super/iStock / Getty Images Plus

Follow ZDNET: Add us as a preferred source on Google.


ZDNET's key takeaways

  • Anthropic documented a large-scale cyberattack using AI.
  • Anthropic says that a Chinese state-sponsored group is to blame.
  • The attack may be the first case of its kind. 

The first large-scale cyberattack campaign leveraging artificial intelligence (AI) as more than just a helping digital hand has now been recorded.

Also: Google spots malware in the wild that morphs mid-attack, thanks to AI

As first reported by the Wall Street Journal, Anthropic, the company behind Claude, an AI assistant, published a report (.PDF) documenting the abuse of its AI models, hijacked in a wide-scale attack campaign simultaneously targeting multiple organizations.

What happened?

In the middle of September, Anthropic detected a "highly sophisticated cyber espionage operation" that used AI throughout the full attack cycle. 

Claude Code, agentic AI, was abused in the creation of an automated attack framework capable of "reconnaissance, vulnerability discovery, exploitation, lateral movement, credential harvesting, data analysis, and exfiltration operations." Furthermore, these stages were performed "largely autonomously," with human operators providing basic oversight after tasking Claude Code to operate as "penetration testing orchestrators and agents" -- in other words, to pretend to be a defender.

Also: Google spots malware in the wild that morphs mid-attack, thanks to AI

Not only did the AI find vulnerabilities in target organizations, but it also enabled their exploitation, data theft, and other malicious post-exploit activities. 

According to Anthropic, not only did this result in high-profile organizations being targeted, but 80% to 90% of "tactical operations" were operated independently by the AI.

"By presenting these tasks to Claude as routine technical requests through carefully crafted prompts and established personas, the threat actor was able to induce Claude to execute individual components of attack chains without access to the broader malicious context," Anthropic said.

Who was responsible, and how did Anthropic respond?

According to Anthropic, a Chinese state-sponsored group was allegedly at the heart of the operation. Now tracked as GTG-1002 and thought to be well-resourced with state backing, the group leveraged Claude in its campaign -- but little more is known about them.

Once Anthropic discovered the abuse of its technologies, it quickly moved to ban accounts associated with GTG-1002 and expand its malicious activity detection systems, which will hopefully uncover what the company calls "novel threat patterns" -- such as the roleplay used by GTG-1002 to make the system act like a genuine, defense-based penetration tester.

Also: This new cyberattack tricks you into hacking yourself. Here's how to spot it

Anthropic is also prototyping early-detection measures to stop autonomous cyberattacks, and both authorities and industry parties were made aware of the incident. 

However, the company also issued a warning to the cybersecurity community at large, urging it to remain vigilant:

"The cybersecurity community needs to assume a fundamental change has occurred: Security teams should experiment with applying AI for defense in areas like SOC automation, threat detection, vulnerability assessment, and incident response and build experience with what works in their specific environments," Anthropic said. "And we need continued investment in safeguards across AI platforms to prevent adversarial misuse. The techniques we're describing today will proliferate across the threat landscape, which makes industry threat sharing, improved detection methods, and stronger safety controls all the more critical."

Is this attack important?

We've recently seen the first indicators that threat actors worldwide are exploring how AI can be leveraged in malicious tools, techniques, and attacks. However, these have previously been relatively limited -- at least, in the public arena -- to minor automation and assistance, improved phishing, some dynamic code generation, email scams, and some code obfuscation. 

It seems that around the same time as the Anthropic case, OpenAI, the makers of ChatGPT, published its own report, which stated there was abuse but little or no evidence of OpenAI models being abused to gain "novel offensive capability," GTG-1002 was busy implementing AI to automatically and simultaneously target organizations. 

Also: Enterprises are not prepared for a world of malicious AI agents

(Disclosure: Ziff Davis, ZDNET's parent company, filed an April 2025 lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.)

Approximately 30 organizations were targeted. Only a small number of these attacks, a "handful," were successful; however, due to AI hallucinations and a number of other issues, including data fabrication and outright lies about obtaining valid credentials. So, while still notable, it could be argued that this case is a step-up in techniques but isn't yet the AI apocalypse.  

Or, as Anthropic said, this discovery "represents a fundamental shift in how advanced threat actors use AI." 

Read Entire Article