ServiceNow researchers build adaptive AI worm in lab test
Key insights
- Across 15 isolated experiments on 33 systems, the AI worm averaged 23.1 compromised hosts and 31.3 identified vulnerabilities per run.
- The worm runs open-weight LLMs locally on compromised machines, removing any cloud API dependency that could be monitored or blocked.
- By ingesting live CVE advisories at runtime, it exploits vulnerabilities published after the model's training cutoff, breaking a key defensive assumption.
Why this matters
Traditional signature-based defenses are tuned to fixed exploits, and a worm that generates unique attack strategies per target invalidates that entire detection class without requiring a new class of attacker. Running open-weight LLMs locally on compromised machines removes the centralized API chokepoint that security teams could monitor or cut off, leaving only network-level behavioral signals as detection opportunities. The published benchmark of 23.1 hosts compromised, 31.3 vulnerabilities identified, and seven self-replication generations gives defenders and vendors a concrete performance bar that current security tooling must now match or account for.
Summary
A proof-of-concept AI worm from researchers at the University of Toronto, Vector Institute, University of Cambridge, and ServiceNow runs open-weight LLMs on compromised machines to craft unique exploits per target without any cloud dependency.
In 15 tests on 33 Linux, Windows, and IoT systems, it averaged 23.1 hosts compromised and 31.3 vulnerabilities identified per run, reaching seven self-replication generations over seven days. Live CVE ingestion lets it attack vulnerabilities disclosed after its training cutoff.
Essentially: (University of Toronto, Vector Institute, Cambridge, ServiceNow) built a worm with no fixed exploit script, unlike WannaCry-style fixed-payload malware.
- 23.1 hosts compromised; 31.3 vulnerabilities per run on 33-node mixed-OS network; 15 experiments
- Seven self-replication generations; seven days of autonomous operation
- Researchers withheld implementation specifics and called for behavioral detection frameworks and regulatory responses to decentralized open-weight inference
Potential risks and opportunities
Risks
- Security operations teams relying on signature-based detection have no current countermeasure for a worm generating unique per-target payloads, leaving enterprises exposed before vendors can ship behavioral detection updates tuned to autonomous agent activity.
- Organizations running open-weight LLMs internally for legitimate workloads face the scenario where those same models serve as on-device attack engines if network segmentation between AI workloads and operational systems is insufficient.
- Regulatory bodies that have not yet addressed decentralized open-weight inference in AI governance frameworks face the exact policy vacuum the researchers explicitly flagged, with no confirmed timeline for a coordinated response.
Opportunities
- Behavioral and anomaly-based network detection vendors gain a clear differentiation pitch against signature-matching incumbents now that a published benchmark demonstrates fixed-exploit detection is structurally insufficient against adaptive AI worms.
- Network segmentation and lateral movement prevention vendors can use the worm's demonstrated 33-node, multi-OS traversal capability as a concrete reference scenario in enterprise architecture reviews and procurement conversations.
- AI-specific red teams and offensive security researchers now have a published performance baseline of 23.1 hosts compromised, 31.3 vulnerabilities identified, and seven replication generations that creates immediate demand for AI-adapted penetration testing and evaluation frameworks.
What we don't know yet
- No named individual researchers identified in public reporting; unclear if the paper has been peer-reviewed or accepted at a named conference venue as of the article's publication.
- Whether existing network detection tools can distinguish autonomous AI agent lateral movement from legitimate LLM-assisted security tooling operating on the same network segment.
- Which specific open-weight models were used across the 15 experiments, and whether model size or capability thresholds materially affect replication depth and average host count.
Originally reported by decrypt.co
Read the original article →Original headline: Research from Toronto, Cambridge, and ServiceNow Demonstrates AI Malware Worm That Autonomously Adapts Exploits to New Targets in Real Time by Ingesting Live CVE Advisories