reddit.com via Reddit

r/cybersecurity: Claroty Team82 Tests Claude Opus 4.6 on Real Hardware Vuln Research — Documents Where LLM-Driven Discovery Succeeds and Fails Against Zenitel TCIV-3+ Intercom CVEs

anthropic cybersecurity agents llm-security vulnerability-research autonomous-ai

Summary

Claroty's Team82 researchers shared a controlled experiment in which Claude Opus 4.6 was tasked with autonomously replicating the five CVEs the team had already manually discovered and disclosed in Zenitel's TCIV-3+ video intercom. The post maps exactly where the model succeeded without human prompting, where it required guidance to stay on target, and where fully autonomous AI-driven vulnerability research hits hard limits on real physical hardware—providing one of the first published named-firm comparisons between human researchers and an LLM working the identical CVE target set.