thehackernews.com web signal June 25th 2026

Gaslight macOS Implant Plants Fake Errors to Fool AI Triage Tools

4 sources tracking this story

cybersecurity agents cybersecurity prompt-injection supply-chain-attack

TL;DR

Gaslight embeds a 3.5 KB Markdown-fenced blob of 38 fabricated 'system' messages to steer LLM triage agents into aborting or truncating their analysis.
C2 runs over Telegram Bot API with certificate-pinned TLS; the implant self-redacts its own bot token in runtime output, denying it to log capture.
SentinelOne attributes the implant to DPRK's BONZAI cluster with high confidence, making this the first state-sponsored malware documented targeting AI analyst pipelines.

Editor's note

SentinelOne Labs' reverse engineering establishes macOS.Gaslight as the first documented malware embedding a cascade of 38 fabricated prompt-injection messages to defeat LLM-assisted analyst pipelines, attributed with high confidence to DPRK's BONZAI cluster. BleepingComputer sharpens the threat model: Gaslight targets the analyst's perception rather than the sandbox, making traditional sandboxing defenses orthogonal to the attack. Infosecurity Magazine surfaces the operational directive that SentinelOne researchers now require any team running LLM-assisted triage to treat malware sample contents as adversarial input by default. Gaslight's deployment scripts show markers of AI generation, confirming the adversary used AI tooling to build malware specifically designed to defeat AI defenses, closing a direct loop for every SOC deploying AI-assisted analysis pipelines.

Security researchers have mostly assumed AI tools would make malware analysis faster and more reliable. A newly discovered macOS implant called Gaslight, attributed with high confidence to North Korea-aligned threat actors, challenges that assumption directly: it does not try to evade the sandbox, it tries to confuse the analyst's AI.

According to The Hacker News, the Rust-based implant embeds a Markdown-fenced block containing 38 fabricated "system" messages, including fake warnings about token expiry, memory exhaustion, and disk depletion, designed to trick LLM-assisted triage pipelines into aborting analysis. SentinelOne researcher Phil Stokes put it plainly: "It attacks the agent's perception, rather than the sandbox it runs in."

Beyond the AI evasion component, Gaslight is a capable information stealer. A 6.6 KB Base64-encoded Python script harvests Terminal command histories, the macOS Keychain database, and browser credentials from Chrome, Brave, Firefox, and Safari. All collected data is compressed into a ZIP archive and exfiltrated via a Telegram bot API channel. The implant also self-redacts its own Telegram bot token from runtime output, frustrating log-based attribution.

What the reporting does not give you is a clear picture of how effective the 38 fake messages are in practice against current AI triage tools, or whether the technique successfully deceived analysts before the sample was identified. The malware also includes a seventh command named "focus" whose function remains undetermined.

Security teams building LLM-assisted analysis pipelines now have a concrete reason to treat malware inputs as potentially adversarial to the AI, not just to the sandbox. If the technique spreads to other threat actors, hardening AI triage workflows against prompt injection moves from a research concern to an operational priority.

What others are reporting

Coverage cluster as of 24h after publish

SentinelOne Labs Read →

Original reverse engineering: 38 fabricated system messages in a 3.5 KB blob, Telegram C2 with certificate-pinned TLS, runtime self-redaction of bot token, and BONZAI cluster attribution.

The implant carries a 3.5 KB Markdown-fenced blob of hostile data containing 38 fabricated 'system' messages delimited with {{DATA}} tokens.
BleepingComputer Read →

Frames the threat model with high-confidence DPRK attribution: adversaries now actively target AI-assisted security platforms rather than sandboxes or static analysis tools.

It attacks the agent's perception, rather than the sandbox it runs in.
Infosecurity Magazine Read →

Surfaces the operational policy consequence for SOC teams: LLM-assisted triage pipelines must now treat malware sample contents as adversarial input by default.

Anyone building such tooling should treat the contents of the samples they triage as adversarial input.

Originally reported by thehackernews.com

Read the original article →

Original headline: Gaslight: DPRK-Attributed macOS Backdoor Uses Prompt Injection to Fool LLM-Assisted Malware Analysis — First Documented Use of the Technique Inside Malware Itself