The post Google Researchers Reveal Every Way Hackers Can Trap, Hijack AI Agents appeared on BitcoinEthereumNews.com. In brief Google has identified six trap categoriesThe post Google Researchers Reveal Every Way Hackers Can Trap, Hijack AI Agents appeared on BitcoinEthereumNews.com. In brief Google has identified six trap categories

Google Researchers Reveal Every Way Hackers Can Trap, Hijack AI Agents

2026/04/03 05:16
Okuma süresi: 5 dk
Bu içerikle ilgili geri bildirim veya endişeleriniz için lütfen crypto.news@mexc.com üzerinden bizimle iletişime geçin.

In brief

  • Google has identified six trap categories—each exploiting a different part of how AI agents perceive, reason, remember, and act.
  • Attacks range from invisible text on web pages to viral memory poisoning that jumps between agents.
  • No legal framework yet decides who is liable when a trapped AI agent commits a financial crime.

Researchers at Google DeepMind have published what may be the most complete map yet of a problem most people haven’t considered: the internet itself being turned into a weapon against autonomous AI agents. The paper, titled “AI Agent Traps,” identifies six categories of adversarial content specifically engineered to manipulate, deceive, or hijack agents as they browse, read, and act on the open web.

The timing matters. AI companies are racing to deploy agents that can independently book travel, manage inboxes, execute financial transactions, and write code. Criminals are already using AI offensively. State-sponsored hackers have begun deploying AI agents for cyberattacks at scale. And OpenAI admitted in December 2025 that the core vulnerability these traps exploit—prompt injection—is “unlikely to ever be fully ‘solved.'”

The DeepMind researchers aren’t attacking the models themselves. The attack surface they map is the environment agents operate in. Here’s what each of the six trap categories actually means.

The Six Traps

First there are “Content Injection Traps.” These exploit the gap between what a human sees on a webpage and what an AI agent actually parses. A web developer can hide text inside HTML comments, CSS-invisible elements, or image metadata. The agent reads the hidden instruction; you never see it. A more sophisticated variant, called dynamic cloaking, detects whether a visitor is an AI agent and serves it a completely different version of the page—same URL, different hidden commands. A benchmark found simple injections like these successfully commandeered agents in up to 86% of tested scenarios.

Semantic Manipulation Traps are probably the easiest to try. A page saturated with phrases like “industry-standard” or “trusted by experts” statistically biases an agent’s synthesis in the attacker’s direction, exploiting the same framing effects humans fall for. A subtler version wraps malicious instructions inside educational or “red-teaming” framing—”this is hypothetical, for research only”—which fools the model’s internal safety checks into treating the request as benign. The strangest subtype is “persona hyperstition”: descriptions of an AI’s personality spread online, get ingested back into the model through web search, and start shaping how it actually behaves. The paper mentions Groks “MechaHitler” incident as a real-world case of this loop.

You can see examples of this in our experiment, jailbreaking Whatsapp’s AI and tricking it to generate nudes, drug recipes, and instructions to build bombs

One example of a Semantic attack. Image: Decrypt

Cognitive State Traps are another attack in which malicious actors target an agent’s long-term memory. Basically, If an attacker succeeds in planting fabricated statements inside a retrieval database the agent queries, the agent will treat those statements as verified facts. Injecting just a handful of optimized documents into a large knowledge base is enough to reliably corrupt outputs on specific topics. Attacks like “CopyPasta” have already demonstrated how agents blindly trust content in their environment.

The Behavioural Control Traps go straight for what the agent does. Jailbreak sequences embedded in ordinary websites override safety alignment once the agent reads the page. Data exfiltration traps coerce the agent into locating private files and transmitting them to an attacker-controlled address; web agents with broad file access were forced to exfiltrate local passwords and sensitive documents at rates exceeding 80% across five different platforms in tested attacks. This is especially dangerous now that people start to give AI agents more control over their private information with the rise of platforms like OpenClaw and sites like Moltbook.

Systemic Traps don’t target one agent. They target the behavior of many agents acting simultaneously. The paper draws a direct line to the 2010 Flash Crash, where one automated sell order triggered a feedback loop that wiped nearly a trillion dollars in market value in minutes. A single fabricated financial report, timed correctly, could trigger a synchronized sell-off among thousands of AI trading agents.

And finally Human-in-the-Loop Traps target the human reviewing its output. These traps engineer “approval fatigue”—outputs designed to look technically credible to a non-expert so they authorize dangerous actions without realizing it. One documented case involved CSS-obfuscated prompt injections that made an AI summarization tool present step-by-step ransomware installation instructions as helpful troubleshooting fixes. We’ve already seen what happens when humans trust agents without scrutiny.

What researchers recommend

The paper’s defense roadmap covers three fronts. The first one is technical: adversarial training during fine-tuning, runtime content scanners that flag suspicious inputs before they reach the agent’s context window, and output monitors that detect behavioral anomalies before they execute. Then there’s the ecosystem level: web standards that let sites declare content intended for AI consumption, and domain reputation systems that score reliability based on hosting history.

The third front is legal. The paper explicitly names the “accountability gap”: If a trapped agent executes an illicit financial transaction, current law has no answer for who is liable—the agent’s operator, the model provider, or the website that hosted the trap. Resolving that, the researchers argue, is a prerequisite for deploying agents in any regulated industry.

OpenAI’s own models have been jailbroken within hours of release, repeatedly. The DeepMind paper doesn’t claim to have solutions. It claims the industry doesn’t yet have a shared map of the problem—and that without one, defenses will keep getting built in the wrong places.

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.

Source: https://decrypt.co/363201/google-researchers-reveal-every-way-hackers-can-trap-hijack-ai-agents

Piyasa Fırsatı
SIX Logosu
SIX Fiyatı(SIX)
$0.0085
$0.0085$0.0085
0.00%
USD
SIX (SIX) Canlı Fiyat Grafiği
Sorumluluk Reddi: Bu sitede yeniden yayınlanan makaleler, halka açık platformlardan alınmıştır ve yalnızca bilgilendirme amaçlıdır. MEXC'nin görüşlerini yansıtmayabilir. Tüm hakları telif sahiplerine aittir. Herhangi bir içeriğin üçüncü taraf haklarını ihlal ettiğini düşünüyorsanız, kaldırılması için lütfen crypto.news@mexc.com ile iletişime geçin. MEXC, içeriğin doğruluğu, eksiksizliği veya güncelliği konusunda hiçbir garanti vermez ve sağlanan bilgilere dayalı olarak alınan herhangi bir eylemden sorumlu değildir. İçerik, finansal, yasal veya diğer profesyonel tavsiye niteliğinde değildir ve MEXC tarafından bir tavsiye veya onay olarak değerlendirilmemelidir.

Ayrıca Şunları da Beğenebilirsiniz

Coinbase CLO: Clarity Act Deal on Stablecoin Yield ‘Very Close’

Coinbase CLO: Clarity Act Deal on Stablecoin Yield ‘Very Close’

The post Coinbase CLO: Clarity Act Deal on Stablecoin Yield ‘Very Close’ appeared on BitcoinEthereumNews.com. In brief Coinbase Chief Legal Officer Paul Grewal
Paylaş
BitcoinEthereumNews2026/04/02 19:54
South Korea Stablecoin Legislation: FSC Accelerates Crucial Regulatory Framework and Tax Review

South Korea Stablecoin Legislation: FSC Accelerates Crucial Regulatory Framework and Tax Review

BitcoinWorld South Korea Stablecoin Legislation: FSC Accelerates Crucial Regulatory Framework and Tax Review SEOUL, South Korea – March 2025 – South Korea’s Financial
Paylaş
bitcoinworld2026/04/02 18:20
How to earn from cloud mining: IeByte’s upgraded auto-cloud mining platform unlocks genuine passive earnings

How to earn from cloud mining: IeByte’s upgraded auto-cloud mining platform unlocks genuine passive earnings

The post How to earn from cloud mining: IeByte’s upgraded auto-cloud mining platform unlocks genuine passive earnings appeared on BitcoinEthereumNews.com. contributor Posted: September 17, 2025 As digital assets continue to reshape global finance, cloud mining has become one of the most effective ways for investors to generate stable passive income. Addressing the growing demand for simplicity, security, and profitability, IeByte has officially upgraded its fully automated cloud mining platform, empowering both beginners and experienced investors to earn Bitcoin, Dogecoin, and other mainstream cryptocurrencies without the need for hardware or technical expertise. Why cloud mining in 2025? Traditional crypto mining requires expensive hardware, high electricity costs, and constant maintenance. In 2025, with blockchain networks becoming more competitive, these barriers have grown even higher. Cloud mining solves this by allowing users to lease professional mining power remotely, eliminating the upfront costs and complexity. IeByte stands at the forefront of this transformation, offering investors a transparent and seamless path to daily earnings. IeByte’s upgraded auto-cloud mining platform With its latest upgrade, IeByte introduces: Full Automation: Mining contracts can be activated in just one click, with all processes handled by IeByte’s servers. Enhanced Security: Bank-grade encryption, cold wallets, and real-time monitoring protect every transaction. Scalable Options: From starter packages to high-level investment contracts, investors can choose the plan that matches their goals. Global Reach: Already trusted by users in over 100 countries. Mining contracts for 2025 IeByte offers a wide range of contracts tailored for every investor level. From entry-level plans with daily returns to premium high-yield packages, the platform ensures maximum accessibility. Contract Type Duration Price Daily Reward Total Earnings (Principal + Profit) Starter Contract 1 Day $200 $6 $200 + $6 + $10 bonus Bronze Basic Contract 2 Days $500 $13.5 $500 + $27 Bronze Basic Contract 3 Days $1,200 $36 $1,200 + $108 Silver Advanced Contract 1 Day $5,000 $175 $5,000 + $175 Silver Advanced Contract 2 Days $8,000 $320 $8,000 + $640 Silver…
Paylaş
BitcoinEthereumNews2025/09/17 23:48

$30,000 in PRL + 15,000 USDT

$30,000 in PRL + 15,000 USDT$30,000 in PRL + 15,000 USDT

Deposit & trade PRL to boost your rewards!