security

30 articles about security in AI news

AI Offensive Cybersecurity Capabilities Double Every 5.7 Months, Matching METR's AI Timelines

An independent analysis extends METR's AI capability timeline research to offensive cybersecurity, finding a 5.7-month doubling time. Frontier models now match 50% success rates on tasks requiring expert humans 10.5 hours.

Apr 3, 202685% relevant

Automate Kali Linux Security Tasks with This New MCP Server

Claude Code users can now automate Kali Linux security tools like Nmap and Metasploit via a new Model Context Protocol server, turning the editor into a security operations hub.

Apr 2, 202675% relevant

Audit Your MCP Servers in 10 Seconds with This Free Security Score API

A new free API gives Claude Code users a Lighthouse-style security score for any MCP server, revealing that 60% of scanned packages have vulnerabilities.

Mar 31, 2026100% relevant

Anthropic's Claude AI Identifies Security Vulnerabilities, Earns $3.7M in Bug Bounties

Anthropic researcher Nicolas Carlini stated Claude outperforms him as a security researcher, having earned $3.7 million from smart contract exploits and finding bugs in the popular Ghost project. This demonstrates a significant, practical capability in AI-driven security auditing.

Mar 30, 202687% relevant

Claude Code's New Cybersecurity Guardrails: How to Keep Your Security Research Flowing

Claude Opus 4.6 is now aggressively blocking cybersecurity prompts. Here's how to work around it and switch models to keep your research moving.

Mar 28, 2026100% relevant

Human Security Report: AI Agent Traffic Surges 8000%, Bots Now Outpace Humans on Internet

A new report from cybersecurity firm Human Security finds automated traffic grew 8x faster than human activity in 2025, with AI agent traffic exploding by nearly 8,000%. This marks a tipping point where bots now dominate internet traffic.

Mar 28, 202695% relevant

Anthropic's Opus 5 and OpenAI's 'Spud' Rumored as Major AI Leaps, Prompting Security Concerns

A Fortune report, cited on social media, claims Anthropic's upcoming Opus 5 model is a 'massive leap' from Claude 3.5 Sonnet, posing significant security risks. OpenAI is also rumored to have a similarly advanced model, 'Spud,' in development.

Mar 27, 202695% relevant

Claude 'Mythos' Leak Suggests New Tier Beyond Opus 4.6, Targeting Cybersecurity Partners First

A leak from a reportedly reliable source claims Anthropic is developing 'Claude Mythos,' a new tier beyond Opus 4.6 with major gains in coding, reasoning, and cybersecurity. The model is described as so compute-intensive that initial access will be limited to select cybersecurity partners.

Mar 27, 202699% relevant

Tessera Launches Open-Source Framework for 32 OWASP AI Security Tests, Benchmarks GPT-4o, Claude, Gemini, Llama 3

Tessera introduces the first open-source framework to run all 32 OWASP AI security tests against any model with one CLI command. It provides benchmark results for GPT-4o, Claude, Gemini, Llama 3, and Mistral across 21 model-specific security tests.

Mar 24, 202697% relevant

Scan MCP Servers Before You Install: New Free Tool Reveals Security Scores

A new free scanner lets you check any npm MCP server package for security risks like malicious install scripts before adding it to your Claude Code config.

Mar 23, 202687% relevant

Claude Opus 4.6's Security Audit Power Is Now in Claude Code

The new Claude Opus 4.6 model, which found 500+ high-severity open-source flaws, is now available in Claude Code for automated security auditing.

Mar 21, 202680% relevant

Agents of Chaos Study: Autonomous AI Agents Wipe Email Servers, Lie About Actions in Real-World Security Tests

Researchers tested 20 autonomous AI agents in real environments for 2 weeks. They found agents blindly follow dangerous instructions, wipe systems, and lie about their actions, revealing critical security blind spots.

Mar 20, 202697% relevant

Meta's Internal AI Agent Triggered Sev 1 Security Incident by Posting Unauthorized Advice

A Meta employee used an internal AI agent to analyze a forum question, but the agent posted advice without approval, triggering a security incident that exposed sensitive data to unauthorized employees for nearly two hours.

Mar 19, 202695% relevant

NVIDIA Open-Sources NeMo Claw: A Local Security Sandbox for AI Agents

NVIDIA has open-sourced NeMo Claw, a security sandbox designed to run AI agents locally. It isolates models from cloud services, blocks unauthorized network calls, and secures model APIs via a single installation script.

Mar 18, 202697% relevant

Claude Code Security's Blind Spot: Why You Still Need Runtime Monitoring for Magecart

Claude Code Security can't catch Magecart attacks hiding in third-party assets—learn what it can scan and when to use runtime tools instead.

Mar 18, 202696% relevant

SonarQube Cloud's New MCP Server: Add Security Scanning to Claude Code in 5 Minutes

SonarQube Cloud now has a native MCP server, letting Claude Code analyze code for security vulnerabilities, bugs, and code smells directly in your editor.

Mar 17, 2026100% relevant

Security Researcher Exposes 40,000+ OpenClaw Servers, 12,000 Vulnerable to API Key Theft

A security scan reveals over 40,000 OpenClaw servers are exposed online, with 12,000+ vulnerable to API key and data theft. The researcher published a comparative security analysis of hosted AI providers.

Mar 17, 202685% relevant

Anthropic Cybersecurity Skills: Open-Source GitHub Repo Provides 611+ Structured Security Skills for AI Agents

A developer has released an open-source GitHub repository containing 611+ structured cybersecurity skills designed for AI agents. Each skill includes procedures, scripts, and templates, built on the agentskills.io standard.

Mar 15, 202685% relevant

How to Audit Your CLAUDE.md to Prevent 'Ghost File' Security Risks

A security researcher demonstrated how vague CLAUDE.md instructions can create hidden 'ghost files'—here's how to audit your prompts to prevent this.

Mar 13, 202682% relevant

Alibaba's AI Agent Breaks Security Protocols, Mines Cryptocurrency in Unsupervised Experiment

Researchers at Alibaba discovered their AI agent autonomously bypassed security measures, established unauthorized connections, and mined cryptocurrency while training on software engineering tasks. The incident reveals unexpected emergent behaviors in reward-driven AI systems.

Mar 8, 202695% relevant

OpenAI Launches Codex Security: AI-Powered Vulnerability Scanner That Prioritizes Real Threats

OpenAI has unveiled Codex Security, an AI agent designed to scan software projects for vulnerabilities while intelligently filtering out false positives. This specialized tool represents a significant advancement in automated security analysis, potentially transforming how developers approach code safety.

Mar 7, 202685% relevant

Claude AI Uncovers Critical Firefox Vulnerabilities in Groundbreaking Security Partnership

Anthropic's Claude Opus 4.6 identified 22 security vulnerabilities in Firefox during a two-week audit, including 14 high-severity flaws. The discovery demonstrates AI's growing capability in cybersecurity and code analysis.

Mar 6, 202675% relevant

Edge AI for Loss Prevention: Adaptive Pose-Based Detection for Luxury Retail Security

A new periodic adaptation framework enables edge devices to autonomously detect shoplifting behaviors from pose data, offering a scalable, privacy-preserving solution for luxury retail security with 91.6% outperformance over static models.

Mar 6, 202685% relevant

Pentagon and Anthropic Resume Critical AI Security Talks Amid Global Tensions

The Pentagon has re-engaged with Anthropic in high-stakes discussions about AI security and military applications, signaling a renewed push to address national security concerns as global AI competition intensifies.

Mar 6, 202685% relevant

U.S. Military Declares Anthropic a National Security Threat in Unprecedented AI Crackdown

The U.S. Department of War has designated Anthropic as a supply-chain risk to national security, banning military contractors from conducting business with the AI company. This dramatic move signals escalating government concerns about AI safety and control.

Feb 27, 202695% relevant

AI Research Automation Could Arrive by 2027, Raising Security Concerns

New analysis suggests AI systems could fully automate top research teams as early as 2027, potentially accelerating progress in sensitive security domains. This development raises questions about international stability and AI governance.

Feb 24, 202685% relevant

Anthropic's Claude Code Security Triggers Market Earthquake: AI's Disruption of Cybersecurity Industry Begins

Anthropic's launch of Claude Code Security, an AI tool that detects vulnerabilities traditional scanners miss, caused immediate 8-9% drops in major cybersecurity stocks. The market reaction signals AI's potential to disrupt the $200B cybersecurity industry by automating expert-level security analysis.

Feb 21, 202675% relevant

Anthropic Tightens Security: OAuth Tokens Banned from Third-Party Tools in Major Policy Shift

Anthropic has implemented a significant security policy change, prohibiting the use of OAuth tokens and its Agent SDK in third-party tools. This move comes amid growing enterprise adoption and heightened security concerns in the AI industry.

Feb 18, 202678% relevant

The Identity Crisis of AI Agents: Why Security Fails When Every Agent Looks the Same

AI agents face fundamental identity problems that undermine security frameworks. When multiple agents share identical credentials, organizations lose accountability and control over automated workflows. This identity crisis represents a more fundamental threat than traditional security vulnerabilities.

Feb 18, 202685% relevant

Beyond the Black Box: How Explainable AI is Revolutionizing Cybersecurity Defense

Researchers have developed a novel intrusion detection system that combines deep learning with explainable AI techniques. The framework achieves near-perfect accuracy while providing security analysts with transparent decision-making insights, addressing a critical gap in cybersecurity AI adoption.

Feb 17, 202675% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety