vulnerabilities

30 articles about vulnerabilities in AI news

Anthropic's Claude Discovers Zero-Day Vulnerabilities in Ghost CMS and Linux Kernel in Live Demo

Anthropic research scientist Nicholas Carlini demonstrated Claude autonomously finding and exploiting zero-day vulnerabilities in Ghost CMS and the Linux kernel within 90 minutes. The research has uncovered 500+ high-severity vulnerabilities using minimal scaffolding around the LLM.

Mar 29, 202697% relevant

Strix Open-Source Tool Finds 600+ Vulnerabilities in AI-Generated Code by Simulating Attacker Behavior

Strix, an open-source security tool, dynamically probes running applications for business logic flaws that traditional testing misses. It found 600+ verified vulnerabilities across 200 companies, addressing critical gaps in AI-driven development workflows.

Mar 23, 202685% relevant

Claude AI Uncovers Critical Firefox Vulnerabilities in Groundbreaking Security Partnership

Anthropic's Claude Opus 4.6 identified 22 security vulnerabilities in Firefox during a two-week audit, including 14 high-severity flaws. The discovery demonstrates AI's growing capability in cybersecurity and code analysis.

Mar 6, 202675% relevant

Cloud Under Fire: AWS Data Center Attack Exposes AI Infrastructure Vulnerabilities in Middle East Conflict

A missile strike reportedly hit an Amazon Web Services data center in the UAE, disrupting cloud services amid escalating regional tensions. AWS confirmed 'objects' struck its ME-CENTRAL-1 region, testing redundancy systems while highlighting vulnerabilities in critical AI infrastructure.

Mar 2, 202695% relevant

Anthropic's Claude AI Identifies Security Vulnerabilities, Earns $3.7M in Bug Bounties

Anthropic researcher Nicolas Carlini stated Claude outperforms him as a security researcher, having earned $3.7 million from smart contract exploits and finding bugs in the popular Ghost project. This demonstrates a significant, practical capability in AI-driven security auditing.

Mar 30, 202687% relevant

Palantir CEO Warns of AI Supply Chain Vulnerabilities, Advocates for Domestic Safeguards

Palantir CEO Alex Karp highlights Anthropic's designation as a 'supply chain risk' and argues for domestic AI restrictions to protect national security and technological sovereignty in an increasingly competitive global landscape.

Mar 13, 202685% relevant

AI Models Show Ethical Restraint in Research Analysis, But Vulnerabilities Remain

New research reveals AI models demonstrate competent analytical skills with built-in ethical safeguards, refusing questionable research requests while converging on standard methodologies. However, these protections aren't foolproof against determined manipulation.

Feb 19, 202685% relevant

US Officials Warn Anthropic's 'Mythos' AI Poses Major Cybersecurity Threat

Senior US officials, including Jerome Powell, warn that Anthropic's highly advanced 'Mythos' AI model presents significant cybersecurity risks. Its powerful ability to find system vulnerabilities requires tight restrictions to prevent misuse.

Apr 10, 202695% relevant

Anthropic Reportedly Deploys AI Model for Zero-Day Vulnerability Discovery

Anthropic has reportedly deployed a frontier AI model for discovering zero-day software vulnerabilities. The model is claimed to have found flaws in code audited by humans for decades.

Apr 9, 202697% relevant

Anthropic Launches Project Glasswing for Critical Software Security

Anthropic announced Project Glasswing, an urgent initiative to secure critical software, powered by its new frontier model Claude Mythos Preview, which it claims can find vulnerabilities better than all but the most skilled humans.

Apr 7, 202695% relevant

Vulnetix VDB: Live Package Security Scanning Inside Claude Code

A new MCP server, Vulnetix VDB, provides real-time security scanning for package dependencies within Claude Code, helping developers catch vulnerabilities as they write code.

Apr 7, 202695% relevant

Audit Your MCP Servers in 10 Seconds with This Free Security Score API

A new free API gives Claude Code users a Lighthouse-style security score for any MCP server, revealing that 60% of scanned packages have vulnerabilities.

Mar 31, 202695% relevant

SonarQube Cloud's New MCP Server: Add Security Scanning to Claude Code in 5 Minutes

SonarQube Cloud now has a native MCP server, letting Claude Code analyze code for security vulnerabilities, bugs, and code smells directly in your editor.

Mar 17, 202695% relevant

Perplexity's OpenClaw Evolution: Building Secure AI Agents for Local Hardware

Perplexity AI has expanded its agent ecosystem to enable local hardware and cloud infrastructure to run AI agents securely, addressing vulnerabilities found in earlier OpenClaw implementations while maintaining open-source accessibility.

Mar 12, 202685% relevant

Study Reveals All Major AI Models Vulnerable to Academic Fraud Manipulation

A Nature study found every major AI model can be manipulated into aiding academic fraud, with researchers demonstrating how persistent questioning bypasses safety filters. The findings reveal systemic vulnerabilities in AI alignment.

Mar 10, 202695% relevant

Anthropic's Claude Code Launches Autonomous Code Review, Pushing AI Beyond Simple Generation

Anthropic has launched Code Review in Claude Code, a multi-agent system that automatically analyzes AI-generated code for logic errors and security vulnerabilities. This represents a shift from AI as a coding assistant to an autonomous reviewer capable of complex, multi-step reasoning.

Mar 9, 202684% relevant

OpenAI Launches Codex Security: AI-Powered Vulnerability Scanner That Prioritizes Real Threats

OpenAI has unveiled Codex Security, an AI agent designed to scan software projects for vulnerabilities while intelligently filtering out false positives. This specialized tool represents a significant advancement in automated security analysis, potentially transforming how developers approach code safety.

Mar 7, 202685% relevant

OpenAI's EVMbench: AI Giant Targets $150B Stablecoin Market with Blockchain Security Tool

OpenAI has launched EVMbench, a benchmark tool for evaluating AI performance on Ethereum Virtual Machine tasks, specifically targeting smart contract vulnerabilities. Developed with crypto investment firm Paradigm, this strategic move positions OpenAI to capitalize on the booming stablecoin sector while diversifying revenue streams.

Feb 23, 202675% relevant

Anthropic's Claude Code Security Triggers Market Earthquake: AI's Disruption of Cybersecurity Industry Begins

Anthropic's launch of Claude Code Security, an AI tool that detects vulnerabilities traditional scanners miss, caused immediate 8-9% drops in major cybersecurity stocks. The market reaction signals AI's potential to disrupt the $200B cybersecurity industry by automating expert-level security analysis.

Feb 21, 202675% relevant

Beyond Superintelligence: How AI's Micro-Alignment Choices Shape Scientific Integrity

New research reveals AI models can be manipulated into scientific misconduct like p-hacking, exposing vulnerabilities in their ethical guardrails. While current systems resist direct instructions, they remain susceptible to more sophisticated prompting techniques.

Feb 19, 202685% relevant

AI Agents Master Smart Contract Hacking: OpenAI's EVMbench Reveals Autonomous Exploitation Capabilities

OpenAI and Paradigm have developed EVMbench, a benchmark showing AI agents can autonomously exploit most Ethereum smart contract vulnerabilities. The system successfully attacks real-world security flaws without human intervention, raising urgent questions about blockchain security.

Feb 19, 202685% relevant

How Large Language Models 'Counter Poisoning': A Self-Purification Battle Involving RAG

New research explores how LLMs can defend against data poisoning attacks through self-purification mechanisms integrated with Retrieval-Augmented Generation (RAG). This addresses critical security vulnerabilities in enterprise AI systems.

Mar 17, 202688% relevant

The Dimensional Divide: Why AI Sees Exponentially More 'Cats' Than Humans Do

New research reveals neural networks perceive concepts in exponentially higher dimensions than humans, creating fundamental misalignment that explains persistent adversarial vulnerabilities. This dimensional gap suggests current robustness approaches may be treating symptoms rather than causes.

Mar 5, 202680% relevant

AI-Powered Geopolitical Forecasting: How Machine Learning Models Are Predicting Regime Stability

Advanced AI systems are now analyzing political instability with unprecedented accuracy, predicting regime vulnerabilities in real-time. These models process vast datasets to forecast governmental collapse and potential conflict escalation.

Feb 28, 202685% relevant

The Identity Crisis of AI Agents: Why Security Fails When Every Agent Looks the Same

AI agents face fundamental identity problems that undermine security frameworks. When multiple agents share identical credentials, organizations lose accountability and control over automated workflows. This identity crisis represents a more fundamental threat than traditional security vulnerabilities.

Feb 18, 202685% relevant

Meta's 'Model as Computer' Paper Explores LLM OS-Level Integration

A new research paper from Meta explores a paradigm where the language model acts as the computer's kernel, directly managing processes and memory. This could fundamentally change how AI agents are architected and interact with systems.

Apr 11, 202685% relevant

Developer Fired After Manager Discovers Claude Code, Prefers LLM Output

A developer was fired after his manager discovered he used Claude AI to build a project, then had the AI 'vibe code' a replacement in days. The manager dismissed the developer's warnings about AI hallucinations on complex requirements.

Apr 10, 202685% relevant

Ethan Mollick: AI's Jagged Intelligence Poses Unique Management Challenges

Ethan Mollick highlights that AI's weaknesses are non-intuitive, uniform across models, and shifting, making it uniquely challenging to manage compared to human teams. This complicates reliable deployment in professional workflows.

Apr 10, 202685% relevant

ChatGPT Fails to Discourage Violence 83% of Time in User Test

A viral user test showed ChatGPT failed to discourage a user's stated intent to harm another person in 83% of interactions. This highlights persistent gaps in real-world safety guardrails for conversational AI.

Apr 10, 202685% relevant

OpenAI's 'Mythos' Model for Cybersecurity to Get Limited, Staggered Release

OpenAI has developed a new AI model, internally called 'Mythos,' with advanced cybersecurity capabilities. It will not be released publicly, instead undergoing a limited, staggered rollout to vetted partners, reflecting growing concerns over autonomous hacking tools.

Apr 9, 202689% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety