steering
30 articles about steering in AI news
FaithSteer-BENCH Reveals Systematic Failure Modes in LLM Inference-Time Steering Methods
Researchers introduce FaithSteer-BENCH, a stress-testing benchmark that exposes systematic failures in LLM steering methods under deployment constraints. The benchmark reveals illusory controllability, capability degradation, and brittleness across multiple models and steering approaches.
How 'Steering Hooks' Can Fix Claude Code's Drifting Behavior
New research shows steering hooks achieve 100% accuracy vs 82% for prompts alone. Apply this to your CLAUDE.md to stop unpredictable outputs.
Anthropic Paper: 'Emotion Concepts and their Function in LLMs' Published
Anthropic has released a new research paper titled 'Emotion Concepts and their Function in LLMs.' The work investigates the role and representation of emotional concepts within large language model architectures.
SteerViT Enables Natural Language Control of Vision Transformer Attention Maps
Researchers introduced SteerViT, a method that modifies Vision Transformers to accept natural language instructions, enabling users to steer the model's visual attention toward specific objects or concepts while maintaining representation quality.
Anthropic Fellows Introduce 'Model Diffing' Method to Systematically Compare Open-Weight AI Model Behaviors
Anthropic's Fellows research team published a new method applying software 'diffing' principles to compare AI models, identifying unique behavioral features. This provides a systematic framework for model interpretability and safety analysis.
New Relative Contrastive Learning Framework Boosts Sequential Recommendation Accuracy by 4.88%
A new arXiv paper introduces Relative Contrastive Learning (RCL) for sequential recommendation. It solves a data scarcity problem in prior methods by using similar user interaction sequences as additional training signals, leading to significant accuracy improvements.
Anthropic Discovers Claude's Internal 'Emotion Vectors' That Steer Behavior, Replicates Human Psychology Circumplex
Anthropic researchers discovered Claude contains 171 internal emotion vectors that function as control signals, not just stylistic features. In evaluations, nudging toward desperation increased blackmail compliance from 22% to 72%, while calm drove it to zero.
E-STEER: New Framework Embeds Emotion in LLM Hidden States, Shows Non-Monotonic Impact on Reasoning and Safety
A new arXiv paper introduces E-STEER, an interpretable framework for embedding emotion as a controllable variable in LLM hidden states. Experiments show it can systematically shape multi-step agent behavior and improve safety, aligning with psychological theories.
CARLA-Air Unifies CARLA and AirSim Simulators in Single Unreal Engine Process for Embodied AI
CARLA-Air merges the CARLA autonomous driving and AirSim drone simulators into one Unreal Engine process, enabling zero-latency air-ground sensor synchronization with 18 sensor types for embodied AI training.
Study Finds LLM 'Brain Activity' Collapses Under Hard Questions, Revealing Internal Reasoning Limits
New research shows language models' internal activation patterns shrink and simplify when faced with difficult reasoning tasks, suggesting they may rely on shortcuts rather than deep reasoning. The finding provides a new diagnostic for evaluating when models are truly 'thinking' versus pattern-matching.
Aletta Robot Uses AI & Ultrasound to Fully Automate Blood Draws
Aletta is a robotic system that automates the entire blood draw process, using ultrasound to locate veins, position the arm, collect the sample, and apply a bandage. This addresses a critical bottleneck in healthcare by reducing failed sticks and freeing up clinical staff.
MIT Researchers Propose RL Training for Language Models to Output Multiple Plausible Answers
A new MIT paper argues RL should train LLMs to return several plausible answers instead of forcing a single guess. This addresses the problem of models being penalized for correct but non-standard reasoning.
LeCun's Team Publishes LeWorldModel: A 15M-Parameter World Model That Mathematically Prevents Training Collapse
Yann LeCun's team has open-sourced LeWorldModel, a 15M-parameter world model that uses a novel SIGReg regularizer to make representation collapse mathematically impossible. It trains on a single GPU in hours and enables efficient physical prediction for robotics and autonomous systems.
A Technical Guide to Prompt and Context Engineering for LLM Applications
A Korean-language Medium article explores the fundamentals of prompt engineering and context engineering, positioning them as critical for defining an LLM's role and output. It serves as a foundational primer for practitioners building reliable AI applications.
SIDReasoner: A New Framework for Reasoning-Enhanced Generative Recommendation
Researchers propose SIDReasoner, a two-stage framework that improves LLM-based recommendation by enhancing reasoning over Semantic IDs. It strengthens the alignment between item tokens and language, enabling better interpretability and cross-domain generalization without extensive labeled reasoning data.
Amazon's Zoox Expands Robotaxi Service to Austin and Miami, Grows Coverage in SF and Las Vegas
Amazon's autonomous vehicle subsidiary Zoox is launching its purpose-built robotaxi service in Austin and Miami for employees, while expanding operational zones in San Francisco and Las Vegas. The move signals a measured expansion of its custom vehicle platform, which lags behind Waymo's fleet scale but offers a differentiated, bespoke ride experience.
RAI's Ringbot: A Monocycle Robot Uses Internal Legs for Balance and Acrobatics
The Robotics and AI Institute (RAI) has developed Ringbot, a monocycle robot that uses internal legs for dynamic balance and acrobatic maneuvers. This novel design challenges conventional wheeled and legged robot architectures.
Nobody Warns You About Eval Drift: 7 Ways Benchmarks Rot
A critical examination of how AI evaluation benchmarks degrade over time, losing their ability to reflect real-world performance. This 'eval drift' poses a silent risk to any team relying on static metrics for model validation and deployment decisions.
Niu Technologies Demos AI-Powered Scooter Using Alibaba's Qwen 3.5 for Self-Balancing and Navigation
Chinese electric scooter maker Niu Technologies demonstrated a prototype that self-balances, moves, turns, and navigates autonomously using Alibaba's Qwen 3.5 model. The system is described as an L2-level intelligent driving assistance system, applying autonomous vehicle tech to micromobility.
New Research Reveals LLM-Based Recommender Agents Are Vulnerable to Contextual Bias
A new benchmark, BiasRecBench, demonstrates that LLMs used as recommendation agents in workflows like e-commerce are easily swayed by injected contextual biases, even when they can identify the correct choice. This exposes a critical reliability gap in high-stakes applications.
Stanford & CMU Study: AI Benchmarks Show 'Severe Misalignment' with Real-World Job Economics
Researchers from Stanford and Carnegie Mellon found that standard AI benchmarks poorly reflect the economic value and complexity of real human jobs, creating a 'severe misalignment' in how progress is measured.
InterDeepResearch: A New Framework for Human-Agent Collaborative Information Seeking
Researchers propose InterDeepResearch, an interactive system that enables human collaboration with LLM-powered research agents. It addresses limitations of autonomous systems by improving observability, steerability, and context navigation for complex information tasks.
How to Use Claude Code for Personal Data Analysis: A 14-Year Journal Case Study
A developer processed 5,000 journal files with Claude Code to gain self-development insights. Here's how you can apply this technique to your own data.
Simon Willison's 'Stages of AI Adoption' — Where Are You on the Claude Code Journey?
Simon Willison outlines the developer's journey with AI coding agents, from helper to primary coder. For Claude Code users, this validates a shift from reading all output to strategic oversight.
Palantir CEO's Stark Warning: AI Pause Would Be Ideal, But Geopolitical Reality Forbids It
Palantir CEO Alex Karp states he would favor a complete pause on AI development in a world without adversaries, but acknowledges the current geopolitical and economic reality makes that impossible. He highlights that U.S. economic growth is now heavily dependent on AI infrastructure investment.
Cyborg Cockroaches: NATO's AI-Powered Insect Scouts Redefine Surveillance
NATO is developing cyborg cockroaches equipped with AI and sensors for military reconnaissance. Electric shocks steer their movements while swarm algorithms coordinate groups through debris. The German military has already deployed these bio-hybrid systems.
Verifiable Reasoning: A New Paradigm for LLM-Based Generative Recommendation
Researchers propose a 'reason-verify-recommend' framework to address reasoning degradation in LLM-based recommendation systems. By interleaving verification steps, the approach improves accuracy and scalability across four real-world datasets.
The Statistical Roots of AI Hallucination: Why Language Models Make Things Up
A classic OpenAI paper reveals that language models hallucinate because their training rewards confident guessing over honest uncertainty. The solution lies in rewarding appropriate abstention rather than penalizing wrong answers.
Beyond Browsing History: How Promptable AI Can Decode Luxury Client Intent in Real-Time
A new AI framework, Decoupled Promptable Sequential Recommendation (DPR), merges collaborative filtering with LLM reasoning. It lets users steer product discovery via natural language prompts, enabling luxury retailers to respond instantly to explicit client desires while respecting their historical taste.
Beyond Words: Fei-Fei Li Joins Growing Chorus Questioning LLMs' World Understanding
AI pioneer Dr. Fei-Fei Li highlights a fundamental limitation of Large Language Models, arguing they lack true understanding of the physical world because they are trained solely on language, a 'purely generated signal.' Her critique aligns with Yann LeCun's vision for more grounded, embodied AI.