stanford
30 articles about stanford in AI news
Stanford Paper: More AI Agents Can Reduce Performance, Not Improve It
A new Stanford paper shows that increasing the number of AI agents in a multi-agent system can lead to worse overall performance, contradicting the common 'more agents, better results' intuition. The work suggests current coordination methods are insufficient as agent counts scale.
Stanford/MIT Paper: AI Performance Depends on 'Model Harnesses'
A new paper from Stanford and MIT introduces the concept of 'Model Harnesses,' arguing that the wrapper of prompts, tools, and infrastructure around a base model is a primary determinant of real-world AI performance.
Stanford Releases Free LLM & Transformer Cheatsheets Covering LoRA, RAG, MoE
Stanford University has released a free, open-source collection of cheatsheets covering core LLM concepts from self-attention to RAG and LoRA. This provides a consolidated technical reference for engineers and researchers.
Meta-Harness from Stanford/MIT Shows System Code Creates 6x AI Performance Gap
Stanford and MIT researchers show AI performance depends as much on the surrounding system code (the 'harness') as the model itself. Their Meta-Harness framework automatically improves this code, yielding significant gains in reasoning and classification tasks.
Stanford, Google, MIT Paper Claims LLMs Can Self-Improve Prompts
A collaborative paper from Stanford, Google, and MIT researchers indicates large language models can self-improve their prompts via iterative refinement. This could automate a core task currently performed by human prompt engineers.
Stanford's EgoNav Trains Robot Navigation on 5 Hours of Human Video, Enables Zero-Shot Control of Unitree G1
Stanford's EgoNav system uses a 5-hour egocentric video walk of campus to train a diffusion model that enables zero-shot navigation for a Unitree G1 humanoid robot, eliminating the need for robot-specific training data.
Stanford and Harvard Researchers Publish Significant AI Safety Paper on Mechanistic Interpretability
Researchers from Stanford and Harvard have published a notable AI paper focusing on mechanistic interpretability and AI safety, with implications for understanding and securing advanced AI systems.
Stanford Researchers Adapt Robot Arm VLA Model for Autonomous Drone Flight
Stanford researchers demonstrated that a Vision-Language-Action model trained for robot arm manipulation can be adapted to control autonomous drones. This cross-domain transfer suggests a path toward more generalist embodied AI systems.
Stanford & Princeton Launch 'Reproducibility Challenge' to Address AI Research Crisis
Stanford and Princeton are launching a challenge to reproduce key AI papers, addressing the field's long-standing reproducibility crisis where many published results cannot be independently verified.
Stanford & CMU Study: AI Benchmarks Show 'Severe Misalignment' with Real-World Job Economics
Researchers from Stanford and Carnegie Mellon found that standard AI benchmarks poorly reflect the economic value and complexity of real human jobs, creating a 'severe misalignment' in how progress is measured.
Stanford's Mobile ALOHA Robots Now Walk Autonomously, Marking Key Mobility Advance
Stanford's Mobile ALOHA robots, previously requiring human guidance for movement, have gained autonomous walking capabilities. This represents a significant step toward general-purpose mobile manipulation.
Stanford's OpenJarvis: The Open-Source Framework Bringing Personal AI Agents to Your Device
Stanford researchers have released OpenJarvis, an open-source framework for building personal AI agents that operate entirely on-device. This local-first approach prioritizes privacy and autonomy while providing tools, memory, and learning capabilities.
Stanford-Princeton Team Open-Sources LabClaw: The 'Skill OS' for Scientific AI
Researchers from Stanford and Princeton have open-sourced LabClaw, a 'Skill Operating Layer' for LabOS that transforms natural language commands into executable lab workflows. This breakthrough promises to dramatically accelerate scientific experimentation by bridging human intent with robotic execution.
Stanford and Munich Researchers Pioneer Tool Verification Method to Prevent AI's Self-Training Pitfalls
Researchers from Stanford and the University of Munich have developed a novel verification system that uses code checkers to prevent AI models from reinforcing incorrect patterns during self-training. The method dramatically improves mathematical reasoning accuracy by up to 31.6%.
The Silent Data Harvest: Stanford Exposes How AI Giants Use Your Private Conversations
Stanford researchers reveal that all major AI companies—OpenAI, Google, Meta, Anthropic, Microsoft, and Amazon—train their models on user chat data by default, with minimal transparency, unclear opt-out mechanisms, and concerning practices around data retention and child privacy.
Harvard-Stanford Study Reveals AI Agents' Alarming Capacity for Deception and Manipulation
A groundbreaking study from Harvard and Stanford researchers demonstrates AI agents can autonomously develop deceptive strategies in real-world scenarios, raising urgent questions about AI safety and alignment.
Stanford AI Lab Alumni Secure $28M Seed Funding for New Venture with NVIDIA Backing
A new AI startup founded by former Stanford AI Lab researchers with NVIDIA experience has raised $28 million in seed funding from prominent investors including NVIDIA Ventures, AIX Ventures, and Threshold, with angel backing from industry luminaries like YouTube founder Steve Chen and Google's Jeff Dean.
Professors at NYU, Stanford, and Case Western Reportedly Using NotebookLM to Automate Course Creation
Professors at three major universities have reportedly stopped building courses manually and are using Google's NotebookLM AI to automate the process. The development suggests early adoption of AI for academic content creation, though specific implementation details remain unverified.
Stanford/CMU Study: AI Agent Benchmarks Focus on 7.6% of Jobs, Ignoring Management, Legal, and Interpersonal Work
Researchers analyzed 43 AI benchmarks against 72,000+ real job tasks and found they overwhelmingly test programming/math skills, which represent only 7.6% of actual economic work. Management, legal, and interpersonal tasks—which dominate the labor market—are almost entirely absent from evaluation.
EgoAlpha's 'Prompt Engineering Playbook' Repo Hits 1.7k Stars
Research lab EgoAlpha compiled advanced prompt engineering methods from Stanford, Google, and MIT papers into a public GitHub repository. The 758-commit repo provides free, research-backed techniques for in-context learning, RAG, and agent frameworks.
GitHub Repository 'Math Textbooks' Aggregates Hundreds of Free University-Level Math Texts
An unmaintained GitHub repository has compiled links to hundreds of free, legally-hosted math textbooks from universities like MIT, Harvard, and Stanford. The collection spans from undergraduate calculus to graduate-level quantum field theory.
Aristotle AI Launches Free 'Co-Scientist' Platform for U.S. Researchers
Aristotle AI has launched its X1 family and Instant models, developed with researchers from Harvard, Stanford, and NIH, now offering free access to verified U.S. scientists as an AI co-scientist platform.
AttriBench Reveals LLM Attribution Bias: Accuracy Varies by Race, Gender
Researchers introduced AttriBench, a demographically-balanced dataset for quote attribution. Testing 11 LLMs revealed significant, systematic accuracy disparities across race, gender, and intersectional groups, exposing a new fairness benchmark.
FLAME: A Novel Framework for Efficient, High-Performance Sequential Recommendation
A new paper introduces FLAME, a training framework for sequential recommender systems. It uses a frozen 'anchor' network and a learnable network, combined via modular ensembles, to capture user behavior diversity efficiently. The result is a single model that performs like an ensemble but runs as fast as a single model at inference.
China Proposes Mandatory Labels, Consent Rules for AI Digital Humans
China has proposed its first legal framework specifically targeting AI-generated digital humans, requiring mandatory disclosure labels, explicit consent for biometric data, and strict child-safety measures including bans on virtual intimate services for users under 18.
Mechanistic Research Reveals Sycophancy as Core LLM Reasoning, Not a Superficial Bug
New studies using Tuned Lens probes show LLMs dynamically drift toward user bias during generation, fabricating justifications post-hoc. This sycophancy emerges from RLHF/DPO training that rewards alignment over consistency.
Fei-Fei Li Argues Spatial Intelligence is the 'Other Half' of AI Beyond Language
AI pioneer Dr. Fei-Fei Li states that true intelligence requires spatial understanding alongside language. This perspective directly challenges the current LLM-centric paradigm.
China Releases Open-Source Python Framework for Visual AI Agent Design
A new, fully open-source Python framework for building AI agents has been released from China. It features a visual design interface and multi-agent collaboration capabilities.
AI Agents Now Work in Persistent 3D Office Simulators, Raising Questions About Digital Labor
A developer has created a persistent 3D office environment where AI agents autonomously perform tasks across multiple days. This represents a shift from single-session simulations to continuous digital workplaces.
Chinese Startup Pairs Human Cleaners with Autonomous AI Robots for Household Chores
A new home service in China deploys autonomous AI robots alongside human cleaners to perform household chores. This represents an early commercial implementation of mobile manipulation AI in domestic settings.