agents

30 articles about agents in AI news

Memory Systems for AI Agents: Architectures, Frameworks, and Challenges

A technical analysis details the multi-layered memory architectures—short-term, episodic, semantic, procedural—required to transform stateless LLMs into persistent, reliable AI agents. It compares frameworks like MemGPT and LangMem that manage context limits and prevent memory drift.

86% relevant

Open-Source AI Crew Replaces Notion, Obsidian with 8 Local Agents

A researcher has built a fully local, open-source system of 8 specialized AI agents that work together to manage an Obsidian vault—handling notes, inboxes, meetings, and deadlines. It replaces separate tools like Notion and inbox triagers with an autonomous, interconnected crew.

87% relevant

Sam Altman Outlines 3 AI Futures: Research, Operations, Personal Agents

OpenAI CEO Sam Altman outlined three potential outcomes for AI development: systems that conduct scientific research, accelerate company operations, and serve as trusted personal agents. This vision frames the strategic direction for OpenAI and the broader industry.

85% relevant

GitNexus Open Sources Codebase Knowledge Graph Engine for AI Agents

GitNexus, an open-source knowledge graph engine, autonomously indexes codebases to map dependencies and execution flows. It integrates with Claude Code, Cursor, and Windsurf via MCP to give AI agents architectural awareness, preventing breaking changes.

99% relevant

New Research Paper Identifies Multi-Tool Coordination as Critical Failure Point for AI Agents

A new research paper posits that the primary failure mode for AI agents is not in calling individual tools, but in reliably coordinating sequences of many tools over extended tasks. This reframes the core challenge from single-step execution to multi-step orchestration and state management.

85% relevant

Microsoft Announces Copilot AI Agents That Function as Virtual Employees

Microsoft is enabling businesses and developers to create AI-powered Copilot agents that can autonomously perform tasks like monitoring email inboxes and automating workflows, functioning as virtual employees rather than passive assistants.

89% relevant

Ethan Mollick Declares End of 'RAG Era' as Dominant Paradigm for AI Agents

AI researcher Ethan Mollick declared that the 'RAG era' for supplying context to AI agents has ended, marking a significant architectural shift in how advanced AI systems process information.

75% relevant

OpenAgents Workspace Launches Open-Source Platform to Connect AI Agents with Shared Files and Browser

OpenAgents Workspace is an open-source platform that connects multiple local AI agents into a unified workspace with shared files and browser context, enabling automated collaboration without manual intervention.

81% relevant

4 Observability Layers Every AI Developer Needs for Production AI Agents

A guide published on Towards AI details four critical observability layers for production AI agents, addressing the unique challenges of monitoring systems where traditional tools fail. This is a foundational technical read for teams deploying autonomous AI systems.

74% relevant

Loop Neighborhood Markets Deploys AI Agents to Store Associates

Loop Neighborhood Markets is equipping its store associates with AI agents. This move represents a tangible step in bringing autonomous AI systems from concept to the retail floor, aiming to augment employee capabilities.

96% relevant

QAsk-Nav Benchmark Enables Separate Scoring of Navigation and Dialogue for Collaborative AI Agents

A new benchmark called QAsk-Nav enables separate evaluation of navigation and question-asking for collaborative embodied AI agents. The accompanying Light-CoNav model outperforms state-of-the-art methods while being significantly more efficient.

75% relevant

Harness Engineering for AI Agents: Building Production-Ready Systems That Don’t Break

A technical guide on 'Harness Engineering'—a systematic approach to building reliable, production-ready AI agents that move beyond impressive demos. This addresses the critical industry gap where most agent pilots fail to reach deployment.

72% relevant

OpenAgents Workspace Enables Real-Time, Multi-Agent AI Collaboration

OpenAgents Workspace allows multiple AI agents to communicate and collaborate in real time. This moves beyond single-agent tools toward a coordinated, multi-agent workflow system.

100% relevant

How to Build a Custom AI Agent with Claude Code's Skills, SubAgents, and Hooks

A developer's deep dive into customizing Claude Code with 7 skills, 5 subagents, and quality-check hooks—showing how to move beyond basic prompting to create a truly autonomous coding assistant.

100% relevant

Cognition Labs Launches 'Canvas for Agents': First Shared Workspace Where AI Agents Code Alongside Humans

Cognition Labs has unveiled a collaborative workspace where AI agents like Codex and Claude Code operate visibly alongside human developers. This marks a shift from AI as a tool to a visible, real-time collaborator in the creative coding process.

87% relevant

CMU Research Identifies 'Biggest Unlock' for Coding Agents: Strategic Test Execution

New research from Carnegie Mellon University suggests the key advancement for AI coding agents lies not in raw code generation, but in developing strategies for how to run and interpret tests. This shifts focus from LLM capability to agentic reasoning.

87% relevant

Agent Washing vs. Real Agents: A Production Engineer's Guide to Telling the Difference

A technical guide exposes 'agent washing'—where chatbots and automation scripts are rebranded as AI agents—and provides a 5-point checklist to identify genuinely agentic systems that can survive production. This matters because 88% of AI agents never reach production.

92% relevant

Base44 Launches Superagent Skills: No-Code Library for Adding Domain-Specific Functions to AI Agents

Base44 has launched Superagent Skills, a library of pre-built, domain-specific functions that can be added to AI agents with a single click. The no-code system allows for combining skills and creating custom ones via natural language description.

85% relevant

Trace2Skill Framework Distills Execution Traces into Declarative Skills via Parallel Sub-Agents

Researchers introduced Trace2Skill, a framework that uses parallel sub-agents to analyze execution trajectories and distill them into transferable declarative skills. This enables performance improvements in larger models without parameter updates.

85% relevant

MemoryCD: New Benchmark Tests LLM Agents on Real-World, Lifelong User Memory for Personalization

Researchers introduce MemoryCD, the first large-scale benchmark for evaluating LLM agents' long-context memory using real Amazon user data across 12 domains. It reveals current methods are far from satisfactory for lifelong personalization.

74% relevant

Agent Reach: Open-Source Tool Gives AI Agents Free Access to Twitter, YouTube, Reddit, and Web Content

Agent Reach is an open-source Python toolkit that enables AI agents to scrape and read content from Twitter, YouTube, Reddit, Xiaohongshu, and the web without paid APIs. It solves the persistent problem of agents hitting authentication walls and anti-scraping blocks when trying to access online information.

85% relevant

GitHub Study of 2,500+ Custom Instructions Reveals Key to Effective AI Coding Agents: Structured Context

GitHub analyzed thousands of custom instruction files, finding effective AI coding agents require specific personas, exact commands, and defined boundaries. The study informed GitHub Copilot's new layered customization system using repo-level, path-specific, and custom agent files.

85% relevant

Satya Nadella Predicts AI Agents Will Commoditize Traditional SaaS, Shifting Value to Orchestration Layer

Microsoft CEO Satya Nadella argues AI agents will reduce traditional software to simple databases, with intelligence moving to the orchestration layer. This signals a fundamental shift in where value is captured in enterprise technology.

85% relevant

Claude Code's New Channels Feature: How to Run Persistent AI Agents in Your Terminal

Claude Code now supports persistent 'Channels' via MCP, letting you run long-lived AI agents that work asynchronously on tasks like monitoring logs or building features.

100% relevant

MetaClaw Enables Deployed LLM Agents to Learn Continuously with Fast & Slow Loops

MetaClaw introduces a two-loop system allowing production LLM agents to learn from failures in real-time via a fast skill-writing loop and update their core model later in a slow training loop, boosting accuracy by up to 32% relative.

85% relevant

Enterprises Are Trading ‘Press One’ for CRM-Native AI Agents

A new report highlights a shift from traditional IVR systems to AI agents integrated directly into CRM platforms. This represents a fundamental change in customer service architecture, moving from scripted menus to conversational, context-aware systems.

82% relevant

Awesome Finance Skills: Open-Source Plugin Adds Real-Time Market Analysis to AI Agents

Developer open-sources Awesome Finance Skills, a plug-and-play toolkit that gives AI agents real-time financial data access, sentiment analysis, and automated research report generation. The MIT-licensed package works with Claude Code, OpenClaw, and other popular agent frameworks.

95% relevant

EnterpriseArena Benchmark Reveals LLM Agents Fail at Long-Horizon CFO-Style Resource Allocation

Researchers introduced EnterpriseArena, a 132-month enterprise simulator, to test LLM agents on CFO-style resource allocation. Only 16% of runs survived the full horizon, revealing a distinct capability gap for current models.

100% relevant

Meta's Hyperagents Enable Self-Referential AI Improvement, Achieving 0.710 Accuracy on Paper Review

Meta researchers introduce Hyperagents, where the self-improvement mechanism itself can be edited. The system autonomously discovered innovations like persistent memory, improving from 0.0 to 0.710 test accuracy on paper review tasks.

95% relevant

UiPath Launches AI Agents for Retail Pricing, Promotions, and Stock Management

UiPath has announced new AI agents designed to autonomously handle core retail operations: dynamic pricing, promotional planning, and inventory gap resolution. This represents a significant move by a major automation player into agentic AI for retail.

100% relevant