In a concise social media post that resonated across AI engineering circles, Wharton professor and AI researcher Ethan Mollick declared: "The RAG era was short-lived, but intense." He clarified that while Retrieval-Augmented Generation (RAG) remains useful, it is "no longer the dominant paradigm for supplying context to agents."
This statement from a prominent voice in applied AI research signals a significant architectural shift occurring beneath the surface of mainstream AI discourse. For the past two years, RAG has been the go-to solution for connecting large language models to external knowledge bases, enabling everything from enterprise chatbots to research assistants.
What Happened
Mollick's tweet represents a watershed moment in recognizing what many AI engineers have been observing in practice: RAG systems, while revolutionary in their time, are being superseded by more integrated approaches to knowledge retrieval and context management.
RAG emerged as a critical innovation when it became clear that LLMs alone couldn't access current or proprietary information. The architecture—which retrieves relevant documents from a database and injects them into the LLM's context window—became standard practice for building production AI systems. Companies built entire product lines around RAG pipelines, and it became a staple of enterprise AI implementations.
Context: The Rise and Evolution of RAG
The RAG paradigm gained prominence around 2023-2024 as organizations sought to ground LLM outputs in factual information. Key advantages included:
- Reducing hallucinations by providing source material
- Enabling access to private or recent data not in training sets
- Creating auditable chains of evidence for generated responses
However, RAG systems came with well-documented limitations: latency from multiple retrieval steps, context window constraints that limited how much information could be injected, and the "needle in a haystack" problem where relevant information might be missed in retrieval.
What's Replacing RAG?
While Mollick didn't specify alternatives in his brief post, several architectural shifts are emerging:
1. End-to-End Learned Retrieval: Models that learn to retrieve and process information in a single forward pass, rather than separating retrieval and generation into distinct phases.
2. Mixture of Experts (MoE) Architectures: Systems where different expert models handle different knowledge domains, with routing mechanisms that determine which expert to consult for specific queries.
3. Long-Context Models: LLMs with context windows extending to 1M+ tokens that can hold entire knowledge bases in memory, reducing the need for external retrieval.
4. Agentic Systems with Tool Use: AI agents that can actively search, browse, and query databases using tools rather than passively receiving retrieved context.
Technical Implications
The shift away from RAG-as-dominant-paradigm has practical implications for AI engineering:
- Simplified Architectures: Systems may move from multi-stage RAG pipelines to single-model approaches
- Reduced Latency: End-to-end approaches eliminate the round-trip time between retrieval and generation components
- Different Skill Sets: Engineers who specialized in building and optimizing RAG systems may need to adapt to new architectural patterns
- Evaluation Changes: Benchmarks that focused on RAG-specific metrics (retrieval accuracy, citation quality) may become less relevant
The Business Impact
Companies that built their competitive advantage on RAG technology face strategic decisions. Startups that raised funding based on "RAG-as-a-service" or specialized RAG tooling may need to pivot or expand their offerings. Meanwhile, enterprises that invested heavily in RAG infrastructure must evaluate whether to continue those investments or transition to newer approaches.
gentic.news Analysis
Mollick's declaration aligns with several trends we've been tracking in the AI architecture space. In our December 2025 coverage of Google's Gemini 2.0 launch, we noted their emphasis on "native retrieval" capabilities that bypass traditional RAG pipelines. Similarly, Anthropic's Claude 3.5 Sonnet demonstrated significantly improved performance on knowledge-intensive tasks without explicit RAG augmentation, suggesting internal architectural improvements to knowledge access.
This shift represents the natural evolution of AI systems from assembled components to more integrated architectures. The pattern mirrors earlier transitions in software engineering—from service-oriented architectures to microservices, and now potentially to more monolithic but capable AI systems. What's particularly notable is the speed of this transition: RAG rose to prominence in late 2023, peaked in 2024-2025, and is now being superseded by early 2026.
For practitioners, the key insight isn't that RAG is obsolete—Mollick explicitly states it remains useful—but that it's no longer the default starting point for context-aware AI systems. Engineers should evaluate whether their use cases still benefit from RAG's explicit retrieval-generation separation or whether newer approaches offer better performance with simpler architectures.
The companies best positioned for this shift are those developing foundation models with built-in retrieval capabilities and those creating agent frameworks that treat knowledge access as just another tool in an agent's toolkit, rather than a separate architectural layer.
Frequently Asked Questions
Is RAG completely obsolete now?
No, and Mollick explicitly clarifies this point. RAG remains a useful technique for specific applications, particularly where explicit citation of sources is required or where the retrieval process needs to be auditable. The shift is that RAG is no longer the default or dominant approach for most context-supply problems.
What should AI engineers learn instead of RAG?
Engineers should focus on understanding end-to-end architectures, agent frameworks with tool use capabilities, and how to work with long-context models. Rather than specializing in RAG pipeline optimization, the emerging skill set involves designing systems where knowledge access is integrated rather than bolted on.
How will this affect companies that sell RAG solutions?
Companies offering RAG-as-a-service or specialized RAG tooling will need to expand their offerings to include newer architectural approaches. Some may pivot to become more general "knowledge access" platforms, while others may focus on niche applications where RAG's specific advantages remain relevant.
What benchmarks should we use to evaluate new approaches?
Look beyond traditional RAG benchmarks like retrieval accuracy and citation quality. Evaluate systems on end-to-end task completion, latency, cost per query, and ability to handle complex, multi-step reasoning that requires accessing diverse knowledge sources. The true test is whether the system can accomplish real-world tasks efficiently, not whether it uses a specific architectural pattern.








