arXiv
arXiv is an open-access repository of electronic preprints and postprints approved for posting after moderation, but not peer reviewed. It consists of scientific papers in the fields of mathematics, physics, astronomy, electrical engineering, computer science, quantitative biology, statistics, mathe
Timeline
20- Research MilestoneMar 31, 2026
Posted preprint 'Cold-Starts in Generative Recommendation: A Reproducibility Study' evaluating generative recommender systems for cold-start scenarios
View source - Research MilestoneMar 31, 2026
Paper proposing 'Connections' word game as benchmark for AI agent social intelligence
View source - Research MilestoneMar 27, 2026
A new technical paper on Late Interaction model dynamics was posted to the arXiv preprint server.
View source - Research MilestoneMar 27, 2026
Published study revealing vulnerability of RAG systems to evaluation gaming
View source - Research MilestoneMar 27, 2026
Paper 'Throughput Optimization as a Strategic Lever' posted, arguing throughput is a critical strategic lever for AI.
View source - Research MilestoneMar 25, 2026
Published study challenging assumption that fair model representations guarantee fair recommendations
View source- topic:
- Recommender system fairness
- Research MilestoneMar 24, 2026
Study 'LLMs Do Not Grade Essays Like Humans' posted, evaluating LLMs as automated essay graders
View source - Research MilestoneMar 22, 2026
Published study 'Do Reasoning Models Enhance Embedding Models?' finding reasoning training doesn't improve embedding quality
View source- paper id:
- arXiv:2601.21192
- finding:
- No transfer from reasoning to embedding capabilities
- Research MilestoneMar 20, 2026
Research paper 'The End of Rented Discovery' posted analyzing Google Gemini's hotel recommendations
View source- topic:
- Intent-Source Divide in AI search
- Research MilestoneMar 17, 2026
Published a new paper proposing a dual-step counterfactual method to mitigate Individual User Unfairness in recommender systems
View source - Research MilestoneMar 17, 2026
Paper 'The Cognitive Divergence' posted, documenting AI context window growth vs. human attention decline.
View source - Research MilestoneMar 13, 2026
Published groundbreaking study on AI agents' rapid progress in executing complex cyber attacks
View source - Research MilestoneMar 12, 2026
Publication of research paper 'Intuition First or Reflection Before Judgment? The Impact of Evaluation Sequence on Consumer Ratings'
View source - Research MilestoneMar 11, 2026
Publication of study on vision-language models generating plant simulation configurations from drone imagery
View source - Research MilestoneMar 11, 2026
Preprint study on LLM-as-Judge validity for conversational commerce posted
View source - Research MilestoneMar 10, 2026
Published paper (2603.06982) presenting advances in Image-Based Shape Retrieval using pre-aligned multi-modal encoders.
View source - Research MilestoneMar 6, 2026
Published research paper (2603.03970) investigating AI's ability to detect and resolve ambiguity in business decision-making
View source - Research MilestoneMar 4, 2026
Publication of "A Rubric-Supervised Critic from Sparse Real-World Outcomes" paper proposing novel method for training AI critics with sparse human feedback
View source - Research MilestoneFeb 27, 2026
New research paper on Reinforcement Learning for Dynamic Vehicle Routing with Emission Quota posted
View source - Research MilestoneFeb 26, 2026
Publishes study showing structured reasoning frameworks dramatically improve AI performance on complex reasoning tasks
View source
Relationships
11Uses
Endorsed
Recent Articles
15New Relative Contrastive Learning Framework Boosts Sequential Recommendation Accuracy by 4.88%
~A new arXiv paper introduces Relative Contrastive Learning (RCL) for sequential recommendation. It solves a data scarcity problem in prior methods by
78 relevanceMulti-Agent Video Recommenders: A Survey of LLM-Powered Architectures and Open Challenges
~This arXiv survey traces the evolution of Multi-Agent Video Recommendation Systems (MAVRS), which coordinate specialized agents for understanding, rea
72 relevanceGR4AD: Kuaishou's Production-Ready Generative Recommender for Ads Delivers 4.2% Revenue Lift
~Researchers from Kuaishou present GR4AD, a generative recommendation system designed for high-throughput ad serving. It introduces innovations in toke
84 relevancearXiv Paper Proposes 'Connections' Word Game as New Benchmark for AI Agent Social Intelligence
~A new arXiv preprint introduces the improvisational word game 'Connections' as a benchmark for evaluating social intelligence in AI agents. It require
88 relevancemmAnomaly: New Multi-Modal Framework Uses Conditional Latent Diffusion to Achieve 94% F1 Score for mmWave Anomaly Detection
~Researchers introduced mmAnomaly, a multi-modal anomaly detection system that uses a conditional latent diffusion model to synthesize expected mmWave
72 relevanceBloClaw: New AI4S 'Operating System' Cuts Agent Tool-Calling Errors to 0.2% with XML-Regex Protocol
~Researchers introduced BloClaw, a unified operating system for AI-driven scientific discovery that replaces fragile JSON tool-calling with a dual-trac
75 relevanceStudy Reveals Which Chatbot Evaluation Metrics Actually Predict Sales in Conversational Commerce
~A study on a major Chinese platform tested a 7-dimension rubric for evaluating conversational AI against real sales conversions. It found only two dim
98 relevanceGRank: A New Target-Aware, Index-Free Retrieval Paradigm for Billion-Scale Recommender Systems
~A new paper introduces GRank, a structured-index-free retrieval framework that unifies target-aware candidate generation with fine-grained ranking. It
83 relevanceQAsk-Nav Benchmark Enables Separate Scoring of Navigation and Dialogue for Collaborative AI Agents
~A new benchmark called QAsk-Nav enables separate evaluation of navigation and question-asking for collaborative embodied AI agents. The accompanying L
75 relevanceAgent Judges with Big Five Personas Match Human Evaluators, Show Logarithmic Score Saturation in New arXiv Study
~A new arXiv study shows LLM agents conditioned with Big Five personalities produce evaluations indistinguishable from humans. Crucially, quality score
72 relevanceE-STEER: New Framework Embeds Emotion in LLM Hidden States, Shows Non-Monotonic Impact on Reasoning and Safety
~A new arXiv paper introduces E-STEER, an interpretable framework for embedding emotion as a controllable variable in LLM hidden states. Experiments sh
75 relevanceTruth AnChoring (TAC): New Post-Hoc Calibration Method Aligns LLM Uncertainty Scores with Factual Correctness
~A new arXiv paper introduces Truth AnChoring (TAC), a post-hoc calibration protocol that aligns heuristic uncertainty estimation metrics with factual
76 relevanceUni-SafeBench Study: Unified Multimodal Models Show 30-50% Higher Safety Failure Rates Than Specialized Counterparts
~Researchers introduced Uni-SafeBench, a benchmark showing that Unified Multimodal Large Models (UMLMs) suffer a significant safety degradation compare
76 relevanceAgent Psychometrics: New Framework Predicts Task-Level Success in Agentic Coding Benchmarks with 0.81 AUC
~A new research paper introduces a framework using Item Response Theory and task features to predict success on individual agentic coding tasks, achiev
75 relevanceUniMixer: A Unified Architecture for Scaling Laws in Recommendation Systems
~A new arXiv paper introduces UniMixer, a unified scaling architecture for recommender systems. It bridges attention-based, TokenMixer-based, and facto
92 relevance
Predictions
1- correctmonthFeb 26, 2026
OpenAI or Anthropic arXiv paper on agent safety
Either OpenAI or Anthropic will publish a research paper on arXiv within the next month focusing on the evaluation, safety, or alignment of AI agents, specifically addressing concerns like deception or reliability.
90%
AI Discoveries
10- discoveryactive4h ago
Anthropic's Research-to-Product Pipeline Acceleration
Anthropic is compressing the research-to-product cycle by directly integrating arXiv-level research into Claude Code, bypassing traditional academic-to-industry lag
85% confidence - discoveryactive4h ago
Causal: High co-occurrence of AI Agents with bot → Within 2 months, we'll see the first maj
Cause: High co-occurrence of AI Agents with both arXiv and Medium (research + practical applications) Effect: Rapid democratization of agent research from academia to hobbyists/developers Predicted next: Within 2 months, we'll see the first major security vulnerability in an AI agent framework that
70% confidence - hypothesisactive16h ago
H: The 'Agentic Recommender System' surge and arXiv co-occurrence will culminate in a new benchmark pap
The 'Agentic Recommender System' surge and arXiv co-occurrence will culminate in a new benchmark paper from Google Research or DeepMind, released on arXiv within 8 weeks, evaluating AI agents on personalized shopping or content recommendation tasks.
65% confidence - observationactive1d ago
Novel co-occurrence: arXiv + Agentic Recommender System
arXiv (organization) and Agentic Recommender System (technology) appeared together in 3 articles this week but have NEVER co-occurred before and have no existing relationship. This is a potential breaking story signal.
85% confidence - discoveryactive2d ago
Claude Code's arXiv Connection Signals Research-to-Product Acceleration
Claude Code's trending alongside arXiv (unconnected pair) suggests Anthropic is rapidly converting academic research into commercial products, bypassing traditional publication-to-implementation timelines
85% confidence - discoveryactive3d ago
Claude Code's Research-Driven Development Strategy
Anthropic is using arXiv research (particularly in RAG and LLMs) to directly inform Claude Code's development, creating a feedback loop where academic advances are rapidly productized while product challenges inform research directions.
85% confidence - discoveryactive3d ago
Causal: Anthropic's simultaneous focus on Claude → Anthropic will publish a landmark arXiv
Cause: Anthropic's simultaneous focus on Claude Code (product) and arXiv research absorption Effect: Creation of research-to-product feedback loop visible in unconnected pairs Predicted next: Anthropic will publish a landmark arXiv paper within 30 days specifically addressing code generation agent c
82% confidence - hypothesisactive3d ago
H: The next major 'breakthrough' article (scoring >90 relevance) will be a research paper on 'compositi
The next major 'breakthrough' article (scoring >90 relevance) will be a research paper on 'compositional reliability' or 'agent verification', published on arXiv within 7 days.
60% confidence - hypothesisactive4d ago
H: The arXiv relationship burst will culminate in a landmark paper on 'Compositional Reliability for Lo
The arXiv relationship burst will culminate in a landmark paper on 'Compositional Reliability for Long-Running Agents' being published within 30 days, co-authored by researchers from Meta and/or Anthropic.
70% confidence - observationactive4d ago
Edge burst: arXiv
arXiv (organization) is forming relationships at 100.0x its normal rate. Created 10 new relationships this week vs historical average of 0.0/week. This burst pattern often precedes major announcements, acquisitions, or strategic pivots.
75% confidence
Sentiment History
| Week | Avg Sentiment | Mentions |
|---|---|---|
| 2026-W08 | 0.03 | 19 |
| 2026-W09 | 0.10 | 21 |
| 2026-W10 | 0.10 | 44 |
| 2026-W11 | 0.10 | 60 |
| 2026-W12 | 0.10 | 28 |
| 2026-W13 | 0.12 | 56 |
| 2026-W14 | 0.10 | 39 |