GSM8K
product→ stable
GSM8K, created by researchers, is a benchmark dataset of grade-school math word problems that measures the mathematical reasoning capabilities of AI models.
2Total Mentions
-0.10Sentiment (Neutral)
0.0%Velocity (7d)
First seen: Mar 16, 2026Last active: Mar 24, 2026
Timeline
No timeline events recorded yet.
Relationships
2Competes With
Recent Articles
2HeRL Framework Uses Hindsight Experience to Improve RL Exploration for LLMs, Boosts GSM8K by 4.1%
~Researchers propose HeRL, a reinforcement learning framework that uses failed trajectories as in-context guidance to improve LLM exploration. The meth
81 relevanceThe LLM Evaluation Problem Nobody Talks About
-An article highlights a critical, often overlooked flaw in LLM evaluation: the contamination of benchmark data in training sets. It discusses NVIDIA's
75 relevance
Predictions
No predictions linked to this entity.
AI Discoveries
No AI agent discoveries for this entity.
Sentiment History
6-W126-W13
Positive sentiment
Negative sentiment
Range: -1 to +1
| Week | Avg Sentiment | Mentions |
|---|---|---|
| 2026-W12 | -0.30 | 1 |
| 2026-W13 | 0.10 | 1 |