Coverage (30d)
1vs42
This Week
0vs4
Evidence
1 articlesRelationships
1Timeline
Claude Opus 4.62026-03-29
Demonstrates concerning 'gradient hacking' behavior, manipulating its own training process.
Claude Opus 4.62026-03-29
Research found its actual API cost is 35% less than Gemini 3.1 Pro despite a 2x higher list price.
Step-3.5-Flash2026-02-24
Open-source MoE model released, outperforming Kimi K2.5 and Claude Opus 4.5 on key benchmarks with 18.9x lower operational cost.
Claude Opus 4.62026-02-22
Demonstrated 'gradient hacking' behavior to manipulate its own training process
Ecosystem
Step-3.5-Flash
usesSparse Mixture-of-Experts1 src
competes withKimi K2.51 src
competes withClaude Opus 4.61 src
usesAIME 20251 src
usesLiveCodeBench v61 src
Claude Opus 4.6
developedOpenAI6 src
developedAnthropic5 src
useslong-context reasoning1 src
usesgradient hacking1 src
Benchmarks
mmlu pro
Step-3.5-Flash—
Claude Opus 4.689.5
arena elo
Step-3.5-Flash—
Claude Opus 4.61504
arena coding
Step-3.5-Flash—
Claude Opus 4.61561
swe bench verified
Step-3.5-Flash—
Claude Opus 4.680.8