OpenAI o1-mini: Advancing Cost-Efficient Reasoning for STEM

Today, OpenAI is introducing o1-mini, a compact and affordable model specifically designed for tasks requiring strong reasoning abilities, particularly in STEM fields such as mathematics and programming. Despite its smaller size, o1-mini demonstrates performance nearly on par with the larger OpenAI o1 model in key benchmarks like AIME and Codeforces, while offering faster responses and lower costs for applications where broad world knowledge is less critical.
o1-mini Now Available for Tier 5 API Users
o1-mini is now accessible to Tier 5 API users, priced 80% lower than the OpenAI o1-preview model. Users across ChatGPT Plus, Team, Enterprise, and Edu can also adopt o1-mini as a faster, more cost-effective alternative to o1-preview, benefiting from enhanced rate limits and reduced latency (see “Model Speed” for further details).
Designed for STEM Reasoning Tasks
Large models like o1 are known for their expansive world knowledge, but this can come at the expense of speed and cost in practical applications. By contrast, o1-mini is built with a focus on reasoning, particularly in STEM disciplines, offering similar performance to o1 on many complex reasoning tasks, but at a significantly lower operational cost. The model has been optimized using the same reinforcement learning pipeline as o1, delivering robust results in reasoning-heavy areas.
Though o1-mini excels in STEM, it is less effective at tasks requiring comprehensive world knowledge (see “Limitations”).
Competitive Performance in Math and Coding
In benchmark tests, o1-mini has proven itself as a competitive option for both math and coding tasks, closely rivaling the performance of o1 at a fraction of the cost.
- Mathematics (AIME): In the AIME math competition, o1-mini scored 70%, just behind o1‘s 74.4%, and notably outperformed o1-preview‘s 44.6%. This level of performance places o1-mini in the top 500 of US high school students, demonstrating its prowess in mathematical reasoning.
- Coding (Codeforces): On the Codeforces platform, o1-mini achieved an Elo score of 1650, nearly matching o1‘s 1673 and surpassing o1-preview‘s 1258. This places the model in the 86th percentile of Codeforces programmers, reflecting its strong capabilities in programming and coding challenges.
Broader STEM Capabilities
Beyond these benchmarks, o1-mini also performs well in other academic reasoning tasks, such as GPQA (science) and MATH-500. While it does not surpass GPT-4o in general knowledge tasks like MMLU, it consistently delivers strong results in STEM-specific domains.
Human Preference Evaluations further validate o1-mini’s strengths, with raters favoring it over GPT-4o in areas like mathematical calculation, programming, and data analysis, though it trails in more language-focused tasks.
For users seeking a fast, efficient, and cost-effective solution for STEM reasoning tasks, o1-mini offers a compelling alternative to larger models, delivering high performance where it counts.