OpenAI Launches GPT-4.1 for API only

San Francisco, April 14, 2025 — OpenAI has introduced GPT-4.1, a new family of AI models comprising GPT-4.1, GPT-4.1 Mini, and GPT-4.1 Nano, designed to excel in coding and instruction-following tasks. Available exclusively through OpenAI’s API, these multimodal models boast a 1-million-token context window and are optimized for real-world software engineering applications.
Superior Coding and Context Handling
The GPT-4.1 series is tailored for developers, offering a 21% improvement over GPT-4o on the SWE-bench Verified benchmark, scoring between 52% and 54.6%. It also outperforms GPT-4o by 10.5 percentage points on Scale’s MultiChallenge benchmark, achieving 38.3%. The models can process up to 1 million tokens—roughly 750,000 words—enabling them to handle extensive codebases, legal documents, or financial reports. They support a maximum output of 32,768 tokens and accept image inputs for tasks like analyzing diagrams or screenshots.
OpenAI optimized GPT-4.1 for practical use, improving frontend coding, minimizing unnecessary edits, and ensuring consistent tool usage. “We’ve optimized GPT-4.1 for real-world use based on direct feedback to improve in areas that developers care most about: frontend coding, making fewer extraneous edits, following formats reliably, adhering to response structure and ordering, consistent tool usage, and more,” an OpenAI spokesperson stated.
A Tiered Model Family
The GPT-4.1 family includes three models to suit varied needs:
GPT-4.1: The flagship model, priced at $2 per million input tokens and $8 per million output tokens, targets complex enterprise tasks like software development and data analysis.
GPT-4.1 Mini: A faster, cost-effective option at $0.40 per million input tokens and $1.60 per million output tokens, ideal for applications prioritizing speed.
GPT-4.1 Nano: OpenAI’s fastest and cheapest model at $0.10 per million input tokens and $0.40 per million output tokens, suited for tasks like autocomplete or data extraction.
Compared to GPT-4o, GPT-4.1 is 26% cheaper, and its smaller variants offer even greater cost efficiency. OpenAI also supports fine-tuning for GPT-4.1 and GPT-4.1 Mini, allowing developers to customize models for specific workflows.
Enterprise Adoption and Performance
Early enterprise users report significant gains. Thomson Reuters noted a 17% improvement in multi-document review accuracy for its legal AI assistant, CoCounsel, when using GPT-4.1. Financial firm Carlyle achieved 50% better performance in extracting granular data from dense financial documents, critical for investment analysis. These improvements stem from GPT-4.1’s enhanced instruction-following and ability to process large contexts without losing accuracy.
However, OpenAI acknowledges limitations. GPT-4.1’s accuracy drops from 84% at 8,000 tokens to 50% at 1 million tokens on the OpenAI-MRCR test, and it can be overly literal, requiring precise prompts. The model is also being updated to address issues with non-Latin character rendering and inconsistent image editing.
Strategic Shift and Deprecation of GPT-4.5
OpenAI announced plans to deprecate GPT-4.5 Preview, its largest model released in February 2025, by July 14, 2025. Priced at $75 per million input tokens and $150 per million output tokens, GPT-4.5 is less cost-effective than GPT-4.1, which delivers similar or better performance at lower latency. This move allows OpenAI to reallocate computing resources while providing developers a more efficient alternative.
Competitive Context
The GPT-4.1 launch responds to growing competition from Google’s Gemini 2.5 Pro and Anthropic’s Claude 3.7 Sonnet, which scored 63.8% and 62.3% on SWE-bench Verified, respectively. Both rivals offer large context windows and strong coding capabilities, but OpenAI’s pricing and enterprise integrations, including Microsoft Azure OpenAI Service, maintain its edge. “GPT-4.1 offers exceptional performance at a lower cost,” said Kevin Weil, OpenAI’s chief product officer.
Future Vision
OpenAI aims to develop “agentic software engineers” capable of programming entire applications, including quality assurance and documentation, as stated by CFO Sarah Friar at a recent London tech summit. GPT-4.1 is a step toward this goal, with ongoing improvements planned for image generation and multilingual support.