OpenAI has introduced its new o1 reasoning model to its API, rolling it out to selected developers starting December 17, 2024. The launch comes as part of a broader update that also includes new features enhancing functionality and customization for developers. To qualify for usage, developers must spend at least $1,000 and maintain accounts older than 30 days.
“Today we’re introducing more capable models, new tools for customization, and upgrades that improve performance, flexibility, and cost-efficiency for developers building with AI.”
-OpenAI
OpenAI launches o1 API for selected developers
The o1 model supersedes the previous o1-preview, boasting capabilities that allow it to fact-check its own responses, an advantage not commonly found in AI models. As a trade-off, the reasoning model tends to take longer to generate answers. The cost for processing with o1 is significant; it charges developers $15 for every 750,000 words analyzed and $60 for generated content, marking a sixfold increase compared to the latest non-reasoning model, GPT-4o.
The new o1 is designed to improve on earlier limitations, with OpenAI asserting that it offers “more comprehensive and accurate responses,” particularly for technical queries related to programming and business. It includes enhancements such as a reasoning effort parameter that allows developers to control the processing time for queries. Additionally, the model is more adaptable than its predecessor, supporting functions like developer messages to customize chatbot behavior and enabling structured outputs using a JSON schema.
To facilitate more dynamic interactions, OpenAI has improved its function calling capabilities, allowing the model to utilize pre-written external functions when generating answers. This API iteration reportedly requires 60% fewer tokens for processing compared to o1-preview, while also achieving a higher accuracy rate—between 25 to 35 percentage points more on benchmarks such as LiveBench and AIME.
OpenAI also expanded its capabilities concerning real-time interactions through its Realtime API, now supporting WebRTC for smoother audio communication. This addition aims to simplify integration for developers, significantly reducing the complexity of code from approximately 250 lines to about a dozen. Furthermore, OpenAI has cut the cost of o1 audio tokens by 60% and mini tokens by 90% to encourage usage among developers.
“Our WebRTC integration is designed to enable smooth and responsive interactions in real-world conditions, even with variable network quality,” OpenAI wrote in the blog. “It handles audio encoding, streaming, noise suppression, and congestion control.”
Another significant update includes a new method for fine-tuning AI models called direct preference optimization. This allows model trainers to provide two outputs and specify a preference without needing to supply exact input/output examples for every scenario. OpenAI claims this method enhances the model’s ability to adapt to various quirks in response style, formatting, and helpfulness.
Developers in programming languages like Go and Java can now access new software development kits (SDKs) designed for easier API integration. As these updates progress, OpenAI plans to expand access and increase rate limits for more developers beyond the initial tier 5 category.
Featured image credit: OpenAI