Technology

OpenAI Plans to Launch a New AI Model Focused on Audio

OpenAI Plans to Launch a New AI Model Focused on Audio

OpenAI is reportedly working on a new AI model designed specifically for audio-related tasks, with plans to release it in the first quarter of the year.

As per reports, the model is expected to arrive by the end of March and will aim to deliver more realistic and natural-sounding speech compared to OpenAI’s existing audio systems. It is also said to improve real-time conversations, allowing smoother and more responsive interactions with users.

The upcoming model is believed to be built on a fresh technical framework. OpenAI’s current real-time audio system, GPT-realtime, relies on the widely used transformer architecture. However, it remains unclear whether the company will switch to a completely new design or introduce an upgraded version of the same architecture.

Some audio AI models work by directly processing spoken input, while others—like OpenAI’s Whisper model launched in 2022—first convert sound into visual representations known as spectrograms. Whisper and OpenAI’s newer audio tools are available in multiple versions with different performance levels, and a similar multi-tier approach may be used for the new model as well.

To strengthen its audio AI efforts, OpenAI has reportedly brought together teams from engineering, research, and product development. The project is said to be led by Kundan Kumar, a former researcher at Character.AI. Several other employees from that company later joined Google following a major talent acquisition deal in 2024.

The new model may go beyond basic speech applications. AI-generated music is quickly gaining popularity, with companies like Suno reportedly earning hundreds of millions in yearly revenue. Entering this space could help OpenAI expand its reach among everyday users.

This audio initiative is also linked to OpenAI’s broader plans in consumer technology. According to reports, the company is preparing to introduce an audio-centric personal device within the next year. Over time, OpenAI may release multiple smart devices, including speakers and wearable technology like smart glasses.

Last year, OpenAI acquired design startup io Products, founded by Jony Ive, as part of its hardware strategy. The deal reportedly valued the company at $6.5 billion. Later reports suggested that Ive is working on a compact, desk-friendly device similar in size to a smartphone.

To support future hardware products, OpenAI may also develop a lightweight audio model that runs directly on devices. Handling requests locally reduces costs and improves efficiency, a strategy already used by Google through its on-device Gemini Nano model in Pixel smartphones.

error: Content is protected !!