OpenAI Unveils GPT-4o: A Faster, More Versatile Language Model

OpenAI has introduced GPT-4o, a new large language model that is significantly faster and freely available to everyone.

The new GPT-4o large language model from OpenAI was unveiled at a special event hosted by the company. GPT-4o is an improved version of the GPT-4 language model, which serves as the core of the popular ChatGPT chatbot. The “o” in GPT-4o stands for Omni, meaning universal or global.

According to Mira Murati, OpenAI’s CTO, the company’s new model is much faster and performs significantly better in understanding text, images, and audio content.

OpenAI says that the GPT-4o large language model will be freely available to everyone, but that paid users will have five times the capacity limits of free users, meaning they can submit five times more requests per day.

In one part of the OpenAI event, we witnessed GPT-4o’s amazing performance in solving math problems and even understanding the user’s tone of voice. GPT-4o’s new voice capability reacts fully to the user’s emotions and engages in friendly conversations and even storytelling.

OpenAI CEO Sam Altman said that the GPT-4o model is “inherently multimodal.” This means that the new model can generate and understand not only text, but also audio and visual content.

GPT-4o AI Makes Human-Computer Conversations More Natural

Developers interested in GPT-4o can purchase the API for this AI model at half the price of GPT-4 Turbo. GPT-4o is twice as fast as the Turbo version.

According to OpenAI, most of the capabilities of the company’s new AI model will be made available gradually, however text and image capabilities are available today in the ChatGPT chatbot.

The GPT-4o language model is a step towards making human interaction with computers more natural and can respond to voice inputs in just 232 milliseconds (average 320 milliseconds); OpenAI claims that GPT-4o’s response time is similar to human response time in everyday conversations.

GPT-4o is on par with the Turbo version in terms of understanding English text and code, but is cheaper and much faster. This new language model has been specifically enhanced in its ability to understand visual and audio content.

Before the release of GPT-4o, you could talk to ChatGPT through Voice Mode with a delay of 2.8 seconds (in GPT-3.5) or 5.4 seconds (in GPT-4). Voice Mode relies on three models to provide this capability: a simple model for converting speech to text, GPT-3.5 or GPT-4 for converting text to text, and finally a third model for converting text to speech.

With the release of GPT-4o, the mechanism of ChatGPT’s voice capability changes. OpenAI says it has developed a new model that has full access to text, voice and images and offers much higher speeds.

According to OpenAI, the GPT-4o AI model, in addition to its fast and accurate performance, pays special attention to safety and does not generate sensitive content.

Back to top button