Epic Beast
Posts
Unlocking the Potential: Exploring the Future of Work with GPT-4o

Unlocking the Potential: Exploring the Future of Work with GPT-4o

What to expect from OpenAI’s latest iteration of their GPT LLM.

Dylan Labrie
May 17, 2024

Image: Self-created Dylan Labrie/Midjourney

What to expect from OpenAI’s latest iteration of their GPT LLM.

In this Edition:

What’s New

The “O” = Omni

New Functionality

Impressive Language Translation

Addressing Safety Concerns

The Impact to You

I watched OpenAI’s Chat GPT-4o Keynote so you don't have to. In case you missed it, OpenAI introduced their new GPT-4o (Tuesday, May 14, 2024).

So what’s NEW?

-ChatGPT has 100 million users creating experiences in the GPT store

-New GPT-4o can now “reasons across voice, text, and vision”

-New GPT-4o streamlines and removes the latency from the previous GPT “Voice Mode” making it more real-time responsive

-GPT-4o is available to all free and paid users (will have up to 5x capacity vs free users)

-GPT 4o has 5x higher rate limits vs. GPT-4 Turbo and is 50% cheaper according to OpenAi

Let’s get into it.

The “O” = Omni

The overall focus of the keynote was introducing OpenAI’s new GPT-4o. The “o” stands for “omni” as in omnipresent as this iteration of OpenAI’s Large Language Model (LLM) can “see” and “speak” and react in real-time to what it sees and hears.

New Functionality

You can now upload screenshots, photos, and documents containing both text and images. GPT-4o also brings memory enabling ChatGPT to have a sense of continuity across all of your conversations.

At the keynote, OpenAI provided several profound GPT-4o demonstrations. GPT-4o can detect human breathing. During the keynote, GPT-4o gave feedback on breathing and calming techniques to one of the presenters. If that was not enough, this was done with a GPT-4o verbalizing its breathing techniques with a sense of humor. GPT-4o also provided real-time solutions to a linear equation with real-time reactions and step-by-step solutions.

In a separate demo, GPT-4o was provided a visual sample of Python code on a screen and asked to describe the code functions. The code provided a temperature plot line chart. Next, GPT-4o provided analytics and reactions to the multiple plot lines in the chart.

Impressive Language Translation

One impressive demo was seeing GPT-4o provide language translation between one presenter speaking in Italian and the other in English. This is not your Google Translate’s translation. This is fast and in real-time. Finally, one of the presenters provided a video selfie to GPT-4o and asked it to react in real-time and describe his facial expressions and GPT-4o reacted with accurate descriptions. Also throughout the demo, GPT-4o was asked to change its voice several times and its output was a close clone of the human voice.

Addressing Safety Concerns

Overall it was an impressive showing by OpenAI. The implications are broad. This latest GPT update brings a new layer to OpenAI’s suite of offerings. The new GPT-4o also presents new safety concerns as we now have an LLM that works in real-time audio and real-time vision. In the wrong hands, this could create safety and security challenges.

OpenAI addressed safety concerns at the Keynote by indicating they would be working with “different stakeholders” from “government, media, entertainment, all industries, civil societies to figure out how to best bring this” technology “into the world”. This could be deemed a miss given that OpenAI already released GPT-4o and is working on safety concerns in concurrence versus addressing these concerns before they launched.

The Impact to You

The introduction of GPT-4o marks a massive milestone in the evolution of AI LLMs. If you are a paid language translator it is still too early to quit your day job, but the new GPT-4o is impressive enough to force us all to begin thinking about its impact in real-world work. Initially, this new layer opens up LLMs from one-dimensional to multi-dimensional, thus the “o” or “Omni” in GPT-4o. GPT-4o’s suite of imaging tools will enable it to create, analyze, and react to 3D models. This alone will make it invaluable to a large population of people from graphic designers to creatives.

We all need to continue to dive into understanding and learning the capabilities and limitations of this latest iteration of OpenAI’s ChatGPT. With GPT-4o, we will unlock unlimited potential by upgrading from what previously was an “intern” assistant to what could potentially be our new junior assistant. By embracing the change, you can effectively boost your productivity and master your work faster than imagined.

You can watch the OpenAI keynote here:

https://www.youtube.com/watch?v=DQacCB9tDaw&t=1s