Product Thumbnail

GPT-4o

OpenAI's new flagship model

Artificial Intelligence

GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction—it accepts as input any combination of text, audio, and image and generates any combination of text, audio, and image outputs.

Top comment

Man, how many generative AI products just got sherlocked? Also, what does this mean for Rabbit r1 and Humane's Ai Pin?

Comment highlights

Really impressed with the demo of GPT-4o. Seeing how well it handled live voice interactions was great, and also it seems to have more humour than previous models! Looking forward to testing out

this is great,congrats to sam and team. though it sounds freaky when ai gets to talk to another ai,i guess sometime in nearest future we might get to see a-i going for a date or lunch with another ai...lol!

Speed is underrated. Even reductions in friction can have a massive impact on how people interact with technology. This is the bet many hardware AI products like the Humane AiPin and Rabbit are placing.

I just shared my review on GPT-4o: https://blog.altern.ai/gpt-4o-re...

The future is indeed full of imagination. Are we a step closer to general artificial intelligence, or a step further away? I believe that for 80% of use cases, tools that score 70 are already competent.

Now that an AI is able to talk to another AI. It's only a step away from talking to itself and, hence, being able to reason and think for itself. That will be the start of AGI. GPT-4o is very impressive and frightening.

So impressed with the live demo, especially seeing its potential to tutor so well. Feeling hopeful for the next generation of kids!

Your product is incredibly impressive, team! The concept is intriguing. I'm curious, what exciting milestones or features are on the agenda for the next phase of development? Keep up the excellent work!

History will remember the day GPT-4o was announced, demonstrating natural and versatile voices, visual understanding, and words so carefully crafted they could mimic any style. Here's to an exciting future that awaits us all, not just those atop the tower (hopefully).

Amazing for consumers but for developers, it needs twilio integration or some kind of web client to be truly useful.

As a little test.. it still doesn’t recognise Pandas append has been deprecated since 2023, and throws it into ~20% of my (‘my’) python scripts .. I made it memorise latest Pandas docs, put it in custom instructions … it still hasn’t clicked. No big drama, there is just clearly some remaining problem in reasoning .. or just differentiating obsolete information. All that said very happy to see an upgrade.. look forward to the macOS app also as I’m using it at least 2 hours per day 👍 P.S. biggest frustration is that it stops processing when you multitask — for free users I get it, for paid users it’s a total pita, takes the flow out of my work when I use it

A great spring launch! Besides the launch event video, this blog post is incredibly informative. The 'capability exploration' section at the end is noteworthy, even more impactful than the event itself. The capability exploration includes visual storytelling, creating posters based on real-life photos, character design (with potential replacement of motion capture), simulating handwriting (in cursive), physical design abilities (like badge and commemorative coin design), image-to-comic conversion, text-to-font transformation, 3D compositing, variable binding, and more... Enhanced multimodal capabilities could significantly impact many AI applications that have only been superficially explored so far. https://openai.com/index/hello-g...