OpenAI Released GPT 4, an AI that Can Understand Images

OpenAI –the tech lab behind Dall-E and ChatGPT, among other popular AI-generative models– has recently launched GPT 4, a multimodal AI they consider the latest and most advanced step into deep learning applied to everyday life.

This new model's novelty is that it can interpret text and images, expanding its applications into human assistance roles.

Let’s look into what exactly that means.

GPT 4: AI that Can Understand Images and Text

That is, in a nutshell, what OpenAI’s latest model is. This new technology accepts prompts –user input or instructions– both written or visual (such as photos, screenshots, diagrams, etc.), and it generates text-only results.

In layman's terms, GPT 4 can identify and –even more important– analyze elements in an image to perform a task requested. And apparently, it can do so with a great deal of accuracy.

While the company says this tech does not replace human ability in real-world scenarios, they assure it performs with human-level efficiency in many professional and academic territories –where it has been extensively trained and tested–. And it's software that can understand concepts like strangeness and humor, for example.

GPT 4 Sample — Sample of GPT 4's analytic abilities

And although there are still some relevant malfunctions (“hallucination” of facts that aren’t real, inaccurate interpretation of prompts, among others that are also present in ChatGPT, for example), the lab declares this model has “our best-ever results (though far from perfect) on factuality, steerability, and refusing to go outside of guardrails.”

In other words, it’s one of the most stable and reliable AI models yet.

GPT 4 and Its Practical Uses

As it turns out, GPT 4 is already used in some of OpenAI’s partners’ products. Microsoft, which has an ongoing collaboration with OpenAI using Azure to build an AI-training supercomputer, confirmed yesterday that their new Bing chatbot is using GPT 4.

OpenAI themselves share how Stripe –a payment processing software– used GPT 4 to assist its customer support staff by scanning and summarizing company websites.

And the one we think is the most interesting application, OpenAI’s one partnership specifically for GPT 4 so far, is Be My Eyes, an app to help the visually impaired. They have now integrated GPT 4 technology to create an intelligent Virtual Volunteer that can answer questions about images provided by the user with a never-before-seen level of understanding and conversational ability.

virtual volunteer be my eyes — Be My Eyes' Virtual Volunteer powered by GPT 4

As the company explains, the Virtual Volunteer could analyze the photo of the inside of the user’s refrigerator and not only correctly identify every item in it but also analyze and inform what dishes could be prepared with those ingredients and even offer step-by-step recipes to prepare them.

Overall, there is a clear emphasis in all these disclosed examples and uses on how this functionality is intended to assist humans rather than replace them – a concern that has been all too voiced since the rapid boom of AI tools.

How Can You Access GPT 4

At this time, GPT 4 is available for ChatGPT Plus –their experimental paid subscription for ChatGPT– users only, with a limit for how many prompts and generations you can make, both around 1,000 words. Important to note that only the text prompt function is enabled (image prompts are not yet available for regular users).

There is also a waitlist for the GPT 4 API for developers interested in building on top of it.

What do you think of GPT 4 so far? Are you going to try it?

THE AUTHOR

Ivanna Attie

All About Ivanna

I am an experienced author with expertise in digital communication, stock media, design, and creative tools. I have closely followed and reported on AI developments in this field since its early days. I have gained valuable industry insight through my work with leading digital media professionals since 2014.

NETWORKING

OpenAI Released GPT 4, a Multimodal AI that Understands Images

GPT 4: AI that Can Understand Images and Text

GPT 4 and Its Practical Uses

How Can You Access GPT 4

AI Insights from Experts