With the introduction of PDF on ChatGPT capabilities, OpenAI is unveiling fresh beta functionalities for ChatGPT Plus enthusiasts. Users highlight that this upgrade not only offers the ability to upload and interact with files but also provides multimodal assistance.
🔥 ChatGPT just released its 2 BIGGEST updates
Her'e what's new :1. True Multimodal: no more switching between tools! ChatGPT gives you true superpowers by letting you use all modes in combination.
2. New file upload formats: you can now upload PDFs and other types of files to… pic.twitter.com/Q2j3nk4bDj— Big Bang AI (@TheBigBangAI) October 29, 2023
In essence, users can bypass the step of choosing options like Browse with Bing from the GPT-4 menu — the system will intuitively determine their intent from the context.
New ChatGPT (All Tools) is INCREDIBLE! 🌟
With the new update, you can upload an image and ask to modify it!
All in 1 prompt, 30 seconds. Goodbye Photoshop?
Thoughts?pic.twitter.com/qjrUARjvdw
— Alex Northstar (@NorthstarBrain) October 29, 2023
Can you work on a PDF on ChatGPT now?
The recent additions infuse certain workplace functionalities from the ChatGPT Enterprise package into the standalone chatbot subscription.
While we haven’t spotted the multimodal enhancement in our Plus package yet, we took a more hands-on route.
Using the Advanced Data Analysis tool, we uploaded the PDF file of a recent article of ours titled “Carving out the hallucinations from AI’s brain” and posed several questions to ChatGPT regarding its content.
Similarly, if one were to upload a PDF on ChatGPT, the system could provide insightful feedback and summaries, streamlining content understanding.
We asked:
- Can you summarize the article in 100 words?
- Can you explain the main idea behind this article?
The response of ChatGPT:
“Summary (in approximately 100 words):
As the utilization of Multimodal Large Language Models (MLLMs) grows, a challenge that has surfaced is AI hallucination. MLLMs are AI models capable of learning from various data types like text and images. Recognizing this issue, researchers from the University of Science and Technology of China (USTC) and Tencent YouTu Lab introduced a solution named “Woodpecker.” This solution is crafted to tackle the hallucination problem in these AI models.
Main Idea:
The central theme of the article revolves around the challenge of AI hallucination in MLLMs and the introduction of the “Woodpecker” solution by researchers to address this issue.”
A key note on this update is the current limitation in combining functionalities: one cannot craft an image using DALL-E 3 and simultaneously analyze a file within the same ChatGPT dialogue.
For instance, if you’re focusing on a PDF on ChatGPT for detailed analysis, you might need a separate session for visual reinterpretations.
However, it’s evident that strides are being made in that direction, signaling a future where these capabilities might seamlessly coexist in a single conversation.
The versatility of the chatbot extends beyond mere text documents. In fact, handling a PDF on ChatGPT is now part of its expanded capabilities, broadening the scope of user interactions.
On Threads, an intriguing instance was shared by a user who presented screenshots of their interaction with ChatGPT. They uploaded a capybara image and prompted ChatGPT to reinterpret it in a Pixar-esque fashion using DALL-E 3.
Not stopping there, they built on the initial concept, introducing another image into the mix — a uniquely shaped skateboard. After instructing ChatGPT to incorporate this image, an unexpected twist surfaced: the resultant visual oddly featured the skateboard donning a hat. The nuances of AI responses can indeed be both fascinating and unexpected!
As ChatGPT continues to evolve and expand its range of functionalities, it’s hard to ignore the growing influence it has within the AI ecosystem. With capabilities that reach far beyond mere text generation to handling files, images, and even the intricacies of documents like a PDF on ChatGPT, the platform is increasingly positioning itself as a centralized hub for AI applications.
While the benefits of such consolidation are undeniable in terms of user experience and technological synergy, it does spark a conversation about the broader implications for the AI industry. Are we witnessing the rise of a singular dominant force in AI? Only time will tell.