ChatGPT, Bard and the Problem With Large Language Models
AI has finally taken over the common man, with large language models (LLM) taking the charge. With the introduction of applications like OpenAIs ChatGPT and Google’s BARD, conversational AI has become the talk of the town.
These chatbots accept your questions and queries and reply like any human would. People have been using these for all sorts of cool stuff like asking them about the weather or write a program in their favorite language and even complete essays. However, what most people don’t realize is that like every AI, LLMs have their limitations.
In this article I’ll be explaining some of the short comings of modern-day conversational AI and why you should use it with caution. But first, let’s take an overview of the two popular models.
What are ChatGPT and BARD?
ChatGPT was first announced on Nov 30, 2022 and soon became a global sensation. It gathered 100 million users in only a span of 2 months, creating a record for the fastest growing application. ChatGPT has been toyed around my millions of users every day since its launch and the results it provides are no less than a shock. Some of its most popular applications are
- All-in-one writer: ChatGPT can construct full essays, reports, summarize text, write business proposals and the list just goes on.
- All-in-one code writer: You can ask ChatGPT to write a piece of code in your favorite programming language. It literally takes it a few seconds to generate fully working code.
- Code debugger: ChatGPT can review your faulty code and provide a list (and fixes) of everything wrong with it. A study found that its debugging accuracy was on par with some famous deep-learning models like CoCoNut and Codex, but with the added benefit of an easier interface.
- Acting as an SQL engine: Yes! you can ask ChatGPT to mimic an SQL engine. In easier terms, you can provide it with SQL queries and ask it to compute and return the expected results.
Some really impressive use-cases no doubt.
BARD, on the other hand, in Googles response to ChatGPT. It was first announced by Google’s CEO, Sundar Pichai in early Feb, 2023. It is built on top of the Google’s own language model, LaMDA and will be released as a direct competitor to ChatGPT.
Bard is currently only released to a closed group of testers and the model used is a light-weight version of LaMDA. Bard one-ups ChatGPT by harnessing the internet to provide up to date information and relevant contexts.
Where Does the Problem Lie?
With all the hype that has been going around, there little room left for any skepticism. However, I am here to tell you that not all that glitters is gold.
A large language model, such as ChatGPT develops its knowledge base from the internet. At first glance, this seems alright until you realize that the internet is filled with, both, facts and opinions. Many of these opinions are incorrect or even controversial.
When a language model is trained on all such information, it is difficult to imagine which of it will retain. The development teams can only do a limited job at fact checking and sooner or later an anomaly is bound to leak into the training dataset.
The BARD Blunder
A prime example of such misinformation was seen in BARD’s initial demo that caused Google to lose $100 Billion in shares.
Google released a short advert, showcasing its conversational AI model and its capabilities. However, the hype and excitement were short lived as the demo was not as accurate as it seemed. It showed a conversational prompt where a person asked the AI about the new discoveries made by the James Webb Space Telescope (Screenshot below).
At first glance the result seems impressive as the model returns some points with high confidence. However, soon some astrounauts started pointing out the mistake in the third bullet point. They claimed that the JWST was, in fact, not the first telescope to capture an exoplanet.
To understand why the model gave such an answer, I decided to query the same question on Google(Screenshots below).
As you can see that the most prominent answer here is the James Webb Space Telescope. Upon closer inspection, we see that all the articles are actually referring to the telescope capturing ITS OWN very first exoplanet and not the first of all time. This is a neat mix up of words due to the structure of the question.
Since the knowledge graph-based search engine could not return what I was looking for, it is no shock that the AI bot made the same blunder. Google BARD is trained on this same data scrapped off the internet and it is very likely that it learned this misleading information as it ranks top in Google search.
For further clarification I asked ChatGPT the same question.
OpenAI’s product seems to realize that an exoplanet was pictured long before the JWST.
*A mildly important note here. It is possible that OpenAI tweaked their model to give a different answer after Google BARDs disaster. *
So is this information correct? Let’s find out.
Grant Tremblay, an astrophysicist who pointed out BARDs mistake seemed to have a different opinion.
He pointed out that the first image was captured by Chauvin et al. in 2004.
So does ChatGPT also not know any better? Let’s try a different search query. This time we’ll omit the requirement for the picture being taken from a telescope.
Well well well…
Now the OpenAI variant seems to agree that the first picture was captured in 2004 and it explains that it was via the Very Large Telescope in Chile.
Now there might be technical explanations behind these different answers but I am not an expert in astrophysics, nor do I have the knowledge to confirm or deny any of these claims.
To a layman, whatever he or she reads, will become the truth. The sole purpose of mentioning all of this is to demonstrate the information ambiguity that might plague large language models.
There are also examples where people have fooled ChatGPT into confidently outputting misinformation using a series of carefully crafted prompts.
The point is that while most people have deemed conversational AI as the game changer, it should be used with caution.
How Dangerous Can Misinformation Get?
Up till now ChatGPT is only used as a toy by majority of its users. People ask it all sorts of weird, silly and challenging questions and enjoy the light-hearted conversation that ensues.
However, many others have started using it in their day-to-day tasks and this is where the danger lies.
Imagine students tutoring themselves by asking AI bots subject related questions. If the bot continues to output factually incorrect statements, the student's education could be affected and result in poor grades.
Similarly, if used to automate medical related tasks, any misinformation could be the difference between life and death.
So We Say No to Artificial Intelligence?
Absolutely not. The only way forward is to embrace technology, not resist it. Like all innovations, conversational AI tools are not yet perfect, but this does not mean that we ditch them entirely.
All good things require time, before they are adequately polished. All we, as users, need to understand is the proper usage. Some useful tips are
- Do not blindly implement AI into mission-critical workflows
- Double-check any information from actual experts
- For now, use artificial intelligence as a second opinion, not as the single source of truth
As an AI engineer myself, I am proud to say mankind is witnessing another revolution. The pace at which AI applications are hitting the market is unprecedented. This is no longer a matter of decades, but years and soon it might just be months.
Some of you might be afraid for your jobs but being resistant will not get you anywhere. It is important to realize the importance of these breakthroughs and start leveraging them for your advantage. As the famous saying goes
AI might not replace you, but a person using AI will