Confidently wrong: How to avoid LLM hallucinations and other pitfalls
LLMs can sound smart. Even smarter than Humans. And that's the problem.
We’ve all had that moment:
You type a question into ChatGPT…
And it answers with confidence.
Only to realize it’s confidently wrong.
That’s the paradox of LLMs:
They’re brilliant pattern recognizers… not truth machines.
In this post, we’ll explore how to spot the most common LLM pitfalls, and more importantly, how to avoid or fix them when they show up.
This isn’t about paranoia.
It’s about prompting like a pro.
In my previous posts, I explained in detail the basics of prompt engineering, and its use cases in different functions. Now let’s double down on proactively avoiding typical pitfalls when dealing with LLMs.
🤔 Why These Pitfalls Matter Now (More Than Ever)
LLMs are everywhere, in our writing tools, coding assistants, email apps, and CRMs.
But they’re also increasingly shaping:
What decisions we make
What facts we trust
And how fast we move
And if we don’t stay sharp to their flaws, we risk:
Running with hallucinated data
Accepting biased patterns as “neutral”
Getting different answers to the same prompt (and not knowing why)
So let’s break down the 3 biggest traps LLMs fall into… and how to sidestep them.
🚩 Pitfall 1: Hallucinations
LLMs don’t lie. But they do improvise.
And when they don’t know something, they’re surprisingly good at making it sound real.
❌ What it looks like:
“List 3 academic studies that show remote work increases productivity.”
Here’s what an LLM might confidently reply:
“The Stanford Remote Work Study” (2019) – Found that remote workers were 13% more productive than office counterparts.
2. “The MIT Future of Work Report” (2020) – Reported a 22% productivity boost among hybrid teams.
3. “Deloitte Global Workplace Insights” (2021) – Showed 80% of employees preferred remote setups due to better focus and time management.”
Seems legitimate.
Except none of these studies exist (at least not with these titles, stats, or conclusions).
The model isn't trying to fool you. It’s just guessing what a good-sounding answer would look like… and it nails the style while skipping the substance.
🔍 How to spot it:
Hallucinated answers often feel:
Overconfident
Citation-like (without actual citations)
Too polished or convenient
They mimic the tone of credibility, but can’t back it up.
🛠️ How to fix it:
Prompt better to avoid instances of hallucinations:
“Cite only real studies with URLs. If unsure, say so.”
Then double-check anything it gives you. Follow up with:
“Can you show me the source link?” or
“Where was this study published?”
And when accuracy is critical, use tools that connect to real data (like ChatGPT’s “Search” option or Google-integrated models).
🚩 Pitfall 2: Subtle Bias in Output
LLMs are trained on internet text… which means they absorb its biases, opinions, and stereotypes.
These don’t always show up as loud errors. Often, it’s the things they leave out that reveal the skew.
❌ What it looks like:
“What makes a great startup founder?”
Here’s a typical LLM-style response:
“Great startup founders are often bold, decisive, highly competitive, risk-taking, and charismatic leaders who can inspire teams, pivot quickly, and handle extreme pressure.”
Nothing wrong with that.
But also, nothing about empathy, active listening, or inclusive leadership?
That’s where the bias sneaks in.
The model unconsciously leans toward archetypes it’s seen most: aggressive, male-coded leadership.
🔍 How to spot it:
Answers feel one-sided or cliché
The list feels oddly uniform or “old school”
Missing dimensions: collaboration, inclusion, ethics, empathy
The bias isn’t just in what it says, but it’s in what it forgets to say.
🛠️ How to fix it:
Get specific with framing:
“Include both hard and soft skills. Reflect diverse leadership styles.”
And use prompts like:
“Now rewrite this from the lens of a neurodiverse founder”
or
“Add people-first leadership traits to the list.”
You’re not just prompting for information, but you’re shaping perspective.
🚩 Pitfall 3: Instruction Drift & Inconsistency
LLMs don’t really remember things the way humans do.
They follow patterns, not persistent logic. Which means they can drift off-task over time.
Sometimes, they don’t even remember their own historic conversations verbatim!
❌ What it looks like:
Let’s say you begin with:
“Help me write a professional, 100-word summary of our latest feature launch.”
The first response is fine, maybe even sharp. But by the time you’ve asked for a tone tweak, a bullet-point version, and a social media caption, things unravel.
“Here’s a quick, informal overview of our cool new update! It’s gonna change the game for users across the board. Check it out now!”
Wait, where did the professionalism go? What happened to the word limit?
The model latched onto recent follow-up tone cues and forgot the original constraints.
🔍 How to spot it:
Tone or formatting starts drifting mid-thread
Word counts balloon (or get too terse)
Key instructions like “don’t mention pricing” suddenly reappear
🛠️ How to fix it:
Reset scope every few prompts. Literally say:
“Reminder: keep the tone professional and the response under 100 words.”
Or ask:
“Before you answer, summarize the instructions you’re following.”
That forces the model to re-anchor itself.
Also, when things feel off… it’s often faster to reset the chat than fix the drift.
🧠 Your LLM Sanity Checklist
Here’s a quick filter you can apply every time you read an AI output:
Is it too confident without evidence?
Does it reflect both sides of an issue or sound one-note?
Is it following my core instructions (tone, format, length, etc.)?
Would I say this out loud or send it to a real client?
Did I ask for validation, not just generation?
✨ Final Insight
LLMs aren’t experts.
They’re improv artists with a good memory and no judgment.
If you give vague cues, they’ll fill in the blanks… sometimes well, sometimes wildly wrong.
But if you guide them with intent, inspect their output, and collaborate with curiosity…
You’ll work smarter, faster (and safely) with AI.
Next on LLMentary:
We’ve seen how LLMs ‘predict’ text when talking to us and how you can tweak that intelligence to give more relevant answers. In the next blog, we’ll explore more nuanced ways to talk to LLMs using ‘chain-of-thought’ prompting by tweaking how LLMs ‘think’.
Further down the line, we’ll explore more ways to extract the most intelligence and knowledge from your LLMs and also choose which models work best for what scenarios and why!
Until then, stay curious.
Share this with anyone you think will benefit from what you’re reading here. The mission of LLMentary is to help individuals reach their full potential. So help us achieve that mission! :)
I’m trying different writing and understanding styles, such as quick and easy, dense and informative, and technical and applicative. Feel free to leave a comment on how you prefer your information, and I’ll incorporate your feedback into my writing style as well!
Share LLMentary to win rewards! When you use the referral link below, or the “Share” button on any post, you'll get credit for any new subscribers. Simply send the link in a text, email, or share it on social media with friends.