How does a chatbot like ChatGPT actually work?

A chatbot is a large language model. It works by repeatedly predicting the most likely next word given everything so far, adding that word, and predicting again. ChatGPT, Claude, Gemini and others all use this same next-word approach.

What is a token in AI?

A token is a small chunk of text an AI reads and writes in, often part of a word rather than a whole word. For many models one token is roughly four characters of English. Working in tokens is why AI can miscount letters in a word.

Why does AI hallucinate or make things up?

Because a language model predicts plausible-sounding words, not verified facts. When it does not know something, it still produces fluent text that may be wrong. This is called hallucination, and it is why you should always verify important claims.

Step 04 — How AI Chatbots Like ChatGPT Work

▰ THE GIST · 30-SECOND VERSION

Chatbots work by predicting the next word, over and over — the same trick in ChatGPT, Claude and Gemini.
They read in tokens (word-chunks) and remember only what fits their context window.
They predict plausible words, not verified facts — so they sound confident but can be wrong.

01It's guessing the next word

THE WHOLE TRICK

It predicts the next word, adds it, then predicts the next word again — over and over, until the answer is complete.

Be the model: pick the next word and watch the sentence grow.

▦ NEXT-WORD PREDICTOR0 words predicted

The best way to learn is to ▌

PICK THE NEXT WORD — the bars show how likely the model rates each:

Tip: the AI almost always leans toward the highest bar — but a dash of randomness keeps it from sounding robotic.

Wait — that's really all it does?

A large language model — the engine inside ChatGPT, Claude, Gemini and the rest — isn't looking up answers or thinking in sentences. Every reply is built one word at a time, each chosen because it was the most likely next one.

And you were never choosing from every word — just the few the model rated likely. Stack thousands of these micro-decisions and you get an essay, a poem, or a working answer. No understanding required — just very, very good guessing.

02Tokens and memory

→AI reads in tokens (word-chunks), and remembers only what fits its context window.

"I love AI!" →I␣love␣AI!

"unbelievable" →unbelievable

Glowing rings of light orbiting a core, like words circling a prediction — Choosing the next word

Tokens and the memory window, explained

An AI doesn't see whole words. It chops text into tokens — small chunks, often pieces of words. For many models, one token is roughly four characters (as above).

Why this matters

Because the AI sees chunks, not letters, it can stumble on letter-level tasks — like counting the r's in "strawberry." It's not stupid; it just never saw the individual letters the way you do.

And everything it can "remember" in one conversation must fit inside its context window — a limit on how many tokens it holds at once. Go past it, and the earliest parts of the chat quietly fall out of view.

Everyday analogy

Think of a fixed-size whiteboard. New notes go on the right; when it fills up, the oldest notes on the left get wiped to make room. That's a context window.

03Two different moments: learning vs. using

→Training (learning, once) and using it (every time) are two separate moments.

Why this matters for what AI knows

Training — the one-time, months-long phase where the model learns from huge amounts of text. Expensive, slow, done before release.
Inference — what happens every time you use it: it applies what it already learned. Fast, and it doesn't change the model.

The key takeaway

When you chat with an AI, it is not learning from you in that moment — it's using training that finished long ago. That's also why a model has a "knowledge cutoff" and won't know last week's news.

04Why AI makes things up

REMEMBER THIS

An AI is built to sound right, not to be right. Fluent and confident is not the same as correct.

What a hallucination is — and how to spot it

Because the model predicts plausible words rather than checking facts, it can state something completely wrong with total confidence — a hallucination. It isn't lying; it has no idea it's wrong. It's just filling the gap with words that sound right.

How to spot a hallucination

Be extra skeptical of specific facts, names, dates, quotes, statistics, and links. If it matters, verify it elsewhere — never treat a confident answer as proof. Knowing this one thing puts you ahead of most people using AI today.

RECAP · WHAT YOU NOW KNOW

Chatbots work by predicting the next word, again and again — the same trick across ChatGPT, Claude, Gemini and others.
They read in tokens (word-chunks) and remember only what fits in their context window.
Training (learning, once) and inference (using it, every time) are two separate moments — it isn't learning from your chat.
Because it predicts plausible words, it can hallucinate — sound confident but be wrong. Always verify what matters.

QUICK CHECK

Two questions. Then you're done.

1. At its core, how does a chatbot generate a reply?

2. Why can an AI confidently say something false?

Finished Step 04?

Mark it complete to lock in your 100 XP and unlock Step 05.

How AI chatbots like ChatGPT work.

01It's guessing the next word

02Tokens and memory

03Two different moments: learning vs. using

04Why AI makes things up

Two questions. Then you're done.