- Chatbots work by predicting the next word, over and over — the same trick in ChatGPT, Claude and Gemini.
- They read in tokens (word-chunks) and remember only what fits their context window.
- They predict plausible words, not verified facts — so they sound confident but can be wrong.
01It's guessing the next word
It predicts the next word, adds it, then predicts the next word again — over and over, until the answer is complete.
Be the model: pick the next word and watch the sentence grow.
Wait — that's really all it does?
A large language model — the engine inside ChatGPT, Claude, Gemini and the rest — isn't looking up answers or thinking in sentences. Every reply is built one word at a time, each chosen because it was the most likely next one.
And you were never choosing from every word — just the few the model rated likely. Stack thousands of these micro-decisions and you get an essay, a poem, or a working answer. No understanding required — just very, very good guessing.
02Tokens and memory

Tokens and the memory window, explained
An AI doesn't see whole words. It chops text into tokens — small chunks, often pieces of words. For many models, one token is roughly four characters (as above).
Because the AI sees chunks, not letters, it can stumble on letter-level tasks — like counting the r's in "strawberry." It's not stupid; it just never saw the individual letters the way you do.
And everything it can "remember" in one conversation must fit inside its context window — a limit on how many tokens it holds at once. Go past it, and the earliest parts of the chat quietly fall out of view.
Think of a fixed-size whiteboard. New notes go on the right; when it fills up, the oldest notes on the left get wiped to make room. That's a context window.
03Two different moments: learning vs. using
Why this matters for what AI knows
- Training — the one-time, months-long phase where the model learns from huge amounts of text. Expensive, slow, done before release.
- Inference — what happens every time you use it: it applies what it already learned. Fast, and it doesn't change the model.
When you chat with an AI, it is not learning from you in that moment — it's using training that finished long ago. That's also why a model has a "knowledge cutoff" and won't know last week's news.
04Why AI makes things up
An AI is built to sound right, not to be right. Fluent and confident is not the same as correct.
What a hallucination is — and how to spot it
Because the model predicts plausible words rather than checking facts, it can state something completely wrong with total confidence — a hallucination. It isn't lying; it has no idea it's wrong. It's just filling the gap with words that sound right.
Be extra skeptical of specific facts, names, dates, quotes, statistics, and links. If it matters, verify it elsewhere — never treat a confident answer as proof. Knowing this one thing puts you ahead of most people using AI today.
- Chatbots work by predicting the next word, again and again — the same trick across ChatGPT, Claude, Gemini and others.
- They read in tokens (word-chunks) and remember only what fits in their context window.
- Training (learning, once) and inference (using it, every time) are two separate moments — it isn't learning from your chat.
- Because it predicts plausible words, it can hallucinate — sound confident but be wrong. Always verify what matters.
