An LLM (Large Language Model) is a prediction engine. Trained on a huge amount of text, it does one thing: given the words so far, it predicts the next chunk of text. Think of it as the world's most well-read autocomplete. Here is the whole idea in one picture:
A chunk here means a token: a word or a piece of a word, the small unit the model reads and writes in. The model picks the next token, adds it to the text, then predicts again. That simple loop, run billions of times during training, is why an LLM is both amazing and fallible. You will watch it predict in a minute.
This is the single most common misunderstanding. Many people picture an LLM as a search engine with a giant database of facts inside it. That is a myth. An LLM does not store or retrieve facts. It predicts text.
The myth: "It has a database of facts and looks up the answer." If that were true, it would either know a fact or politely say it does not. It would not invent things.
The reality: It generates the most likely-sounding next text based on patterns it learned. Often that lands on the truth, because true statements are common in its training text. Sometimes it produces a fluent, confident answer that is simply wrong. This is called a hallucination: text that sounds right but is not.
What people imagine is happening.
What is actually happening.
Pick a starting phrase. The page reveals the next words an LLM might choose, with rough probabilities. Notice: it is weighing likely text, not fetching a fact from a table.
π This runs entirely in your browser. Nothing is sent anywhere.
These numbers are illustrative only and approximate, hand-picked for teaching. A real model weighs many thousands of possible tokens. The point is the shape of it: several candidates, each with a likelihood, one winner, then it repeats.
π‘ See this idea doing real work: feeding the model the right text up front (called context) is what makes its predictions land on your facts. That is the whole game behind RAG and prompting.
Two moments get mixed up constantly. Keep them separate and most "AI is creepy / AI is magic" confusion clears up.
Specifics like model names, prices, and context-window sizes change quickly. As of writing, mainstream models read roughly tens to hundreds of thousands of tokens of context at once, with a few going higher. Treat any exact figure as a snapshot, and rely on the durable idea: more relevant context in, more reliable text out.
You know an LLM is a prediction engine, not a fact database. You know why it can be confidently wrong, that it does not learn from your chats by default, and that the right context is what makes it reliable.
Understanding the engine is step one. We put it to work safely in your business, wired to your real data with the guardrails that keep its predictions honest.
Keep going: how to prompt it well β Β· give it your facts (RAG) β Β· spot bad output β
Day 21 of 30 plain-English AI lessons for small business. See the whole track β