What the model
actually sees.
Not your whole matter. Not last week’s chat. A fixed window of text — and nothing else. Seven stages.
A window, not a memory.
The model has no memory and no filing system. Each time you press send, it is handed one continuous run of text — the context window — and predicts what comes next. Anything outside that window does not exist for the model. Windows are measured in tokens — the word-fragments the model reads instead of whole words.
Everything shares the same space.
Your instructions, every document you paste, the whole conversation so far, and every answer the model has already given all sit in the same window, competing for room. The window is shared — a long document squeezes everything else.
Watch it overflow.
Each block is sized by its share of the window. Keep adding — once the total passes 32k, the oldest material falls out of the top, starting with your instructions.
Why long chats drift.
When a conversation outgrows the window, the earliest turns are dropped or summarised. The careful instruction you gave in message one may literally no longer be in front of the model by message forty. It hasn’t “forgotten” like a person — the text is simply absent. One matter per chat; restate what matters.
Attention thins out.
recall by position — the middle sags
Even inside the window, attention isn’t uniform. Material at the start and end of a long context tends to get more attention than material buried in the middle. Put the thing that matters most at the start or the end — and say it twice if it’s critical.
Four practical rules.
A million tokens is a bundle, not a library.
Modern windows reach around a million tokens — roughly a full hearing bundle, or several long novels. But a bigger window is not a guarantee of attention: the middle still sags (stage 05), and models can still be selective about what they use.
Now you know what it can’t see.
The window explains half of all disappointing AI answers. The other half is what the model does with the space it has — and why it sometimes fills gaps with invention.
Next: Why AI Makes Things Up →