Context Window
The context window is the maximum amount of text a language model (LLM) can take into account at once, expressed in tokens.
The context window is the maximum amount of text a language model (LLM) can take into account at once, expressed in tokens.
It includes the system prompt, the conversation history, any provided documents and the generated response. A larger window lets the model analyse longer documents or maintain a longer conversation without losing information.
In 2026, frontier models reach several million tokens (Gemini, Claude). But a large window doesn't guarantee quality: phenomena such as lost in the middle and the quadratic complexity of attention remain open challenges.
