One-Sentence Definition
A large language model (LLM) is a deep-learning system trained on massive text datasets that can understand, generate, and reason about human language at a remarkably high level.
How It Works
An LLM is built on the transformer architecture. During training, the model reads vast quantities of text -- books, websites, code repositories, scientific papers -- and learns to predict what token (word or word fragment) comes next in a sequence. This simple objective, repeated trillions of times across terabytes of data, produces a model that internalizes grammar, facts, reasoning patterns, and even coding conventions.
Modern LLMs have hundreds of billions of parameters -- the numerical weights that encode everything the model has learned. GPT-4, Claude, Gemini, and Llama 3 are all LLMs, though they differ in training data, architecture tweaks, and safety tuning. After the base model is trained (pre-training), it goes through fine-tuning stages -- often including reinforcement learning from human feedback (RLHF) -- to make it more helpful, accurate, and safe in conversation.
At inference time (when you type a prompt), the model processes your input through dozens of transformer layers, each applying self-attention to weigh how every token relates to every other token. The output is a probability distribution over the vocabulary, and the model samples from it to produce each new token one at a time.
Why It Matters
LLMs are the technology behind ChatGPT, Claude, Gemini, and nearly every AI chatbot and coding assistant on the market. They have moved AI from a specialist tool to something hundreds of millions of people use daily for writing, research, programming, analysis, and creative work.
The competitive landscape is intense. OpenAI, Anthropic, Google DeepMind, Meta, and Mistral are all releasing new LLMs on a rapid cadence. Open-weight models like Llama and Mistral have democratized access, letting startups and researchers run capable models on their own hardware. Meanwhile, context windows have grown from 4,000 tokens in early GPT-3.5 to over a million tokens in some 2026 models, dramatically expanding what LLMs can process in a single session.
Key Takeaway
A large language model is a transformer-based neural network trained on enormous text corpora to understand and generate language, and it is the core technology powering the current wave of AI products.
Part of the AI Weekly Glossary.