One-Sentence Definition
A foundation model is a large AI model pre-trained on broad data at scale that can be adapted to a wide range of downstream tasks through fine-tuning, prompting, or integration into larger systems.
How It Works
The term "foundation model" was coined by Stanford's Center for Research on Foundation Models (CRFM) in 2021 to describe a shift in AI development. Instead of training a separate model for each task -- one for translation, one for sentiment analysis, one for summarization -- the field moved toward training one massive model on diverse data and then adapting it to many tasks.
Foundation models are trained in two phases. Pre-training exposes the model to enormous datasets: text from the internet, code repositories, books, images, or a combination. The model learns general patterns -- how language works, how objects appear in images, how code is structured. This phase is extremely expensive, often requiring thousands of GPUs running for months and costing tens to hundreds of millions of dollars.
Adaptation is the second phase. A foundation model can be adapted through fine-tuning (additional training on task-specific data), prompt engineering (carefully crafted instructions), or RAG (connecting it to external knowledge). This flexibility is what makes them "foundational" -- one model serves as the base for countless applications.
Examples of foundation models in 2026 include GPT-4 and GPT-4o (OpenAI), Claude (Anthropic), Gemini (Google DeepMind), Llama 3 (Meta), Mistral Large (Mistral AI), and Stable Diffusion (Stability AI). Each can be adapted for specific industries, tasks, and products.
Why It Matters
Foundation models have reorganized the AI industry. A small number of labs with the capital to pre-train frontier models (OpenAI, Anthropic, Google, Meta) produce the base models. A much larger ecosystem of companies builds products on top of them. This creates economic leverage -- every improvement to a foundation model ripples across thousands of downstream applications -- but also concentration risk, since the entire ecosystem depends on a handful of model providers.
Open-weight foundation models like Llama and Mistral partially address this by letting anyone download, modify, and deploy the model, reducing dependence on API providers.
Key Takeaway
A foundation model is a large, general-purpose AI model pre-trained at scale and adapted to many tasks, forming the base layer of the modern AI technology stack.
Part of the AI Weekly Glossary.