One-Sentence Definition
Transfer learning is a machine learning technique where a model trained on one task is reused as the starting point for a different but related task, dramatically reducing the data and compute needed to achieve strong performance.
How It Works
Training a neural network from scratch requires massive datasets and significant compute. Transfer learning sidesteps this by starting with a model that has already learned useful representations from a large dataset and adapting it to a new problem.
The classic example comes from computer vision. A model like ResNet, pre-trained on ImageNet's 14 million images, learns general visual features: edges, textures, shapes, and object parts. If you need a model that classifies skin lesions in medical images, you do not train from scratch. Instead, you take the pre-trained ResNet, replace its final classification layer with one suited to your categories, and fine-tune on your medical dataset. The model already knows what edges and textures look like -- it just needs to learn which patterns indicate melanoma.
The same principle applies to language. Models like BERT, GPT, and Llama are pre-trained on vast text corpora, learning grammar, facts, and reasoning patterns. When you fine-tune one of these models on legal documents, customer emails, or scientific papers, you are performing transfer learning. The model transfers its general language understanding to your specific domain.
Transfer learning can be partial (freeze most layers, train only the last few) or full (update all weights with a low learning rate). Techniques like LoRA and adapters make transfer learning even more efficient by training only a small number of additional parameters.
Why It Matters
Transfer learning is the reason AI is accessible beyond Big Tech. Without it, every company would need to train models from scratch at enormous cost. With it, a startup can take an open-weight model like Llama 3, fine-tune it on a few thousand domain-specific examples, and deploy a capable system for a fraction of the price.
The entire paradigm of foundation models -- large models pre-trained once and adapted many times -- depends on transfer learning working reliably. It is the economic engine that makes modern AI development viable for organizations of all sizes.
Key Takeaway
Transfer learning reuses knowledge from pre-trained models to solve new tasks efficiently, and it is the reason modern AI does not require training from scratch for every application.
Part of the AI Weekly Glossary.