Training

Fine-Tuning

The process of further training a pre-trained model on a smaller, task-specific dataset to improve its performance on particular tasks or to adapt it to a specific domain.

Fine-tuning takes a foundation model that has already been pre-trained on a massive general corpus and continues training it on a narrower, curated dataset. This process adapts the model's weights to perform better at specific tasks — like generating medical summaries, classifying legal documents, or matching a particular writing style — without the enormous cost of training from scratch.

The most common fine-tuning approaches differ in how much of the model they modify. Full fine-tuning updates all model parameters and requires significant compute but produces the best results. Parameter-efficient methods like LoRA (Low-Rank Adaptation) and QLoRA freeze most of the model and only train small adapter layers, dramatically reducing memory and compute requirements. Instruction fine-tuning focuses on teaching the model to follow particular instruction formats. RLHF (reinforcement learning from human feedback) is a specialized form of fine-tuning that aligns model behavior with human preferences.

Fine-tuning is not always the right choice. For many tasks, well-crafted prompts with few-shot examples can achieve comparable results without the engineering overhead. Fine-tuning makes sense when you need consistent formatting, domain-specific knowledge, reduced latency (shorter prompts), or behavior that is difficult to elicit through prompting alone. It is especially valuable when you have hundreds or thousands of high-quality input-output examples that represent your desired behavior.

Major providers offer fine-tuning APIs: OpenAI supports fine-tuning GPT-4o and GPT-4o-mini, while open-source models like Llama 3 and Mistral can be fine-tuned locally or on cloud platforms. The cost depends on dataset size, model size, and training duration. A typical fine-tuning run on a 7B-parameter open-source model can be done on a single high-end GPU in a few hours, while fine-tuning a 70B model requires multi-GPU setups or cloud resources.

Few-Shot Learning

Frontier Models

Explore more AI concepts in the glossary

Browse Full Glossary