Start a project

Models.

The model itself — what it is, how it generates, and how its parameters change behaviour.

23 terms

LLM (Large Language Model)
A model trained on huge amounts of text to predict the next token.
Read definition
Transformer
The neural network architecture behind every major LLM — attention over sequences.
Read definition
Attention Mechanism
How transformers decide which tokens to focus on when generating each output token.
Read definition
Multimodal Model
A model that handles text plus images, audio, or video in one request.
Read definition
Mixture of Experts (MoE)
An architecture where only part of the model activates per token.
Read definition
Distillation
Training a smaller, cheaper model to mimic a larger one's outputs.
Read definition
Fine-Tuning
Continuing to train a base model on your own examples to specialize its behavior.
Read definition
LoRA (Low-Rank Adaptation)
A lightweight way to fine-tune by training small adapter weights instead of the whole model.
Read definition
Quantization
Storing model weights at lower precision (e.g., 4-bit) to save memory and run faster.
Read definition
Temperature
A dial that controls how random or focused a model's output is.
Read definition
Top-p (Nucleus Sampling)
A sampling dial that picks from the smallest set of tokens summing to probability p.
Read definition
Chain-of-Thought (CoT)
Asking the model to reason step by step before answering.
Read definition
RLHF (Reinforcement Learning from Human Feedback)
The training technique that turns a raw LLM into a helpful, safe assistant.
Read definition
Instruct Model
A base model fine-tuned to follow instructions — the "chat" version you actually use.
Read definition
Few-Shot Learning
Showing the model 2-5 examples of the task in the prompt so it learns the pattern.
Read definition
Zero-Shot Prompting
Asking the model to do a task with no examples — just instructions.
Read definition
In-Context Learning (ICL)
How models adapt to new tasks from examples in the prompt, with no weight updates.
Read definition
Diffusion Model
The architecture behind image generators like DALL-E, Midjourney, and Stable Diffusion.
Read definition
GPT-4o
OpenAI's flagship multimodal model — fast, cheap relative to predecessors, and supports vision and voice.
Read definition
Claude Sonnet (Anthropic)
Anthropic's primary workhorse model — strong writing, long context, and reliable tool use.
Read definition
Llama (Meta)
Meta's open-source LLM family — the leading choice for self-hosted and fine-tuned deployments.
Read definition
Gemini (Google)
Google's frontier LLM family — notable for its 2M-token context window and Google ecosystem integration.
Read definition
Vision-Language Model (VLM)
A model that understands both images and text — reads documents, screenshots, and photos.
Read definition

Other categories

Agents & Tools13 RAG & Retrieval14 Infrastructure13 Evaluation7 Safety & Trust7 Enterprise AI7