Skip to the content.

Parameters

The fundamental unit of measurement for the size — and, very approximately, the capability — of a modern AI model.


In one sentence

A parameter is a single number inside a neural network whose value was determined during training, and a model’s “parameter count” — the figure usually quoted alongside its name (Gemma 4 26B, Llama 3.3 70B) — is the total number of those numbers, typically running from a few hundred million on the small end to a trillion or more on the frontier.

Why this term exists

Modern language models are, mechanically, very large mathematical functions. The function takes input tokens and produces output tokens, and the recipe for getting from input to output is a long sequence of weighted-sum operations. Each weighted sum has a multiplier — a parameter — that says how much this particular input matters to this particular output.

Training is the process of finding good values for those multipliers. Trillions of training examples flow through the model, each one nudging the parameter values slightly to make the model’s outputs match the desired outputs better. When training finishes, the parameter values are frozen. The “frozen blob of numbers” is the model.

The number of parameters, then, is a structural property of the model — it tells you how big the function is, how much memory it takes to load, and (very roughly) how much capacity it has to represent useful patterns.

What it actually does — concretely

A working language model is a stack of mathematical operations called layers. Inside each layer are matrices of numbers — the parameters — that get multiplied with the input to produce the layer’s output. The output of one layer becomes the input of the next.

If a model has 26 billion parameters, that means there are 26 billion individual numbers stored on disk and loaded into memory when the model runs. Each one is typically a 16-bit or 32-bit floating-point number; with the common 4-bit quantization, each parameter is squeezed down to 4 bits of storage. The math:

That is why parameter count is the first specification anyone looks at when picking a local model. It tells you whether the model will fit.

A small but important warning about parameters as a quality measure

It is tempting to assume more parameters = better model. This is broadly true at the frontier (a 405B model knows more than a 7B model from the same family), but the relationship is loose and full of exceptions:

The headline parameter count is the rough size of the model. It is not the rough quality of the model. Confusing the two is a common rookie mistake — and a common vendor-marketing trick.

Working example from this machine

The four local models benchmarked on this MacBook on May 2, 2026, ordered by parameter count:

Model Parameters Architecture Size on disk (Q4) Tok/sec
Gemma 4 26B 26 billion MoE (sparse) 17 GB 96.5
Gemma 3 27B 27 billion Dense 17 GB 29.0
Gemma 4 31B 31 billion Dense 19 GB 24.6
Qwen 2.5 32B 32 billion Dense 19 GB 25.7

Notice that the smallest-parameter model — Gemma 4 26B — outruns all the larger ones by a factor of nearly 4×. Architecture (sparse vs. dense) matters more than raw parameter count for inference speed. This is the same lesson stated above, made concrete.

Why this matters in a teaching context

For a BBA or MBA classroom, parameter count is a useful entry point into a deeper management lesson: headline metrics are usually less informative than the supporting structure that produced them.

The same pattern shows up in:

A management student who internalizes the pattern “the headline number is a starting point, not an answer” will be a more rigorous strategist. AI parameter counts are a particularly clean teaching example because the gap between headline and reality can be demonstrated in twenty minutes of benchmarking.

Trade-offs


Related entries: Ollama, Token burn, Fine-tuning, forthcoming Quantization.

Return to Dictionary All Entries (A–Z)