The Gaussian distribution, also called the normal distribution or bell curve, is the probability density function:
with two parameters: the mean (the center of the bell) and the standard deviation (how wide it spreads). It is the single most important probability distribution in statistics.
What it looks like
Plotted on a graph, the Gaussian is a symmetric bell-shaped curve, highest at , decreasing smoothly in both directions. About 68% of the total probability lies within one standard deviation of the mean, 95% within two, and 99.7% within three — the famous “68-95-99.7 rule.”
The in the denominator is a normalization constant that ensures the total area under the curve equals 1, as every probability density must.
Why it’s everywhere — the Central Limit Theorem
The Gaussian’s dominance comes from a remarkable result called the Central Limit Theorem (CLT). It says: if you add up many independent random variables, each with finite mean and variance, the sum tends toward a Gaussian distribution — regardless of how the individual variables are distributed.
This is why so many measured quantities are (approximately) normal:
- Heights of adults in a population — the combined effect of many genetic and environmental factors
- Measurement errors in physics — the sum of many small independent disturbances
- Sample means of any distribution — central to all of statistics
The Gaussian isn’t a model of any particular phenomenon; it is the distribution that emerges whenever you average.
History
The earliest derivation came from Abraham de Moivre in 1733, who used it as an approximation to the binomial distribution. Pierre-Simon Laplace extended it. But the name comes from Carl Friedrich Gauss, who in 1809 showed that the distribution of measurement errors in astronomy — the “errors of observation” — was well described by this curve. Gauss used it in his work on planetary orbits and the method of least squares.
The distribution is also sometimes called the error curve for this reason.
In statistics
Much of classical statistics — confidence intervals, hypothesis tests, regression analysis — assumes that data or sample means are normally distributed, either as an exact model or as an approximation via the CLT. The t-test, the z-test, the F-test, and the chi-squared test all trace back to the Gaussian.
In machine learning, the Gaussian is the default assumption for noise, the starting point for many regression models, and the key ingredient in Gaussian processes.
The pitfall: heavy tails
The Gaussian predicts that extreme events are vanishingly rare — a 10-sigma event has probability around . But real-world data often have heavier tails: extreme events happen more often than a Gaussian predicts.
Financial returns are famously heavier-tailed than Gaussian — a fact made painfully clear in the 1987 crash and again in 2008. Earthquake magnitudes, internet traffic spikes, and power outages all follow power-law distributions with fat tails. Assuming normality in these domains is one of the most common and expensive mistakes in applied statistics.
Nassim Taleb’s book The Black Swan is essentially a book-length argument against uncritical use of the Gaussian outside the domains where the CLT actually applies.
The total area under the curve is always 1 — shifting μ slides the bell horizontally, shrinking σ sharpens the peak while widening it flattens the tails. The 68-95-99.7 rule is a direct corollary of the error-function integral.
Frequently asked
Who discovered the normal distribution?
Abraham de Moivre derived it in 1733 for approximating the binomial distribution. Carl Friedrich Gauss popularized it in 1809 in his work on measurement errors — which is why it's named after him.
Why does it show up everywhere?
The Central Limit Theorem: the sum of many independent random variables tends to a normal distribution, regardless of the original distributions. Whenever a quantity is the combined effect of many small random factors, the result is approximately normal.
Is the normal distribution always the right model?
No. Many real-world quantities — financial returns, web traffic, earthquake magnitudes — have heavier tails than the normal predicts. Assuming normality where it doesn't hold is a major source of real-world mistakes.