A medical test for a rare disease is 99% accurate. You take the test. It comes back positive. What’s the probability you actually have the disease?

If you said “99%,” you’ve made a mistake that medical professionals make routinely. The correct answer depends on the base rate — how rare the disease is in the general population. For most rare diseases, the answer turns out to be much lower than 99%. Often dramatically lower.

Working out the right answer requires Bayes’ theorem and the broader framework of Bayesian inference — a way of reasoning about probability that has, in the last fifty years, become one of the most important tools in modern statistics, machine learning, and decision theory.

This article is about what Bayesian inference is, why it gives the right answer to questions like the medical-test one, and where this kind of reasoning shows up across science and engineering.

The medical-test calculation

Let’s work through the example carefully. Suppose:

  • A disease affects 0.1% of the population (1 in 1000 people).
  • A test is 99% accurate: if you have the disease, it correctly says yes 99% of the time. If you don’t, it correctly says no 99% of the time.

Now imagine testing 100,000 random people:

  • About 100 have the disease (0.1%). The test catches 99 of them.
  • About 99,900 don’t have the disease. The test wrongly flags 1% of them, or 999 false positives.

So out of all positive tests (99 + 999 = 1,098), only 99 are true positives. The probability that a positive test means you actually have the disease is:

P(diseasepositive)=991,0989%.P(\text{disease} \mid \text{positive}) = \frac{99}{1,098} \approx 9\%.

A 99% accurate test, applied to a 0.1% rare disease, gives only 9% confidence in a positive result.

This isn’t a trick. It’s correct, careful application of probability theory. The reason is that the base rate is so low that even a small false-positive rate produces many more false positives than true positives.

Bayes’ theorem precisely

The math is captured by Bayes’ theorem, introduced by the Reverend Thomas Bayes around 1761:

P(HD)=P(DH)P(H)P(D).P(H \mid D) = \frac{P(D \mid H) \cdot P(H)}{P(D)}.

Reading the symbols:

  • P(HD)P(H \mid D)posterior probability: what we want. The probability of hypothesis HH being true given the data DD.
  • P(DH)P(D \mid H)likelihood: the probability of observing data DD if hypothesis HH is true. (For our medical test: 99%.)
  • P(H)P(H)prior probability: our belief about HH before seeing data. (For the disease: 0.1%.)
  • P(D)P(D)marginal probability of the data, computed by summing over all hypotheses.

The medical-test calculation in this notation:

P(disease+)=P(+disease)P(disease)P(+)=0.99×0.0010.99×0.001+0.01×0.999=0.000990.010989%.P(\text{disease} \mid +) = \frac{P(+ \mid \text{disease}) \cdot P(\text{disease})}{P(+)} = \frac{0.99 \times 0.001}{0.99 \times 0.001 + 0.01 \times 0.999} = \frac{0.00099}{0.01098} \approx 9\%.

The Bayesian framework gives a recipe: start with your prior belief, update it by the evidence, get the posterior. Repeat as new evidence arrives.

Bayesian vs frequentist statistics

The Bayesian approach is one of two major schools of statistical thought. The other is frequentist statistics. The distinction is technical but the core difference is:

Frequentist: probability is the long-run frequency of an event in repeated trials. A 95% confidence interval means: if I repeated my procedure many times, 95% of the resulting intervals would contain the true value.

Bayesian: probability is a degree of belief. A 95% credible interval means: given my prior plus the evidence, there’s a 95% probability the true value is in this interval.

Both frameworks have valid applications. Frequentist methods dominate in some scientific fields (especially physics, classical experimental design), while Bayesian methods are more natural in others (machine learning, decision theory, AI).

The historical pendulum has swung back and forth. Bayesian methods were developed first (1761) and dominated through much of the 19th century. The frequentist approach (Fisher, Neyman, Pearson) became dominant in the 1920s and held that position for half a century. Since the 1970s, Bayesian methods have made a strong comeback, partly due to advances in computational sampling (MCMC) that made previously intractable Bayesian calculations feasible.

Today, most working statisticians use both, choosing whichever framework fits the problem.

Where Bayesian inference is used

The applications are extensive. A partial list:

Medical diagnosis. Modern diagnostic systems combine test results with patient demographics, symptoms, and prior probabilities to produce calibrated diagnostic probabilities. The base-rate calculation we did is the simplest case.

Spam filtering. Naive Bayes classifiers were the first effective spam filters and are still used. They compute P(spamwords in email)P(\text{spam} \mid \text{words in email}) using estimates from a training corpus.

A/B testing in tech. Companies running website experiments use Bayesian methods to decide when an A/B test has gathered enough evidence to declare a winner. The Bayesian approach handles “early stopping” more cleanly than frequentist methods.

Search and rescue. When a ship or plane goes missing, search teams use Bayesian methods to update their probability map. They start with a prior (where the missing object was last known), update with observations (where they’ve searched, where they haven’t), and concentrate further searches where the posterior probability is highest. This is how the Air France 447 wreckage was found in 2011 after years of searching.

Robotics and self-driving cars. Bayesian filtering (Kalman filters, particle filters) tracks the position and state of vehicles given noisy sensor data. The car’s belief about its own location is constantly being updated as new sensor readings arrive.

Machine learning. Bayesian neural networks treat model weights as probability distributions, capturing uncertainty in predictions. This is increasingly important for “AI alignment” applications where models need to know when they’re uncertain.

Particle physics. When LIGO detected gravitational waves in 2015 and 2017, the analysis was Bayesian: matching observed waveforms against theoretical templates with priors on physical parameters.

Forensic science. Comparing DNA evidence to suspect populations involves Bayesian reasoning. Frequentist mistakes in DNA evidence interpretation have caused serious miscarriages of justice.

Genetics. Inferring evolutionary relationships, gene linkage, and population structure increasingly uses Bayesian methods (BEAST, MrBayes, etc.) — they handle uncertainty cleanly and incorporate prior biological knowledge.

The base rate fallacy

One of the most consequential errors in human reasoning is ignoring the base rate. The medical-test example is the classic case. Studies have shown that doctors and medical students routinely overestimate the probability of disease given a positive test, often by a factor of 5–10.

The reason is that human intuition naturally focuses on case-specific information — “this test is 99% accurate” — and underweights the prior probability — “but the disease is rare.” Daniel Kahneman and Amos Tversky’s work on cognitive biases (which won Kahneman the 2002 Nobel Prize in Economics) documented this tendency extensively.

The base-rate fallacy has consequences everywhere:

  • Mass screening for rare conditions can produce many more false positives than true positives, even with very accurate tests. This is why the U.S. Preventive Services Task Force often recommends against screening for certain rare conditions in low-risk populations — the math doesn’t favor it.

  • Predictive policing and risk-assessment algorithms can be profoundly miscalibrated when they don’t properly account for base rates. A “high-risk” classification with a 99% accurate algorithm applied to a 1% base-rate population produces 50% false positives.

  • Identification of unusual events (terrorist threats, fraud, anomalies) suffers the same problem at scale. Even highly accurate classifiers, applied to billions of data points, produce overwhelming numbers of false positives because the true positives are so rare.

The Bayesian formalism is the antidote. Always compute the posterior properly. Always check what the prior actually is.

Updating beliefs sequentially

A powerful feature of Bayesian inference is that the posterior of one observation can serve as the prior for the next. This makes Bayesian reasoning naturally suited to sequential learning.

Start with prior P0(H)P_0(H). See evidence D1D_1, get posterior P1(H)=P(HD1)P_1(H) = P(H \mid D_1). Use P1P_1 as the prior for the next round. See evidence D2D_2, get P2(H)=P(HD1,D2)P_2(H) = P(H \mid D_1, D_2). And so on.

This sequential updating is the mathematical core of:

  • Online learning in machine learning
  • Adaptive algorithms that improve with new data
  • Forecasting models that update as new observations arrive
  • Sensor fusion in robotics, where multiple noisy measurements combine into a coherent estimate

The Kalman filter — used in everything from GPS navigation to spacecraft attitude control — is essentially a sequential Bayesian update for normally distributed beliefs.

Markov Chain Monte Carlo

For complicated models, the Bayesian calculation is rarely tractable in closed form. Computing P(HD)P(H \mid D) when HH is a high-dimensional model and DD is real-world data usually involves intractable integrals.

The breakthrough that made modern Bayesian inference practical was Markov Chain Monte Carlo (MCMC), particularly the Metropolis-Hastings algorithm (1953) and Gibbs sampling (1984). These techniques generate samples from the posterior distribution using random walks designed to converge to the right distribution. (See our Markov chains piece for the broader framework.)

MCMC techniques transformed Bayesian statistics from a niche area to a workhorse method by the late 1990s. Software like BUGS, JAGS, Stan, and PyMC lets working scientists write down their model and let the computer figure out the posterior. This democratized Bayesian inference enormously.

What Bayesian thinking teaches

The deepest lesson is that rational belief updating is mathematically constrained. Given your prior beliefs and new evidence, there’s exactly one correct way to update, and Bayes’ theorem specifies it.

This has philosophical implications:

Conditional probability is the right framework for reasoning under uncertainty. Whatever your prior beliefs, evidence shifts them in a precise quantitative way. Subjective priors don’t make Bayesian reasoning subjective in any harmful sense — different priors converge to similar posteriors as more evidence accumulates.

Strong claims require strong evidence. Extraordinary hypotheses have low priors. To overcome a low prior, you need evidence with a very high likelihood ratio. This is the mathematical basis for “extraordinary claims require extraordinary evidence.”

Calibration is a virtue. A Bayesian calculation produces probability estimates that are, in the long run, well-calibrated: things you assigned 80% probability to should happen 80% of the time. This calibration is testable and produces objective performance metrics.

Most reasoning under uncertainty is Bayesian whether you realize it or not. When you weigh evidence, update your opinion based on new information, or assess how surprising a finding is, you’re (informally) doing Bayesian reasoning. The formal apparatus just makes the calculations precise.

The next time you read a news story about a medical test, a security system, or an algorithm’s accuracy, your first reaction should be: what’s the base rate? Without that, the accuracy number means very little. With it, you can do the calculation yourself and get the right answer.

That’s, in many ways, the practical gift of Bayesian inference: a precise method for thinking clearly about uncertain situations, and a way to spot the everyday reasoning errors that come from ignoring base rates. Two and a half centuries after the Reverend Bayes, it remains one of the most underrated tools in clear thinking.

Frequently asked

Are Bayesian methods always better than frequentist methods?

Not always — they're complementary. Bayesian methods require specifying a prior distribution, which is sometimes well-justified and sometimes essentially arbitrary. When priors are clearly known, Bayesian inference is more powerful. When priors are uncertain, frequentist methods can be more conservative and reproducible. Most modern statisticians use both, choosing the framework that fits the problem.

What's the difference between a prior and a likelihood?

The prior P(H) is your belief about how likely the hypothesis is BEFORE seeing the data. The likelihood P(D|H) is the probability of observing the data IF the hypothesis is true. Bayes' theorem combines them into the posterior P(H|D) — your updated belief after seeing the data. Different priors can lead to different posteriors even with the same data, which is why prior choice matters.

Why does the medical-test example produce such a low probability?

Because the disease is rare. Even with a 99% accurate test, if only 0.1% of the population has the disease, the false positives outnumber the true positives by about 10 to 1. The post-test probability of having the disease, given a positive result, is therefore only about 9% — not 99%. This base-rate effect is why mass screening for rare conditions is statistically tricky.