Relearning Statistics in my 30s: What I learned from Statistical Rethinking

19 Sep, 2025

Statistical Rethinking by Richard McElreath might be the best academic book I've ever read. This is probably because I’ve never read an academic book purely out of interest before, but this one is definitely different. It has good storytelling, irrefutable arguments to convince you to adopt a Bayesian perspective, and it literally teaches you how to do statistics. I’d like to share here why this book is such a good read.

It taught me that statistics does make sense

The most common use of statistics is to gain knowledge about a population from a limited sample. We use a sample because it’s usually too expensive to collect data from every member of a population. We then use the sample data and the mathematical tools of inferential statistics to make inferences about the population. We try to translate, generalize, or extrapolate the knowledge we get from the sample to the overall population (gaining knowledge from the sample data is usually called descriptive statistics).

There are two schools of thought—or rather, belief systems—in the inference process: Frequentist and Bayesian. They differ in how they view the unknown. Suppose we want to know the average commuting time of Jakarta residents and sample 1,000 people. A Bayesian would say: “After seeing the data, I am 95% sure the average commuting time is 140 minutes a day.” A Frequentist would say: “After seeing the data, I know that if I repeated this sampling many, many times and calculated a 95% confidence interval each time, 95% of those intervals would contain the true average commuting time, which is 140 minutes a day.” See the difference?

Bayesians are comfortable assigning uncertainty to a particular variable. For example: “I am 80% sure the average Indonesian height is between 150–160 cm.” Here, you are uncertain about a specific variable: Indonesian average height. This is intuitive because we do it all the time. Examples:

I am sure that tomorrow will rain
I am sure I will be late by 10 mins to the office
I am not sure whether I will pass this exam

However, the Bayesian method requires a prior belief to work. This prior belief is then updated with collected data. To expand on the previous example: “Previously, I believed the average Indonesian height was between 170–180 cm. After seeing the data, my updated belief is that it’s between 150–160 cm.” Here, you have a prior belief, collect data, and now have an updated belief.

On the other hand, Frequentists depend on long-run frequency: “If we repeated the sample many times, 80% of the confidence intervals we calculate would contain the true average. For this particular sample, the confidence interval is 150–160 cm. We don’t know if this interval contains the true mean, but if we repeated the sampling many times, 80% of such intervals would contain it.”

I am a simple man with a simple mind, and 95% of the time—no pun intended—I prefer the Bayesian method because it’s more intuitive and straightforward for decision-making.

Bayesian: "I am uncertain about it, but I can update my belief"

Frequentist: "I'll sample again and again and this will get me to the truth"

It taught me the importance of causality

Causality is very important in inferential statistics because data alone does not explain reality. You have to determine causality first (fortunately, we usually do this unconsciously) before drawing conclusions. For example, the correlation between ice cream sales and sunburn is very high. If you’re not careful, you might say ice cream sales cause sunburn, which is obviously wrong. The underlying cause is that hot weather increases both ice cream sales and sunburn cases.

There are many subtler examples where the distinction between correlation and causation is not obvious, and it’s easy to fall into the trap of analyzing data incorrectly.

It taught me how to do real statistics in a real programming language

The book contains many examples and practice problems that you solve using a real programming language. It uses R and a Bayesian programming library. I feel fortunate to have learned from this book before AI-dominated coding, because I learned the “old-school” way, which is slower but effective.

In conclusion, I’m glad I decided to read and learn from this book. Although it caused months of struggle and agony, I emerged with a slightly better understanding of statistics.