Relearning Statistics in my 30s: What I learned from Statistical Rethinking

Statistical Rethinking by Richard McElreath might be the best academic book I've ever read. This is probably because Iâve never read an academic book purely out of interest before, but this one is definitely different. It has good storytelling, irrefutable arguments to convince you to adopt a Bayesian perspective, and it literally teaches you how to do statistics. Iâd like to share here why this book is such a good read.
It taught me that statistics does make sense
The most common use of statistics is to gain knowledge about a population from a limited sample. We use a sample because itâs usually too expensive to collect data from every member of a population. We then use the sample data and the mathematical tools of inferential statistics to make inferences about the population. We try to translate, generalize, or extrapolate the knowledge we get from the sample to the overall population (gaining knowledge from the sample data is usually called descriptive statistics).
There are two schools of thoughtâor rather, belief systemsâin the inference process: Frequentist and Bayesian. They differ in how they view the unknown. Suppose we want to know the average commuting time of Jakarta residents and sample 1,000 people. A Bayesian would say: âAfter seeing the data, I am 95% sure the average commuting time is 140 minutes a day.â A Frequentist would say: âAfter seeing the data, I know that if I repeated this sampling many, many times and calculated a 95% confidence interval each time, 95% of those intervals would contain the true average commuting time, which is 140 minutes a day.â See the difference?
Bayesians are comfortable assigning uncertainty to a particular variable. For example: âI am 80% sure the average Indonesian height is between 150â160 cm.â Here, you are uncertain about a specific variable: Indonesian average height. This is intuitive because we do it all the time. Examples:
- I am sure that tomorrow will rain
- I am sure I will be late by 10 mins to the office
- I am not sure whether I will pass this exam
However, the Bayesian method requires a prior belief to work. This prior belief is then updated with collected data. To expand on the previous example: âPreviously, I believed the average Indonesian height was between 170â180 cm. After seeing the data, my updated belief is that itâs between 150â160 cm.â Here, you have a prior belief, collect data, and now have an updated belief.
On the other hand, Frequentists depend on long-run frequency: âIf we repeated the sample many times, 80% of the confidence intervals we calculate would contain the true average. For this particular sample, the confidence interval is 150â160 cm. We donât know if this interval contains the true mean, but if we repeated the sampling many times, 80% of such intervals would contain it.â
I am a simple man with a simple mind, and 95% of the timeâno pun intendedâI prefer the Bayesian method because itâs more intuitive and straightforward for decision-making.
Bayesian: "I am uncertain about it, but I can update my belief"
Frequentist: "I'll sample again and again and this will get me to the truth"
It taught me the importance of causality
Causality is very important in inferential statistics because data alone does not explain reality. You have to determine causality first (fortunately, we usually do this unconsciously) before drawing conclusions. For example, the correlation between ice cream sales and sunburn is very high. If youâre not careful, you might say ice cream sales cause sunburn, which is obviously wrong. The underlying cause is that hot weather increases both ice cream sales and sunburn cases.
There are many subtler examples where the distinction between correlation and causation is not obvious, and itâs easy to fall into the trap of analyzing data incorrectly.
It taught me how to do real statistics in a real programming language
The book contains many examples and practice problems that you solve using a real programming language. It uses R and a Bayesian programming library. I feel fortunate to have learned from this book before AI-dominated coding, because I learned the âold-schoolâ way, which is slower but effective.
In conclusion, Iâm glad I decided to read and learn from this book. Although it caused months of struggle and agony, I emerged with a slightly better understanding of statistics.