(notes on) ‘Naked Statistics’ by Charles Wheelan

Chris Stoneman
8 min readMar 19, 2021
Naked Statistics by Charles Wheelan

NB. These notes contribute towards my diyMBA, my attempt to learn lots without a formal MBA programme. Check out my full reading list, and why I’m doing it DIY.

In summary, this is a great intro to the basics of statistics. Many of us work with statistics every day and, despite never formally studying statistics, can think we know more than we do… “We can interpret the data, what more do we need?” This book offers a foundational understanding of how different types of analysis are built, why they’re useful, and how they can be misused.

It’s full of fascinating stories and real life applications that keep the reader intrigued, giving us energy to dive deeper into the maths for true understanding. Thoroughly recommended… Buy the book!

TLDR

Statistics won’t offer full answers, but they will help us make more informed answers. Key areas to understand include:

  • Descriptive statistics (averages, standard deviation, correlation coefficients) and the underlying principles of normal distribution curves, indexes, and probability.
  • The importance of robust data and sampling (central limit theorem, common biases, inference) and techniques used to ensure this robustness.
  • Defining relationships between multiple variables (regression analysis) and avoiding / recognising mistakes.
  • How to use experiments to evaluate the real life impact of specific variables.

My Notes

What’s the Point?

“The point is to learn things that inform our lives.”

Statistics won’t offer full answers, but they will help us make more informed answers. They allow us to:

  • Summarise data and recognise patterns
  • Make better decisions
  • Answer important questions
  • Evaluate the effectiveness of our decisions, policies and programmes
  • …and importantly, identify when statistics are being used to mislead or misinform.

Descriptive Statistics

… organise complex info into a single number, eg a batting average.
For example, we can measure:

  • the average of a set of numbers using the mean, median or mode. Each statistic tells us something different, so be wary.
  • how spread out data is from its average mean, using standard deviation (how far is a set of data from the mean) or variance (the average of how far all points differ from the mean).
  • how a data point is changing, using the percentage difference.
    (New figure — original figure ) / original figure
    The numerator (on top) is the size of the change in absolute terms, the denominator (on bottom) is what puts this change in context by comparing it with our starting point.
    (I’ve done this a gazillion times and love now properly understanding the mathmatical reasoning)
  • how two data sets are related to one another using the correlation coefficient, ranging from -1 (perfect negative correlation) to +1 (perfect positive correlation). 0 means no correlation.

A normal distribution curve tells us exactly what proportion of the the observations in a normal distribution lie within one standard deviation of the mean, within two SDs, three SDs etc.

An index is a combination of several statistics, giving us one number add insight to a situation. But remember, any index is sensitive to the statistics used and their individual weighting.

Beware!! Statistics can be misleading, for example

  • Accuracy is a measure of whether a figure is broadly consistent with the truth. Precision is a measure of how exact a figure is.
    If an answer is accurate, then more precision is usually better. But no amount of precision can make up for inaccuracy.
  • We could say that Globalization has increased income inequality (rich countries are richer while poor countries remain poor). But really we don’t care about countries, we care about people. Measuring people shows us global inequality is falling rapidly.
  • Correlation does not imply causation!

Basic Probability

How do insurance companies work?

Larger sample sizes = greater accuracy.
The law of large numbers states that as the number of experiments increase, the average of the outcomes will get closer and closer to its expected value (average). aka a Regression towards the mean.

So insurers know that… “Yes, there will be payouts, but over the long run what comes in will be more than what goes out
Casinos and Lotteries work in the same way.

A decision tree map out each course of action, showing statistical probability at each step. Leads to Expected Value, the sum of all the different outcomes, each weighted by its probability and payoff.

The Monty Hall problem. Famous and awesome and properly helped me understand the details of probability. Just watch the video.

Common mistakes when using probability

  • Assuming events are independent when they’re not
  • Not understanding when events are independent (eg Gamblers fallacy, Hot Hand fallacy)
  • Clusters Happen. Unlikely events are likely to happen eventually.
  • The Prosecutor’s Fallacy. “The chances of finding a coincidental one in a million match are high if you run the sample through a database with samples from a million people”
  • Regression to the mean (the law of large numbers). eg no mutual fund manager outperforms the whole stock market for long
  • Statistical Discrimination. It is mistaken and wrong to unjustly and unreasonably discriminate against someone on the basis of irrelevant factors using statistics.

The Importance of Data

Sampling involves analysing a representative subset to draw conclusions on a wider data set. It opens the door to powerful insights, but getting a good sample is hard, eg Polling.

Central Limit Theorem: a large , properly drawn sample will resemble the population from which it is drawn. The sample means will be roughly distributed as a normal distribution around the population mean.

Garbage in / Garbage out. Some common data mistakes include:

  • Selection bias. When a sample group are improperly selected for analysis, causing them not be representative of the broad population.
  • Publication bias. When the outcome of an experiment influences the decision whether to publish it. eg “x does not cause cancer” is not interesting and is not published, but “x does cause cancer” is interesting and is published.
  • Recall bias. Caused by study participants not recalling past experiences accurately.
  • Survivorship bias. Error of focusing on the people / things that passed some selection process whilst overlooking those that did not, typically because of their lack of visibility.
  • Healthy user bias. A bias that damages the validity of health studies which test the effectiveness of particular therapies or interventions. eg “People who take vitamins regularly are likely to be healthy, because they are the kind of people who take vitamins regularly”

Inference

Statistics cannot prove anything with certainty. Instead, the power of statistical inference derives from observing some pattern or outcome and then using probability to determine the most likely explanation for that outcome”

So instead of confirming a null hypothesis (proving it with certainty), we either reject the hypothesis or fail to reject the hypothesis.

For example, the null hypothesis explains the legal principle of being “innocent until proven guilty”. A suspect is assumed to be innocent (null is not rejected) until proven guilty (null is rejected) beyond a reasonable doubt (to a statistically significant degree).

“We find the defendant guilty” or “we find the defendant not guilty”, but never “we find the defendant innocent”.

The idea behind this statement is that nobody is ever completely and entirely innocent. Similarly, in statistics, we conclude that there is evidence supporting a hypothesis, or this is a lack of evidence supporting a hypothesis, but we do not outright claim that we categorically conclude that evidence in favor of a hypothesis is definitive.

A type I error involves wrongly rejecting a true null hypothesis (false positive)

A type II error involves failing to reject a false null hypothesis (false negative)

What kind of error is worse? It depends… For spam filters, it is better to permit some type II errors (it is better to allow a few spam emails to enter one’s inbox while ensuring that important emails are not discarded as spam)

For cancer screenings, type I errors are preferable (it is better to receive a false diagnosis to then later find out that you are healthy, rather than to miss out on a potentially fatal diagnosis)

For capturing terrorists, there is no consensus about whether type I or type II errors are preferable. As such, there is broad public disagreement.

Regression Analysis

…estimates the relationships between a dependent variable (the outcome variable) and one or more independent variables (predictors).

Allows us to unravel complex relationships in which multiple factors affect an outcome that we care about, such as income, test scores, or heart disease.

For example, individuals with higher levels of education tend to earn more. But an additional variable, eg inherent ability, will influence education and earnings. Therefore, an individual’s higher earnings would not only be attributable to their education levels but also to their inherent ability.

For any regression coefficient , you will generally be interested in:

  • Sign (positive or negative) for an independent variable tells us the direction of its association with the dependent variable (the outcome we are trying to explain).
  • Size. How big is the observed effect between the independent variable and the dependent variable? Is it of a magnitude that matters?
  • Significance. Is the observed result an aberration based on a quirky sample of data, or does it reflect a meaningful association that is likely to be observed for the population as a whole?

Common regression analysis mistakes

  • Using regression to analyse non-linear relationships
  • Correlation does not equal causation
  • Reverse causality (causality can go both ways)
  • Omitted variable bias, eg predicting earnings using age and education but not gender
  • Highly correlated explanatory variables (multicollinearity). Occurs when two variables are highly correlated with one another. eg IQ and education making it difficult to determine the effect of one single variable.
  • Extrapolating beyond the data (results only apply to the population from which sample was drawn).
  • Data mining (using too many variables… eventually one of them is bound to be significant by chance)

Program Evaluation

Researchers use experiments to isolate a treatment effect. Common approaches include:

  • Randomised, controlled experiments. Set up two separate groups, a control and a treatment group, keep all factors constant apart from the treatment, and compare results.
  • Natural experiment. Finding natural situations that are consistent except for the treatment.
  • Non-equivalent control. Also involves a treatment and a control group, but it has the potential to be biased as it creates non-randomized treatment and control groups.
  • Difference in differences. Sometimes an intervention negatively affects a dependent variable. eg juvenile kids’ criminal behavior declined after visiting adult prisons, using scare tactics to deter them from pursuing crime. However, similar kids who did not go through this program showed even greater improvements (with even further declining criminal activity) compared to the first group (who attended the program).

Conclusion: Questions that statistics help answer

Here’s one piece of research I found truly creative and inspiring…

Who is more responsible when it comes to handling the family’s finances, men or women? This is v important for developing countries.

Anecdotal evidence suggests women place a high priority on the health and welfare of their children, while men are more inclined to drink up their wages at the local pub. At worst, this anecdotal evidence reinforces stereotypes. At best, it’s a hard thing to prove. How can we separate out how husbands and wives choose to spend communal resources?

French economist Esther Duflo used the randomised, controlled experiment to find statistical insights…

In Côte d’Ivoire, women and men in a family typically share responsibility for some crops and individually manage other crops (Men grow cocoa, coffee; women grow plantains, coconuts). The beauty of this arrangement from a research standpoint is that the men’s crops and the women’s crops respond to rainfall patterns in different ways. In years in which cocoa and coffee do well, men have more disposable income to spend. In years in which plantains and coconuts do well, the women have more extra cash.

Now we need merely broach a delicate question: Are the children in these families better off in years in which the men’s crops do well, or in the years when the women crops do well?

The answer: When the women do well, they spend some of their extra cash on more food for the family. The men don’t.
(like this story? share the tweet thread)

The End! Buy the book and follow Charles Wheelan on Twitter…

…and see my full reading list with notes on other awesome books.

--

--

Chris Stoneman

Dad, Hub, LDN/E17 resident. Strategy @Spotify. ex Universal Music. Here I share my thoughts or things I learn, please help me understand them more. @CWStoneman