Seven Statistical Sins

Inspired by an article on I decided to compile a list of the seven statistical sins. Statistics is a vital tool to understanding the patterns in the world around us, however our intuition often lets us down when it comes to interpreting these patterns.

1.Assuming small differences are meaningful

Examples of this include small fluctuations in the stock market, or differences in polls where one party is ahead by one point or two. These represent chance rather than anything meaningful.

To avoid drawing any false conclusions that may arise due to this statistical noise we must consider the margin of error related to the numbers. If the difference is smaller than the margin of error, there is likely no meaningful difference and is probably due to random fluctuations.

2. Equating statistical significance to real-world significance

Statistical data may not represent real-world generalisations, for example stereotypically women are more nurturing while men are physically stronger. However, given a pile of data, if you were to pick two men at random there is likely to be quite a lot of difference in their physical strength; if you pick one man and one women they may end up being very similar in terms of nurturing or the man may be more nurturing than the woman.

This error can be avoided by analysing the effect size of the differences between groups, which is a measure of how the average of one group differs from the average of another. Then if the effect size is small, the two groups are very similar. Even if the effect size is large, each group will still have a lot of variation so not all members of one group will be different from all members of the other (hence giving rise to the error described above).

3. Neglecting to look at the extremes

This is relevant when looking at normal distributions.



In these cases, when there is a small change in performance for the group, whilst there is no effect on the average person the character of the extremes changes more drastically. To avoid this, we have to reflect on whether we’re dealing with the extreme cases or not. If we are, these small differences can radically affect the data.

4. Trusting coincidence

If we look hard enough, we can find patterns and correlations between the strangest things, which may be merely due to coincidence. So, when analysing data we have to ask ourselves how reliable the observed association is. Is it a one-off? Can future associations be predicted? If it has only been seen once, then it is probably only due to chance.

5. Getting causation backwards

When we find a correlation between two things, for example unemployment and mental health, it may be tempting to see a causal path in one direction: mental health problems lead to unemployment. However, sometimes the causal path goes in the other direction: unemployment leads to mental health problems.

To get the direction of the causal path correct, think about reverse causality when you see an association. Could it go in the other direction? Could it even go in both ways (called a feedback loop)?

6. Forgetting outside cases

Failure to consider a third factor that may create an association between two things may lead to an incorrect conclusion. For example, there may be an association between eating at restaurants and high cardiovascular strength. However, this may be due to the fact that those who can afford to eat at restaurants regularly are in high socioeconomic bracket, which in turn means they can also afford better health care.

Therefore, it is crucial to think about possible third factors when you observe a correlation.

7. Deceptive Graphs

A lot of deception can arise from the way that the axis are labeled (specifically the vertical axis) on graphs. The labels should show a meaningful range for the data given. For example, by choosing a narrower range a small difference looks more impactful (and vice versa).


In fact, check out this blog filled with bad graphs.

M x


e is irrational

Proving a number is irrational is mostly done by contradiction. So first suppose e is rational: e = p/q where p, q are coprime integers.

We know that q≥2 as e is not an integer (in fact, it’s in between 2 and 3). Then

Screen Shot 2017-06-14 at 10.40.11 AM.png

Note that, as q!e and n are natural numbers, we must have that x is a natural number.


Screen Shot 2017-06-14 at 10.41.02 AM.png

And so we can bound x in the following way

Screen Shot 2017-06-14 at 10.41.06 AMThis is a contradiction since q!e must be a natural number, but it is a sum of an integer n plus a non-integer x. Hence, e is irrational.

M x

NEWS: 13532385396179

Recently, James Davis found a counterexample to John H. Conway’s ‘Climb to a Prime’ conjecture, for which Conway was offering $1,000 for a solution.

The conjecture states the following:

Let n be a positive integer. Write the prime factorisation in the usual way, where the primes are written in ascending order and exponents of 1 are omitted. Then bring the exponents down to the line, omit the multiplication signs, giving a number f(n). Now repeat.”

For example, f(60) = f(2^2 x 3 x 5) = 2235. As 2235 = 3 x 5 x 149, f(2235) = 35149. Since 35149 is prime, we stop there.

Davis had a feeling that the counterexample would be of the form

Screen Shot 2017-06-10 at 2.37.23 PM.png

where p is the largest prime factor of n. This motivated him to look for x of the form

Screen Shot 2017-06-10 at 2.38.05 PM.png

The number Davis found was 13532385396179 = 13 x 53^2 x 3853 x 96179, which maps to itself under f (i.e. its a fixed point). So, f will never map this composite number to a prime, hence disproving the conjecture.

M x

MATHS BITE: Apéry’s Constant

Apéry’s constant is defined as the number

{\displaystyle {\begin{aligned}\zeta (3)&=\sum _{n=1}^{\infty }{\frac {1}{n^{3}}}\\&=\lim _{n\to \infty }\left({\frac {1}{1^{3}}}+{\frac {1}{2^{3}}}+\cdots +{\frac {1}{n^{3}}}\right)\end{aligned}}}

where ζ is the Riemann Zeta Function.

This constant is named after the French mathematician Roger Apéry who proved that it was irrational in 1978. However it is still unknown whether or not it is transcendental.


The Basel Problem asked about the convergence of the following sum:
Screen Shot 2017-06-10 at 2.16.05 PM.png

In the 18th century, Leonhard Euler proved that in fact it did – to π^2/6. However, the limit of the following sum remained unknown:Screen Shot 2017-06-10 at 2.19.28 PM.png

Although mathematicians made some progress, including Euler who calculated the first 16 decimal digits of the sum, it was not known whether the number was rational or irrational, until Apéry.

Furthermore, it is currently not known specifically whether any other particular ζ(n), for n odd, is irrational. “The best we’ve got is from Wadim Zudilin, in 2001, who showed that at least one of ζ(5), ζ(7), ζ(9), ζ(11) must be irrational, and Tanguy Rivoal, in 2000, who showed that infinitely many of the ζ(2k+1) must be irrational.”

M x

Where have I been?

So I have a confession to make… I have completely neglected my blog for the past month. Although it really upset me that I couldn’t upload regular (or in fact any) content, I have just been so busy with work that it was impossible. In these last few months I have been in complete exam mode, and although it’s been absolutely exhausting, it has also been extremely rewarding.

But now, I am finally finished and can concentrate on uploading more regularly. Thank you all so much for your patience, new blog posts coming soon!

M x

Proof Without Words #2

My last post seemed to go down well so I thought I’d compile a few more images of proofs without words!


Determinant is the area of a parallelogram, by Solomon Golomb, Mathematics Magazine, March 1985.



A visual proof of Jensen’s inequality, found on Wikipedia. Jensen’s inequality states that

Screen Shot 2017-05-02 at 7.51.38 AM.png

In the diagram, the dashed curve along the X axis is the hypothetical distribution of X, while the dashed curve along the Y axis is the corresponding distribution of Y values. Note that the convex mapping Y(X) increasingly “stretches” the distribution for increasing values of X.



Hope you enjoyed! M x

VIDEO: Spiral Sculptures

John Edmark is an artist and professor at Stanford University who has used the Golden Angle to sculpt spirals. The Golden Angle is derived from the Golden Ratio: it is the smaller of the two angles created by dividing the circumference of a circle according to the golden ratio and comes out to be around 137.5°.

For more click here.

M x