MATHS BITE: Wythoff’s Game

Wythoff’s game is a two-player mathematical strategy game. It is played with two piles of counters and goes as follows:

Players take turns to remove counters from one, or both piles (if they are removing stones from both piles, they must remove the same number of stones from each). The game ends when one person removes the last counter, or counters, thus winning.

It is name Wythoff’s game, as it was Dutch mathematician W. A. Wythoff that published a mathematical analysis of the game in 1907, although it was previously know in China as tsyan-shidzi (“choosing stones”).

Optimal Strategy?

Any position of the game can be described by a pair of integers (x,y) where x≤y. This describes the size of both piles in the position.

The strategy of the game involves cold and hot positions:

  • A cold position is one where the player whose turn it is to move will lose, even with his best move;
  • A hot position is one where the player whose turn it is to move will win with his best move.

Thus, the optimal strategy for a person in a hot position is to move to a cold position.

We can classify the positions as hot or cold recursively, using 3 simple rules:

  1. (0,0) is a cold position;
  2. Any position from which a cold position can be reached in a single move is a hot position;
  3. If every move leads to a hot position then the position is cold.

For example (taken from Wikipedia):

“all positions of the form (0, y) and (y, y) with y > 0 are hot, by rule 2. However, the position (1,2) is cold, because the only positions that can be reached from it, (0,1), (0,2), and (1,1), are all hot. The cold positions (x, y) with the smallest values of x and y are (0, 0), (1, 2), (3, 5), (4, 7),(6,10) and (8, 13).”

Connection with the Golden Ratio

Wythoff discovered that the cold positions can be determined using the Golden Ratio. Namely, if n is any natural number, then

Screen Shot 2017-11-05 at 5.54.53 PM.png

is a cold position (where we use the floor function and φ is the Golden Ratio).

To read more, click here.

M x


Modern Mathematicians: Hirotugu Akaike

On November 5th, Google Doodle celebrated the 90th birth anniversary of Hirotugu Akaike. Having never heard of his work before I was interested in finding out more about him.


Google Doodle

Hirotugu Akaike was an award-winning Japanese statistician who worked in information theory (“studies the quantification, storage, and communication of information”).

He is known for the ‘Akaike Information Criterion’, which he formulated in the 1970s. In post-war Japan, Akaike worked on problems that were unique to his country and that would help contribute to its reconstruction. Using fundamental concepts of information maths, he analysed the processing of “sericultural products, cement kiln controls, and thermal electric power plant controls”. In doing so, he was able to give a  solution to the model selection problem: the Akaike Information Criterion.

AIC is widely used today as a guideline for the selection of statistical models in a variety of areas, such as medicine, engineering, economics as well as the fields of mathematics and statistics.

In 2006, Akaike was awarded the Kyoto Prize for “his major contribution to statistical science and modelling in the development of the Akaike information criterion“.

He passed away on August 4, 2009 at the age of 82.

M x

What is the Gamma Function?

A lot of important functions in applied sciences are defined using improper integrals, and perhaps the most famous of these is the Gamma Function. Throughout my course I have heard mention of this function, and always wondered “but what exactly is the Gamma Function?” I finally decided to find out.

Mathematicians wanted to find a formula that generalised the factorial expression for the natural numbers. This lead to the discovery of following very well-known formula:


Say we replace the n but the x to create the function:


What is the domain of this function? The only possible ‘bad points‘ are 0 and ∞:

  • t ≈ 0: e^-t ≈ 1 and hence the expression inside the integral is approximately t^x, meaning we have convergence at around 0, only if x>-1.
  • t –> ∞: No matter the value of x, the integral is convergent.

From this we can see that the domain of f(c)  is (-1,∞).

To obtain (0,∞) as the domain, we need to shift the function one unit to the right. This defines the Gamma Function.


The above discussion explains why the Gamma Function has the x-1 power.


Some Properties:

  • Γ(n) = (n-1)! where n is a natural number
  • Γ(x+1) = xΓ(x) for x > 0. This follows easily from the definition of Γ(x) by performing integration by parts.
  • Weierstrass’s definition is also valid for all complex numbers (except non-positive integers):{\displaystyle \Gamma (z)={\frac {e^{-\gamma z}}{z}}\prod _{n=1}^{\infty }\left(1+{\frac {z}{n}}\right)^{-1}e^{z/n}}where{\displaystyle \gamma \approx 0.577216} is the Euler-Mascheroni Constant.
  • Taking the logarithm of the above expression (we can do this as Γ(x) > 0 for x > 0), we get that:


  • Differentiating this, we observe that:displaymath355In fact, Γ(x) can be differentiated infinitely often.
  • Γ(x) is connected with sin(x): For x > 0 displaymath451

Click here for more properties and applications.

M x

Waring’s Problem


Take any whole number k that is greater than or equal to 2.  Show that there is some number s (that is allowed to depend on k) so that every positive whole number is a sum of s kth powers.


This problem was first solved by David Hilbert, and was then later proved by G.H. Hardy and John Littlewood in a different way. In this case, the second proof actually gave a lot more insight to the problem. To solve Waring’s problem as it has been stated you don’t need to give an s, you only need to show that one exists.  Finding the smallest s that works is a much more challenging (although arguably more interesting) problem. Using the, now named, Hardy-Littlewood circle method, Hardy and Littlewood wrote down an expression that approximated the number of ways to write N as a sum of s kth powers:


The proof of Waring’s problem comes from that fact that since this must be positive and also be a whole number, there must be some way to write N as a sum of s kth powers.

Click here for more.

M x

(Sorry for missing a post last week, my university work is getting a bit hectic so posts may be more sporadic!)

MATHS BITE: Even or Odd?

What are even and odd functions? These two terms get used very frequently in order to simplify problems such as integration or when finding Fourier coefficients, for example.

Even Function

An even function is such that for all x in the domain

f(x) = f(-x)

An even function is symmetric with respect to the y-axis.

Screen Shot 2017-10-11 at 5.20.23 PM

Even Function | mathsisfun


  • Any polynomial p, where n is even for all xn. For example f(x) = x2+2x4 – 6. Note that (x+1)2 is not even.
  • f(x) = cos(x)

Screen Shot 2017-10-11 at 5.24.53 PM.png

Odd Function

An odd function is such that for all x in the domain

f(x) = -f(-x)

An odd function is symmetric with respect to the origin.

Screen Shot 2017-10-11 at 5.20.18 PM

Odd Function | mathsisfun


  • x, x3, x5,… etc. Note that unlike even functions, an expression such as x3 + 1 is not odd.
  • f(x) = sin(x)

Screen Shot 2017-10-11 at 5.30.14 PM.png

Odd and/or Even?

A function can be neither odd nor even, for example x− x + 1:

Screen Shot 2017-10-11 at 5.31.18 PM.png

Neither odd nor even | mathsisfun

Additionally, the only function that is even and odd is f(x) = 0.


  • Adding two even (odd) functions will give an even (odd) function.
  • Adding an even and an odd function will give a function that is neither odd nor even.
  • Both the product of two even functions and the product of two odd functions is even.
  • The product of an even function and an odd function is odd.

Maths Behind Bitcoin

Bitcoin is a “worldwide cryptocurrency and digital payment system” that works without a central repository or single administrator. This system is peer-to-peer meaning that each computer can act as a server for the others, which allows for transactions to take place between users directly without an intermediary (or central server).

Bitcoins are not stored centrally, but rather exist as records in the block chain, which is a “distributed ledger”. To own a bitcoin simply means to have the ability to transfer control of it to someone else by creating a record of the transfer in the block chain. This ability is granted by access to an ECDSA (Elliptic Curve Digital Signature Algorithm) private and public key pair. What does this mean?


An algorithm is a process to make calculations. They are like machines: data comes in, and the algorithm does some work, and data comes out. ECDSA is the algorithm that creates the Bitcoin key pairs. To create a new key pair the elliptic curve is plotted across a finite field.

Elliptic Curves

An elliptic curve is represented algebraically as an equation of the form:

 y2 = x3 + ax + b

Elliptic curves have useful properties, such as a non-vertical line intersecting two non-tangent points on the curve will always intersect a third point on the curve. Furthermore, if the non-vertical line intersects a point tangent to the curve it will intersect precisely one other point. Using these properties it is possible to define two operations:

  • Point Addition: P + Q = R is defined as the reflection through the x-axis of the third intersecting point R’ on a line that includes P and Q.

math behind bitcoin

  • Point Doubling: P + P = R is defined by finding the line tangent to the point being doubled and taking a reflection through the x-axis of the intersecting point R’ on the curve.

point doubling

Finite Fields

In relation to ECDSA, a finite field is a predefined range of positive numbers within which every calculation must fall. If a number is outside of this range, it will wrap around so that it does fall within this range.

In order to understand, consider taking the modulus operator. For example, mod 7: here the finite field is modulo 7 and all answers in this field lie in the range 0 – 6.


Bitcoin defines the formula for the curve and the parameters of the field so that every user has the same graph. The parameters include:

  • the equation used;
  • the prime modulo of the field;
  • base point that falls on the curve.

The order of the base point is not independent of the other parameters, and can be thought of graphically as “the number of times the point can be added to itself until its slope is infinite.” The base point is selected so that the order is a large prime number.

In the case of bitcoin:

Elliptic Curve Equation: y2 = x3 + 7


Base Point: 04 79BE667E F9DCBBAC 55A06295 CE870B07 029BFCDB 2DCE28D9 59F2815B 16F81798 483ADA77 26A3C465 5DA4FBFC 0E1108A8 FD17B448 A6855419 9C47D08F FB10D4B8



In ECDSA the private key is chosen as a number between 1 and the order. Then the public key is given by:

public key = private key * base point

Practically, the computation of the public key is broken down into point doubling and point addition operations starting from the base point.

This shows that the maximum number of private keys possible is equal to the order. However, Bitcoin has an enormous field:

There are 10^82 atoms in the universe. If you made the entire Universe our finite field and you drew a giant elliptic curve through it, each “point” in the universe would about 300 square atoms. At that size, each cell in your body takes up the space of 1,150,000,000 Bitcoin key pairs.


Bitcoin Addressess

A Bitcoin address is not a public key; it is an RIPEMD-160 hash of an SHA256 hash of a public key. SHA256 and RIPEMD-160 are also algorithms that, unlike ECDSA, generate a hash. I have taken an example from cryptocoinsnews to illustrate a simple example of a hash (SHA256 and RIPEMD-160 are much more complex):

Every letter has a value of its position in the alphabet: A = 1, B = 2, etc.
Our hash algorithm is this:

for i=0 to len(word):
h = h + (i + letter_value)

So for the word “ABC” using our hash would be:

Loop for ‘A’
h = 0 + (0 + 1)
Loop ‘B’
h = 1 + (1 + 2)
h = 4 + (2 + 3)
h = 9

‘ABC’ hashes to 9.

How Bitcoin Actually Works

Your private key is kept secret. This key allows you to unlock funds that are owed to you in the Bitcoin block chain.

When making a transacting, it contains a script that has your private key’s signature and the matching public key. This is in order to prove that the public key provided matches the private key that was used to make the signature. If the public key hashes with the Bitcoin address in the unclaimed transaction, it can be spent.

Hope you enjoyed this basic explanation of Bitcoin! M x

Markov Chains: An Intro

The Markov Property

A stochastic process (“a mathematical object usually defined as a collection of random variables”) is said to have the Markov Property if, conditional on the present value, the future is independent on the past.

Let’s firstly introduce some notation: let S be a countable set called the state space and let = (Xt: t ≥ 0) be a sequence of random variables taking values in S.

Then, the sequence X is called a Markov Chain if it satisfies the Markov Property:

Screen Shot 2017-10-05 at 5.24.15 PM.png

for all t ≥ 0 and all x0, x1, …, xt ϵ S.

Notation is simplified in the case where the Markov chain is homogeneous. This is when for all i, j ϵ S, the conditional probability P(Xt+1 = j | Xt = i) does not depend on the value of t.


  • Branching Process: The branching process is a simple model of the growth of a population; each member of the nth generation has a number of offspring that is independent of the past; Xn = size of the nth generation.
  • Random Walk: A particle performs a random walk on the line: let Z1, Z2, …, be independent with P(Zi = 1) = p and P(Zi = -1) = 1-p, then Xn =
    Z1 + … + Zn; at each epoch of time, it jumps a random distance that is independent of the previous jumps.
  • Poisson Process: the Poisson process satisfies a Markov property in which time is a continuous variable rather than a discrete variable, and thus the Poisson process is an example of a continuous-time Markov chain; the Markov property still holds as arrivals after time t are independent of arrivals before this time.

Two Quantities

For simplification (and as this is only an intro to  Markov chains) we’ll assume that the Markov chains are homogeneous.

Two quantities that are needed in order to calculate the probabilities in a chain are the:

  1. transition matrix: P = (pi,j: i,j ϵ S) given by pi,j = P(X1 = j | X0 = i);
  2. initial distribution: λ = (λi : i ϵ S) given by λi = P(X0 = i).

As we have assumed homogeneity we have that

pi,j = P(Xn+1 = j | Xn = i) for n ≥ 0.

These quantities are characterised in the following way:


a) The vector λ is a distribution  in that λi ≥ 0 for i ϵ S and the sum of λi over i = 1.

b) The matrix P = (pi,j) is a stochastic matrix in that pi,j ≥ 0 for i, j ϵ S and the sum of pi,j over j = 1 for i ϵ S (i.e. that P row sums to 1)


a) As λi is a probability, it is clearly non-negative. Additionally, the sum of λi over i = the sum of P(X0 = i) over i = 1.

b) Since pi,j is a probability, it is non-negative. Finally, the sum of pi,j over j = the sum of P(Xn+1 = j | Xn = i) over j = P(Xn+1 ϵ S | Xn = i) = 1.

Hope you enjoyed this brief introduction to a Markov Chain!

M x