Maths Behind Bitcoin

Bitcoin is a “worldwide cryptocurrency and digital payment system” that works without a central repository or single administrator. This system is peer-to-peer meaning that each computer can act as a server for the others, which allows for transactions to take place between users directly without an intermediary (or central server).

Bitcoins are not stored centrally, but rather exist as records in the block chain, which is a “distributed ledger”. To own a bitcoin simply means to have the ability to transfer control of it to someone else by creating a record of the transfer in the block chain. This ability is granted by access to an ECDSA (Elliptic Curve Digital Signature Algorithm) private and public key pair. What does this mean?


An algorithm is a process to make calculations. They are like machines: data comes in, and the algorithm does some work, and data comes out. ECDSA is the algorithm that creates the Bitcoin key pairs. To create a new key pair the elliptic curve is plotted across a finite field.

Elliptic Curves

An elliptic curve is represented algebraically as an equation of the form:

 y2 = x3 + ax + b

Elliptic curves have useful properties, such as a non-vertical line intersecting two non-tangent points on the curve will always intersect a third point on the curve. Furthermore, if the non-vertical line intersects a point tangent to the curve it will intersect precisely one other point. Using these properties it is possible to define two operations:

  • Point Addition: P + Q = R is defined as the reflection through the x-axis of the third intersecting point R’ on a line that includes P and Q.

math behind bitcoin

  • Point Doubling: P + P = R is defined by finding the line tangent to the point being doubled and taking a reflection through the x-axis of the intersecting point R’ on the curve.

point doubling

Finite Fields

In relation to ECDSA, a finite field is a predefined range of positive numbers within which every calculation must fall. If a number is outside of this range, it will wrap around so that it does fall within this range.

In order to understand, consider taking the modulus operator. For example, mod 7: here the finite field is modulo 7 and all answers in this field lie in the range 0 – 6.


Bitcoin defines the formula for the curve and the parameters of the field so that every user has the same graph. The parameters include:

  • the equation used;
  • the prime modulo of the field;
  • base point that falls on the curve.

The order of the base point is not independent of the other parameters, and can be thought of graphically as “the number of times the point can be added to itself until its slope is infinite.” The base point is selected so that the order is a large prime number.

In the case of bitcoin:

Elliptic Curve Equation: y2 = x3 + 7


Base Point: 04 79BE667E F9DCBBAC 55A06295 CE870B07 029BFCDB 2DCE28D9 59F2815B 16F81798 483ADA77 26A3C465 5DA4FBFC 0E1108A8 FD17B448 A6855419 9C47D08F FB10D4B8



In ECDSA the private key is chosen as a number between 1 and the order. Then the public key is given by:

public key = private key * base point

Practically, the computation of the public key is broken down into point doubling and point addition operations starting from the base point.

This shows that the maximum number of private keys possible is equal to the order. However, Bitcoin has an enormous field:

There are 10^82 atoms in the universe. If you made the entire Universe our finite field and you drew a giant elliptic curve through it, each “point” in the universe would about 300 square atoms. At that size, each cell in your body takes up the space of 1,150,000,000 Bitcoin key pairs.


Bitcoin Addressess

A Bitcoin address is not a public key; it is an RIPEMD-160 hash of an SHA256 hash of a public key. SHA256 and RIPEMD-160 are also algorithms that, unlike ECDSA, generate a hash. I have taken an example from cryptocoinsnews to illustrate a simple example of a hash (SHA256 and RIPEMD-160 are much more complex):

Every letter has a value of its position in the alphabet: A = 1, B = 2, etc.
Our hash algorithm is this:

for i=0 to len(word):
h = h + (i + letter_value)

So for the word “ABC” using our hash would be:

Loop for ‘A’
h = 0 + (0 + 1)
Loop ‘B’
h = 1 + (1 + 2)
h = 4 + (2 + 3)
h = 9

‘ABC’ hashes to 9.

How Bitcoin Actually Works

Your private key is kept secret. This key allows you to unlock funds that are owed to you in the Bitcoin block chain.

When making a transacting, it contains a script that has your private key’s signature and the matching public key. This is in order to prove that the public key provided matches the private key that was used to make the signature. If the public key hashes with the Bitcoin address in the unclaimed transaction, it can be spent.

Hope you enjoyed this basic explanation of Bitcoin! M x


How to Solve It

In 1945, Hungarian mathematician George Pólya wrote an extremely successful book called How to Solve It, which sold over one million copies and has been translated into 17 languages. In this book, Pólya identifies the four basic principles of problem solving.

1. Understand the Problem

Firstly, you must understand the problem you are tackling.  Although this may seem like common sense, it is often overlooked; the amount of times I’ve been staring at a problem for hours because I haven’t fully understood what it is asking! Consider asking the following questions:

  • What is the unknown? What are the data? What is the condition?
  • What are you asked to find or show?
  • Can you restate the problem in your own words?
  • Do you understand all the words used in stating the problem?

For some areas in maths, such as mechanics, drawing a labelled figure can often be extremely helpful to visualise what is going on.

Resultado de imagem para free body diagram

Example of Diagram

2. Devise a Plan

There are many different suitable ways to solve a problem, as mentioned by Pólya. However, it is important to choose an appropriate strategy, which is a skill learnt by solving many problems. Some strategies include:

  • Guess and check
  • Use symmetry
  • Consider special cases
  • Solve an equation or use a formula
  • Look for a pattern
  • Solve a simpler problem
  • Work backwards
  • Eliminate possibilities

When choosing a strategy, it may help to consider whether you have solved a related problem already or whether you can think of a familiar problem that has the same or similar unknown.

3. Carry out the Plan

Once you have devised the plan, this step is simple if you already have the necessary skills, often only required care and patience -try to avoid silly mistakes!

Persist with the plan that you have chosen, only discarding it after multiple failed attempts. Then, return to step 2.

4. Look back

Pólya highlights how a lot can be gained from looking back and reflecting on the work you have done, asking yourself what worked and what didn’t. This way you can implement strategies that were successful for future problems that are similar.

Read more here.

M x

MATHS BITE: The Kolakoski Sequence

The Kolakoski sequence is an infinite sequence of symbols {1,2} that is its own “run-length encoding“. It is named after mathematician Willian Kolakoski who described it in 1965, but further research shows that it was first discussed by Rufus Oldenburger in 1939.

This self-describing sequence consists of blocks of single and double 1s and 2s. Each block contains digits that are different from the digit in the preceding block.

To construct the sequence, start with 1. This means that the next block is of length 1. So we require that the next block is 2, giving the sequence 1, 2. Continuing this infinitely gives us the Kolakoski sequence: 1, 2, 1, 1, 2, 1, 2, 2, 1, 2, 2, 1, 1, 2, etc.

M x

Diophantine Approximation: Liouville’s Theorem

Diophantine approximation deals with the approximation of real numbers by rational numbers.

Liouville’s Theorem

In the 1840’s Liouville obtained the first lower bound for the approximation of algebraic numbers:

Let α ∈ R be an irrational algebraic number satisfying f(α) = 0 with non-zero irreducible (cannot be reduced) f ∈ Z[x] of degree d. Then there is a non-zero constant C such that for every fraction p/q

Screen Shot 2016-12-11 at 10.35.05 AM.png


The proof utilises the mean value theorem. By this theorem, given p/q, there is a real ξ between α and p/q such that

Screen Shot 2016-12-11 at 10.35.08 AM.png

Since f has integer coefficients and is of degree d, the value of f(p/q) is a rational number with denominator at worst q^d. Since f is irreducible, f(p/q) cannot be equal to 0. Thus

Screen Shot 2016-12-11 at 10.40.22 AM.png

and so

Screen Shot 2016-12-11 at 10.40.58 AM.png

A corollary of this result is that numbers that are well approximable by rational numbers, i.e. in for every d ≥ 1 and positive constant C, there is a rational p/q such that

Screen Shot 2016-12-11 at 10.43.32 AM.png

are transcendental.



β is a real, transcendental number.

This is because there is a rational approximation

Screen Shot 2016-12-11 at 10.46.43 AM.png



Analysing this inequality, the ratio


is unbounded as N → +∞, and so β is well approximable by rationals.

M x


Maths on Trial

Last week I had the pleasure to attend a talk by Leila Schneps on the mathematics of crime. A self-declared pure mathematician, Schneps recently became involved in studying the mathematics in criminal trials. Rather than focusing on the mathematics of forensic data, such as DNA, the talk was on the use of Bayesian networks in crime evidence analysis.

One of the most challenging tasks for juries in criminal trials is to weigh the impact of different pieces of evidence which may be dependent on each other. In fact, studies have shown that there are several fallacies that jury members fall into on a regular basis, such as overestimating the weight of a single piece of DNA evidence. This is because most people assume that the probability of an event A happening given that B happens [P(A|B)] is equal to an event B happening, given that A happens [P(B|A)]. However, this is NOT true: the two are connected by Bayes’ Rule.

For example, a forensic specialist may say that a piece of DNA is found in 1 in every 1000 people. The jury will take this to mean the the suspect must be guilty as he is that person out of 1000 (the chances of it being anyone else is so low). However, this is critically not true, as explained above.

Thus, Bayesian networks are a powerful tool to assess the weight of different kinds of evidence, taking into account their dependencies on one another, and what effect they have on the guilt of the suspect.

What are Bayesian Networks?

Bayesian Networks are a type of statistical model,  which organise a body of knowledge in any given area, in this case evidence of a criminal trial, by mapping out cause-and-effect relationships and encoding them with numbers that represent the extent to which one variable is likely to affect the other. These networks are named after British mathematician Reverend Thomas Bayes, due to their reliance on Bayes’ formula. This rule can be extended to multiple variables and multiple states, allowing complicated probabilities to be calculated.


To illustrate an easy example, consider this Bayesian Network:

Let us say that the weather can only have three states: sunny, cloudy or rainy, that the grass can only be wet or dry, and that the sprinkler can be on or off. The arrows, which represent dependence, have been drawn in that way because if it rainy, then the lawn will be wet, but if it is sunny for a long time then this will cause us to turn on the sprinklers, and hence the lawn will also be wet.

By imputing probabilities into this network that reflect the reality of real weather, lawn and sprinkler-use behaviour, it can be made to answer questions like:

“If the lawn is wet, what are the chances it was caused by rain or by the sprinkler?”

“If the chance of rain increases, how does that affect my having to budget time for watering the lawn?”

I51G8WI2nsNL._SX334_BO1,204,203,200_.jpgn her presentation, Leila Schneps talked briefly about a book she had released entitled ‘Math on Trial’, which describes 10 trials spanning the 19th century to today, in which mathematical arguments were used (and greatly misused) as evidence. The cases discussed include “Sally Clark, who was accused of murdering her children by a doctor with a faulty sense of calculation; of nineteenth-century tycoon Hetty Green, whose dispute over her aunt’s will became a signal case in the forensic use of mathematics; and of the case of Amanda Knox, in which a judge’s misunderstanding of probability led him to discount critical evidence – which might have kept her in jail.”

After hearing Schneps transmit her passion and excitement, I am fascinated with this subject and can’t wait to get my hands on this book to learn more! M x

Benford’s Law

Benford’s Law is names after the American physicist Frank Benford who described it in 1938, although it had been previously mentioned by Simon Newcomb in 1881.

Benford’s Law states that in “naturally occurring collections of numbers” the leading significant digit is more likely to be small. For example, in sets of numbers which obey this law, the number 1 appears as the first significant digit about 30% of the time, which is much greater than if the digits were distributed uniformly: 11.1%.

In mathematics, a set of numbers satisfies this law if the leading digit, d, occurs with a probability:

P(d)=\log _{10}(d+1)-\log _{10}(d)=\log _{10}\left({\frac {d+1}{d}}\right)=\log _{10}\left(1+{\frac {1}{d}}\right).

Hence, if d = 1, then P(1) = log 2 = 30.1..%

The leading digits have the following distribution in Benford’s law:

Screen Shot 2016-09-27 at 2.57.40 PM.png

Source: Wikipedia


Source: Wolfram MathWorld

As P(d) is proportional to the space between d and d+1 on a logarithmic scale, the mantissae of the logarithms of the numbers are uniformly and randomly distributed.


Benford’s law has found applications in a big variety of data sets, such as stock prices, house prices, death rates and mathematical constants.


Frequency of first significant digit of physical constants plotted against Benford’s Law | Source: Wikipedia

Due to this, fraud can be found applying Benford’s law to data sets. This is because if a person is trying to fabricate ‘random’ values to try to not appear suspicious, they will probably select numbers such that the initial digits are uniformly distributed. However, as explained above, this is completely wrong! In fact, this application of Benford’s law is so powerful that there is an “industry specialising in forensic accounting and auditing which uses these phenomena to look for inconsistencies in data.” describes how:

“In 1993, in State of Arizona v. Wayne James Nelson (CV92-18841), the accused was found guilty of trying to defraud the state of nearly $2 million, by diverting funds to a bogus vendor.

The perpetrator selected payments with the intention of making them appear random: None of the check amounts was duplicated, there were no round-numbers, and all the values included dollars and cents amounts. However, as Benford’s Law is esoterically counterintuitive, he did not realize that his seemingly random looking selections were far from random.”

Sources: 1 | 2 | 3 | 4


After writing this I’ve realised that I touched on this law (in less detail) in a previous post during my Christmas series! M x