## Algorithms: PART 2

### QR Algorithm

From 1959-1961, John G.F Francis worked on finding a stable method to compute eigenvalues and thus created the QR algorithm. Eigenvalues are one of the most essential numbers associated with matrices, however they can be quite difficult to compute.

This algorithm is based on the fact that is relatively easy to transform a square matrix into a matrix that is almost upper triangular (one extra set of no-zero entries just below the main diagonal). Based on QR decomposition, which writes a matrix A as a product of an orthogonal matrix Q and an upper triangular matrix R, the QR algorithm iteratively changes Ai = QiRi to Ai+1RiQi

“The algorithm is numerically stable because it proceeds by orthogonal similarity transforms.”

-Wikipedia

Under certain conditions, the matrices Ai converge to a triangular matrix and the eigenvalues of a triangular matrix are listed on the diagonal so the eigenvalue problem is solved.

By the mid-1960s, the QR algorithm had turned once-formidable eigenvalue problems into routine calculations.

### Quicksort

The Quicksort algorithm was developed by Tony Hoare in 1962. It puts N things in numerical or alphabetical order by using a recursive strategy:

1. Pick one element as a pivot
2. Separate the rest into piles of big and small elements, as compared with the pivot
3. Repeat this procedure on each pile

Although you can get stuck doing all N(N-1)/2 comparisons, on average Quicksort runs on average with O(N logN) efficiency, which is very fast. Its simplicity has made Quicksort a “poster child of computational complexity“.

### Fast Fourier Transform

The fast Fourier Transform was developed by James Cooley and John Tukey in 1965. The FFT revolutionised signal processing. Although the idea can be traced back to Gauss, it was a paper by Cooley and Tukey that made it clear how easily Fourier transforms can be computed. It relies on a ‘divide-and-conquer‘ strategy and hence reduces it from an O(N^2) to an O(N logN) computation.

### Integer Relation Detection Algorithm

Given real numbers x1, …, xn, are there integers a1, …, an (not all 0) for which a1x1 + … + anxn = 0? Helaman Ferguson and Rodney Forcade found an algorithm – the Integer Relation Detection Algorithm – to answer this question. This is easily solved in the case where n = 2 by Euclid’s algorithm, computing terms in the continued-fraction expansion of x1/x2 – if x1/x2 is rational, then the expansion terminates.

The detection algorithm has, for example, been used to find the precise coefficients of the polynomials satisfied by the third and fourth bifurcation points of the logistic map.

It has also proved useful in simplifying calculations with Feynman diagrams in quantum field theory.

### Fast Multipole Algorithm

This algorithm overcomes one of the biggest problems of N-body simulations, which is the fact that accurate calculations of the motions of N particles interaction via gravitational or electrostatic forces seemingly requires O(N^2) calculations, i.e. one for each pair of particles. This algorithm, developed in 1987 by Leslie Greengard and Vladimir Rokhlin, does it with O(N) computations.

How does it do this? It uses multipole expansions to approximate the effects of a distant group of particles on a local group. Then we define ever-larger groups as distances increase. One of the big advantages of the fast multipole algorithm is that it comes with rigorous error estimates, a feature that a lot of other methods lack.

M x

## 10 Algorithms: PART 1

### Monte Carlo Method

At the Los Alamos Scientific Laboratory,  John von Neumann, Stan Ulam and Nick Metropolis created the Metropolis algorithm, also known as the Monte Carlo method. This algorithm obtains approximate solutions to numerical problems that has an unmanageable number of degrees of freedom and to combinatorial problems that have factorial size. It does this by mimicking a random process.

### Simplex Method

In 1947, George Dantzig created the simplex method for linear programming. Linear programming dominated the world of industry, where “economic survival depends on the ability to optimise within budgetary and other constraints“. It’s widespread application makes Dantzig’s algorithm one of the most successful of all time.

The simplex method is a elegant way of arriving at optimal answers, and in practice it is highly efficient.

### Krylov Subspace Iteration Methods

The development of the Krylov Subspace iteration methods was initiated by Magnus Hestenes, Eduard Stiefel and Cornelius Lanczos from the Institute for Numerical Analysis at the National Bureau of Standards in 1950. They address the simple task of solving the equations Ax = b. Although these seem simple, when A is a massive nxn matrix, the algebraic answer x = b/A is not easy to compute!

So, iterative methods, such as solving equations of the form Kxi + 1 = Kxi + b – Axi, were introduced.  This lead to the study of Krylov subspaces, which are named after Russian mathematician Nikolai Krylov. These subspaces are spanned by powers of a matrix that is applied to a initial remainder vector r0 = b – Ax0.

Lanczos found a way to generate an orthogonal basis for such a subspace when the matrix is symmetric, and then Hestenes and Stiefel proposed an even better method – conjugate gradient method – used when the system is both symmetric and positive definite.

### Fortran Optimising Compiler

Developed in 1957 by a team at IBM lead by John Backus, the Fortran optimising compiler is said to be one of the most important innovations in the history of computer programming. After its development, scientists could tell a computer what they wanted it to do without having to “descend into the netherworld of machine code“.

Fortran I consisted of 23,500 assembly-language instructions, and although this is not a lot compared to modern compilers, it was capable of a great number of sophisticated computations.

The compiler “produced code of such efficiency that its output would startle the programmers who studied it.”

– Backus

Part 2 coming soon! M x

## Maths Behind Bitcoin

Bitcoin is a “worldwide cryptocurrency and digital payment system” that works without a central repository or single administrator. This system is peer-to-peer meaning that each computer can act as a server for the others, which allows for transactions to take place between users directly without an intermediary (or central server).

Bitcoins are not stored centrally, but rather exist as records in the block chain, which is a “distributed ledger”. To own a bitcoin simply means to have the ability to transfer control of it to someone else by creating a record of the transfer in the block chain. This ability is granted by access to an ECDSA (Elliptic Curve Digital Signature Algorithm) private and public key pair. What does this mean?

### Algorithms

An algorithm is a process to make calculations. They are like machines: data comes in, and the algorithm does some work, and data comes out. ECDSA is the algorithm that creates the Bitcoin key pairs. To create a new key pair the elliptic curve is plotted across a finite field.

### Elliptic Curves

An elliptic curve is represented algebraically as an equation of the form:

y2 = x3 + ax + b

Elliptic curves have useful properties, such as a non-vertical line intersecting two non-tangent points on the curve will always intersect a third point on the curve. Furthermore, if the non-vertical line intersects a point tangent to the curve it will intersect precisely one other point. Using these properties it is possible to define two operations:

• Point Addition: P + Q = R is defined as the reflection through the x-axis of the third intersecting point R’ on a line that includes P and Q.

• Point Doubling: P + P = R is defined by finding the line tangent to the point being doubled and taking a reflection through the x-axis of the intersecting point R’ on the curve.

### Finite Fields

In relation to ECDSA, a finite field is a predefined range of positive numbers within which every calculation must fall. If a number is outside of this range, it will wrap around so that it does fall within this range.

In order to understand, consider taking the modulus operator. For example, mod 7: here the finite field is modulo 7 and all answers in this field lie in the range 0 – 6.

### ECDSA

Bitcoin defines the formula for the curve and the parameters of the field so that every user has the same graph. The parameters include:

• the equation used;
• the prime modulo of the field;
• base point that falls on the curve.

The order of the base point is not independent of the other parameters, and can be thought of graphically as “the number of times the point can be added to itself until its slope is infinite.” The base point is selected so that the order is a large prime number.

In the case of bitcoin:

Elliptic Curve Equation: y2 = x3 + 7

Prime Modulo: 2256 – 232 – 29 – 28 – 27 – 26 – 24 – 1 = FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE FFFFFC2F

Base Point: 04 79BE667E F9DCBBAC 55A06295 CE870B07 029BFCDB 2DCE28D9 59F2815B 16F81798 483ADA77 26A3C465 5DA4FBFC 0E1108A8 FD17B448 A6855419 9C47D08F FB10D4B8

Order: FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE BAAEDCE6 AF48A03B BFD25E8C D0364141

Source: coindesk.com

In ECDSA the private key is chosen as a number between 1 and the order. Then the public key is given by:

public key = private key * base point

Practically, the computation of the public key is broken down into point doubling and point addition operations starting from the base point.

This shows that the maximum number of private keys possible is equal to the order. However, Bitcoin has an enormous field:

There are 10^82 atoms in the universe. If you made the entire Universe our finite field and you drew a giant elliptic curve through it, each “point” in the universe would about 300 square atoms. At that size, each cell in your body takes up the space of 1,150,000,000 Bitcoin key pairs.

– www.cryptocoinsnews.com

A Bitcoin address is not a public key; it is an RIPEMD-160 hash of an SHA256 hash of a public key. SHA256 and RIPEMD-160 are also algorithms that, unlike ECDSA, generate a hash. I have taken an example from cryptocoinsnews to illustrate a simple example of a hash (SHA256 and RIPEMD-160 are much more complex):

Every letter has a value of its position in the alphabet: A = 1, B = 2, etc.
Our hash algorithm is this:

for i=0 to len(word):
h = h + (i + letter_value)

So for the word “ABC” using our hash would be:

Loop for ‘A’
h = 0 + (0 + 1)
Loop ‘B’
h = 1 + (1 + 2)
‘C’
h = 4 + (2 + 3)
h = 9

‘ABC’ hashes to 9.

### How Bitcoin Actually Works

Your private key is kept secret. This key allows you to unlock funds that are owed to you in the Bitcoin block chain.

When making a transacting, it contains a script that has your private key’s signature and the matching public key. This is in order to prove that the public key provided matches the private key that was used to make the signature. If the public key hashes with the Bitcoin address in the unclaimed transaction, it can be spent.

Hope you enjoyed this basic explanation of Bitcoin! M x

## Lychrel numbers

Lychrel number is a natural number that cannot form a palindrome by the 196-algorithm: an iterative process of repeatedly reversing a numbers’ digits and adding the resulting numbers.

Whilst in other bases (powers of two) certain numbers can be proven to never form a palindrome, in base 10 (the base system we use in everyday life) no Lychrel numbers have been proven to exist. However many numbers, such as 196, are suspected to be a Lychrel number on “heuristic and statistical grounds“.

The name Lychrel was coined by Wade Van Landingham in 2002 as an anagram of Cheryl, his girlfriend’s name.

### 196-Algorithm

The reverse-and-add process is when you add a number to the number formed by reversing the order of its digits.

Examples of non-Lychrel numbers are (taken from Wikipedia):

• 56 becomes palindromic after one iteration: 56+65 = 121.
• 57 becomes palindromic after two iterations: 57+75 = 132, 132+231 = 363.
• 59 becomes a palindrome after 3 iterations: 59+95 = 154, 154+451 = 605, 605+506 = 1111
• 89 takes an unusually large 24 iterations to reach the palindrome 8,813,200,023,188.
• 1,186,060,307,891,929,990 takes 261 iterations to reach the 119-digit palindrome 44562665878976437622437848976653870388884783662598425855963436955852489526638748888307835667984873422673467987856626544, which is the current world record for the Most Delayed Palindromic Number. It was solved by Jason Doucette‘s algorithm and program in 2005.

### 196

196 is the smallest number suspected to never reach a palindrome in base 10 and has thus received the most attention:

• In 1985 a program by James Killman ran unsuccessfully for over 28 days, cycling through 12,954 passes and reaching a 5366-digit number.
• John Walker began his 196 Palindrome Quest in 1987. His program ran for almost three years, then terminated (as instructed) in 1990 with the message:

Stop point reached on pass 2,415,836.
Number contains 1,000,000 digits.
• In 1995, Tim Irvin and Larry Simkins reached the two million digit mark in only three months without finding a palindrome.
• Jason Doucette then reached 12.5 million digits in May 2000.
• Wade Van Landingham used Jason Doucette’s program to reach a 13 million digit. By May 2006, Van Landingham had reached the 300 million digit mark.
• In 2011 Romain Dolbeau completed a billion iterations to produce a number with 413,930,770 digits, and in February 2015 his calculations reached a number with billion digits.

A palindrome has yet to be found.

M x

## Knapsack Problem

The knapsack problem is a problem in combinatorial optimisation.

Imagine you are going to visit your friends for whom you have bought lots of presents. So many, in fact, that you can’t fit them all in your luggage meaning you must leave some behind. If we decide to pack a combination of presents with the highest value but that doesn’t exceed the weight limit imposed by airline, how can we find such a combinations? This is the Knapsack Problem.

We could solve this by trying all the possible combinations, calculating their value and weight and then picking one that is below the weight limit but maximises the value. Whilst this is okay when we are dealing with a small number of items, it is simply unfeasible when the number of items is large as the number of combinations is too big. So, is there an algorithm which can work for any number of items that doesn’t take too long?

Computer scientists have a way of measuring the complexity of a problem by how fast the time taken grows as the size of the input increases. Polynomial growth, i.e. if the size of the input is n then the time taken grows by a factor of n^k, describes and ‘easy problem’; the growth is nowhere near as “explosive” as  exponential growth, e.g. 2^n.

Whether or not a polynomial time algorithm exists to solve the knapsack problem is unknown, meaning it is in a class of problems called NP-Hard Problems.

Complexity Classes:

• Class P: problems that can be solved in polynomial time
• Class NP: a potential solution can be checked in polynomial time
• NP-Complete: within the NP class but particularly hard
• NP-Hard: as least as hard as the NP class.

It is possible that the NP class = P class, though this is still unknown (and is one of the Millennium Prize Problems), hence the two diagrams above.

M x

P.S. My posts may be more sporadic in the next month because I’m in exam term!

## New Books in Maths: March 2017

Today I decided to do another instalment of my series ‘New Books in Math’, where I talk about books which have been recently released or will be released soon.

Release Date: April 2017

Ali Almossawi, the author of the popular book Bad Arguments, has returned with a funny and smart introduction to algorithms.

This book aims to answer questions such as “why is Facebook so good at predicting music?” and “how do you discover new music?

To demystify a topic of ever increasing importance to our lives, Almossawi presents us with alternative methods for tackling twelve different scenarios, guiding us to better and more efficient choices “that borrow from same systems that underline a computer word processor, a Google search engine, or a Facebook ad”.

“Bad Choices will open the world of algorithms to all readers making this a perennial go-to for fans of quirky, accessible science books.”

### Are Numbers Real?: The Uncanny Relationship of Mathematics and the Physical World – Brian Clegg

Release Date: March 2017

What was life like before numbers existed? Numbers began as simple representations of everyday things, but mathematics rapidly took a life of its own and grew into what it is today.

In Are Numbers Real?, Brian Clegg explores the way that maths has become more and more detached from reality, but yet remains the driving force of the development of modern physics.

“From devising a new counting system based on goats, through the weird and wonderful mathematics of imaginary numbers and infinity to the debate over whether mathematics has too much influence on the direction of science, this fascinating and accessible book opens the reader’s eyes to the hidden reality of the strange yet familiar world of numbers.”

### The Mathematics Lover’s Companion: Masterpieces for Everyone – Edward R. Scheinerman

Release Date: March 2017

How can a shape have more than one dimension but fewer than two? What is the best way to elect public officials when more than two candidates are vying for the office? Is it possible for a highly accurate medical test to give mostly incorrect results? Can you tile your floor with regular pentagons? How can you use only the first digit of sales numbers to determine if your accountant is lying? Can mathematics give insights into free will?

Edward Scheinerman answers these questions and more in The Mathematics Lover’s Companion, in bite-size chapters that require only secondary school mathematics. He invites readers to get involved in solving mathematical puzzles, and provides an engaging tour of numbers, shapes and uncertainty.

The result is an unforgettable introduction to the fundamentals and pleasures of thinking mathematically.

M x

## Mazes

#### What’s the difference between a maze and a labyrinth?

It is generally accepted that a labyrinth contains only one path, which often spirals around folding back in on itself in ever-decreasing loops. On the other hand, a maze contains branching paths, which lead to nowhere, and therefore present the person with choices and the potential for them to get lost.

Mathematicians have created maze-generating algorithms, which tend to fall into two main types:

• Ones that start with a single bounded space and then sub-divide it with walls to produce smaller sub-spaces;
• Ones that start with a bunch of disconnected rooms and then demolish walls to create paths between them.

## Escaping Mazes

Most methods work for ‘simple mazes’, i.e. mazes with no short-cuts via bridges or passage loops (circular paths that lead back to where they started).

So, if we assume the maze is simple, the most common method to escape the maze is wall-following. Here, you place a hand on the wall of the maze, keep walking maintaining contact with the wall and eventually you will get out.

Why does this work? If you imagine picking the wall of the maze and stretching its perimeter to remove any corners. You will eventually create a circle-like shape, which must form part of the maze’s outer border.

## Trémaux’s Algorithm

The problem is that most mazes aren’t simple, for example the Escot Gardens’ beech hedge maze in Devon.

So, another method of maze escape is known as Trémaux’s algorithm, which works is all cases.

Here’s how it works. Basically, with this algorithm, you leave a trail as you navigate your way through the maze. Then follow these rules:

• If you arrive at a junction you have not previously encountered, randomly select a way to go;
• If that leads you to a junction where one path is new but the other is not, select the path you have not been on;
• If you are choosing between a once or twice-used path, choose the path that has only be used once, then leave a second trail behind you;
• Never select a path that already has two trails.

This method works for all mazes and is guaranteed to get you out!

M x

Everyday millions of users will spend a quarter of their time on Facebook scrolling mindlessly through their Newsfeed. The Newsfeed, arguably the most important part of the site, is controlled by a unique algorithm called the ‘Edgerank’ algorithm. In this post, I’ll talk you through what this is and how it works.

What is it?

Edgerank is the algorithm that dictates what stories appear on a users Newsfeed. Say you are a Facebook user. Every action that your Facebook friends take is a possible Newsfeed story. Facebook calls these actions ‘edges’. Whenever someone visits their Newsfeed there are around 1,500 stories waiting to be seen, and since most people don’t have the time or simply don’t want to read through 1,500 posts, this algorithm prioritises the stories to show users what they’re most likely to be interested in. It is called ‘EdgeRank’ because it ranks the edges.

How does it work?

There are three things that this algorithm takes into consideration:

1. Affinity Score
2. Edge Weight
3. Time Decay

Affinity Score

This means how ‘connected’ you are to the edge. It is calculated by considering how close you are with the person who’s posting. For example, if the story is by someone you frequently interact with, have several mutual friends with, or are related to, the algorithm is more likely to give the edge a higher affinity score.

Edge Weight

Not all actions, or edges, are considered to be of equal importance; a friend creating a status update would carry more weight than someone liking a photo.

Time Decay

As time passes and the posts gets older, it is likely that it has already been seen or that it is no longer as relevant. To solve this, the algorithm multiplies the story by 1/x, where x is the time since the action happened, to decrease its importance as time passes.

The formula below is then used to give each edge an overall score of importance:

The higher the score, the more the story will be prioritised on your Newsfeed. Simple right!