## Introduction

Liu Feng posted a number of Lisp functions to compute a Fibonacci number with O(log(n)) time complexity where n is the n-th number to be computed. The last such function involved tail recursion. While the function is correct there was no convincing proof of correctness. Essentially computing the n-th Fibonacci number was reduced to the computing . Hence here we will consider the simpler numerical exponentiation problem and extension to computing the Fibonacci number follows automatically. We present a semi-formal proof of correctness and show how the program could be implemented in C++.

## Numerical Exponentiation

Consider the problem of computing x^{N} where x is a number and N a positive integer. A naive solution will require N-1 multiplications. However the solution described below, in C++ syntax, will compute the solution in O(log(N)) time:

template<typename number> number power_iter(number x, int N, number p) { number a = x; unsigned int n = N; auto even = [](unsigned int n) -> bool { return n % 2 == 0; }; //loop invariant x^N = a^n * p; while (n > 0) { if (even(n)) a = a*a, n = n / 2; else p = a*p, n = n - 1; } return p; } long power(long x, unsigned int N) { return power_iter(x, N, 1L); }

## Proof of Correctness

Notice that the loop-invariant x^{N} = a^{n} * p is trivially established at the start of the first iteration. Using induction it can be shown that the invariant holds at the end of every iteration with the new values of a, n and p. Hence when `n=0`

we have x^{N} = p, which is the desired value.

### Performance

If N can be represented in binary as 10…1, then the algorithm proceeds by changing the least significant bit to zero if it is one or right shifting the number if the least significant bit is zero. Hence in the worst case there are at most 2*log_{2}(N)-1 multiplications and in the best case log_{2}(N) multiplications.

### An Aside

If the ‘*’ operator is implemented as addition, then power operation becomes multiplication provided, the initial value of `p`

is zero. This is also known as the Russian Peasant multiplication algorithm.

## Fibonacci Number

Define a Fibonacci matrix as of the form

.

Fibonacci matrices are closed under multiplication. In other words

.

where

x = a*p + b*q

and

y = a*q + b*(p+q) = p*b + q*(a+b).

Note that a Fibonacci matrix is commutative with regard to multiplication and that we require only two values to represent a Fibonacci matrix. We are now ready to implement the Fibonacci number computation:

template<class T> struct Fibonacci_Matrix { T a, b; }; template<class T> Fibonacci_Matrix<T> operator*( Fibonacci_Matrix<T> A, Fibonacci_Matrix<T> B) { Fibonacci_Matrix<T> X{ A.a*B.a + A.b*B.b, A.a*B.b + A.b*(B.a + B.b) }; return X; } template<class T> T NthFibonacci(unsigned int N) { Fibonacci_Matrix<T> one{ 0, 1 }; if (N == 0) return {0}; auto result = power_iter(one, N - 1, one); return result.b; }

Notice that `power_iter`

function remains the same; we just had to redefine multiplication for Fibonacci matrices to use `power_iter`

. Note that `power_iter`

relies on the multiplication operator being associative but not necessarily commutative, even though in this case it does not matter.

### Performance Analysis

Using gmplib the 1000000-th Fibonacci number was computed using the above template library, as listed below:

int main () { std::cout<<NthFibonacci<mpz_class>(1000000)<<std::endl; return 0; }

It was close to 100 times faster than the linear code:

mpz_class fib(int n) { mpz_class p1 {0}; mpz_class p2 {1}; mpz_class result; if ( n == 0 ) return 0; if ( n == 1 ) return 1; for(int i = 1; i <= n/2 ; i ++ ) { p1 += p2; p2 += p1; } return (n % 2) ? p2 : p1; } int main () { std::cout<<fib(1000000)<<std::endl; return 0; }

It is great Joe that you implemented this solution using a templatized power_iter function. I have seen this implemented in a brute force manner of explicit matrix multiplication where my eyes started glazing over trying to keep track of the multiplication steps. At first, I couldn’t see how the ‘p’ variable got updated for all cases of n. Then I ran through the steps by hand and I saw that it works. Another thing I did not get is the X^N = a^n * p invariant for each iteration of the loop. To me it looks like N stays the same while n gets updated each time making that relation not true after the first iteration. Also, the while loop block in the power_iter function needs to be aligned with the return statement. It looks like it is part of the ‘even’ lambda definition when glanced casually. Another suggestion for the ‘even’ function is to replace it with a bit test for oddness instead of the modulus operator and the n= n /2 replaced by right bit shift if you want to squeeze even more performance from it. It was a good post and I am happy to have learnt something new.

“X^N = a^n * p” is the invariant. (That is only new contribution to this discussion)

N and X are constant. p,a and n change but the invariant holds at the beginning and end of every iteration.

Proof by induction: the invariant is true at the start when X=a, N=n and p=1. If n is even we have a<-a^2 and n<-n/2 and the invariant holds. If n is odd, a^n*p becomes a^(n-1)*(a*p) which is equal to a^n*p and the invariant holds. Thus the invariant holds in both cases.

I must confess I spent almost a whole night thinking about it. If you interested in knowing more about invariants, checkout "Discipline of Programming" Edsger Dijkstra or "Science of Programming" David Gries.

As regards your second comment, I was more interested in clarity not performance.

I figured that max performance was not the end goal but I thought I’d still put it out there for folks who may be interested.