Thursday, July 29, 2004

Bad algorithms and caching

The fibonacci function is traditionally defined recursively:

fib(n) = fib(n-1) + fib(n-2); fib(1) = fib(0) = 1

This would typically be implemented in the venerable C language as:

unsigned long long fib(int n)
{
        return n == 0 || n == 1 ? 1 : fib(n-1) + fib(n-2);
}
The problem with this implementation is that the performance is atrocious.
$ time ./fib -r 40
165580141
 
real    0m18.263s
user    0m18.262s
sys     0m0.002s
The reason is that each higher number needs geometrically increasing number of recursive invocations:
 
 fib(0) = 1 call
 fib(1) = 1 call
 fib(2) = 3 calls
 fib(3) = 5 calls
 fib(4) = 9 calls
 fib(5) = 15 calls
 fib(6) = 25 calls
 fib(7) = 41 calls
 fib(8) = 67 calls
 fib(9) = 109 calls
 fib(10)=177 calls
 ....
 fib(40) = 331160281 calls
Curiously, the number of calls for fib(n) can be defined as:
 fib_calls(n) = 1 + fib_calls(n-1) + fib_calls(n-2); fib_calls(1) = fib_calls(0) = 1 
 
The performance problem is addressed by rewriting the function to work iteratively:
unsigned long long iterative_fib(int n)
{
        register unsigned long long n1 = 0, n2 = 1, tmp;
 
        int i = 0;
        for(i = 0 ; i < n; i++)
        {
                tmp = n2;
                n2 = n2 + n1;
                n1 = tmp;
        }
        return n2;
}
This gives a remarkable boost in performance:
$ time ./fib -i 40
165580141
 
real    0m0.003s
user    0m0.001s
sys     0m0.002s
Another way to get around the performance problem is to use caching with the recursive algorithm:
static unsigned long long cache[100];
 
unsigned long long cached_fib(int n)
{
        if(cache[n])
                return cache[n];

        return n == 0 || n == 1 ? 1 : (cache[n] = cached_fib(n-1)+cached_fib(n-2));
}
Performance is similar to that of the iterative version:
$ time ./fib -c 40
165580141
 
real    0m0.003s
user    0m0.000s
sys     0m0.003s
However, in a program that calls the fibonacci function numerous times, the cached recursive version would work on an average, faster than the iterative verion. The flip side is that caching uses up memory. After a certain threshold, the system starts swapping to disk and performance degrades rapidly. So you really want to select good algorithms wherever possible and save caching for operations that simply have to be cached such as those involving disk or network IO.

Click here to download the complete source

No comments: