Example: Concurrent Haskell Fibonacci

This example involves calculating Fibonacci numbers recursively. I'm doing it the stupid way that leads to an explosion of recursive calls, but that's nice specifically because it takes some time to compute. (Also known as the worst algorithm in the world.)

Since we're going to be writing parallel code, we'll need Haskell's parallel module, and also the time module for the code timing:

cabal install --lib parallel time

I did the following tests with GHC 9.4.8 on a 16-core (24-thread) processor on Linux.

For each test, the code was compiled and executed with these commands:

ghc -O2 -with-rtsopts="-N24" -threaded concurrent.hs
time ./concurrent

You can have a look at the complete code used; I have excerpted key chunks below.

Non-Parallel Version

The first version of the code, fibSer, is naïve and non-parallel. The hope is that we can do the work in a fraction of the time using all of the cores.

Calculating fibSer 47 took 21 seconds without the -with-rtsopts compilation flag.

Adding the -with-rtsopts switch gives GHC permission to use many cores: it parallelized somehow and used some multithreading (showed ~130% processor usage) but still took 21 seconds to do the same calculation. Not a good start.

Parallel Version 1: very parallel

The first parallel version of the function, fibPar1, always calculates both recursive calls in parallel:

fibPar1 n = (term1 `par` term2) `pseq` (term1 + term2)
    where
      term1 = fibPar1 (n-1)
      term2 = fibPar1 (n-2)

This version apparently used many threads (~2000% processor usage) and took 12 seconds to do the same calculations. That's a disappointing speedup: it's actually much longer than the non-parallel version when you count the amount of processor being used. The problem here is too much parallelism: too many operations are sent off to separate threads, making the system spend a lot of time synchronizing everything.

Parallel Version 2: minimally parallel

So, for the next version of the function, fibPar2, we'll try only parallelizing the top level call:

fibPar2 n = (term1 `par` term2) `pseq` (term1 + term2)
    where
      term1 = fibSer (n-1)
      term2 = fibSer (n-2)

Notice that the “recursive” calls there are to fibSer. For fibPar2 47, this code calculates fibSer 46 and fibSer 45 in parallel, but each of those calculations is done single-threaded.

This version took 14 seconds: faster but still close to the original version. This code isn't parallel enough. It isn't creating enough threads to really keep the cores busy.

Parallel Version 3: appropriately parallel

So, for the next version of the function, fibPar3, we will try to be parallel when appropriate:

cutoff = 30
fibPar3 n
  | n<cutoff     = fibSer (n-1) + fibSer (n-2)
  | n>=cutoff    = (term1 `par` term2) `pseq` (term1 + term2)
    where
      term1 = fibPar3 (n-1)
      term2 = fibPar3 (n-2)

This code does the same thing as fibPar1 for values of n larger than cutoff, but falls back to the non-parallel version of the function for smaller n. That is, it is parallel when it makes sense, but not when it doesn't

The value for cutoff was determined by experiment, but values 10–40 all seem to be just about as good.

This version took 1.7 seconds. We finally have a respectable speedup.

Chunk Size Exploration

After the success of fibPar3, it occurred to me that it actually generalized to mimic the other versions of the code. With cutoff==0, the function is equivalent to fibPar1; with cutoff==n, it is fibPar2; and with cutoff>=n+1, it is fibSer.

So, I decided to test out the more generalized version and time how long the calculation takes with each cutoff. I have put the code for these experiments in a separate file to avoid clutter. This was run in the same conditions described above.

Here are the times taken for each cutoff value:

The Lesson

Writing parallel code and using all of your cores can make things faster, but you can't just blindly throw concurrency at some code and expect it to do magic.

Updated Wed June 04 2025, 09:27 by ggbaker.

Simon Fraser University
Engaging the World

CourSys