Sunday, March 16, 2008

A short reminder

Folds and maps are Haskell's default iteration combinators. Mapping is easy enough, but folds can often be rather messy, especially if nested. For example, given a map of sets of some values, we want to write a function to swap keys and values. The function's type will be:

type SetMap k a = Map k (Set a)

invertSetMap :: (Ord a, Ord b) => SetMap a b -> SetMap b a
The resulting map should contain a key for each value of type b, occurring in any set in the original map. The new values (of type Set a) are all those original keys for which the new key occurred in the value set. Intuitively, if the original set was representing arrows from values of type a to values of type b, this function should reverse all arrows. We can easily implement this function using two nested folds.

invertSetMap sm = 
  M.foldWithKey 
      (\k as r -> 
          S.fold (\a r' -> M.insertWith S.union a (S.singleton k) r')
                 r
                 as)
      M.empty
      sm    
That's not pretty at all! I had written quite a bit of this kind of code (and hated it each time), until I finally remembered a fundamental Haskell lesson. Haskell uses lists to simulate iteration and specify other kinds of control flow. In particular list comprehensions are often extremely cheap, since the compiler can automatically remove many or all intermediate lists and generate very efficient code. So let's try again.

invertSetMap sm = M.fromListWith S.union
    [ (a, S.singleton k) | (k, as) <- M.assocs sm
                         , a       <- S.toList as ]
So much more readable! A quick benchmark also shows that it's slightly faster (a few percent for a very big map). Lesson to take home: If your folds get incomprehensible consider list comprehensions.