Sunday, March 16, 2008

A short reminder

Folds and maps are Haskell's default iteration combinators. Mapping is easy enough, but folds can often be rather messy, especially if nested. For example, given a map of sets of some values, we want to write a function to swap keys and values. The function's type will be:

type SetMap k a = Map k (Set a)

invertSetMap :: (Ord a, Ord b) => SetMap a b -> SetMap b a
The resulting map should contain a key for each value of type b, occurring in any set in the original map. The new values (of type Set a) are all those original keys for which the new key occurred in the value set. Intuitively, if the original set was representing arrows from values of type a to values of type b, this function should reverse all arrows. We can easily implement this function using two nested folds.

invertSetMap sm = 
  M.foldWithKey 
      (\k as r -> 
          S.fold (\a r' -> M.insertWith S.union a (S.singleton k) r')
                 r
                 as)
      M.empty
      sm    
That's not pretty at all! I had written quite a bit of this kind of code (and hated it each time), until I finally remembered a fundamental Haskell lesson. Haskell uses lists to simulate iteration and specify other kinds of control flow. In particular list comprehensions are often extremely cheap, since the compiler can automatically remove many or all intermediate lists and generate very efficient code. So let's try again.

invertSetMap sm = M.fromListWith S.union
    [ (a, S.singleton k) | (k, as) <- M.assocs sm
                         , a       <- S.toList as ]
So much more readable! A quick benchmark also shows that it's slightly faster (a few percent for a very big map). Lesson to take home: If your folds get incomprehensible consider list comprehensions.

6 comments:

  1. Here's a Python version, just for kicks:

    >>> def invert_dict( input_dictionary ):
    return dict( [ ( value, key ) for ( key, value ) in input_dictionary.iteritems() ] )
    ...
    >>> invert_dict( { 'a':1, 'b':2 } )
    {1: 'a', 2: 'b'}
    >>> invert_dict( { 'a':1, 'b':1 } )
    {1: 'b'}

    ReplyDelete
  2. And in Perl:

    > perl -MData::Dumper -e '%x=(a=>1,b=>1);%y=reverse %x;print Dumper \%y'

    ReplyDelete
  3. Yes, Python has list comprehensions, too, and I guess they're also efficient. Note, though that my code requires nested "loops", since it stores sets of values rather than just a single value. I don't know what Python programmers use as sets, but if you decide to use lists, the outputs should be:
    {1: ['a'], 2: ['b']}
    and
    {1: ['a','b']}

    ReplyDelete
  4. If you represent sets as maps with unit values, I think the most straightforward Perl solution is:
    my %result;
    for my $outer (keys %x) {
        for my $inner (keys %{$x{$outer}}) {
            $result{$inner}{$outer} = undef;
        }
    }

    Not very elegant, but it gets the job done.

    ReplyDelete
  5. in python:

    def invertSetMap(sm):
    r = {}
    for k, vs in a.iteritems():
    for v in vs:
    r.setdefault(v, set()).add(k)
    return r

    ReplyDelete
  6. # in python:
    def invertSetMap(sm):
        r = {}
        for k, vs in a.iteritems():
            for v in vs:
                r.setdefault(v, set()).add(k)
        return r

    ReplyDelete