The Case of the Missing Maybe

September 3, 2012 at 4:14 pm
filed under Coding
Tagged , ,

I spent a big chunk of yesterday and the day before hacking away at 4Clojure. I’ve got some initial impressions, which are of course experiential and thus likely to evolve with time.

Mostly this centers around how to handle collections and recursion.

Empty collections and nil

This is what’s vexing me now: detecting a sequence versus an empty list versus nil versus a scalar when dealing with a collection which may contain either nested sequences or scalars.

There are two functions that give you the rough equivalent of (drop 1 coll): rest and next. rest never returns nil, even on an empty sequence. next, conversely, will return nil on an empty sequence. first is similar. To wit:

user=> (map rest [[1] [] [1 2] nil])
(() () (2) ())
user=> (map next [[1] [] [1 2] nil])
(nil nil (2) nil)
user=> (map first [[1] [] [1 2] nil])
(1 nil 1 nil)

How do you reason about each of these? Well, it doesn’t lend itself to a nice if/else.

(if (nil? x) 
  (handlenil x)
  (if (coll? x) ; not nil. collection, empty collection, or scalar?
    (if (empty? x) ; must be sub-clause of coll? since (coll? 1) is an error.
      (handleempty x)
      (handlecoll x)
    (handlescalar x))))

The trouble for me was that this wasn’t just theoretical. I got a couple of infinite recursions because I was infinitely appending nil or an empty list! After staring at example solutions to the problem, though, and many revisions to this blog post, I think I’ve figured something out.

The answer?

The best I’ve got follows.

Look at how first behaves. It will give you nil, but only if you call it on an empty list. (Or unless nil is in the list. Heh.) We can eliminate that prospect by treating the empty list as the degenerate case, using empty? to bail out.

That means we ought to use rest to recurse instead of next. rest is in fact blessedly simple to work with on account of the fact that you always get a list. Having established that, at least, we only give ourselves lists (never nil), we can treat (first coll) safely.

One difference between this and the exhaustive checking above is that we’re happy to barf on anything for which empty? is invalid input. In On Lisp, Paul Graham says something about how the line between API and the language itself are so blurred that the source of errors is obvious, and maybe he’s right. For now, frankly, I cringe.

Closing thoughts: the inevitable Haskell comparison

In this very narrow case, Haskell is simpler because it narrows the problem space:

flatn :: [[a]] -> [a]
flatn []     = []
flatn (x:xs) = x ++ (flatn xs)

This example covers only one level of nesting, but that’s immaterial in practice on account of pattern matching. Given a nested type like this:

data MyData = Foo a (MyData a) | Bar a (MyData a) | Empty

You can pattern match on any of these easily enough.

This is a trivial comparison, obviously; you must look at the whole language in order to weigh the trade-offs. I look forward to learning enough Clojure to appreciate macros, and I expect its approach to IO and randomness are far less restrictive than Haskell’s.

Nevertheless, this is one area where I experienced a fair amount of frustration and substantial cognitive dissonance.

%d bloggers like this: