Macros in Racket — what

December 28, 2013 at 5:06 pm
filed under Coding
Tagged , , ,

Man, just when I thought I’d started to understand macros, I stumble across Racket.

Don’t get me wrong. I’m still enjoying my foray into Racket. But if, for instance, you started out trying to understand Lisp macros via On Lisp, you may be in for some trouble. Yes, as far as I can tell, the usual syntax like

`(foo ,bar ,@baz)

will work. But if you begin to read the section of the Racket Guide about macros, it becomes clear that there’s much, much more to the picture.

This is really just me thinking out loud as I work this out. I’m more sure of some things than others, and I’ll try to make clear which is which, but consider this a blanket caveat.

I think I agree with the folks who’ve suggested that while it’s easy to find trivial or complex Racket macro examples, it’s hard to find examples of moderate complexity. The nice thing is that Fear of Macros exists. I began reading my way through it today.

Macros in Racket seems to be built around a relatively simple idea: make syntax first class.

I am not certain, but I am pretty sure that syntax like backticks, et al, are not functions. At a minimum they aren’t atoms, right? You can’t map backticks over a list of atoms, or funcall or apply it. ,, @, and backticks are all reader directives (the R in REPL), which is distinct from the environment (the E or eval in REPL). They exist at a more fundamental level to enable Lisp’s clever syntax.

I don’t think there’s anything wrong with that, necessarily, but it’s interesting to contemplate Racket’s assertion that syntax should be a first-class datatype, beyond just a list. So that’s why you end up with concepts like transformers.

A macro written in this way isn’t mucking around with the reader directly. (Maybe that’s how it’s implemented, though.) Rather, you’re working in a different namespace, or the moral equivalent, at compile time. A macro then becomes a function which receives a syntax object, which the programmer can manipulate via a number of other primitives.

It gets a little confusing here, though, because a syntax object isn’t just a glorified AST or what have you. Functions like syntax->datum will recurse through a syntax object representing (+ 1 (* 2 3)) and yield just that, whereas others may only recurse one level, providing a syntax object for (e.g.) + and *.

AIUI, these syntax objects know something about the scope in which they were introduced. And this is important because if you’re just operating at a textual substitution level, you can run into problems with scope. Is it a bit like closures? I think so, with the caveat that this would be during the compilation phase, before “real” code is executed.

What do I mean by “real” code? Well, I put that in quotes because there’s not much of a difference in reality. I think it has to do with phases. So during the compile phase, you can perform computation. You could write a macro my-+ which performed addition at compile time, right? Or a real example of computation might be to take a declaration like (struct person (id name phone)) and declare accessors make-person, person-id, and so on. This is how any Lisp macro works, so I’m not singling out Racket as special.

So if you consider that compile time is itself just another phase in program evaluation, issues of lexical scope and such take on a new meaning.

Also, yeah, far as I can tell, in Racket there are interesting phases or times in addition to compile time. In fact I think you can arbitrarily nest phases. I wanted to say that this is the moral equivalent of nesting backticks, but I don’t think that’s accurate enough to help. To put it grossly, backticks are a way to require another layer of evaluation. Put another way, you put backticks around something to delay its evaluation by one application of eval, and a comma to remove a “layer” of delay.

Digression: sometimes I try to think about this like when you have nested quotes in prose:

“Alice said ‘drink this,’ so I did,” Bob said.

You have three layers of nesting here: story-level (“Bob said”), Bob-level (“Alice said”), and Alice-level (“drink this”). English isn’t quite as regimented as code, but it does have rules for quotes.

End digression. (You know, as if this whole thing isn’t a digression.)

I believe backticks, et al, are quite distinct from “phases” in Racket. Backticks are (again) a syntactic construct for consumption by the reader. The environment doesn’t come into play because eval doesn’t care where your lists came from; they’re “just” lists. This is advantageous in terms of simplicity, I’d expect.

Conversely, a syntax object might actually know what symbols, et al, it’s referring to. There’re a pile of Racket functions oriented around this concept. You can play with this at the REPL:

racket@> (define foo 1)
racket@> (eval (syntax->datum (syntax (+ foo 2))))
racket@> (syntax-e #`(+ foo 2))
'(#<syntax::5978 +> #<syntax::5980 foo> #<syntax::5984 2>)
racket@> (define foo-syntax (second (syntax-e (syntax (+ foo 2)))))
racket@> (identifier? foo-syntax)
racket@> (free-identifier=? (syntax foo) foo-syntax)

The third expression at the prompt is interesting only because syntax-e parses a syntax objects into its constituent parts. The subsequent expression extracts foo as a syntax object and then compares it to another syntax object representing the same foo.

why why why

All right, I went through all that to flesh out my own mental model for why you might want a richer datatype to represent syntax, rather than “just” lists with reader directives.

The way this ends up working in Racket is that a macro receives a syntax object rather than a list of its arguments. The macro is free to manipulate that object (e.g. using syntax-e or syntax->datum). The transformer returns something which is evaluated.

With that in mind, this surprised me a little bit:

racket@> (eval #'(+ foo 1))
racket@> (eval (syntax->datum #'(+ foo 1)))

eval-ing a syntax object is equivalent to eval-ing a datum. Okay.

I think it’s only functionally equivalent, because of how I set up the example. Specifically, the first example is “portable” to other scopes, and it’ll be able to resolve foo whether foo exists in the current context. The second example will evaluate foo in the current context, possibly failing.

To expand on that: say foo didn’t exist at runtime. Or it evaluates to something different at runtime vs compile-time. In the former case, the sharp-quoted version would still work (assuming that it was a proper macro) whereas the datum version would error out. In the latter case, the computations would each evaluate to something different.

Interestingly, in the latter case, that’s actually probably what you want! That’s because they’re distinct times, and you wouldn’t want a run-time binding to trample a compile-time binding, right? Hmm.


In the next post I’ll think a little more about the interesting pieces this enables, how it changes how I look at macros in Racket, and maybe I’ll even try to understand them.

I don’t think, in aggregate, that it’s really all that different in terms of manipulating lists. The extra pieces come from the syntax object, and what happens when your functions and constructs know something about what they’re manipulating. You can have richer affordances and such. So that’ll be interesting to contemplate.

%d bloggers like this: