Monday, June 20, 2011

A Brief Intro to Writing Macros in Racket

I plan to learn, among other things, more about lisp-like macros this Summer. I had a brief introduction to the Common Lisp macro system years ago, but was frightened away when I couldn't understand my code after taking a few days' break. I'm revisiting macros using Racket. While there are a handful of great references on Common Lisp macros (including Paul Graham's On Lisp, which I've yet to read!), I had trouble finding a basic introduction to Racket's more complicated macro system. The team at Racket/PLT Scheme have been researching macro systems for a few years now, and they've implemented a pretty sophisticated system: syntax objects are first class citizens, et cetera. Quite honestly, I don't understand any of it yet. But I thought it'd help to write down my thoughts and work thus far. Maybe it'll help someone else learn something, too!

Step 1: Getting to terms with Syntax Objects and Transformer Bindings


Lisp macros allow you to extend the syntax of the lisp language itself. Racket implements these macros by creating a syntax type to model the syntax you wish to implement. Essentially, a syntax object contains an unevaluated s-expression (lisp expression) along with some data about its scope. A syntax object is a first-class citizen, meaning it can be passed around to and fro functions like any other lisp object. You can extract data (e.g. particular symbols) from these syntax objects using pattern matching techniques, and then use that data to build a new syntax object. Id est, you can write a function that takes a syntax object and returns a new, transformed syntax object. To implement a macro, you describe the syntax and write a function to transform it. When Racket sees your new syntax, it transforms it as you described and then evaluates the resulting syntax object as an s-expression within its own scope. (At least, I think that's what happens!)

Pretend for a second that we wish to implement a while clause in Racket. We want to be able to write something like:
(while (< x 5)
    (begin (display x)
        (set! x (+ x 1))))

And have it display
1234

What we need is something to take the above s-expression and transform it into the following s-expression:
(define (loop)
    (if (< x 5)
        (begin
            (display x)
            (set! x (+ 1 x))
            (loop)
        )
        (display "")))
(loop)

Let's first think about how to turn the s-expression (while test body) into a syntax object. This is done with the syntax function, e.g.:
(syntax (while text body))

Racket provides a shortcut to "quote" syntax, id est, the above is equivalent to:
#`(while text body)

Let's go ahead and define a syntax object to use later in our examples:
(define stx #`(while (< x 5)
    (begin (display x)
        (set! x (+ x 1))
    )
))

But how do we transform this syntax object into another? As stated above, we'll use a sort of pattern matching to extract data from the syntax object, and plug this data into a new syntax template. Racket provides a function to implement this called syntax-case. syntax-case is a function that takes a syntax object, a list of "keywords", and a list of pairs of patterns and new syntax objects. It tries to match the syntax object you provided to one of the patterns listed, and returns the new corresponding syntax object. Those patterns can be tricky, and are best explained with examples, so let's try one: First note that our keyword is "while", let's try the code:
(syntax-case stx
    (while)
    [(while test body) #`body])

The (while test body) says that we're looking for an s-expression that starts with the keyword "while", and follows with two other s-expressions. It binds the syntax object corresponding to those s-expressions to the symbols "test" and "body", respectively. The new syntax object #`body is simply the body sample. Thus the above expression returns a syntax object for only the last s-expression in our while clause.

If we didn't list "while" in our keyword list, the expression would have still returned the same result. But instead of only matching while-clauses, the syntax-case expression would match anything, and simply store the first expression in a symbol named while.

Let's implement a proper transformation now. As we stated above, we want our while expression to transform into a recursive function that calls itself so long as test is true. Our syntax-case is thus:
(syntax-case stx (while)
    [(while test body)
    #`(begin (define (loop)
            (if test
                (begin body
                    (loop))
                (display "")))
        (loop))])

As you can [hopefully] see if you run this code through a REPL, we get a syntax object containing something similar to our desired s-expression. Now let's tell Racket to perform this syntax transformation. To do this, we'll use define-syntax. This statement works a lot like our normal define, except that it ties an identifier to a syntax transformation. Here's the code:
(define-syntax (while stx-obj)
    (syntax-case stx-obj (while)
        [(while test body)
        #`(begin (define (loop)
                (if test
                    (begin body
                        (loop))
                    (display "")))
            (loop))]))

Now the code
(let ((x 1))
    (while (< x 5)
        (begin (display x)
            (set! x (+ 1 x)))))

Works as expected. stx-obj captures the total expression, including the while clause that we defined. Since Racket detects this by itself, we actually don't need to define the while keyword. We can replace it with an empty keyword list and a "_" (underscore) pattern (this matches any syntax symbol, but doesn't bind the symbol to a new expression) like this:
(define-syntax (while stx-obj)
    (syntax-case stx-obj ()
        [(_ test body)
        #`(begin (define (loop)
                (if test
                    (begin body
                        (loop))
                    (display "")))
            (loop))]))


Rolling Our Own Object System


Now that we've got the basics of macros down, let's try a slightly more ambitious project: we're going to use macros to implement a basic message-passing object system. First, consider the following code:
(define (animal location health speed)
    (define (move new-loc)
        (set! location new-loc)
        (set! health (- health speed)))
    (define (dispatch m)
        (cond [(eq? m 'move) move]
            [(eq? m 'location) location]
            [(eq? m 'health) health]
            [(eq? m 'speed) speed]
            (else (error "unknown message"))))
    dispatch)

(define my-animal (animal 1 100 5))
(my-animal 'health)

     >> 100

((my-animal `move) 2)
(my-animal 'health)

     >> 95

This code implements a message-passing design pattern for my awkward model of an "animal" (perhaps for a game?). We can create new animal objects with the animal function that we defined, and then pass messages to it, emulating object-oriented programming.

Instead of writing this design pattern over and over again, we're going to create a macro to implement it for us. We want to write
(new-object Animal
    (location health speed)
    [(move (λ (new-loc)
        (set! location new-loc)
        (set! health (- health speed))))])

The pattern we wish to match is:
(_ name (value ...) [(func-name lambda-body) ...])

The ellipses (...) after "value" and "(func-name lambda-body)" denote that we expect a list of symbols, rather than just a single symbol. When we write the syntax template, we'll have to use the same ellipses to repeat the corresponding expression across the list. To transform the above code into our implementation, the code is:
#'(define (name value ...)
    (define func-name lambda-body) ...
    (define (dispatch m)
        (cond [(eq? m 'value) value] ...
            [(eq? m 'func-name) func-name] ...
            [else (error "unknown message")]))
    dispatch)

Combining all this together, we can write the following syntax transformation:
(define-syntax (new-object stx)
    (syntax-case stx ()
        [(_ name (value ...) [(func-name lambda-body) ...])
            #'(define (name value ...)
                (define func-name lambda-body) ...
                (define (dispatch m)
                    (cond [(eq? m 'value) value] ...
                        [(eq? m 'func-name) func-name] ...
                        [else (error "unknown message")]))
                dispatch)]))

This macro works fine for the above. But hold on! What if we want to implement setters for new objects? E.g.
((my-animal `set-health!) 500)

Implementing this bit is tricky...

Manipulating Syntax


...to do this, we'll need to take the syntax object for the symbol "health" and transform it into a new syntax object that contains the symbol "set-health!". We'll need to do this for the full list of members. But how can we? Let's take a break from the macros and play around with syntax objects for a bit. First, define the following:
(define stx #'(health location speed))
(syntax->list stx)

syntax->list does what it sounds like, takes a syntax object of a list and returns a list of syntax objects. Our idea is to continue transformations on that list until it becomes the following syntax object list:
#'(set-health! set-location! set-speed!)

Then we'll incorporate it into our macro definition. Racket provides another useful function, syntax->datum which will transform each individual syntax object of a symbol into the symbol itself. So our next step is
(map syntax->datum (syntax->list stx))

Now we have a list of symbols. We can transform this normally, e.g.
(map (λ (item)
        (format "set-~a!" (syntax->datum item)))
    (syntax->list stx))

We're getting closer, now we need to transform this list of strings back into a list of syntax objects of symbols. This is pretty straightforward, we just nest datum->syntax and string->symbol. Note the following about datum->syntax: A syntax object includes information about the scope of the s-expression it contains. We want our new syntax object to be within the same scope as the original syntax object. Luckily, datum->syntax will do exactly that. It takes both a syntax object and a symbol and returns a new syntax object containing that symbol within the scope of the first syntax object. Hence our code is
(map (λ (item)
        (datum->syntax stx
                (string->symbol (format "set-~a!" (syntax->datum item)))))
    (syntax->list stx))

That does exactly what we want it to. Now, let's re-visit the syntax template from our macro and see how to implement this idea for the object setters. Recall the macro:
(define-syntax (new-object stx)
    (syntax-case stx ()
        [(_ name (value ...) [(func-name lambda-body) ...])
            #'(define (name value ...)
                (define func-name lambda-body) ...
                (define (dispatch m)
                    (cond [(eq? m 'value) value] ...
                        [(eq? m 'func-name) func-name] ...
                        [else (error "unknown message")]))
                dispatch)]))

We need to do an intermediate syntax transformation, to convert the "(values ...)" syntax object to the one we created earlier. We can do this with with-syntax. with-syntax takes a list of pairs of patterns and syntax objects and a template syntax object. It binds the bits of the syntax object to the symbols in the pattern, likesyntax-case, and transforms the template into a new syntax object. We're going to match our transformed list to a simple syntax pattern like thus
[(set-value ...)
    (map (λ (item)
        (datum->syntax stx
            (string->symbol (format "set-~a!" (syntax->datum item)))))
        (syntax->list #`(value ...)))]

Note that syntax->list is now acting on #`(value ...), id est, the syntax object of lists of symbols for object member names. Also note that the syntax object used in datum->syntax is the stx object from our macro definition. Our new template, with the new setters, looks like:
#'(define (name value ...)
    (define (set-value new-val)
        (set! value new-val)) ...
    (define func-name lambda-body) ...
    (define (dispatch m)
        (cond [(eq? m 'value) value] ...
            [(eq? m 'func-name) func-name] ...
            [(eq? m 'set-value) set-value] ...
            [else (error "unknown message")]))
    dispatch)

Now let's wrap it all up in our macro definition:
(define-syntax (new-object stx)
    (syntax-case stx ()
        [(_ name (value ...) [(func-name lambda-body) ...])
        (with-syntax ([(set-value ...)
                (map (λ (item)
                    (datum->syntax stx
                        (string->symbol (format "set-~a!" (syntax->datum item)))))
                    (syntax->list #`(value ...)))])
        #'(define (name value ...)
            (define (set-value new-val)
                (set! value new-val)) ...
            (define func-name lambda-body) ...
            (define (dispatch m)
                (cond [(eq? m 'value) value] ...
                    [(eq? m 'func-name) func-name] ...
                    [(eq? m 'set-value) set-value] ...
                    [else (error "unknown message")]))
                dispatch))]))

There you have it, folks. We've implemented a basic object system in less than 20 lines of code.

Comments, questions, corrections all welcome.

5 comments:

  1. Are you familiar with this?

    http://hipster.home.xs4all.nl/lib/scheme/gauche/define-syntax-primer.txt

    ReplyDelete
  2. no! I'm not. I'm just getting into this whole macro thing. That's for the reference!

    ReplyDelete
  3. Cool!

    It would be great to post all of your code together, perhaps as a github gist.

    Also, you might check out the `format-id' function from the `racket/syntax' library, which replaces a bunch of the symbol & string manipulation you're doing.

    ReplyDelete
  4. @Sam thanks for the tip about "format-id". Good stuff, and it would certainly simplify the code and the above discussion!

    I'll try to get around to putting things on github. no promises. Perhaps for my next post.

    ReplyDelete
  5. how can i get books,materials or video tutorials on racket macros

    ReplyDelete