Guile macros - avoiding code duplication when interfacing to notmuch

By — Dr. Óscar Nájera
Dec 4, 2020 | 9 min read | Guile Notmuch

When working on the interface to notmuch I stumbled upon the case where iterating over messages and iterating over filenames produced the same code structure only changing the functions being used. I hate to see code duplication and thus I went out exploring for solutions available in Guile to solve this issue.

In this series of posts I record how to use Guile as a scripting language and solve various tasks related to email work.

I always hear that macros are the killer feature of lisp languages. That, in lisp, macros are code generating functions which do all kinds of magic tricks. This case of code duplication sounded like the right moment to test that claim.

I must say that this is the first time that I deal with a language feature of this kind. It was quite a complicated topic to get around and make it work. I noticed that the very acclaimed feature of Scheme hygienic macros made my simple goals harder to achieve. I’m in no way at a position to judge against them. My software development experience has shown me that a language quality is not measured on how easy is to do big things, but rather how hard it makes it for you to do the wrong things. Maybe the solution I’ll propose next is just a bad way of doing things and thus it was hard to get it done. I’ll revisit this when I reach a higher experience, after all macros are the killer feature.

Looking for the common code

In the last post I showed how to iterate over messages, here is my solution to iterate over filenames. Notice the similarities, the structure is the same, the functions called are named similarly, because they do the same and only change on the type they act, thus messages becomes filenames. A fundamental difference is that here, there is no function to destroy the filename string pointer. That one, I really hope gets garbage collected, because I found no reference on freeing that pointer. That also means when comparing the iterators that the messages version has an extra step.

 1(define (filenames-iter message proc)
 2  (let ((obj (notmuch_message_get_filenames message)))
 3    (let loop ((item (notmuch_filenames_get obj))
 4               (acc (quote ())))
 5      (if (= 0 (notmuch_filenames_valid obj))
 6          (begin (notmuch_filenames_destroy obj)
 7                 acc)
 8          (let ((result (proc item)))
 9            ;; notice that here there is no call to clear
10            ;; the item memory with a function notmuch_filename_destroy
11            (notmuch_filenames_move_to_next obj)
12            (loop (notmuch_filenames_get obj)
13                  (cons result acc)))))))

The first let where I define the main object I’m iterating over is actually an afterthought. I used to pass that object as a function argument. Yet when I wrote the macro I realized that, the macro would call the instantiation of the object on every call site and I ran out of memory and my computer crashed. Thus I keep that structure on this early examples, so that you can later recognize them on the macro. Macros are a new programming discipline for me, I might need this reminder later in life.

From string to symbol

I know how to edit strings, that is a common task in software. What I learned for the first time with Guile was how to transform a string into a symbol:

1(string->symbol "notmuch_filenames_get")

It couldn’t be more simple. A dedicated function does the job. It is important to realize that the symbol can’t do anything on its own. You can’t put it as the first element of a list and expect it to be called, find the function you specify and do the job you want. In my first iteration of working with these code blocks, I didn’t use macros, but only functions and used eval to get the procedure I wanted. Like the following.

1((eval
2  (string->symbol "notmuch_filenames_get")
3  (interaction-environment))
4 filenames-iterator)

I had to evaluate the symbol I just created in the currently defined interaction environment for it to become the procedure I wanted to call and only then I could apply it to the filenames-iterator to get the currently pointed filename.

Macros

Thread-first macro

The most common macros I read about are when, unless, or and, and that is fine, but they seem to show to little of the power of a macro to rewrite code. They are too small of a change on your writing conventions, and once you read them they don’t make much difference.

The threading macros were more striking to me. I realized when writing lisp code that it is nice to have a function, test if it works and then wrap it with another one and so on, composing them as the way to process the data. However, when I came back and read that code, I had to read back and forth many-times, from the root to its leaves and back. It does has some benefits, because the extra time you take forces you to engage a bit more with the context and information about the code you are reading. However, many times I was just looking around for where the functions start and end, and I was more confused than enlightened.

Threading macros let you convert nested function calls into a list of function calls and thereby improve readability. The next macro, is the thread-first macro ->, and what it does is place each element as the first argument of the next element(that would be a function call). Try to follow that modification on the next code block. value gets inserted as the first argument in a function call that is defined by the next element, that is the function defined with the name fun. If fun takes more than one argument, I would write a list, and the macro injects value as a first argument. If fun takes only one argument, I write the symbol directly, and the macro takes care of placing fun and value in the parenthesis. Then this macro calls itself recursively, that is how the nesting is recreated.

1(define-syntax ->
2  (syntax-rules ()
3    ((_ value) value)
4    ((_ value (fun . other-args) next ...) (-> (fun value . other-args) next ...))
5    ((_ value fun next ...) (-> (fun value) next ...))))

Here is an example of how I would use this macro. The singular function takes a syntax element (kind of a symbol with information about its environment of evaluation) and turns it into a datum, which is the symbol. That symbol becomes a string, from which I drop the last character, that should be an s, as that is the regular plural of nouns. At least that works well enough for filenames and messages.

1(define (singular stx)
2  (-> stx
3      syntax->datum
4      symbol->string
5      (string-drop-right 1)))

My notmuch iterator macro

I went many times over the Guile documentation on macros and I wasn’t able to achieve what I wanted. I missed more examples and explanations. It was due to the hygienic macros, that I couldn’t just have my macro write symbols directly. I needed to always include the lexical context.

I started exploring NYACC’s code(after all it is a tool that generates scheme code) an found enough inspiration for what I wanted. The next function nm-symbols uses the lexical context in the identifier tmpl-id and a variable number of arguments that will be composed together as strings. The return value is a syntax object being the named function I want with the lexical context of tmpl-id. This is how I compose the symbols with the name I want and that they resolve to the functions provided by libnotmuch. I needed to wrap it on eval-when for things to work inside the macro presented later on.

 1(eval-when (expand load eval)
 2  (define (nm-symbols tmpl-id . args)
 3    (define (stx->str stx)
 4      (symbol->string (syntax->datum stx)))
 5    (datum->syntax
 6     tmpl-id
 7     (string->symbol
 8      (apply string-append
 9             (map (lambda (ss) (if (string? ss) ss (stx->str ss))) args)))))
10
11  ;; the function singular defined before needs to be here
12  )

That is all the setup I need. I can now write my iterator macro. Notice how I construct the symbols I want with nm-symbols, it uses #'type to get the lexical context, and type is the symbol I use to refer to the notmuch type. The #' is a reader macro that brings the symbol into a syntax, only from there I can extract lexical context.

After defining the syntax elements, I directly write my iterator using them. I would have loved to find a way not to write the destructor(free memory) if the symbol was not defined, but for now I check at runtime for the definition and if succeeds it calls it.

 1(define-syntax nm-iter
 2  (lambda (x)
 3    (syntax-case x ()
 4      ((_ type query proc)
 5       (with-syntax ((valid? (nm-symbols #'type "notmuch_" #'type "_valid"))
 6                     (destroy (nm-symbols #'type "notmuch_" #'type "_destroy"))
 7                     (get (nm-symbols #'type "notmuch_" #'type "_get"))
 8                     (next (nm-symbols #'type "notmuch_" #'type "_move_to_next"))
 9                     (item-destroy (nm-symbols #'type "notmuch_" (singular #'type) "_destroy")))
10         #'(let ((obj query))
11             (let loop ((item (get obj))
12                        (acc '()))
13               (if (= 0 (valid? obj))
14                   (begin
15                     (destroy obj)
16                     acc)
17                   (let ((result (proc item)))
18                     (when (defined? (quote item-destroy))
19                       (item-destroy item))
20                     (next obj)
21                     (loop (get obj) (cons result acc)))))))))))

The way I use this macro for messages is:

1(nm-iter messages
2         (result-messages query)
3         (get-headers "from" "subject"))

and for filenames is:

1(nm-iter filenames
2         (notmuch_message_get_filenames message*)
3         pointer->string)

Summary

This solution took me a lot of time to research. It has hard and painful to get it to work. At the time of writing, I’m still not sure I understand all elements that I ended up using, but it was a fun experience to get it to work. During the process of learning and trying, I ran out of file handles and also completely filled up my memory and crashed my system. All the good experiences that come by breaking things while learning.

The end result, from a software perspective, is worse than where I started. The solution takes more lines of code. The level of nesting of the functions and macros used is deeper. The overall readability and thus maintainability dropped. I’m also including implementation details in the macro (the part where I check for the symbol definition), it doesn’t feel right.

The good things is that I managed to get something new working. Despite more new code, there is less code duplication and would I need an iterator for notmuch’s threads I would get it for free instead of incurring in code triplication.

Dr. Óscar Nájera

Software archeologist – Recovering Physicist – Dancer

As a scientist I studied the physics of the very small quantum world. As a computer hacker I distill code. Software is eating the world, and less code means less errors, less problems. Millions of lines of legacy code demand attention and have to be understood and simplified for future reliable operation.