When working on the interface to notmuch I stumbled upon the case where
iterating over messages and iterating over filenames produced the same code
structure only changing the functions being used. I hate to see code
duplication and thus I went out exploring for solutions available in Guile
to solve this issue.
In this series of posts I record how to use Guile as a scripting language
and solve various tasks related to email work.
I always hear that macros are the killer feature of lisp languages. That,
in lisp, macros are code generating functions which do all kinds of magic
tricks. This case of code duplication sounded like the right moment to test
that claim.
I must say that this is the first time that I deal with a language feature
of this kind. It was quite a complicated topic to get around and make it
work. I noticed that the very acclaimed feature of Scheme hygienic macros
made my simple goals harder to achieve. I’m in no way at a position to
judge against them. My software development experience has shown me that a
language quality is not measured on how easy is to do big things, but
rather how hard it makes it for you to do the wrong things. Maybe the
solution I’ll propose next is just a bad way of doing things and thus it
was hard to get it done. I’ll revisit this when I reach a higher
experience, after all macros are the killer feature.
Looking for the common code
In the last post I showed how to iterate over messages, here is my solution
to iterate over filenames. Notice the similarities, the structure is the
same, the functions called are named similarly, because they do the same
and only change on the type they act, thus messages
becomes
filenames
. A fundamental difference is that here, there is no function to
destroy the filename string pointer. That one, I really hope gets garbage
collected, because I found no reference on freeing that pointer. That also
means when comparing the iterators that the messages
version has an extra
step.
1(define (filenames-iter message proc)
2 (let ((obj (notmuch_message_get_filenames message)))
3 (let loop ((item (notmuch_filenames_get obj))
4 (acc (quote ())))
5 (if (= 0 (notmuch_filenames_valid obj))
6 (begin (notmuch_filenames_destroy obj)
7 acc)
8 (let ((result (proc item)))
9 ;; notice that here there is no call to clear
10 ;; the item memory with a function notmuch_filename_destroy
11 (notmuch_filenames_move_to_next obj)
12 (loop (notmuch_filenames_get obj)
13 (cons result acc)))))))
The first let
where I define the main object I’m iterating over is
actually an afterthought. I used to pass that object as a function
argument. Yet when I wrote the macro I realized that, the macro would call
the instantiation of the object on every call site and I ran out of memory
and my computer crashed. Thus I keep that structure on this early examples,
so that you can later recognize them on the macro. Macros are a new
programming discipline for me, I might need this reminder later in life.
From string to symbol
I know how to edit strings, that is a common task in software. What I
learned for the first time with Guile was how to transform a string into a
symbol:
1(string->symbol "notmuch_filenames_get")
It couldn’t be more simple. A dedicated function does the job. It is
important to realize that the symbol can’t do anything on its own. You
can’t put it as the first element of a list and expect it to be called,
find the function you specify and do the job you want. In my first
iteration of working with these code blocks, I didn’t use macros, but only
functions and used eval
to get the procedure I wanted. Like the
following.
1((eval
2 (string->symbol "notmuch_filenames_get")
3 (interaction-environment))
4 filenames-iterator)
I had to evaluate the symbol I just created in the currently defined
interaction environment for it to become the procedure I wanted to call and only
then I could apply it to the filenames-iterator
to get the currently
pointed filename.
Macros
Thread-first macro
The most common macros I read about are when
, unless
, or
and
, and
that is fine, but they seem to show to little of the power of a macro to
rewrite code. They are too small of a change on your writing conventions,
and once you read them they don’t make much difference.
The threading macros were more striking to me. I realized when writing lisp
code that it is nice to have a function, test if it works and then wrap it
with another one and so on, composing them as the way to process the
data. However, when I came back and read that code, I had to read back and
forth many-times, from the root to its leaves and back. It does has some
benefits, because the extra time you take forces you to engage a bit more
with the context and information about the code you are reading. However,
many times I was just looking around for where the functions start and end,
and I was more confused than enlightened.
Threading macros let you convert nested function calls into a list of
function calls and thereby improve readability. The next macro, is the
thread-first macro ->
, and what it does is place each element as the
first argument of the next element(that would be a function call). Try to
follow that modification on the next code block. value
gets inserted as
the first argument in a function call that is defined by the next element,
that is the function defined with the name fun
. If fun
takes more than
one argument, I would write a list, and the macro injects value
as a
first argument. If fun
takes only one argument, I write the symbol
directly, and the macro takes care of placing fun
and value
in the
parenthesis. Then this macro calls itself recursively, that is how the
nesting is recreated.
1(define-syntax ->
2 (syntax-rules ()
3 ((_ value) value)
4 ((_ value (fun . other-args) next ...) (-> (fun value . other-args) next ...))
5 ((_ value fun next ...) (-> (fun value) next ...))))
Here is an example of how I would use this macro. The singular
function
takes a syntax element (kind of a symbol with information about its
environment of evaluation) and turns it into a datum, which is the symbol.
That symbol becomes a string, from which I drop the last character, that
should be an s
, as that is the regular plural of nouns. At least that
works well enough for filenames
and messages
.
1(define (singular stx)
2 (-> stx
3 syntax->datum
4 symbol->string
5 (string-drop-right 1)))
My notmuch iterator macro
I went many times over the Guile documentation on macros
and I wasn’t able
to achieve what I wanted. I missed more examples and explanations. It was
due to the hygienic macros, that I couldn’t just have my macro write
symbols directly. I needed to always include the lexical context.
I started exploring NYACC’s code(after all it is a tool that generates
scheme code) an found enough inspiration for what I wanted. The next
function nm-symbols
uses the lexical context in the identifier tmpl-id
and a variable number of arguments that will be composed together as
strings. The return value is a syntax object being the named function I
want with the lexical context of tmpl-id
. This is how I compose the
symbols with the name I want and that they resolve to the functions
provided by libnotmuch
. I needed to wrap it on eval-when
for things to
work inside the macro presented later on.
1(eval-when (expand load eval)
2 (define (nm-symbols tmpl-id . args)
3 (define (stx->str stx)
4 (symbol->string (syntax->datum stx)))
5 (datum->syntax
6 tmpl-id
7 (string->symbol
8 (apply string-append
9 (map (lambda (ss) (if (string? ss) ss (stx->str ss))) args)))))
10
11 ;; the function singular defined before needs to be here
12 )
That is all the setup I need. I can now write my iterator macro. Notice how
I construct the symbols I want with nm-symbols
, it uses #'type
to get
the lexical context, and type
is the symbol I use to refer to the notmuch
type. The #'
is a reader macro that brings the symbol into a syntax, only
from there I can extract lexical context.
After defining the syntax elements, I directly write my iterator using
them. I would have loved to find a way not to write the destructor(free
memory) if the symbol was not defined, but for now I check at runtime for
the definition and if succeeds it calls it.
1(define-syntax nm-iter
2 (lambda (x)
3 (syntax-case x ()
4 ((_ type query proc)
5 (with-syntax ((valid? (nm-symbols #'type "notmuch_" #'type "_valid"))
6 (destroy (nm-symbols #'type "notmuch_" #'type "_destroy"))
7 (get (nm-symbols #'type "notmuch_" #'type "_get"))
8 (next (nm-symbols #'type "notmuch_" #'type "_move_to_next"))
9 (item-destroy (nm-symbols #'type "notmuch_" (singular #'type) "_destroy")))
10 #'(let ((obj query))
11 (let loop ((item (get obj))
12 (acc '()))
13 (if (= 0 (valid? obj))
14 (begin
15 (destroy obj)
16 acc)
17 (let ((result (proc item)))
18 (when (defined? (quote item-destroy))
19 (item-destroy item))
20 (next obj)
21 (loop (get obj) (cons result acc)))))))))))
The way I use this macro for messages is:
1(nm-iter messages
2 (result-messages query)
3 (get-headers "from" "subject"))
and for filenames is:
1(nm-iter filenames
2 (notmuch_message_get_filenames message*)
3 pointer->string)
Summary
This solution took me a lot of time to research. It has hard and painful to
get it to work. At the time of writing, I’m still not sure I understand all
elements that I ended up using, but it was a fun experience to get it to
work. During the process of learning and trying, I ran out of file handles
and also completely filled up my memory and crashed my system. All the good
experiences that come by breaking things while learning.
The end result, from a software perspective, is worse than where I
started. The solution takes more lines of code. The level of nesting of the
functions and macros used is deeper. The overall readability and thus
maintainability dropped. I’m also including implementation details in the
macro (the part where I check for the symbol definition), it doesn’t feel
right.
The good things is that I managed to get something new working. Despite
more new code, there is less code duplication and would I need an iterator
for notmuch’s threads
I would get it for free instead of incurring in
code triplication.