Manually creating the bindings for a sizable C program is a lot of work,
also it is not very rewarding. I’m lazy and don’t want to do that. I looked
around for what other options I have and I was surprised about the
solution.
In this series of posts I record how to use Guile as a scripting language
and solve various tasks related to email work.
If my experience with Python has taught me something is that the endeavor
of interfacing programs between languages can be quite painful. I remember
trying to use boost.python
, then cython
, and then even hearing about pybind
and python-cffi
. All those projects, why is there no simple solution? They
had a good start and then it was painful the rest of the way. With Guile I
didn’t search for long and I was quickly blown away.
NYACC
is a project that did what I wanted very quickly. It can read the C
code and automagically generate all the bindings you need. I still
experienced some difficulties, and I still spent a lot of time looking up
at the C code in notmuch and their Python bindings for guidance, yet the
overall experience was a lot nicer than what I remember from the Python
world. My work was focused on designing a usable interface, thinking about
how I want my implementation to work. The typing and generation of the
bindings was done entirely by NYACC. What I like about it is that you don’t
repeat yourself
, you take the C header file and NYACC builds the bindings,
it directly understands the C code. On the many Python projects I have
used, you must implement a new copy, which you then need to maintain.
Creating the module
1(define-ffi-module (ffi notmuch)
2 #:library '("libnotmuch")
3 #:include '("notmuch.h"))
That is all you need to start with, write it on a file called
ffi/notmuch.ffi
inside your guile path. Then as the NYACC documentation
says just execute:
1guild compile-ffi ffi/notmuch.ffi
and you get the ffi/notmuch.scm
file with ALL the bindings defined on
the notmuch.h
file. It provides even wrappers/unwrappers between Guile
and C types and their enums. I was really amazed how well it works. For
reasons I’m unaware of, you still need to call string->pointer
and
pointer->string
when dealing with those string pointers. Since it is
written in the documentation there might be a limitation or be a design
choice.
With all the bindings already implemented for you, the only thing left is
to implement some adapters to interact with the library the way you like
and not the way it was written to be used in the C world.
Building the interface
Make sure that the generated file ffi/notmuch.scm
is in your path and
import it. The workflow is now much easier, since all the bindings are
already at your disposal. I can directly use the module to create my
adapters to use notmuch in Guile.
Wrappers around the wrappers - the adapters
NYACC creates a binding for notmuch_database_open
, which looks more
complicated that what I presented in the previous post, yet that is because
it provides additional wrappers/unwrappers to the types. Same thing with
all other exposed functions.
NYACC also defines constructors for types, for example
make-notmuch_database_t*
creates a pointer to that type and I get it with
a nice representation in the REPL, which is much nicer than, what I had in
the previous post with make-bytevector
. My adapter to open the database
is now much cleaner.
1(use-modules
2 (system foreign)
3 (system ffi-help-rt) ;; functions from nyacc
4 (ffi notmuch)) ;; the module just created
5
6(define (open-database path mode)
7 ;; nyacc provides the pointer "constructor"
8 (let ((ffi-db (make-notmuch_database_t*)))
9 (notmuch_database_open (string->pointer path) mode (pointer-to ffi-db))
10 ffi-db))
Next I set up a query, and set the default of omitting the deleted
and
spam
tags. I should read those options from the notmuch-config, yet I
don’t want to create that interface at the moment, thus I just put it here.
1(define (query-db db str)
2 (let ((query (notmuch_query_create db (string->pointer str))))
3 (for-each (lambda (tag)
4 (notmuch_query_add_tag_exclude query (string->pointer tag)))
5 (list "deleted" "spam"))
6 query))
To process the query I need to see the matching messages. For that I
implement result-messages
and the extra utility function
count-messages
.
1(define (result-messages query)
2 (let ((messages (make-notmuch_messages_t*)))
3 (notmuch_query_search_messages query (pointer-to messages))
4 messages))
5
6(define (count-messages query)
7 (let ((counter (make-int32)))
8 (notmuch_query_count_messages query (pointer-to counter))
9 (fh-object-ref counter)))
Iterating over the messages
The previous functions allowed me to get the messages matching the query,
yet I need to be able to process them, that means iterating over each
message. Looping
in Guile is done via recursion. I use the named let to
express recursion for an iterative process. Here I do heavy use of the C++
functions to iterate over the messages, very similar to how it is
implemented in the C++ code. It gets annoying to differentiate in the
functions between plural and singular, because there are messageS
and
message
. I’m guilty of this crime on my own software, yet with so many
more prefixes and suffixes in the function names in here it was tougher on
my eyes this time. The message iterator with inline redundant explanations
is as follows:
1(define (messages-iter query proc)
2 ;; get all messages that match the query
3 (let ((obj (result-messages query)))
4 ;; This is the named let, LOOP is the procedure which accepts
5 ;; the amount of bindings as arguments
6 ;; ITEM are the individual messages, here I get the first one to initialize it
7 ;; ACC is is a list accumulating the results of the iteration
8 (let loop ((item (notmuch_messages_get obj))
9 (acc '()))
10 ;; Terminate iteration if the obj, which is a pointer for the messageS
11 ;; is not pointing to a valid message anymore.
12 (if (= 0 (notmuch_messages_valid obj))
13 (begin
14 ;; Extremely important to clear memory of the messageS
15 (notmuch_messages_destroy obj)
16 ;; This is the retun value, the list of results
17 acc)
18 (let ((result (proc item))) ;; I evalutate proc to a message
19 ;; Extremely important to clear the memory of the message
20 (notmuch_message_destroy item)
21 ;; This moves the pointer of messageS to the next message
22 (notmuch_messages_move_to_next obj)
23 ;; Recursion in play, LOOP is called, it gets the next message
24 ;; because the pointer was just moved and RESULT is placed at the
25 ;; head of ACC, for the next iteration
26 (loop (notmuch_messages_get obj)
27 (cons result acc)))))))
The power of Scheme is that I can abstract that iteration and pass a
function to process the messages. On the C++ I found all those pointer
manipulating functions being called all over the place, each time an
iteration was needed.
A simple function to extract selected headers is just again an iteration of
the notmuch_message_get_header
with different arguments. I get the
benefit to abstract behavior in a function of variable arity for each
header I want. I return a new function that only takes a message as
argument.
1(define (get-headers . labels)
2 (lambda (message)
3 (map (lambda (label)
4 (pointer->string
5 (notmuch_message_get_header message (string->pointer label))))
6 labels)))
7
8;; Use it like this, where msg is a notmuch message pointer
9;; ((get-headers "date" "to" "from") msg)
10
11;; Use it with the iterator like this:
12(let* ((db (open-database "/home/titan/.mail/" 0))
13 (query (query-db db "discussions on some mailing list"))
14 (result (messages-iter query (get-headers "date" "to" "from"))))
15 ;; always clear memory
16 (notmuch_query_destroy query)
17 (notmuch_database_destroy db)
18 ;; return the result
19 result)
Tagging emails
Tagging is again a procedure I apply to a message, thus I only need to
implement that function like I did with get-headers
. In this case
apply-tags-to-message
returns a function that consumes the message and
applies the desired tags, which are given all is the same string. This is
to reuse my configured tags from previous setups (tags : "+sent +project -inbox"
). The next function is quite dense as it needs to iterate over
each tag that is going to be applied or removed.
1(define (apply-tags-to-message tags)
2 (lambda (message)
3 (let loop ((rest (string-tokenize tags)))
4 (unless (null-list? rest)
5 (let ((tag (string->pointer (substring (car rest) 1))))
6 (if (string-prefix? "-" (car rest))
7 (notmuch_message_remove_tag message tag)
8 (notmuch_message_add_tag message tag)))
9 (loop (cdr rest))))))
I use this function just as I showed in the previous section, however I
need to open the database in READ_WRITE
mode, that is I must pass a 1
to the open-database
function. Keep in mind that the tagging function
does not return anything, thus the result of the iterator will be a list of
undefined values.
Deleting message files
This task has some new challenges. Notmuch has two functions to get the
message filename. notmuch_message_get_filename
and
notmuch_message_get_filenames
, singular and plural cases again. The first
one gets a filename of the message, most of the time a message has also one
corresponding file. However, when you interact with mailing lists, you
might end with some copies of the same message, also if you manage multiple
email accounts and receive the same message on many of the accounts. For
that reason there is the function with the plural case. It returns an
iterator over the filenames, which is to be processed just like I did with
the messages iterator.
The code is so similar I will not write it here, you can take it as an
exercise left to you. In an upcoming post I’ll show my attempt at using
macros to write an iterator to deal with both messages and filenames.
Summary
As I already said at the beginning of the post, NYACC is an amazing
solution to create the bindings from Guile to C, it felt much more
comfortable to use than anything I have used on the Python world, when I
had to use those tools some years ago.
After creating some basic adapters, extending my script to process my
emails with notmuch was an absolute pleasure. I did run into problems like
running out of memory and running out of file handles. I had forgotten
about those old problems since I moved to the land of Python, yet when
interfacing with C, you need to take that responsibility again, whether you
are in Python or Guile.
I implemented enough features for this email processing script that I
already manage to do everything afew
, was able to do. As such I have
replaced it already with my script. It is not comparatively better, yet it
makes me proud to use my stuff, designed in the way I want to use it.
I still need to practice a way of abstracting the behavior I want my code
to express, I still need to learn about macros
and how the object system
works.