• Monads as a Design Pattern

    Lately I’ve found monads to be more and more useful in several programming projects. For example, Harlan’s type inferencer uses a monad to keep track of what variables have been unified with each other, among other things. It took me a while to really grok monads. One reason is that many of the tutorials I’ve seen start out with category theory and the monad laws. These things don’t strike me as all that useful when I’m trying to make my code better in some way.

    What I have found useful is to think of monads as a style of programming (like continuation passing style), or even a design pattern. I’ve found monads are really handy when you need to thread some object through a sequence of function calls. To see how this works, we’re going to start with a store-passing interpreter for a small language and show how to use monads to hide the store in the cases where we don’t need it.

  • On Being an Artifact Evaluator

    I had the pleasure of serving as a committee member for this year’s PLDI Artifact Evaluation Process. After reading Lindsey Kuper’s post from the author’s point of view, I thought I’d say a little about my perspective from the other side. I had a lot of fun doing this, and it’s exciting to think about the implications for our field as artifact evaluation becomes a more common thing at conferences.

  • Rust Project Updates

    Happy New Year!

    With the recent release of Rust 0.9, I’ve decided to start a tradition of tagging releases for my Rust projects to coincide with releases of the Rust language. Although things like Rust CI have helped keep Rust code running as the language matures, there’s still some frustration in using Rust projects if you’d rather not always run the latest master build. Sometimes, it may even be impossible to make two projects work together if their maintainers do not update at the same pace. By tagging releases with official Rust releases, it will become much easier to always find a version of my code that works with the latest release of Rust.

    Without further ado, here are the projects:

    • rust-opencl - OpenCL bindings for Rust. Thanks to Colin Sheratt and Ian Daniher for their contributions and help keeping this code running.
    • rust-papi - Bindings to the PAPI performance counters library for Linux.
    • SciRust - Some linear algebra routines.
    • Boot2Rust - A small UEFI program that lets you boot and run nothing but Rust (and all the UEFI firmware stuff). See more in my previous post.

    Besides shamelessly plugging my own software, I hope that this post will encourage others who maintain Rust projects to do the same. As the Rust community grows, more people will want to stick with official releases, and these are much more valuable when most of the Rust projects have easy-to-find versions that work with these releases.

  • Continuation Passing Style Interpreters

    In my post on Scheme debuggers, I introduced a simple Scheme interpreter. Unfortunately, the interpreter was structured such that it was hard to make a terribly sophisticated debugger. While we won’t improve upon our debugger much today, I do want to look at a different style of interpreter that should enable more advanced debugging features.

    This new style is called continuation passing style. You can think of a continuation as what to do next in a program, and so continuation passing style means we make the continuation an explicit argument of a function. This means we can do a lot more with the control flow of the program, which is important in a debugger, for example, since we want to be able to pause and inspect the execution at arbitrary points.

    We’ll continue with a quick introduction to continuations and continuation passing style, then look at how to apply this to our interpreter. Once that is done, we will see how to implement call/cc in our new interpreter.

  • Booting to Rust

    A couple nights ago I was looking over the UEFI spec, and I realized it shouldn’t be too hard to write UEFI applications in Rust. It turns out, you can, and here I will tell you how.

    The thing that surprises me most about UEFI is that it now appears possible to boot your machine without ever writing a single line of assembly language. Booting used to require this tedious process of starting out in 16-bit real mode, then transitioning into 32-bit protected mode and then doing it all over again to get into 64-bit mode. UEFI firmwares, on the other hand, will happily load an executable file that you give it and run your code in 64-bit mode from the start. Your startup function receives a pointer to some functions that give you basic console support, as well as an API to access richer features. From a productivity standpoint, this seems like a win, but I also miss the sorcery you used to have to do when you were programming at this level.

    Booting to Rust is a lot like writing bindings to any C library, except that the linking process is a bit more involved.

  • Improving the Performance of Harlan's FFI

    My last post showed that it’s now possible to call code written in Harlan from C++ programs. Sadly, the performance numbers I posted were pretty embarrassing. On the bright side, when you have a 20-30x slowdown like we saw before, it’s usually pretty easy to get most of that back. In this post, we’ll see how. The performance still isn’t where I’d like to be, but when we’re done today, we’ll only be seeing about a 4x slowdown relative to CUBLAS.

  • Using Harlan in C++ Programs

    So far, Harlan programs have primarily existed in a vacuum. You’d compile them, run them, and they might produce some output. Certainly none of them received any input from the outside world. Most of the test cases use small sets of data, and the larger benchmarks generated incredibly synthetic data, like a vector of 16 million ones. My focus has been on building the compiler itself, so this has been a tolerable situation up to this point. However, Harlan is at the point where it needs more realistic applications and it’s clear the foreign function interface (FFI) story just won’t cut it anymore.

    I’m happy to report that it’s now possible to pass non-trivial amounts of data from C++ to Harlan. Two new features made this possible. First, there are library functions like unsafe-deref-float and unsafe-set!-float which allow reading and writing from raw pointers. Second, there’s a new primitive form called unsafe-vec-ptr which gives a raw pointer to the contents of a vector. These are very low level, but they give us the tools we need to build a reasonably usable FFI. Let’s see how to use these to implement a dot product in Harlan and use it from a C++ program.

    First, we need to write the dot product function. This is pretty short in Harlan.

      (import ffi)
      (define (harlan_dot N pa pb)
        (let ((a (import-float-vec pa N))
              (b (import-float-vec pb N)))
          (reduce + (kernel ((a a) (b b)) (* a b))))))

    For the most part, this is a straightforward dot product written in Harlan. The main new thing is the call to import-float-vec, which copies a C-style array into a Harlan vector. If you’re curious, it’s implementation is in ffi.kfc.

    Unlike most Harlan programs, this does not define a main function. Instead, we compile it to a shared library by running the following from your Harlan directory.

    ./harlanc --shared dotprod.kfc

    When this is done, you’ll have a dotprod.so which you can link to your C++ programs. The harlan_dot function is exposed under the signature float harlan_dot(int N, float *pa, float *pb).

    Now, let’s plug this function into the dot product benchmark I wrote about previously. Basically, we add a prototype for harlan_dot, then add a call to TIME(harlan_dot) along with the rest of the benchmarks. You can see the full set of changes here. I commented out the CUBLAS version because I ran into runtime errors that I didn’t feel like debugging. Below is a graph of how Harlan compares with the other implementations.

    Execution time for dot product on 33,554,432 element vectors (shorter bars are better).


    Clearly I’ve got some performance issues to deal with. On the bright side, Harlan runs faster on the GPU than it does on the CPU. I’ll be investigating these performance problems soon.

    As far as the FFI goes, there are some usability issues that remain too. For example, the Harlan compiler and the code it produces have some relative paths hard coded, which means they must run from the Harlan source directory. These shouldn’t be hard to fix. In the meantime, it’s now possible to integrate Harlan code into projects written in other languages.

  • How to write a simple Scheme debugger

    A while back, Rich Loveland asked how one might write a Scheme debugger. I realized that I’ve written many a Scheme interpreter, but I’ve never really thought about how to write a debugger. This post is a first step down that path. We’ll write what I’d consider a minimally functional debugger. It will allow you to break a running program into the debugger (albeit by invoking a primitive function in the program you’re running) and then inspect the variables that are available. As an added bonus, you’ll be able to evaluate arbitrary Scheme expressions from the debugger and even change the values of variables. For the moment, however, we will not support stepping program execution, which is admittedly an important feature of debuggers.

    We’ll start by writing an interpreter for a small but interesting subset of Scheme. Then we’ll show see how to add debugging features to the interpreter. As usual, you’ll be able to see all the code on Github.

    The Scheme Interpreter

    We’ll start with the so-called “Three Line Interpreter” and then add a few features to make it a bit more interesting. If you’re familiar with Scheme interpreters, feel free to skim this section. If you want more detail, check out the Essentials of Programming Languages. Our interpreter, value-of, takes two arguments. The first is the expression to evaluate, and the second is the environment that maps variable names onto values. Here’s the interpreter:

    (define (value-of e env)
      (match e
    	 (guard (symbol? x))
         (lookup x env))
        ((lambda (,x) ,e)
         (lambda (v) (value-of e (cons (cons x v) env))))
        ((,[e1] ,[e2])
         (e1 e2))))

    It’s a few more than three lines, but we’re mainly interested in the three match clauses. The first line is the variable reference line. If you try to evaluate something like x, we use lookup to go find the current value of x in the environment. I’ll omit the definition of lookup, but it’s basically the same as (cdr (assq x env)).

    The next line is the lambda line. This creates a procedure from a lambda expression. We cheat and use perfectly good Scheme’s built-in lambda. Notice this procedure we create takes a value, v, and then calls value-of on the body of the lambda. The important bit is the (cons (cons x v) env), which adds a binding of x to the value that was passed in to the environment, so that the interpreter can find the correct value when evaluating the body.

    Finally, we have the application line. Basically, we use match’s catamorphism feature to recursively compute the value of both the thing to be applied and the value to apply it to. Since our lambda line evaluates to Scheme procedures, we just apply the value e1 to its argument, e2.

    Although this doesn’t seem like a very rich language, you could use it to compute any computable function if you wanted to. You probably don’t want to though.

    More Features

    Now that we have the core of our interpreter, we can add features to the language. In most cases, this is as simple as adding a few more clauses to our match expression. Let’s start with numbers.

        (,n (guard (number? n)) n)
        ((,op ,[e1] ,[e2]) (guard (memq op '(+ * -)))
         ((eval op) e1 e2))

    The first line leaves numbers as they are. There’s not really much more to do with them.

    The second line lets you do things to numbers. We make sure the operation is one of +, * or -. We could add more, but these are enough for now. I cheated once again and use eval to convert the symbol representing the operator into the Scheme procedure that performs that operation. As before, we use catamorphism to evaluate the two arguments to the operator.

    To make debugging a little more interesting, let’s add some side effects. We add set! with the following match clause.

        ((set! ,x ,[e])
         (update-env! x e env))

    Of course, this isn’t very interesting without knowing what update-env! does. Here you go!

    (define (update-env! x v env)
      (if (eq? x (caar env))
          (set-cdr! (car env) v)
          (update-env! x v (cdr env))))

    Basically, we just search through the environment until we find the variable to change and then use set-cdr! to change its value.

    Finally, let’s add begin. We could simulate this with lambdas and function applications, but it’s much cleaner just to add it directly. We get begin with this clause:

        ((begin ,e* ... ,e)
           (let loop ((e* e*))
             (if (pair? e*)
                   (value-of (car e*) env)
                   (loop (cdr e*)))))
           (value-of e env)))

    We start by evaluating each of the first expressions for their effect, and then we return the value of the last expression.

    At this point, we have enough that we can start to write some reasonably interesting programs. In particular, we can write programs that have bugs that we might want to debug. Let’s add a debugger!

    The Debugger

    The first thing we’re going to do is add a way to get into the debugger. Most of the time the debugger does this by loading the source files and letting you click on the point in the code where you want a break point. This takes more UI work than I want to do right now, so instead we’ll just add a special command to our language, (debug), which breaks into the debugger. As usual, this is as simple as adding another match clause:

    	 (debugger env))

    This calls out to a function we have yet to define, called debugger. We pass in the environment debugger needs to see the environment so we can inspect it.

    The debugger itself is just a read-eval-print loop (REPL). It prompts the programmer for a command, then evaluates the command, prints out any results and then continues. Let’s start with a debugger that we can get out of:

    (define (debugger env)
      (printf "debug> ")
      (let ((cmd (read))
            (continue #t))
        (match cmd
          ((continue) (set! continue #f))
          (,else (printf "unknown command\n")))
        (if continue
            (debugger env)
            (printf "running...\n"))))

    We start out by printing debug> and then using read to read in an S-Expression. Much like our interpreter (in fact, this is just an interpreter for another language), we use match to determine which command we’re given and then evaluate it. For now we support one command, (continue), which continues the execution of the program. We do this by setting a flag that tells the debugger not to continue it’s loop. Since we were called by value-of, we just return to that function and the interpreter picks up where it left off.

    Here’s an example debugging session:

    > (value-of '(debug) '())
    debug> (continue)

    At the Scheme REPL, we call value-of with a simple expression that immediately breaks into the debugger, and we start in the empty environment ('()).

    Most debuggers give you some kind of call stack. The closest analog to that in our language is a list of all the variables in scope, so let’s add a way to see these. Once again, we add one more match clause:

          ((show-env) (show-env env))

    This just forwards to a procedure, show-env, which prints out the values in the environment. Here’s an example of how to use it:

    > (value-of '(((lambda (x)
                     (lambda (y)
                           (+ x y))))
    debug> (show-env)
    0. x: 4
    1. y: 5
    debug> (continue)

    So now we can stop the execution, continue the execution, and see what’s in the environment. What if we want to change things? We could add commands to set values, but a more powerful way is to use the target language itself to do this. Thus, we’ll add an eval command, which evaluates an expression in the debugger:

          ((eval ,e)
           (printf "~s => ~s\n" e (value-of e env)))

    Now, if we want to change values, we can just evaluate a set! expression, like this:

    > (value-of '(((lambda (x)
                     (lambda (y)
                           (+ x y))))
    debug> (show-env)
    0. x: 4
    1. y: 5
    debug> (eval (set! x 115))
    (set! x 115) => #<void>
    debug> (show-env)
    0. x: 115
    1. y: 5
    debug> (continue)

    As expected, the resulting value changes to reflect the fact that we modified the environment during program execution.

    So there’s a first steps towards a Scheme debugger. We were able to do this with relatively few changes to our interpreter. It seems to me that adding more advanced features would require more changes. For example, there’s no way to inspect or change the code that’s running now. To do this, we would have to keep track of this data in a form that is accessible to the debugger. Furthermore, we’re still missing a way to do finer grained execution control, such as stepping over a single statement or out of the current lambda. I suspect most if not all of these problems can be solved by writing our interpreter in continuation passing style. I hope to explore this in a later post.

  • Visualizing the Turing Tarpit

    Jason Hemann and I recently had a paper accepted at FARM called “Visualizing the Turing Tarpit.” The idea grew out of a talk that Jason did at our weekly PL Wonks seminar on the minimalist programming languages, Iota and Jot. At the end of the talk, Ken Shan asked whether this could be used to do some kind of cool fractal visualization of programs. That night, several of us pulled out our computers and started hacking on Iota and Jot interpreters.

    Iota and Jot are examples of Turing Tarpits, that is, languages “in which everything is possible but nothing of interest is easy.” The term comes from Alan Perlis. Turing Tarpits have some utility. The Lambda Calculus is arguably a Turing Tarpit, and yet it is quite useful in the study of programming languages and computability. Iota is notable as a Turing-complete language which consists of only two symbols. For example, i, *i*ii, and **iii are all legal programs in Iota. Sadly, some strings, such as i*i are not legal. This makes Iota less than ideal for enumerating many programs, as we can’t choose arbitrary strings but must instead be sure we follow the grammar. Jot was designed to fix this. Jot has the property that any binary string is a legal program.

    Using Jot, it’s now incredibly easy to enumerate large numbers of programs. The next step is to plot the behavior. We chose to see how many steps a program executes for before terminating (or until we give up, as not all programs will terminate), and then use this to generate a color for a certain point. Below is an animated example of what we ended up with.

  • Why Write Compilers in Scheme?

    One of the questions Klint Finley asked me for the Wired article about Harlan was “Why Scheme?” I wasn’t really satisfied with my answer, so I thought I’d answer it more completely here. Besides the fact that we were already experienced writing compilers in Scheme, Scheme has a lot of features that are very useful to compiler writers.