Strangely Consistent

Theory, practice, and languages, braided together

Code generation and stone soup

I don't know what kept me away from generating code for so long. Fear and prejudice, perhaps.

I've been trying it the last few days, and I have two things to say. First, it's like learning to program all over again. Remember that sense of power from the early days, when just picking up coding? "Hey, I can program this piece of code to do whatever I want." Well, guess what? That feeling comes back when one starts down on the path to madn... erm, to code generation. Only it's stronger. "Hey, I can program this piece of code to program that piece of code to do whatever it wants!" I think I've just discovered meta-hubris. Most likely, I'm not the first to do so.

Second, there's a flip-side to the feeling of power. That other feeling is how you feel when you knit your brows and wish that your neurons would line up a bit better so you could think more clearly about the problem at hand. Who would have thought, that feeling is also stronger when you're suddenly writing two different, entwined and related programs at the same time, in the same file. In my case, the knitted brows turn into an empty stare and a jaw left slackly agape, as I sit there wishing that I was better at context-switching between runloops.

Honestly, I think I expected eval to be the source of much programmer confusion, but I have to confess that it seems I underestimated the vistas it opens up when you buy into the idea of generating exactly the piece of code you need for the task (from an AST, say), and then eval it into a closure. That's what the back end of a compiler ends up doing, so maybe I shouldn't be so surprised that it's a versatile technique.

Lately, I've been in the business of squeezing every drop of juice out of the already implemented control flow constructs already implemented in Rakudo. I'm writing a p6regex→p6 compiler, you see. (Yes, that's a rather crazy notion; thanks for asking.) Along the way, I've often felt the need for not-yet-implemented control flow. This has led me to this hope-inducing maxim:

Every type of control flow in programming languages is just convenient sugar for if statements and while loops.

ifs and whiles are the stone soup to which all the rest of our control flow can be added as seasoning. ifs let you conditionally skip ahead in code, and whiles allow you to conditionally skip back. That's all you need.

Here are some examples.

Aside from the switch statements and unlabeled next etc, which already work very well in Rakudo, I've been doing the whole list of desugarings in GGE (the regex compiler). The part with the continuations was especially fun. I needed them for backtracking, at least as long as the compiler was only an interpreter.

But then, during a fruitful discussion with diakopter++, I was told how to emulate (delimited) gotos with a switch and a loop. The idea is quite obvious in retrospect: just keep the current 'label' in a variable, and switch on it in each iteration. Presto! I should have thought of that. I don't even need to flee to PIR any more.

I took the idea and generalized it to delimited gosubs: instead of keeping the current label in a scalar, keep it at the top of a stack. Define macro-like constructs to push to (local-branch) and pop from (local-return) the stack. Suddenly I don't need continuations as much.

Result: this. We send in the regex /<[a..b]> | <[b..e]>/ on the top line, along with the target string c to match on. The program generates an AST, an anonymous subroutine which executes the regex in atomic Perl 6 operations, and finally a match object which indeed finds c to be a match.

Here's a similar but slightly more involved example. And here's one doing captures and backreferences inside a quantified non-capturing group. Isn't that exquisite? (Ok, bad choice of word. Sorry.)

As I said, I wrote most of with a feeling of being not just in over my head, but of being in over my head twice. I'm still a bit surprised it works. The runtime compilation seems to introduce a bit of a speed penalty, but (1) it's a one-time cost, since you can re-use the regex object, and (2) I told you it would be slow.

The code-generating work still resides only in a local branch on my computer. I'll push it to master as soon as I'm done bringing GGE back to its former capabilities. (Update 2010-01-24: Done, and done.)

Code writing code. What a concept!