Strangely Consistent

Theory, practice, and languages, braided together

Speed up by a factor of 6 million

By the end of March, I received an email saying this:

$ time perl t4.pl
total: 4783154184978

real    0m0.185s
user    0m0.176s
sys     0m0.004s

(requires a perl with 64bit integers)

There was a t4.pl file attached.

You may recognize the total that the program prints out is the total number of t4 configurations, the same number that it took my C program two weeks to calculate on a decent box. So somehow, Salvador Fandino, perl.org blogger and occasional reader of my blog, managed to find way to arrive at the answer 6 million times as fast.

Well, that's interesting. To say the least.

Maybe I should be super-embarrassed. Maybe my cheeks should cycle through previously un-attained shades of crimson as I ponder the fact that my program was 6 million times as slow as someone else's. Ouch! But, I dunno. I don't really see it that way. I got to write about something I care about. Salvador++ cared enough to improve on my methods. The world is a better place. Blogging is cool — I learn stuff. Prestige doesn't much enter into it — the next time I'll have a better tool in my toolbox.

So, let's investigate this new tool, and how it's better.

First off, to get a factor-6e6 speedup, you don't apply some simple optimization somewhere; you use a different method. Salvador's code doesn't try to enumerate all the configurations, it just gets at the number. Which makes a lot of sense in retrospect, since we're not using the individual configurations for anything. My program arrives at each individual configuration, but then just throws it away immediately. Wasteful.

Salvador's blog post is as brief as his email. But let's copy the code over here and talk about it a bit:

#!/usr/bin/perl

use strict;
use warnings;

my $tab = <<EOT;
-----xxx
------xx
x-----xx
x------x
xx-----x
xx------
xxx-----
EOT

my $vertical = index $tab, "\n";
my $diagonal = $vertical + 1;

my $acu = { $tab => 1 };

for my $ix (0 .. length($tab) - 1) {
    my %next;
    while (my ($k, $c) = each %$acu) {
        my $s = substr($k, 0, 1, '');
        $next{$k} += $c;
        if ($s eq '-') {
            my $k1 = $k;
            if ($k1 =~ s/^-/x/) { # horizontal xx
                $next{$k1} += $c;
                if ($k1 =~ s/^x-/xx/) { # horizontal xxx
                    $next{$k1} += $c;
                }
            }
            $k1 = $k;
            if ($k1 =~ s/^(.{$vertical})-/${1}x/os) { # vertical xx
                $next{$k1} += $c;
                if ($k1 =~ s/^(.{$vertical}x.{$vertical})-/${1}x/os) {  # vertical xxx
                    $next{$k1} += $c;
                }
            }
            $k1 = $k;
            if ($k1 =~  s/^(.{$diagonal})-/${1}x/os) { # diagonal xx
                $next{$k1} += $c;
                if ($k1 =~ s/^(.{$diagonal}x.{$diagonal})-/${1}x/os) {  # diagonal xxx
                    $next{$k1} += $c;
                }
            }
        }
    }
    $acu = \%next;
}

my ($k, $c) = each %$acu;
print "total: $c\n";

The code is wonderfully idiomatic and to-the-point. Here are a few highlights, as I see them:

The program does far too much destructive updating for my tastes. I realize when I look at it that I no longer "think" in terms of these destructive updates. But it does it so successfully and idiomatically, that I find it difficult to list it as a disadvantage. Maybe it's a Perl 5 thing. Constructs like s/// are terribly convenient, and their default is to mutate things. (Even though Perl 5.14 adds /r for non-destructive substitution).

I was curious how this script would look (and perform) in Perl 6, so I wrote a straight port of it, trying to stick to the original as closely as possible:

my $tab = join "\n", <
    -----xxx
    ------xx
    x-----xx
    x------x
    xx-----x
    xx------
    xxx-----
>;

my $vertical = index $tab, "\n";
my $diagonal = $vertical + 1;

my %acu = $tab => 1;

my $vertical_xx = eval("/^ (. ** $vertical) '-'/");
my $vertical_xxx = eval("/^ (. ** $vertical 'x' . ** $vertical) '-'/");
my $diagonal_xx = eval("/^ (. ** $diagonal) '-'/");
my $diagonal_xxx = eval("/^ (. ** $diagonal 'x' . ** $diagonal) '-'/");

for ^$tab.chars {
    my %next;
    for %acu.kv -> $k, $c {
        my $s = $k.substr(0, 1);
        my $k0 = $k.substr(1);
        %next{$k0} += $c;
        next unless $s eq '-';
        my $k1 = $k0;
        if $k1.=subst(/^ '-'/, 'x') ne $k0 { # horizontal xx
            %next{$k1} += $c;
            my $k2 = $k1;
            if $k2.=subst(/^ 'x-'/, 'xx') ne $k1 { # horizontal xxx
                %next{$k2} += $c;
            }
        }
        $k1 = $k0;
        if $k1.=subst($vertical_xx,
                      -> $/ { $0 ~ 'x' }) ne $k0 { # vertical xx
            %next{$k1} += $c;
            my $k2 = $k1;
            if $k2.=subst($vertical_xxx,
                          -> $/ { $0 ~ 'x' }) ne $k1 { # vertical xxx
                %next{$k2} += $c;
            }
        }
        $k1 = $k0;
        if $k1.=subst($diagonal_xx,
                      -> $/ { $0 ~ 'x' }) ne $k0 { # diagonal xx
            %next{$k1} += $c;
            my $k2 = $k1;
            if $k2.=subst($diagonal_xxx,
                          -> $/ { $0 ~ 'x' }) ne $k1 { # diagonal xxx
                %next{$k2} += $c;
            }
        }
    }
    %acu := %next;
}

say "total: %acu.values()";

Ugh! This script is longer than the Perl 5 version, and it looks messier, too. A few factors contribute to that. First, you can't just do s/// in Rakudo in an if statement. (You can in Niecza, though.) Second, there are problems with <atom> ** $repeats, and I got to submit two tickets about that, and then do a workaround with the evals you see above. (Aah. Feels like the old days.)

Furthermore, jnthn++ could put this program into the profiler, and get two optimizations out of it. It went from 40s on my machine, to 37s.

But in the end, I felt that my straight-port version suffers from not playing off Perl 6's strengths. So I wrote a version that leans more towards immutability and closures.

my $tab = join "\n", <
    -----xxx
    ------xx
    x-----xx
    x------x
    xx-----x
    xx------
    xxx-----
>;

my $vertical = index $tab, "\n";
my $diagonal = $vertical + 1;

my %acu = $tab => 1;

sub make_substituter($rx) {
    return sub ($tab) {
        my $newtab = $tab;
        return $newtab
            if $newtab.=subst($rx, -> $/ { $0 ~ 'x' }) ne $tab;
    };
}

sub make_2x_substituter($rx) {
    return sub ($tab) {
        my $newtab = $tab;
        return $newtab
            if $newtab.=subst($rx, -> $/ { [~] $0, 'x', $1, 'x' }) ne $tab;
    };
}

my @pieces = 
    make_substituter(rx/^ ('') '-'/),
    make_substituter(eval("/^ ({'.' x $vertical}) '-'/")),
    make_substituter(eval("/^ ({'.' x $diagonal}) '-'/")),
    make_2x_substituter(rx/^ ('') '-' ('') '-'/),
    make_2x_substituter(eval("/^ ({'.' x $vertical}) '-' ({'.' x $vertical}) '-'/")),
    make_2x_substituter(eval("/^ ({'.' x $diagonal}) '-' ({'.' x $diagonal}) '-'/"));

for ^$tab.chars {
    my %next;
    for %acu.kv -> $k, $c {
        my $s = $k.substr(0, 1);
        my $k0 = $k.substr(1);
        %next{$k0} += $c;
        next unless $s eq '-';
        for @pieces -> &piece {
            if &piece($k0) -> $newtab {
                %next{$newtab} += $c;
            }
        }
    }
    %acu := %next;
}

say "total: %acu.values()";

Hmm. The loop is shorter now, but at the cost of some abstractions in other places. It's an improvement on my first version, but I don't really feel I got close to the succinctness of Salvador's Perl 5 version here either. (And this version runs slower, predictably. Something like 52s on my machine.)

I'm pretty sure it's possible to make even more idiomatic versions. This is a large enough problem to make things interesting. I encourage others to try.