p5-moritz

Download the raw code.

use v6;

# code adapted from
# http://en.wikibooks.org/wiki/Algorithm_implementation/Strings/Longest_common_substring#Perl
# instead of the full matrix of suffixes we only use two rows, one for the
# previous suffix length, one for current.

sub LCS($s1, $s2) {
    my $lcs_len = 0;
    my ($l1, $l2) = ($s1, $s2)>>.chars;
    my @char1 = Str, $s1.comb;
    my @char2 = Str, $s2.comb;
    my @previous_suffix = 0 xx ($l2 + 1);
    my $lcs = '';
    for 1..$l2 -> $n1 {
        my @current_suffix = 0 xx ($l2 + 1);
        for 1..$l2 -> $n2 {
            next unless @char1[$n1] eq @char2[$n2];

            # look, current common substring just got one longer!
            # RAKUDO #80614: I would have liked to write that as
            # my $c = @current_suffix[$n] = ...
            # but a bug in rakudo prevents that :(
            @current_suffix[$n2] = my $c = @previous_suffix[$n2 - 1] + 1;


            if $c > $lcs_len {
                # ... and it's the best we've got!
                $lcs_len = $c;
                $lcs = $s1.substr($n1 - $c, $c);
            }
        }
        # avoid copying by using binding
        @previous_suffix := @current_suffix;
    }
    return $lcs;
}

multi MAIN {
    say LCS get, get;
}

multi MAIN('test') {
    use Test;
    plan *;
    is LCS('otherworldlinesses', 'physiotherapy'), 'other', 'masaks example 1';
    is LCS('physiotherapy', 'otherworldlinesses'), 'other', 'masaks example 2';

    done_testing;
}

Readability

Very readable. The algorithm clearly stands out from the code.

Consistency

There's a bug on line 15. The $l2 should have been $l1. This manifests when the first string is long and the second one short:

$ perl6 solutions/p5-moritz/code 
here's a very long string with the longest substring at the very end
very end
er

A small oversight, but with quite disastrous consequences for correctness. The tests don't catch it, because they're testing same-length strings.

Clarity of intent

The in-code comments are nice in that they explain why the code looks the way it does, rather than what it does. Two of the comments tell the story about the intent of the actions, which is even nicer:

# look, current common substring just got one longer!

# ... and it's the best we've got!

Algorithmic efficiency

This algorithm is O(n₁n₂), where n₁ and n₂ are the lengths of $s1 and $s2, respectively. That's not extremely fast compared to what could be acheived in this task.

Idiomatic use of Perl 6

A small but nice convenience is the >>.chars on line 10. In my view, it helps highlight the symmetry of the assignment, too.

I'm liking the use of binding at line 34. It combines an eye for efficiency with the features of Perl 6.

Brevity

Not the shortest solution, but not unreasonably long either.