Download the raw code.
use v6;
my @strA = $*IN.get.comb;
my @strB = $*IN.get.comb;
my %offsetB;
%offsetB.push( @strB Z 0..* );
my $max-substr = '';
for 0..* Z @strA -> $i, $chr {
next unless %offsetB.exists: $chr;
for @( %offsetB{$chr} ) -> $j {
my $substr = [~] gather for @strA[$i..*-1] Z @strB[$j..*-1] -> $a, $b {
last if $a ne $b;
take $a;
};
$max-substr = $substr if $substr.chars > $max-substr.chars;
}
last if $max-substr.chars >= +@strA | +@strB - $i;
}
say $max-substr;
I find this script pleasant and readable.
Nothing to remark on here.
%offsetB
is a fine name, though I'd have gone with %indexB
or something.
The concept of an offset makes more sense when the array is accessed through
a pointer, like in C or assembly.
So here's the odd thing: I would have expected a suffix string solution to beat any solution which just iterates the strings. But this solution is blazingly fast. At least under the current limitations of Rakudo, this is clearly the way to find longest common substrings.
It appears to me that the time complexity is O(n1cm), where
n1 is the length of @strA
, c is the size of the biggest
array in %offsetB
(that is, number of repetitions of the most frequent
character in @strB
), and m is the length of the longest common substring.
Now that's an upper limit, but in most cases we'll loop fewer times
than c and m in the inner loops.
Here I must mention line 7, with its %offsetB.push( @strB Z 0..* );
A
very nice idiom, and this is its very first use, as far as I know. When
you push pairs where the keys are the same, all the values of those
keys will end up collected into an array under that key. So we're basically
using .push
for its collecting capability here.
Writing line 11 as for 0..* Z @strA -> $i, $chr {
is fine, though the
more idiomatic way rather than Z
is to do @strA.kv
. To each his own,
though. This way of doing it is possibly more explicit, and it's tantalizing
with its lazy save from the clutches of the infinite range.
The gather
block could have been a map
here, but while the last
would have worked equally well in a map
, its presence means we're doing
slightly non-trivial things, so a gather
is fine, I guess.
Nice use of a junction on line 23 as well. That's the kind of junction that is used just to group things for ease of reading/writing, and that could easily be desugared into not using a junction by a savvy enough compiler.
Very short.