toy list processing problem: collect similar terms

ccc31807 cartercc at gmail.com
Mon Sep 27 22:43:37 CEST 2010


On Sep 26, 12:05 am, Xah Lee <xah... at gmail.com> wrote:
> here's a interesting toy list processing problem.
>
> I have a list of lists, where each sublist is labelled by
> a number. I need to collect together the contents of all sublists
> sharing
> the same label. So if I have the list
>
> ((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
> r) (5 s t))
>
> where the first element of each sublist is the label, I need to
> produce:
>
> output:
> ((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))

Here is a solution in Perl -- the verbose version. Please see my note
below.

SCRIPT:
use strict;
use warnings;
my %lists;

while (<DATA>)
{
    chomp;
    my ($k, @v) = split(/ /, $_);
    push(@{$lists{$k}}, @v);
}

foreach my $k (sort keys %lists)
{
    print "$k - @{$lists{$k}}\n";
}

exit(0);

__DATA__
0 a b
1 c d
2 e f
3 g h
1 i j
2 k l
4 m n
2 o p
4 q r
5 s t

OUTPUT:
>perl lists.plx
0 - a b
1 - c d i j
2 - e f k l o p
3 - g h
4 - m n q r
5 - s t

NOTE:
I assume that you want an idiomatic solution for the language. I have
therefore converted your data into a typical record oriented
structure. Perlistas don't use parenthesis. If you want a Lispy
solution, use Lisp. Further, Perl was made for exactly this kind of
problem, which is simple data munging, taking some input, transforming
it, and printing it out -- Practical Extraction and Reporting
Language. I know that some Lispers (R.G., are you reading?) will
object to a formulation like this: @{$lists{$k}}, but all this says
(in Perl) is to spit out the value contained in the hash element
$lists{$k} as an array, and is good idiomatic Perl, even if some
Lispers aren't quite up to the task of understanding it.

CC.



More information about the Python-list mailing list