toy list processing problem: collect similar terms
sln at netherlands.com
sln at netherlands.com
Thu Oct 7 20:44:31 EDT 2010
On Wed, 06 Oct 2010 10:52:19 -0700, sln at netherlands.com wrote:
>On Sat, 25 Sep 2010 21:05:13 -0700 (PDT), Xah Lee <xahlee at gmail.com> wrote:
>
>>here's a interesting toy list processing problem.
>>
>>I have a list of lists, where each sublist is labelled by
>>a number. I need to collect together the contents of all sublists
>>sharing
>>the same label. So if I have the list
>>
>>((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
>>r) (5 s t))
>>
>>where the first element of each sublist is the label, I need to
>>produce:
>>
>>output:
>>((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))
>>
>[snip]
>
>>anyone care to give a solution in Python, Perl, javascript, or other
>>lang? am guessing the scheme solution can be much improved... perhaps
>>using some lib but that seems to show scheme is pretty weak if the lib
>>is non-standard.
>>
>
>Crossposting to Lisp, Python and Perl because the weird list of lists looks
>like Lisp or something else, and you mention other languages so I'm throwing
>this out for Perl.
>
>It appears this string you have there is actually list syntax in another language.
>If it is, its the job of the language to parse the data out. Why then do you
>want to put it into another language form? At runtime, once the data is in variables,
>dictated by the syntax, you can do whatever data manipulation you want
>(combining arrays, etc..).
>
>So, in the spirit of a preprocessor, given that the text is balanced, with proper closure,
>ie: ( (data) (data) ) is ok.
> ( data (data) ) is not ok.
>
>the below does simple text manipulation, joining like labeled sublists, without going into
>the runtime guts of internalizing the data itself. Internally, this is too simple.
>
If not preprocessor, then ...
The too simple, order independent, id independent, Perl approach.
-sln
-----------------
use strict;
use warnings;
use Data::Dump 'dump';
my @inp = ([0,'a','b'],[1,'c','d'],[2,'e','f'],[3,'g','h'],
[1,'i','j'],[2,'k','l'],[4,'m','n'],[2,'o','p'],
[4,'q','r'],[5,'s','t']);
my ($cnt, @outp, %hs) = (0);
for my $ref (@inp) {
$hs{ $$ref[0] } or $hs{ $$ref[0] } = $cnt++;
push @{$outp[ $hs{ $$ref[0] } ] }, @{$ref}[ 1 .. $#{$ref} ];
}
dump @outp;
__END__
(
["a", "b"],
["c", "d", "i", "j"],
["e", "f", "k", "l", "o", "p"],
["g", "h"],
["m", "n", "q", "r"],
["s", "t"],
)
More information about the Python-list
mailing list