toy list processing problem: collect similar terms

WJ w_a_x_man at yahoo.com
Thu Feb 17 08:19:43 CET 2011


Xah Lee wrote:

> here's a interesting toy list processing problem.
> 
> I have a list of lists, where each sublist is labelled by
> a number. I need to collect together the contents of all sublists
> sharing
> the same label. So if I have the list
> 
> ((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
> r) (5 s t))
> 
> where the first element of each sublist is the label, I need to
> produce:
> 
> output:
> ((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))
> 

Solving without hash-tables or "group-by".

Using Guile:

First, sort the groups by the numbers.

(stable-sort groups (lambda(a b)(< (car a) (car b))))

((0 a b) (1 c d) (1 i j) (2 e f) (2 k l) (2 o p) (3 g h)
 (4 m n) (4 q r) (5 s t))

Next, flatten the list.

(append-map identity step1)

(0 a b 1 c d 1 i j 2 e f 2 k l 2 o p 3 g h 4 m n 4 q r 5 s t)

Remove duplicate numbers.

(delete-duplicates step2)

(0 a b 1 c d i j 2 e f k l o p 3 g h 4 m n q r 5 s t)

We now need a very useful function called "scan".

;; Yields sublists of contiguous items that satisfy func.
(define (scan func lst)
  (let ((tail (find-tail func lst)))
    (if tail
      (cons (take-while func tail)
            (scan func (drop-while func tail)))
      '())))

(scan symbol? step3)

((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))



More information about the Python-list mailing list