Dict to "flat" list of (key,value)

Raymond Hettinger vze4rx4y at verizon.net
Mon Aug 4 02:27:02 EDT 2003


[Raymond]
> >index = {}
> >    for pagenum in range(len(pages)):
> >         page = pages[pagenum]
> >         for word in page:
> >               index.setdefault(word, []).append(pagenum)
> >
> >it-all-starts-with-a-good-data-structure-ly yours,

[John Machin]
> Amen, brother.
>
> Even worse: I recall seeing code somewhere that had a data structure
> like this:
>
> {k1: [v1,v2], k2: v3, k3: None, ...}
> instead of
> {k1: [v1,v2], k2: [v3], k3: [], ...}
>
> I liked the elegant code example for building a book index. However in
> practice the user requirement would be not to have duplicated page
> numbers when a word occurs more than once on the same page. If you can
> achieve that elegantly, please post it!

How about a simple Py2.3 clean-up pass:

    # Uniquify and sort each list of page numbers
    for word, pagelist in index.iteritems():
        newlist = dict.fromkeys(pagelist).keys()
        newlist.sort()
        index[word] = newlist


Raymond Hettinger






More information about the Python-list mailing list