Dict to "flat" list of (key,value)
Raymond Hettinger
vze4rx4y at verizon.net
Mon Aug 4 02:27:02 EDT 2003
[Raymond]
> >index = {}
> > for pagenum in range(len(pages)):
> > page = pages[pagenum]
> > for word in page:
> > index.setdefault(word, []).append(pagenum)
> >
> >it-all-starts-with-a-good-data-structure-ly yours,
[John Machin]
> Amen, brother.
>
> Even worse: I recall seeing code somewhere that had a data structure
> like this:
>
> {k1: [v1,v2], k2: v3, k3: None, ...}
> instead of
> {k1: [v1,v2], k2: [v3], k3: [], ...}
>
> I liked the elegant code example for building a book index. However in
> practice the user requirement would be not to have duplicated page
> numbers when a word occurs more than once on the same page. If you can
> achieve that elegantly, please post it!
How about a simple Py2.3 clean-up pass:
# Uniquify and sort each list of page numbers
for word, pagelist in index.iteritems():
newlist = dict.fromkeys(pagelist).keys()
newlist.sort()
index[word] = newlist
Raymond Hettinger
More information about the Python-list
mailing list