# Learning Python via a little word frequency program

Fredrik Lundh fredrik at pythonware.com
Wed Jan 9 12:33:56 CET 2008

```Andrew Savige wrote:

> Here's my first attempt:
>
> names = "freddy fred bill jock kevin andrew kevin kevin jock"
> freq = {}
> for name in names.split():
>     freq[name] = 1 + freq.get(name, 0)
> deco = zip([-x for x in freq.values()], freq.keys())
> deco.sort()
> for v, k in deco:
>     print "%-10s: %d" % (k, -v)
>
> I'm interested to learn how more experienced Python folks would solve
> this little problem. Though I've read about the DSU Python sorting idiom,
> I'm not sure I've strictly applied it above ... and the -x hack above to
> achieve a descending sort feels a bit odd to me, though I couldn't think
> of a better way to do it.

sort takes a reverse flag in recent versions, so you can do a reverse
sort as:

deco.sort(reverse=True)

in older versions, just do:

deco.sort()
deco.reverse() # this is fast!

returns the sorted list, and both "sort" and "sorted" now allow you to
pass in a "key" function that's used to generate a sort key for each
item.  taking that into account, you can simply write:

# sort items on descending count
deco = sorted(freq.items(), key=lambda x: -x[1])

simplifying the print statement is left as an exercise.

> I also have a few specific questions. Instead of:
>
> for name in names.split():
>     freq[name] = 1 + freq.get(name, 0)
>
> I might try:
>
> for name in names.split():
>     try:
>         freq[name] += 1
>     except KeyError:
>         freq[name] = 1
>
> Which is preferred?

for simple scripts and small datasets, always the former.

for performance-critical production code, it depends on how often you
expect "name" to be present in the dictionary (setting up a try/except
is cheap, but raising and catching one is relatively costly).

> Ditto for:
>
> deco = zip([-x for x in freq.values()], freq.keys())
>
> versus:
>
> deco = zip(map(operator.neg, freq.values()), freq.keys())

using zip/keys/values to emulate items is a bit questionable.  if you
need to restructure the contents of a dictionary, I usually prefer items
(or iteritems, where suitable) and tuple indexing/unpacking in a list
comprehension (or generator expression, where suitable).

> Finally, I might replace:
>
> for v, k in deco:
>     print "%-10s: %d" % (k, -v)
>
> with:
>
> print "\n".join("%-10s: %d" % (k, -v) for v, k in deco)

why?

</F>

```