[Tutor] Dictionary - count values where values are stored as a list

GTXY20 gtxy20 at gmail.com
Mon Oct 1 18:21:22 CEST 2007


This works perfectly.

However I will be dealing with an import of a very large dictionary - if I
call the commands at command line this seems to be very taxing on the CPU
and memory and will take a long time.

I was thinking of creating each as a fucntion whereby python would just to
write to a file instead of calling within a python shell do you think that
this would speed up the process?

All in total I will probably be looking at about 2 million dictionary keys
with assorted value quantities.

M.


On 10/1/07, Kent Johnson <kent37 at tds.net> wrote:
>
> GTXY20 wrote:
> > Hello,
> >
> > Any way to display the count of the values in a dictionary where the
> > values are stored as a list? here is my dictionary:
> >
> > {'1': ['a', 'b', 'c'], '3': ['a', 'b', 'c'], '2': ['a', 'b', 'c'], '4':
> > ['a', 'c']}
> >
> > I would like to display count as follows and I would not know all the
> > value types in the values list:
> >
> > Value QTY
> > a       4
> > b       3
> > c       4
>
> You need two nested loops - one to loop over the dictionary values and
> one to loop over the individual lists. collections.defaultdict is handy
> for accumulating the counts but you could use a regular dict also:
>
> In [4]: d={'1': ['a', 'b', 'c'], '3': ['a', 'b', 'c'], '2': ['a', 'b',
> 'c'], '4': ['a', 'c']}
> In [5]: from collections import defaultdict
> In [6]: counts=defaultdict(int)
> In [7]: for lst in d.values():
>    ...:     for item in lst:
>    ...:         counts[item] += 1
>    ...:
> In [8]: counts
> Out[8]: defaultdict(<type 'int'>, {'a': 4, 'c': 4, 'b': 3})
>
> In [10]: for k, v in sorted(counts.items()):
>    ....:     print k,v
>    ....:
>    ....:
> a 4
> b 3
> c 4
>
>
> > Also is there anyway to display the count of the values list
> > combinations so here again is my dictionary:
> >
> > {'1': ['a', 'b', 'c'], '3': ['a', 'b', 'c'], '2': ['a', 'b', 'c'], '4':
> > ['a', 'c']}
> >
> >
> > And I would like to display as follows
> >
> > QTY Value List Combination
> > 3      a,b,c
> > 1      a,c
>
> Again you can use a defaultdict to accumulate counts. You can't use a
> mutable object (such as a list) as a dict key so you have to convert it
> to a tuple:
>
> In [11]: c2=defaultdict(int)
> In [13]: for v in d.values():
>    ....:     c2[tuple(v)] += 1
>    ....:
> In [14]: c2
> Out[14]: defaultdict(<type 'int'>, {('a', 'b', 'c'): 3, ('a', 'c'): 1})
>
> Printing in order of count requires switching the order of the (key,
> value) pairs:
>
> In [15]: for count, items in sorted( ((v, k) for k, v in c2.items()),
> reverse=True):
>    ....:     print count, ', '.join(items)
>    ....:
> 3 a, b, c
> 1 a, c
>
> or using a sort key:
> In [16]: from operator import itemgetter
> In [17]: for items, count in sorted(c2.items(), key=itemgetter(1),
> reverse=True):
>    ....:     print count, ', '.join(items)
>    ....:
> 3 a, b, c
> 1 a, c
>
> Kent
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/tutor/attachments/20071001/ae0da57b/attachment.htm 


More information about the Tutor mailing list