Duplicate keys in dict?
Tim Chase
python.list at tim.thechases.com
Sun Mar 7 14:11:18 EST 2010
vsoler wrote:
> On 7 mar, 17:53, Steven D'Aprano <st... at REMOVE-THIS-
> cybersource.com.au> wrote:
>> On Sun, 07 Mar 2010 08:23:13 -0800, vsoler wrote:
>>> Hello,
>>> My code snippet reads data from excel ranges. First row and first column
>>> are column headers and row headers respectively. After reding the range
>>> I build a dict.
>>> ................'A'..............'B'
>>> 'ab'............3................5
>>> 'cd'............7................2
>>> 'cd'............9................1
>>> 'ac'............7................2
>>> d={('ab','A'): 3, ('ab','B'): 5, ('cd','A'): 7, ...
>>> However, as you can see there are two rows that start with 'cd', and
>>> dicts, AFAIK do not accept duplicates.
>>> One of the difficulties I find here is that I want to be able to easily
>>> sum all the values for each row key: 'ab', 'cd' and 'ac'. However,
>>> using lists inside dicts makes it a difficult issue for me.
>
> What I need is that sum(('cd','A')) gives me 16, sum(('cd','B')) gives
> me 3.
But you really *do* want lists inside the dict if you want to be
able to call sum() on them. You want to map the tuple ('cd','A')
to the list [7,9] so you can sum the results. And if you plan to
sum the results, it's far easier to have one-element lists and
just sum them, instead of having to special case "if it's a list,
sum it, otherwise, return the value". So I'd use something like
import csv
f = file(INFILE, 'rb')
r = csv.reader(f, ...)
headers = r.next() # discard the headers
d = defaultdict(list)
for (label, a, b) in r:
d[(label, 'a')].append(int(a))
d[(label, 'b')].append(int(b))
# ...
for (label, col), value in d.iteritems():
print label, col, 'sum =', sum(value)
Alternatively, if you don't need to store the intermediate
values, and just want to store the sums, you can accrue them as
you go along:
d = defaultdict(int)
for (label, a, b) in r:
d[(label, 'a')] += int(a)
d[(label, 'b')] += int(b)
# ...
for (label, col), value in d.iteritems():
print label, col, 'sum =', value
Both are untested, but I'm pretty sure they're both viable,
modulo my sleep-deprived eyes.
-tkc
More information about the Python-list
mailing list