[Tutor] [Fwd: Re: Making a dictionary of dictionaries from csv file]

spir denis.spir at free.fr
Wed Dec 3 22:35:03 CET 2008


Judith Flores a écrit :
> Dear Python community,
>
>    I have been trying to create a dictionary of dictionaries (and more 
dictionaries) from a csv file. The csv file contains longitudinal data
corresponding to names. The following is just a very (very) simple example of
how the data looks:
>
> NameDayweighttemp
> name114537
> name135536
> name215936
> name233436.5
> name316637
> name338736.8

Apart from the lack of ',', I'm not really sure to understand the logical
structure of you data: for instance, can there be the same day for several
names? If not, then the problem is simpler as the day uniquely identifiesyour
data. Anyway, it seems clear that a (name+day) key is unique for a
(weight,temp) data pair. right?

> So far I have written this:
>
> from csv import *
>
> f=open('myfile.csv',"rt")
>
> row={}
> maindict={}
>
> reader=DictReader(f)
>
> for row in reader:
> maindict[row['Name']=row
>
>
> then I can access the weight of a given name like this:
>
> wg=int(maindict[['name1']['weight'])
>
>
>
> My question is the following:
>
> How can I convert the csv to a dictionary that would have the following structure?
>
> maindict = {
> 'name1' : {
> 'Day' : {
> 1 : { 'weight' : '45', 'temp' : '37' } ,
> 3 : { 'weight' : '55', 'temp' : '36' }
>   }
> },
> 'name2' : { ..... # and we repeat the same structure for the rest of the names.
> }
>>From my code above you can of course guess that it doesn't make beyondthe > 
> level of name.

If I understand well, you can clarify your problem for instance with such a 
data type (all pseudo code, untested):
class Measure(object):
	def __init__(self,weight,temp):
		self.weight = weight
		self.temp = temp
This will remove one dict level. You can create a data object
x = Measure(weight,temp)
which weight and temp attributes can be set, changed, & read individually using
'x.weight' and 'x.temp'. Much nicer, I think.

At higher level, you are then able to put such measure objects in a simple dict
with day keys and Measure values. So that an individual Measure object will
actually be accessed as a dict value, e.g.:
day_measures = {day:measure, day:measure...}
measure = day_measures[day]
temp = day_measures[day].temp (=measure.temp)

At top-level, you then need a construct to hold the set of name data objects.
As they are named, it could be a nesting dictionary of (name:day_measures)
pairs. This will be especially efficient if you frequently need accessing the
data through the 'name' key.

An alternative (which I would chose to avoid nested dicts) is to use the python
ability to create attributes which names are known at runtime only, using the
setattr() built-in function:

setattr(obj,name,val) will give the object 'obj' an attribute which full name
will be obj.name, and value will be val. Very nice (*).

That way, you can have an overall Data object (chose an appropriate name) with
a whole bunch of attributes called by their 'name' and which individual value 
is a day_measures dict:
data
	name1 : day_measures dict
	name2 : day_measures dict
	...

You need to create an attribute for each set of (named) day_measures dict you
read from the CSV file.

class Data(object): pass
data = Data()	# new empty object

while not end_of_file:
	<identify name>
	<read the set of (day:measure) pairs from CSV into dict>
	# create attribute to store the measures into data object
	setattr(data, name, dict)
# done

Now, from the outest scope, an individual data item is spelled e.g.:
data.name[day].temp
(which I find much more legible)

> Thank you very much,
>
> Judith

Hope I haven't totally miusunderstood your problem.
Salutation,
denis

(*) The ordinary way to create an attr beeing to write
obj.name = value
you need to know the name at coding time. The setattr() syntax instead allows
to set the name at runtime like a dict key. So that any ordinary object becomes
an alternative to dicts.
[A major redondance, indeed -- I personly do not really know anymore what  the 
proper use of a dict should be. Even less, as attr are implemented asdict,
there shouldn't even be a performance loss. What do you think, list members?]



More information about the Tutor mailing list