[Tutor] table to dictionary and then analysis
Russel Winder
russel at winder.org.uk
Tue May 15 11:36:45 CEST 2012
On Mon, 2012-05-14 at 23:38 -0400, bob gailer wrote:
[...]
> I would set up a SQLite database with a table of 4 numeric columns:
> year, month, rainfall, firearea
> Use SQL to select the desired date range and do the max and avg
> calculations:
> select year, avg(firearea), max(rainfall) from table where year = 1973
> and month between 6 and 8)
>
> you can use dictionaries but that will be harder. Here a start
> (untested). Assumes data are correct.
Clearly if the data is to be stored for a long time and have various
(currently unknown) queries passed over it then year a database it the
right thing -- though I would probably choose a non-SQL database.
If the issues is to just do quick calculations over the data in the file
format then nothing wrong with using dictionaries or parallel arrays à
la:
with open ( 'yearmonthrainfire.txt' ) as infile :
climateindexname = infile.readline ( ).split ( )
data = [ line.split ( ) for line in infile.readlines ( ) ]
years = sorted ( { item[0] for item in data } )
months = [ 'Jan' , 'Feb' , 'Mar' , 'Apr' , 'May' , 'Jun' , 'Jul' , 'Aug' , 'Sep' , 'Oct' , 'Nov' , 'Dec' ]
dataByYear = { year : [ ( float ( item[2] ) , float ( item[3] ) ) for item in data if item[0] == year ] for year in years }
dataByMonth = { month : [ ( float ( item[2] ) , float ( item[3] ) ) for item in data if item[1] == month ] for month in months }
averagesByYear = { year : ( sum ( dataByYear[year][0] ) / len ( dataByYear[year][0] ) , sum ( dataByYear[year][1] ) / len ( dataByYear[year][1] ) ) for year in years }
averagesByMonth = { month : ( sum ( dataByMonth[month][0] ) / len ( dataByMonth[month][0] ) , sum ( dataByMonth[month][1] ) / len ( dataByMonth[month][1] ) ) for month in months }
for year in years :
print ( year , averagesByYear[year][0] , averagesByYear[year][1] )
for month in months :
print ( month , averagesByMonth[month][0] , averagesByMonth[month][1] )
The cost of the repetition in the code here is probably minimal compared
to the disc access costs. On the other hand this is a small data set so
time is probably not a big issue.
--
Russel.
=============================================================================
Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder at ekiga.net
41 Buckmaster Road m: +44 7770 465 077 xmpp: russel at winder.org.uk
London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/tutor/attachments/20120515/882bea34/attachment.pgp>
More information about the Tutor
mailing list