[Tutor] table to dictionary and then analysis
Joel Goldstick
joel.goldstick at gmail.com
Wed May 16 18:57:36 CEST 2012
On Wed, May 16, 2012 at 11:03 AM, Alan Gauld <alan.gauld at btinternet.com> wrote:
> On 16/05/12 12:27, Russel Winder wrote:
>
>>> As a matter of interest why?
>>
>>
>> Because there are alternatives that need to be investigated on a per
>> problem basis for the best database.
>
>
> I agree, but in this case SQL seemed like the most likely fit of the ones I
> knew. however:
>
>> SQL
>> MongoDB
>
>
> I know about these
>
>> CouchDB
>> Cassandra
>> Neo
>
>
> These are new to me.
>
>
>> etc. Python only has SQLite3 as standard but there are alternatives. I
>> have been using PyMongo quite successfully.
>
>
> Python comes with several storage/access options including shelve, gdbm,
> ldap, cobfig files, XML, in addition to SQL.
>
>
>> It is not clear that the original table works better with the relational
>> model compared to one of the key-value stores or document stores.
>
>
> Most key-value stores are optimised for fast queries of a single type
> and generally not great at grouping or ordering. They also tend to major on
> flexiblity of data format. The OPs requirements suggested intelligent
> filtering of a fixed record format which is one of the areas where SQL works
> well. The other side of the coin is that the data is essentially single
> table so the relationship management aspects of SQL would not be needed. So
> I agree we don't have enough detail
> to be 100% sure that another option would not work as well or better.
>
> But most other options require learning new (often bespoke) query languages
> and have limited user tools. All of these factors need to be included too.
> Mongo et al tend to be better suited, in my experience, to machine access
> applications rather than end user access.
>
>
>> There are various articles around the Web comparing and contrasting
>> these various models. Some of the articles are even reasonable :-)
>
>
> Wikipedia is my friend :-)
>
>
> --
> Alan G
> Author of the Learn to Program web site
> http://www.alan-g.me.uk/
>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
I think the OP is just learning and this thread may have gotten of track.
Here is some code to get started. I decided to use sqlite3 since its
easy to use with python -- no finding and learning to load packages.
#!/usr/bin/env python
import sqlite3 as db
# Ideally this shouldn't be global, but in this short code snippet it
gets the job done
# here we create a database and get a cursor
conn = db.connect('climate.db')
cursor = conn.cursor()
print cursor
# this will create a table for our data
sql_create = """CREATE TABLE if not exists rain (
id INTEGER PRIMARY KEY,
year INTEGER,
month TEXT(3),
rainfall FLOAT,
fire_area FLOAT
)"""
# this will read the data file and put it in our database
def populate_climate_table(file_name):
"""
reads the file_name and insert data into sqlite table
"""
sql_insert_string = "insert into rain (year, month, rainfall,
fire_area) values (%d, '%s', %f, %f)"
f = open(file_name)
f.readline() # get rid of column headers
for l in f.readlines():
data_list = l.split()
print data_list
sql_insert = sql_insert_string % (int(data_list[0]),
data_list[1], float(data_list[2]), float(data_list[3]))
print sql_insert
cursor.execute(sql_insert)
conn.commit()
if __name__ == '__main__':
print sql_create
cursor.execute(sql_create)
populate_climate_table('data.txt')
So, I haven't solved all of the questions with this code. The next
thing to do is to read a little about sqlite select statements.
for example: sqlite> select sum(rainfall)/count(*) from rain;
3.97352768125
This statement will give the average rainfall over the complete dataset.
To get the ave rainfall for a given year do this:
sqlite> select sum(rainfall)/count(*) from rain where year = 1983;
Come back with more questions
--
Joel Goldstick
More information about the Tutor
mailing list