[Tutor] Setting up a database

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Wed Mar 2 01:35:24 CET 2005



On Tue, 1 Mar 2005, James O. Sweeney wrote:

> I have an assignment that entails entering cash transactions as records.


Hi James,

Just to make it clear: we are prohibited from doing homework questions.
Since it's an assignment, we'll try to stay clear of direct solutions.


> My question is about setting up the database file. In Python there is a
> dictionary function but it looks as if only one value can be assigned to
> the key (I intend to make the key the timestamp).

A dictionary can be used to associate a single key with multiple values.
If we let each value be a list, then we can do a 'key' to 'records'
mapping:

###
>>> d = {}
>>> def addTally(name):
...     d.setdefault(name[0], []).append(name)
...
>>> addTally('john')
>>> addTally('brian')
>>> addTally('jane')
>>> addTally('alice')
>>> addTally('bob')
>>> d
{'a': ['alice'], 'j': ['john', 'jane'], 'b': ['brian', 'bob']}
###


Here, we can see that the key 'john' can be associated with multiple
values.  The body above:

######
d.setdefault(name[0], []).append(name)
######

is shorthand for:

######
if name not in d:
    d[name[0]] = []
d[name[0]].append(name)
######


> The file has to be searchable so that reports can be pulled regarding
> amount, date range, etc. - the usual things one would do with a
> database.

The direct solution would be to use a database here.



> I can't figure if I can make a query on a multi-field record in Python

You can.  Conceptually, if you can make a query into a predicate, then you
can always do a linear scan:

### Pseudocode
def doQuerySearch():
    resultSet = []
    for record in allRecordsInDatabase:
        if satisfiesQuery(record):
            resultSet.append(record)
    return resultSet
###

And satisfiesQuery() can do whatever it needs to do to see if the record
is a good one or not.

The problem here is one of efficiency: this is a linear scan.  For small
databases, this is fine.  For larger ones, it might not be so good.


Databases use indices and other optimization strategies to avoid a linear
scan across the entire database.  Dictionaries are one tool at our
disposal that allow us to avoid linear scans.  If we use several
dictionaries, then we might even be able to do quick lookup on different
fields.

But a lot of the work behind databases involves figuring out which field
constraint will cut down on the number of candidates most effectively, and
that's hard work.  That's why, if you can, you probably should use a real
database.



Best of wishes to you.



More information about the Tutor mailing list