attribute access and indirection

Ethan Furman ethan at
Thu Oct 22 15:21:13 CEST 2009

Greetings, List!

Say I have an old-fashioned dbf style table, with a single name field of
50 characters:

   names = dbf.Table(':memory:', 'name C(40)')

Then I add a bunch of names from who-knows-where:

   for name in some_iterable():

Now I want to know how many start with a 'J'...
I have two options, 1) brute force:

   matches = [rec for rec in names if[0] == 'J']

or, 2) binary search after ordering:

   matches ='J', startswith=True)

So far so good.  Now it gets a little more complicated.  In my use case
I am trying to match records from one database to records from a another
database;  therefore, I do _not_ know what text I will be searching for,
only the fields I will be using.

If I only had one criteria, I'd still be okay:

   different_table.order('zipcode[:5], last_name')
   for record in original_table:
     matches =[record.zipcode[:5], last_name])

However, I have three different sets of matches:
   'first_name[:1], last_name, city, dlvryaddrs[:4]'
   'first_name[:1], last_name[:5], dlvryaddrs[:8]'
   'first_name, last_name, city, state'

This is not a problem for the ordering, as I can just do
   for criteria in (choices):

The problem comes at the matching stage:  the .search method is
expecting a list of the pieces it is supposed to find, so what I need is
a way to apply, for example, 'first_name[:1], last_name[:5],
dlvryaddrs[:8]', to the current record to yield the text to search for.

Current code follows, more comments towards the end.

import dbf
import shutil
from collections import defaultdict
from cookbook.utils import index

source_tables = [ '/temp/kaz15514',
                   '/temp/msq15994' ]

counts = defaultdict(int)

for i in index(source_tables):
     source_tables[i] = dbf.Table(source_tables[i])

shutil.copy('z:/orders/25105/mbk16508_02', '.')
match_back = dbf.Table('mbk16508_02')
match_back.add_fields('f1ltcta4 C(100), f1l5a8 C(100), ftltctst C(100)')

for field, criteria in \
        (('f1ltcta4', 'first_name[:1], last_name, city, dlvryaddrs[:4]'),
         ('f1l5a8', 'first_name[:1], last_name[:5], dlvryaddrs[:8]'),
         ('ftltctst', 'first_name, last_name, city, state'))
     for table in source_tables:
         counts = defaultdict(int)
         for record in match_back:
             matches =

The only idea I have at the moment is to parse the string (much like I
do in the order method), and after the string is parsed pluck out the
the needed pieces.  If that is the best and/or most practical way to do
it, I was thinking of adding __call__ to the record class.  Then,
besides being able to do:

   matches =[record.zip4[:5], record.dlvryaddrs])

I could also do:

   matches ='zip4[:5], dlvryaddrs'))

or, equivalently,
   criteria = 'this, that[:7], the_other'
   matches =

Any better ideas?  Am I missing anything already in the stdlib?

Any and all tips appreciated!


<shameless plug>  Python d-Base currently lives at, and it's main purpose in
life is to ease the transition between old dbf files and newer sql
tables.  It can, however, be used for read/write access to dBase III and
VFP 6 tables, including memo fields.

Success stories and bug reports both equally welcome!  :D
</shameless plug>

More information about the Python-list mailing list