ANN: dbf.py 0.94.003

Ethan Furman ethan at stoneleaf.us
Fri Jul 27 12:50:39 CEST 2012


Simon Cropper wrote:
> On 27/07/12 05:31, Ethan Furman wrote:
>> A few more bug fixes, and I actually included the documentation this
>> time.  :)  It can be found at http://python.org/pypi/dbf, and has been
>> tested on CPythons 2.4 - 2.7, and PyPy 1.8.
> 
> [snip]
> 
> Ethan,
> 
> That's great.
> 
> Can you comment on the ultimate aim of the project?

To provide read/write access to the dbf format, from
dBase III to dBase 7, including memos and index files.
Currently supports dBase III, FoxPro, Clipper,
and Visual Foxpro (but not autoincrement nor varchar).


> Is this package primarily a "universal dbf translator" that allows the 
> data stored in DBFs (which I might add I have many in legacy VFP 
> applications and GIS Shapefiles) to be accessed and extracted or is the 
> module being designed to be used interactively to extract data from and 
> update tables?

Some folk use it as a dbf translator, some folk use it for interactive 
work.  I use it for both those purposes as well as creating new dbf 
files which get processed by our in-house software as well as 
third-party software every day.


> I remember on the last thread that someone mentioned that indexes are 
> not supported. I presume then that moving around a table with a couple 
> of million records might be a tad slow. Have you tested the package on 
> large datasets, both DBFs with a large number of records as well as a 
> large number of fields?

The largest tables I've had at my disposal so far were about 300,000 
records with roughly 50 fields with a total record length of about 
1,500.  Processing (for me) involves going through every single record, 
and yes it was a tad slow.  This is my most common scenario, and index 
files would not help at all.

For more typical work (for others) of selecting and using a subset of 
the dbf, an in-memory index can be created -- initial creation can take 
a few moments, but searches afterwards are quite quick.

This is a pure-python implementation, so speed is not the first goal. 
At some point in the future I would like to create a C accelerator, but 
that's pretty far down the to-do list.

~Ethan~



More information about the Python-list mailing list