[Tutor] Which non SQL Database ?

Terry Carroll carroll at tjc.com
Wed Dec 8 23:08:32 CET 2010


On Sat, 4 Dec 2010, Jorge Biquez wrote:

> What would do you suggest to take a look? If possible available under the 3 
> plattforms.

I would second the use of SQLite.  It's built into Python now, on all 
platforms.

But you specified "non SQL", so one other thing I'd suggest is to just 
create the data structure you need in Python and use pickle to save it.

I recently had an exercise of recovering files from a damaged hard drive. 
The problem is, it recovered a lot of legitimately deleted files along 
with the recovered "live" files.  All the files had generic names, with 
only filetypes to guide me for content, like "028561846.avi" instead of 
descriptive names.

I wrote a program to read every single one of these files and determine 
its MD5 checksum; I stored the results in a dictionary.  The key to the 
dictionary was the checksum; and the value was a list of files that had 
that checksum; the list was usually, but not always, only one element.

Then I pickled that dictionary.

In another program, I ran os.walk against my archive CDROMs/DVDRROMs, or 
some other directories on my hard drive, finding the MD5 of each file; and 
if it corresponded to a "rescued" file, it deleted the rescued file.

Ideally, I would have also updated the dictionary to drop the files I'd 
cleaned up, and at the end of processing, re-pickle the edited 
dictionary; but that wasn't an option as I usually had 2 or 3 instances of 
the program running simultaneously, each processing a different directory 
of CD/DVD.



More information about the Tutor mailing list