[Tutor] Python program design

Doug Glenn doug at foruminfosystems.com
Sat Jul 21 03:28:51 CEST 2007


I have a question on program design.  The program will be a catalog of
portable media (DVD, CD, flash drive, floppy, etc).  I am currently in the
preliminary stages of design and I currently would like to get input on what
the best method would be for initially gathering the data and what format it
should take.

I plan on using the os.walk function to iterate a scan through the media.
This returns a set of three variables in a list that will change each
iteration until the scan completes.

I don't know anything at all about database design or the most efficient
method of gathering the data and then inputing it into the database.  In
order to optimize this part of the design I wanted to get some input from
the masters. It is better to get it right the first time then go back and
fix it later.

Since os.walk will generate a unique set of data each interation through the
directory structure I must append the data from each pass into a 'master'
array or list.  In order to identify the media later I will have to get the
disk label and the disk ID as the primary identifiers for the compiled
data.

Each iteration of the directories on the media I will create a list like
this :
root = current directory being scanned - contains single directory
dirs = subdirectories under the current root - contains names of all
directories
files = filenames - contains all the files in the current working directory.

I need to generate the following additional information on each file during
each interation.
size = size of file
type = file extension

My initial test example is something like this:
import os
from os.path import join, getsize
for root, dirs, files in os.walk(device):  # device name from config file

Then I would need to get the file size (but this is giving me an error at
the moment)
    for name in files:
        s = sum(getsize(join(root, name)
        print s  (syntax error here. I have not figured it out yet. There
are spaces in the path/filename combo)
   (code to parse the file extension here)

Back to the data though, should I create something like these while reading
the media and prior to inserting it into the database?
[disk label, disk id [root[dir[file,size,type, permissions]]]] in a
dictionary or tuple? or use a list?

or flat in a dictionary, tuple or list like
[disk label, disk id, root,dir,filename,size,type,permissions]

When it has completed the scan I want to insert it into the database with
the following fields
disk label, disk id, root directory, path, filename, file size, file type,
original file permissions, and comment field.  (Does anyone thing I should
have any other fields? Suggestions welcome)

Thank you in advance.  If this is off topic, please reply off the list and
let me know.
-- 
Doug Glenn
FORUM Information Systems, LLC
http://foruminfosystems.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/tutor/attachments/20070720/9e984c30/attachment.html 


More information about the Tutor mailing list