newbie evaluation

Robert Brewer fumanchu at amor.org
Thu Sep 23 02:23:07 EDT 2004


Ismael Herrera wrote:
> Hi , i am new to python , and this is my first wanna-be 
> program, it would
> help me a lot if you point me my mistakes or bad design, bad 
> practice or an
> easier , faster ,cleaner or better way to program since dont 
> want to stick
> with bad practices.

Off the cuff:

1. Read PEP 8 (http://www.python.org/peps/pep-0008.html) and follow its
recommendations regarding whitespace, if you can.

2. In the block:

>   def load(self):
>     
>     try :
>       
>       #open the database file   
>       f = open('/var/lib/dpkg/available','r')
>       
>     except : raise IOError

...why not just let any exception which open() raises propagate outward?
There is *very* rarely a need in Python to trap all exceptions with a
bare "except:" clause. If you just re-raise IOError, there's no point in
trapping at all.

3. (The biggie) Why use a class for dpackagedb instead of a function?
Below is a complete, minimally-tested version which uses a simple
state-machine function instead. It makes one pass over the data and is
probably quite a bit faster than the class-based version. I also dropped
the regex since we now iterate over each line one by one.

4. Even if you keep the class-based approach, you don't need the class
attributes like "PKG = 'Package'" -- they just mess up the namespace and
don't add any value. Use the actual field names instead. 

5. Do you really need all of those attributes (like "arquitecture") to
be "top-level" for the dpackage objects? I chose to put them in a
subclass of dict instead. The difference is that you would refer to
"dpackage().arquitecture" in your scheme and
"dpackage()['Architecture']" in mine. If you still want the non-English
terms, re-introduce a dict lookup when the field names are parsed.

6. When you want to remove the last item in a list, use list.pop().

7. My version is about half the length of yours (so stop putting blank
lines between every statement ;). Although a state machine might seem
more complex, it's actually easier to grasp at once since it all fits on
a single page of code (on my screen, anyway).

-----------

#!/usr/bin/python

fieldnames = ['Package', 'Priority', 'Section', 'Installed-Size',
              'Maintainer', 'Architecture', 'Source', 'Version',
              'Replaces', 'Depends', 'Conflicts', 'Size',
              'Description']


class dpackage(dict):
    """A package in the Debian 'available packages' database."""
    
    def __str__(self):
        return "\n".join(["%s = %s" % (k, self.get(k))
                          for k in fieldnames])


def all_packages():
    f = open('/var/lib/dpkg/available', 'r')
    
    db = []
    pkg = dpackage()
    desc = []
    
    for line in f:
        if line != "\n":
            if line.startswith('Description'):
                # Add the line to the description.
                # A non-empty 'desc' signals we are in the
                # description-gathering state.
                desc.append(line)
            else:
                if desc:
                    desc.append(line)
                else:
                    # Parse the line and add to the current package
                    k, v = line.split(":", 1)
                    v = v.strip()
                    if k in ('Depends', 'Recommends'):
                        pkg[k] = v.split(",")
                    else:
                        pkg[k] = v
        else:
            # Finalize the current package and store it
            if desc:
                pkg['Description'] = "".join(desc)
                desc = []
            if pkg:
                db.append(pkg)
                # Make a new, empty package object
                pkg = dpackage()
    
    f.close()
    
    # Not sure why you drop the last item,
    # but you must have a reason.
    db.pop()
    
    return db


if __name__ == '__main__':
    for x in all_packages():
        if x:
            print x

-----------

...and here's the test output:

>>> d = debpkg.all_packages()
>>> len(d)
10823
>>> print d[0]
Package = cl-infix
Priority = optional
Section = non-free/devel
Installed-Size = 64
Maintainer = Kevin M. Rosenberg <kmr at debian.org>
Architecture = all
Source = None
Version = 19960628.1-2
Replaces = None
Depends = ['common-lisp-controller (>= 3.37)']
Conflicts = None
Size = 15242
Description = Description: an infix reader macro for Common Lisp
 This package provides an infix reader macro for Common Lisp,
 allowing use of more
 traditional mathematical syntaxes in Common Lisp programs.


Hope that helps,


Robert Brewer
MIS
Amor Ministries
fumanchu at amor.org



More information about the Python-list mailing list