[Tutor] Re: Critique and a big thanks!

Andrei project5 at redrival.net
Fri Apr 2 15:57:45 EST 2004

Shawhan, Doug (EM, ITS) wrote on Fri, 2 Apr 2004 14:16:40 -0500:

> Tutors, 
> The script grew from a simple copy thang to this threaded beast 
> that works way better than I ever thought it would! Thank you 

Perhaps 'beast' is too big a word for what amounts (in Python at least) to
60 lines of code ;). 

> My final question: What would you folks suggest to make this 
> script more simple and easier to read? 

For one thing, I'd get rid of tabs and use 4 spaces per indentation level.
It's proper Python style, though there are of course people who prefer
tabs. If you have a half-decent editor, you won't notice the difference
actually, as you can keep using tabs and the editor converts your tab to 4
spaces. Real tabs look ugly in some mail clients (or they don't appear at
all). E.g. mine indents tabs at 8 spaces, making your code excessively

Also it's good style to use spaces before and after assignment (=), e.g.
"gathers = []" instead of "gathers=[]".

A general note: you can use forward slashes in paths even on Windows. I
haven't tried on network drives, but it works just fine on local ones. This
should help you get rid of the '\\\\\\\' things.

> #create list for thread pool.
> gathers=[]
> #open the .csv file and strip out the linefeeds
> hostguffer=open(host_list,"r")

Using file() instead of open() is the modern way - more intuitive too. The
result works the same.

> hostholder=hostguffer.readlines()

It's good practice to close files after you're finished reading from them.

> hosts=[]
> for host in hostholder:
> 	hosts.append(host[:-1])

There's an alternative (shorter) way of doing this too, by looping directly
over the file:

  hosts = []
  for line in hostguffer: # don't need hostholder

or with a list comprehension (even shorter):

  hosts = [ line[:-1] for line in hostguffer ]

> class gather(Thread):
> 	def __init__(self, users, host, destination_directory):
> 		Thread.__init__(self)
> 		self.host=host
> 		self.users=users
> 		self.destination_directory=destination_directory

I'm not a big fan of very long variable names: they're hard to type and
they make lines very long. 'dest_dir' for example would be basically every
bit as clear as 'destination_directory', at one third its size.

> 			#strip extraneous pathname info from username. 
> 			#Assumes all hostnames are the same length
> 			username=user[21:] 

Generally speaking it's not good to have magic numbers in the code. 21 here
is a magic number and should either be a global constant, or be determined
at runtime based on the location of some char in user, whichever is best
for you.
In you particular case it doesn't really matter, but imagine you have a
similar loop in 10 places in your code and suddenly for whatever reason
user names start at position 23 instead of 21. Using a magic number means
you'd have to go and change all 10 places manually. If you'd used a global
constant, you would only have to change a single value (that of the
constant). If you'd written code to determine the position of the username
at runtime, you wouldn't have to change anything.

> 			collection_path="%s\\MYDOCU~1\\Exchange\\"%user
> 			mailbox_list=[]

You don't seem to use mailbox_list anywhere, so you should remove it.

> 			mailboxes=glob.glob("%s*.pst"%collection_path)
> 			if mailboxes !=[]:

An empty list is automatically interpreted as False in a boolean context,
while any non-empty list is interpreted as True. So you could just as well

    if mailboxes: # means "if mailboxes isn't an empty list"
        # do stuff

The same neat shortucut is also available for other data types, e.g.
    username = raw_input('What is your name?\n > ').strip()
    # username now is a string, but it might have length 0
    if username: # means the same as 'if username != "" '
        # code which gets executed if a real user name was specified

> 				#create pathname on storage host!
> 				storage_path="%s\\%s\\%s\\"%(self.destination_directory,self.host, username)

The os.path module has a nice function called join():

>>> import os.path
>>> os.path.join('a', 'b', 'c')

It looks better than the string formatting option and it's less cumbersome
to modify if at some point it's necessary to add/remove a subdirectory in
that path.

> 				os.system("mkdir %s"%storage_path)

Python has batteries for this task: os.mkdir() (or os.makedirs() if you
want recursive directory creation).

> 				os.system("copy /Y %s*.pst %s"%(collection_path, storage_path))
> 				#we'll just copy the addressbooks too, for smarts
> 				os.system("copy /Y %s*.pab %s"%(collection_path, storage_path))

The shutil module has functions for copying files. In this case it's
probably easier to use the DOS commands though, since shutil doesn't
support wildcards AFAIK. But e.g. when you intend to write cross-plaform
code, os.system really should be avoided :).

> def count_active():
>     #returns the number of living threads
>     num_active = 0
>     for g in gathers:
>         if g.isAlive():
>             num_active += 1
>     print "%d alive" % num_active
>     return num_active

This one-liner should work as well (though it's doubtful it's clearer than
your solution):

  num_active = len([ g for g in gathers if g.isAlive() ])



Real contact info (decode with rot13):
cebwrpg5 at jnanqbb.ay. Fcnz-serr! Cyrnfr qb abg hfr va choyvp cbfgf. V ernq
gur yvfg, fb gurer'f ab arrq gb PP.

More information about the Tutor mailing list