[Tutor] Multi-Dimensional Dictionary that contains a 12 element list.

Sat Dec 31 15:33:52 CET 2005

Paul Kraus wrote:
> So now for all my reports all I have to write are little 5 or 6 line scripts 
> that take a text file split the fields and format them before basing them off 
> into my custom object. Very slick and this is my first python project. Its 
> cluttered and messy but for about 1 hours worth of work on a brand new 
> language I am impressed with the usability of this language.

I'm impressed with how fast you have learned it!
> 
> Current Script - Critique away! :)

A number of comments below...

> =-=-=-=-=--=-=
> #!/usr/bin/python
> import string
> import re
> 
> class Tbred_24Months:
>     def __init__(self,currentyear,currentmonth): ### Takes Ending Year and 
> Ending Month Inits List
>         guide = []
>         self.results = {}
>         self.guide = {}
>         self.end_month = currentmonth
>         self.end_year  = currentyear
You never use these two attributes ^^

>         self.start_month = currentmonth
>         self.start_year = currentyear

start_month and start_year can be local variables, you don't use them 
outside this method.

>         for count in range(24):
>             guide.append((self.start_month,self.start_year))
>             self.start_month -= 1
>             if self.start_month < 1:
>                 self.start_month = 12
>                 self.start_year -= 1
>         guide.reverse()
>         count = 0
>         for key in guide:
>             self.guide[key[1],key[0]]=count
>             count += 1

You can unpack key in the for statement, giving names to the elements:
   for start_month, start_year in guide:
     self.guide[start_year, start_month] = count

You can generate count automatically with enumerate():
   for count, (start_month, start_year) in enumerate(guide):
     self.guide[start_year, start_month] = count

If you built guide with tuples in the order you actually use them (why 
not?) you could build self.guide as easily as
   self.guide = dict( (key, count) for count, key in enumerate(guide) )

For that matter, why not just build self.guide instead of guide, in the 
first loop?
         for index in range(23, -1, -1):
             self.guide[start_month,start_year] = index
             start_month -= 1
             if start_month < 1:
                 start_month = 12
                 start_year -= 1

I found the use of both guide and self.guide confusing but it seems you 
don't really need guide at all.

>         self.sortedkeys = self.guide.keys()
>         self.sortedkeys.sort()
You don't use sortedkeys

> 
>     def insert(self,key,year,month,number):
>         if self.guide.has_key((year,month)):
>             if self.results.has_key(key):
>                 seq = self.guide[(year,month)]
>                 self.results[key][seq] += number
>             else:
>                 self.results[key] = []
>                 for x in range(24):self.results[key].append(0)

Could be
   self.results[key] = [0*24]

Is there a bug here? You don't add in number when the key is not there.

I like to use dict.setdefault() in cases like this:
   if self.guide.has_key((year,month)):
     seq = self.guide[(year,month)]
     self.results.setdefault(key, [0*24])[seq] += number

though if performance is a critical issue your way might be better - it 
doesn't have to build the default list every time. (Don't try to build 
the default list just once, you will end up with every entry aliased to 
the same list!) (I hesitate to include this note at all - you will only 
notice a difference in performance if you are doing this operation many 
many times!)
> 
> def splitfields(record):
>     fields = []
>     datestring=''
>     ### Regular Expr.
>     re_negnum = re.compile('(\d?)\.(\d+)(-)')
>     re_date   = re.compile('(\d\d)/(\d\d)/(\d\d)')
>     for element in record.split('|'):
>         element=element.strip() # remove leading/trailing whitespace
>         ### Move Neg Sign from right to left of number
>         negnum_match = re_negnum.search( element )
>         if negnum_match: 
>             if negnum_match.group(1):element = "%s%d.%02d" 
> %(negnum_match.group(3),int(negnum_match.group(1)),int(negnum_match.group(2)))
>             else:element = "%s0.%02d" 
> %(negnum_match.group(3),int(negnum_match.group(2)))

This is a lot of work just to move a - sign. How about
   if element.endswith('-'):
     element = '-' + element[:-1]

>         ### Format Date
>         date_match = re_date.search(element)
>         if date_match:
>             (month,day,year) = 
> (date_match.group(1),date_match.group(2),date_match.group(3))
>             ### Convert 2 year date to 4 year
>             if int(year) > 80:year = "19%02d" %int(year)
>             else:year = "20%02d" %int(year)
>             element = (year,month,day)

You go to a lot of trouble to turn your year into a string, then you 
convert it back to an int later. Why not just keep it as an int?

Do you know that Python data structures (lists, dicts, tuples) are 
heterogeneous? You can have a list that contains strings, ints, tuples, etc.

>         if element == '.000': element = 0.00
>         fields.append( element )
>     return fields

This whole splitfields() function is kind of strange. It seems you know 
what order the fields are in since the caller assumes an order. So why 
not something like

   elements = record.split('|')
   if len(elements) != 7:
     return None

   vendor,otype,oreturn,discountable,discperc,amount,date
   discperc = processNumber(discperc)
   amount = processNumber(amount)
   date = processDate(date)
   return vendor,otype,oreturn,discountable,discperc,amount,date

then all the ugly regex stuff goes into processNumber() and 
processDate() and the intent of the code is much clearer. 
processNumber() can return a float and processDate() can return the 
(year, month, day) tuple you need.
> 
>             
> ### Build Vendor Sales 
> sales = Tbred_24Months(2005,11)
> vendorsales = open('vendorsales.txt','r')
> for line in vendorsales:
>     fields = splitfields( line )
>     if len(fields) == 7:
>         (vendor,otype,oreturn,discountable,discperc,amount,date) = fields
>         amount = float(amount);discperc = float(discperc)
>         #if discperc and discountable == 'Y': amount = amount - ( amount * 
> (discperc/100) )
>         if otype == 'C' or oreturn == 'Y':amount = amount * -1
>         sales.insert(vendor,int(date[0]),int(date[1]),amount)
> result = ''
> 
> for key in sales.results:
>     sum = float(0)
could be sum = 0.0

>     result = str(key)
>     for amount in sales.results[key]:
>         sum += amount
>         result += "|" + str(amount)

You can iterate keys and values together with
   for key, amounts in sales.results.iteritems():

then it would be
     for amount in amounts:

>     print str(key),sum
>     #print result,sum

HTH
Kent