[Tutor] Expanding a Python script to include a zcat and awk pre-process

galaxywatcher at gmail.com galaxywatcher at gmail.com
Sat Jan 9 13:44:54 CET 2010


After many more hours of reading and testing, I am still struggling to  
finish this simple script, which bear in mind, I already got my  
desired results by preprocessing with an awk one-liner.

I am opening a zipped file properly, so I did make some progress, but  
simply assigning num1 and num2 to the first 2 columns of the file  
remains elusive. Num3 here gets assigned, not to the 3rd column, but  
the rest of the entire file. I feel like I am missing a simple strip()  
or some other incantation that prevents the entire file from getting  
blobbed into num3. Any help is appreciated in advance.

#!/usr/bin/env python

import string
import re
import zipfile
highflag = flagcount = sum = sumtotal = 0
f = file("test.zip")
z = zipfile.ZipFile(f)
for f in z.namelist():
     ranges = z.read(f)
     ranges = ranges.strip()
     num1, num2, num3 = re.split('\W+', ranges, 2)  ## This line is  
the root of the problem.
     sum = int(num2) - int(num1)
     if sum > 10000000:
         flag1 = " !!!!"
         flagcount += 1
     else:
         flag1 = ""
     if sum > highflag:
         highflag = sum
     print str(num2) + " - " + str(num1) + " = " + str(sum) + flag1
     sumtotal = sumtotal + sum

print "Total ranges = ", sumtotal
print "Total ranges over 10 million: ", flagcount
print "Largest range: ", highflag

======
$ zcat test.zip
134873600, 134873855, "32787 Protex Technologies, Inc."
135338240, 135338495, 40597
135338496, 135338751, 40993
201720832, 201721087, "12838 HFF Infrastructure & Operations"
202739456, 202739711, "1623 Beseau Regional de la Region Languedoc  
Roussillon"



More information about the Tutor mailing list