[Tutor] how to sort the data inside the file.
Chris Fuller
cfuller084 at thinkingplanet.net
Mon Dec 31 20:45:21 CET 2007
On Monday 31 December 2007 10:36, Chris Fuller wrote:
> lin = re.findall('\s*([^\s]+)\s+([^\s]+)\s+(\d+)( [kM])?bytes', s)
This is incorrect. The first version of the script I wrote split the file
into records by calling split('bytes'). I erroneously assumed I would obtain
the desired results by sinmply adding "bytes" to the RE. The original RE
could have been written such that this would have worked, (and would have
been a little "cleaner") but it wasn't. The space should be obligatory, and
not included with the [kM] group.
I tried some of Kent's suggestions, and compared the run times. Nested
split()'s are faster than REs! Python isn't as slow as you'd think :)
# seperate into records (drop some trailing whitespace)
lin = [i.split() for i in s.split('bytes')[:-1]]
for fields in lin:
try:
if fields[3] == 'M':
mul = 1000000
elif fields[3] == 'k':
mul = 1000
except IndexError:
mul = 1
lout.append( (fields[0], fields[1], int(fields[2])*mul) )
Cheers
More information about the Tutor
mailing list