[Tutor] Log analyzer

Danny Yoo dyoo@hkn.eecs.berkeley.edu
Sat, 15 Dec 2001 00:48:15 -0800 (PST)

On Sat, 15 Dec 2001, Mike Yuen wrote:

> I'm trying to make a little log analyzer for myself and the problem is, I
> want to split each line which initially looks like:
> FWIN,2001/09/14,01:44:53 -6:00GMT,,,UDP
> I used the split function and got:
> 'FWIN,2001/09/14,01:44:53', '-6:00', 'GMT,<bunch of numbers here>'
> I want each sections boundries to be BETWEEN the commas.  So, for example:
> FWIN is one section
> 2001/09/14 is another
> 01:44:53 -6:00GMT is yet another.
> * Note: each sections size will vary in size.
> I know I can take a another pass over the line but i've got literally
> 1000's of lines to process and taking 2 passes over each line really slows
> things done.  So, is  there an efficient way to split the lines?

Sounds like you want to split along the commas.  The default that split()
uses is whitespace, because it's a "common case" that people run into all
the time.  However, split() can take in an optional "delimiter" parameter.  
Take a look:

>>> string.split('supercalifragilisticexpialidocious', 'i')
['supercal', 'frag', 'l', 'st', 'cexp', 'al', 'doc', 'ous']

Good luck!