[Tutor] Log analyzer
Sat, 15 Dec 2001 00:48:15 -0800 (PST)
On Sat, 15 Dec 2001, Mike Yuen wrote:
> I'm trying to make a little log analyzer for myself and the problem is, I
> want to split each line which initially looks like:
> FWIN,2001/09/14,01:44:53 -6:00GMT,126.96.36.199:137,188.8.131.52:137,UDP
> I used the split function and got:
> 'FWIN,2001/09/14,01:44:53', '-6:00', 'GMT,<bunch of numbers here>'
> I want each sections boundries to be BETWEEN the commas. So, for example:
> FWIN is one section
> 2001/09/14 is another
> 01:44:53 -6:00GMT is yet another.
> * Note: each sections size will vary in size.
> I know I can take a another pass over the line but i've got literally
> 1000's of lines to process and taking 2 passes over each line really slows
> things done. So, is there an efficient way to split the lines?
Sounds like you want to split along the commas. The default that split()
uses is whitespace, because it's a "common case" that people run into all
the time. However, split() can take in an optional "delimiter" parameter.
Take a look:
>>> string.split('supercalifragilisticexpialidocious', 'i')
['supercal', 'frag', 'l', 'st', 'cexp', 'al', 'doc', 'ous']