Stripping whitespace

John Machin sjmachin at lexicon.net
Wed Jan 23 23:47:35 CET 2008


On Jan 24, 7:57 am, "Reedick, Andrew" <jr9... at ATT.COM> wrote:
>
> Why is it that so many Python people are regex adverse?  Use the dashed
> line as a regex.  Convert the dashes to dots.  Wrap the dots in
> parentheses.  Convert the whitespace chars to '\s'.  Presto!  Simpler,
> cleaner code.

Woo-hoo! Yesterday was HTML day, today is code review day. Yee-haa!

>
> import re
>
> state = 0
> header_line = ''
> pattern = ''
> f = open('a.txt', 'r')
> for line in f:
>         if line[-1:] == '\n':
>                 line = line[:-1]
>
>         if state == 0:
>                 header_line = line
>                 state += 1

state = 1

>         elif state == 1:
>                 pattern = re.sub(r'-', r'.', line)
>                 pattern = re.sub(r'\s', r'\\s', pattern)
>                 pattern = re.sub(r'([.]+)', r'(\1)', pattern)

Consider this:
    pattern = ' '.join('(.{%d})' % len(x) for x in line.split())

>                 print pattern
>                 state += 1

state = 2

>
>                 headers = re.match(pattern, header_line)
>                 if headers:
>                         print headers.groups()
>         else:
>                 state = 2

assert state == 2

>                 m = re.match(pattern, line)
>                 if m:
>                         print m.groups()
>
> f.close()
>



More information about the Python-list mailing list