Split text file into words

qwweeeit qwweeeit at yahoo.it
Wed Mar 9 10:11:36 EST 2005


I thank you for your help.
I already used re.split successfully but in this case...
I didn't explain more deeply because I don't want someone else do my
homework.

I want to implement a variable & commands cross reference tool.
For this goal I must clean the python source from any comment and
manifest string.
On the cleaned source file I must isolate all the words (keeping the
words connected by '.')

My wrong code (don't consider the line ref. in traceback ... it's an
extract!):

import re

# input text file w/o strings & comments

f=open('file.txt')
lInput=f.readlines() 
f.close()

fOut=open('words.txt','w')

for i in lInput:
.   ll=re.split(r"[\s,{}[]()+=-/*]",i)
.   fOut.write(' '.join(ll)+'\n')

fOut.close()

Traceback (most recent call last):
  File "./GetWords.py", line 70, in ?
    ll=re.split(r"[\s,{}[]()+=-/*]",i)
  File "/usr/lib/python2.3/sre.py", line 156, in split
    return _compile(pattern, 0).split(string, maxsplit)
RuntimeError: maximum recursion limit exceeded


... and if I use:
ll=re.split(r"\s,{}[]()+=-/*",i)

Traceback (most recent call last):
  File "./GetWords.py", line 70, in ?
    ll=re.split(r"\s,{}[]()+=-/*",i)
  File "/usr/lib/python2.3/sre.py", line 156, in split
    return _compile(pattern, 0).split(string, maxsplit)
  File "/usr/lib/python2.3/sre.py", line 230, in _compile
    raise error, v # invalid expression
sre_constants.error: bad character range

I taught it was my mistake in the use of re.split...

I am using:   
Python 2.3.4 (#2, Aug 19 2004, 15:49:40)
[GCC 3.4.1 (Mandrakelinux (Alpha 3.4.1-3mdk)] on linux2



More information about the Python-list mailing list