Regex for strings utility
Skip Montanaro
skip at pobox.com
Tue Jul 17 15:25:52 EDT 2001
rhys> I'm trying to write a script which operates like the Unix
rhys> 'strings' utility but I'm having difficulties with the regex.
...
rhys> I'm getting a Syntax Error: Invalid Token at the closing brace to
rhys> the pattern.
You have a couple problems. First, the pattern needs to be a string, so it
has to be enclosed in quotes. Second, the terminating character for the for
loop needs to be a colon. Third, based upon the way you imported re, you
need to refer to the findall function as re.findall.
Here's a slightly revised version of your script:
#!/usr/bin/env python
# strings program
import sys, re
f = open(sys.argv[1])
line = f.readline()
pattern = re.compile("[\040-\126\s]{4,}")
while line:
# regular expression to match strings >=4 chars goes here
matches = re.findall(pattern, line)
for match in matches:
print match
line = f.readline()
--
Skip Montanaro (skip at pobox.com)
http://www.mojam.com/
http://www.musi-cal.com/
More information about the Python-list
mailing list