Using regular expressions to extract substrings from files
t.hume at bom.gov.au
Fri Sep 10 04:38:10 CEST 2004
I am new to Python, and was wondering if it is possible to operate on
files using regular expressions.
What I mean is this:
- It is easy to search for a substring of a string using regular
- Can I also search for a substring inside a file using regular
expressions? The substring may span several lines (ie there may be
embedded new line and carriage return characters).
So far, the only way I know how to do this is to read the entire file into
a string, and then parse the resulting string with regular expressions.
This is OK for small files (in fact it is probably quite efficient,
because the disc I/O is done all at once). However, once the files get
large, there is the risk I will run out of memory. The closest UNIX tool I
can think of to do this sort of job is grep, but that doesn't have the
power and flexibility of Python.
Any ideas would be appreciated.
Bureau of Meteorology Research Centre
More information about the Python-list