[BangPypers] Question on Pattern Matching and Regular Expression

Gora Mohanty gora at mimirtech.com
Mon Jan 7 11:08:05 CET 2013


On 7 January 2013 15:06, davidsnt <davidsnt at gmail.com> wrote:
> Bangpypers,
>
> Having a little trouble in parsing a file of 702 line appox,
>
> the file is in the format
>
> #
> # <Title>
> #
> [Space]
> [
>
> Few lines of information about the title
>
> ]
>
> [Space]

If the above format is strictly followed, this should do it,
assuming you can read the entire file into a string (s in
the example below.

import re
TITLE_RE = re.compile( r'#\n#([^\n]*)\n#\n \n\[([^\]]*)\]\n \n',
re.MULTILINE|re.DOTALL )
for m in TITLE_RE.finditer( s.strip ):
     title, info = m.groups()
     print title, info

Error handling, and reading chunks from a large file
are left as an exercise for the reader.

Also, if the file format is at all more complex, and
maybe even in this case, I would write a parser
rather than use regular expressions.

Regards,
Gora


More information about the BangPypers mailing list