Regular expression for file name
Christopher T King
squirrel at WPI.EDU
Mon Jul 19 09:48:04 EDT 2004
On Sun, 18 Jul 2004, Miki Tebeka wrote:
> In a configuration file there can be ID's and filename tokens.
> The file names have a known suffix (.o or .mls) and I need to get a regular
> expression that will catch filename but not an ID.
>
> Currently:
> ID = r"[a-zA-Z\.]\w+(?![/\\])"
> FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))"
>
> However if I have the filename "Sources/kernel/rom_kernel.mls" then
> "Source" is interrupted as ID and "s/kernel/rom_kernel.mls" is interrupted
> as file name.
I'm not familiar with PLY, but my guess as to the cause is that it gives
you those results because it is trying to match ID first, and then
FILENAME. The best way to solve this is to incorporate another restraint
in your RE, that is, the delimiter at the end of the pattern (presumably
whitespace):
ID = r"[a-zA-Z\.]\w+(?=\s)"
FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))(?=\s)"
I'm not sure if PLY supports (?=...) or not, but I assume it does, since
you used its complement ((?!...)) in your original REs.
More information about the Python-list
mailing list