re question
Daniel Schüle
uval at rz.uni-karlsruhe.de
Fri Jun 23 09:23:57 EDT 2006
Hello re gurus,
I wrote this pattern trying to get the "name" and the "content" of VHDL
package
I know that the file is a valid VHDL code, so actually there is no need to
perform
validation after 'end' token is found, but since it works fine I don't want
to touch it.
this is the pattern
pattern =
re.compile(r'^\s*package\s+(?P<name>\w+)\s+is\s+(?P<content>.*?)\s+end(\s+package)?(\s+(?P=name))?\s*;',
re.DOTALL | re.MULTILINE | re.IGNORECASE)
and the problem is that
package TEST is xyz end;
works but
package TEST123 is xyz end;
fails
\w is supposed to match [a-zA-Z0-9_] so I don't understand why numbers and
undescore let the pattern fail?
(there is a slight suspicion that it may be a re bug)
I also tried this pattern with the same results
pattern =
re.compile(r'^\s*package\s+(?P<name>.+?)\s+is\s+(?P<content>.*?)\s+end(\s+package)?(\s+(?P=name))?\s*;',
re.DOTALL | re.MULTILINE | re.IGNORECASE)
something must be wrong with (?P<name>\w+) inside the main pattern
thanks in advance
--
Daniel
More information about the Python-list
mailing list