[Tutor] Another regular expression question
kent37 at tds.net
Wed Sep 14 00:18:51 CEST 2005
Bernard Lebel wrote:
> Hello, yet another regular expression question :-)
> So I have this xml file that I'm trying to find a specific tag in. For
> this I'm using a regular expression. Right now, the tag I'm trying to
> find looks like this:
> <sceneobject name="Camera_Root_bernard" type="CameraRoot">
> So I'm using a regular expression to find:
> My code looks like this:
> import os, re
> def searchTag( sPattern, sFile ):
> Scans a xml file to try to find a line that matches search criterias.
> sPattern (string): regular expression pattern string
> sFile (string): full file path to scan
> RETURN VALUE: text line (string) or None
> oRe = re.compile( sPattern )
> if os.path.exists( sFile ) == False: return None
No need to compare to False, you can just say
if not os.path.exists( sFile ): return None
> oFile = file( sFile, 'r' )
> for sLine in oFile.xreadlines(): # read text
for sLine in oFile:
is more idiomatic and avoids reading the whole file at once.
> oMatch = oRe.search( sLine ) # attempt a search
> if oMatch != None: # check if search returned success
> return sLine
> # Scan has yield no result, return None
> return None
> sLine = searchTag( r'(sceneobject)(type="CameraRoot")', sFile )
> The thing is that I suspect my regular expression pattern to be
> incorrect because I always get None, but am at a loss here. Any advice
> would be welcomed.
You need something in the regex to match the part between 'sceneobject' and 'type="CameraRoot"'. The regex you are using expects them to be adjacent. Try
sLine = searchTag( r'(sceneobject).*?(type="CameraRoot")', sFile )
which means, match anything between the two strings, but the smallest amount possible (non-greedy).
It's also possible that the tag you are looking for spans multiple lines. In this case you should look at an XML parsing library.
More information about the Tutor