[Tutor] Another regular expression question
Kent Johnson
kent37 at tds.net
Wed Sep 14 00:18:51 CEST 2005
Bernard Lebel wrote:
> Hello, yet another regular expression question :-)
>
> So I have this xml file that I'm trying to find a specific tag in. For
> this I'm using a regular expression. Right now, the tag I'm trying to
> find looks like this:
>
> <sceneobject name="Camera_Root_bernard" type="CameraRoot">
>
> So I'm using a regular expression to find:
> sceneobject
> type="CameraRoot"
>
>
> My code looks like this:
>
>
> import os, re
>
>
> def searchTag( sPattern, sFile ):
>
> """
> Scans a xml file to try to find a line that matches search criterias.
>
> ARGUMENTS:
> sPattern (string): regular expression pattern string
> sFile (string): full file path to scan
>
> RETURN VALUE: text line (string) or None
> """
>
> oRe = re.compile( sPattern )
>
> if os.path.exists( sFile ) == False: return None
No need to compare to False, you can just say
if not os.path.exists( sFile ): return None
> else:
> oFile = file( sFile, 'r' )
>
> for sLine in oFile.xreadlines(): # read text
for sLine in oFile:
is more idiomatic and avoids reading the whole file at once.
> oMatch = oRe.search( sLine ) # attempt a search
> if oMatch != None: # check if search returned success
> oFile.close()
> return sLine
>
> # Scan has yield no result, return None
> oFile.close()
> return None
>
>
> sLine = searchTag( r'(sceneobject)(type="CameraRoot")', sFile )
>
>
> The thing is that I suspect my regular expression pattern to be
> incorrect because I always get None, but am at a loss here. Any advice
> would be welcomed.
You need something in the regex to match the part between 'sceneobject' and 'type="CameraRoot"'. The regex you are using expects them to be adjacent. Try
sLine = searchTag( r'(sceneobject).*?(type="CameraRoot")', sFile )
which means, match anything between the two strings, but the smallest amount possible (non-greedy).
It's also possible that the tag you are looking for spans multiple lines. In this case you should look at an XML parsing library.
Kent
More information about the Tutor
mailing list