"Zeroing out" the Nth group in a RE

Doru-Catalin Togea doru-cat at ifi.uio.no
Sat Aug 3 11:41:18 EDT 2002


Hi all!

I want to match all occurences of Bible references in a string like:
	
	refs = 'gen  5:17 - 23  , lev 14:20, rev 19:10 - 25'
	
There are two kinds of Bible references:
	- simple, like 'lev 14:20'
	- a range, like 'gen 5:17-23'   # an extension of the simple
referance
	
I want a very general RE which matches all referances, both simple and
ranges at once. Running the following code 

bibleRef =
re.compile(r'(?:(?:(\w+)(?:\s+)(\d+):(\d+))(?:(?:\s*)(?:-)(?:\s*)(\d+))?)')
m = bibleRef.findall(refs)
print m

outputs:
	
[('gen', '5', '17', '23'), ('lev', '14', '20', '23'), ('rev', '19', '10',
'25')]

which is "mistaken" in that the second tuple should have been 

	('lev', '14', '20') 		or 
	('lev', '14', '20', '')

I tried to achieve this by grouping the last part of my RE, (the part
denotind the range extension), in a set of (), and by placing an '?' after
that, to say that this part is optional, that is, do match whether it
occurs or not.

So, how do I zero-out this "fourth group", when I encounter simple
referances?

Thank you if you can help.
Catalin



	<<<< ================================== >>>>
	<<     We are what we repeatedly do.      >>
	<<  Excellence, therefore, is not an act  >>
	<<             but a habit.               >>
	<<<< ================================== >>>>





More information about the Python-list mailing list