trailing space in RE

Denis S. Otkidach ods at fep.ru
Fri Aug 2 11:50:11 EDT 2002


On Fri, 2 Aug 2002, Doru-Catalin Togea wrote:

DT> Hi all!
DT>
DT> I have written a little script to parse some Bible text, and
DT> to this
DT> purpose I defined the following re:
DT>
DT> 	bibleRef = r'(\w+) (\d+):(\d+) (.+)'
[snip]
DT> Everything works fine, but I have a problem in that "the
DT> rest of the
DT> text" allways has a trailing space like this:
DT>
DT> "Gen 1:1 In the beginning God created the heavens and the
DT> earth. "
DT> "1Co 10:12 Therefore let him who thinks he stands take heed
DT> lest he
DT> fall. "
DT>
DT> So my question is, how do I match "the rest of the text" but
DT> not the last
DT> character (which is a space)?

You should state explicitly the last character as non-whitespace:

>>> import re
>>> r = re.compile(r'(\w+) (\d+):(\d+) (.+\S)')
>>> m = r.match("Gen 1:1 In the beginning God created the heavens
and the earth.")
>>> m.groups()
('Gen', '1', '1', 'In the beginning God created the heavens and
the earth.')

-- 
Denis S. Otkidach
http://www.python.ru/      [ru]
http://diveinto.python.ru/ [ru]





More information about the Python-list mailing list