[docs] possible doc bug: 6.2.1. Regular Expression Syntax, *?, +?, ??
Georg Brandl
georg at python.org
Tue Apr 12 01:53:08 EDT 2016
On 04/11/2016 09:34 PM, John Gordon wrote:
> I'm a Python newbie and am very reluctant to submit a doc bug since I don't know
> what I'm doing yet.
>
> But I'm trying to learn to use re.search and am using your doc.
>
> I can't figure out how the text I colored red below can be accurate. Besides its
> logic not making sense to me, I tried running it and it doesn't work: .*?
> doesn't match <H1> .
>
> If I'm wrong I need to understand why.
>
> ============
>
> DOC BUG (maybe):
> https://docs.python.org/3/library/re.html
>
> 6.2.1. Regular Expression Syntax
>
> **?, +?, ??
> *The '*', '+', and '?' qualifiers are all greedy; they match as much text as
> possible. Sometimes this behaviour isn’t desired; if the RE <.*> is matched
> against '<H1>title</H1>', it will match the entire string, and not just
> '<H1>'. Adding '?' after the qualifier makes it perform the match in
> non-greedy or minimal fashion; as few characters as possible will be
> matched. *Using .*? in the previous expression will match only '<H1>'.**
> **
> *
>
>
> I tried the following and none returned <H1>:
>
>>>> ttt = re.search('.*?','<H1>title</H1>')
>>>> ttt = re.search('(.*?)','<H1>title</H1>')
>>>> ttt = re.search(r'(.*?)','<H1>title</H1>')
Hi John,
this is indeed confusingly worded; it should give the whole expression,
i.e. ``<.*?>``.
Also, it should probably refrain from using HTML in a regex example altogether,
as parsing HTML/XML with regexes is one of the classic "I thought I'd use regex,
now I have two problems" cases.
I've changed the example now to be a bit less specific, and clarified the
replacement regex.
Thanks for the report!
Georg
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/docs/attachments/20160412/527182e3/attachment.sig>
More information about the docs
mailing list