[Python-Dev] Misc re.match() complaint

Terry Reedy tjreedy at udel.edu
Wed Jul 17 11:05:50 CEST 2013


On 7/17/2013 12:15 AM, Stephen J. Turnbull wrote:
> Terry Reedy writes:
>   > On 7/15/2013 10:20 PM, Guido van Rossum wrote:
>   >
>   > >> Or is this something deeper, that a group *is* a new object in
>   > >> principle?
>   > >
>   > > No, I just think of it as returning "a string"
>   >
>   > That is exactly what the doc says it does. See my other post.
>
> The problem is that IIUC '"a string"' is intentionally *not* referring
> to the usual "str or bytes objects" (at least that's one of the
> standard uses for scare quotes, to indicate an unusual usage).

There are no 'scare quotes' in the doc. I put quote marks on things to 
indicated that I was quoting. I do not know how Guido regarded his marks.

 > Either
> the docstring is using "string" in a similarly ambiguous way, or else
> it's incorrect under the interpretation that buffer objects are *not*
> "strings", so they should be inadmissible as targets.

Saying that input arguments can be "Unicode strings as well as 8-bit 
strings' (the wording is from 2.x, carried over to 3.x) does not 
necessary exclude other inputs. CPython is somethimes more more 
permissive than the doc requires. If the doc said str, bytes, butearray, 
or memoryview, then other implementations would have to do the same to 
be conforming. I do not know if that is intended or not.

The question is whether CPython should be just as permissive as to the 
output types of .group(). (And what, if any requirement should be 
imposed on other implementations.)

 > Something
> should be fixed, and I suppose it should be the return type of group().
>
> BTW, I suggest that Terry's usage of "string" (to mean "str or bytes"
> in 3.x, "unicode or str" in 2.x) be adopted, and Guido's "stringish"

This word is an adjective, not a noun.

> be given expanded meaning, including buffer objects.  Then we can say
> informally that in searching and matching a target is a stringish, the
> pattern is a stringish (?) or compiled re, but the group method
> returns a string.

Guido's idea to fix (tighten up) the output in 3.4 is fine with me.

-- 
Terry Jan Reedy



More information about the Python-Dev mailing list