[Python-Dev] Misc re.match() complaint
Guido van Rossum
guido at python.org
Tue Jul 16 19:21:36 CEST 2013
On Tue, Jul 16, 2013 at 12:55 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> Is there a strong enough use case to change it? I can't say the current
> behaviour seems very useful either, but some people may depend on it.
This is the crucial question. I personally see the current behavior as
an artifact of the (lack of) design process, not as a conscious
decision. Given that we also have m.string, m.start(grp) and
m.end(grp), those who need something matching the original type (or
even something that is known to be a reference into the original
object) can use that API; for most use cases, all you care about is is
the selected group as a string, and it is more useful if that is
always an immutable string (bytes or str).
The situation is most egregious if the target string is a bytearray,
where there is currently no way to get the result as an immutable
bytes object without an extra copy. (There's no API that lets you
create a bytes object directly from a slice of a bytearray.)
In terms of backwards compatibility, I wouldn't want to do this in a
bugfix release, but for a feature release I think it's fine -- the
number of applications that could be bitten by this must be extremely
small (and the work-around is backward-compatible: just use
m.string[m.start() : m.stop()]).
> I already find it a bit weird that you're passing a bytearray or
> memoryview to re.match(), to be honest :-)
Yes, this is somewhat of an odd corner, but actually most built-in
APIs taking bytes also take anything else that can be coerced to bytes
(io.open() seems to be the exception, and it feels like an accident --
os.open() *does* accept bytearray and friends). This is quite useful
for code that interacts with C code or system calls -- often you have
a large buffer shared between C and Python code for efficiency, and
being able to do pretty much anything to the buffer that you can do to
a bytes object (apart from using it as a dict key) helps a lot.
--
--Guido van Rossum (python.org/~guido)
More information about the Python-Dev
mailing list