[Python-ideas] Re: Multiple arguments to str.partition and bytes.partition

Jan. 10, 2023


      James Addison via Python-ideas writes:
...
On Sun, 8 Jan 2023 at 08:32, Stephen J. Turnbull
<stephenjturnbull@gmail.com> wrote:
...
Trying to avoid the usual discussions about permissive parsing /
supporting various implementations in-the-wild: long-term, the least
ambiguous and most computationally-efficient environment would
probably want to reduce special cases like that?  (both in-data and
in-code)
That's not very human-friendly, though.  Push that to extremes and you
get XML.  "Nobody expects the XML Validators!"
...
Structural pattern matching _seems_ like it could correspond here, in
terms of selecting appropriate arguments -- but it is, as I understand
it, limited to at-most-one wildcard pattern per match (by sensible
design).
If I understand what you mean by "structural pattern matching", that
seems more appropriate to parsing already tokenized input.
...
I suppose an analysis (that I don't have the ability to perform
easily) could be to determine how many regular expression codesites
could be migrated compatibly and beneficially by using
multiple-partition-arguments.
My guess is that for re.match (or re.search) it would be relatively
few.  People tend to reach for regular expression matching when they
have repetition or alternatives that they want to capture in a single
expression, and that is generally not going to be easy to capture with
str.partition.  But I bet that *many* calls to re.split take regular
expressions of the form f'[{separators}]' which would be easy enough
to search for.  That's where you could reduce the number of regexps.

Steve