
James Addison via Python-ideas writes:
On Sun, 8 Jan 2023 at 08:32, Stephen J. Turnbull <stephenjturnbull@gmail.com> wrote:
Trying to avoid the usual discussions about permissive parsing / supporting various implementations in-the-wild: long-term, the least ambiguous and most computationally-efficient environment would probably want to reduce special cases like that? (both in-data and in-code)
That's not very human-friendly, though. Push that to extremes and you get XML. "Nobody expects the XML Validators!"
Structural pattern matching _seems_ like it could correspond here, in terms of selecting appropriate arguments -- but it is, as I understand it, limited to at-most-one wildcard pattern per match (by sensible design).
If I understand what you mean by "structural pattern matching", that seems more appropriate to parsing already tokenized input.
I suppose an analysis (that I don't have the ability to perform easily) could be to determine how many regular expression codesites could be migrated compatibly and beneficially by using multiple-partition-arguments.
My guess is that for re.match (or re.search) it would be relatively few. People tend to reach for regular expression matching when they have repetition or alternatives that they want to capture in a single expression, and that is generally not going to be easy to capture with str.partition. But I bet that *many* calls to re.split take regular expressions of the form f'[{separators}]' which would be easy enough to search for. That's where you could reduce the number of regexps. Steve