Regex uses the ? symbol to indicate that something is a "non-greedy" match (to default to "shortest match")

import re
str_ = "a:b:c"
assert re.match(r'(.*):(.*)', str_).groups() == ("a:b", "c")
assert re.match(r'(.*?):(.*)', str_).groups() == ("a", "b:c")

Typically, debugging parsing issues involves testing the output of a function (not changes to locals()).

Parse defaults to (case-insensitive) non-greedy/shortest-match: 

> parse() will always match the shortest text necessary (from left to right) to fulfil the parse pattern, so for example:

> >>> pattern = '{dir1}/{dir2}'
> >>> data = 'root/parent/subdir'
> >>> sorted(parse(pattern, data).named.items())
> [('dir1', 'root'), ('dir2', 'parent/subdir')]

> So, even though {'dir1': 'root/parent', 'dir2': 'subdir'} would also fit the pattern, the actual match represents the shortest successful match for dir1.

https://github.com/r1chardj0n3s/parse#potential-gotchas

https://github.com/r1chardj0n3s/parse#format-specification :

> Note: attempting to match too many datetime fields in a single parse() will currently result in a resource allocation issue. A TooManyFields exception will be raised in this instance. The current limit is about 15. It is hoped that this limit will be removed one day.


On Sat, Sep 19, 2020, 1:00 PM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
Parsing can be ambiguous:
     f"{x}:{y}" = "a:b:c"
Does this set
     x = "a"
     y = "b:c"
or
     x = "a:b"
     y = "c"
Rob Cliffe

On 17/09/2020 05:52, Dennis Sweeney wrote:
> TL;DR: I propose the following behavior:
>
>      >>> s = "She turned me into a newt."
>      >>> f"She turned me into a {animal}." = s
>      >>> animal
>      'newt'
>
>      >>> f"A {animal}?" = s
>      Traceback (most recent call last):
>      File "<pyshell#2>", line 1, in <module>
>              f"A {animal}?" = s
>      ValueError: f-string assignment target does not match 'She turned me into a newt.'
>
>      >>> f"{hh:d}:{mm:d}:{ss:d}" = "11:59:59"
>      >>> hh, mm, ss
>      (11, 59, 59)
>
> === Rationale ===
>
> Part of the reason I like f-strings so much is that they reduce the
> cognitive overhead of reading code: they allow you to see *what* is
> being inserted into a string in a way that also effortlessly shows
> *where* in the string the value is being inserted. There is no need to
> "paint-by-numbers" and remember which variable is {0} and which is {1}
> in an unnecessary extra layer of indirection. F-strings allow string
> formatting that is not only intelligible, but *locally* intelligible.
>
> What I propose is the inverse feature, where you can assign a string
> to an f-string, and the interpreter will maintain an invariant kept
> in many other cases:
>
>      >>> a[n] = 17
>      >>> a[n] == 17
>      True
>
>      >>> obj.x = "foo"
>      >>> obj.x == "foo"
>      True
>
>      # Proposed:
>      >>> f"It is {hh}:{mm} {am_or_pm}" = "It is 11:45 PM"
>      >>> f"It is {hh}:{mm} {am_or_pm}" == "It is 11:45 PM"
>      True
>      >>> hh
>      '11'
>
> This could be thought of as analogous to the c language's scanf
> function, something I've always felt was just slightly lacking in
> Python. I think such a feature would more clearly allow readers of
> Python code to answer the question "What kinds of strings are allowed
> here?". It would add certainty to programs that accept strings,
> confirming early that the data you have is the data you want.
> The code reads like a specification that beginners can understand in
> a blink.
>
>
> === Existing way of achieving this ===
>
> As of now, you could achieve the behavior with regular expressions:
>
>      >>> import re
>      >>> pattern = re.compile(r'It is (.+):(.+) (.+)')
>      >>> match = pattern.fullmatch("It is 11:45 PM")
>      >>> hh, mm, am_or_pm = match.groups()
>      >>> hh
>      '11'
>
> But this suffers from the same paint-by-numbers, extra-indirection
> issue that old-style string formatting runs into, an issue that
> f-strings improve upon.
>
> You could also do a strange mishmash of built-in str operations, like
>
>      >>> s = "It is 11:45 PM"
>      >>> empty, rest = s.split("It is ")
>      >>> assert empty == ""
>      >>> hh, rest = rest.split(":")
>      >>> mm, am_or_pm = s.split(" ")
>      >>> hh
>      '11'
>
> But this is 5 different lines to express one simple idea.
> How many different times have you written a micro-parser like this?
>
>
> === Specification (open to bikeshedding) ===
>
> In general, the goal would be to pursue the assignment-becomes-equal
> invariant above. By default, assignment targets within f-strings would
> be matched as strings. However, adding in a format specifier would
> allow the matches to be evaluated as different data types, e.g.
> f'{foo:d}' = "1" would make foo become the integer 1. If a more complex
> format specifier was added that did not match anything that the
> f-string could produce as an expression, then we'd still raise a
> ValueError:
>
>      >>> f"{x:.02f}" = "0.12345"
>      Traceback (most recent call last):
>      File "<pyshell#2>", line 1, in <module>
>              f"{x:.02f}" = "0.12345"
>      ValueError: f-string assignment target does not match '0.12345'
>
> If we're feeling adventurous, one could turn the !r repr flag in a
> match into an eval() of the matched string.
>
> The f-string would match with the same eager semantics as regular
> expressions, backtracking when a match is not made on the first
> attempt.
>
> Let me know what you think!
> _______________________________________________
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-leave@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/JEGSKODAK5MCO2HHUF4555JZPZ6SKNEC/
> Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/CVPRH5MEEUV2HPP4QOSZQDGQ6CWAXCY7/
Code of Conduct: http://python.org/psf/codeofconduct/