On Fri, Sep 18, 2020 at 10:53:57AM +1000, Chris Angelico wrote:
On Fri, Sep 18, 2020 at 10:51 AM Steven D'Aprano
wrote: On Thu, Sep 17, 2020 at 11:09:35PM +1000, Chris Angelico wrote:
I've frequently yearned for an sscanf-like feature in Python. Usually I end up longhanding it with string methods, or else reaching for a regex, but neither of those is quite what I want. I'd prefer scanf notation to format strings, but either is acceptable.
Why make this a syntactic feature when a scanf function would do?
Because a scanf function can't assign directly. In fact, the exact same issue that led to f-strings in the first place; there's no reliable way to embed the names into the format string without a lot of redundancy.
But that's a *separate problem*. Regexes can't assign directly either. And we wouldn't want them to! (It's okay for a regex to have it's own internal namespace, like named groups, but it shouldn't leak out into the locals or globals.) Extracting data from a string, like scanf, regexes, sed, awk, SNOBOL etc sounds like a big win. Assignment should be a separate problem. And at last I think I have thought of a use of dict unpacking I like. If our scanf(pattern, target) function returns a dict of {name: value} pairs, how do we apply it to locals? target [, names] = **scanf(pattern, target) where dict assignment matches assignment targets on the left with keys in the dict. Acceptable target names are simple identifiers, dotted names and subscripts: spam, eggs.fried, cheese[0] = **{'cheese[0]': 3, 'spam': 1, 'eggs.fried': 2} would do the obvious assignments. (I could live without the dotted names and subscripts, if people don't like the additional complexity.) Targets missing a key:value, or keys missing a target, would raise an exception. The bottom line here is that separation of concerns is a principle we should follow. Text scanning and assignment are two distinct problems and we should keep them distinct. This will allow us to pre-process the pattern we want to match, and post-process the results of the scan, e.g. spam, eggs, cheese = **(defaults | scanf(pattern, string)) We could have multiple scanners too, anything that returned a dict of target names and values. We wouldn't need to build the scanner into the interpreter, only the assignment syntax. The scanner itself is just a function. -- Steve