
On 24Apr2021 22:35, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Cameron Simpson writes:
On 23Apr2021 18:25, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
I don't understand how this is supposed to work. It looks to me like !code is a preprocessor: [...] If so,
'{x} is {x!lc:foo} in lowercase'
will fail because str doesn't implement the 'foo' format code.
Maybe we're talking about different things. In the example above, I'm talking about "lc", not "foo".
I don't think so. I know you want to talk only about "lc", but I want to understand how this interacts with "foo", and why you can't use "foo" in your application. [...]
My use case is presupplied strings, eg a command line supplied format string.
In that case the format string is user input, and x is a variable in the program that the user can have substituted into their string?
Assuming that *exact* use case, wouldn't
class LowerableStr(str): ... def __format__(self, fmt): ... if fmt == 'lc': ... return self.lower() ... else: ... return str.__format__(self, fmt) ... "{x} is {x:lc} in lowercase".format_map({'x' : LowerableStr("This")}) 'This is this in lowercase'
do?
You're perfectly correct. ":lc" can be shoehorned into doing what I ask. But __format__ is in the wrong place for how I'd like to do this. First up, I have somehow missed this (":format_name") in the semirecursive mess which is the python format-and-friends descriptions. (object.__format__? str.format? str.formap_map? f''? the format mini-language? all in separate places, for reasonable reasons, but my head has exploded multiple times trying to stitch them together). On reflection, __format__ is something I may already have considered and rejected. Let me explain. The "!r" conversion is applied _by_ the formatting, _to_ an arbitrary value. So I could write a single function for my hypothetical "!lc" which did what I want for an arbitrary object, because it would be called with the value being converted. By contrast, the ":lc" format specifier form requires the value being formatted _itself_ to have a special __format__ method. This scales poorly when I might put almost anything into the format string. I'm not speaking here of allowing an end user to inject arbitrary access code into my programme via a format string, but that the format strings I'm using via .format_map(mapping) are given a mapping which is a pretty rich view of an almost arbitrary data structure I've made available for display via the format string. In case you care, my primary use case is a tag library _with_ an ontology, where tag values can be arbitrary Python values. (In reality, those values need to be JSON renderable just now, since they land in text files or database JSON blobs when persisted.) Anyway, the format string lets me write formats like: track {track_id} has artist {artist._meta.fullname} which hops off through the ontology, or: # from a config or command line or default # I like lowercased filenames a lot filename_format = '{artist_lc}--{album_lc}--{track_id}--{title_lc}.mp3' with open(filename_format.format_map(tagset.ns()), 'wb'): ... write file data ... The filename_format is the example where I want some kind of "lc" conversion/formatting to apply to an arbitrary value. In the code above, tagset.ns() returns a magic subclass of SimpleNamespace which has the following properties: - it allows mapping-like attribute access so that I can pass it to format_map() - it has attributes/keys computed from the tags in tagset so that I can use them in the format string - it has an elaborate __getattr__ method recognising a number of suffixes like "_lc" - should there be no actual tag of that name it will find the prefix and lowercase that, for example - a tag title="My Name" gets "My Name" from {title} and "my_name" from {title_lc} - the same __getattr__ recognises some things like _meta and returns another namespace containing metadata from the ontology, letting me say: {artist._meta.fullname} All this is to support giving the user/config a fairly rich suite of stuff which can go in a format string using the Python format string syntax. Regarding "!lc" vs ":lc": The "!lc" approach: If the format syntax let one supply a mapping of conversions to functions for something like "!lc" then I could rip out a big chunk of complexity from __getattr__, because the "_lc" suffixes above are essentially a syntax hack to work around that shortcoming. The ":lc" approach: The problem with __format__ is that it must be applied to a class. The values inside {foo.bar.zot} in the format string might be almost any type. The only way to get __format__ to do what I'd like is to wrap every such value in a proxy of some kind with a .__format__ method. In principle I can do that in my magic namespace class (from .ns() above). But that's yet another layer of complexity in something I'm already unhappy with. Hence the word "shoehorn" earlier. So to my mind, being able to plug in a mapping of additional (or overriding) conversion specifiers would be the most natural way to improve my situation. Architecturally, that is where I would want my magic "lc" et al to land. I could move the magic "._meta" attribute in there as well producing fewer "magic" attributes, etc. That is why I'm for being able to augument the "!conversion" stuff. Cheers, Cameron Simpson <cs@cskk.id.au>