[Python-ideas] new format spec for iterable types

Wed Sep 9 16:02:27 CEST 2015

At some point, instead of complicating how format works internally, you
should just write a function that does what you want. I realize there's
a continuum between '{}'.format(iterable) and
'{<really-really-complex-stuff}'.format(iterable). It's not clear where
to draw the line. But when the solution is to bake knowledge of
iterables into .format(), I think we've passed the point where we should
switch to a function: '{}'.format(some_function(iterable)).

In any event, If you want to play with this, I suggest you write
some_function(iterable) that does what you want, first.

Eric.

On 9/9/2015 9:41 AM, Wolfgang Maier wrote:
> Thanks for all the feedback!
> 
> Just to summarize ideas and to clarify what I had in mind when proposing
> this:
> 
> 1)
> Yes, I would like to have this work with any (or at least most)
> iterables, not just with my own custom type that I used for illustration.
> So having this handled by the format method rather than each object's
> __format__ method could make sense. It was just simple to implement it
> in Python through the __format__ method.
> 
> Why did I propose * as the first character of the new format spec string?
> Because I think you really need some token to state unambiguously[1]
> that what follows is a format specification that involves going through
> the elements of the iterable instead of working on the container object
> itself. I thought that * is most intuitive to understand because of its
> use in unpacking.
> 
> [1] unfortunately, in my original proposal the leading * can still be
> ambiguous because *<, *> *= and *^ could mean element joining with <, >,
> = or ^ as separators or aligning of the container's formatted string
> representation using * as the fill character.
> 
> 
> Ideally, the * should be the very first thing inside a replacement field
> - pretty much as suggested by Oscar - and should not be part of the
> format spec. This is not feasible through a format spec handled by the
> __format__ method, but through a modified str.format method, i.e.,
> that's another argument for this approach. Examples:
> 
> 'foo {*name:<sep>} bar'.format(name=<expr>)
> 'foo {*0:<sep>} bar {1}'.format(x, y)
> 'foo {*:<sep>} bar'.format(x)
> 
> 
> 2)
> As for including an additional format spec to apply to the elements of
> the iterable:
> I decided against including this in the original proposal to keep it
> simple and to get feedback on the general idea first.
> The problem here is that any solution requires an additional token to
> indicate the boundary between the <separator> part and the element
> format spec. Since you would not want to have anyone's custom format
> spec broken by this, this boils down to disallowing one reserved
> character in the <separator> part, like in Oscar's example:
> 
> 'foo {*name:<sep>:<fmt>} bar'.format(name=<expr>)
> 
> where <sep> cannot contain a colon.
> 
> So that character would have to be chosen carefully (both : and | are
> quite readable, but also relatively common element separators I guess).
> In addition, the <separator> part should be non-optional (though the
> empty string should be allowed) to guarantee the presence of the
> delimiter token, which avoids accidental splitting of lonely element
> format specs into a "<sep>" and <fmt> part:
> 
> # format the elements of name using <fmt>, join them using <sep>
> 'foo {*name:<sep>:<fmt>} bar'.format(name=<expr>)
> # format the elements of name using <fmt>, join them using ''
> 'foo {*name::<fmt>} bar'.format(name=<expr>)
> # a syntax error
> 'foo {*name:<fmt>} bar'.format(name=<expr>)
> 
> On the other hand, these restriction do not look too dramatic given the
> flexibility gain in most situations.
> 
> So to sum up how this could work:
> If str.format encounters a leading * in a replacement field, it splits
> the format spec (i.e. everything after the first colon) on the first
> occurrence of the <sep>|<fmt> separator (possibly ':' or '|') and does,
> essentially:
> 
> <sep>.join(format(e, <fmt>) for e in iterable)
> 
> Without the *, it just works the current way.
> 
> 
> 3)
> Finally, the alternative idea of having the new functionality handled by
> a new !converter, like:
> 
> "List: {0!j:,}".format([1.2, 3.4, 5.6])
> 
> I considered this idea before posting the original proposal, but, in
> addition to requiring a change to str.format (which would need to
> recognize the new token), this approach would need either:
> 
> - a new special method (e.g., __join__) to be implemented for every type
> that should support it, which is worse than for my original proposal or
> 
> - the str.format method must react directly to the converter flag, which
> is then no different to the above solution just that it uses !j instead
> of *. Personally, I find the * syntax more readable, plus, the !j syntax
> would then suggest that this is a regular converter (calling a special
> method of the object) when, in fact, it is not.
> Please correct me, if I misunderstood something about this alternative
> proposal.
> 
> Best,
> Wolfgang
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>