[Python-3000] PEP - string.format

Ian Bicking ianb at colorstudy.com
Sat Apr 22 23:18:23 CEST 2006


Talin wrote:
>> Thus you can't nest formatters, e.g., {0:pad(23):xmlquote}, unless the 
>> underlying object understands that.  Which is probably unlikely.
> 
> At this point, I'm thinking not, although I could be convinced otherwise.
> Remember, that you can accomplish all of the same things by processing the input
> arguments; The conversion specifiers are a convenience.
> 
> Also, in your model, there would be a distinction between the first specifier
> (which converts the object to a string), and subsequent ones (which modify the
> string). My complexity senses are tingling...

I would assume that any formatting can produce any object, and only at 
the end will the object (if necessary) be converted to a string with str().


>>       3) Otherwise, check the internal formatter within
>>          string.format that contains knowledge of certain builtin
>>          types.
>>
>> If it is a language change, could all those types have __format__ 
>> methods added?  Is there any way for the object to accept or decline to 
>> do formatting?
> 
> Good question. I suspect that it may be impractical to add __format__ to all
> built-in types, so we should plan to allow a fallback to an internal formatter.

Yeah, on further thought while this would be possible for py3k, some 
form of this could be usefully done as a module before that.

>>       4) Otherwise, call str() or unicode() as appropriate.
>>
>> Is there a global repr() formatter, like %r?  Potentially {0:repr} could 
>> be implemented the same way by convention, including in object.__format__?
> 
> Good idea. (Should there be a *global* custom formatter? Plugins? Subject of a
> separate PEP I think.)

I don't think there should be any modifiable global formatter.  As an 
implementation detail there may be one, but one module shouldn't be able 
to change the way formatting works for everyone.

But repr() is something of a special case, because it's so widely 
applicable.

>>      The formatter should examine the type of the object and the
>>      specifier string, and decide whether or not it wants to handle
>>      this field. If it decides not to, then it should return False
>>      to indicate that the default formatting for that field should be
>>      used; Otherwise, it should call builder.append() (or whatever
>>      is the appropriate method) to concatenate the converted value
>>      to the end of the string, and return True.
>>
>> Well, I guess this is the use case, but it feels a bit funny to me.  A 
>> concrete use case would be appreciated.
> 
> The main use case was that the formatter might need to examine the part of the
> string that's already been built. For example, it can't handle expansion of tabs
> unless it knows the current column index. I had originally planned to pass only
> the column index, but that seemed too special-case to me.

Hmm... so the tab-aligning formatter would look for tab alignment 
formatting specifications?  An actual implementation of a tab aligning 
custom formatter would probably make this easier to think about.

>>      A fairly high degree of convenience for relatively small risk can
>>      be obtained by supporting the getattr (.) and getitem ([])
>>      operators.  While it is certainly possible that these operators
>>      can be overloaded in a way that a maliciously written string could
>>      exploit their behavior in nasty ways, it is fairly rare that those
>>      operators do anything more than retargeting to another container.
>>      On other other hand, the ability of a string to execute function
>>      calls would be quite dangerous by comparison.
>>
>> It could be a keyword option to enable this.  Though all the keywords 
>> are kind of taken.  This itself wouldn't be an issue if ** wasn't going 
>> to be used so often.
> 
> The keywords are all taken - but there are still plenty of method names
> available :) That's why "fformat" has a different method name, so that we can
> distinguish the custom formatter parameter from the rest of the params.
> 
> Unfortunately, this can't be used too much, or you get a combinatorial explosion
> of method names:
> 
>    string.format
>    string.fformat
>    string.format_dict
>    string.fformat_dict
>    ...

Yeah... that's not so pretty ;).  Things like .update() and even % 
handle both cases without too much ambiguity.

>>      One other thing that could be done to make the debugging case
>>      more convenient would be to allow the locals() dict to be omitted
>>      entirely.  Thus, a format function with no arguments would instead
>>      use the current scope as a dictionary argument:
>>
>>          print "Error in file {p.file}, line {p.line}".format()
>>
>>      An alternative would be to dedicate a special method name, other
>>      than 'format' - say, 'interpolate' or 'lformat' - for this
>>      behavior.
>>
>> It breaks some conventions to have a method that looks into the parent 
>> frame; but the use cases are very strong for this.  Also, if attribute 
>> access was a keyword argument potentially that could be turned on by 
>> default when using the form that pulled from locals().
> 
> To be honest, I'd be willing to drop this whole part of the proposal if that's
> what the folks here would like. I like to present all options, but that doesn't
> mean that I myself am in favor of all of them.
> 
> I realize that there are some use cases for it, but I don't know if the use
> cases are significantly better.

It's not just *some* use cases -- a substantial fraction of all string 
interpolation falls into this use case.

Really it means that it makes sense to make a sample implementation and 
try to rewrite some stdlib modules, with and without special treatment 
for locals().


-- 
Ian Bicking  |  ianb at colorstudy.com  |  http://blog.ianbicking.org


More information about the Python-3000 mailing list