[Python-3000] More PEP 3101 changes incoming

Thu Aug 9 09:31:37 CEST 2007

Ron Adam wrote:
> Talin wrote:
>> Ron Adam wrote:
>>> Now here's the problem with all of this.  As we add the widths back 
>>> into the format specifications, we are basically saying the idea of a 
>>> separate field width specifier is wrong.
>>>
>>> So maybe it's not really a separate independent thing after all, and 
>>> it just a convenient grouping for readability purposes only.
>>
>> I'm beginning to suspect that this is indeed the case.
> 
> Yes, I believe so even more after experimenting last night with 
> specifier objects.
> 
> for now I'm using ','s for separating *all* the terms.  I don't intend 
> that should be used for a final version, but for now it makes parsing 
> the terms and getting the behavior right much easier.
> 
>      f,.3,>7     right justify in field width 7, with 3 decimal places.
> 
>      s,^10,w20   Center in feild 10,  expands up to width 20.
> 
>      f,.3,%
> 
> This allows me to just split on ',' and experiment with ordering and see 
> how some terms might need to interact with other terms and how to do 
> that without having to fight the syntax problem for now.
> 
> Later the syntax can be compressed and tested with a fairly complete 
> doctest as a separate problem.

When you get a chance, can you write down your current thinking in a 
single document? Right now, there are lots of suggestions scattered in a 
bunch of different messages, some of which have been superseded, and 
it's hard to sew them together.

At this point, I think that as far as the mini-language goes, after 
wandering far afield from the original PEP we have arrived at a design 
that's not very far - at least semantically - from what we started with. 
In other words, other than the special case of 'repr', we find that 
pretty much everything can fit into a single specifier string; Attempts 
to break it up into two independent specifiers that are handled by two 
different entities run into the problem that the specifiers aren't 
independent and there are interactions between the two. Because the 
dividing line between "format specifier" and "alignment specifier" 
changes based on the type of data being formatted, trying to keep them 
separate results in redundancy and duplication, where we end up with 
more than one way to specify padding, alignment, or minimum width.

So I'm tempted to just use what's in the PEP now as a starting point - 
perhaps re-arranging the order of attributes, as has been discussed, or 
perhaps not - and then handling 'repr' via a different prefix character 
other than ':'. The 'repr' flag does nothing more than call __repr__ on 
the object, and then call __format__ on the result using whatever 
conversion spec was specified. (There might be a similar flag that does 
a call to __str__, which has the effect of calling str.__format__ 
instead of the object's native __format__ function.)

As far as requiring the different built-in versions of __format__ to 
have to parse the standard conversion specifier, that is not a problem 
in practice, as we'll have a little mini-parser that parses the 
conversion spec and fills in a C struct. There will also be a 
Python-accessible version of the same thing for people extending 
formatters in Python.

So, the current action items are:

1) Get consensus the syntax of the formatting mini-language.

2) Create a pure-python implementation of the global 'format' function, 
which will be a new standard library function that formats a single 
value, given a conversion spec:

    format(value, conversion)

3) Write implementations of str.__format__, int.__format__, 
float.__format__, decimal.__format__ and so on.

4) Create C implementations of the above.

5) Write the code for complex, multi-value formatting as specified in 
the PEP, and hook up to the built-in string class.

-- Talin