[Python-3000] More PEP 3101 changes incoming

Sun Aug 5 08:06:43 CEST 2007

Guido van Rossum wrote:
> On 8/4/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>> Ron Adam wrote:
>>> Which would result in a first column that right aligns, a second column
>>> that centers unless the value is longer than 100, in which case it right
>>> align, and cuts the end, and a third column that left aligns, but cuts off
>>> the right if it's over 15.
>> All this talk about cutting things off worries me. In the
>> case of numbers at least, if you can't afford to expand the
>> column width, normally the right thing to do is *not* to cut
>> them off, but replace them with **** or some other thing that
>> stands out.
>>
>> This suggests that the formatting and field width options may
>> not be as easily separable as we would like.
> 
> I remember a language that did the *** thing; it was called Fortran.
> It was an absolutely terrible feature. A later language (Pascal)
> solved it by ignoring the field width if the number didn't fit -- it
> would mess up your layout but at least you'd see the value. That
> strategy worked much better, and later languages (e.g. C) followed it.
> So I think a maximum width is quite unnecessary for numbers. For
> strings, of course, it's useful; it can be made part of the
> string-specific conversion specifier.

I looked up Fortran's print descriptors and it seems they have only a 
single width descriptor which as you say automatically included the *** 
over flow behavior.  So the programmer doesn't have a choice, they can 
either specify a width and get that too, or don't specify a width.  I can 
see how that would be very annoying.

    See section 2...

    http://www-solar.mcs.st-and.ac.uk/~steveb/course/notes/set4.pdf

The field width specification I've described is rich enough so that the 
programmer can choose the behavior they want.  So it doesn't have the same 
problem.

A programmer can choose to implement the Fortran behavior if they really 
want to.  They would need to specify an overflow replacement character to 
turn that on.  Other wise it never occurs.

    '{0:10+20/*,s}'

In the above case the field width would normally be 10, but could expand 
upto 20,  and only if it goes over 20 is the field filled with '*'s.  But 
that behavior was explicitly specified by the programmer by supplying an 
overflow replacement character along with the max_width size.  It's not 
automatically included as in the Fortran case.  Truncating behavior is 
explicitly specified by giving a max_width size without a replacement 
character. And a minimum width is explicitly specified by supplying a 
min_width size.

So the programmer has full and explicit control of the alignment behaviors 
in all cases.

Since an alignment specification is always paired with a format 
specification, the programmer can choose the best alignment behavior to go 
along with a formatter in the context of their application.  This is a good 
thing even though some programmers may not always make the best choices at 
first.  I believe they will learn fairly quickly what not to do.

So the choices are:

1 - Remove the replacement character alignment option. It may not be all 
that useful, and by removing it we protect programmers from making some 
mistakes, but limit others from this feature who may find it useful.

So just how useful/desirable is this?

2 - Only use max_width inside string formatters.  This further protects 
programmers from making silly choices.  And further limits other that may 
want to use max_width with other types.  It also breaks up the clean split 
of alignment and format specifiers.  (But this may be a matter of 
perspective.)

I'm +0 on (1), and -1 on (2) moving max_width to the string formatter.  So 
what do others think about these features?

If you do #2, then #1 also goes, unless it too is moved to the string 
formatter.

Note: Moving these to the string type formatter doesn't prevent them from 
being used with numbers in all cases.  A general text class would still be 
able to use them with numeric entries because it would call the __str__ 
method of the number to first convert the number to a string, but then call 
__format__ on that string and forward these string options.  It just 
requires more thought to do, and a better understanding of the internal 
process.

But also this depends on the choice of the underlying implementation.

Cheers,
    Ron