[Python-Dev] PEP 3101 Update

Fri May 19 16:55:37 CEST 2006

On 5/19/06, Talin <talin at acm.org> wrote:
> Guido van Rossum wrote:
> > [http://www.python.org/dev/peps/pep-3101/]
> > http://msdn.microsoft.com/library/en-us/cpguide/html/cpconcompositeformatting.asp

[on width spec a la .NET]
> We already have that now, don't we? If you look at the docs for "String
> Formatting Operations" in the library reference, it shows that a
> negative sign on a field width indicates left justification.

Yes, but I was proposing to adopt the .NET syntax for the feature.
Making it a responsibility of the generic formatting code rather than
of each individual type-specific formatter makes it more likely that
it will be implemented correctly everywhere.

[on escaping]
> There is another solution to this which is equally subtle, although
> fairly straightforward to parse. It involves defining the rules for
> escapes as follows:
>
>     '{{' is an escaped '{'
>     '}}' is an escaped '}', unless we are within a field.
>
> So you can write things like {0:10,{1}}, and the final '}}' will be
> parsed as two separate closing brackets, since we're within a field
> definition.
>
>  From a parsing standpoint, this is unambiguous, however I've held off
> on suggesting it because it might appear to be ambiguous to a casual reader.

Sure. But I still think there isn't enough demand for variable
expansion *within* a field to bother. When's the lats time you used a
* in a % format string? And how essential was it?

[on error handling for unused variables]
> I am undecided on this issue as well, which is the reason that it's not
> mentioned in the PEP (yet).

There's another use case which suggests that perhaps all errors should
pass silent (or at least produce a string result instead of an
exception). It's a fairly common bug to have a broken format in an
except clause for a rare exception. This can cause major issues in
large programs because the one time you need the debug info badly you
don't get anything at all. (The logging module now has a special
work-around for this reason.) That's too bad, but it is a fact of
life.

If broken formats *never* caused exceptions (but instead some kind of
butchered conversion drawing attention to the problem) then at least
one source of frustration would be gone. If people like this I suggest
putting some kind of error message in the place of any format
conversion for which an error occurred. (If we left the {format}
itself in, then this would become a feature that people would rely on
in undesirable places.)

I wouldn't want to go so far as to catch exceptions from type-specific
formatters; but those formatters themselves should silently ignore bad
specifiers, or at least return an error string, rather than raise
exceptions.

Still, this may be more painful for beginners learning to write format strings?

Until we have decided, the PEP should list the two alternatives in a
fair amount of detail as "undecided".

> I will be supplying a Python implementation of the parser along with the
> PEP. What I would prefer not to supply (although I certainly can if you
> feel it's necessary) is an optimized C implementation of the same
> parser, as well as the implementations of the various type-specific
> formatters.

There's no need for any performance work at this point, and C code is
"right out" (as the Pythons say). The implementation should be usable
as a readable "spec" to resolve gray areas in the PEP.

> [] is the most intuitive syntax by far IMHO. Let's run it up the
> flagpole and see if anybody salutes :)

Fair enough.

> The way I have set up the API for writing custom formatters (not talking
> about the __format__ method here) allows the custom formatter object to
> examine the entire output string, not merely the part that it is
> responsible for; And moreover, the custom formatter is free to modify
> the entire string. So for example, a custom formatter could tabify or
> un-tabify all previous text within the string.

Ouch. I don't like that.

> The API could be made slightly simpler by eliminating this feature. The
> reason that I added it was specifically so that custom formatters could
> perform column-specific operations, like the old BASIC function that
> would print spaces up to a given column. Having generated my share of
> reports back in the old days (COBOL programming in the USAF), I thought
> it might be useful to have the ability to do operations based on the
> absolute column number.

Please drop it. Python has never had it and AFAICR it's never been requested.

> Currently the API specifies that a custom formatter is passed an array
> object, and the custom formatter should append its data to the end of
> the array, but it is also free to examine and modify the rest of the array.
>
> If I were to remove this feature, then instead of using an array, we'd
> simply have the custom formatter return a string like __format__ does.

Yes please.

> So the question is - is the use case useful enough to keep this feature?
> What do people think of the use of the Python array type in this case?

That rather constrains the formatting implementation, so I'd like to drop it.

BTW I think we should move this back to the py3k list -- the PEP is
3101 after all. That should simplify the PEP a bit because it no
longer has ti distinguish between str and unicode. If we later decide
to backport it to 2.6 it should be easy enough to figure out what to
do with str vs. unicode (probably the same as we do for %).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)