[Python-Dev] str.format implementation

Ben Wolfson wolfson at gmail.com
Tue Dec 13 04:56:16 CET 2011


Hi,

I'm hoping to get some kind of consensus about the divergences between
the implementation and documentation of str.format
(http://mail.python.org/pipermail/python-dev/2011-June/111860.html and
the linked bug report contain examples of the divergences). These
pertain to the arg_name, attribute_name, and element_index fields of
the grammar in the docs:

    replacement_field ::=  "{" [field_name] ["!" conversion] [":"
format_spec] "}"
    field_name        ::=  arg_name ("." attribute_name | "["
element_index "]")*
    arg_name          ::=  [identifier | integer]
    attribute_name    ::=  identifier
    element_index     ::=  integer | index_string
    index_string      ::=  <any source character except "]"> +

Nothing definitive emerged from the last round of discussion, and as
far as I can recall there are now three proposals for what kind of
changes might be worth making:

 (1) the implementation should conform to the docs;*
 (2) like (1) with the change that element_index should be changed to
"integer | identifier" (rendering index_string otiose);
 (3) like (1) with the change that index_string should be changed to
'<any source character except "]", "}", or "{">'.

* the docs link "integer" to
http://docs.python.org/reference/lexical_analysis.html#grammar-token-integer
but the current implementation only allows decimal integers, which
seems reasonable and worth retaining.

(2) was suggested by Greg Ewing on python-dev and (3) by Petri
Lehtinen in the bug report. (Petri actually suggested that braces be
disallowed except for the nesting in the format_spec, but it comes to
the same thing.)

None of these should be difficult to implement; patches exist for (1)
and (2). (2) and (3) would lead to format strings that are easier to
for the programmer to visually parse; (1) would make the indexing part
of the replacement field conform more closely to the way indexing with
strings behaves in Python generally, where arbitrary strings can be
used. (It wouldn't conform exactly, obviously, since ']' would still
be excluded.)

I personally would prefer (1) to (2) or (3), and (3) to (2), had I my
druthers, but it doesn't matter a *whole* lot to me; I'd prefer any of
them to nothing (or to changing the docs to reflect the current batty
behavior).

-- 
Ben Wolfson
"Human kind has used its intelligence to vary the flavour of drinks,
which may be sweet, aromatic, fermented or spirit-based. ... Family
and social life also offer numerous other occasions to consume drinks
for pleasure." [Larousse, "Drink" entry]


More information about the Python-Dev mailing list