[Python-3000] PEP: str(container) should call str(item), not repr(item)

Guido van Rossum guido at python.org
Thu May 29 21:31:17 CEST 2008


Let me just save everyone a lot of time and say that I'm opposed to
this change, and that I believe that it would cause way too much
disturbance to be accepted this close to beta.

--Guido

On Thu, May 29, 2008 at 12:21 PM, Oleg Broytmann <phd at phd.pp.ru> wrote:
> Hello. A draft for a discussion.
>
> PEP: XXX
> Title: str(container) should call str(item), not repr(item)
> Version: $Revision$
> Last-Modified: $Date$
> Author: Oleg Broytmann <phd at phd.pp.ru>,
>        Jim Jewett <jimjjewett at gmail.com>
> Discussions-To: python-3000 at python.org
> Status: Draft
> Type: Standards Track
> Content-Type: text/plain
> Created: 27-May-2008
> Post-History: 28-May-2008
>
>
> Abstract
>
>    This document discusses the advantages and disadvantages of the
>    current implementation of str(container).  It also discusses the
>    pros and cons of a different approach - to call str(item) instead
>    of repr(item).
>
>
> Motivation
>
>    Currently str(container) calls repr on items.  Arguments for it:
>    -- containers refuse to guess what the user wants to see on
>       str(container) - surroundings, delimiters, and so on;
>    -- repr(item) usually displays type information - apostrophes
>       around strings, class names, etc.
>
>    Arguments against:
>    -- it's illogical; str() is expected to call __str__ if it exists,
>       not __repr__;
>    -- there is no standard way to print a container's content calling
>       items' __str__, that's inconvenient in cases where __str__ and
>       __repr__ return different results;
>    -- repr(item) sometimes do wrong things (hex-escapes non-ascii
>       strings, e.g.)
>
>    This PEP proposes to change how str(container) works.  It is
>    proposed to mimic how repr(container) works except one detail
>    - call str on items instead of repr.  This allows a user to choose
>    what results she want to get - from item.__repr__ or item.__str__.
>
>
> Current situation
>
>    Most container types (tuples, lists, dicts, sets, etc.) do not
>    implement __str__ method, so str(container) calls
>    container.__repr__, and container.__repr__, once called, forgets
>    it is called from str and always calls repr on the container's
>    items.
>
>    This behaviour has advantages and disadvantages.  One advantage is
>    that most items are represented with type information - strings
>    are surrounded by apostrophes, instances may have both class name
>    and instance data:
>
>        >>> print([42, '42'])
>        [42, '42']
>        >>> print([Decimal('42'), datetime.now()])
>        [Decimal("42"), datetime.datetime(2008, 5, 27, 19, 57, 43, 485028)]
>
>    The disadvantage is that __repr__ often returns technical data
>    (like '<object at address>') or unreadable string (hex-encoded
>    string if the input is non-ascii string):
>
>        >>> print(['тест'])
>        ['\xd4\xc5\xd3\xd4']
>
>    One of the motivations for PEP 3138 is that neither repr nor str
>    will allow the sensible printing of dicts whose keys are non-ascii
>    text strings.  Now that unicode identifiers are allowed, it
>    includes Python's own attribute dicts.  This also includes JSON
>    serialization (and caused some hoops for the json lib).
>
>    PEP 3138 proposes to fix this by breaking the "repr is safe ASCII"
>    invariant, and changing the way repr (which is used for
>    persistence) outputs some objects, with system-dependent failures.
>
>    Changing how str(container) works would allow easy debugging in
>    the normal case, and retrain the safety of ASCII-only for the
>    machine-readable  case.  The only downside is that str(x) and
>    repr(x) would more often be different -- but only in those cases
>    where the current almost-the-same version is insufficient.
>
>    It also seems illogical that str(container) calls repr on items
>    instead of str.  It's only logical to expect following code
>
>        class Test:
>            def __str__(self):
>                return "STR"
>
>            def __repr__(self):
>                return "REPR"
>
>
>        test = Test()
>        print(test)
>        print(repr(test))
>        print([test])
>        print(str([test]))
>
>    to print
>
>        STR
>        REPR
>        [STR]
>        [STR]
>
>    where it actually prints
>
>        STR
>        REPR
>        [REPR]
>        [REPR]
>
>    Especially it is illogical to see that print in Python 2 uses str
>    if it is called on what seems to be a tuple:
>
>        >>> print Decimal('42'), datetime.now()
>        42 2008-05-27 20:16:22.534285
>
>    where on an actual tuple it prints
>
>        >>> print((Decimal('42'), datetime.now()))
>        (Decimal("42"), datetime.datetime(2008, 5, 27, 20, 16, 27, 937911))
>
>
> A different approach - call str(item)
>
>    For example, with numbers it is often only the value that people
>    care about.
>
>        >>> print Decimal('3')
>        3
>
>    But putting the value in a list forces users to read the type
>    information, exactly as if repr had been called for the benefit of
>    a machine:
>
>        >>> print [Decimal('3')]
>        [Decimal("3")]
>
>    After this change, the type information would not clutter the str
>    output:
>
>        >>> print "%s".format([Decimal('3')])
>        [3]
>        >>> str([Decimal('3')])  # ==
>        [3]
>
>    But it would still be available if desired:
>
>        >>> print "%r".format([Decimal('3')])
>        [Decimal('3')]
>        >>> repr([Decimal('3')])  # ==
>        [Decimal('3')]
>
>    There is a number of strategies to fix the problem.  The most
>    radical is to change __repr__ so it accepts a new parameter (flag)
>    "called from str, so call str on items, not repr".  The
>    drawback of the proposal is that every __repr__ implementation
>    must be changed.  Introspection could help a bit (inspect __repr__
>    before calling if it accepts 2 or 3 parameters), but introspection
>    doesn't work on classes written in C, like all builtin containers.
>
>    Less radical proposal is to implement __str__ methods for builtin
>    container types.  The obvious drawback is a duplication of effort
>    - all those __str__ and __repr__ implementations are only differ
>    in one small detail - if they call str or repr on items.
>
>    The most conservative proposal is not to change str at all but
>    to allow developers to implement their own application- or
>    library-specific pretty-printers.  The drawback is again
>    a multiplication of effort and proliferation of many small
>    specific container-traversal algorithms.
>
>
> Backward compatibility
>
>    In those cases where type information is more important than
>    usual, it will still be possible to get the current results by
>    calling repr explicitly.
>
>
> Copyright
>
>    This document has been placed in the public domain.
>
>
>
> Local Variables:
> mode: indented-text
> indent-tabs-mode: nil
> sentence-end-double-space: t
> fill-column: 70
> coding: utf-8
> End:
>
> Oleg.
> --
>     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
>           Programmers don't die, they just GOSUB without RETURN.
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list