Re: [Python-ideas] reprs of recursive datastructures.

On Sat, Sep 8, 2012 at 9:17 AM, Terry Reedy <tjreedy@udel.edu> wrote:
Pairs of different things have the same representation, making the representation ambiguous to both people and the interpreter.
Well yeah, when designing a repr() we usually have to compromise. E.g. if you render a class instance it often shows the class name but not the module name (e.g. decimal.Decimal.)
Moreover, the interpreter's guess is usually wrong.
The requirement that the interpreter can evaluate a repr() and return a similar value is pretty weak, and I'm not sure that in this case the fact that copying the output back into the interpreter returns an object of a different share matters much to anyone. A subtler but similar bug appears with lists containing multiple references to the same sublist, e.g.
I don't think we should attempt to fix this particular one -- first of all, the analysis would be tricky (there could be a user-defined object involved) and second of all, I can't think of a solution that still produces a valid expression (except perhaps a very ugly one).
But when does it ever matter?
It would be trivial to tweak the representations of recursive lists so they are not valid list displays.
To what purpose? I still don't understand what the actual use case is where you think that will produce a better experience for the user. -- --Guido van Rossum (python.org/~guido)

On Sat, Sep 8, 2012 at 7:49 PM, Guido van Rossum <guido@python.org> wrote:
To what purpose? I still don't understand what the actual use case is where you think that will produce a better experience for the user.
The thing I don't like is that the current display flat out lies about the sequence contents - it displays a terminal constant ("..."), rather than a clear marker that a recursive loop was detected. The case of multiple references to a different list is not the same, as then the repr() at least still accurately reflects what you would get when iterating over the data structure. So, my perspective is if attempting to naively flatten the list would create an infinite loop, then I want evaluating the representation to throw a syntax error the way it did in Python 2. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 9/8/2012 6:02 AM, Nick Coghlan wrote:
This expresses what I was trying to say better than I did. When '...' was chosen for recursive structures, it make the result not-legal-code, as it should be. The 3.0 incorporation of '...' as legal syntax, created, in a sense, a reversion. So that suggests revising the recursion marker. That said, there is the issue of doctests, so I would only change in 3.4. -- Terry Jan Reedy

On Sep 8, 2012, at 3:02 PM, Terry Reedy <tjreedy@udel.edu> wrote:
That said, there is the issue of doctests, so I would only change in 3.4.
Note that in doctest displays, ellipsis has yet another meaning. I agree that this is 3.4 material, and the solution should probably be something in <>.

On Sat, Sep 8, 2012 at 5:26 PM, MRAB <python@mrabarnett.plus.com> wrote:
I was probably not very clear about the problem of having ellipsis appear as expected output in doctests. The problem is that '...' has a special meaning for doctests: """ When specified, an ellipsis marker (...) in the expected output can match any substring in the actual output. ... """ http://docs.python.org/py3k/library/doctest.html#doctest.ELLIPSIS This means that <...> will match any angle bracketed repr. Note that lists are not the only types affected by this issue. Dicts, for example, have the same problem:
It is possible the other mutable container types are similarly affected. It looks like this problem requires some more though. If we ever decide to allow non-ASCII characters in repr, my vote for repr of recursive list will be
"[[\N{ANTICLOCKWISE GAPPED CIRCLE ARROW}]]" '[[⟲]]'
"\N{WHITE SMILING FACE}" '☺'

On 08Sep2012 23:27, MRAB <python@mrabarnett.plus.com> wrote: | On 08/09/2012 23:06, Alexander Belopolsky wrote: | > If we ever decide to allow non-ASCII characters in repr, my vote for | > repr of recursive list will be | > | >>>> "[[\N{ANTICLOCKWISE GAPPED CIRCLE ARROW}]]" | > '[[⟲]]' | > | Or: | | >>> "[[\N{CLOCKWISE GAPPED CIRCLE ARROW}]]" | '[[⟳]]' [...] These are sublime! Personally I find the former one more intuitively expressive of a recursion, probably because the arrow points "left" (in my current font, anyway; how variable is this?) and therefore towards the stuff already recited. The latter arrow seems to point "right" or "forwards", no so recursive to my intuition. | >>>> "\N{WHITE SMILING FACE}" | > '☺' Cute but a -1 from me; less intuitive meaning. Cheers, -- Cameron Simpson <cs@zip.com.au> To understand recursion, you must first understand recursion.

On Sep 8, 2012, at 6:02 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
It's more of an equivocation than a flat-out lie ;-) It is an equivocation because "..." is legitimately used for multiple purposes (in English text for "and so on ...", in long established use in Python to denote recursive reprs, in doctest as a placeholder for elided result text, and in its newest role as the Ellipsis terminal constant). It seems to me that the first three roles are well-established and are reasonably consistent with one another. Further, each of those roles an important task. In contrast, the new role as a terminal constant for an Ellipsis singleton is brand-new, not very important, and doesn't even have a clear semantic role (what it is "supposed" to mean?). Changing the first three uses just so it won't conflict with the last seems like the tail wagging the dog. I agree Steven that this isn't a problem worth solving. As Alexander pointed-out, the ... punctuation can be used in two distinct ways inside doctests (as part of expected output or as a placeholder for elided content). A consequence is that there won't be a reliable automated way to convert existing doctests for a new notation for recursive reprs. ISTM that changes which break tests are worse than other changes because the process of upgrading from one Python version to the next is so heavily dependent getting existing tests to pass. The tests are your safety net during upgrades -- breaking them makes upgrading less palatable. Raymond

On Sat, Sep 8, 2012 at 7:49 PM, Guido van Rossum <guido@python.org> wrote:
To what purpose? I still don't understand what the actual use case is where you think that will produce a better experience for the user.
The thing I don't like is that the current display flat out lies about the sequence contents - it displays a terminal constant ("..."), rather than a clear marker that a recursive loop was detected. The case of multiple references to a different list is not the same, as then the repr() at least still accurately reflects what you would get when iterating over the data structure. So, my perspective is if attempting to naively flatten the list would create an infinite loop, then I want evaluating the representation to throw a syntax error the way it did in Python 2. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 9/8/2012 6:02 AM, Nick Coghlan wrote:
This expresses what I was trying to say better than I did. When '...' was chosen for recursive structures, it make the result not-legal-code, as it should be. The 3.0 incorporation of '...' as legal syntax, created, in a sense, a reversion. So that suggests revising the recursion marker. That said, there is the issue of doctests, so I would only change in 3.4. -- Terry Jan Reedy

On Sep 8, 2012, at 3:02 PM, Terry Reedy <tjreedy@udel.edu> wrote:
That said, there is the issue of doctests, so I would only change in 3.4.
Note that in doctest displays, ellipsis has yet another meaning. I agree that this is 3.4 material, and the solution should probably be something in <>.

On Sat, Sep 8, 2012 at 5:26 PM, MRAB <python@mrabarnett.plus.com> wrote:
I was probably not very clear about the problem of having ellipsis appear as expected output in doctests. The problem is that '...' has a special meaning for doctests: """ When specified, an ellipsis marker (...) in the expected output can match any substring in the actual output. ... """ http://docs.python.org/py3k/library/doctest.html#doctest.ELLIPSIS This means that <...> will match any angle bracketed repr. Note that lists are not the only types affected by this issue. Dicts, for example, have the same problem:
It is possible the other mutable container types are similarly affected. It looks like this problem requires some more though. If we ever decide to allow non-ASCII characters in repr, my vote for repr of recursive list will be
"[[\N{ANTICLOCKWISE GAPPED CIRCLE ARROW}]]" '[[⟲]]'
"\N{WHITE SMILING FACE}" '☺'

On 08Sep2012 23:27, MRAB <python@mrabarnett.plus.com> wrote: | On 08/09/2012 23:06, Alexander Belopolsky wrote: | > If we ever decide to allow non-ASCII characters in repr, my vote for | > repr of recursive list will be | > | >>>> "[[\N{ANTICLOCKWISE GAPPED CIRCLE ARROW}]]" | > '[[⟲]]' | > | Or: | | >>> "[[\N{CLOCKWISE GAPPED CIRCLE ARROW}]]" | '[[⟳]]' [...] These are sublime! Personally I find the former one more intuitively expressive of a recursion, probably because the arrow points "left" (in my current font, anyway; how variable is this?) and therefore towards the stuff already recited. The latter arrow seems to point "right" or "forwards", no so recursive to my intuition. | >>>> "\N{WHITE SMILING FACE}" | > '☺' Cute but a -1 from me; less intuitive meaning. Cheers, -- Cameron Simpson <cs@zip.com.au> To understand recursion, you must first understand recursion.

On Sep 8, 2012, at 6:02 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
It's more of an equivocation than a flat-out lie ;-) It is an equivocation because "..." is legitimately used for multiple purposes (in English text for "and so on ...", in long established use in Python to denote recursive reprs, in doctest as a placeholder for elided result text, and in its newest role as the Ellipsis terminal constant). It seems to me that the first three roles are well-established and are reasonably consistent with one another. Further, each of those roles an important task. In contrast, the new role as a terminal constant for an Ellipsis singleton is brand-new, not very important, and doesn't even have a clear semantic role (what it is "supposed" to mean?). Changing the first three uses just so it won't conflict with the last seems like the tail wagging the dog. I agree Steven that this isn't a problem worth solving. As Alexander pointed-out, the ... punctuation can be used in two distinct ways inside doctests (as part of expected output or as a placeholder for elided content). A consequence is that there won't be a reliable automated way to convert existing doctests for a new notation for recursive reprs. ISTM that changes which break tests are worse than other changes because the process of upgrading from one Python version to the next is so heavily dependent getting existing tests to pass. The tests are your safety net during upgrades -- breaking them makes upgrading less palatable. Raymond
participants (7)
-
Alexander Belopolsky
-
Cameron Simpson
-
Guido van Rossum
-
MRAB
-
Nick Coghlan
-
Raymond Hettinger
-
Terry Reedy