Mailman 3 reprs of recursive datastructures. - Python-ideas

newer
Re: [Python-ideas] reprs of...

reprs of recursive datastructures.

older
Add annotations to global statement

Mike Graham

Sept. 7, 2012

7:51 p.m.

With the Python 3 loosening of where ... can occur, this somewhat suboptimal behaviour occurs

...

Is this something that can be improved? Is it something worth improving? Mike

Show replies by date

Terry Reedy

September 2012

9:57 p.m.

On 9/7/2012 3:51 PM, Mike Graham wrote:

...

With the Python 3 loosening of where ... can occur, this somewhat suboptimal behaviour occurs

...
...
...
x = [] x.append(x) x [[...]] eval(repr(x)) [[Ellipsis]]

I believe ... was used for representations before it became Ellipsis literal. In any case, the representation is now ambiguous. It is not possible to reliably invert a many-to-one function.

...

Is this something that can be improved?

Change the recursive substitution so there is no ambiguity. For instance, use the unicode ellipsis character instead of '...', Since the output is unicode and may contain non-ascii chars anyway, that might be considered.

...

...
...
'\u2026' '…' [[...]] [[Ellipsis]] [[…]] SyntaxError: invalid character in identifier

If not that, pick anything else giving a syntax error.

...

...
...
[[,,,]] SyntaxError: invalid syntax [[. . .]] SyntaxError: invalid syntax

...

Is it something worth improving?

I think so. Ambiguity is bad, and the substituted representation is something of a fib, so it should not mimic something that is valid. eval(representation of recursive structure) should either correctly evaluate by re-creating the recursive structure represented* or it should raise an error. * That would mean that the same expression should be valid in code also. An interesting idea, and a deep can of worms. I believe it would require that '. . .' or whatever be recognizable syntax but not a named object, as the latter would re-introduce the same ambiguity. -- Terry Jan Reedy

Terry Reedy

10:07 p.m.

On 9/7/2012 5:57 PM, Terry Reedy wrote:

...

On 9/7/2012 3:51 PM, Mike Graham wrote:

...
With the Python 3 loosening of where ... can occur, this somewhat suboptimal behaviour occurs

...
...
...
x = [] x.append(x) x [[...]] eval(repr(x)) [[Ellipsis]]

I believe ... was used for representations before it became Ellipsis literal. In any case, the representation is now ambiguous. It is not possible to reliably invert a many-to-one function.

...
Is this something that can be improved?

Change the recursive substitution so there is no ambiguity. For instance, use the unicode ellipsis character instead of '...', Since the output is unicode and may contain non-ascii chars anyway, that might be considered.

...
...
...
'\u2026' '…' [[...]] [[Ellipsis]] [[…]] SyntaxError: invalid character in identifier

If not that, pick anything else giving a syntax error.

...
...
...
[[,,,]] SyntaxError: invalid syntax [[. . .]] SyntaxError: invalid syntax

Or probably the simplest, just use 4 periods

...

...
...
[[....]] SyntaxError: invalid syntax

...

...
Is it something worth improving?

I think so. Ambiguity is bad, and the substituted representation is something of a fib, so it should not mimic something that is valid. eval(representation of recursive structure) should either correctly evaluate by re-creating the recursive structure represented* or it should raise an error.

* That would mean that the same expression should be valid in code also. An interesting idea, and a deep can of worms. I believe it would require that '. . .' or whatever be recognizable syntax but not a named object, as the latter would re-introduce the same ambiguity.

-- Terry Jan Reedy

Alexander Belopolsky

2:55 a.m.

On Fri, Sep 7, 2012 at 6:07 PM, Terry Reedy <tjreedy@udel.edu> wrote:

...

Or probably the simplest, just use 4 periods

...
...
...
[[....]]

or two: [[..]]

Guido van Rossum

6:27 a.m.

Can someone explain what problem we are trying to solve? I fail to uderstand what's wrong with the current behavior... -- Sent from Gmail Mobile

Terry Reedy

7:23 a.m.

On 9/8/2012 2:27 AM, Guido van Rossum wrote:

...

Can someone explain what problem we are trying to solve? I fail to uderstand what's wrong with the current behavior...

Pairs of different things have the same representation, making the representation ambiguous to both people and the interpreter. Moreover, the interpreter's guess is usually wrong. In particular, the representations of recursive lists use what is now the Ellipsis literal '...', so they are also valid list displays for a non-recursive nested list containing Ellipsis. The interpreter always reads ... as the Ellipsis literal, which it nearly always is not what is meant. It would be trivial to tweak the representations of recursive lists so they are not valid list displays. --- Terry Jan Reedy -- Terry Jan Reedy

Steven D'Aprano

8:06 a.m.

On 08/09/12 17:23, Terry Reedy wrote:

...

I'm not sure that you are right to assume that recursive lists are more common than lists containing Ellipsis. Neither are exactly common, and at least a few people use Ellipsis as a ready-made sentinel value that isn't None.

...

It would be trivial to tweak the representations of recursive lists so they are not valid list displays.

Ah, I had not realised that you wanted eval(repr(x)) to fail if x was recursive. That's more reasonable than expecting it to generate x. Changing the repr of recursive lists will break doctests. And frankly, my aesthetic sense would be hurt if the repr of a recursive list used something other than ... for the part not displayed. An ellipsis is the right symbol to use when skipping part of the display, and an ellipsis is three dots, not two or four. A unicode … would be acceptable, except I understand that buildins must be ASCII. I don't think this is genuinely enough of a problem that it needs fixing. -- Steven

Steven D'Aprano

7:45 a.m.

On 08/09/12 16:27, Guido van Rossum wrote:

...

Can someone explain what problem we are trying to solve? I fail to uderstand what's wrong with the current behavior...

I believe that some people think that if you eval the repr of a recursive list, the result should be an equally recursive list. But it isn't: py> x = [1, 2, 3] py> x.append(x) py> eval(repr(x)) == x False I think they are misguided in their expectation. There is no way to write a single expression using list literals which generates a recursive list, so why would you expect eval to produce one? Furthermore, list reprs of recursive lists have been ambiguous for years. This code works identically in 2.4 and 3.2: py> a = []; a.append(a) py> b = []; b.append(b) py> x = [[], []]; x[0].append(x); x[1].append(x) py> y = [a, b] py> x == y False py> repr(x) == repr(y) True eval(repr(x)) == x is not a guaranteed invariant, it is a "nice to have". -1 on trying to fix this. -- Steven

Nick Coghlan

8:16 a.m.

On Sat, Sep 8, 2012 at 5:45 PM, Steven D'Aprano <steve@pearwood.info> wrote:

...

No, the problem is that you get the *wrong answer* instead of an exception. Python 2:

...

Python 3:

...

As pointed out earlier, this is due to the fact that the previously illegal notation used to indicate the recursive reference is now valid syntax. The simplest fix is to just introduce alternative notation for the self-reference that will reintroduce the desired syntax error, such as "<...>" or "<self>". Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Terry Reedy

September 2012

9:57 p.m.

On 9/7/2012 3:51 PM, Mike Graham wrote:

...

With the Python 3 loosening of where ... can occur, this somewhat suboptimal behaviour occurs

...
...
...
x = [] x.append(x) x [[...]] eval(repr(x)) [[Ellipsis]]

I believe ... was used for representations before it became Ellipsis literal. In any case, the representation is now ambiguous. It is not possible to reliably invert a many-to-one function.

...

Is this something that can be improved?

...

...
...
'\u2026' '…' [[...]] [[Ellipsis]] [[…]] SyntaxError: invalid character in identifier

If not that, pick anything else giving a syntax error.

...

...
...
[[,,,]] SyntaxError: invalid syntax [[. . .]] SyntaxError: invalid syntax

...

Is it something worth improving?

Terry Reedy

10:07 p.m.

On 9/7/2012 5:57 PM, Terry Reedy wrote:

...

On 9/7/2012 3:51 PM, Mike Graham wrote:

...
With the Python 3 loosening of where ... can occur, this somewhat suboptimal behaviour occurs

...
...
...
x = [] x.append(x) x [[...]] eval(repr(x)) [[Ellipsis]]

I believe ... was used for representations before it became Ellipsis literal. In any case, the representation is now ambiguous. It is not possible to reliably invert a many-to-one function.

...
Is this something that can be improved?

Change the recursive substitution so there is no ambiguity. For instance, use the unicode ellipsis character instead of '...', Since the output is unicode and may contain non-ascii chars anyway, that might be considered.

...
...
...
'\u2026' '…' [[...]] [[Ellipsis]] [[…]] SyntaxError: invalid character in identifier

If not that, pick anything else giving a syntax error.

...
...
...
[[,,,]] SyntaxError: invalid syntax [[. . .]] SyntaxError: invalid syntax

Or probably the simplest, just use 4 periods

...

...
...
[[....]] SyntaxError: invalid syntax

...

...
Is it something worth improving?

I think so. Ambiguity is bad, and the substituted representation is something of a fib, so it should not mimic something that is valid. eval(representation of recursive structure) should either correctly evaluate by re-creating the recursive structure represented* or it should raise an error.

* That would mean that the same expression should be valid in code also. An interesting idea, and a deep can of worms. I believe it would require that '. . .' or whatever be recognizable syntax but not a named object, as the latter would re-introduce the same ambiguity.

-- Terry Jan Reedy