[Python-ideas] Round-trippable repr for everything

Andrew Barnert abarnert at yahoo.com
Wed Feb 10 16:04:13 EST 2016


On Wednesday, February 10, 2016 12:29 PM, Random832 <random832 at fastmail.com> wrote:

> On Tue, Feb 9, 2016, at 03:56, Cory Benfield wrote:
>>  I think the reality is that there is no constraint on the representation
>>  of arbitrary types to be round-trippable in any way. Again, all custom
>>  types have non-round-trippable representations by default, many more
>>  eclectic built-in types have non-round-tripppable representations (in
>>  addition to NaN, the memoryview object leaps to mind).
> 
> One other example is classes and functions though in many cases it's not
> clear why this should be the case. Have the default for top-level
> functions and classes check whether it's reachable through
> [module].[name] and if so return that. The default for methods could use
> [class].[name], or [obj].[name] for bound methods.
> 
> Instead of the current representation, you could have the default repr
> on objects use pickle.

That seems like a bad idea.

First, as a human programmer, the repr "[1, 2, (3, 'spam')]" means something to me--and the same thing to me as to Python; the repr "unrepr('silly.C object at 0x106ec2b38', b'\x80\x03csilly\nC\nq\x00)\x81q\x01}q\x02X\x01\x00\x00\x00xq\x03K\x02sb.')" means less to me than the current "<silly.C at 0x106ec2b38>", while taking up a lot more space.

Meanwhile, if I've changed the silly module since printing out the repr, that pickle will fail--or, worse, and more likely, succeed but give me the wrong value. And, since eval'ing reprs is generally something you do when experimenting or when debugging code that you're actively working on, this will be very common.

Meanwhile, eval("[1, 2, (3, 'spam')]") == [1, 2, (3, 'spam')], and that's true for most containers and "value-type" objects, which tend to be the kinds of things that have round-trippable reprs today. That probably won't be true for instances of arbitrary types where the creator didn't think of a repr. 

Finally, when I paste a repr into the REPL or into my source and it needs further qualification, as with datetime or Decimal, I can almost always figure this out pretty easily. And I would much rather have to figure it out and then paste "decimal.Decimal('123.456')" into my source than have the computer figure it out and paste "__import__('decimal').Decimal('123.456')" or something with a pickle in the middle of it into my source.


It's true that pasting reprs is just something that tends to often work when you'd expect it to today, not something guaranteed--but since repr is used for exploration and debugging by humans, not for automated processing by scripts, "tends to often work" tends to often be very good. So there really isn't a problem here. And your suggestion doesn't really help that use anyway; the only use it helps is using repr as an automated serialization format, which we explicitly _don't_ want to help.

If you want automated serialization, just use pickle. There's no reason to twist repr and eval to support that use case almost but not quite as well (and even less securely and efficiently). If you want something more readable, use something like YAML with a custom type library, or jsonpickle, etc. Or, if you think we need a more human-readable pickle format, that might be an interesting idea, but there's no reason to twist repr into it, or to force it to be readable by eval instead of loads.


More information about the Python-ideas mailing list