__repr__ helper(s) in reprlib
Currently, the reprlib module [1] offers an "alternate repr() implementation", and focuses mainly on guarding the length of the returned string. I propose to broaden its scope slightly and make it the place to add helper functions for writing __repr__(), and to add one such function. A dataclass automatically generates a __repr__ method (unless you ask it not to). @dataclasses.dataclass class Spam: quality: int recipe: str ham: complex
Spam(75, "<censored>", 1j) Spam(quality=75, recipe='<censored>', ham=1j)
For non-dataclass classes, it would be extremely helpful to have an easy helper function available: class Spam: def __repr__(self): return reprlib.kwargs(self, ["quality", "recipe", "ham"]) The implementation for this function would be very similar to what dataclasses already do: # in reprlib.py def kwargs(obj, attrs): attrs = [f"{a}={getattr(obj, a)!r}" for a in attrs] return f"{obj.__class__.__qualname__}({", ".join(attrs)})" A similar function for positional args would be equally easy. Bikeshedding opportunity: Should it be legal to omit the attrs parameter, and have it use __slots__ or fall back to dir(obj) ? ChrisA [1] https://docs.python.org/3/library/reprlib.html
On Jan 21, 2020, at 12:29, Chris Angelico <rosuav@gmail.com> wrote:
For non-dataclass classes, it would be extremely helpful to have an easy helper function available:
class Spam: def __repr__(self): return reprlib.kwargs(self, ["quality", "recipe", "ham"])
The implementation for this function would be very similar to what dataclasses already do:
# in reprlib.py def kwargs(obj, attrs): attrs = [f"{a}={getattr(obj, a)!r}" for a in attrs] return f"{obj.__class__.__qualname__}({", ".join(attrs)})"
A similar function for positional args would be equally easy.
I like this, but I think it’s still more complex than it needs to be for 80% of the cases (see below), while for the other 20%, I think it might make it too easy to get things wrong. Usually, the repr is either something you could type into the REPL to get an equal object, or something that’s a SyntaxError (usually because it’s in angle brackets). There are exceptions to that (like overlong or self-recursive containers), but as a general rule it’s true. And there’s no check here that the args in __repr__ are the same ones as in __new__/__init__, so it might be way too easy (especially when modifying code) to produce something that looks like a valid repr but isn’t—and may not even produce an error (e.g., you added a new constructor param with a default value, and didn’t add it to kwargs, so now every repr gives you a repr for something with the default value rather than the actual value). Maybe if there were a way to specify init and repr or new and repr together… but at that point you might as well use dataclasses, right?
Bikeshedding opportunity: Should it be legal to omit the attrs parameter, and have it use __slots__ or fall back to dir(obj) ?
This makes things even more dangerous. It’s very common for classes to have attributes that aren’t part of the constructor call, or constructor params that aren’t attributes, and this would give you the wrong answer. However, it might be useful to have a dump-all-attributes function that gave you an angle-bracket repr a la file objects, something like: attrstr = ' '.join([f"{obj.__class__.__qualname__} object at {id(obj):x}“] + [f"{attr} = {getattr(obj, attr)!r}" for attr in type(obj).__slots__]> return f"<{attrstr}>" It would be nice if there were a safe way to get the constructor-call-style repr. And I think there might be for 80% of the types—and the rest can specify it manually and take the risk of getting it wrong, probably. One option is the pickle/copy protocol. If the type uses one of the newargs methods, you can use that to get the constructor arguments; if it uses one of the other pickling methods (or can’t be pickled), this just doesn’t work. You could also look at the inspect.signature of __init__ and/or __new__. If every param has an attribute with the same name, use that; otherwise, this doesn’t work. And if none of the automatic ways worked and you tried to use them anyway, you get an error. But it would be nice if this error were at class-defining time rather than at repr-calling time, so maybe a decorator is actually a better solution? @reprlib.defaultrepr class Spam:
On Wed, Jan 22, 2020 at 9:17 AM Andrew Barnert <abarnert@yahoo.com> wrote:
On Jan 21, 2020, at 12:29, Chris Angelico <rosuav@gmail.com> wrote:
For non-dataclass classes, it would be extremely helpful to have an easy helper function available:
class Spam: def __repr__(self): return reprlib.kwargs(self, ["quality", "recipe", "ham"])
The implementation for this function would be very similar to what dataclasses already do:
# in reprlib.py def kwargs(obj, attrs): attrs = [f"{a}={getattr(obj, a)!r}" for a in attrs] return f"{obj.__class__.__qualname__}({", ".join(attrs)})"
A similar function for positional args would be equally easy.
I like this, but I think it’s still more complex than it needs to be for 80% of the cases (see below)
IMO that's not a problem. The implementation of reprlib.kwargs() is allowed to be complex, since it's buried away as, well, implementation details. As long as it's easy to call, that's all that matters.
while for the other 20%, I think it might make it too easy to get things wrong.
Hmm. I kept it completely explicit - it will generate a repr that shows the exact attributes listed (and personally, I'd often write it as "quality recipe ham".split()), and left the idea of automatic detection as a bikesheddable option.
Bikeshedding opportunity: Should it be legal to omit the attrs parameter, and have it use __slots__ or fall back to dir(obj) ?
This makes things even more dangerous. It’s very common for classes to have attributes that aren’t part of the constructor call, or constructor params that aren’t attributes, and this would give you the wrong answer.
It would be nice if there were a safe way to get the constructor-call-style repr. And I think there might be for 80% of the types—and the rest can specify it manually and take the risk of getting it wrong, probably.
One option is the pickle/copy protocol. If the type uses one of the newargs methods, you can use that to get the constructor arguments; if it uses one of the other pickling methods (or can’t be pickled), this just doesn’t work.
You could also look at the inspect.signature of __init__ and/or __new__. If every param has an attribute with the same name, use that; otherwise, this doesn’t work.
And if none of the automatic ways worked and you tried to use them anyway, you get an error.
This is definitely getting into the realm of magic. The question is, is it worth the convenience? I'd be +0.25 for the magic. Also, that can always be added in the future. If the explicit-list-of-attributes form is useful but too fiddly, it would be possible to add a stated default of "figure it out by magic", and then the exact magic can be redefined at will.
But it would be nice if this error were at class-defining time rather than at repr-calling time, so maybe a decorator is actually a better solution?
@reprlib.defaultrepr class Spam:
Also a definite possibility. ChrisA
On Jan 21, 2020, at 14:30, Chris Angelico <rosuav@gmail.com> wrote:
On Wed, Jan 22, 2020 at 9:17 AM Andrew Barnert <abarnert@yahoo.com> wrote:
On Jan 21, 2020, at 12:29, Chris Angelico <rosuav@gmail.com> wrote:
For non-dataclass classes, it would be extremely helpful to have an easy helper function available:
class Spam: def __repr__(self): return reprlib.kwargs(self, ["quality", "recipe", "ham"])
The implementation for this function would be very similar to what dataclasses already do:
# in reprlib.py def kwargs(obj, attrs): attrs = [f"{a}={getattr(obj, a)!r}" for a in attrs] return f"{obj.__class__.__qualname__}({", ".join(attrs)})"
A similar function for positional args would be equally easy.
I like this, but I think it’s still more complex than it needs to be for 80% of the cases (see below)
IMO that's not a problem. The implementation of reprlib.kwargs() is allowed to be complex, since it's buried away as, well, implementation details. As long as it's easy to call, that's all that matters.
Sorry, I didn’t mean the implementation, I meant the interface. You shouldn’t have to specify the names in simple cases, and there are no correct names to specify in more complicated cases, and the range in which names are useful but also necessary seems pretty narrow. Even this example can’t work with your kwargs function: class Spam def __init__(self, x): self._x = x def __repr__(self): return kwargs(self, '_x'.split()) # or 'x'.split() You want the repr to be `Spam(x=3)`, but there’s no way to get that. If you use _x you get `Spam(_x=3)`, which is wrong; if you use x you get an AttributeError, which nobody wants from repr.
while for the other 20%, I think it might make it too easy to get things wrong.
Hmm. I kept it completely explicit - it will generate a repr that shows the exact attributes listed (and personally, I'd often write it as "quality recipe ham".split()),
Right, but that’s exactly what makes it easy to get wrong. If I add a new param and attribute with a default value and don’t remember to also add it to the repr, I’m now silently generating reprs that look right but aren’t. If I have an attribute that’s computed and include it in repr by accident, likewise.
Bikeshedding opportunity: Should it be legal to omit the attrs parameter, and have it use __slots__ or fall back to dir(obj) ?
This makes things even more dangerous. It’s very common for classes to have attributes that aren’t part of the constructor call, or constructor params that aren’t attributes, and this would give you the wrong answer.
It would be nice if there were a safe way to get the constructor-call-style repr. And I think there might be for 80% of the types—and the rest can specify it manually and take the risk of getting it wrong, probably.
One option is the pickle/copy protocol. If the type uses one of the newargs methods, you can use that to get the constructor arguments; if it uses one of the other pickling methods (or can’t be pickled), this just doesn’t work.
You could also look at the inspect.signature of __init__ and/or __new__. If every param has an attribute with the same name, use that; otherwise, this doesn’t work.
And if none of the automatic ways worked and you tried to use them anyway, you get an error.
This is definitely getting into the realm of magic.
I don’t think getnewargs is any more magical than dir is—and it’s correct, and ties into a protocol already used for two other things in Python.
The question is, is it worth the convenience? I'd be +0.25 for the magic.
I think it is. The real problem that makes people not bother to write proper reprs is not wanting to list all the damn parameters that you already listed three times each in init a fourth time. It’s not like it’s hard to write the f-string, it’s just tedious. And easy to get wrong. And without magic, this only makes both problems a little better instead of actually solving them—you still have to list all those names a fourth time and you can still easily get it wrong.
Also, that can always be added in the future. If the explicit-list-of-attributes form is useful but too fiddly, it would be possible to add a stated default of "figure it out by magic", and then the exact magic can be redefined at will.
I don’t think you can document something as “magic” and redefine it from version to version in Python without worrying about breaking lots of existing code. But that’s a good argument for putting this on PyPI first, where we can redefine it at will, and when we finally get it right (assuming people are using it, and there are no popular alternatives), then it can go into the stdlib to die as part of reprlib forever.
I really like this feature suggestion and I would definitely use it if it were available. It might be easiest to have the repr helper do as much as it can, but give the user the flexibility to override things with varying levels of precision. For example: def kwargs(obj, keys: Optional[Collection[str]] = None, keys_values: Optional[Mapping[str, Any]] = None, *, qualified=True): if keys is None and keys_values is None: pass # Fallback? keys_strings = ( [] if keys is None else [(k, repr(getattr(obj, k))) for k in keys]) keys_strings += ( [] if keys_values is None else [(k, repr(v)) for k, v in keys_values.items()]) class_name = (obj.__class__.__qualname__ if qualified else obj.__class__.__name__) parameters = ", ".join( f"{k}={v}" for k, v in keys_strings) return f"{class_name}({parameters})" On Tuesday, January 21, 2020 at 10:33:43 PM UTC-5, Andrew Barnert via Python-ideas wrote:
On Jan 21, 2020, at 14:30, Chris Angelico <ros...@gmail.com <javascript:>> wrote:
On Wed, Jan 22, 2020 at 9:17 AM Andrew Barnert <abar...@yahoo.com
<javascript:>> wrote:
On Jan 21, 2020, at 12:29, Chris Angelico <ros...@gmail.com
<javascript:>> wrote:
For non-dataclass classes, it would be extremely helpful to have an easy helper function available:
class Spam: def __repr__(self): return reprlib.kwargs(self, ["quality", "recipe", "ham"])
The implementation for this function would be very similar to what dataclasses already do:
# in reprlib.py def kwargs(obj, attrs): attrs = [f"{a}={getattr(obj, a)!r}" for a in attrs] return f"{obj.__class__.__qualname__}({", ".join(attrs)})"
A similar function for positional args would be equally easy.
I like this, but I think it’s still more complex than it needs to be for 80% of the cases (see below)
IMO that's not a problem. The implementation of reprlib.kwargs() is allowed to be complex, since it's buried away as, well, implementation details. As long as it's easy to call, that's all that matters.
Sorry, I didn’t mean the implementation, I meant the interface. You shouldn’t have to specify the names in simple cases, and there are no correct names to specify in more complicated cases, and the range in which names are useful but also necessary seems pretty narrow.
Even this example can’t work with your kwargs function:
class Spam def __init__(self, x): self._x = x def __repr__(self): return kwargs(self, '_x'.split()) # or 'x'.split()
You want the repr to be `Spam(x=3)`, but there’s no way to get that. If you use _x you get `Spam(_x=3)`, which is wrong; if you use x you get an AttributeError, which nobody wants from repr.
while for the other 20%, I think it might make it too easy to get things wrong.
Hmm. I kept it completely explicit - it will generate a repr that shows the exact attributes listed (and personally, I'd often write it as "quality recipe ham".split()),
Right, but that’s exactly what makes it easy to get wrong. If I add a new param and attribute with a default value and don’t remember to also add it to the repr, I’m now silently generating reprs that look right but aren’t. If I have an attribute that’s computed and include it in repr by accident, likewise.
Bikeshedding opportunity: Should it be legal to omit the attrs parameter, and have it use __slots__ or fall back to dir(obj) ?
This makes things even more dangerous. It’s very common for classes to have attributes that aren’t part of the constructor call, or constructor params that aren’t attributes, and this would give you the wrong answer.
It would be nice if there were a safe way to get the constructor-call-style repr. And I think there might be for 80% of the types—and the rest can specify it manually and take the risk of getting it wrong, probably.
One option is the pickle/copy protocol. If the type uses one of the newargs methods, you can use that to get the constructor arguments; if it uses one of the other pickling methods (or can’t be pickled), this just doesn’t work.
You could also look at the inspect.signature of __init__ and/or __new__. If every param has an attribute with the same name, use that; otherwise, this doesn’t work.
And if none of the automatic ways worked and you tried to use them anyway, you get an error.
This is definitely getting into the realm of magic.
I don’t think getnewargs is any more magical than dir is—and it’s correct, and ties into a protocol already used for two other things in Python.
The question is, is it worth the convenience? I'd be +0.25 for the magic.
I think it is. The real problem that makes people not bother to write proper reprs is not wanting to list all the damn parameters that you already listed three times each in init a fourth time. It’s not like it’s hard to write the f-string, it’s just tedious. And easy to get wrong. And without magic, this only makes both problems a little better instead of actually solving them—you still have to list all those names a fourth time and you can still easily get it wrong.
Also, that can always be added in the future. If the explicit-list-of-attributes form is useful but too fiddly, it would be possible to add a stated default of "figure it out by magic", and then the exact magic can be redefined at will.
I don’t think you can document something as “magic” and redefine it from version to version in Python without worrying about breaking lots of existing code.
But that’s a good argument for putting this on PyPI first, where we can redefine it at will, and when we finally get it right (assuming people are using it, and there are no popular alternatives), then it can go into the stdlib to die as part of reprlib forever.
_______________________________________________ Python-ideas mailing list -- python...@python.org <javascript:> To unsubscribe send an email to python-id...@python.org <javascript:> https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/IRBFLX... Code of Conduct: http://python.org/psf/codeofconduct/
participants (3)
-
Andrew Barnert
-
Chris Angelico
-
Neil Girdhar