Proto-PEP part 4: The wonderful third option
Sorry, folks, but I've been busy the last few days--the Language Summit is Wednesday, and I had to pack and get myself to SLC for PyCon, &c. I'll circle back and read the messages on the existing threads tomorrow. But for now I wanted to post "the wonderful third option" for forward class definitions we've been batting around for a couple of days. The fundamental tension in the proposal: we want to /allocate/ the object at "forward class" time so that everyone can take a reference to it, but we don't want to /initialize/ the class (e.g. run the class body) until "continue class" time. However, the class might have a metaclass with a custom __new__, which would be responsible for allocating the object, and that isn't run until after the "class body". How do we allocate the class object early while still supporting custom metaclass.__new__ calls? So here's the wonderful third idea. I'm going to change the syntax and semantics a little, again because we were batting them around quite a bit, so I'm going to just show you our current thinking. The general shape of it is the same. First, we have some sort of forward declaration of the class. I'm going to spell it like this: forward class C just for clarity in the discussion. Note that this spelling is also viable: class C That is, a "class" statement without parentheses or a colon. (This is analogous to how C++ does forward declarations of classes, and it was survivable for them.) Another viable spelling: C = ForwardClass() This spelling is nice because it doesn't add new syntax. But maybe it's less obvious what is going on from a user's perspective. Whichever spelling we use here, the key idea is that C is bound to a "ForwardClass" object. A "ForwardClass" object is /not/ a class, it's a forward declaration of a class. (I suspect ForwardClass is similar to a typing.ForwardRef, though I've never worked with those so I couldn't say for sure.) Anyway, all it really has is a name, and the promise that it might get turned into a class someday. To be explicit about it, "isinstance(C, type)" is False. I'm also going to call instances of ForwardClass "immutable". C won't be immutable forever, but for now you're not permitted to set or change attributes of C. Next we have the "continue" class statement. I'm going to spell it like this: continue class C(BaseClass, ..., metaclass=MyMetaclass): # class body goes here ... I'll mention other possible spellings later. The first change I'll point out here: we've moved the base classes and the metaclass from the "forward" statement to the "continue" statement. Technically we could put them either place if we really cared to. But moving them here seems better, for reasons you'll see in a minute. Other than that, this "continue class" statement is similar to what I (we) proposed before. For example, here C is an expression, not a name. Now comes the one thing that we might call a "trick". The trick: when we allocate the ForwardClass instance C, we make it as big as a class object can ever get. (Mark Shannon assures me this is simply "heap type", and he knows far more about CPython internals than I ever will.) Then, when we get to the "continue class" statement, we convince metaclass.__new__ call to reuse this memory, and preserve the reference count, but to change the type of the object to "type" (or what-have-you). C has now been changed from a "ForwardClass" object into a real type. (Which almost certainly means C is now mutable.) These semantics let us preserve the entire existing class creation mechanism. We can call all the same externally-visible steps in the same externally-visible order. We don't add any new dunder methods, we don't remove any dunder methods, we don't expose a new dunder attribute for users to experiment with. What mechanism do we use to achieve this? metaclass.__new__ always has to do one of these two things to create the class object: either it calls "super().__new__", or what we usually call "three-argument type". In both cases, it passes through the **kwargs that it received into the super().__new__ call or the three-argument type call. So the "continue class C" statement will internally add a new kwarg: "__forward__ = C". If super().__new__ or three-argument type get this kwarg, they won't allocate a new object, they'll reuse C. They'll preserve the current reference count, but otherwise overwrite C with all the juicy vitamins and healthy minerals packed into a Python class object. So, technically, this means we could spell the "continue class" step like so: class C(BaseClass, ..., metaclass=MyMetaClass, __forward__=C): ... Which means that, combined with the "C = ForwardClass()" statement above, we could theoretically implement this idea without changing the syntax of the language. And since we already don't have to change the underlying semantics of Python class creation, the technical debt incurred by adding this to the language becomes much smaller. What could go wrong? My biggest question so far: is there such a thing as a metaclass written in C, besides type itself? Are there metaclasses with a __new__ that /doesn't/ call super().__new__ or three-argument type? If there are are metaclasses that allocate their own class objects out of raw bytes, they'd likely sidestep this entire process. I suspect this is rare, if indeed it has ever been done. Anyway, that'd break this mechanism, so exotic metaclasses like these wouldn't work with "forward-declared classes". But at least they needn't fail silently. We just need to add a guard after the call to metaclass.__new__: if we passed in "__forward__=C" into metaclass.__new__, and metaclass.__new__ didn't return C, we raise an exception. Cheers, //arry/ p.s. When I say "we" above, I generally mean Eric V. Smith, Barry Warsaw, Mark Shannon, and myself. But please assume that any dumb ideas in the proposal are mine, and I was too wrong-headed to listen to the sage advice from these three wise men when I wrote this email.
On 26 Apr 2022, at 07:32, Larry Hastings <larry@hastings.org> wrote:
[… snip …]
Next we have the "continue" class statement. I'm going to spell it like this:
continue class C(BaseClass, ..., metaclass=MyMetaclass): # class body goes here ...
I'll mention other possible spellings later. The first change I'll point out here: we've moved the base classes and the metaclass from the "forward" statement to the "continue" statement. Technically we could put them either place if we really cared to. But moving them here seems better, for reasons you'll see in a minute.
Other than that, this "continue class" statement is similar to what I (we) proposed before. For example, here C is an expression, not a name.
Now comes the one thing that we might call a "trick". The trick: when we allocate the ForwardClass instance C, we make it as big as a class object can ever get. (Mark Shannon assures me this is simply "heap type", and he knows far more about CPython internals than I ever will.) Then, when we get to the "continue class" statement, we convince metaclass.__new__ call to reuse this memory, and preserve the reference count, but to change the type of the object to "type" (or what-have-you). C has now been changed from a "ForwardClass" object into a real type. (Which almost certainly means C is now mutable.)
A problem with this trick is that you don’t know how large a class object can get because a subclass of type might add new slots. This is currently not possible to do in Python code (non-empty ``__slots__`` in a type subclass is rejected at runtime), but you can do this in C code. Ronald — Twitter / micro.blog: @ronaldoussoren Blog: https://blog.ronaldoussoren.net/
On 4/25/22 23:56, Ronald Oussoren wrote:
A problem with this trick is that you don’t know how large a class object can get because a subclass of type might add new slots. This is currently not possible to do in Python code (non-empty ``__slots__`` in a type subclass is rejected at runtime), but you can do this in C code.
Dang it! __slots__! Always there to ruin your best-laid plans. *shakes fist at heavens* I admit I don't know how __slots__ is currently implemented, so I wasn't aware of this. However! The first part of my proto-PEP already proposes changing the implementation of __slots__, to allow adding __slots__ after the class is created but before it's instantiated. Since this is so late-binding, it means the slots wouldn't be allocated at the same time as the type, so happily we'd sidestep this problem. On the other hand, this raises the concern that we may need to change the C interface for creating __slots__, which might break C extensions that use it. (Maybe we can find a way to support the old API while permitting the new late-binding behavior, though from your description of the problem I'm kind of doubtful.) Cheers, //arry/
On 26 Apr 2022, at 20:52, Larry Hastings <larry@hastings.org> wrote:
On 4/25/22 23:56, Ronald Oussoren wrote:
A problem with this trick is that you don’t know how large a class object can get because a subclass of type might add new slots. This is currently not possible to do in Python code (non-empty ``__slots__`` in a type subclass is rejected at runtime), but you can do this in C code. Dang it! __slots__! Always there to ruin your best-laid plans. *shakes fist at heavens*
I admit I don't know how __slots__ is currently implemented, so I wasn't aware of this. However! The first part of my proto-PEP already proposes changing the implementation of __slots__, to allow adding __slots__ after the class is created but before it's instantiated. Since this is so late-binding, it means the slots wouldn't be allocated at the same time as the type, so happily we'd sidestep this problem. On the other hand, this raises the concern that we may need to change the C interface for creating __slots__, which might break C extensions that use it. (Maybe we can find a way to support the old API while permitting the new late-binding behavior, though from your description of the problem I'm kind of doubtful.)
I used the term slots in a very loose way. In PyObjC I’m basically doing: typedef struct { PyHeapTypeObject base; /* Extra C fields go here */ } PyObjCClassObject; Those extra C fields don’t get exposed to Python, but could well be by using getset definitions. This has worked without problems since early in the 2.x release cycle (at least, that’s when I started doing this in PyObjC), and is how one subclasses other types as well. “Real” __slots__ don’t work when subclassing type() because type is a var object. That’s “just” an implementation limitation, it should be possible to add slots after the variable length bit (he says while wildly waving his hands). Ronald — Twitter / micro.blog: @ronaldoussoren Blog: https://blog.ronaldoussoren.net/
I am traveling and have no keyboard right now, but it looks like this thread is confusing the slots that a type gives to its *instances* and extra slots in the type object itself. Only the latter are a problem. I also would like to hear more about the problem this is trying to solve, when th real-world examples. (E.g. from pydantic?) On Tue, Apr 26, 2022 at 11:57 Larry Hastings <larry@hastings.org> wrote:
On 4/25/22 23:56, Ronald Oussoren wrote:
A problem with this trick is that you don’t know how large a class object can get because a subclass of type might add new slots. This is currently not possible to do in Python code (non-empty ``__slots__`` in a type subclass is rejected at runtime), but you can do this in C code.
Dang it! __slots__! Always there to ruin your best-laid plans. *shakes fist at heavens*
I admit I don't know how __slots__ is currently implemented, so I wasn't aware of this. However! The first part of my proto-PEP already proposes changing the implementation of __slots__, to allow adding __slots__ after the class is created but before it's instantiated. Since this is so late-binding, it means the slots wouldn't be allocated at the same time as the type, so happily we'd sidestep this problem. On the other hand, this raises the concern that we may need to change the C interface for creating __slots__, which might break C extensions that use it. (Maybe we can find a way to support the old API while permitting the new late-binding behavior, though from your description of the problem I'm kind of doubtful.)
Cheers,
*/arry* _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/YU3PJKPM... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido (mobile)
On Tue, Apr 26, 2022 at 1:25 PM Guido van Rossum <guido@python.org> wrote:
I also would like to hear more about the problem this is trying to solve, when th real-world examples. (E.g. from pydantic?)
Yes please. I think these threads have jumped far too quickly into esoteric details of implementation and syntax, without critical analysis of whether the semantics of the proposal are in fact a good solution to a real-world problem that someone has. I've already outlined in a more detailed reply on the first thread why I don't think forward declarations provide a practically useful solution to forward reference problems for users of static typing (every module that imports something that might be a forward reference would have to import its implementation also, turning every one-line import of that class into two or three lines) and causes new problems for every user of Python due to its reliance on import side effects causing global changes at a distance. See https://mail.python.org/archives/list/python-dev@python.org/message/NMCS77YF... for details. Under PEP 649, forward references are a small problem confined to the edge case of early resolution of type annotations. There are simple and practical appropriately-scoped solutions easily available for that small problem: providing a way to resolve type annotations at runtime without raising NameError on not-yet-defined names. Such a facility (whether default or opt-in) is practically useful for many users of annotations (including dataclasses and documentation tools), which have a need to introspect some aspects of annotations without necessarily needing every part of the annotation to resolve. The existence of such a facility is a reasonable special case for annotations specifically, because annotations are fundamentally special: they provide a description of code, rather than being only a part of the code. (This special-ness is precisely also why they cause more forward references in the first place.) IMO, this forward declaration proposal takes a small problem in a small corner of the language and turns it into a big problem for the whole language, without even providing as nice and usable an option for common use cases as "PEP 649 with option for lax resolution" does. This seems like a case study in theoretical purity ("resolution of names in annotations must not be special") running roughshod over practicality. Carl
On 26/04/2022 20:48, Carl Meyer via Python-Dev wrote:
On Tue, Apr 26, 2022 at 1:25 PM Guido van Rossum <guido@python.org> wrote:
I also would like to hear more about the problem this is trying to solve, when th real-world examples. (E.g. from pydantic?) Yes please. I think these threads have jumped far too quickly into esoteric details of implementation and syntax, without critical analysis of whether the semantics of the proposal are in fact a good solution to a real-world problem that someone has.
I've already outlined in a more detailed reply on the first thread why I don't think forward declarations provide a practically useful solution to forward reference problems for users of static typing (every module that imports something that might be a forward reference would have to import its implementation also, turning every one-line import of that class into two or three lines) and causes new problems for every user of Python due to its reliance on import side effects causing global changes at a distance. See https://mail.python.org/archives/list/python-dev@python.org/message/NMCS77YF... for details.
Under PEP 649, forward references are a small problem confined to the edge case of early resolution of type annotations. There are simple and practical appropriately-scoped solutions easily available for that small problem: providing a way to resolve type annotations at runtime without raising NameError on not-yet-defined names. Such a facility (whether default or opt-in) is practically useful for many users of annotations (including dataclasses and documentation tools), which have a need to introspect some aspects of annotations without necessarily needing every part of the annotation to resolve. The existence of such a facility is a reasonable special case for annotations specifically, because annotations are fundamentally special: they provide a description of code, rather than being only a part of the code. (This special-ness is precisely also why they cause more forward references in the first place.)
IMO, this forward declaration proposal takes a small problem in a small corner of the language and turns it into a big problem for the whole language, without even providing as nice and usable an option for common use cases as "PEP 649 with option for lax resolution" does. This seems like a case study in theoretical purity ("resolution of names in annotations must not be special") running roughshod over practicality.
Carl
Insofar as I understand the above (knowing almost nothing about typing), +1. Best wishes Rob Cliffe
FWIW, Carl presented a talk about his proposed way forward using PEP 649 with some small enhancements to handle cases like dataclasses (*), and it was well received by those present. I personally hope that this means the end of the "forward class declarations" proposals (no matter how wonderful), but the final word is up to the SC. (*) Mostly fixing the edge cases of the "eval __code__ with tweaked globals" hack that Carl came up with previously, see https://github.com/larryhastings/co_annotations/issues/2#issuecomment-109243... . -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
FWIW, I'm in agreement. My "forward class" proposal(s) were me trying to shine a light to find a way forward; I'm in no way adamant that we go that direction. If we can make 649 palatable without introducing forward declarations for classes, that's great! If in the future we discover more edge cases that Carl's approach doesn't easily solve, we could always revisit it later. For now it goes in the freezer of "ideas we aren't moving forward with". //arry/ On 4/29/22 19:08, Guido van Rossum wrote:
FWIW, Carl presented a talk about his proposed way forward using PEP 649 with some small enhancements to handle cases like dataclasses (*), and it was well received by those present. I personally hope that this means the end of the "forward class declarations" proposals (no matter how wonderful), but the final word is up to the SC.
(*) Mostly fixing the edge cases of the "eval __code__ with tweaked globals" hack that Carl came up with previously, see https://github.com/larryhastings/co_annotations/issues/2#issuecomment-109243....
-- --Guido van Rossum (python.org/~guido <http://python.org/~guido>) /Pronouns: he/him //(why is my pronoun here?)/ <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
_______________________________________________ Python-Dev mailing list --python-dev@python.org To unsubscribe send an email topython-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived athttps://mail.python.org/archives/list/python-dev@python.org/message/EBDKGKPM... Code of Conduct:http://python.org/psf/codeofconduct/
Can someone state what's currently unpalatable about 649? It seemed to address the forward-referencing issues, certainly all of the cases I was expecting to encounter. On Sun, 2022-05-01 at 15:35 -0600, Larry Hastings wrote:
FWIW, I'm in agreement. My "forward class" proposal(s) were me trying to shine a light to find a way forward; I'm in no way adamant that we go that direction. If we can make 649 palatable without introducing forward declarations for classes, that's great! If in the future we discover more edge cases that Carl's approach doesn't easily solve, we could always revisit it later. For now it goes in the freezer of "ideas we aren't moving forward with".
/arry On 4/29/22 19:08, Guido van Rossum wrote:
FWIW, Carl presented a talk about his proposed way forward using PEP 649 with some small enhancements to handle cases like dataclasses (*), and it was well received by those present. I personally hope that this means the end of the "forward class declarations" proposals (no matter how wonderful), but the final word is up to the SC.
(*) Mostly fixing the edge cases of the "eval __code__ with tweaked globals" hack that Carl came up with previously, see https://github.com/larryhastings/co_annotations/issues/2#issuecomment-109243... .
-- --Guido van Rossum (python.org/~guido) Pronouns: he/him (why is my pronoun here?)
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/EBDKGKPM... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RXZWTIRV... Code of Conduct: http://python.org/psf/codeofconduct/
On 5/1/22 15:44, Paul Bryan wrote:
Can someone state what's currently unpalatable about 649? It seemed to address the forward-referencing issues, certainly all of the cases I was expecting to encounter.
Carl's talk was excellent here; it would be lovely if he would chime in and reply. Here is my almost-certainly-faulty recollection of what he said. * PEP 649 doesn't work for code bases that deliberately using un-evaluatable expressions but still examine them at runtime. Some code bases would have a major circular import problem if they had to import every module they use in annotations. By only importing those in "if TYPE_CHECKING" blocks, mypy (which inspects each module in isolation) can resolve the references, so it works fine at static analysis time. Occasionally they /also/ need to examine the annotation at runtime, but this is only a rudimentary check, so a string works fine for them. So 563 works, but 649 throws e.g. a NameError. Carl proposes a mitigation strategy here: run the co_annotations code object with a special globals dict set that creates fake objects instead of failing on lookups. * PEP 649 is a pain point for libraries using annotations for documentation purposes. The annotation as written may be very readable, but evaluating it may turn into a very complicated object, and the repr() or str() of that object may be a complicated and obscure the original intent. Carl proposes using much the same strategy here; also it /might/ work to use ast.unparse to pull the original expression out of the source code, though this seems like it would be less reliable. That's everything I remember... but I was operating on two hours' sleep that day. You might also consult Brett's thread about finding edge cases in PEPs 484, 563, and 649: https://discuss.python.org/t/finding-edge-cases-for-peps-484-563-and-649-typ... Cheers, //arry/
Hi Paul, On Sun, May 1, 2022 at 3:47 PM Paul Bryan <pbryan@anode.ca> wrote:
Can someone state what's currently unpalatable about 649? It seemed to address the forward-referencing issues, certainly all of the cases I was expecting to encounter.
Broadly speaking I think there are 3-4 issues to resolve as part of moving forward with PEP 649: 1) Class decorators (the most relevant being @dataclass) that need to inspect something about annotations, and because they run right after class definition, the laziness of PEP 649 is not sufficient to allow forward references to work. Roughly in a similar boat are `if TYPE_CHECKING` use cases where annotations reference names that aren't ever imported. 2) "Documentation" use cases (e.g. built-in "help()") that really prefer access to the original text of the annotation, not the repr() of the fully-evaluated object -- this is especially relevant if the annotation text is a nice short meaningful type alias name, and the actual value is some massive unreadable Union type. 3) Ensuring that we don't regress import performance too much. 4) A solid migration path from the status quo (where many people have already started adopting PEP 563) to the best future end state. Particularly for libraries that want to support the full range of supported Python versions. Issues (1) and (2) can be resolved under PEP 649 by providing a way to run the __co_annotations__ function without erroring on not-yet-defined names, I think we have agreement on a plan there. Performance of the latest PEP 649 reference implementation does not look too bad relative to PEP 563 in my experiments, so I think this is not an issue -- there are ideas for how we could reduce the overhead even further. The migration path is maybe the most difficult issue -- specifically how to weigh "medium-term migration pain" (which under some proposals might last for years) vs "best long-term end state." Still working on reaching consensus there, but we have options to choose from. Expect a more thorough proposal (probably in the form of an update to PEP 649?) sometime after PyCon. Carl
Thanks, Carl and Larry for the explanations. On Sun, 2022-05-01 at 16:13 -0600, Carl Meyer wrote:
Hi Paul,
On Sun, May 1, 2022 at 3:47 PM Paul Bryan <pbryan@anode.ca> wrote:
Can someone state what's currently unpalatable about 649? It seemed to address the forward-referencing issues, certainly all of the cases I was expecting to encounter.
Broadly speaking I think there are 3-4 issues to resolve as part of moving forward with PEP 649:
1) Class decorators (the most relevant being @dataclass) that need to inspect something about annotations, and because they run right after class definition, the laziness of PEP 649 is not sufficient to allow forward references to work. Roughly in a similar boat are `if TYPE_CHECKING` use cases where annotations reference names that aren't ever imported.
2) "Documentation" use cases (e.g. built-in "help()") that really prefer access to the original text of the annotation, not the repr() of the fully-evaluated object -- this is especially relevant if the annotation text is a nice short meaningful type alias name, and the actual value is some massive unreadable Union type.
3) Ensuring that we don't regress import performance too much.
4) A solid migration path from the status quo (where many people have already started adopting PEP 563) to the best future end state. Particularly for libraries that want to support the full range of supported Python versions.
Issues (1) and (2) can be resolved under PEP 649 by providing a way to run the __co_annotations__ function without erroring on not-yet-defined names, I think we have agreement on a plan there. Performance of the latest PEP 649 reference implementation does not look too bad relative to PEP 563 in my experiments, so I think this is not an issue -- there are ideas for how we could reduce the overhead even further. The migration path is maybe the most difficult issue -- specifically how to weigh "medium-term migration pain" (which under some proposals might last for years) vs "best long-term end state." Still working on reaching consensus there, but we have options to choose from. Expect a more thorough proposal (probably in the form of an update to PEP 649?) sometime after PyCon.
Carl
Larry Hastings wrote:
[...]
Now comes the one thing that we might call a "trick". The trick: when we allocate the ForwardClass instance C, we make it as big as a class object can ever get. (Mark Shannon assures me this is simply "heap type", and he knows far more about CPython internals than I ever will.)
It's possible that I'm misunderstanding the allocation mechanism (and it sounds like you've discussed it with people that know a lot more about the internals than me), but if C is an instance of the metaclass then surely you have to know the metaclass to know this. And it looks like the metaclass can definitely be a C type (and thus have a C struct defining an instance). A presumably the metaclass could be a variable size C type (i.e. like tuple), so possibly not even known until the instance is made. David
On Tue, Apr 26, 2022 at 4:04 AM <dw-git@d-woods.co.uk> wrote:
Larry Hastings wrote:
[...]
Now comes the one thing that we might call a "trick". The trick: when we allocate the ForwardClass instance C, we make it as big as a class object can ever get. (Mark Shannon assures me this is simply "heap type", and he knows far more about CPython internals than I ever will.)
This proposal will indeed surpass almost all concerns I raised earlier, if not all. It is perfectly legal to have a custom metaclass not passing "**kwargs" to type.__new__, and I think most classes don't do it: these in general do not expect a "**kwargs" in general and will simply not run the first time the code is called, and could be updated at that point. Besides, there is no problem in keeping this compatible for "pre" and "pos' this PEP. I had thought of that possibility before typing my other answers, and the major problem of "inplace modifying' is that one can't know for sure the final size of the class due to "__slots__". Over allocating a couple hundred bytes could make for reasonable slots, but it could simply raise a runtime exception if the final call would require more slots than this maximum size - so, I don't think this is a blocking concern. As for
What could go wrong? My biggest question so far: is there such a thing as a metaclass written in C, besides type itself? Are there metaclasses with a __new__ that *doesn't* call super().__new__ or three-argument type? If there are are metaclasses that allocate their own class objects out of raw bytes, they'd likely sidestep this entire process. I suspect this is rare, if indeed it has ever been done. Anyway, that'd break this mechanism, so exotic metaclasses like these wouldn't work with "forward-declared classes". But at least they needn't fail silently. We just need to add a guard after the call to metaclass.__new__: if we passed in "__forward__=C" into metaclass.__new__, and metaclass.__new__ didn't return C, we raise an exception.
That. There are some metaclasses. I did not check, but I do suspect even some of the builtin-types like "list" and "tuple" do bypass "type.__new__" (they fail when used with multiple inheritance along with ABCs, in that they do not check for abstractmethods). This is fixable. Third party "raw' metaclasses that re-do the object structure could simply runtime err until, _and if ever desired_, rewritten to support this feature as you put it. Even for metaclasses in pure Python, sometimes they won't resolve to an instance of the class itself (think of the behavior seen in pathlib.Path which acts as a factory to a subclass)- I see no problem in these just not supporting this feature as well. ====================== With this in mind, the major concerns are those put by Carl Meyer on the "Part 1" thread, namely, that this might not be usable for static checking at all - https://mail.python.org/archives/list/python-dev@python.org/message/NMCS77YF... And, of course, my suggestion that the problem this tries to resolve is already resolved by the use of typing.Protocol, as far as type-checking is concerned. Adding a way for a Protocol to be able to find its registered implementations and instantiate one of them when needed would solve this for "real forward referencing" as well. All in all, I still think this complicates things too much for little gain - people needing real forward references always could find a way out, since Python 2 times. And, while in my earlier e-mails I wrote that I mentioned PEP 563 could also resolve this, I really was thinking about PEP 649 - although, I think, actually, any of the 2 could solve the problem annotation wise. If needed for "real code" instead of annotations, extending Protocol so that it can find implementations, could work as well.
On 26 Apr 2022, at 07:32, Larry Hastings <larry@hastings.org> wrote:
[…]
What could go wrong? My biggest question so far: is there such a thing as a metaclass written in C, besides type itself? Are there metaclasses with a __new__ that doesn't call super().__new__ or three-argument type? If there are are metaclasses that allocate their own class objects out of raw bytes, they'd likely sidestep this entire process. I suspect this is rare, if indeed it has ever been done. Anyway, that'd break this mechanism, so exotic metaclasses like these wouldn't work with "forward-declared classes". But at least they needn't fail silently. We just need to add a guard after the call to metaclass.__new__: if we passed in "__forward__=C" into metaclass.__new__, and metaclass.__new__ didn't return C, we raise an exception.
There are third party metaclasses written in C, one example is PyObjC which has meta classes written in C and those meta classes create a type with additional entries in the C struct for the type. I haven’t yet tried to think about the impact of this proposal, other than the size of the type (as mentioned earlier). The PyObjC meta class constructs both the Python class and a corresponding Objective-C class in lock step. On first glance this forward class proposal should not cause any problems here other than the size of the type object. Ronald — Twitter / micro.blog: @ronaldoussoren Blog: https://blog.ronaldoussoren.net/
On 2022-04-26 06:32, Larry Hastings wrote:
Sorry, folks, but I've been busy the last few days--the Language Summit is Wednesday, and I had to pack and get myself to SLC for PyCon, &c. I'll circle back and read the messages on the existing threads tomorrow. But for now I wanted to post "the wonderful third option" for forward class definitions we've been batting around for a couple of days.
The fundamental tension in the proposal: we want to /allocate/ the object at "forward class" time so that everyone can take a reference to it, but we don't want to /initialize/ the class (e.g. run the class body) until "continue class" time. However, the class might have a metaclass with a custom __new__, which would be responsible for allocating the object, and that isn't run until after the "class body". How do we allocate the class object early while still supporting custom metaclass.__new__ calls?
So here's the wonderful third idea. I'm going to change the syntax and semantics a little, again because we were batting them around quite a bit, so I'm going to just show you our current thinking.
The general shape of it is the same. First, we have some sort of forward declaration of the class. I'm going to spell it like this:
forward class C
just for clarity in the discussion. Note that this spelling is also viable:
class C
I don't like that because it looks like you've just forgotten the colon. Perhaps: class C: ... [snip]
On 4/26/22 09:31, MRAB wrote:
On 2022-04-26 06:32, Larry Hastings wrote:
Note that this spelling is also viable:
class C
I don't like that because it looks like you've just forgotten the colon.
Perhaps:
class C: ...
That's not a good idea. Every other place in Python where there's a statement that ends in a colon, it's followed by a nested block of code. But the point of this statement is to forward-declare C, and this statement /does not have/ a class body. Putting a colon there is misleading. Also, your suggestion is already legal Python syntax; it creates a class with no attributes. So changing this existing statement to mean something else would potentially (and I think likely) break existing code. Consider C++'s forward-declared class statement: class C; You could say about that, "I don't like that because it looks like you've just forgotten the curly braces." But we didn't forget anything, it's just new syntax for a different statement. //arry/
On Wed, 27 Apr 2022 at 05:05, Larry Hastings <larry@hastings.org> wrote:
On 4/26/22 09:31, MRAB wrote:
Perhaps:
class C: ...
Also, your suggestion is already legal Python syntax; it creates a class with no attributes. So changing this existing statement to mean something else would potentially (and I think likely) break existing code.
Not sure if it quite counts as "existing code", but I do often use this notation during development to indicate that this class will exist, but I haven't coded it yet. (In contrast, "class C: pass" indicates that an empty body is sufficient for this class, eg "class SpamException(Exception): pass" which needs no further work.) If a less subtle distinction is needed, what about "class C = None"? That removes the expectation of a colon and body. Personally, I'm still inclined towards the kwarg method, though ("class C(forward=True): pass"), since it's legal syntax. ChrisA
On Apr 25, 2022, at 22:32, Larry Hastings <larry@hastings.org> wrote:
The general shape of it is the same. First, we have some sort of forward declaration of the class. I'm going to spell it like this:
forward class C
just for clarity in the discussion. Note that this spelling is also viable:
class C
That is, a "class" statement without parentheses or a colon. (This is analogous to how C++ does forward declarations of classes, and it was survivable for them.) Another viable spelling:
C = ForwardClass()
I like this latter one exactly because as you say it doesn’t require any syntax changes. In the fine tradition of Python, we could certainly add syntactic sugar later, but I like that this can be implemented without it and we can see how the idea plays out in practice before committing to new syntax.
This spelling is nice because it doesn't add new syntax. But maybe it's less obvious what is going on from a user's perspective.
Whichever spelling we use here, the key idea is that C is bound to a "ForwardClass" object. A "ForwardClass" object is not a class, it's a forward declaration of a class. (I suspect ForwardClass is similar to a typing.ForwardRef, though I've never worked with those so I couldn't say for sure.)
I haven’t looked deeply at the code for ForwardRef, but I just cracked open typing.py and noticed this: class ForwardRef(_Final, _root=True): """Internal wrapper to hold a forward reference.""" __slots__ = ('__forward_arg__', '__forward_code__', '__forward_evaluated__', '__forward_value__', '__forward_is_argument__', '__forward_is_class__', '__forward_module__’) So it seems that you’re almost there already!
So, technically, this means we could spell the "continue class" step like so:
class C(BaseClass, ..., metaclass=MyMetaClass, __forward__=C): ...
Again, nice! Look Ma, no syntax changes. I don’t know whether the __slots__ issue will break this idea but it is really … wonderful! -Barry
On Mon, Apr 25, 2022 at 10:32:15PM -0700, Larry Hastings wrote: [...]
Whichever spelling we use here, the key idea is that C is bound to a "ForwardClass" object. A "ForwardClass" object is /not/ a class, it's a forward declaration of a class. (I suspect ForwardClass is similar to a typing.ForwardRef, though I've never worked with those so I couldn't say for sure.)
I know that the three devs working on this plan (Larry, Barry, Mark) are extremely competent, well-respected people, but this proto-PEP really feels to me like you aren't heavy users of typing and are proposing an extremely complex, complicated, problematic "solution" which even the typing people don't want. Carl Meyer has repeatedly said that this proposal won't solve the issues that you have designed it to solve. I don't think you have responded to him either time, which is worrying. https://mail.python.org/archives/list/python-dev@python.org/message/NMCS77YF... https://mail.python.org/archives/list/python-dev@python.org/message/RVQSLD43... Now Carl is only one person, and not necessarily representative, but Jelle Zijlstra is also heavily involved with mypy and typeshed and has expressed serious doubts about this proposal as well. So unless somebody with solid experience in typing contradict Carl and Jelle, we ought to take their comments as definitive and trust them that these "ForwardClass" objects won't solve the problem of forward declarations (or indeed that the problem doesn't really need solving beyond the two PEPs already on the table, plus stringified types). I realise that you have invested a lot of time and energy into this proto-PEP, but if it fails to solve the forward declaration problem, and introduces problems of its own, then maybe we should not fall into the Sunk Costs fallacy by continuing to debate the best syntax and implementation for something that we don't need. -- Steve
participants (12)
-
Barry Warsaw
-
Carl Meyer
-
Chris Angelico
-
dw-git@d-woods.co.uk
-
Guido van Rossum
-
Joao S. O. Bueno
-
Larry Hastings
-
MRAB
-
Paul Bryan
-
Rob Cliffe
-
Ronald Oussoren
-
Steven D'Aprano