Re: Supporting dataclass-like semantics

Hello from the PyCharm team! While reading your suggestion, I figured out that there is one technical difficulty for PyCharm. To have a quick code insight (inspections, completion and so on) PyCharm has to build reduced syntax trees. Access to them is faster because some information is omitted (e.g. assigned values, function bodies) and they don't require any parsing as original sources do. This is done during the well-known indexing stage that processes every file independently with no ability to run resolve to other files. With the current proposal, it would be difficult to process dataclasses correctly in one case, I'll comment several examples to shed some light on PyCharm internals: ``` # everything is fine here, # `id` and `name` are saved as a class level attributes; # after indexing is finished, PyCharm is able to figure out that `create_model` describes a dataclass # because resolve becomes available; # same for the example with ModelBase @create_model class CustomerModel: id: int name: str ``` Next sample: ``` # since resolve is not available during indexing, # there is no way to determine if `CustomerModel` is a dataclass # and hence no way to protect `ModelField(default=0)` from being skipped (I've mentioned above that assigned values are omitted) @create_model(init=False) class CustomerModel: id: int = ModelField(default=0) name: str ``` Possible solution for PyCharm would be to save assigned values for all class level attributes (and make indexes more heavy) or to have some predefined names/heuristics to filter out class attributes that can not describe dataclass field. I'd not say it is a major issue, please keep it in mind, I'll be glad to discuss. -- Semyon Proshev, PyCharm --

Hi Semyon, Are you sure you need the field descriptor information in the index? Field descriptors affect many runtime behaviors, but the only static type checking behavior they impact is whether their corresponding fields appear within the synthesized `__init__` method and whether there is a default value provided for that parameter. I'd be surprised if constructor parameter details are included in your index. Pylance, the language server built on top of pyright, also indexes files so it can provide fast completions, code actions, searches, etc. It works similar to what you've described in PyCharm. It processing each file independently, parsing and binding it, but not performing any semantic analysis or type evaluation. I mention this because the dataclass_transform specification didn't present any problem for the pylance indexer. If you do need field descriptor details in the index, how do you handle this for dataclass field descriptors? Do you hard-code `dataclasses.field` and `dataclasses.Field` when you're indexing? If this does present a problem for you, let me know if you have any alternative approaches that might eliminate the problem. -Eric -- Eric Traut Contributor to Pyright & Pylance Microsoft Corp.
participants (3)
-
Eric Traut
-
Semyon Proshev
-
Семён Прошев