Hello,
thank you for making Python and the neat inspect module.
I would love to hear your opinion on the following aspect of inspect that I believe might be worth improving:
Consider the following program saved in a file (say hello.py):
import inspect
def hello(): print("Hello World") print(inspect.getsource(hello))
class Hello: def __init__(self): print("Hello World") print(inspect.getsource(Hello))
Running hello.py will, unsurprisingly, print the source of hello and Hello.
Now, some of us use an Jupyter (with the capabilities provided by IPython) notebooks, which are a great tool and awesome match with Python. These notebooks can be large and complex enough to want to use introspection on methods defined in itself (also, I'm prototyping things I might want to use as a library in Notebooks a lot, and I think I'm not alone).
IPython enhances the interactive console to enable introspection (by providing "files" for the cells). As a result, the following will work as expected:
def hello(): print("Hello World") print(inspect.getsource(hello))
However, it does not work for classes: class Hello: def __init__(self): print("Hello World") print(inspect.getsource(Hello))
will run into an error in a Jupyter notebook, more precisely
TypeError: <class '__main__.Hello'> is a built-in class
The reason why the latter does not work is because inspect cannot find a source file.
The technical background is that for a function hello, inspect.getfile
finds the file through hello.__code__.co_filename
which IPython can arrange for, while for the class Hello, it tries
Hello.__module__
, which is __main__
and then would see if
sys.modules[Hello.__module__] has a __file__ attribute, which it does
not (and which could not be disambiguated into cell-level).
I once made a PR in github #13894 and earlier https://bugs.python.org/issue33826 but got, let's say, reactions that were not entirely encouraging. I still think that it is a useful feature and I don't think that there are readily available solutions and after another year has passed, I humbly submit this for your considerations.
Best regards and thank you.
Thomas
On Mon, Jun 15, 2020 at 7:22 AM Thomas Viehmann tv.python-dev.python.org@beamnet.de wrote:
Hello,
thank you for making Python and the neat inspect module.
I would love to hear your opinion on the following aspect of inspect that I believe might be worth improving:
Consider the following program saved in a file (say hello.py):
import inspect
def hello(): print("Hello World") print(inspect.getsource(hello))
class Hello: def __init__(self): print("Hello World") print(inspect.getsource(Hello))
Running hello.py will, unsurprisingly, print the source of hello and Hello.
Now, some of us use an Jupyter (with the capabilities provided by IPython) notebooks, which are a great tool and awesome match with Python. These notebooks can be large and complex enough to want to use introspection on methods defined in itself (also, I'm prototyping things I might want to use as a library in Notebooks a lot, and I think I'm not alone).
IPython enhances the interactive console to enable introspection (by providing "files" for the cells). As a result, the following will work as expected:
def hello(): print("Hello World") print(inspect.getsource(hello))
However, it does not work for classes: class Hello: def __init__(self): print("Hello World") print(inspect.getsource(Hello))
will run into an error in a Jupyter notebook, more precisely
TypeError: <class '__main__.Hello'> is a built-in class
The reason why the latter does not work is because inspect cannot find a source file.
The technical background is that for a function hello, inspect.getfile
finds the file through hello.__code__.co_filename
which IPython can arrange for, while for the class Hello, it tries
Hello.__module__
, which is __main__
and then would see if
sys.modules[Hello.__module__] has a __file__ attribute, which it does
not (and which could not be disambiguated into cell-level).
I once made a PR in github #13894 and earlier https://bugs.python.org/issue33826 but got, let's say, reactions that were not entirely encouraging. I still think that it is a useful feature and I don't think that there are readily available solutions and after another year has passed, I humbly submit this for your considerations.
Best regards and thank you.
Thomas
It would probably help if you didn't bury the lede: You have a languishing
bug+PR that would benefit users of Jupyter Notebooks, and you would like
some help getting your proposal accepted. The proposal is to add a
__filename__
attribute to classes, and the problem it solves is that
currently inspect.getsource() uses the class's __module__
attribute,
which in Jupyter's case points to __main__
for all classes defined in
cells. There can be only one file per module object so Jupyter cannot
trick inspect.getsource() into showing the source of the cell containing
the class definition (which it manages to do for functions, because of the
__code__.co_filename
attribute).
I could think of a trick that inspect.getsource() might use if the class
contains at least one method: it could look at a method and try its
__code__.co_filename
attribute (maybe only if the __file__
attribute
for the module found via the class's __module__
doesn't exist -- I'm sure
Jupyter can arrange for that to be the case). But I see how that would be a
problem (I can think of plenty of reasons why a class might not have any
methods).
I do think that your proposal is reasonable, although I wonder what the Jupyter developers think of it. (How closely are you connected to that project?)
Hopefully some other core dev will now take pity on you.
-- --Guido van Rossum (python.org/~guido) Pronouns: he/him **(why is my pronoun here?) http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...
Hello,
thank you for your feedback!
I could think of a trick that inspect.getsource()
might use if the class
contains at least one method: it could look at a method and try its
__code__.co_filename
attribute (maybe only if the __file__
attribute
for the module found via the class's __module__
doesn't exist -- I'm sure
Jupyter can arrange for that to be the case). But I see how that would be a
problem (I can think of plenty of reasons why a class might not have any
methods).
That is a great idea, in addition getsource would have to filter out inherited methods (which should be doable, but indeed exclude classes). Would you prefer such a patch to inspect.getsource over the adding __filename__?
I do think that your proposal is reasonable, although I wonder what the Jupyter developers think of it. (How closely are you connected to that project?)
I am not affiliated with Jupyter at all and I imagine that I'd be prone to asking "would it be nice if inspect.getsource worked better?", which likely doesn't yield the most interesting answers.
We can chase the related reports:
https://github.com/jupyter/notebook/issues/3802 https://github.com/ipython/ipython/issues/11249
The topic does seem to pop up now and then:
https://stackoverflow.com/questions/51566497/getting-the-source-of-an-object... https://stackoverflow.com/questions/35854373/python-get-source-code-of-class...
Best regards
Thomas
On Tue, Jun 16, 2020 at 2:00 AM Thomas Viehmann tv@beamnet.de wrote:
Hello,
thank you for your feedback!
I could think of a trick that
inspect.getsource() might use if the class
contains at least one method: it could look at a method and try its
__code__.co_filename
attribute (maybe only if the __file__
attribute
for the module found via the class's __module__
doesn't exist -- I'm
sure
Jupyter can arrange for that to be the case). But I see how that would
be a
problem (I can think of plenty of reasons why a class might not have any
methods).
That is a great idea, in addition getsource would have to filter out inherited methods (which should be doable, but indeed exclude classes).
It would just have to iterate over the class __dict__
, which
doesn't
contain inherited objects anyways.
Would you prefer such a patch to inspect.getsource over the adding __filename__?
It would certainly be much easier to get through the review process. Adding
a __filename__
(why not __file__
?) attribute to classes is a
major
surgery, presumably requiring a PEP, and debating the pros and cons and
performance implications and fixing a bunch of tests that don't expect this
attribute, and so on. Adding an imperfect solution to inspect.getsource()
would only require the cooperation of whoever maintains the inspect module.
I do think that your proposal is reasonable, although I wonder what the Jupyter developers think of it. (How closely are you connected to that project?)
I am not affiliated with Jupyter at all and I imagine that I'd be prone to asking "would it be nice if inspect.getsource worked better?", which likely doesn't yield the most interesting answers.
I had trouble parsing this sentence; I believe you mean to say that the Jupyter maintainers would just tell you this should be fixed in inspect.getsource()?
We can chase the related reports:
This was closed because Jupyter is not to blame here at all, they declared it an IPython issue.
And here there also doesn't seem to be much interest, given that it's been open and unanswered since 2018.
The topic does seem to pop up now and then: > >
https://stackoverflow.com/questions/51566497/getting-the-source-of-an-object...
https://stackoverflow.com/questions/35854373/python-get-source-code-of-class...
Very few stars. This suggests not many people care about this problem, and that in turn might explain the lukewarm response you find everywhere.
Lastly, I have to ask: Why is this so important to you? What does this prevent you from doing? You have illustrated the problem with toy examples -- but what is the real-world problem you're encountering (apparently regularly) that causes you to keep pushing on this? This needs to be explored especially since so few other people appear to need this to work.
-- --Guido van Rossum (python.org/~guido) Pronouns: he/him **(why is my pronoun here?) http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...
16.06.20 21:02, Guido van Rossum пише:
It would certainly be much easier to get through the
review process.
Adding a __filename__
(why not __file__
?) attribute to classes
is a
major surgery, presumably requiring a PEP, and debating the pros and
cons and performance implications and fixing a bunch of tests that don't
expect this attribute, and so on. Adding an imperfect solution to
inspect.getsource() would only require the cooperation of whoever
maintains the inspect module.
If add the file name of the sources as a class attribute, we need also to add the line number (or the range of line numbers) of the class definition. Otherwise inspect.getsource() will still be ambiguous. Also, specifying the file name does not help in case of REPL or compiling a string, so maybe you need to attach full source text to a class?
On 2020-06-17 09:02, Serhiy Storchaka wrote:
16.06.20 21:02, Guido van Rossum пише:
It would certainly be much easier to get through
the review process.
Adding a __filename__
(why not __file__
?) attribute to classes
is
a major surgery, presumably requiring a PEP, and debating the pros and
cons and performance implications and fixing a bunch of tests that
don't expect this attribute, and so on. Adding an imperfect solution
to inspect.getsource() would only require the cooperation of whoever
maintains the inspect module.
If add the file name of the sources as a class attribute, we need also to add the line number (or the range of line numbers) of the class definition. Otherwise inspect.getsource() will still be ambiguous.
That, or even the entire __code__ of the internal function that sets up the class. That has all the needed information.
I did a small experiment with this, and indeed it breaks tests that either don't expect the attribute or expect anything with __code__ is a function: https://github.com/python/cpython/commit/3fddc0906f2e7b92ea0f7ff040560a10372...
You can actually do this in pure Python, just to see what breaks. See the attachment.
Also, specifying the file name does not help in case of REPL or compiling a string, so maybe you need to attach full source text to a class?
You get the same problem with functions, already. But Jupyter Notebook apparently works around this issue.
On 16/06/2020 20:02, Guido van Rossum wrote:
Very few stars. This suggests not many people care about this problem, and that in turn might explain the lukewarm response you find everywhere.
This seems to be the core, and combined with the cost of measuring performance impacts of adding a new field, it may well not be worth it from a Python developer's perspective. I see it clearly now.
Lastly, I have to ask: Why is this so important to you? What does this prevent you from doing? You have illustrated the problem with toy examples -- but what is the real-world problem you're encountering (apparently regularly) that causes you to keep pushing on this? This needs to be explored especially since so few other people appear to need this to work.
I do all nearly my Python development work on Jupyter notebooks. One thing I miss is getting the class source code (via Jupyter Notebook's ??). When things get large and complicated enough, it seems that I end up trying to look at my own code.
The other part is that I might be overly fond of manipulating Python programs themselves. For example, PyTorch (a library sometimes used for machine learning) sports a JIT for a subset of Python and I spent some time trying to see why they can parse functions but not classes (instead they look at the methods one by one on instances of the class, which works, but always feels like a work-around). Before that method came about, I tried for a while to work with classes directly, and this is was a large part of the original motivation of looking to fix access to source code.
In hindsight, it would seem that this feature is mostly interesting to tool developers, not the general population, and again, I can see why it's a very niche feature that likely isn't worth going through the process of adding support for from Python's perspective.
Thank you for taking the time to consider my request, I sincerely appreciate it and I learned a great deal from our conversation and it makes me feel much better about the ill fate of my proposed patch.
Best regards
Thomas
I presume Jupyter also lets you import code from a file, which you edit outside, Jupyter? Is,that not an option for you?
On Wed, Jun 17, 2020 at 04:09 Thomas Viehmann tv@beamnet.de wrote:
On 16/06/2020 20:02, Guido van Rossum wrote:
Very few stars. This suggests not many people care about this problem, and that in turn might explain the lukewarm response you find everywhere.
This seems to be the core, and combined with the cost of measuring performance impacts of adding a new field, it may well not be worth it from a Python developer's perspective. I see it clearly now.
Lastly, I have to ask: Why is this so important to you? What does this prevent you from doing? You have illustrated the problem with toy examples -- but what is the real-world problem you're encountering (apparently regularly) that causes you to keep pushing on this? This needs to be explored especially since so few other people appear to need this to work.
I do all nearly my Python development work on Jupyter notebooks. One thing I miss is getting the class source code (via Jupyter Notebook's ??). When things get large and complicated enough, it seems that I end up trying to look at my own code.
The other part is that I might be overly fond of manipulating Python programs themselves. For example, PyTorch (a library sometimes used for machine learning) sports a JIT for a subset of Python and I spent some time trying to see why they can parse functions but not classes (instead they look at the methods one by one on instances of the class, which works, but always feels like a work-around). Before that method came about, I tried for a while to work with classes directly, and this is was a large part of the original motivation of looking to fix access to source code.
In hindsight, it would seem that this feature is mostly interesting to tool developers, not the general population, and again, I can see why it's a very niche feature that likely isn't worth going through the process of adding support for from Python's perspective.
Thank you for taking the time to consider my request, I sincerely appreciate it and I learned a great deal from our conversation and it makes me feel much better about the ill fate of my proposed patch.
Best regards
Thomas
-- --Guido (mobile)
On 17/06/2020 17:25, Guido van Rossum wrote:
I presume Jupyter also lets you import code from a file, which you edit outside, Jupyter? Is,that not an option for you?
It's not the file that is the problem, but the lack of it. If I didn't want to cover classes within the __main__ module, I wouldn't have gotten myself into this in the first place.
If it were just the technical aspects, I'd still think recording the filename would be the a good solution, I just never thought about the cost of the process enough.
Maybe the other fundamentally sound solution would be to treat the __main__ module of an interactive session as a single file with ever incrementing line numbers. Making it more explicit that Jupyter/IPython is a world of amending only would solve so many other problems as well and create a world of entirely new ones.
Best regards
Thomas