[Python-Dev] Should there be a way or API for retrieving from a code object a loader method and package file where the code comes from?

Wed Dec 24 06:03:37 CET 2008

Paul Moore writes:
 > 2008/12/23 R. Bernstein <rocky at panix.com>:
 > >  A use case here I am thinking of here is in a stack trace or a
 > >  debugger, or a tool which wants to show in great detail, information
 > >  from a code object obtained possibly via a frame object.
 > 
 > Thanks for the clarifications. I see what you're after much better now.
 > 
 > > I find it kind of sucky to see in a traceback: "<string>" as opposed
 > > to the text (or prefix of the text) of the actual string that was
 > > passed. Or something that has been referred to as a "pseudo-file" like
 > > /usr/lib/python2.5/site-packages/tracer-0.1.0-py2.5.egg/foo/bar.py
 > > when it is really member foo/bar.py of zipped egg
 > > /usr/lib/python2.5/site-packages/tracer-0.1.0-py2.5.egg.
 > 
 > Fair comment. That points to a "human readable" type of string. It's
 > not available at the moment, but I guess it could be.
 > 
 > But see below.
 > 
 > >  >     -  xxx.get_source_code(token) is a function (I don't know where,
 > >  > xxx is a placeholder for some "suitable" module) which, given such a
 > >  > token, returns the source, or None if there's no viable concept of
 > >  > "the source".
 > >
 > > There always is a viable concept of a source. It's whatever was done
 > > to get the code. For example, if it was via an eval then the source
 > > was the eval function and a string, same for exec. If it's via
 > > database access, well that then and some summary info about what's
 > > known about that.
 > 
 > Hmm, "source" colloquially, yes "bytecode loaded from ....\xxx.pyc",
 > for example. But not "source" in the sense of "source code". Some
 > applications run with only bytecode shipped, no source code available
 > at all.
 > 
 > > There are two problems. One is displaying location information in an
 > > unambiguous way -- the pseudo-file above is ambiguous and so is
 > > <string> since there's no guarentee that OS's make to not name a file
 > > that. The second problem is programmatically getting information such
 > > as a debugger or an IDE might do so that the information can be
 > > conveyed back to a user who might want to inspect surrounding source
 > > code or modules.
 > 
 > This is more than you were asking for above.
 > 
 > The first problem is addressed with a "human readable" (narrative)
 > description, as above.
 > 
 > The second, however, requires machine-readable access to source code
 > (if it exists). That's what the loader get_source() call does for you.
 > But you have to be prepared for the fact that it may not be possible
 > to get source code, and decide what you want to happen in that case.

I'm missing your point here. 

When one uses information from a traceback, or is in a debugger, or is
in an IDE, it is assumed that in order to use the information given
you'll need access to the source code. And IDE's and debuggers have
had to deal with the fact that source code is not available from day
one, even before there was zipimporter.

In order to get the strings of source text that linecache.getlines()
gives, it has to prowl around for other information, possibly looking
for a loader along the protocol defined in PEP 302 and/or others. And
its that information that a debugger, IDE or some tool of that ilk
might need.

Many IDE's and debuggers nowadays open a socket and pass information
back and forth over that. An obvious advantage is that it means you
can debug remotely. But in order for this to work, some information is
generally passed back and for regarding the location of the source
text. In the Java world and Eclipse for example, it is possible for
the jar to be in a different location from on the machine which you
might be debugging on. And probably too often that jar isn't the same
one. So it is helpful in this kind of scenario to break out a location
into the name of a jar and the member inside the jar. Perhaps also
some information about that jar.

It is possible that instead of passing around locations, debuggers and
such tools instead use get_source() instead, because that's what
Python has to offer.  :-)

I jest here, but honestly I've been surprised that there is no IDE
that I know of that in fact works this way. The machine running the
code clearly may have more accurate access to the source than a
front-end IDE. Undeterred by the harsh facts of reality, I have hope
that someday there *might* be an IDE that has provision for this. So
in a Ruby debugger (ruby-debug) one can request checksum information
on the files the debugger things are loaded in order to facilitate
checking that the source one an IDE might be showing in fact matches
the source for that part of the code that one is currently under
investigation.

 > 
 > >  > I hope this is of some help,
 > >
 > > Yes, thanks. At least I now have a clearer idea of the state of
 > > where things stand.
 > 
 > Good. Sorry it's not better news :-)
 > 
 > Paul
 >