[Python-ideas] Add a cryptographic hash (e.g SHA1) of source to Python Compiled objects?

rocky at gnu.org rocky at gnu.org
Tue Feb 3 10:56:47 CET 2009


I've been re-examining from ground up the whole state of affairs in
writing a debugger. One of the challenges of a debugger or any
source-code analysis tool is verifying that the source-code that the
tool is reporting on corresponds to the compiled object under
execution.

For debuggers, this problem becomes more likely to occur when you are
debugging on a computer that isn't the same as the computer where the
code is running.)

For this, it would be useful to have a cryptographic hash like a SHA1
in the compiled object, but hopefully accessible via the module object
where the file path is stored.

I understand that there is a mtime timestamp in the .pyc but this is
not as reliable as cryptographic hash such as SHA1.

There seems to be some confusion in thinking the only use case for
this is in remote debugging where source code may be on a different
computer than where the code is running, but I do not believe this is
so.  Here are two other situations which come up. 

First is a code coverage tool like coverage.py which checks coverage
over several runs. Let's say the source code is erased and checked out
again; or edited and temporarily changed several times but in the end
the file stays the same. A SHA1 has will understand the file hasn't
changed, mtime won't.

A second more contrived example is in in some sort of secure
environment. Let's say I am using the compiled Python code, (say for
an embedded device) and someone offers me what's purported to be the
source code.  How can I easily verify that this is correct?

In theory I suppose if I have enough information about the version of
Python and which platform, I can compile the purported source ignoring
some bits of information (like the mtime ;-) in the compiled
object. But one would have to be careful about getting compilers and
platforms then same or understand how this changes compilation.






More information about the Python-ideas mailing list