How to know that two pyc files contain the same code

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sun Mar 11 03:06:53 EDT 2012


On Sun, 11 Mar 2012 12:15:11 +1100, Chris Angelico wrote:

> On Sun, Mar 11, 2012 at 9:52 AM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> On Sat, 10 Mar 2012 15:48:48 +0100, Gelonida N wrote: Define
>> "identical" and "the same".
>>
>> If I compile these two files:
>>
>>
>> # file ham.py
>> x = 23
>> def func():
>>    a = 23
>>    return a + 19
>>
>>
>>
>> # file = spam.py
>> def func():
>>    return 42
>>
>> tmp = 19
>> x = 4 + tmp
>> del tmp
>>
>>
>> do you expect spam.pyc and ham.pyc to count as "the same"?
> 
> They do not contain the same code. They may contain code which has the
> same effect, but it is not the same code.

To me, they do: they contain a function "func" which takes no arguments 
and returns 42, and a global "x" initialised to 23. Everything else is an 
implementation detail.

I'm not being facetious. One should be asking what is the *purpose* of 
this question -- is it to detect when two pyc files contain the same 
*interface*, or to determine if they were generated from identical source 
code files (and if the later, do comments and whitespace matter)?

What if one merely changed the order of definition? Instead of:

def foo(): pass
def bar(): pass

one had this?

def bar(): pass
def foo(): pass

It depends on why the OP cares if they are "identical". I can imagine use-
cases where the right solution is to forget ideas about identical code, 
and just checksum the files (ignoring any timestamps).


-- 
Steven



More information about the Python-list mailing list