[Q] how to protect python program from decompilation

Tim Peters tim.one at home.com
Sun Feb 18 13:42:04 EST 2001


[Leonid Gluhovsky]
> ...
> Right now we are considering the following scheme: build a Python
> with opcodes in Include/opcode.h reshuffled, and without the dis module
> in it; use this interpreter to byte compile our code; use its Tools/freeze
> to package our program into executable and ship it to customers.
>
> What are the holes in this approach?

The different sections of the Python eval loop are easy to recognize in
machine language.  It wouldn't take a Python mini-guru more than an hour to
reverse-engineer your permutation by just watching the eval loop in a
machine-language debugger, noting which opcodes jump to which sections of
the eval loop.  In fact, they only have to do that for *one* opcode.  Then
they know where in memory the eval loop case-stmt jump table (generated by
the C compiler) lives, and the entire permutation can be read off directly
from that.  Then they can alter the std dis.py accordingly to do an exact
symbolic disassembly of your whole code base.

> What is a better approach?

If there were one, I expect Microsoft would do that instead of just
prohibiting reverse-engineering in their licenses.  It's not Python gurus
you have to worry about so much, it's machine-language gurus.  That's why
this has little to do with Python.  You can't hide anything from them,
unless (as Aahz suggested) you ship your own CPU card that doesn't allow
tracing instructions.

You can make it more *unpleasant* by running your code through a name
obfuscator first, e.g. systematically replace all vrbl instances of "i" by a
fixed 20-character randomly generated identifier ("AAAA75WRWLKJLKJ3lS9Q"),
and so on.  Certain branches of the US Govt do that before sending in
compiler bug reports, as if nobody can recognize an FFT from its structure
alone <0.9 wink>.

It's much better to assume your code *will* be reverse-engineered (simply
because it will be).  Then the focus shifts to dreaming up ways to prove
infringement of your license.  An effective way is (no kidding) to build in
subtle bugs.  If a competitor develops the same set of subtle bugs later,
they're holding a smoking gun.

for-the-same-reason-when-i-was-growing-up-the-regional-map-showed-
    a-river-in-my-neighborhood-that-never-existed-ly y'rs  - tim





More information about the Python-list mailing list