Selling Python Software

Mon Nov 3 08:54:52 EST 2003

John J. Lee wrote:

> Alex Martelli <aleax at aleax.it> writes:
> [...]
>> "Can not be decompiled" is impossible whatever language you're using.
>> 
>> "*extreme* difficulty" is in the eye of the beholder.  You can e.g.
>> add layers of encryption/decription to the bytecode, etc, but whatever
>> you do somebody else can undo.  Depending on the relative skills of
>> you and the "somebody else" the ratio (your effort to keep things
>> secret, to theirs to uncover them) can be any.
> [...]
> 
> Whie this is all true, you seem to put undue emphasis on the fact that
> it's always *possible* to decompile stuff.  Isn't the point you make
> in your last sentence actually crucial here?  The game is to make your

Of course it's crucial.  But so what?

> opponent (customer ;-) incur more expense in decompiling it than it
> would cost to just go ahead and pay you, is it not?  And yeah, you
> also have to take into account how much it costs you to come up with
> the protection scheme, of course.

Of course.  It can be framed as a zero-sum game of incomplete
information on both sides.  You don't really know that anybody
will ever try to steal your code -- any eurocent you invest in
protecting it might be a complete waste, if nobody ever even
dreams of so trying.  At the other extreme, whoever tries to do
the stealing might be technically good and well informed, as well
as dishonest, so that in 5 minutes they destroy 5 days' worth of
work by you on "protection".  In both cases, investing in such
protection is throwing money away from your POV.  The hypothetical
adversary, for his part, may not know and be unable to gauge the
effort needed to crack and steal your code -- if he's not well
informed nor competent, he might just be flailing around for 10
days and stop just before the 11th day's effort WOULD deliver the
illegal goods he's after.

Of course, guess what IS the effect on this game's payoff matrix
of discussing technical possibilities in a public forum.  "I give
you three guesses, but the first two don't count"...:-).

> So, is there a good practical solution of that form, for Python code
> of this sort of size (or any other size)?  I suspect the answer for
> standard Python may be no, while the answer for optimising compilers
> may be yes -- but that's just a guess.

The answer is no for either case.  I've spent too high a proportion of
my life (at my previous employer) putting "protection systems" in
place (including optimising compilers, weird machine-code tricks,
even in one case some microcode hacks), and it was the worst waste of
my time anybody could possibly devise.

Part of the problem is, that the "warezdoodz culture" is stacked
against you.  If you DO come up with a novel approach, that is a
challenge to guys who SPEND THEIR LIFE doing essentially nothing
but cracking software-protection schemes *for fun*.  Even if it's
taken you 10 hours and it makes them spend 20 hours, they _do not
account this as a cost_, any more than a crossword enthusiast sees
as "a cost" the hours he spends cracking a particularly devious
crossword -- indeed, once said enthusiast is good enough, unless
the puzzle it's hard it's no fun.  But don't think that therefore
using an obviously weak scheme is a counter: just below the top
crackerz there are layers and layers of progressively less capable
ones desperate to put notches in their belt.

To me, playing such zero-sum games is a net loss and waste of time
because with the same investment of my time and energy I could
be playing games with sum _greater_ than zero, as is normally the
case for technical development not connected to security issues
(i.e., where the only "net benefit" of the development doesn't
boil down to frustrating somebody else's attempts at cracking and
stealing) and even for much security-related work (e.g., most of
OpenBSD's developments help against plain old BUGS and crashes just
as much as they help against wilfull attacks against you).

There exist technical solutions that DO make it impossible for
anybody to crack your precious algorithms: just make those precious
algorithms available only from a network server under your total
control, NEVER giving out executable code for them.  (You can then
work on securing the network server, etc, but these ARE problems
that are technically susceptible to good solutions).  If anybody
refuses this solution (surely a costly one on some parameters) it
probably means their algorithms aren't worth all that much after
all (there may be connectivity, latency or bandwidth problems in
some cases, of course, but with the spread of network technologies
these are progressively less likely to apply as time goes by).  If
the algorithms aren't worth all that much, they're not worth me
spending my time in zero-sum games to protect them -- lawyers are
probably more emotionally attuned than engineers to playing zero-
sum games, since so much of legal practice vs so little engineering
practice is like that, so that may be a back-up possibility.

It's not about programming languages at all.  In the end, "clever"
schemes that are presumed to let people run code on machines under
their control yet never be able to "read" the code must rely on
machinecode tricks of some sort, anyway, since obviously, from a
technical viewpoint, if the code must be executable, it must be
*read* on the way to the execution engine -- if it's encrypted it
must exist in decrypted form at some point (and it can then be
captured and examined at that point), etc.  Some of the code that
I was supposed to "protect" for my previous employer was in C,
Fortran, C++, and other high-level languages; some was in
machine code; other yet was in intermediate-code generated by a
proprietary scripting language; ... in the end it made no real
difference one way or another.

Alex