hide python code !

danielx danielwong at berkeley.edu
Wed Aug 16 16:39:10 EDT 2006


Steven D'Aprano wrote:
> On Tue, 15 Aug 2006 09:00:16 -0700, Ben Sizer wrote:
>
> > Yes, in much the same way that there is no point ever locking your
> > doors or installing burglar alarms, as a determined thief will
> > eventually steal your belongings.
>
> That's an utterly pointless and foolish analogy.
>
> (1) If a thief breaks into your house and steals your TV, you no longer
> have a TV. If a developer sees your code, you still have your code, *even
> if they subsequently copy it*. You haven't lost your code, it is just no
> longer secret. Since secrecy is rarely valuable in and of itself, you've
> lost nothing.

But haven't you lost your control over the code? If you were trying to
sell a program (regardless of whether this is a good way to make money
from it), hasn't your ability to do so been undercut? This is the loss.

>
> Yes, I've heard all the stories about "valuable algorithms" and the like.
> Some of them might even be true. But for 99+% of code, spending even one
> cent to keep it secret is just wasting money.

That may be true, but for someone who has determined that the hiding
the code would be best, it would seem to be quite a good investment.
Besides, these kinds of decisions are made case by case. We would not
throw a dice to see whether some code should be released or not. Of
course, these kinds of statistics _should_ moderate any decision, but I
don't think you can expect that "99+%" will make sense to most
(intelligent) people.

But we have only considered the economics of such a decision. Even if
there is no market value to a work, a person has an understandable
desire to exercise the rights of ownership over a work, given the
amount of personal investment one makes in producing it. It's reall
just a form of acknowledgement (you may consider an alternative form of
acknowledgement more rewarding, but we are talking about the author,
not you). Considering the "investment" justificiation, I find it
difficult to deny an author the right to his or her own work (the right
to a work, of course, implies the option to protect it).

I think the above idea is frequently missed in discussions about
copyrights/patents in the open source world. There, the focus seems to
be on the marketability granted by protections (legal or physical). The
post I am responding to illustrates this focus. Do we believe an author
forfeits ownership of a work merely by sharing it? As a matter of
conscience, I don't believe the answer can be imposed on anyone. Every
person must answer this for him or herself.

>
> (2) Compiling code to machine language isn't like locking your door.
> Compiling code doesn't prevent me from seeing your code or your algorithm,

If a house is locked, it can still be entered (without the key). The
point is not that it is impossible to break in, but that it is more
difficult.

> it just means I see it written in machine language instead of C. If I know
> how to read machine code, or if I have a decompiler, then I can read it,
> no problems at all. Would you argue that Python source code hides your

I know how to read asm, but if you say anyone can read asm just as
easily as one can read Python or even C, then you must be referring to
a machine.

> algorithm because it is inscrutable to people who can't read and
> understand Python? Surely not. So why do you argue that compiled code is
> hidden merely because it is inscrutable to people who don't know how to
> download a decompiler off the Internet?

It's all a matter of degree. The question of plausibility is always
relevant.

>
> (3) Compiling code is certainly not like installing a burglar alarm. When
> I decompile your code, no alarms ring and you are not notified.

That's pretty nit-picky...

>
>
> > I find it strange that people (at least on c.l.py) often equate
> > 'imperfect protection' with 'pointless protection'.
>
> Nonsense. Can I remind you that the Original Poster *explicitly* rejected
> using Python's imperfect code-hiding technique (distribute only the
> compiled .pyc files) because they can be disassembled, but failed to
> realise that EXACTLY the same argument holds for compiled C code?
>
> Let me make it clear with a better analogy than your locked door one: the
> O.P. says "I don't want people to look through the windows of my Python
> house. I thought about hanging curtains, but people with thermal imaging
> equipment can see right through the walls. Can I hang vertical blinds in
> Python like my C programmer friends?"
>
> The answers are:
>
> (1) No, Python uses curtains. If you want vertical blinds, use another
> language.
>
> (2) Even if you hang vertical blinds, it isn't going to stop people with
> thermal imaging equipment from seeing into your house and copying your
> algorithm, just like they can with Python.
>
>
>
> > The all-or-nothing
> > attitude makes no sense. If you can halve the number of people who can
> > deduce your algorithm, that helps. If you can double the time it takes
> > for those people to deduce it, that also helps. If it took you months
> > of R&D, the value of even imperfect protection rises.
>
> Fine. But you haven't demonstrated how to do that. You're just plucking
> figures out of the air. Anyone can do that: I claim that going to the
> trouble of hiding code with (say) py2exe reduces the number of people who
> can deduce your algorithm by 0.1%, and increases the time it takes them by
> 0.01%. Who is to say that my figures are not as good or better than yours?
> Do you really think that (say) Microsoft has got neither decompilers nor
> people who can operate them?

I think the point still stands. You seem to acknowledge it at first.
Your m$ example even supports it, because the number of people that
work there is relatively small, not to mention the fact that m$
employees need to be paid (they are paying with their souls aren't they
:P). Your way of getting around the point is just nit-picking at the
figures. Even if we don't take the "twice" figure literally, I imagine
that most of us would agree that the amount that the bar can be raise
is considerable and not insignificant.

An ancillary point: If the bar can be raised (considerably) at little
cost, then a person who wants to protect his or her code (for economic
reasons or otherwise) profits from going through the trouble.

In the end, if he find that the trouble was not worth the cost, it is
his or her loss. Anyone else's loss due to the (relative)
inaccessibility of the code should not be the author's responsibility.
ie, the author should be under no obligation to save someone else the
trouble of accessing the code unfettered (imho).

> 
> 
> 
> -- 
> Steven D'Aprano




More information about the Python-list mailing list