[Distutils] A new, experimental packaging tool: distil

Vinay Sajip vinay_sajip at yahoo.co.uk
Tue Mar 26 11:34:33 CET 2013


Philippe Ombredanne <pombredanne <at> nexb.com> writes:

> I see that you are using a pattern similar to the virtualenv.py
> script, embedding other code as a compressed byte array.
>
> I am in general fine with the approach, though I feel a bit uncomfy
> with this approach creeping in as "the" way to bootstrap things with
> one single file for core distribution-related tools.

It's not a particularly new approach - it's just that way because it makes
things easier for the user. If I had used a more conventional approach, I'm
not sure as many people would be willing to try it. Like virtualenv, it's a
tool that cannot rely on the presence of existing installation tools.

> Would anyone know of a better way to package things in a single
> python-executable bootstrapping script file without obfuscating the
> source contents in compressed/encoded/obfuscated byte arrays?

It's only obfuscated as a side-effect - the other way would be to put all your
code in a single module - not much fun to maintain, that way. But if someone
has a better way, that would certainly be of interest.

> Also, in your code calling this binary payload STUFF feels a tad
> scary:  this is arbitrary code that I cannot see nor inspect before
> running.

Would you find it more trustworthy if it was called TRUST_ME_ITS_SAFE? ;-)

Remember, it's just a Python script running without system privileges.

> I would not want to run unknown STUFFs on my machine ... and even more
> so since the corresponding sources are not available publicly yet in a
> source repo.

The absence of a public source repo is a red herring. If you want to inspect
the code, the time taken to add a pdb breakpoint after the .zip write, and to
unzip the file to a folder of your choice (or to add code to distil.py to do
this), is trivial compared to the time you would spend doing the actual code
inspection. The code is open to inspection, but I'd hope that most users focus
on whether the tool has useful qualities, how it could be used to move
packaging forwards, what it demonstrates about distlib etc.

Have you inspected setuptools or pip code to verify that they are safe? As well
as everything you've ever downloaded from PyPI, which might or might not be
exactly the same as what's shown in a project's public VCS repo?

> At the minimum, getting some comments or explicit variable names the
> virtualenv way on what this payload is would help IMHO:
> 
> "STUFF = """
> eJyEm1OMLlC3Zb+ybbtO1Snbtm3brjpl27Zt27Zt27b6Tye3b9J9k37YK9kv+2FmPMxkrC0vBQKKCgA
> AIAGIUeqCCqFgEh/4AACRBQCAD8AFGFs4OVtbGNLpGRoYWdnbOTrTObk7GdnZmlqY0dq7qyhDAUDqZP
> ......"
> 

While virtualenv has a number of discrete files, I have just one zip file
containing distlib, CLI support code and distil code - that's a lot of files,
so I'm not sure a comment would be all that helpful. What would it really tell
you? What "STUFF" is really saying, to most users, is "stuff you don't need to
care about the details of". For the security-conscious, a mere comment from a
potentially untrusted source is no substitute for that unzip + time-consuming
code inspection.

Regards,

Vinay Sajip




More information about the Distutils-SIG mailing list