[Catalog-sig] Deprecate External Links

M.-A. Lemburg mal at egenix.com
Wed Feb 27 18:10:09 CET 2013


On 27.02.2013 17:43, Donald Stufft wrote:
> On Wednesday, February 27, 2013 at 11:34 AM, M.-A. Lemburg wrote:
>> On 27.02.2013 16:42, Donald Stufft wrote:
>>> On Wednesday, February 27, 2013 at 10:39 AM, M.-A. Lemburg wrote:
>>>> -1.
>>>>
>>>> There are many reasons for not hosting packages and distributions
>>>> on PyPI itself.
>>>>
>>>> If you use and trust a package, you also have to know and trust its
>>>> dependencies, no matter where they are hosted, so you're not gaining
>>>> any security by disabling links to other download locations: if
>>>> you don't trust the way a package is hosted, you don't use it; if
>>>> you do, then removing the link breaks the package and all its
>>>> dependencies.
>>>>
>>>
>>>
>>> You also have to know and trust the hosting locations for all of them, and
>>> if they are not available via SSL you have to know and trust that there is
>>> not a MITM available. 
>>>
>>
>>
>> Right.
>>
>> I'm not saying that it's not a good idea to host packages on PyPI,
>> but forcing the community into doing this is not a good idea.
>>
>>>> Instead of suggesting to removing support for externally hosted packages,
>>>> why not propose a mechanism to provide a more direct/secure way to
>>>> reference them ?
>>>>
>>>
>>>
>>> I did mention a method for doing that in my email. However there are reasons
>>> beyond the security ones to require packages being hosted on PyPI. Namely
>>> uptime, privacy, and performance.
>>>
>>
>>
>> Your proposed uploading of hash values would require listing all
>> distribution files for each release somehow. I don't see how you'd
>> get that to work with older Python versions.
>>
>> """
>> 1. It is difficult to secure the process of spidering external links
>> for download.
>> 1a. The only way I can think offhand is by requiring uploading
>> a hash of the expected files to PyPI along with the download
>> link and removing all urls except for the download_url. This
>> has the effect that only 1 file can be associated with a particular
>> release.
>> """
>>
>> Uptime and performance have in the past been one of the reasons why
>> people chose not to upload files to PyPI. This could be changed,
>> of course.
>>
>>
> 
> I don't see how. If PyPI goes down then the packaging tools cannot
> query /simple/foo/ to see the external links. Adding in additional SPOF's
> only harms uptime, there is no possible way for it to increase it. 

Package installers only need access to the static files in
the /simple/ index. Those can be put behind a CDN to increase
uptime.

PyPI itself doesn't have to be up and running if you just want
to download the files (unfortunately, that's not true at the
moment, because the /simple/ index is dynamically generated,
but that can be changed).

See http://wiki.python.org/moin/CloudPyPI for details.

>> Another reason for not uploading files to PyPI are the license
>> terms you have to agree to on PyPI and the fact that you can no
>> longer control where your distribution files are made available
>> by agreeing to them. This could be changed as well, but we'd need
>> to add more legalese to the PyPI mirror setup for this to work...
>> not sure whether people providing the mirrors would like this.
>>
>>
> 
> The legalese doesn't particularly give any more rights than any
> free/OSS license does. There's not a requirement currently that
> packages on PyPI be free/OSS but this change would only actually
> affect people who want to upload non free code to PyPI.

It does affect any package author, regardless of the license.
Some examples:

* you may be forced remove a distribution from the net (think DMCA,
  patents, trademarks, etc)

* the distribution may contain a serious bug that you don't want to
  spread

* you may want to keep more accurate statistics of the reach of
  your project

>> Security can be had by having installers check the GPG signatures
>> of distribution file. You don't need to trust the download
>> site for that.
> 
> GPG signatures are good, we don't have them yet. And when we do
> it's only 1 layer of defense, not the final solution.

Sure, you still have to trust the author :-)

>> I'm not sure what you meant with privacy in this context.
>
> If I download something from server there is a certain amount
> of information that by nature of HTTP and networking gets
> leaked to that host. Additionally if it's done via non TLS connections
> it also gets leaked to anyone who has a MITM on my connection.
>
> This is especially important in countries where the government
> actively surveils or modifies the traffic of their citizens.

I can see an issue with e.g. trying to download code that
is illegal to use in a country (e.g. crypto code, exploits,
hacks, etc.), but the country officials would probably just
block the complete PyPI site than bother with filtering single
requests.

IMO, that's beyond the scope of what we're discussing
here, though.

>> Something that would work even with older Python versions is
>> letting the download URL point to a meta-file which contains
>> the links to the other distribution files. That way you
>> avoid having the installers trying to parse arbitrary
>> websites and you can add more security to the downloads
>> by providing hash values, etc. in those meta-files.
>>
>> Since installers already know how to parse the /simple/
>> (HTML) index files, we might use that same format
>> for those meta-files.

So what do you think of the above idea ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 26 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


More information about the Catalog-SIG mailing list