[Distutils] easy_install wrong download site preference

anatoly techtonik techtonik at gmail.com
Fri Jul 2 00:33:19 CEST 2010


On Fri, Jul 2, 2010 at 12:10 AM, P.J. Eby <pje at telecommunity.com> wrote:
>> >
>> > It prefers newer packages, or, if the versions are the same, it prefers
>> > the shortest download URL. Â In this case, the Google Code url is shorter.
>>
>> That's illogical. Better prefer PyPI if versions are the same.
>
> The "shortest path" logic is there to avoid certain file recognition
> problems that occur without it.  The special case of PyPI isn't special
> enough to break those rules.

Although practicality beats purity. Can you list those "certain file
recognition problems"? I.e. Explicit is better than implicit.

>> PyPI is that page. Google Code URL homepage doesn't have any Python
>> related downloads at all.
> There isn't any way to know that from the filename.

That's why it should use the site where all filenames are Python
downloads if filenames are the same.

>> What if we set download_url instead of
>> homepage back to PyPI page - will it satisfy setuptools as a quick
>> fix?
> No.  You'd need to remove the current "home_page" setting, or point it
> elsewhere.

That's very strange. Then what download_url is for?

>>  (I understand that people do not want to touch setuptools code
>> anymore)
>
> That's not really the issue; the issue here is that package precedence is
> based on a stable comparison scheme, where it doesn't make sense to give one
> location priority over another, as it will simply lead to someone else
> complaining about the changed behavior, because they were relying on a
> different URL having precedence under the current scheme.

These rules need to be described first. What if somebody already broke
the proper order and now everybody suffers? If autodiscovery rules
were well described - it was possible to analyse them and propose more
intuitive approach. Then if "someone else" will attempt to complain -
you could send them to the PEP or another "how and why" document.

> What's more, even if I made this change and released it immediately into the
> wild, you would *still* have this problem for the entire existing installed
> base, which is considerable.  (Many people have not upgraded from setuptools
> 0.6c2 or 0.6c9, which are many years old.)

Do you know how many? I suspect they probably do not need protocol
buffers anyway. In either case it is absolutely normal to require
newer version of setuptools for installation and mention it in docs.

> And, the distribute fork would need to match the behavior as well, or
> there's that group of people who may still end up with the wrong file.

Chances are that it will be the right file (according to "how and why" PEP). =)

>  Already, AFAICT, distribute has changed (at least in the repository) the
> path precedence rules in a way that is not consistent with the way
> setuptools does it, for both URLs *and* local file paths, so I can't really
> give any guidance as to what their version of easy_install will do -- it may
> do the "right thing" in this case, but if so, it's essentially by accident.
>  (i.e., it won't necessarily do the right thing in other cases, since their
> changes weren't motivated by the same issue.)

There definitely should be some FAQ/doc/agreement about how weight of
the links is calculated during discovery process. Should we with
describing current state in Google Wave?

> (One thing that I *could* do, would be to give precedence to links with #md5
> tags -- this would automatically make PyPI (and PyPI-clone) links score
> higher, and would be less likely to introduce problems than trying to force
> recognition of PyPI specifically.  But this would still have the problem of
> getting out into the field in a timely way.)

Perhaps not specifically PyPI, but any [default] index/mirror site
should be scored higher, although any fix would suffice.

I don't see what md5 is for. If it is for downloads protection then
people who can upload malicious package will have it regenerated
automatically, and for MITM attacks it is not a problem to replace md5
either.

>> Where is the relevant code for this PyPI? I wonder why it didn't set
>> rel="download" for PyPI downloads if it should?
>
> That's not the problem either; easy_install is *finding* those links just
> fine; if you use "easy_install -vvn protobuf" you'll see it list both the
> PyPI tgz's and the Google Code URLs as candidates; it's just using the URL
> length as the tie-breaker.

I thought it will raise the weight of those links if there could be a
rel="download" attribute.
-- 
anatoly t.


More information about the Distutils-SIG mailing list