[Distutils] Handling Case/Normalization Differences

holger krekel holger at merlinux.eu
Tue Sep 2 11:36:47 CEST 2014


On Mon, Sep 01, 2014 at 19:07 -0400, Donald Stufft wrote:
> > On Sep 1, 2014, at 4:53 PM, holger krekel <holger at merlinux.eu> wrote:
> > 
> > On Thu, Aug 28, 2014 at 14:58 -0400, Donald Stufft wrote:
> >> Right now the “canonical” page for a particular project on PyPI is whatever the
> >> author happened to name their package (e.g. Django). This requires PyPI to have
> >> some "smarts" so that it can redirect things like /simple/django/ to
> >> /simple/Django/ otherwise someone doing ``pip install django`` would fall back
> >> to a much worse behavior.
> >> 
> >> If this redirect doesn't happen, then pip will issue a request for just
> >> /simple/ and look for a link that, when both sides are normalized, compares
> >> equal to the name it's looking for. It will then follow the link, get
> >> /simple/Django/ and everything works... Except it doesn't. The problem here
> >> comes from the external link classification that we have now. Pip sees the
> >> link to /simple/Django/ as an external link (because it lacks the required
> >> rels) and the installation finally fails.
> >> 
> >> The /simple/ case rarely happens when installing from PyPI itself because of
> >> the redirect, however it happens quite often when someone is attempting to
> >> instal from a mirror instead. Even when everything works correctly the penality
> >> for not knowing exactly what name to type in results in at least 1 extra http
> >> request, one of which (/simple/) requires pulling down a 2.1MB file.
> >> 
> >> To fix this I'm going to modify PyPI so that it uses the normalized name in
> >> the /simple/ URL and redirects everything else to the non-normalized name.
> > 
> > Of course you mean redirecting everything to the normalized name.
> > 
> >> I'm also going to submit a PR to bandersnatch so that it will use
> >> normalized names ...
> > 
> > devpi-server also broke and I did a hotfix release today.  Older
> > installs will still have a problem, though (not all companies run the
> > newest version all the time).  Apart form the fact i was on vacation and
> > on business travels, the notice for that breaking change was only one
> > day which i think is a bit too quick.  I'd really appreciate if you send
> > a mail to Christian for bandersnatch and me for devpi before such
> > changes happen and with a bit more reasonable ahead time.
> > 
> > Besides, i think it's a good change in principle.
> > 
> > best and thanks,
> > holger
> 
> I can only really replete this with https://xkcd.com/1172/.
> This shouldn’t have been a breaking change, anyone following the HTTP
> spec dealt with this change just fine. As far as I can tell the only reason
> it broke devpi was because of an assertion in the code that was asserting
> against an implementation detail, an implementation detail that I changed.

Right, the assertion was there to ensure pypi's "realname" and devpi's
internal "realname" of a project are the same.  This check is now relaxed.

FWIW I'd prefer it we just said in all pypi APIs (http and xmlrpc/json)
that a project name is always kept in canonical form, i.e. you can maybe
register "HeLlo_World" but it just means "hello-world" next time someone
asks for it.  What is the relevance of the "realname" anyway?
Do you keep "realnames" in warehouse? 

> I’m sorry it broke devpi and that it happened at a time when you were
> on vacation, but honestly I don’t think it’s reasonable to expect every
> little thing to have to be run past a list of people. Due to the undocumented
> nature of these tools people have put a lot of (also undocumented)
> assumptions into their code, many of which are simply depending on
> implementation details. I try to test my changes against what I can,
> in this case pip, setuptools, and bandersnatch, but I can’t test against
> everything.

Thanks for all your work and eagerness to improve things.  I think
it's safe to assume that any change in PyPI's pip/bandersnatch/devpi
facing http API has potential for disruption even if some http
specification says otherwise -- at least until we have some specification
of how tool/pypi interactions work.

best,
holger


More information about the Distutils-SIG mailing list