setuptools or PyPI problem...?
I don't know which side this belongs to, but I had a problem when I tried to create a package with a "-" in it ("Paste-Deploy"). setup.py register worked fine, and created a "Paste-Deploy" project; however, when I did an upload it created a "Paste_Deploy-0.1.tar.gz" file, and PyPI wouldn't accept it, I believe because it thought it belonged to the (nonexistant) Paste_Deploy project. In the end I just smushed the words together, but I figure this really should work. I don't know if it's setuptools (or distutils) that is uploading to PyPI incorrectly, or PyPI that is mismatching projects. -- Ian Bicking / ianb@colorstudy.com / http://blog.ianbicking.org
At 01:26 PM 8/22/2005 -0500, Ian Bicking wrote:
I don't know which side this belongs to, but I had a problem when I tried to create a package with a "-" in it ("Paste-Deploy"). setup.py register worked fine, and created a "Paste-Deploy" project; however, when I did an upload it created a "Paste_Deploy-0.1.tar.gz" file, and PyPI wouldn't accept it, I believe because it thought it belonged to the (nonexistant) Paste_Deploy project. In the end I just smushed the words together, but I figure this really should work. I don't know if it's setuptools (or distutils) that is uploading to PyPI incorrectly, or PyPI that is mismatching projects.
Try the latest setuptools; you shouldn't have this problem as I changed it not to create anything other than eggs with escaped '-' characters. However, it would be very nice to have PyPI support escaped nonalphanumerics in filenames, as they are a bitch to deal with otherwise. Eggs absolutely have to have an unambiguously parseable filename, and the only way to do that is by escaping '-' to '_'. This means that you can't upload eggs for a project named e.g. 'Paste-Deploy', if PyPI rejects a Paste_Deploy-whatever.egg file. On the broader scope of things, I'd like to see PyPI smash all non-alphanumeric runs in project names to a single '-', and use case-insensitive project name comparison. I'd attempt to try my hand at PyPI patches but at the moment don't have any obvious way to test them.
Ian Bicking wrote:
I don't know which side this belongs to, but I had a problem when I tried to create a package with a "-" in it ("Paste-Deploy"). setup.py register worked fine, and created a "Paste-Deploy" project; however, when I did an upload it created a "Paste_Deploy-0.1.tar.gz" file, and PyPI wouldn't accept it, I believe because it thought it belonged to the (nonexistant) Paste_Deploy project. In the end I just smushed the words together, but I figure this really should work. I don't know if it's setuptools (or distutils) that is uploading to PyPI incorrectly, or PyPI that is mismatching projects.
Sounds like a PyPI problem to me. Regards, Martin
On Tue, 23 Aug 2005 04:26 am, Ian Bicking wrote:
I don't know which side this belongs to, but I had a problem when I tried to create a package with a "-" in it ("Paste-Deploy"). setup.py register worked fine, and created a "Paste-Deploy" project; however, when I did an upload it created a "Paste_Deploy-0.1.tar.gz" file, and PyPI wouldn't accept it, I believe because it thought it belonged to the (nonexistant) Paste_Deploy project.
The only restrictions PyPI places on filenames for uploads are: # check for valid filenames filename = content.filename if not safe_filenames.match(filename): raise FormError, 'invalid distribution file' # check for dodgy filenames if '/' in filename or '\\' in filename: raise FormError, 'invalid distribution file' # check the file for valid contents based on the type if not verify_filetype.is_distutils_file(content, filename, filetype): raise FormError, 'invalid distribution file' Where: safe_filenames = re.compile(r'.+?\.(exe|tar\.gz|bz2|rpm|deb|zip|tgz|egg)$', re.I) and "is_distutils_file" just looks at the extension and pokes into the file based on the extension to make sure that an ".exe" uplood looks kinda like an installer, and ".zip" and ".egg" uploads look kinda like ZIP files of distutils origin. No checks are made that a filename matches a package name. So given the metadata: setup( name="To-Do List", version="1.23 alpha!", ... ) as long as that *name* (and version) is passed unchanged to PyPI, a file named "frozzleplop-1.2.3.zip" could be attached to the "To-Do List" package. I can only assume that setuptools is mutating the name/version in order to generate a safe filename, but then passing the mutated name/version to PyPI as the release identifier. I think it's an unacceptable change to make to PyPI to accept the mutated name/version, as the name/version represents the unique identifier in the database for a package. Unique identifier collisions are possible when you start mangling them, and I'd really prefer to avoid such things. Richard
At 10:25 AM 9/23/2005 +1000, Richard Jones wrote:
I can only assume that setuptools is mutating the name/version in order to generate a safe filename, but then passing the mutated name/version to PyPI as the release identifier.
I haven't attempted to reproduce Ian's problem, but I don't believe I'm doing this, at least not in the upload command. I'll have to check.
I think it's an unacceptable change to make to PyPI to accept the mutated name/version, as the name/version represents the unique identifier in the database for a package. Unique identifier collisions are possible when you start mangling them, and I'd really prefer to avoid such things.
I'd like to encourage moving towards mangling the *keys* themselves, in order to be robust in the face of typos. I think allowing random punctuation and upper/lower case to distinguish projects (e.g. allowing SQLObject, sqlobject, and SQLobject to be different packages) is asking for trouble on the human side of things, entirely aside from allowing them in filenames, dealing with case-insensitive filesystems, and so on. Having a nice human readable name for the web page, PKG-INFO, and description are one thing, but having it used for filenames, URLs, and database keys is quite another.
On Fri, 23 Sep 2005 01:32 pm, Phillip J. Eby wrote:
I'd like to encourage moving towards mangling the *keys* themselves, in order to be robust in the face of typos. I think allowing random punctuation and upper/lower case to distinguish projects (e.g. allowing SQLObject, sqlobject, and SQLobject to be different packages) is asking for trouble on the human side of things, entirely aside from allowing them in filenames, dealing with case-insensitive filesystems, and so on. Having a nice human readable name for the web page, PKG-INFO, and description are one thing, but having it used for filenames, URLs, and database keys is quite another.
I believe what you're proposing would require changing Python itself so it enforces restrictions on package names (ie. all lower-case, very limited punctuation, no whitespace). I think that's a good idea, but I also think at this point that the cat's out of the bag :( Happy to hear contradictory views and be shouted down. Richard
At 01:37 PM 9/23/2005 +1000, Richard Jones wrote:
On Fri, 23 Sep 2005 01:32 pm, Phillip J. Eby wrote:
I'd like to encourage moving towards mangling the *keys* themselves, in order to be robust in the face of typos. I think allowing random punctuation and upper/lower case to distinguish projects (e.g. allowing SQLObject, sqlobject, and SQLobject to be different packages) is asking for trouble on the human side of things, entirely aside from allowing them in filenames, dealing with case-insensitive filesystems, and so on. Having a nice human readable name for the web page, PKG-INFO, and description are one thing, but having it used for filenames, URLs, and database keys is quite another.
I believe what you're proposing would require changing Python itself so it enforces restrictions on package names (ie. all lower-case, very limited punctuation, no whitespace). I think that's a good idea, but I also think at this point that the cat's out of the bag :(
Python doesn't let you use spaces and arbitrary punctuation in package names, so perhaps I've misunderstood you. Also, just in case you've misunderstood me, I'm referring above to *project* names, not package names. For example, PEAK has its project name registered as PEAK on PyPI, but its Python package name (that you actually import) is 'peak'. I'm referring above only to project names, not package names.
On Sat, 24 Sep 2005 01:40 am, Phillip J. Eby wrote:
At 01:37 PM 9/23/2005 +1000, Richard Jones wrote:
I believe what you're proposing would require changing Python itself so it enforces restrictions on package names (ie. all lower-case, very limited punctuation, no whitespace). I think that's a good idea, but I also think at this point that the cat's out of the bag :(
Python doesn't let you use spaces and arbitrary punctuation in package names, so perhaps I've misunderstood you.
Also, just in case you've misunderstood me, I'm referring above to *project* names, not package names. For example, PEAK has its project name registered as PEAK on PyPI, but its Python package name (that you actually import) is 'peak'. I'm referring above only to project names, not package names.
By "package name" I mean the distutils package name. Richard
At 08:45 AM 9/24/2005 +1000, Richard Jones wrote:
On Sat, 24 Sep 2005 01:40 am, Phillip J. Eby wrote:
At 01:37 PM 9/23/2005 +1000, Richard Jones wrote:
I believe what you're proposing would require changing Python itself so it enforces restrictions on package names (ie. all lower-case, very limited punctuation, no whitespace). I think that's a good idea, but I also think at this point that the cat's out of the bag :(
Python doesn't let you use spaces and arbitrary punctuation in package names, so perhaps I've misunderstood you.
Also, just in case you've misunderstood me, I'm referring above to *project* names, not package names. For example, PEAK has its project name registered as PEAK on PyPI, but its Python package name (that you actually import) is 'peak'. I'm referring above only to project names, not package names.
By "package name" I mean the distutils package name.
Okay, so I still don't understand why this requires "changing Python itself". Could you explain?
On Sat, 24 Sep 2005 10:10 am, Phillip J. Eby wrote:
Okay, so I still don't understand why this requires "changing Python itself". Could you explain?
Distutils metadata capture is implemented in the Python core. We would want to implement any name restrictions there, surely? Otherwise people only get an error when attempting to use setuptools or register with PyPI, which would be just annoying. Richard
On 9/23/05, Richard Jones <richardjones@optushome.com.au> wrote:
Distutils metadata capture is implemented in the Python core. We would want to implement any name restrictions there, surely? Otherwise people only get an error when attempting to use setuptools or register with PyPI, which would be just annoying.
The use of distutils should not imply the use of PyPI. Perhaps we'd want distutils to issue a warning when building a distribution if the naming conventions weren't acceptable, but that's the most we'd want. That should be something that could easily be turned off for a site or an individual. -Fred -- Fred L. Drake, Jr. <fdrake at gmail.com> Zope Corporation
At 11:40 PM 9/23/2005 -0400, Fred Drake wrote:
On 9/23/05, Richard Jones <richardjones@optushome.com.au> wrote:
Distutils metadata capture is implemented in the Python core. We would want to implement any name restrictions there, surely? Otherwise people only get an error when attempting to use setuptools or register with PyPI, which would be just annoying.
The use of distutils should not imply the use of PyPI. Perhaps we'd want distutils to issue a warning when building a distribution if the naming conventions weren't acceptable, but that's the most we'd want. That should be something that could easily be turned off for a site or an individual.
Not only that, but I'm not suggesting we ban those characters from names. I'm suggesting merely that we strip them in a uniform way. The error message would be "somebody already has a project with a name that's too similar to yours", not "you have unacceptable characters in your project name". :) I'm suggesting, in other words, that "Foo*Bar" and "Foo!Bar" simply not be considered unique project names, not that whichever project registers the name first can't use it with funky punctuation in PKG-INFO and display it on their PyPI page that way. (I would also suggest that we clarify the rules for determining project name uniqueness and recommend people follow them for simplicity's sake, of course.) I'm also suggesting that if somebody goes to the URL "/pypi/foo--bar", it would still pull up the "Foo*Bar" project if that's the one that's registered, because canonicalizing 'foo--bar' should yield the same unique key as canonicalizing 'Foo*Bar'. (This is particularly nice for EasyInstall users, since it wouldn't need to fall back to pulling down the entire index to do a case-insensitive search when they don't match someone's CreativeCAPS in a project name.) In other words, all user inputs (URL or otherwise) should be normalized for key storage and lookup, distinct from the human-readable name of the package. (Setuptools implements this for eggs by having distinct "project_name" and "key" attributes.) This approach has a few important features: 1. It can be implemented without renaming existing packages, unless there are actual conflicts in PyPI today 2. It can be implemented without any need for co-operation from package authors, because it's strictly a PyPI-side change. 3. It allows authors to fully express their creativity in naming 4. It allows end-users to ignore the authors' creativity :) The principal downside, of course, is that it's probably not a minor change to the PyPI code base, with respect to the "two names" issue, or with respect to how lookups are done. Which is why I haven't been hounding Richard to do it. Well, maybe just a little. ;)
participants (5)
-
"Martin v. Löwis"
-
Fred Drake
-
Ian Bicking
-
Phillip J. Eby
-
Richard Jones