On 11/10/2012 03:38 AM, Daniel Holth wrote:
Although I think the ~ is a very ugly -, it could be useful to change the separator to something less commonly used than the -.
It would be useful to be able to use the hyphen - in the version of a package (for semver) and elsewhere. Using it as the separator could make parsing the file name a bit trickier than is healthy.
This change would affect PEP 376 which reads:
This distinct directory is named as follows::
name + '-' + version + '.dist-info'
with today's hyphen/underscore folding: re.sub('[^A-Za-z0-9.]+', '-', version), could become
It would also affect pip, setuptools, and the wheel peps. If we do this, I would like to allow Unicode package names at the same time. safe_name(), the pkg_resources function that escapes package names for file names, would become
re.sub(u"[^\w.]+", "_", u"package-name", flags=re.U)
In other words, the rule for package names would be that they can contain any Unicode alphanumeric or _ or dot. Right now package names cannot practically contain non-ASCII because the setuptools installation will fold it all to _ and installation metadata will collide on the disk.
safe_version(), presently the same as safe_name() would also need to allow + for semver.
How about, rather than trying to create a complicated 1:1 mapping between metadata and filename, just use a hash of the metadata?
There should still be a "user-readable" part of the filename, but it now only has to be a one-way function that is allowed to collide. So if you do for some reason have a collision in the user-friendly part the hash still saves you,