2009/7/4 Paul Moore email@example.com:
2009/7/4 Paul Moore firstname.lastname@example.org:
2009/7/3 Tarek Ziadé email@example.com:
You can give me a bitbucket account so I can give you write access to the repo, There are tests as long as you install Nose.
How do I get the tests to work? Just running nosetests gives an error (probably because pkgutil is being imported from the stdlib, rather than from this directory).
I just run them from within the directory
If I set PYTHONPATH=. then I get errors. I suspect path normalisation (for backslashes) in the zipfile handling.
Actually, the test
assert_equals(list(dist.get_egginfo_files(local=True)), [os.path.join(SITE_PKG, 'mercurial-1.0.1.egg-info/PKG_INFO'), os.path.join(SITE_PKG, 'mercurial-1.0.1.egg-info/RECORD')])
is broken, because the expected value uses slashes, which are *not* the local separator on win32.
I've attached a patch.
Applied, thanks (I didn't run them under win32 yet)
But there's 2 comments I'd make (one minor, one major)
Minor one: The tests often seem to be exercising the internal classes, not so much the public API, so many of them will probably not be of much use to me :-(
I'll add some more tests then, or even user stories.
I think you need some real-world use cases, with actual sample (pseudo-)code, to validate the design here. As things stand, it's both confusing and (I suspect) unusable in practice. Sorry, I know that sounds negative, but if this isn't to be a source of subtle bugs for years to come, it needs to be clarified now. PEP 302 is still hitting this type of issue - runpy and importlib have brought out errors and holes in the protocol quite recently - even though Just and I went to great lengths to try to tease out hidden assumptions up front.
Agreed, the zip case was added afterwards, but in practice, the APIs are still dealing with the files are *filesystem files* located in a container (eg a directory or a zip file) located somewhere on the filesystem.
"local" in that case is a flag that means "translate a file path expressed in the local filesystem" which make no sense anymore with zip files. But the goal really, is to be able to point out that two distributions are using the very same file.
Right now PEP 376 and the prototype code handle these two real world use cases:
- browsing regular site-packages-like directories - browsing site-packages-like directories, that are zipped.
- I have a "packages.zip" file in /var/, wich is also in my sys.path. It contains a distribution "foo-1.0" that has the "roman.py" file in its root. So the RECORD file located in "foo-1.0.egg-info" has a line starting with "roman.py,..."
- Then if I install docutils 0.5 as a regular filesystem distribution, "roman.py" will be added in Python's site-packages. and docutils-0.5.egg-info/RECORD will contain "roman.py,..." with the same hash.
The local flag will return these paths:
- /var/packages.zip/roman.py <--- not a "real" path - /usr/local/lib/python2.6/site-packages/roman.py
So removing the docutils distribution will be doable, because these paths are different.
get_metadata_files() - returns slash-separated names, relative to the egginfo dir get_metadata_file(path) - path must be slash-separated, relative to the egginfo dir
get_installed_files - returns the contents of RECORD unaltered uses(path) - checks if path is in RECORD
The latter 2 are not very useful in practice - you can't say anything about entries in different RECORD files, which is likely the real use case you want. Maybe RECORD could have an extra "Location" entry, which determines where it exists globally (this would be the directory to which the filenames were relative, in the case of filesystem-based distributions) and RECORD entries are comparable if the Location values in the 2 RECORD files match. That's a lot more complex - but depending on what use people expect to make of these 2 APIs, it may be justified.
Yes, In practice, if you look at my previous example, even if "/var/packages.zip/roman.py" isn't a real path, it's enough to compare RECORD entries globally.
The "Location" entry you are proposing in that case, would be "/var/packages.zip".
But do we really need to store it the RECORD ? Or can't we define an API that returns two elements :
- the path to the location (in the example: /var/packages.zip or /usr/local/lib/python2.6/site-packages) - the path within the location itself (in the example: roman.py)
A concrete proposal would be to take back your proposal, but return tuples with the location as the first member. e.g. "(location, relative path[s])"
The code that is comparing paths to see if they are the same can join location+relative path[s], while we can provide in a dedicated function something to read the content of the file (that would be get_data I guess, if I refer to PEP 302)