
Hello, Tarek Ziadé <ziade.tarek <at> gmail.com> writes:
so I am mailing it here again, for a new round of feedback, if needed.
- the pep : http://svn.python.org/projects/peps/trunk/pep-0376.txt
Some comments: - the **MD5** hash of the file, encoded in hex. Notice that `pyc` and `pyo` generated files will not have a hash. Why the exception for pyc and pyo files? - `zlib` and `zlib-2.5.2.egg-info` are located in `site-packages` so the file paths are relative to it. Is it a general rule? That is, the paths in RECORD are always relative to its grandparent directory? The RECORD format ----------------- The `RECORD` file is a CSV file, composed of records, one line per installed file. The ``csv`` module is used to read the file, with the `excel` dialect, which uses these options to read the file: - field delimiter : `,` - quoting char : `"`. - line terminator : `\r\n` Wouldn't it be better to use the native line terminator on the current platform? (someone might want to edit or at least view the file) What is the character encoding for non-ASCII filenames? UTF-8? Are the RECORD file's contents somehow part of the DistributionMetadata? - ``DistributionDirectories``: manages ``EggInfoDirectory`` instances. What is an EggInfoDirectory ? A plural class name looks strange (I think it's the first time I see one in the CPython codebase). How about another name? (DistributionPool, DistributionMap, WorkingSet etc.). - ``get_egginfo_file(path, binary=False)`` -> file object Returns a file located under the `.egg-info` directory. Returns a ``file`` instance for the file pointed by ``path``. Is it always a file instance or just a file-like object? (zipped distributions come to mind). Is it opened read-only? - ``owner(path)`` -> ``Distribution`` instance or None If ``path`` is used by only one ``Distribution`` instance, returns it. Otherwise returns None. This is a bit confusing. If it returns None, it doesn't distinguish between the case where several Distributions refer to the path, and the case where no distributions refer to it, does it? Is there any reason to have this method while file_users(path) already exists? A new class called ``DistributionDirectories`` is created. It's a collection of ``DistributionDirectory`` and ``ZippedDistributionDirectory`` instances. The constructor takes one optional argument ``use_cache`` set to ``True`` by default. You forgot to describe the constructor's signature and what it does exactly. ``EggInfoDirectories`` also provides the following methods besides the ones from ``dict``:: What is EggInfoDirectories? - ``append(path)`` Creates an ``DistributionDirectory`` (or ``ZippedDistributionDirectory``) instance for ``path`` and adds it in the mapping. - ``load(paths)`` Creates and adds ``DistributionDirectory`` (or ``ZippedDistributionDirectory``) instances corresponding to ``paths``. Why are these methods named completely differently although they do almost the same thing? Besides, append() makes it look like ordering matters. Does it? (for a naming suggestion, e.g. load(path) and load_all(paths). Or, even simpler, load(*paths) or load(paths)) - ``get_file_users(path)`` -> Iterator of ``Distribution`` (or ``ZippedDistribution``) instances. This method is named file_users in another class. Perhaps the naming should be consistent? All these functions use the same global instance of ``DistributionDirectories`` to use the cache. Is the global instance available to users? >>> for path, hash, size in dist.get_installed_files():: ... print '%s %s %d %s' % (path, hash, size) There's one too many "%s" here. Thanks for your work! Antoine.