[Distutils] PEP 376 comments

Tarek Ziadé ziade.tarek at gmail.com
Mon Jun 8 11:58:18 CEST 2009

2009/6/7 Ronald Oussoren <ronaldoussoren at mac.com>:
> I have some feedback on PEP376, both the pep itself and the pkgutil code. I'll start with comments on the PEP.

Thanks for this detailed feedback.

> * I don't like the definition of a project: this seems to define what distutils calls a distribution, and that is not necessarily and application.


> * The description of the status quo is not entirely acurate w.r.t. setuptools, you describe only one of the possible ways setuptools can install a project (pip seems to use another one of setuptools' ways of installing a project, and setuptools can also install projects inside of zipfiles in multiple ways).

The goal is to keep only one way to install projects. I've added more
details in the PEP

> * Regarding the RECORD file: it is not a "CSV-like" file, it is a real CSV file. I'd specify the exact options for the 'csv' module that will be used, rather than writing that the default options are used and then explaining what those are.


> * Should the PEP specify the encoding of text-files? PEP314 doesn't seem to specify the encoding of PKG-INFO files, which can cause problems when a field contains data that isn't ASCII.

The encoding used is utf-8 since 2.6.  I think we should rather update
PEP 314, and mention it in the upcoming PEP 345 as well,

> All of these are minor issues. The following are imho more serious.
> * The PEP doesn't describe how this PEP interacts with PEP302. That is, how should the "egg-info" machinery work when a project is not installed on the filesystem but in a zipfile. I'm not primarily interested in the ".egg" zipfiles that setuptools uses, but I am worried about how this will affect tools like py2exe and py2app that bundle the python code used by an application into a zipfile.

I need to work on this.

> * Why is there a "paths" argument for the global functions (such as "get_file_users")? The description claims the functions will use sys.path and hence it is not necessary to have an argument to specify the path.

That's because you can use these apis to work on an arbitrary list of
paths in a packaging tool.  I have changed the
description accordingly.

> * There is no API to list the files in the egg-info directory, you could build one yourself on top of the get_installed_files method but should IMO be part of the public API.


> And now on to the implementation:
> * EggInfo.get_file: the implementation seems to want to load a file from the .egg-info directory, the PEP itself is unclear about what this method is intended to do.

Added more details. I've also renamed some APIs to make things clearer
(get_file -> get_egg_info_file)
(I am also thinking about renaming EggInfo to a better name)

> If get_file should open a file in the egg-info directory I'd raise an exception if the path argument specifies an absolute path.


> I wonder if it wouldn't be better to have a function that returns the contents of the file instead of one that returns a file-like object. Especially when thinking of PEP302 integration.

Why ?

> Wouldn't it be better to have two separate methods instead of "binary" argument.  In 3.x file-like objects behave slightly different w.r.t binary and text stream ("bytes" vs. "str" as the result of read).

Will work on that.

> * EggInfo.get_installed_files: if 'local' is True the yielded paths are made absolute w.r.t. the egg-info directory rather than the directory containing the egg-info directory.

Yes that was a bug. Fixed in the prototype.

> * The global functions seem to maintain and modify global state, wouldn't this cause problems if I specify different values of the path arguments in different pieces of code?

The cache just prevents re-reading a directory content.  Do you have a
scenario in mind of a possible problem ?
The only problem I can see is when a project is uninstalled by a
program while these APIs are used by another program.

By the way, I'm thinking about adding something else for that part: A
singleton-like class for EggInfoDirectory. That
would make sure there's only one instance for each path.


More information about the Distutils-SIG mailing list