[Python-Dev] PEP 376 : Changing the .egg-info structure
P.J. Eby
pje at telecommunity.com
Tue May 19 22:36:40 CEST 2009
At 04:04 PM 5/19/2009 +0200, Tarek Ziadé wrote:
>On Sat, May 16, 2009 at 6:55 PM, P.J. Eby <pje at telecommunity.com> wrote:
> >
> > 1. Why ';' separation, instead of tabs as in PEP 262? Aren't semicolons a
> > valid character in filenames?
>
>I am changing this into a <tab>. for now.
>
>What about Antoine's idea about doing a quote() on the names ?
I like the CSV idea better, since the csv module is available in 2.3
and up. We should just pick a dialect with unambiguous quoting rules.
> From my point of view <tabs> seems more simple to deal with, if 3rd-party
>tools want to work on these files without using pkgutil or Python.
True, but then CSV files are still pretty common.
One other possibility that might work is using a vertical bar as a separator.
My preference rank at the moment is probably tabs, CSV, or vertical
bar. But I don't really care all that much, so let the people who care decide.
Personally, though, I don't see much point to cross-language
manipulation of the file. System packaging tools have their own way
of keeping track of this stuff. So unless somebody's using it to
*build* system packages (e.g. making an RPM builder), they don't need this.
Now, about the APIs...
> > 4. There should probably be a way to iterate over the projects in a
> > directory, since it's otherwise impossible for an installation tool to find
> > out what project(s) "own" a file that conflicts with something being
> > installed. Alternatively, reshaping the file API to allow querying by path
> > as well as by project might work.
>
>I am adding a "get_projects" api:
>
> get_projects() -> iterator
>
> Provides an iterator that will return (name, path) tuples, where `name`
> is the name of a registered project and `path` the path to its `egg-info`
> directory.
>
>But for the use case you are mentioning, what about an explicit API:
>
> get_owners(paths) -> sequence of project names
>
> returns a sequence of tuple. For each path in the "paths" list, a
>tuple of project names
> is returned
>
> >
> > 5. If any cache mechanisms are to be used by the API, the API
> *must* make it
> > possible to bypass or explicitly manage that cache, as otherwise
> > installation tools and tools that manipulate sys.path at runtime may end up
> > using incorrect data.
>
>work in progress - (I am afraid I have to write an advanced prototype
>to be able to know
>exaclty how the cache might work, and so, what API we should have)
I think it would be simpler to have explicit object types
representing things like a directory, a collection of directories,
and individual projects, and these object types should be part of the API.
Any function-oriented API should just be exposed as the methods of a
default singleton. Other Python modules follow this pattern -- and
it's what I copied for the pkg_resources design. It gives a nice
tradeoff between keeping the simple things simple, and complex things
possible, as well as keeping mechanism and policy separate.
Right now, the API design you're trying to do is being burdened by
using strings and tuples to represent things that could just as
easily be objects with their own methods, instead of things you have
to pass back into other APIs. This also makes caching more complex,
because you can't just have one main object with stuff hanging off;
you've got to have a bunch of dictionaries, tuples, lists, sets, etc.
More information about the Python-Dev
mailing list