[Distutils] Design rationale for the egg format ?

Tue Jun 15 03:36:49 CEST 2010

At 08:29 AM 6/15/2010 +0900, David Cournapeau wrote:
>On Mon, Jun 14, 2010 at 11:28 PM, P.J. Eby <pje at telecommunity.com> wrote:
> > At 03:59 PM 6/14/2010 +0900, David Cournapeau wrote:
> >> Â - why are the metadata split into files instead of one single metadata
> >> file ?
> >
> > Because that's simpler than trying to define a single universal file format
> > that's forward and backward-compatible with every possible feature and use
> > case. Â Each use case can have an optimized file format.
> >
> > It also scales better for performance when you have multiple things you
> > might (or might not) be reading. Â For example, since entry points are
> > separate from dependencies, you you don't need to read the 
> dependencies from
> > an egg that doesn't have an entry point you're scanning for.
>
>What I am interested in is the exact situations where this happens
>(there is the case where eggs are used as plugins, the case where eggs
>are namespace packages, etc...). For example, I don't quite understand
>why reading dependencies need to be fast (it does not matter at
>install time, so I guess I am missing some usecases) ?

As I said above, "it *also* scales better for performance" -- i.e., 
it's a secondary concern.  The #1 reason for separating metadata 
files is that it makes the addition of new metadata much easier than 
maintaining a single monolithic format.

That is, programs that don't understand new metadata don't have to 
read it.  Plugins that write metadata don't need to co-operate with 
others - they can just write their own files.  And so on.

That is the original reason for making separate metadata files: i.e. 
simplicity.  It just turned out to also provide a performance benefit 
in the case of cross-egg scanning for distinct types of metadata.