Re: [Distutils] Design rationale for the egg format ?
At 08:29 AM 6/15/2010 +0900, David Cournapeau wrote:
On Mon, Jun 14, 2010 at 11:28 PM, P.J. Eby
wrote: At 03:59 PM 6/14/2010 +0900, David Cournapeau wrote:
 - why are the metadata split into files instead of one single metadata file ?
Because that's simpler than trying to define a single universal file format that's forward and backward-compatible with every possible feature and use case. Â Each use case can have an optimized file format.
It also scales better for performance when you have multiple things you might (or might not) be reading. Â For example, since entry points are separate from dependencies, you you don't need to read the dependencies from an egg that doesn't have an entry point you're scanning for.
What I am interested in is the exact situations where this happens (there is the case where eggs are used as plugins, the case where eggs are namespace packages, etc...). For example, I don't quite understand why reading dependencies need to be fast (it does not matter at install time, so I guess I am missing some usecases) ?
As I said above, "it *also* scales better for performance" -- i.e., it's a secondary concern. The #1 reason for separating metadata files is that it makes the addition of new metadata much easier than maintaining a single monolithic format. That is, programs that don't understand new metadata don't have to read it. Plugins that write metadata don't need to co-operate with others - they can just write their own files. And so on. That is the original reason for making separate metadata files: i.e. simplicity. It just turned out to also provide a performance benefit in the case of cross-egg scanning for distinct types of metadata.
participants (1)
-
P.J. Eby