PEP 376 - get_egginfo_files
The PEP says: """ get_egginfo_files(local=False) -> iterator of paths Iterates over the RECORD entries and return paths for each line if the path is pointing a file located in the .egg-info directory or one of its subdirectory. """ Should this method really only return filenames noted in the RECORD file? Would it not be better for it to iterate over *all* files in the .egg-info directory? I understand that there shouldn't, in practice, be any files in that directory *not* mentioned in the RECORD file, but given that we already have get_installed_files to read the RECORD file, I would imagine it's better for this file to so something more than filter the return values from get_installed_files. Actually, on that note, consider the pkgutil functions: def get_distribution(name): for d in get_distributions(): if d.name == name: return d return None def get_file_users(path): for d in get_distributions(): if d.uses(path): yield d These don't actually add much to the API. While I can see the advantage of having them as convenience methods, it might be worth pointing out in the PEP that that's all they are. Similarly, how valuable is Distribution.name, given that it's the same as Distribution.metadata.name? I'm probably just going to make it a property - @property def name(self): return self.metadata.name but that's actually slower than just using self.metadata.name directly, so it's a bit of an attractive nuisance, and I'd prefer it if it wasn't present. (For the PEP 302 stuff, I'm making metadata a cached property, so name *has* to be a property to ensure that the metadata cache is managed properly...) Paul.
2009/7/5 Paul Moore <p.f.moore@gmail.com>:
The PEP says:
""" get_egginfo_files(local=False) -> iterator of paths
Iterates over the RECORD entries and return paths for each line if the path is pointing a file located in the .egg-info directory or one of its subdirectory. """
Should this method really only return filenames noted in the RECORD file? Would it not be better for it to iterate over *all* files in the .egg-info directory? I understand that there shouldn't, in practice, be any files in that directory *not* mentioned in the RECORD file, but given that we already have get_installed_files to read the RECORD file, I would imagine it's better for this file to so something more than filter the return values from get_installed_files.
I don't see a use case for having more out of get_egginfo_files. I still find it useful because to iterate over metadata files. Maybe we could remove it and add a filter option for get_installed_files. A callable that gets each visited file and returns True or False to filter them out: get_installed_files(path, filter=callable) And then provide a "egginfo_files" callable to get what we have with get_egginfo_files : get_installed_files(path, filter=egginfo_files)
Actually, on that note, consider the pkgutil functions:
def get_distribution(name): for d in get_distributions(): if d.name == name: return d return None
def get_file_users(path): for d in get_distributions(): if d.uses(path): yield d
These don't actually add much to the API. While I can see the advantage of having them as convenience methods, it might be worth pointing out in the PEP that that's all they are.
Sure,
Similarly, how valuable is Distribution.name, given that it's the same as Distribution.metadata.name? I'm probably just going to make it a property -
It's just for conveniency, since this metadata field is also the identifier of the distribution.
@property def name(self): return self.metadata.name
I don't think this adds any value, since self.metadata is a read-only instance, that gets loaded once when the Distribution object is created.
2009/7/5 Tarek Ziadé <ziade.tarek@gmail.com>:
2009/7/5 Paul Moore <p.f.moore@gmail.com>:
The PEP says:
""" get_egginfo_files(local=False) -> iterator of paths
Iterates over the RECORD entries and return paths for each line if the path is pointing a file located in the .egg-info directory or one of its subdirectory. """
Should this method really only return filenames noted in the RECORD file? Would it not be better for it to iterate over *all* files in the .egg-info directory? I understand that there shouldn't, in practice, be any files in that directory *not* mentioned in the RECORD file, but given that we already have get_installed_files to read the RECORD file, I would imagine it's better for this file to so something more than filter the return values from get_installed_files.
I don't see a use case for having more out of get_egginfo_files. I still find it useful because to iterate over metadata files.
Maybe we could remove it and add a filter option for get_installed_files. A callable that gets each visited file and returns True or False to filter them out:
get_installed_files(path, filter=callable)
And then provide a "egginfo_files" callable to get what we have with get_egginfo_files :
get_installed_files(path, filter=egginfo_files)
-1. Unnecessary generalisation. Let's stick with the 2 functions as documented. [...]
Similarly, how valuable is Distribution.name, given that it's the same as Distribution.metadata.name? I'm probably just going to make it a property -
It's just for conveniency, since this metadata field is also the identifier of the distribution.
@property def name(self): return self.metadata.name
I don't think this adds any value, since self.metadata is a read-only instance, that gets loaded once when the Distribution object is created.
... not any more :-) Your zipfile handling was horribly broken on Windows, thanks to the usual slash/backslash confusion. The sanest way to fix it seemed to me to be to load the metadata lazily, rather than in the __init__ (as otherwise, zipfile and filesystem implementation end up not being able to share any code). Once that's done, the name attribute has to *also* handle lazy-loading of the metadata, and the above property is the easiest way to do this. Actually, my implementation is looking less and less like yours, and ultimately any implementation questions are irrelevant until you see my code and spot all the errors :-) I'm trying to get it into a postable state as fast as I can. (At last count, I've replaced about 140 lines of code with 70, and it now includes PEP 302 support all the (non-internal) tests still pass. So it's looking OK...) Paul.
participants (2)
-
Paul Moore
-
Tarek Ziadé