On 11/5/06, Andrew Dalke <dalke@dalkescientific.com> wrote:
I agree that supporting non-filesystem directories (zip files, CSV/Subversion sandboxes, URLs) would be nice, but we already have a big enough project without that. What constraints should a Path object keep in mind in order to be forward-compatible with this?
Is the answer therefore that URLs and URI behaviour should not place constraints on a Path object becuse they are sufficiently dissimilar from file-system paths? Do these other non-FS hierarchical structures have similar differences causing a semantic mismatch?
This discussion has renforced my belief that os.path.join's behavior is correct with non-initial absolute args: os.path.join('/usr/bin', '/usr/local/bin/python') I've used that in applications and haven't found it a burden. Its behavior with '..' seems justifiable too, and Talin's trick of wrapping everything in os.path.normpath is a great one. I do think join should take more care to avoid multiple slashes together in the middle of a path, although this is really the responsibility of the platform library, not a generic function/method. Join is true to its documentation of only adding separators and never than deleting them, but that seems like a bit of sloppiness. On the other hand, the filesystems don't care; I don't think anybody has mentioned a case where it actually creates a path the filesystem can't handle. urljoin clearly has a different job. When we talked about extending path to URLs, I was thinking more in terms of opening files, fetching resources, deleting, renaming, etc. rather than split-modify-rejoin. A hypothetical urlpath module would clearly have to follow the URL rules. I don't see a contradition in supporting both URL joining rules and having a non-initial absolute argument, just to avoid cross-"platform" surprises. But urlpath would also need methods to parse the scheme and host on demand, query strings, #fragments, a class method for building a URL from the smallest parts, etc. As for supporting path fragments and '..' in join arguments (for filesystem paths), it's clearly too widely used to eliminate. Users can voluntarily refrain from passing arguments containing separators. For cases involving a user-supplied -- possibly hostile -- path, either a separate method (safe_join, child) could achieve this, or a subclass implemetation that allows only safe arguments. Regarding pathname-manipulation methods and filesystem-access methods, I'm not sure how workable it is to have separate objects for them. os.mkdir( Path("/usr/local/lib/python/Cheetah/Template.py").parent ) Path("/usr/local/lib/python/Cheetah/Template.py").parent.mkdir() FileAccess( Path("/usr/local/lib/python/Cheetah/Template.py").parent ).mkdir() The first two are reasonable. The third... who would want to do this for every path? How often would you reuse the FileAccess object? I typically create Path objects from configuration values and keep them around for the entire application; e.g., data_dir. Then I create derived paths as necessary. I suppose if the FileAccess object has a .path attribute, it could do double-duty so you wouldn't have to store the path separately. Is this what the advocates of two classes have in mind? With usage like this? my_file = FileAccess( file_access_obj.path.joinpath("my_file") ) my_file = FileAccess( Path(file_access_obj,path, "my_file") ) Working on my Path implementation. (Yes it's necessary, Glyph, at least to me.) It's going slow because I just got a Macintosh laptop and am still rounding up packages to install. -- Mike Orr <sluggoster@gmail.com>