[Python-Dev] Path object design

glyph at divmod.com glyph at divmod.com
Wed Nov 1 23:29:03 CET 2006


On 08:14 pm, sluggoster at gmail.com wrote:
>Argh, it's difficult to respond to one topic that's now spiraling into
>two conversations on two lists.

>glyph at divmod.com wrote:

>(...) people have had to spend five years putting hard-to-read
>os.path functions in the code, or reinventing the wheel with their own
>libraries that they're not sure they can trust.  I started to use
>path.py last year when it looked like it was emerging as the basis of
>a new standard, but yanked it out again when it was clear the API
>would be different by the time it's accepted.  I've gone back to
>os.path for now until something stable emerges but I really wish I
>didn't have to.

You *don't* have to.  This is a weird attitude I've encountered over and over again in the Python community, although sometimes it masquerades as resistance to Twisted or Zope or whatever.  It's OK to use libraries.  It's OK even to use libraries that Guido doesn't like!  I'm pretty sure the first person to tell you that would be Guido himself.  (Well, second, since I just told you.)  If you like path.py and it solves your problems, use path.py.  You don't have to cram it into the standard library to do that.  It won't be any harder to migrate from an old path object to a new path object than from os.path to a new path object, and in fact it would likely be considerably easier.

>> *It is already used in a large body of real, working code, and
>> therefore its limitations are known.*
>
>This is an important consideration.However, to me a clean API is more
>important.

It's not that I don't think a "clean" API is important.  It's that I think that "clean" is a subjective assessment that is hard to back up, and it helps to have some data saying "we think this is clean because there are very few bugs in this 100,000 line program written using it".  Any code that is really easy to use right will tend to have *some* aesthetic appeal.

>I took a quick look at filepath.  It looks similar in concept to PEP
>355.  Four concerns:
>    - unfamiliar method names (createDirectory vs mkdir, child vs join)

Fair enough, but "child" really means child, not join.  It is explicitly for joining one additional segment, with no slashes in it.

>    - basename/dirname/parent are methods rather than properties:
>leads to () overproliferation in user code.

The () is there because every invocation returns a _new_ object.  I think that this is correct behavior but I also would prefer that it remain explicit.

>    - the "secure" features may not be necessary.  If they are, this
>should be a separate discussion, and perhaps implemented as a
>subclass.

The main "secure" feature is "child" and it is, in my opinion, the best part about the whole class.  Some of the other stuff (rummaging around for siblings with extensions, for example) is probably extraneous.  child, however, lets you take a string from arbitrary user input and map it into a path segment, both securely and quietly.  Here's a good example (and this actually happened, this is how I know about that crazy windows 'special files' thing I wrote in my other recent message): you have a decision-making program that makes two files to store information about a process: "pro" and "con".  It turns out that "con" is shorthand for "fall in a well and die" in win32-ese.  A "secure" path manipulation library would alert you to this problem with a traceback rather than having it inexplicably freeze.  Obscure, sure, but less obscure would be getting deterministic errors from a user entering slashes into a text field that shouldn't accept them.

>    - stylistic objection to verbose camelCase names like createDirectory

There is no accounting for taste, I suppose.  Obviously if it violates the stlib's naming conventions it would have to be adjusted.

>> Path representation is a bike shed.  Nobody would have proposed
>> writing an entirely new embedded database engine for Python: python
>> 2.5 simply included SQLite because its utility was already proven.
>
>There's a quantum level of difference between path/file manipulation
>-- which has long been considered a requirement for any full-featured
>programming language -- and a database engine which is much more
>complex.

"quantum" means "the smallest possible amount", although I don't think you're using like that, so I think I agree with you.  No, it's not as hard as writing a database engine.  Nevertheless it is a non-trivial problem, one worthy of having its own library and clearly capable of generating a fair amount of its own discussion.

>Fredrik has convinced me that it's more urgent to OOize the pathname
>conversions than the filesystem operations.

I agree in the relative values.  I am still unconvinced that either is "urgent" in the sense that it needs to be in the standard library.

>Where have all the proponents of non-OO or limited-OO strategies been?

This continuum doesn't make any sense to me.  Where would you place Twisted's solution on it?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20061101/8fff62db/attachment.htm 


More information about the Python-Dev mailing list