[Python-Dev] Path object design

Mike Orr sluggoster at gmail.com
Thu Nov 2 03:36:31 CET 2006


On 11/1/06, glyph at divmod.com <glyph at divmod.com> wrote:
> On 08:14 pm, sluggoster at gmail.com wrote:

> >(...) people have had to spend five years putting hard-to-read
> >os.path functions in the code, or reinventing the wheel with their own
> >libraries that they're not sure they can trust.  I started to use
> >path.py last year when it looked like it was emerging as the basis of
> >a new standard, but yanked it out again when it was clear the API
> >would be different by the time it's accepted.  I've gone back to
> >os.path for now until something stable emerges but I really wish I
> >didn't have to.
>
> You *don't* have to.  This is a weird attitude I've encountered over and
> over again in the Python community, although sometimes it masquerades as
> resistance to Twisted or Zope or whatever.  It's OK to use libraries.  It's
> OK even to use libraries that Guido doesn't like!  I'm pretty sure the first
> person to tell you that would be Guido himself.  (Well, second, since I just
> told you.)  If you like path.py and it solves your problems, use path.py.
> You don't have to cram it into the standard library to do that.  It won't be
> any harder to migrate from an old path object to a new path object than from
> os.path to a new path object, and in fact it would likely be considerably
> easier.

Oh, I understand it's OK to use libraries.  It's just that a path
library needs to be widely tested and well supported so you know it
won't scramble your files.  A bug in a date library affects only
datetimes. A bug in a database database library affects only that
database.  A bug in a template library affects only the page being
output.  But a bug in a path library could ruin your whole day.  "Um,
remember those important files in that other project directory you
weren't working in? They were just overwritten."

Also, I train several programmers new to Python at work. I want to
make them learn *one* path library that we'll be sure to stick with
for several years.  Every path library has subtle quirks, and
switching from one to another may not be just a matter of renaming
methods.

> >    - the "secure" features may not be necessary.  If they are, this
> >should be a separate discussion, and perhaps implemented as a
> >subclass.
>
> The main "secure" feature is "child" and it is, in my opinion, the best part
> about the whole class.  Some of the other stuff (rummaging around for
> siblings with extensions, for example) is probably extraneous.  child,
> however, lets you take a string from arbitrary user input and map it into a
> path segment, both securely and quietly.  Here's a good example (and this
> actually happened, this is how I know about that crazy windows 'special
> files' thing I wrote in my other recent message): you have a decision-making
> program that makes two files to store information about a process: "pro" and
> "con".  It turns out that "con" is shorthand for "fall in a well and die" in
> win32-ese.  A "secure" path manipulation library would alert you to this
> problem with a traceback rather than having it inexplicably freeze.
> Obscure, sure, but less obscure would be getting deterministic errors from a
> user entering slashes into a text field that shouldn't accept them.

Perhaps you're right.  I'm not saying it *should not* be a basic
feature, just that unless the Python community as a whole is ready for
this, users should have a choice to use it or not.

I learned about DOS device files from the manuals back in the 80s.
But I had completely forgotten them when I made several "aux"
directories in a Subversion repository on Linux.  People tried to
check it out on Windows and... got some kind of error.  "CON" means
console: its input comes from the keyboard and its output goes to the
screen.  Since this is a device file, I'm not sure a path library has
any responsibility to treat it specially.  We don't treat
"/dev/stdout" specially unless the user specifically calls a device
function. I have no idea why Microsoft thought it was a good idea to
put the seven-odd device files in every directory. Why not force
people to type the colon ("CON:").  If they've memorized what CON
means they should have no trouble with the colon, especially since
it's required with "A:" and "C:" anyway

For trivia, these are the ones I remember:
    CON               Console  (keyboard input, screen output)
    KBRD              Keyboard input.
    ???                  screen output
    LPT1/2/3        parallel ports
    COM 1/2/3/4  serial ports
    PRN                  alias for default printer port (normally LPT1)
    NUL                  bit bucket
    AUX                  game port?

COPY CON FILENAME.TXT     # Unix: "cat >filename.txt".
COPY FILENAME.TXT PRN      # Unix: "lp filename.txt"  or "cat
filename.txt | lp".
TYPE FILENAME.TXT               # Unix: "cat filename.txt".

> >Where have all the proponents of non-OO or limited-OO strategies been?
>
> This continuum doesn't make any sense to me.  Where would you place
> Twisted's solution on it?

In the "let's create a brilliant library and put a dark box around it
so nobody knows it's there" position.  Although you say you've been
trying to spread the word about it. For whatever reason, I haven't
heard about it till now.  Not sure what this means.

But what I meant is, we OO proponents have been trying to promote
path.py and/or get a similar module into the stdlib for years, and all
we got was... not even hostility... just indifference and silence.
People like to complain about os.path but not do anything about fixing
it, or even to say which approach they *would* support.  Talin started
a great thread on the python-3000 list, going back to the beginning
and saying "What is wrong with os.path, how much does it need fixing,
and is consensus on an API possible?"  Maybe he did what the rest of
us (including me) should have done long ago.

-- 
Mike Orr <sluggoster at gmail.com>


More information about the Python-Dev mailing list