File handling: The easy and the hard way

Thorsten Kampe thorsten at thorstenkampe.de
Sun Oct 3 13:25:00 CEST 2004


* Hans-Joachim Widmaier (2004-09-30 15:56 +0200)
> Handling files is an extremely frequent task in programming, so most
> programming languages have an abstraction of the basic files offered by
> the underlying operating system. This is indeed also true for our language
> of choice, Python. Its file type allows some extraordinary convenient
> access like:
> 
>     for line in open("blah"):
>         handle_line(line)
> 
> While this is very handy for interactive usage or throw-away scripts, I'd
> consider it a serious bug in a "production quality" software. Tracebacks
> are fine for programmers, but end users really never should see any.

Correct. Set sys.tracebacklimit to zero (and leave it at the default
1000 for "--debug').

> Especially not when the error is not in the program itself, but rather
> just a mistyped filename. (Most of my helper scripts that we use to
> develop software handle files this way. And even my co-workers don't
> recognize 'file or directory not found' for what it is.) End users are
> entitled to error messages they can easily understand like "I could not
> open 'blaah' because there is no such file". Graceful error handling is
> even more important when a program isn't just run on a command line but
> with a GUI.
> 
> Which means? Which means that all this convenient file handling that
> Python offers really should not be used in programs you give away. When I
> asked for a canonical file access pattern some months ago, this was the
> result:
> http://groups.google.com/groups?hl=de&lr=&ie=UTF-8&threadm=pan.2003.12.30.21.32.37.195763%40web.de&rnum=1&prev=/groups%3Fhl%3Dde%26lr%3D%26ie%3DUTF-8%26q%3Dfile%2Bpattern%2Bcanonical%26btnG%3DSuche%26meta%3D
> 
> Now I have some programs that read and write several files at once. And
> the reading and writing is done all over the place. If I really wanted to
> do it "right", my once clear and readily understandable code turns into a
> nightmare. This doesn't look like the language I love for its clarity and
> expressivness any more.

I think the whole task is rather complicated. Python just passes the
error message of the operating system through without interpreting it.
Of course "no such file" is b***sh*t for an error message (because
it's even lacking a verb) but can Python by itself be more
sophisticated about these things than the OS? I don't think so.

Think about "/foo/bar/baz/text.txt doesn't exist". There are multiple
reasons

1. There is a directory /foo/bar/baz/ but no test.txt in it.
2. The whole path hierarchy is missing: there isn't even a /foo.
3. It was there, I know it because wrote to the file, but now I cannot
access the directory anymore. Is the file deleted? The directory? Did
I lose access to the network where the share is actually located?

Now this is where the intelligence of the programmer comes in and they
try to guess. They say: "Sorry, I'm not sitting in front of the
machinem but this error message could mean this." And oftenly they are
ridiciously wrong.

If anyone ever made some kind of "error matrix" - meaning if error foo
and bar but not baz then it's likely whatever, the whole community
could profit.

Thorsten



More information about the Python-list mailing list