[Python-Dev] IO module precisions and exception hierarchy
Pascal Chambon
chambon.pascal at gmail.com
Mon Sep 28 10:50:51 CEST 2009
>
>> +-InvalidFileNameError (filepath max lengths, or "? / : " characters
>> in a windows file name...)
>
> This might be a bit too precise. Unix just has EINVAL, which
> covers any kind of invalid parameter, not just file names.
>
Allright thanks, an InvalidParameter (or similar) exception should do it
better then.
> Personally I'd love to see a richer set of exceptions for IO errors,
> so long as they can be implemented for all supported platforms and no
> information (err number from the os) is lost.
>
> I've been implementing a fake 'file' type [1] for Silverlight which
> does IO operations using local browser storage. The use case is for an
> online Python tutorial running in the browser [2]. Whilst implementing
> the exception behaviour (writing to a file open in read mode, etc) I
> considered improving the exception messages as they are very poor -
> but decided that being similar to CPython was more important.
>
> Michael
>
> [1]
> http://code.google.com/p/trypython/source/browse/trunk/trypython/app/storage.py
> and
> http://code.google.com/p/trypython/source/browse/trunk/trypython/app/tests/test_storage.py
>
> [2] http://www.trypython.org/
>
Cool stuff :-)
It's indeed quite unsure at the moment which exceptions it will really
be possible (and relevant) to implement in a cross-platform way... I
guess I should use my own fileio implementation as a playground and a
proof of concept, before we specify anything for CPython.
> What happens isn't specified, but in practice (with the current
> implementation) the overwriting will happen at the byte level, without
> any check for correctness at the character level.
>
> Actually, read+write text streams are implemented quite crudely, and
> little testing is done of them. The reason, as you discovered, is that
> the semantics are too weak, and it is not obvious how stronger semantics
> could look like. People wanting to do sophisticated random reads+writes
> over a text file should probably handle the encoding themselves and
> access the file at the binary level.
>
It sounds ok to me, as long as we notify users about this danger (I've
myself just realized about it). Most newcomers may happily open an UTF8
text file, and read/write in it carelessly, without realizing that the
characters they write actually screw up the file...
> How about just making IOError = OSError, and introducing your proposed
> subclasses? Does the usage of IOError vs OSError have *any* useful
> semantics?
>
I though that OSError dealt with a larger set of errors than IOError,
but after checking the errno codes, it seems that they're all more or
less related to IO problems (if we include interprocess communication in
I/O). So theoretically, IOErrors and OSErrors might be merged. Note that
in this case, windowsErrors would have to become children of
EnvironmentError, because windows error code really seem to go farther
than io errors (they deal with recursion limits, thousands of PC
parameters...).
The legacy is so heavy that OSError would have to remain as is, I think,
but we might simply forget it in new io modules, and concentrate on an
IOError hierarchy to provide all the info needed by the developer.
>
>> Some of the error messages are truly awful though as things stand,
>> especially for someone new to Python. Try to read from a file handle
>> opened in read mode for example: IOError: [Errno 9] Bad file descriptor
> Subdividing the IOError exception won't help with
> that, because all you have to go on when deciding
> which exception to raise is the error code returned
> by the OS. If the same error code results from a
> bunch of different things, there's not much Python
> can do to sort them out.
>
Well, you don't only have the error number, you also have the context of
this exception. IOErrors subclasses would particularly be useful in a
"high level" IO contect, when each single method can issue lots of
system calls (to check the file, lock it, edit it...). If the error is
raised during your locking operation, you can decide to sort it as
"LockingError" even if the error code provided might appear in several
different situations.
Regards,
Pascal
More information about the Python-Dev
mailing list