[Python-Dev] IO module precisions and exception hierarchy

Mon Sep 28 10:50:51 CEST 2009

>
>>  +-InvalidFileNameError (filepath max lengths, or "? / : " characters 
>> in a windows file name...)
>
> This might be a bit too precise. Unix just has EINVAL, which
> covers any kind of invalid parameter, not just file names.
>
Allright thanks, an InvalidParameter (or similar) exception should do it 
better then.


> Personally I'd love to see a richer set of exceptions for IO errors, 
> so long as they can be implemented for all supported platforms and no 
> information (err number from the os) is lost.
>
> I've been implementing a fake 'file' type [1] for Silverlight which 
> does IO operations using local browser storage. The use case is for an 
> online Python tutorial running in the browser [2]. Whilst implementing 
> the exception behaviour (writing to a file open in read mode, etc) I 
> considered improving the exception messages as they are very poor - 
> but decided that being similar to CPython was more important.
>
> Michael
>
> [1] 
> http://code.google.com/p/trypython/source/browse/trunk/trypython/app/storage.py 
> and 
> http://code.google.com/p/trypython/source/browse/trunk/trypython/app/tests/test_storage.py 
>
> [2] http://www.trypython.org/
>
Cool stuff  :-)
It's indeed quite unsure at the moment which exceptions it will really 
be possible (and relevant) to implement in a cross-platform way... I 
guess I should use my own fileio implementation as a playground and a 
proof of concept, before we specify anything for CPython.


> What happens isn't specified, but in practice (with the current 
> implementation) the overwriting will happen at the byte level, without 
> any check for correctness at the character level.
>
> Actually, read+write text streams are implemented quite crudely, and 
> little testing is done of them. The reason, as you discovered, is that 
> the semantics are too weak, and it is not obvious how stronger semantics 
> could look like. People wanting to do sophisticated random reads+writes 
> over a text file should probably handle the encoding themselves and 
> access the file at the binary level.
>   
It sounds ok to me, as long as we notify users about this danger (I've 
myself just realized about it). Most newcomers may happily open an UTF8 
text file, and read/write in it carelessly, without realizing that the 
characters they write actually screw up the file...


> How about just making IOError = OSError, and introducing your proposed 
> subclasses? Does the usage of IOError vs OSError have *any* useful 
> semantics?
>
I though that OSError dealt with a larger set of errors than IOError, 
but after checking the errno codes, it seems that they're all more or 
less related to IO problems (if we include interprocess communication in 
I/O). So theoretically, IOErrors and OSErrors might be merged. Note that 
in this case, windowsErrors would have to become children of 
EnvironmentError, because windows error code really seem to go farther 
than io errors (they deal with recursion limits, thousands of PC 
parameters...).
The legacy is so heavy that OSError would have to remain as is, I think, 
but we might simply forget it in new io modules, and concentrate on an 
IOError hierarchy to provide all the info needed by the developer.


>
>> Some of the error messages are truly awful though as things stand, 
>> especially for someone new to Python. Try to read from a file handle 
>> opened in read mode for example: IOError: [Errno 9] Bad file descriptor
> Subdividing the IOError exception won't help with
> that, because all you have to go on when deciding
> which exception to raise is the error code returned
> by the OS. If the same error code results from a
> bunch of different things, there's not much Python
> can do to sort them out.
>
Well, you don't only have the error number, you also have the context of 
this exception. IOErrors subclasses would particularly be useful in a 
"high level" IO contect, when each single method can issue lots of 
system calls (to check the file, lock it, edit it...). If the error is 
raised during your locking operation, you can decide to sort it as 
"LockingError" even if the error code provided might appear in several 
different situations.

Regards,
Pascal