file object, details of modes and some issues.

simon place simon_place at lineone.net
Tue Aug 26 11:43:08 EDT 2003


is the code below meant to produce rubbish?, i had expected an exception.

f=file('readme.txt','w')
f.write(' ')
f.read()

( PythonWin 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on
win32. )

I got this while experimenting, trying to figure out the file objects modes,
which on very careful reading of the documentation left me completely in the
dark. Below is a summary of that experimentation, some is for reference and
some is more of a warning.

     'r' is (r)ead mode so you can't write to the file, you get
'IOError:(0,'Error')' if you try, which isn't a particularly helpful error
description. read() reads the whole file and read(x) x-bytes, unless there are
less than x bytes left, then it reads as much as possible. so a test for less
than the required number of bytes indicates the end of the file, i think maybe
an exception when a read at the end of the file is attempted would be better,
like iterators. if you try to open a non-existent file in 'r' mode you get
'IOError: [Errno 2] No such file or directory: "filename"' which makes sense

     'w' is (w)rite mode so you can't read from the file, ( any existing file is
erased or a new file created, and bear in mind that anything you write to the
file can't be read back directly on this object.), you get 'IOError: [Errno 9]
Bad file descriptor' if you try reading, which is an awful error description.
BUT this only happens at the beginning of the file? when at the end of the
file, as is the case when you have just written something ( without a backward
seek, see below), you don't get an exception, but lots of rubbish data ( see
example at beginning.) This mode allows you to seek backward and rewrite
data, but if you try a read somewhere between the first character and the end,
you get a different exception 'IOError: (0, 'Error')'

     'a' is (a)ppend mode, you can only add to the file, so basically write mode
(with the same problems ) plus a seek to the end, obviously append doesn't
erase an existing file and it also ignores file seeks, so all writes pile up
at the end. tell() gives the correct location in the file after a write ( so
actually always gives the length of the file.) but if you seek() you don't get
an exception and tell() returns the new value but writes actually go to the
end of the file, so if you use tell() to find out where writes are going, in
this mode it might not always be right.

     'r+' is (r)ead (+) update, which means read and write access,  but
don't read, without backward seeking, after a write because it will then read
a lot of garbage.( the rest of the disk fragment/buffer i guess? )

     'w+' is (w)rite (+) update mode, which means read and write access,
(like 'r+' but on a new or erased file).

     'a+' is (a)ppend (+) update mode, which also means read and write, but
file seeks are ignored, so any reads seems a bit pointless since they always
read past the end of the file! returning garbage, but it does extend
the file, so this garbage becomes incorporated in the file!! ( yes really )

     'b', all modes can have a 'b' appended to indicate binary mode, i think this
is something of a throw-back to serial comms ( serial comms being bundled into
the same handlers as files because when these things were developed, 20+ years
ago, nothing better was around. ) Binary mode turns off the 'clever' handling
of line ends and ( depending on use and os ) other functional characters (
tabs expanded to spaces etc ), the normal mode is already binary on windows so
binary makes no difference on win32 files. But since in may do on other
o.s.'s, ( or when actually using the file object for serial comms.) i think
you should actually ALWAYS use the binary version of the mode, and handle the
line ends etc. yourself. ( then of course you'll have to deal with the
different line end types!)

     Bit surprised that the file object doesn't do ANY access control, multiple
file objects on the same actual file can ALL write to it!! and other software
can edit files opened for writing by the file object. However a write lock on
a file made by other software cause a 'IOError: [Errno 13] Permission denied'
when opened by python with write access. i guess you need
os.access to test file locks and os.chmode to change the file locks, but i
haven't gone into this, shame that there doesn't appear to be a nice simple
file object subclass that does all this! Writes to the file object actually
get done when flush() ( or seek() ) is called.

     suffice to say, i wasn't entirely impressed with the python file object, then
i remembered the cross platform problems its dealing with and all
the code that works ok with it, and though i'd knock up this post of my
findings to try to elicit some discussion / get it improved / stop others
making mistakes.




















More information about the Python-list mailing list