[Tutor] RE: New file stuff
alan.gauld@bt.com
alan.gauld@bt.com
Wed, 14 Aug 2002 14:31:12 +0100
> Well, yes, but just because it is the way that it has always
> been done by computer scientists doesn't mean that a
> different way might not be easier
Oh I agree, the point I was making is that lots of
language designers have tried and they keep coming
back to variations on a theme...
> > Pascal kind of makes that distinction by having text
> > files as a special category. But binary files are
> > broken into myriad types... FILE OF <FOO>
>
> If anybody wanted to do that (deal with a file as a
> collection of 64-bit words instead of bytes, for example
I didn't explain that properly.
Pascal has the rather nice feature of defining binary
files in terms of the high level dta types you want to
store. Thus if we define a class Foo we can declare a
FILE OF FOO
Then we can read/write foo objects to the file as
binary chunks. This is a very nice feature although
still not perfect since it can't handle polymorphic
collections etc. But a lot better than writing
sizeof(foo) bytes each time...
> > Thats actually quite tricky to do. Why not try implememting
> > the interface in Python to see whats involved....
I don't mean define the interface I mean actually write
a prototype of the basic open/read/write/slice operations
as a class.
See how ,any special cases you have to deal with etc.
> system basics - why can't you just read the whole file into a
> buffer and manipulate that, which occasional flushes to a
> backup version?
Usually thats what the programmer will do but for big files
(several hundred megabytes) thats not really an option.
Its coping with these special cases that, makes file
handling difficult, becayse at the end of the day you
come up against the hardware which is essentiually
a long sequence of bytes on a disk or tape!
> understand Linux correctly, this is what the operating system
> does anyway, or at least that is the excuse everybody keeps
Nope, Linux is smarter than that. It reads in a bit of
the file(a page) and then as you read thru the data it
'pages in' the next segment ready for you to move onto.
Once you leave the first page it gets deleted and the
space used for the next page and so on... But this is
still basically a sequential read of the disk/tape.
> "free" shows me that all my nice RAM is being used for
> buffers and caches and stuff like that..
Yes the pages are stored in buffers and the output
is written to a page buffer before eventually being
flushed to disk - that's why files need a flush
operation...
To compound matters the underlying limitations tend
to be as you say the Posix level read/write calls,
but even they are sdhaped largely by the hardware
device driver I/O routines which also operate in
the same way(the BIOS on a PC).
To radically change how we use files we need to change
how we design the hardware!
After all even our applications(word etc) use the
same metaphor - Open a file, read it in, modify it,
write it out...
Alan g.
Author of the 'Learning to Program' web site
http://www.freenetpages.co.uk/hp/alan.gauld