[Tutor] pass tuples to user defined function(beginner)

Dave Angel d at davea.name
Wed Nov 30 00:16:30 CET 2011


On 11/29/2011 04:40 PM, Mayo Adams wrote:
> Apologies for my numerous offenses to protocol, and gratitude for  the
> suggestions all around.And yet...  as to the notion of a tuple
> existing in some non-Platonic sense, I suspect I will have carry my
> confusion away in order to dispel it by further reading.
> "Wrong on both counts?" Hey, you win! I WAS wrong. But if it is not a
> tuple that is in the file, it would be helpful to know what it is.
> Presumably, a string representing a tuple. And as to the matter of
> representation,  I cant immediately see how anything in a script is
> anything other than a representation of some kind, hence the
> distinction between representamen and object does no work for me.
>
> Yes, I should have been clearer as to what I was trying to achieve,
> but I underestimated the good will of this community(in most cases)and
> their willingness to help.  For which, as I said, much thanks.

Nicely worded.  We appreciate your confusion, but please realize  that 
each of us comes from a different  background, computing-wise.

In traditional (compiled) languages, there was a sharp and uncrossable 
division between the stuff in source files and the stuff in data files.  
The former got compiled (by multi-million dollar machines) into machine 
code, which used the latter as data, both input and output.  The data 
the user saw never got near a compiler, so the rules of interpreting it 
were entirely different.  A file was just a file, a bunch of bytes made 
meaningful only by the particular application that dealt with it.  And 
the keyboard, screen and printer (and the card punch/reader) were in 
some sense just more files.

This had some (perhaps unforeseen) advantages in isolating the differing 
parts of the application.  Source code was written by highly trained 
people, who tried to eliminate the bugs and crashes before releasing the 
executable to the public.  Once debugged, this code was extremely 
protected against malicious and careless users (both human and virus, 
both local and over the internet) who might manage to corrupt things.  
The operating systems and hardware were also designed to make this 
distinction complete, so that even buggy programs couldn't harm the 
system itself.  I've even worked on systems where the code and the data 
weren't even occupying the same kind of memory.  When a program crashed 
the system, it was our fault, not the users' because we didn't give him 
(at least not on purpose) the ability to run arbitrary code.

then came the days of MSDOS, which is a totally unprotected operating 
system, and scripting languages and interpreters, which could actually 
run code from data files, and the lines became blurry.  People were 
writing malicious macros into word processing data, and the machines 
actually would crash from it.  (Actually things were not this linear, 
interpreters have been around practically forever, but give me a little 
poetic license)

So gradually, we've added protection back into our personal systems that 
previously was only on mainframes.  But the need for these protections 
is higher today than ever before, mainly because of the internet.


So to Python.  Notice that the rules for finding code are different than 
those for finding data files.  No accident.  We WANT to treat them 
differently.  We want data files to be completely spec'ed out, as to 
what data is permissible and what is not.  Without such a spec, programs 
are entirely unpredictable.  The whole Y2K problem was caused because 
too many programmers made unwarranted assumptions about their data (in 
that case about the range of valid dates).  And they were also saving 
space, in an era where hard disks were available for maybe three 
thousand dollars for 10 megabytes.

Ever filled in a form which didn't leave enough room for yout town name, 
or that assumed that city would not have spaces?  Ever had software that 
thought that apostrophes weren't going to happen in a person's name?  Or 
that zip codes are all numeric (true in US, not elsewhere).  Or that 
names only consist of the 26 English letters.  Or even that only the 
first letter would be capitalized.

Anyway, my post was pointing out that without constraining the data, it 
was impractical/unsafe/unwise to encode the data in a byte stream(file), 
and expect to decode it later.

In your example, you encoded a tuple that was a string and an integer.  
You used repr() to do it, and that can be fine.  But the logic to 
transform that string back into a tuple is quite complex, if you have to 
cover all the strange cases.  And impossible, if you don't know anything 
about the data.

A tuple object in memory doesn't look anything like the text you save to 
that file.  It's probably over a hundred bytes spread out over at least 
three independent memory blocks.  But we try to pretend that the 
in-memory object is some kind of idealized tuple, and don't need to know 
the details.  Much of the space is taken up by pointers, and pointers to 
pointers, that reference the code that knows how to manipulate it.

You might want to look into shelve or pickle, which are designed to save 
objects to a file and restore them later.  These have to be as general 
as possible, and the complexity of both the code and the resultant data 
file reflects that.

-- 

DaveA



More information about the Tutor mailing list