Trying to set a cookie within a python script
davea at ieee.org
Tue Aug 3 17:41:36 CEST 2010
>> On 3 Αύγ, 11:10, Dave Angel <da... at ieee.org> wrote:
>> a) a text editor takes keystrokes and cut/paste info and other data, and
>> produces a stream of (unicode) characters. It then encodes each of
>> those character into one or more bytes and saves it to a file. You have
>> to tell Notepad++ how to do that encoding. Note that unless it's saving
>> a BOM, there's no clue in the file what encoding it used.
> So actually when i'm selecting an encoding from Notepad++'s options
> iam basically telling the editor the way the it's suppose to store
> those streams of characters to the hard disk drive.
> Different encodings equals different ways of storting the data to the
> media, correct?
Exactly. The file is a stream of bytes, and Unicode has more than 256
possible characters. Further, even the subset of characters that *do*
take one byte are different for different encodings. So you need to tell
the editor what encoding you want to use.
>> b) The python compiler has to interpret the bytes it finds (spec. within
>> string literals and comments), and decode them into unicode for its own
>> work. It uses the 'coding:' comment to decide how to do this. But once
>> the file has been compiled, that comment is totally irrelevant, and ignored.
> What is a "String Literal" ?
In python, a string literal is enclosed by single quotes, double quotes,
myvar = u"tell me more"
myvar = u'hello world'
The u prefix is used in python 2.x to convert to Unicode; it's not
needed in 3.x and I forget which one you're using.
these are affected by the coding comment, but
myvar = myfile.readline()
> Basically if i understood you right, this line of code tells Python
> the opposite thign from (a).
> (a) told the editor how to store data to the media, while (b) tells
> the python compiler how to retrive these data from the media(how to
> read it, that is!) Right?
>> c1) Your python code has to decide how to encode its information when
>> writing to stdout. There are several ways to accomplish that.
> what other ways except the prin '''Content-Type blah blah... ''' ?
You can use the write() method of sys.stdout, or various equivalents,
such as the one produced by io.open(). You can probably also use
But probably the easiest is to do something like:
sys.stdout = codecs.getwriter('utf8')(sys.stdout)
and then print to stdout will use the utf8 encoding for its output.
>> c2) The browser sees only what was sent to stdout, starting with the
>> "Content-Type..." line. It uses that line to decide how to decode the
>> rest of the stream. Let me reemphasize, the browser does not see any of
>> the python code, or comments.
> I was under the impression the the stdout of a cgi python script was
> the web server itself since this is the one app that awaits for the
> data to be displayed.
> When a python script runs it produces html output that time or only
> after the
> python's output to the Web Server the html output is produced?
I don't understand your wording. Certainly the server launches the
python script, and captures stdout. It then sends that stream of bytes
out over tcp/ip to the waiting browser. You ask when does it become html
? I don't think the question has meaning.
> And something else please.
> My cgi python scripts contains english and greek letters, hence this
> is an indication of tellign the editor to save the file to disk as
> utf-8 right?
> Well i told Notepad++ to save ti as Ascii and also removed the '# -*-
> coding: utf-8 -*-' line.
> and only used print ''' Content-Type: text/html; charset=UTF-8 /n'''
> So, how the editor managed to save the file as ascii although my file
> coaniens characters that are beyond the usual 7-bit ascci set?
I don't know Notepad++, so I don't know how it handles a character
outside the legal ASCII range. So I'd only be guessing. But I'm guessing
it ignored the ASCII restriction, and just wrote the bottom 8 bits of
each character. That'll work for some of the non-ASCII characters.
> and how could the python compiler 'read them and executed them' ?
I'd only be speculating, since I've seen only a few lines of your
source. Perhaps you're using Python 2.x, and not specifying u"" for
those literals, which is unreasonable, but does tend to work for *some*
of the second 128 characters.
> I shoulds have saved in utf-8 and have inside the script the line so
> the compiler knew to open it as utf-8. How come it dit work as ascii
> both in stroing and retreiving!!
Since you have the setup that shows this effect, why not take a look at
the file, and see whether there are any non-ASCII characters (codes
above hex 7f) in it ? And whether there's a BOM. Then you can examine
the unicode characters produced. by changing your source code.
The more I think about it, the more I suspect your confusion comes
because maybe you're not using the u-prefix on your literals. That can
lead to some very subtle bugs, and code that works for a while, then
fails in inexplicable ways.
More information about the Python-list