Trying to set a cookie within a python script

Dave Angel davea at ieee.org
Tue Aug 3 14:00:17 EDT 2010


¯º¿Â wrote:
>> On 3 Αύγ, 18:41, Dave Angel <da... at ieee.org> wrote:
>>     
>>> Different encodings equal different ways of storing the data to the
>>> media, correct?
>>>       
>> Exactly. The file is a stream of bytes, and Unicode has more than 256
>> possible characters. Further, even the subset of characters that *do*
>> take one byte are different for different encodings. So you need to tell
>> the editor what encoding you want to use.
>>     
>
> For example an 'a' char in iso-8859-1 is stored different than an 'a'
> char in iso-8859-7 and an 'a' char of utf-8 ?
>
>
>   
Nope, the ASCII subset is identical. It's the ones between 80 and ff 
that differ, and of course not all of those. Further, some of the codes 
that are one byte in 8859 are two bytes in utf-8.

You *could* just decide that you're going to hardwire the assumption 
that you'll be dealing with a single character set that does fit in 8 
bits, and most of this complexity goes away. But if you do that, do 
*NOT* use utf-8.

But if you do want to be able to handle more than 256 characters, or 
more than one encoding, read on.

Many people confuse encoding and decoding. A unicode character is an 
abstraction which represents a raw character. For convenience, the first 
128 code points map directly onto the 7 bit encoding called ASCII. But 
before Unicode there were several other extensions to 256, which were 
incompatible with each other. For example, a byte which might be a 
European character in one such encoding might be a kata-kana character 
in another one. Each encoding was 8 bits, but it was difficult for a 
single program to handle more than one such encoding.

So along comes unicode, which is typically implemented in 16 or 32 bit 
cells. And it has an 8 bit encoding called utf-8 which uses one byte for 
the first 192 characters (I think), and two bytes for some more, and 
three bytes beyond that.

You encode unicode to utf-8, or to 8859, or to ...
You decode utf-8 or 8859, or cp1252 , or ... to unicode

>>> What is a "String Literal" ?
>>>       
>> In python, a string literal is enclosed by single quotes, double quotes,
>> or triples.
>> myvar ="tell me more"
>> myvar ='hello world'
>> The u prefix is used in python 2.x to convert to Unicode; it's not
>> needed in 3.x and I forget which one you're using.
>>     
>
> I use Python 2.4 and never used the u prefix.
>
>   
Then you'd better hope you never manipulate those literals. For example, 
the second character of some international characters expressed in utf8 
may be a percent symbol, which would mess up string formatting.
> i Still don't understand the difference between a 'string' and a
> 'string literal'
>
>   
A string is an object containing characters. A string literal is one of 
the ways you create such an object. When you create it that way, you 
need to make sure the compiler knows the correct encoding, by using the 
encoding: line at beginning of file.
> If i save a file as iso-8859-1 but in some of my variabels i use greek
> characters instead of telling the browser to change encoding and save
> the file as utf-8 i can just use the u prefix like your examples to
> save the variables as iso-8859-1 ?
>
>   
>> I don't understand your wording. Certainly the server launches the
>> python script, and captures stdout. It then sends that stream of bytes
>> out over tcp/ip to the waiting browser. You ask when does it become html
>> ? I don't think the question has meaning.
>>     
>
> http cliens send request to http server(apache) , apache call python
> interpreter python call mysql to handle SQL queries right?
>
> My question is what is the difference of the python's script output
> and the web server's output to the http client?
>
>   
The web server wraps a few characters before and after your html stream, 
but it shouldn't touch the stream itself.

> Who is producing the html code? the python output or the apache web
> server after it receive the python's output?
>
>
>   
see above.
>> The more I think about it, the more I suspect your confusion comes
>> because maybe you're not using the u-prefix on your literals. That can
>> lead to some very subtle bugs, and code that works for a while, then
>> fails in inexplicable ways.
>>     
>
> I'm not sure whatr exaclty the do just yet.
>
> For example if i say mymessage = "καλημέρα" and the i say mymessage = u"καλημέρα" then the 1st one is a greek encoding variable while the
> 2nd its a utf-8 one?
>
>   
No, the first is an 8 bit copy of whatever bytes your editor happened to 
save. The second is unicode, which may be either 16 or 32 bits per 
character, depending on OS platform. Neither is utf-8.
> So one script can be in some encoding and some parts of the script
> like th2 2nd varible can be in another?
>
>   
mymessage = u"καλημέρα"

creates an object that is *not* encoded. Encoding is taking the unicode 
stream and representing it as a stream of bytes, which may or may have 
more bytes than the original has characters.

> ============================
> Also can you please help me in my cookie problem as to why only the
> else block executed each time and never the if?
>
> here is the code:
>
> [code]
> if os.environ.get('HTTP_COOKIE') and cookie.has_key('visitor') =
> 'nikos':		#if visitor cookie exist
> 	print "ΑΠΟ ΤΗΝ ΕΠΟΜΕΝΗ ΕΠΙΣΚΕΨΗ ΣΟΥ ΘΑ ΣΕ ΥΠΟΛΟΓΙΖΩ ΩΣ ΕΠΙΣΚΕΠΤΗ
> ΑΥΞΑΝΟΝΤΑΣ ΤΟΝ ΜΕΤΡΗΤΗ!"
> 	cookie['visitor'] = 'nikos', time() - 1 )		#this cookie will expire
> now
> else:
> 	print "ΑΠΟ ΔΩ ΚΑΙ ΣΤΟ ΕΞΗΣ ΔΕΝ ΣΕ ΕΙΔΑ, ΔΕΝ ΣΕ ΞΕΡΩ, ΔΕΝ ΣΕ ΑΚΟΥΣΑ!
> ΘΑ ΕΙΣΑΙ ΠΛΕΟΝ Ο ΑΟΡΑΤΟΣ ΕΠΙΣΚΕΠΤΗΣ!!"
> 	cookie['visitor'] = 'nikos', time() + 60*60*24*365 )		#this cookie
> will expire in an year
> [/code]
>
> How do i check if the cookie is set and why if set never gets unset?!
>
>   
I personally haven't done any cookie code. If I were debugging this, I'd 
factor out the multiple parts of that if statement, and find out which 
one isn't true. From here I can't guess.

DaveA



More information about the Python-list mailing list