<div class="gmail_quote">2010/3/5 Dave Angel <span dir="ltr">&lt;<a href="mailto:davea@ieee.org">davea@ieee.org</a>&gt; </span><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


In other words, you don&#39;t understand my paragraph above. </blockquote><div><br></div><div>Maybe. But please don&#39;t be angry. I&#39;m here to learn, and as i&#39;ve run into a very difficult concept I want to fully undestand it.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Once the string is stored in t as an 8 bit string, it&#39;s irrelevant what the source file encoding was.</blockquote>

<div><br></div><div>Ok, you&#39;ve said this 2 times, but, please, can you tell me why? I think that&#39;s the key passage to understand how encoding of strings works. The source file encoding affects all file lines, also strings. If my encoding is UTF8 python will read the string &quot;ciao è ciao&quot; as &#39;ciao \xc3\xa8 ciao&#39; but if it&#39;s latin1 it will read &#39;ciao \xe8 ciao&#39;. So, how can it be irrelevant?</div>

<div><br></div><div>I think the problem is that i can&#39;t find any difference between 2 lines quoted above:</div><div><br></div><div>a = u&quot;ciao è ciao&quot;</div><div><br></div><div>and</div><div><br></div><div>a = &quot;ciao è ciao&quot;</div>

<div>a = unicode(a)</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">If you then (whether it&#39;s in the next line, or ten thousand calls later) try to convert to unicode without specifying a decoder, it uses the default encoder, which is a application wide thing, and not a source file thing.  To see what it is on your system, use sys.getdefaultencoding().<br>

</blockquote><div><br></div><div>And this is ok. Spir said that it uses ASCII, you now say that it uses the default encoder. I think that ASCII on spir&#39;s system is the default encoder so.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

The point is that there isn&#39;t just one global value, and it&#39;s a good thing.  You should figure everywhere characters come into  your program (eg. source files, raw_input, file i/o...) and everywhere characters go out of your program, and deal with each of them individually.</blockquote>

<div><br></div><div>Ok. But it always happen this way. I hardly ever have to work with strings defined in the file.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

Don&#39;t store anything internally as strings, and you won&#39;t create the ambiguity you have with your &#39;t&#39; variable above.<br>

<br>

DaveA<br></blockquote><div><br></div><div>Thankyou Dave</div><div><br></div><div>Giorgio </div></div><br><br clear="all"><br>-- <br>--<br>AnotherNetFellow<br>Email: <a href="mailto:anothernetfellow@gmail.com">anothernetfellow@gmail.com</a><br>