<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 7/6/2013 4:01 πμ, Cameron Simpson
wrote:<br>
</div>
<blockquote cite="mid:20130607010122.GA91151@cskk.homeip.net"
type="cite">
<pre wrap="">On 06Jun2013 11:46, =?utf-8?B?zp3Or866zr/PgiDOk866z4EzM866?= <a class="moz-txt-link-rfc2396E" href="mailto:nikos.gr33k@gmail.com"><nikos.gr33k@gmail.com></a> wrote:
| Τη Πέμπτη, 6 Ιουνίου 2013 3:44:52 μ.μ. UTC+3, ο χρήστης Steven D'Aprano έγραψε:
| > py> s = '999-Eυχή-του-Ιησού'
| > py> bytes_as_utf8 = s.encode('utf-8')
| > py> t = bytes_as_utf8.decode('iso-8859-7', errors='replace')
| > py> print(t)
| > 999-EΟΟΞ�-ΟΞΏΟ-ΞΞ·ΟΞΏΟ
|
| errors='replace' mean dont break in case or error?
Yes. The result will be correct for correct iso-8859-7 and slightly mangled
for something that would not decode smoothly.</pre>
</blockquote>
How can it be correct? We have encoded out string in utf-8 and then
we tried to decode it as greek-iso? How can this possibly be
correct?<br>
<blockquote cite="mid:20130607010122.GA91151@cskk.homeip.net"
type="cite">
<pre wrap="">
| You took the unicode 's' string you utf-8 bytestringed it.
| Then how its possible to ask for the utf8-bytestring to decode
| back to unicode string with the use of a different charset that the
| one used for encoding and thsi actually printed the filename in
| greek-iso?
It is easily possible, as shown above. Does it make sense? Normally
not, but Steven is demonstrating how your "mv" exercises have
behaved: a rename using utf-8, then a _display_ using iso-8859-7.</pre>
</blockquote>
Same as above, i don't understand it at all, since different
charsets(encodings) used in the encode/decode process.<br>
<blockquote cite="mid:20130607010122.GA91151@cskk.homeip.net"
type="cite">
<pre wrap="">
|
| a) WHAT does it mean when a linux system is set to use utf-8?
It means the locale settings _for the current process_ are set for
UTF-8. The "locale" command will show you the current state.</pre>
</blockquote>
That means that, when a linux application needs to saved a filename
to the linux filesystem, the app checks the filesytem's 'locale', so
to encode the filename using the utf-8 charset ?<br>
And likewise when a linux application wants to decode a filename is
also checking the filesystem's 'locale' setting so to know what
charset must use to decode the filename correctly back to the
original string?<br>
<br>
So locale is used for filesystem itself and linux apps to know how
to read(decode) and write(enode) filenames from/into the system's
hdd?<br>
<blockquote cite="mid:20130607010122.GA91151@cskk.homeip.net"
type="cite">
<pre wrap="">
| c) WHAT happens when the two of them try to work together?
If everything matches, it is all good. If the locales do not match,
the mismatch will result in an undesired bytes<->characters
encode/decode step somewhere, and something will display incorrectly
or be entered as input incorrectly.
</pre>
</blockquote>
<br>
Cant quite grasp the idea:<br>
<br>
local end: Win8, locale = greek-iso<br>
remote end: CentOS 6.4, locale = utf-8<br>
<br>
FileZilla by default uses "do not know what charset" to upload
filenames<br>
Putty by default uses greek-iso to display filenames<br>
<br>
<br>
WHAT someone can expect to happen when all of the above work
together?<br>
Mess of course, but i want to hear in detail each step of the mess
as it emerges.<br>
<br>
<div class="moz-signature">-- <br>
<a href="http://superhost.gr"><font color="blue">Webhost</font></a><font
color="blue">
<font color="lime"> &&
<a href="http://psariastonafro.wordpress.com"><font
color="red">Weblog</font></a></font></font></div>
</body>
</html>