Changing filenames from Greeklish => Greek (subprocess complain)
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Sun Jun 9 02:45:50 EDT 2013
On Sat, 08 Jun 2013 22:09:57 -0700, nagia.retsina wrote:
> chr('A') would give me the mapping of this char, the number 65 while
> ord(65) would output the char 'A' likewise.
Correct. Python uses Unicode, where code-point 65 ("ordinal value 65")
means letter "A".
There are older encodings. For example, a very old one, used on IBM
mainframes, is EBCDIC, where ordinal value 65 means the letter "â", and
the letter "A" has ordinal value 193.
> What would happen if we we try to re-encode bytes on the disk? like
> trying:
>
> s = "νίκος"
> utf8_bytes = s.encode('utf-8')
> greek_bytes = utf_bytes.encode('iso-8869-7')
>
> Can we re-encode twice or as many times we want and then decode back
> respectively lke?
Of course. Bytes have no memory of where they came from, or what they are
used for. All you are doing is flipping bits on a memory chip, or on a
hard drive. So long as *you* remember which encoding is the right one,
there is no problem. If you forget, and start using the wrong one, you
will get garbage characters, mojibake, or errors.
[...]
> And also is there a deiffrence between "encoding" and "compressing" ?
Of course. They are totally unrelated.
> Isnt the latter useing some form of encoding to take a string or bytes
> to make hold less space on disk?
Correct, except forget about "encoding". It's not relevant (except,
maybe, in a mathematical sense) and will just confuse you.
--
Steven
More information about the Python-list
mailing list