Changing filenames from Greeklish => Greek (subprocess complain)
Νικόλαος Κούρας
nikos.gr33k at gmail.com
Mon Jun 10 05:59:31 EDT 2013
> >>>> s = 'α'
> >>>> s.encode('utf-8')
> > b'\xce\xb1'
'b' stands for binary right?
b'\xce\xb1' = we are looking at a byte in a hexadecimal format?
if yes how could we see it in binary and decimal represenation?
> > I see that the encoding of this char takes 2 bytes. But why two exactly?
> > How do i calculate how many bits are needed to store this char into bytes?
> Because utf-8 takes 1 to 4 bytes to encode characters
Since 2^8 = 256, utf-8 should store the first 256 chars of unicode charset using 1 byte.
Also Since 2^16 = 65535, utf-8 should store the first 65535 chars of unicode charset using 2 bytes and so on.
But i know that this is not the case.
But i dont understand why.
> >>>> s = 'a'
> >>>> s.encode('utf-8')
> > b'a'
> utf-8 takes ASCII as it is, as 1 byte. They are the same
EBCDIC and ASCII and Unicode are charactet sets, correct?
iso-8859-1, iso-8859-7, utf-8, utf-16, utf-32 and so on are encoding methods, right?
More information about the Python-list
mailing list