RE Module Performance
steve+comp.lang.python at pearwood.info
Sat Jul 27 08:28:56 CEST 2013
On Fri, 26 Jul 2013 08:46:58 -0700, wxjmfauth wrote:
> BTW, I'm pleased to read "sequence of bits" and not bytes. Again, utf
> transformers are producing sequence of bits, call Unicode Transformation
> Units, with lengths of 8/16/32 *bits*, from there the names utf8/16/32.
> UCS transformers are (were) producing bytes, from there the names
Not only does your distinction between bits and bytes make no practical
difference on nearly all hardware in common use today, but the Unicode
Consortium disagrees with you, and defines UTC in terms of bytes:
"A Unicode transformation format (UTF) is an algorithmic mapping from
every Unicode code point (except surrogate code points) to a unique byte
 There may still be some old supercomputers where a byte is more than
8 bits in use, but they're unlikely to support Unicode.
More information about the Python-list