(Simple?) Unicode Question

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Sat Aug 29 04:26:54 EDT 2009


On Sat, 29 Aug 2009 09:34:43 +0200, Thorsten Kampe wrote:

> * Rami Chowdhury (Thu, 27 Aug 2009 09:44:41 -0700)
>> > Further, does anything, except a printing device need to know the
>> > encoding of a piece of "text"?
> 
> Python needs to know if you are processing the text.

Python only needs to know when you convert the text to or from bytes. I 
can do this:

>>> s = "hello"
>>> t = "world"
>>> print(' '.join([s, t]))
hello world

and not need to care anything about encodings.

So long as your terminal has a sensible encoding, and you have a good 
quality font, you should be able to print any string you can create.



>> I may be wrong, but I believe that's part of the idea between
>> separation of string and bytes types in Python 3.x. I believe, if you
>> are using Python 3.x, you don't need the character encoding mumbo jumbo
>> at all ;-)
> 
> Nothing has changed in that regard. You still need to decode and encode
> text and for that you have to know the encoding.

You only need to worry about encoding when you convert from bytes to 
text, and visa versa. Admittedly, the most common time you need to do 
that is when reading input from files, but if all your text strings are 
generated by Python, and not output anywhere, you shouldn't need to care 
about encodings.

If all your text contains nothing but ASCII characters, you should never 
need to worry about encodings at all.


-- 
Steven



More information about the Python-list mailing list