[Tutor] Unicode encoding and raw_input() in Python 2.7 ?
Dave Angel
davea at davea.name
Sat Apr 18 02:37:10 CEST 2015
On 04/17/2015 04:39 AM, Samuel VISCAPI wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Hi,
>
> This is my first post to that mailing list if I remember correctly, so
> hello everyone !
>
Welcome to the list.
> I've been stuck on a simple problem for the past few hours. I'd just
> like raw_input to work with accentuated characters.
That should mean you want to use unicode.
If you're using raw_input, then you must be using Python 2.x. Easiest
first step to doing things right in Unicode would be to switch to
version 3.x But I'll assume that you cannot do this, for the duration
of this message.
>
> For example:
>
> firstname = str.capitalize(raw_input('First name: '))
If you're serious about Unicode, you're getting an encoded string with
raw_input, so you'll need to decode it, using whatever encoding your
console device is using. If you don't know, you're in big trouble. But
if you're in Linux, chances are good that it's utf-8.
>
> where firstname could be "Valérie", "Gisèle", "Honoré", etc...
>
> I tried -*- coding: utf-8 -*-, u'', unicode(), but to no avail...
>
As Alan says, you're not tellins us anything useful. "No avail" is too
imprecise to be useful. I'll comment on them anyway.
The coding statement applies only to literals you use in your source
code. It has nothing at all to do with the value returned by raw_input.
u'' likewise is used in your source code. It has nothing to do with
what the user may type into your program.
unicode() is a "function" that may decode a string received from
raw_input, providing you know what the coding was. You can also
accomplish it by using the method str.decode().
> I'm using str.capitalize and str.lower throughout my code, so I guess
> some encoding / decoding will also be necessary at some point.
Those apply to strings. But if you're doing it right, you should have
unicode objects long before you apply such methods. So you'd want the
unicode methods unicode.upper and unicode.lower
--
DaveA
More information about the Tutor
mailing list