raw_input() and utf-8 formatted chars
Marc 'BlackJack' Rintsch
bj_666 at gmx.net
Fri Oct 12 16:43:43 EDT 2007
On Fri, 12 Oct 2007 13:18:35 -0700, 7stud wrote:
> On Oct 12, 1:18 pm, kyoso... at gmail.com wrote:
>> On Oct 12, 1:53 pm, 7stud <bbxx789_0... at yahoo.com> wrote:
>>
>> > s = 'A\xcc\x88' #capital A with umlaut
>> > print s #displays capital A with umlaut
>>
>> > s = raw_input('Enter: ') #A\xcc\x88
>> > print s #displays A\xcc\x88
>>
>> > print len(input) #9
>>
>> > It looks like every character of the string I enter in utf-8 is being
>> > interpreted literally as 9 separate characters rather than one
>> > character. How do I enter a capital A with an umlaut so that python
>> > treats it as one character?
>>
>> I don't know. This works for me:
>>
>>
>>
>> >>> x = raw_input('Enter: ')
>> Enter:
>> >>> len(x)
>> 1
>>
>> I'm using Python 2.4 with Default Source Encoding set to None on
>> Windows XP SP2.
>>
>> Mike
>
> Yeah, but what happens when you enter A\xcc\x88?
You mean literally!? Then of course I get A\xcc\x88 because that's what I
entered. In string literals in source code the backslash has a special
meaning but `raw_input()` does not "interpret" the input in any way.
> And what is it that your keyboard enters to produce an 'a' with an umlaut?
*I* just hit the ä key. The one right next to the ö key. ;-)
Ciao,
Marc 'BlackJack' Rintsch
More information about the Python-list
mailing list