Unicode blues in Python3

Rami Chowdhury rami.chowdhury at gmail.com
Tue Mar 23 14:00:09 EDT 2010


On Tuesday 23 March 2010 10:33:33 nn wrote:
> I know that unicode is the way to go in Python 3.1, but it is getting
> in my way right now in my Unix scripts. How do I write a chr(253) to a
> file?
>
> #nntst2.py
> import sys,codecs
> mychar=chr(253)
> print(sys.stdout.encoding)
> print(mychar)

The following code works for me:

$ cat nnout5.py 
#!/usr/bin/python3.1

import sys
mychar = chr(253)
sys.stdout.write(mychar)
$ echo $(cat nnout)
ý

Can I ask why you're using print() in the first place, rather than writing 
directly to a file? Python 3.x, AFAIK, distinguishes between text and binary 
files and will let you specify the encoding you want for strings you write.

Hope that helps,
Rami
> 
>  > ./nntst2.py
> 
> ISO8859-1
> ý
> 
>  > ./nntst2.py >nnout2
> 
> Traceback (most recent call last):
>   File "./nntst2.py", line 5, in <module>
>     print(mychar)
> UnicodeEncodeError: 'ascii' codec can't encode character '\xfd' in
> position 0: ordinal not in range(128)
> 
> > cat nnout2
> 
> ascii
> 
> ..Oh great!
> 
> ok lets try this:
> #nntst3.py
> import sys,codecs
> mychar=chr(253)
> print(sys.stdout.encoding)
> print(mychar.encode('latin1'))
> 
> > ./nntst3.py
> 
> ISO8859-1
> b'\xfd'
> 
> > ./nntst3.py >nnout3
> > 
> > cat nnout3
> 
> ascii
> b'\xfd'
> 
> ..Eh... not what I want really.
> 
> #nntst4.py
> import sys,codecs
> mychar=chr(253)
> print(sys.stdout.encoding)
> sys.stdout=codecs.getwriter("latin1")(sys.stdout)
> print(mychar)
> 
>  > ./nntst4.py
> 
> ISO8859-1
> Traceback (most recent call last):
>   File "./nntst4.py", line 6, in <module>
>     print(mychar)
>   File "Python-3.1.2/Lib/codecs.py", line 356, in write
>     self.stream.write(data)
> TypeError: must be str, not bytes
> 
> ..OK, this is not working either.
> 
> Is there any way to write a value 253 to standard output?

----
Rami Chowdhury
"Ninety percent of everything is crap." -- Sturgeon's Law
408-597-7068 (US) / 07875-841-046 (UK) / 01819-245544 (BD)



More information about the Python-list mailing list