Unicode problem
Erik Max Francis
max at alcyone.com
Sat Jul 7 18:21:03 EDT 2007
pabloski at giochinternet.com wrote:
> Hi to all, I have a little problem with unicode handling under Python.
>
> I have this code
>
> s = u'A unicode string with this damn apostrophe \x2019'
>
> outf = codecs.open('filename.txt', 'w', 'iso-8859-15')
> outf.write(s)
>
> what I obtain is a UnicodeEncodeError that says me that character \x2019
> maps to undefined.
>
> But the character \x2019 is the apostrophe and in the unicode table it has
> \x0027 as an equivalent, so the codecs should convert \x2019 to \x27 ( as
> defined in iso-8859-15 )....
U+2019 is RIGHT SINGLE QUOTATION MARK. The APOSTROPHE (U+0027) is a
cross-reference as a similar code point, but they're not the same thing.
Your problem is that ISO-8859-15 doesn't have the RIGHT SINGLE QUOTATION
MARK, so you'll have to do the translation yourself if you want to turn
it into a true APOSTROPHE.
--
Erik Max Francis && max at alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM, Y!M erikmaxfrancis
She glanced at her watch ... It was 9:23.
-- James Clavell
More information about the Python-list
mailing list