[Expat-discuss] Windows-1252 and Latin-1

Fred L. Drake, Jr. fdrake at acm.org
Tue Jan 14 10:42:53 EST 2003


Baldur van Lew writes:
 > Unfortunately 1252 isn't the same as Latin-1 (ISO-8859-1) - this is a
 > constant source of confusion.

Given that I was relying on Microsoft's own documentation, I can see
where that would spring up.  ;-(

 > Specifically in the range 80-9F Windows 1252 has a number of characters
 > defined which do not appear in 8859-1 
 > 
 > "EUR'f"...??^?S<OEZ''""*--~(tm)s>oezY"  - if you're running windows check
 > the Character Map application with subset Windows Characters.

I've got it on my home box (unfortunately), which is about 7 hours
away.  ;-)

 > I assume you have to define your own encoding table to handle this (extended
 > latin1 table?).

Yes.  There is a defined API to allow this to be done.  Our long-term
roadmap includes providing an additional library that supports many
additional encodings such as these weird Windows code pages), but we
really haven't had time to spend on that at this point.  Day jobs and
all that, you know.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation



More information about the Expat-discuss mailing list