Getting Properly Encoded Strings from Word into Python

Skip Montanaro skip at pobox.com
Fri Jan 18 14:49:44 EST 2002


    Fred> A fellow on microsoft.public.word.vba.general, Klaus Linke, posted
    Fred> a very long VB routine there that changes characters in the Symbol
    Fred> font to Unicode. It loops through a document's characters
    Fred> collection looking for characters in the Symbol font and replaces
    Fred> them with their Unicode equivalents using a large (317 line)
    Fred> translation table.

Isn't that what the encodings directory in the Python distribution is for?
There are lots of Windows-looking modules there (cp1252.py and so forth).
If none of them are appropriate, you can probably whip up your own using the
VB translation table you mentioned.  I don't know how to use them.  If I was
so inclined, I'd start with the docs for the codecs module.

-- 
Skip Montanaro (skip at pobox.com - http://www.mojam.com/)




More information about the Python-list mailing list