Reading Windows CSV file with LCID entries under Linux.
mail at timgolden.me.uk
Mon Sep 22 17:59:41 CEST 2008
Thomas Troeger wrote:
> I've stumbled over a problem with Windows Locale ID information and
> codepages. I'm writing a Python application that parses a CSV file,
> the format of a line in this file is "LCID;Text1;Text2". Each line can
> contain a different locale id (LCID) and the text fields contain data
> that is encoded in some codepage which is associated with this LCID. My
> current data file contains the codes 1033 for German and 1031 for
> English US (as listed in
> Unfortunately, I cannot find out which Codepage (like cp-1252 or
> whatever) belongs to which LCID.
> My question is: How can I convert this data into something more
> reasonable like unicode? Basically, what I want is something like
> "Text1;Text2", both fields encoded as UTF-8. Can this be done with
> Python? How can I find out which codepage I have to use for 1033 and 1031?
The GetLocaleInfo API call can do that conversion:
You'll need to use ctypes (or write a c extension) to
use it. Be aware that if it doesn't succeed you may need
to fall back on cp 65001 -- utf8.
More information about the Python-list