Unicode/utf-8 data in SQL Server

John Machin sjmachin at lexicon.net
Tue Aug 8 20:27:29 EDT 2006


thebjorn wrote:
> I'm working with a MS SQL Server database created by a program from a
> fine US company who seems to have gotten run over by the Unicode truck.
> In their infinite wisdom they've decided to store Unicode data directly
> in regular varchar fields, utf-8 encoded! (on the bright side, it is
> properly utf-8 encoded). One of our customers then wants to use a csv
> file created from a report to import in Excel and is getting an
> attitude when the text shows up "all garbled" (which I can
> understand...)
>
> One method that works is to use Python to pull down the result set from
> the database, accumulate the entire result text as a big unicode string
> (while decode('utf-8') all text fields in the process) separating each
> field with a tab, before encode('utf-16') the result string and writing
> it to a file opened in binary mode.  This ensures that the file gets a
> bom, that it's in a format (utf-16) that Excel can import, and
> hopefully tabs are less common than commas in the source data :-(  The
> csv module doesn't support Unicode.

Last time I looked, *Excel* didn't support csv files in utf-N :-(

>
> The customer is of the firm belief that our national characters
> (æøå) are part of ascii, presumably because they're
> single-byte-encoded in iso-8859-1. He has no understanding for the
> issues (either by choice or experience) so there is no purpose to
> trying to explain the differences... Be that as it may, he might be
> satisfied with a csv file in that (iso-8859-1) encoding since the local
> version of Excel can import it transparently (with significant
> behind-the-scenes magic I believe...?)

No magic AFAICT. The bog-standard Windows kit in (north/west/south
Europe + the English-speaking world) uses code page 1252 (Python:
'cp1252') which is an MS-molested iso-885-1.

The customer should be very happy if you do
text.decode('utf-8').encode('cp1252') -- not only should the file
import into Excel OK, he should be able to view it in
Word/Notepad/whatever.

HTH,
John




More information about the Python-list mailing list