is there a library/program that converts sqlite database from windows-1252 to utf-8 ?

Stef Mientki stef.mientki at gmail.com
Sun Sep 12 11:16:14 CEST 2010


 On 12-09-2010 00:07, Robert Kern wrote:
> On 9/11/10 4:45 PM, Stef Mientki wrote:
>>   On 11-09-2010 21:11, Robert Kern wrote:
>>> SQLite internally stores its strings as UTF-8 or UTF-16 encoded Unicode. So it's not clear what
>>> you mean when you say the database is "windows-1252". Can you be more specific?
>> I doubt that, but I'm not sure ...
>
> From the documentation, it looks like SQLite does not attempt to validate the input as UTF-8
> encoded, so it is possible that someone pushed in raw bytes. See "Support for UTF-8 and UTF-16" in
> the following page:
>
>   http://www.sqlite.org/version3.html
>
>> For some databases written by other programs and
>> written with Python, with
>>      cursor = self.conn.cursor ()
>>      self.conn.text_factory = str
>>
>> Can only be read back with with text_factory = str
>> then the resulting string columns contains normal strings with windows 1252 coding, like
>> character 0xC3
>
> You can probably use
>
>   self.conn.text_factory = lambda x: x.decode('windows-1252')
>
> to read the data, though I've never tried to use that API myself.
>
> You will need to write a program yourself that opens one connection to your existing database for
> reading and another connection to another database (using the defaults) for writing. Then iterate
> over your tables and copy data from one database to the other.
>
> You may also be able to simply dump the database to a text file using "sqlite3 bad-database.db
> .dump > bad-sql.sql", read the text file into Python as a string, decode it from windows-1252 to
> unicode and then encode it as utf-8 and write it back out. Then use "sqlite3 good-database.db
> .read good-sql.sql" to create the new database. I've never tried such a thing, so it may not work.
>
Yes, I think I've to do somethhing like that,
to conserve the structure and field types, it's even more complex.

cheers,
Stef



More information about the Python-list mailing list