Unicode / cx_Oracle problem
raschulmanxx at verizon.net
Sun Sep 10 22:25:45 CEST 2006
On Sun, 10 Sep 2006 11:42:26 +0200, "Diez B. Roggisch"
<deets at nospam.web.de> wrote:
>What does print repr(mean) give you?
That is a useful suggestion.
For context, I reproduce the source code:
in_file = codecs.open("c:\\pythonapps\\mean.my",encoding="utf_16_LE")
connection = cx_Oracle.connect("username", "password")
cursor = connection.cursor()
for row in in_file:
id = row
mean = row
print "Value of row is ", repr(row) #debug line
print "Value of the variable 'id' is ", repr(id) #debug line
print "Value of the variable 'mean' is ", repr(mean) #debug line
cursor.execute("""INSERT INTO mean (mean_id,mean_eng_txt)
Here is the result from the print repr() statements:
Value of row is u"\ufeff(3,'sadness, lament; sympathize with,
Value of the variable 'id' is u'\ufeff'
Value of the variable 'mean' is u'('
Clearly, the values loaded into the 'id' and 'mean' variables are not
satisfactory but are picking up the BOM.
>The oracle NLS is a sometimes tricky beast, as it sets the encoding it
>tries to be clever and assigns an existing connection some encoding,
>based on the users/machines locale. Which can yield unexpected results,
>such as "Dusseldorf" instead of "Düsseldorf" when querying a german city
>list with an english locale.
>So - you have to figure out, what encoding your db-connection expects.
>You can do so by issuing some queries against the session tables I
>believe - I don't have my oracle resources at home, but googling will
>bring you there, the important oracle term is NLS.
It's very hard to figure out what to do on the basis of complexities
on the order of
(tiny equivalent http://tinyurl.com/fnc54
But I'm not even sure I got that far. My problems so far seem prior:
in Python or Python's cx_Oracle driver. To be candid, I'm very tempted
at this point to abandon the Python effort and revert to an all-ucs2
environment, much as I dislike Java and C#'s complexities and the poor
support available for all-Java databases.
>Then you need to encode the unicode string before passing it - something
>mean = mean.encode("latin1")
I don't see how the Chinese characters embedded in the English text
will carry over if I do that.
In any case, thanks for your patient and generous help.
Delete the antispamming 'xx' characters for email reply
More information about the Python-list