[DB-SIG] PEP 249: En-/Decoding problems occurring when using hdbcli
hueseyin.dagaydin at sap.com
Tue Nov 27 08:43:57 EST 2018
as a Python developer and user of hdbcli-2.2.53,
I face a strange phenomenon that I could not solve so far.
The following pre-requisites have been applied on our instances:
1. import is being done like:
from hdbcli import dbapi
1. Input text:
test_doc = 'The dish was disgusting. I am not going to that restaurant anymore.'
Sometimes the exception is thrown saying
ERROR: 'utf-8' codec can't decode byte 0xe7 in position 2: invalid continuation byte’
and one time the other saying:
ERROR 'utf-8' codec can't decode byte 0x8e in position 2: invalid start byte ’
1. We make use of HANA’s CALL TA_ANALYZE, such as
* connection = dbapi.connect(hana_sys, port=hana_port, user=hana_user, password=hana_pass)
query = """
CALL TA_ANALYZE ( DOCUMENT_BINARY => ?,
MIME_TYPE => ?,
LANGUAGE_DETECTION => '',
TA_ANNOTATIONS => ?,
PLAINTEXT => ? );
* plain_text = cursor.execute(query, (doc, '', '', '')).
in order to access on our result.
We expect to see the plain text converted from given binary data, but unfortunately the error from above occurs.
* finally we fetch the results by iterating over the rows via:
for row in cursor.fetchall():
Now the strange phenomenon is that this issue seems to occur
only on Windows and Linux based OS and not MacOS.
Since our apps are deployed and running on Linux based instances,
we need to get solved this issue.
Please also note that the exception is being thrown randomly,
meaning not on every executions but just in some of them.
this also makes it difficult to find a solution,
let alone understand the problem.
The input remains always the same
My team and I hope to hear from you.
Thanks in advance on behalf of the team for any advice.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the DB-SIG