<div dir="ltr"><br><div class="gmail_extra"><div class="gmail_quote">On Wed, Jan 8, 2014 at 7:34 PM, Antoine Pitrou <span dir="ltr"><<a href="mailto:solipsis@pitrou.net" target="_blank">solipsis@pitrou.net</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Wed, 8 Jan 2014 09:50:30 +0900<br>
INADA Naoki <<a href="mailto:songofacandy@gmail.com">songofacandy@gmail.com</a>><br>
wrote:<br>
><br>
> textdata = b"hello"<br>
<br>
textdata shouldn't be a bytes object! If it's text it's a str.<br>
<div class="im"><br></div></blockquote><div><br><div><div>PyMySQL and MySQL-python supports both of unicode text and encoded text.<br></div>So bytes may be text in MySQL if it inserted into TEXT or VARCHAR column.<br><br>
</div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="im">
> bindata = b"abc\xff\x00"<br>
> query = "UPDATE table SET textcol=%s bincol=%s"<br>
><br>
> print build_query(query, textdata, bindata)<br>
><br>
><br>
> I can't port this to Python 3.<br>
<br>
</div>I'm sure you can port it. Just decode your bindata using<br>
surrogateescape:<br>
<br>
bindata = bindata.decode('utf8', 'surrogateescape')<br>
<br>
and then encode the query at the end:<br>
<br>
query = query.encode('utf8', 'surrogateescape')<br>
<br>
It will be a little slower, though.<br></blockquote><div><br></div><div>You're right. I've not considered using surrogateescape here.<br></div><div><br>But MySQL connection may be not utf8. It's default latin1 and you can use many encoding.<br>
</div><div>Some encoding doesn't ensure roundtrip. In such encoding, <br><br>bindata = bindata.decode('sjis', 'surrogateescape')<br></div><div>query = query % bindata<br></div><div>query.encode('sjis', 'surrogateescape')<br>
</div><div><br></div><div>may break bindata.<br><br></div><div>I may be able to ascii for decoding when mysql uses ascii compatible encoding.<br></div><div><br>bindata = bindata.decode('ascii', 'surrogateescape')<br>
<div>query = query % bindata<br></div><div>query.encode('sjis', 'surrogateescape')<br></div><div><br></div><div>But I think decode/encode with surrogateescape is not only slow, but also dangerous when using<br>
</div><div>encoding except ascii or utf8.<br></div><br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Regards<br>
<span class=""><font color="#888888"><br>
Antoine<br>
</font></span><div class=""><div class="h5"><br>
<br>
_______________________________________________<br>
Python-ideas mailing list<br>
<a href="mailto:Python-ideas@python.org">Python-ideas@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/python-ideas" target="_blank">https://mail.python.org/mailman/listinfo/python-ideas</a><br>
Code of Conduct: <a href="http://python.org/psf/codeofconduct/" target="_blank">http://python.org/psf/codeofconduct/</a><br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>INADA Naoki <<a href="mailto:songofacandy@gmail.com">songofacandy@gmail.com</a>>
</div></div>