[issue1170] shlex have problems with parsing unicode

Éric Araujo report at bugs.python.org
Sun Oct 23 07:36:14 CEST 2011


Éric Araujo <merwok at netwok.org> added the comment:

$ ./python 
Python 2.7.2+ (2.7:27ae7d4e1983+, Oct 23 2011, 00:09:06) 
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import shlex
>>> shlex.split(u'Hello, World!')
['Hello,', 'World!']

This bug was fixed indirectly by a StringIO fix in 27ae7d4e1983, for #1548891.  BTW, this report was a duplicate of #6988, closed a year ago.

Python 2.7.3 will finally support unicode in shlex, so the doc change requested in this report is outdated.  However, I still want to do something for this.  I’ve noticed that shlex.split’s argument can be a file-like object, and I wonder if passing a StringIO.StringIO(my_unicode_string) wouldn’t work.  If such a short recipe works, I’m all for including it in the 2.7 docs for users of older versions.  If a longer recipe is needed, then ActiveState’s Python Cookbook would be more appropriate, and I’ll add a link to the docs.  If it’s very long and requires a PyPI project, then I’m willing to link to that.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue1170>
_______________________________________


More information about the Python-bugs-list mailing list