[issue22019] ntpath.join() error with Chinese character Path

Zachary Ware report at bugs.python.org
Mon Jul 21 04:08:19 CEST 2014


Zachary Ware added the comment:

What type are your arguments, str, unicode, or a mix?  I can reproduce your issue using a unicode and a str containing a non-ASCII character, while any other combination "works":

>>> import os
>>> os.path.join('test', 'test\x85')
'test\\test\x85'
>>> os.path.join('test', u'test\x85')
u'test\\test\x85'
>>> os.path.join(u'test', 'test\x85')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\ntpath.py", line 84, in join
    result_path = result_path + p_path
UnicodeDecodeError: 'ascii' codec can't decode byte 0x85 in position 4: ordinal not in range(128)
>>> os.path.join(u'test', u'test\x85')
u'test\\test\x85'

The fact that any mixed-type combination works is sheer accident.  This is just a side effect of Python 2's 'bolted-on' approach to Unicode, and the fix is to upgrade to Python 3.  If you have to stay with Python 2, you can try to fix your code by making sure you decode all input to unicode as soon as you get it, and only encode to str when you have to (which is basically what you need to do in Python 3, but Python won't give you helpful exceptions at the source of the problem in 2.x).

I don't believe there's anything that should be changed in ntpath.join.

----------
nosy: +zach.ware

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue22019>
_______________________________________


More information about the Python-bugs-list mailing list