[ python-Bugs-846133 ] os.chmod/os.utime/shutil do not work with
unicode filenames
SourceForge.net
noreply at sourceforge.net
Fri Nov 28 04:58:15 EST 2003
Bugs item #846133, was opened at 2003-11-21 08:27
Message generated for change (Comment added) made by mhammond
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=846133&group_id=5470
Category: Unicode
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Eric Meyer (meyeet)
>Assigned to: Mark Hammond (mhammond)
>Summary: os.chmod/os.utime/shutil do not work with unicode filenames
Initial Comment:
I have a filename that contains Kanji characters and I'm
trying change the permissions on the file.
I am running Python 2.3.1 on Windows 2000. Also I
have the japanese language pack installed so that I can
view the kanji characters in Windows explorer.
>>> part
u'\u5171\u6709\u3055\u308c\u308b.txt'
>>> os.chmod(part, 0777)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
OSError: [Errno 22] Invalid argument: '?????.txt'
>>>
I attached the above named file for you to test against.
Thanks.
----------------------------------------------------------------------
>Comment By: Mark Hammond (mhammond)
Date: 2003-11-28 20:58
Message:
Logged In: YES
user_id=14198
I opened http://www.python.org/sf/846133 regarding os.utime,
which I found via the "shutil" module, via SpamBayes, also
on a Japanese system (see that bug for details), but then I
saw this and decided to tackle them both.
I rolled my fix for that in with a fix for chmod. I also
hacked the test suite radically:
* Creation of a test_support.TESTFN_UNICODE_UNENCODEABLE
variable, which is a Unicode string that can *not* be
encoded using the file system encoding. This will cause
functions with 'encoding' support but without Unicode
support (such as utime/chmod) to fail.
* Made functions of all the test cases, so more combinations
of unicode/encoded can be tested. Many are redundant, but
that is OK.
* Added shutil tests of the filenames
* While I was there, converted to a unittest test.
The new test case blows up with a couple of errors before
the posixmodule patch is applied, and passes after.
Note that shutil.move/copy etc can not handle being passed
one string and one unicode arg, and therefore this
combination is skipped. I'd like any opinions on whether
this is a bug in shutil or not.
Also note that the new comment in test_support.py regarding
a potential bug in the 'mbcs' encoding - it appears as if it
always works as though errors=ignore.
Comments/reviews?
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2003-11-25 09:21
Message:
Logged In: YES
user_id=21627
If you look at the source of os.chmod, it is not at all
surprising that it does not work for characters outside the
file system encoding: it is simply not implemented. Patches
are welcome.
----------------------------------------------------------------------
Comment By: George Yoshida (quiver)
Date: 2003-11-22 11:51
Message:
Logged In: YES
user_id=671362
Hi, Eric.
My previous post was maybe wrong.
This is the problem of os.chmod.
I've confirmed two kinds of exceptions are raised when
using os.chmod for unicode filenames.
The first one is [Errno 22] Invalid argument.
You can read/write a file but cannot use os.chmod.
The second one is [Errno 2] No such file or directory.
Although there exists a file, Python complains "No such
file or directory"
test.test_codecs has a bunch of international unicode
characters, so I borrowed them for testing.
>>> import os
>>> from test.test_codecs import punycode_testcases
>>> def unicode_test(name):
try:
f = file(name, 'w')
f.close()
except IOError, e:
print e
return
try:
os.chmod(name, 0777)
except OSError, e:
print e
>>> for i, (uni, puny) in enumerate
(punycode_testcases):
print i
unicode_test(uni)
I ran this script on Windows 2000(Japanese edition)
using Python 2.3 and got "[Errno 22]" for
0,1,2,3,4,5,7,10 and "[Errno 2]" for 9.
----------------------------------------------------------------------
Comment By: Eric Meyer (meyeet)
Date: 2003-11-22 03:18
Message:
Logged In: YES
user_id=913976
George,
I tried the following but I had to specify one of the japanese
codecs during the unicode() call. What is your default
encoding set to? Below are my results.
>>> import os
>>> os.listdir('.')
[]
>>> u1 = unicode('\x82\xa0', 'cp932')
>>> u2 = u'\x82\xa0'
>>> u1, u2
(u'\u3042', u'\x82\xa0')
>>> print >> file(u1, 'w'), "hello world"
>>> os.listdir('.')
['?']
>>> os.chmod(u1, 0777)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
OSError: [Errno 22] Invalid argument: '?'
----------------------------------------------------------------------
Comment By: George Yoshida (quiver)
Date: 2003-11-21 11:07
Message:
Logged In: YES
user_id=671362
I'm running Python in almost the same environment.
I guess this results from the different bihavior of u'' and
unicode('').
If you convert a multi-byte character to a unicode
character,
u'' and unicode('') don't return the same string.
unicode'' works as intended but u'' doesn't.
This is probably caused by the bug of Japanese codecs
package.
Eric, please try the session below and tell me what
happens.
NOTE: Japanese codecs needs to be installed to test the
code below.
Otherwise, UnicodeDecodeError will be raised.
---
>>> import os
>>> os.listdir('.')
[]
>>> lst = ['\x82', '\xa0'] # japanese character
>>> u1 = unicode('\x82\xa0')
>>> u2 = u'\x82\xa0'
>>> u1 == u2
False
>>> u1, u2
(u'\u3042', u'\x82\xa0') # u2 is odd
>>> print >> file(u1, 'w'), "hello world"
>>> os.listdir('.')
['あ']
>>> os.chmod(u1, 0777)
>>> os.chmod(u2, 0777)
Traceback (most recent call last):
File "<pyshell#179>", line 1, in -toplevel-
os.chmod(u2, 0777)
OSError: [Errno 22] Invalid argument: '??'
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=846133&group_id=5470
More information about the Python-bugs-list
mailing list