[ python-Bugs-846133 ] os.chmod/os.utime/shutil do not work with unicode filenames

SourceForge.net noreply at sourceforge.net
Wed May 5 08:26:41 EDT 2004


Bugs item #846133, was opened at 2003-11-21 08:27
Message generated for change (Comment added) made by mhammond
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=846133&group_id=5470

Category: Unicode
Group: None
>Status: Closed
Resolution: None
Priority: 5
Submitted By: Eric Meyer (meyeet)
Assigned to: Mark Hammond (mhammond)
Summary: os.chmod/os.utime/shutil do not work with unicode filenames

Initial Comment:
I have a filename that contains Kanji characters and I'm 
trying change the permissions on the file.

I am running Python 2.3.1 on Windows 2000.  Also I 
have the japanese language pack installed so that I can 
view the kanji characters in Windows explorer.


>>> part
u'\u5171\u6709\u3055\u308c\u308b.txt'
>>> os.chmod(part, 0777)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OSError: [Errno 22] Invalid argument: '?????.txt'
>>>

I attached the above named file for you to test against.

Thanks.

----------------------------------------------------------------------

>Comment By: Mark Hammond (mhammond)
Date: 2004-05-05 22:26

Message:
Logged In: YES 
user_id=14198

I'm fairly sure this has been nailed (including the test
failure) for some time?

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2003-12-04 18:18

Message:
Logged In: YES 
user_id=21627

2.3 maint should be fine: the problems are more likely in
the new test cases than in the code itself.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-12-04 06:21

Message:
Logged In: YES 
user_id=31435

meyeet, 2.3.3 should be released this month (December).

Mark, I reopened this, because test_unicode_filename fails on 
Win98SE now (see Python-Dev report; that was on the trunk; 
I don't know about 2.3 maint).

----------------------------------------------------------------------

Comment By: Eric Meyer (meyeet)
Date: 2003-12-04 06:16

Message:
Logged In: YES 
user_id=913976

Is there an approximate date (or month) when 2.3.3 is likely 
to be released?

----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2003-12-03 12:33

Message:
Logged In: YES 
user_id=14198

release23-maint:
Checking in posixmodule.c;
new revision: 2.300.8.5; previous revision: 2.300.8.4

trunk:
Checking in posixmodule.c;
new revision: 2.309; previous revision: 2.308
Checking in test_support.py;
new revision: 1.59; previous revision: 1.58
Checking in test_unicode_file.py;
new revision: 1.11; previous revision: 1.10
Removing output/test_unicode_file;
new revision: delete; previous revision: 1.1



----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2003-12-02 08:39

Message:
Logged In: YES 
user_id=21627

The patches to posixmodule.c are fine for both 2.3 and 2.4.
Can you apply them before 2.3.3 is frozen?

The patches to the test suite are fine for 2.4 only, and
they probably need to be relaxed. For example, on OSX, there
simply is no file name that fails to work for the normal
file system API: the file system encoding is UTF-8, so it
supports all file names. You should consider changing
test_pep277.py instead.

----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2003-11-29 12:32

Message:
Logged In: YES 
user_id=14198

I created www.python.org/sf/850997 about the MBCS encoding
issue.

----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2003-11-28 20:58

Message:
Logged In: YES 
user_id=14198

I opened http://www.python.org/sf/846133 regarding os.utime,
which I found via the "shutil" module, via SpamBayes, also
on a Japanese system (see that bug for details), but then I
saw this and decided to tackle them both.

I rolled my fix for that in with a fix for chmod.  I also
hacked the test suite radically:
* Creation of a test_support.TESTFN_UNICODE_UNENCODEABLE
variable, which is a Unicode string that can *not* be
encoded using the file system encoding.  This will cause
functions with 'encoding' support but without Unicode
support (such as utime/chmod) to fail.
* Made functions of all the test cases, so more combinations
of unicode/encoded can be tested.  Many are redundant, but
that is OK.
* Added shutil tests of the filenames
* While I was there, converted to a unittest test.

The new test case blows up with a couple of errors before
the posixmodule patch is applied, and passes after.

Note that shutil.move/copy etc can not handle being passed
one string and one unicode arg, and therefore this
combination is skipped.   I'd like any opinions on whether
this is a bug in shutil or not.

Also note that the new comment in test_support.py regarding
a potential bug in the 'mbcs' encoding - it appears as if it
always works as though errors=ignore.

Comments/reviews?

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2003-11-25 09:21

Message:
Logged In: YES 
user_id=21627

If you look at the source of os.chmod, it is not at all
surprising that it does not work for characters outside the
file system encoding: it is simply not implemented. Patches
are welcome.

----------------------------------------------------------------------

Comment By: George Yoshida (quiver)
Date: 2003-11-22 11:51

Message:
Logged In: YES 
user_id=671362

Hi, Eric.

My previous post was maybe wrong.
This is the problem of os.chmod.

I've confirmed two kinds of exceptions are raised when 
using os.chmod for unicode filenames.

The first one is [Errno 22] Invalid argument.
You can read/write a file but cannot use os.chmod.

The second one is [Errno 2] No such file or directory.
Although there exists a file, Python complains "No such 
file or directory"

test.test_codecs has a bunch of international unicode 
characters, so I borrowed them for testing.

>>> import os
>>> from test.test_codecs import punycode_testcases
>>> def unicode_test(name):
    try:
        f = file(name, 'w')
        f.close()
    except IOError, e:
        print e
        return
    try:
        os.chmod(name, 0777)
    except OSError, e:
        print e

        
>>> for i, (uni, puny) in enumerate
(punycode_testcases):
    print i
    unicode_test(uni)


I ran this script on Windows 2000(Japanese edition) 
using Python 2.3 and got "[Errno 22]" for 
0,1,2,3,4,5,7,10 and "[Errno 2]" for 9.

----------------------------------------------------------------------

Comment By: Eric Meyer (meyeet)
Date: 2003-11-22 03:18

Message:
Logged In: YES 
user_id=913976

George,

I tried the following but I had to specify one of the japanese 
codecs during the unicode() call.  What is your default 
encoding set to?  Below are my results. 

>>> import os
>>> os.listdir('.')
[]
>>> u1 = unicode('\x82\xa0', 'cp932')
>>> u2 = u'\x82\xa0'
>>> u1, u2
(u'\u3042', u'\x82\xa0')
>>> print >> file(u1, 'w'), "hello world"
>>> os.listdir('.')
['?']
>>> os.chmod(u1, 0777)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OSError: [Errno 22] Invalid argument: '?'

----------------------------------------------------------------------

Comment By: George Yoshida (quiver)
Date: 2003-11-21 11:07

Message:
Logged In: YES 
user_id=671362

I'm running Python in almost the same environment.

I guess this results from the different bihavior of u'' and 
unicode('').
If you convert a multi-byte character to a unicode 
character,
u'' and unicode('') don't return the same string.
unicode'' works as intended but u'' doesn't.
This is probably caused by the bug of Japanese codecs 
package.

Eric, please try the session below and tell me what 
happens.

NOTE: Japanese codecs needs to be installed to test the 
code below.
Otherwise, UnicodeDecodeError will be raised.
---

>>> import os
>>> os.listdir('.')
[]
>>> lst = ['\x82', '\xa0']   # japanese character
>>> u1 = unicode('\x82\xa0')
>>> u2 = u'\x82\xa0'
>>> u1 == u2
False
>>> u1, u2
(u'\u3042', u'\x82\xa0')  # u2 is odd
>>> print >> file(u1, 'w'), "hello world"
>>> os.listdir('.')
['&#12354;']
>>> os.chmod(u1, 0777)
>>> os.chmod(u2, 0777)

Traceback (most recent call last):
  File "<pyshell#179>", line 1, in -toplevel-
    os.chmod(u2, 0777)
OSError: [Errno 22] Invalid argument: '??'

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=846133&group_id=5470



More information about the Python-bugs-list mailing list