[Python-Dev] New proposition for Python3 bytes filename issue

Victor Stinner victor.stinner at haypocalc.com
Mon Sep 29 15:23:04 CEST 2008


Patches are already avaible in the issue #3187 (os.listdir):

Le Monday 29 September 2008 14:07:55 Victor Stinner, vous avez écrit :
>  - listdir(unicode) -> unicode and raise an error on invalid filename

Need raise_decoding_errors.patch (don't clear Unicode error

>  - listdir(bytes) -> bytes

Always working.

>  - getcwd() -> unicode
>  - getcwd(bytes=True) -> bytes

Need merge_os_getcwd_getcwdu.patch

Note that current implement of getcwd() uses PyUnicode_FromString() to encode 
the directory, whereas getcwdu() uses the correct code (PyUnicode_Decode). So 
I merged both functions to keep only the correct version: getcwdu() => 
getcwd().

>  - open(): accept bytes or unicode

Need io_byte_filename.patch (just remove a check)

> os.path.*() should accept operations on bytes filenames, but maybe not on
> bytes+unicode arguments. os.path.join('directory', b'filename'): raise an
> error (or use *implicit* conversion to bytes)?

os.path.join() already reject mixing bytes + str.

But os.path.join(), glob.glob(), fnmatch.*(), etc. doesn't support bytes. I 
wrote some patches like:
 - glob1_bytes.patch: Fix glob.glob() to accept invalid directory name
 - fnmatch_bytes.patch: Patch fnmatch.filter() to accept bytes filenames

But I dislike both patches since they mix bytes and str. So this part still 
need some work.

-- 
Victor Stinner aka haypo
http://www.haypocalc.com/blog/


More information about the Python-Dev mailing list