[issue9630] Reencode filenames when setting the filesystem encoding

STINNER Victor report at bugs.python.org
Fri Sep 24 14:40:40 CEST 2010


STINNER Victor <victor.stinner at haypocalc.com> added the comment:

> Some things about your patch:
> - as Amaury said, functions should be named "redecode*"
> rather than "reencode*" 

Yes, as written before (msg117269), I will do it in my next patch.

> - please use -1 for error return, not 1

Ok.

> - have you tried to measure if it made Python startup slower?

(Sploiter: the overhead is around 3%)

First, my patch doesn't concern Windows or Mac OS X, because the filesystem 
encoding is hardcoded in these platforms. Then, it only concerns systems with 
a filesystem encoding different than utf-8. utf-8 is now the default encoding of 
all Linux distributions. I suppose that BSD systems do also use it by default.

Let's try a dummy benchmark with py3k r84990. 5 runs, I kept the smallest 
time.

-- pydebug mode (gcc -O0) with the patch ---

$ unset PYTHONFSENCODING; time ./python  -c "pass"
real    0m0.084s
user    0m0.080s
sys     0m0.010s

$ export PYTHONFSENCODING=ascii; time ./python  -c "pass"
real    0m0.100s
user    0m0.100s
sys     0m0.000s

The startup time overhead is around 20%.

-- default mode (gcc -O3) without the patch ---

$ unset PYTHONFSENCODING; time ./python  -c "pass"

real    0m0.033s
user    0m0.030s
sys     0m0.000s

-- default mode (gcc -O3) with the patch ---

$ export PYTHONFSENCODING=utf-8; time ./python  -c "pass"

real    0m0.032s
user    0m0.030s
sys     0m0.000s

$ export PYTHONFSENCODING=ascii; time ./python  -c "pass"

real    0m0.033s
user    0m0.020s
sys     0m0.020s

Here is overhead is around 3%.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9630>
_______________________________________


More information about the Python-bugs-list mailing list