[Patches] [ python-Patches-536661 ] splitext performances improvement
noreply@sourceforge.net
noreply@sourceforge.net
Thu, 12 Dec 2002 12:31:52 -0800
Patches item #536661, was opened at 2002-03-29 09:06
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470
Category: Library (Lib)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Sebastien Keim (s_keim)
Assigned to: Martin v. Löwis (loewis)
Summary: splitext performances improvement
Initial Comment:
After more thought, I must admit that the behavior change in splitext, I proposed with patch 536120 is not acceptable. So I would instead propose this one which should only improve performances without modifying behavior.
The following bench says that patched splitext is between 2x(for l1) and 25x(for l2) faster than the original one.
The diff patch also test_posixpath.py to check the pitfall described by Tim comments in patch 536120 page.
def splitext(p):
root, ext = '', ''
for c in p:
if c == '/':
root, ext = root + ext + c, ''
elif c == '.':
if ext:
root, ext = root + ext, c
else:
ext = c
elif ext:
ext = ext + c
else:
root = root + c
return root, ext
def splitext2(p):
i = p.rfind('.')
if i<=p.rfind('/'):
return p, ''
else:
return p[:i], p[i:]
l1 = ('t','.t','a.b/','a.b','/a.b','a.b/.c','a.b/c.d')
l2 = (
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.tyyttyt',
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.',
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/.tyyttyt',
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut',
'reeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeyttyutyuyuttyuyut.tyyttyt',
'/iuouiiuuoiiuiikhjzekezhjzekejkejkzejkhejkhzejzehjkhjezhjkehzkhjezh.tyyttyt'
)
for i in l1+l2:
assert splitext2(i) == splitext(i)
import time
def test(f,args):
t = time.clock()
for p in args:
for i in range(1000):
f(p)
return time.clock() - t
def f(p):pass
a=test(splitext, l1)
b=test(splitext2, l1)
c=test(f,l1)
print a,b,c,(a-c)/(b-c)
a=test(splitext, l2)
b=test(splitext2, l2)
c=test(f,l2)
print a,b,c,(a-c)/(b-c)
----------------------------------------------------------------------
>Comment By: Martin v. Löwis (loewis)
Date: 2002-12-12 21:31
Message:
Logged In: YES
user_id=21627
Sebastien, thanks for the patch, and Armin, thanks for the
review. Applied as
macpath.py 1.41
ntpath.py 1.52
posixpath.py 1.55
test_macpath.py 1.1
test_ntpath.py 1.17
test_posixpath.py 1.5
(dospath has gone meanwhile)
----------------------------------------------------------------------
Comment By: Armin Rigo (arigo)
Date: 2002-12-07 17:04
Message:
Logged In: YES
user_id=4771
The test_macpath module should probably use
from test import test_support
instead of
import test_support
Apart from this the patch looks fine.
----------------------------------------------------------------------
Comment By: Sebastien Keim (s_keim)
Date: 2002-04-03 09:03
Message:
Logged In: YES
user_id=498191
xxxpath.dif contains the splitext patch for posixpath, ntpath, dospath macpath and the corresponding test files (I have added a test file for macpath).
I have found better to not attempt to modify riscospath.py since I don't know this platform. Anyway, it already use a rfind strategy.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2002-04-02 11:24
Message:
Logged In: YES
user_id=21627
Sharing code is a good thing. However, it would be critical
as to how exactly this is done, since os is such a central
module. If you start now, and don't get agreement
immediately, it may well be that you cannot complete until
Python 2.3.
----------------------------------------------------------------------
Comment By: Sebastien Keim (s_keim)
Date: 2002-04-02 09:28
Message:
Logged In: YES
user_id=498191
I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this?
----------------------------------------------------------------------
Comment By: Sebastien Keim (s_keim)
Date: 2002-04-02 09:15
Message:
Logged In: YES
user_id=498191
I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this?
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2002-03-29 19:56
Message:
Logged In: YES
user_id=31435
I like it fine so far as it goes, but I'd like it a lot
more if it also patched the splitext and test
implementations for other platforms. It's not good that,
e.g., posixpath.py and ntpath.py get more and more out of
synch over time, and that their test suites also diverge.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2002-03-29 10:49
Message:
Logged In: YES
user_id=21627
The patch looks good to me.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470