performance problem with time.strptime()
Nils Rüttershoff
nils at ccsg.de
Thu Jul 2 09:00:11 EDT 2009
Hi Casey
Casey Webster wrote:
> On Jul 2, 7:30 am, Nils Rüttershoff <n... at ccsg.de> wrote:
>
>
>> Rec = re.compile(r"^\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\s-\s\d+\s\[(\d{2}/\w+/\d{4}:\d{2}:\d{2}:\d{2})\s\+\d{4}\].*")
>> Line = '1.2.3.4 - 4459 [02/Jul/2009:01:50:26 +0200] "GET /foo HTTP/1.0" 200 - "-" "www.example.org" "-" "-" "-"'
>>
>
> I'm not sure how much it will help but if you are only using the regex
> to get the date/time group element, it might be faster to replace the
> regex with:
>
>
>>>> date_string = Line.split()[3][1:-1]
>>>>
Indeed this would give a little speed up (by 1000000 iteration approx
3-4 sec). But this would be only a small piece of the cake. Although thx :)
The problem is that time.strptime() consult locale.py for each
iteration. Here the hole cProfile trace:
first with epoch and second with strptime (condensed):
5000009 function calls in 33.084 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 33.084 33.084 <string>:1(<module>)
1 2.417 2.417 33.084 33.084 <timeit-src>:2(inner)
1000000 9.648 0.000 30.667 0.000 time_test.py:30(epoch)
1 0.000 0.000 33.084 33.084 timeit.py:177(timeit)
1000000 3.711 0.000 3.711 0.000 {built-in method groupdict}
1000000 4.318 0.000 4.318 0.000 {built-in method match}
1 0.000 0.000 0.000 0.000 {gc.disable}
1 0.000 0.000 0.000 0.000 {gc.enable}
1 0.000 0.000 0.000 0.000 {gc.isenabled}
1000000 7.764 0.000 7.764 0.000 {map}
1 0.000 0.000 0.000 0.000 {method 'disable' of
'_lsprof.Profiler' objects}
1000000 5.225 0.000 5.225 0.000 {time.mktime}
2 0.000 0.000 0.000 0.000 {time.time}
################################################################
29000009 function calls in 124.449 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 124.449 124.449 <string>:1(<module>)
1 2.244 2.244 124.449 124.449 <timeit-src>:2(inner)
1000000 3.500 0.000 33.559 0.000 _strptime.py:27(_getlang)
1000000 41.814 0.000 100.754 0.000 _strptime.py:295(_strptime)
1000000 4.010 0.000 104.764 0.000
_strptime.py:453(_strptime_time)
1000000 11.647 0.000 19.529 0.000 locale.py:316(normalize)
1000000 3.638 0.000 23.167 0.000
locale.py:382(_parse_localename)
1000000 5.120 0.000 30.059 0.000 locale.py:481(getlocale)
1000000 7.242 0.000 122.205 0.000 time_test.py:37(strptime)
1 0.000 0.000 124.449 124.449 timeit.py:177(timeit)
1000000 1.771 0.000 1.771 0.000 {_locale.setlocale}
1000000 1.735 0.000 1.735 0.000 {built-in method __enter__}
1000000 1.626 0.000 1.626 0.000 {built-in method end}
1000000 3.854 0.000 3.854 0.000 {built-in method groupdict}
1000000 1.646 0.000 1.646 0.000 {built-in method group}
2000000 8.409 0.000 8.409 0.000 {built-in method match}
1 0.000 0.000 0.000 0.000 {gc.disable}
1 0.000 0.000 0.000 0.000 {gc.enable}
1 0.000 0.000 0.000 0.000 {gc.isenabled}
2000000 2.942 0.000 2.942 0.000 {len}
1 0.000 0.000 0.000 0.000 {method 'disable' of
'_lsprof.Profiler' objects}
3000000 4.552 0.000 4.552 0.000 {method 'get' of 'dict'
objects}
1000000 2.072 0.000 2.072 0.000 {method 'index' of 'list'
objects}
1000000 1.517 0.000 1.517 0.000 {method 'iterkeys' of
'dict' objects}
2000000 3.113 0.000 3.113 0.000 {method 'lower' of 'str'
objects}
2000000 3.233 0.000 3.233 0.000 {method 'replace' of 'str'
objects}
2000000 2.953 0.000 2.953 0.000 {method 'toordinal' of
'datetime.date' objects}
1000000 1.476 0.000 1.476 0.000 {method 'weekday' of
'datetime.date' objects}
1000000 4.332 0.000 109.097 0.000 {time.strptime}
2 0.000 0.000 0.000 0.000 {time.time}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20090702/9b9cdd1f/attachment-0001.html>
More information about the Python-list
mailing list