Fastest way to calculate leading whitespace
dasacc22
dasacc22 at gmail.com
Wed May 12 00:10:46 EDT 2010
On May 10, 2:25 am, Stefan Behnel <stefan... at behnel.de> wrote:
> Stefan Behnel, 10.05.2010 08:54:
>
>
>
>
>
> > dasacc22, 08.05.2010 19:19:
> >> This is a simple question. I'm looking for the fastest way to
> >> calculate the leading whitespace (as a string, ie ' ').
>
> > Here is an (untested) Cython 0.13 solution:
>
> > from cpython.unicode cimport Py_UNICODE_ISSPACE
>
> > def leading_whitespace(unicode ustring):
> > cdef Py_ssize_t i
> > cdef Py_UNICODE uchar
>
> > for i, uchar in enumerate(ustring):
> > if not Py_UNICODE_ISSPACE(uchar):
> > return ustring[:i]
> > return ustring
>
> > Cython compiles this to the obvious C code, so this should be impossible
> > to beat in plain Python code.
>
> ... and it is. For a simple string like
>
> u = u" abcdefg" + u"fsdf"*20
>
> timeit gives me this for "s=u.lstrip(); u[:-len(s)]":
>
> 1000000 loops, best of 3: 0.404 usec per loop
>
> and this for "leading_whitespace(u)":
>
> 10000000 loops, best of 3: 0.0901 usec per loop
>
> It's closer for the extreme case of an all whitespace string like " "*60,
> where I get this for the lstrip variant:
>
> 1000000 loops, best of 3: 0.277 usec per loop
>
> and this for the Cython code:
>
> 10000000 loops, best of 3: 0.177 usec per loop
>
> But I doubt that this is the main use case of the OP.
>
> Stefan
indeed, actually ive been going back and forth on the idea to use
cython for some more intensive portions. That bit of code looks really
simple so I think I'll give cython a shot. Only deal is I need to be
able to use w/e the latest cython is available via easy_install, but
this should prove an interesting experience.
More information about the Python-list
mailing list