Fastest way to calculate leading whitespace

dasacc22 dasacc22 at gmail.com
Wed May 12 00:10:46 EDT 2010


On May 10, 2:25 am, Stefan Behnel <stefan... at behnel.de> wrote:
> Stefan Behnel, 10.05.2010 08:54:
>
>
>
>
>
> > dasacc22, 08.05.2010 19:19:
> >> This is a simple question. I'm looking for the fastest way to
> >> calculate the leading whitespace (as a string, ie ' ').
>
> > Here is an (untested) Cython 0.13 solution:
>
> >     from cpython.unicode cimport Py_UNICODE_ISSPACE
>
> >     def leading_whitespace(unicode ustring):
> >         cdef Py_ssize_t i
> >         cdef Py_UNICODE uchar
>
> >         for i, uchar in enumerate(ustring):
> >             if not Py_UNICODE_ISSPACE(uchar):
> >                 return ustring[:i]
> >         return ustring
>
> > Cython compiles this to the obvious C code, so this should be impossible
> > to beat in plain Python code.
>
> ... and it is. For a simple string like
>
>      u = u"   abcdefg" + u"fsdf"*20
>
> timeit gives me this for "s=u.lstrip(); u[:-len(s)]":
>
> 1000000 loops, best of 3: 0.404 usec per loop
>
> and this for "leading_whitespace(u)":
>
> 10000000 loops, best of 3: 0.0901 usec per loop
>
> It's closer for the extreme case of an all whitespace string like " "*60,
> where I get this for the lstrip variant:
>
> 1000000 loops, best of 3: 0.277 usec per loop
>
> and this for the Cython code:
>
> 10000000 loops, best of 3: 0.177 usec per loop
>
> But I doubt that this is the main use case of the OP.
>
> Stefan

indeed, actually ive been going back and forth on the idea to use
cython for some more intensive portions. That bit of code looks really
simple so I think I'll give cython a shot. Only deal is I need to be
able to use w/e the latest cython is available via easy_install, but
this should prove an interesting experience.



More information about the Python-list mailing list