[Patches] [Patch #103250] Optimize a strspn() out of startup

noreply@sourceforge.net noreply@sourceforge.net
Thu, 18 Jan 2001 16:24:58 -0800


Patch #103250 has been updated. 

Project: python
Category: None
Status: Closed
Submitted by: pj99
Assigned to : gvanrossum
Summary: Optimize a strspn() out of startup

Follow-Ups:

Date: 2001-Jan-18 16:24
By: gvanrossum

Comment:
OK, I've applied this, and am closing the patch.

It's actually disturbing that the parser uses isalnum() -- the language is
not *supposed* to accept non-ASCII letters in the Latin-1 character set,
but under certain circumstances, that may be the effect!

-------------------------------------------------------

Date: 2001-Jan-16 21:33
By: nobody

Comment:
At least the Python tokenizer uses (isalnum(c) || c == '_') ...

-------------------------------------------------------

Date: 2001-Jan-16 14:28
By: pj99

Comment:
The (isalpha(c) || isdigit(c) || c=='_') suggestion
seems to me to be the better idea.  However, how
does this interact with localization and unicode and
such?  Do we want exactly the list of specified
NAME_CHARS allowed here, or do we want a localized
set of alpha/digit/'_' chars allowed here?

As for the other 98.5%, I have a nice table from my
profiler, showing where the time is, but I don't know
how to put that table here, without mangling the format.
It shows about 19% of the time in malloc/free, about 15%
of the time in pthread code, about 14% in dictobject.c code,
and about 6% in acceler.c code (esp. fixstate()).

And of course the getc() savings that come from loading
an entire *.pyc file in a single fread(), as being persued
in patch 103252, saves perhaps 24% of what's left.  (Note,
the 19,15,14 and 6 percents above are based on what's left,
_after_ the 24% getc() optimization is applied).

*If* much of the malloc/free use is for lots of small
chunks, coming and going, as tends to happen with object
based code, then I could imagine a custom allocator (based
on top of malloc/free, so quite portable).  This might
save much of the pthread cost as well, if the custom
allocator kept separate pools of small chunks of malloc'd
memory, per thread, avoiding locks (if that can be done ?).
I have done work in the past in a C++/mmap environment that
reduced the costs of many small dynamic allocations to near
zero, both space and time.  But that relied on the second
argument to C++ delete(), which is the size of the object
being deleted, as well as on an AUTOGROW mmap region.  This
is not applicable to C, nor is it particularly portable.
Still, something like this, perhaps requiring recoding of
some critical free() calls to use an optional method that
passed in a second sizeof argument, might be the next most
useful optimization.

Someone else in the news thread that was parent to this
patch suggested emacs-like dump facility, to capture the
state of the Python interpreter after it has loaded the
usual suspects.  But I can't imagine that this would be
sufficiently portable or maintainable to be relevant.
As someone who has had to maintain O.S./library support
for and compatibility with this emacs dump facility, I
have grown to dislike it strongly.

-------------------------------------------------------

Date: 2001-Jan-15 23:25
By: nobody

Comment:
Why not (isalpha(c) || isdigit(c) || c=='_') ?

-------------------------------------------------------

Date: 2001-Jan-15 19:35
By: gvanrossum

Comment:
Looks cool to me.

Do you have an idea what to do about the other 98.5%? :)
-------------------------------------------------------

-------------------------------------------------------
For more info, visit:

http://sourceforge.net/patch/?func=detailpatch&patch_id=103250&group_id=5470