[Patches] [Patch #103250] Optimize a strspn() out of startup
noreply@sourceforge.net
noreply@sourceforge.net
Thu, 18 Jan 2001 16:24:58 -0800
Patch #103250 has been updated.
Project: python
Category: None
Status: Closed
Submitted by: pj99
Assigned to : gvanrossum
Summary: Optimize a strspn() out of startup
Follow-Ups:
Date: 2001-Jan-18 16:24
By: gvanrossum
Comment:
OK, I've applied this, and am closing the patch.
It's actually disturbing that the parser uses isalnum() -- the language is
not *supposed* to accept non-ASCII letters in the Latin-1 character set,
but under certain circumstances, that may be the effect!
-------------------------------------------------------
Date: 2001-Jan-16 21:33
By: nobody
Comment:
At least the Python tokenizer uses (isalnum(c) || c == '_') ...
-------------------------------------------------------
Date: 2001-Jan-16 14:28
By: pj99
Comment:
The (isalpha(c) || isdigit(c) || c=='_') suggestion
seems to me to be the better idea. However, how
does this interact with localization and unicode and
such? Do we want exactly the list of specified
NAME_CHARS allowed here, or do we want a localized
set of alpha/digit/'_' chars allowed here?
As for the other 98.5%, I have a nice table from my
profiler, showing where the time is, but I don't know
how to put that table here, without mangling the format.
It shows about 19% of the time in malloc/free, about 15%
of the time in pthread code, about 14% in dictobject.c code,
and about 6% in acceler.c code (esp. fixstate()).
And of course the getc() savings that come from loading
an entire *.pyc file in a single fread(), as being persued
in patch 103252, saves perhaps 24% of what's left. (Note,
the 19,15,14 and 6 percents above are based on what's left,
_after_ the 24% getc() optimization is applied).
*If* much of the malloc/free use is for lots of small
chunks, coming and going, as tends to happen with object
based code, then I could imagine a custom allocator (based
on top of malloc/free, so quite portable). This might
save much of the pthread cost as well, if the custom
allocator kept separate pools of small chunks of malloc'd
memory, per thread, avoiding locks (if that can be done ?).
I have done work in the past in a C++/mmap environment that
reduced the costs of many small dynamic allocations to near
zero, both space and time. But that relied on the second
argument to C++ delete(), which is the size of the object
being deleted, as well as on an AUTOGROW mmap region. This
is not applicable to C, nor is it particularly portable.
Still, something like this, perhaps requiring recoding of
some critical free() calls to use an optional method that
passed in a second sizeof argument, might be the next most
useful optimization.
Someone else in the news thread that was parent to this
patch suggested emacs-like dump facility, to capture the
state of the Python interpreter after it has loaded the
usual suspects. But I can't imagine that this would be
sufficiently portable or maintainable to be relevant.
As someone who has had to maintain O.S./library support
for and compatibility with this emacs dump facility, I
have grown to dislike it strongly.
-------------------------------------------------------
Date: 2001-Jan-15 23:25
By: nobody
Comment:
Why not (isalpha(c) || isdigit(c) || c=='_') ?
-------------------------------------------------------
Date: 2001-Jan-15 19:35
By: gvanrossum
Comment:
Looks cool to me.
Do you have an idea what to do about the other 98.5%? :)
-------------------------------------------------------
-------------------------------------------------------
For more info, visit:
http://sourceforge.net/patch/?func=detailpatch&patch_id=103250&group_id=5470