[Patches] [ python-Patches-1454481 ] Make thread stack size runtime tunable

SourceForge.net noreply at sourceforge.net
Wed Jun 14 04:05:21 CEST 2006


Patches item #1454481, was opened at 2006-03-20 23:37
Message generated for change (Comment added) made by aimacintyre
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1454481&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core (C code)
Group: Python 2.5
>Status: Pending
>Resolution: Accepted
Priority: 5
Submitted By: Andrew I MacIntyre (aimacintyre)
>Assigned to: Andrew I MacIntyre (aimacintyre)
Summary: Make thread stack size runtime tunable

Initial Comment:
Platform default thread stack sizes vary considerably.
Some are very generous (Win32: usually 1MB; Linux: 1MB,
sometimes 8MB).  Others are not (FreeBSD: 64k).

Some platforms have restricted virtual address space
OS/2: 512M less overhead) which makes hard coding a
generous default thread stack size problematic.  Some
platforms thread commit stack address space, even
though the memory backing it may not be committed
(Windows, OS/2 at least).

Some applications have a thirst for stack space in
threads (Zope). Some programmers want to be able to use
lots of threads, even in the face of sound advice about
the lack of wisdom in this approach.

The current approach to stack space management in
threads in Python uses a hard coded strategy, relying
on the platform having a useful default or relying on
the system administrator or distribution builder
over-riding the default at compile time.

This patch is intended to allow developers some control
over managing this resource from within Python code by
way of a function in the thread module.  As written, it
is not intended to provide unlimited flexibility; that
would probably require exposing the underlying
mechanism as an option on the creation of each thread.

An alternative approach to providing the functionality
would be to use an environment variable to provide the
information to the thread module.  This has its pros
and cons, in terms of flexibility and ease of use, and
could be complementary to the approach implemented.

The patch has been tested on OS/2 and FreeBSD 4.8.  I
have no means of testing the code on Win32 or Linux,
though Linux is a pthread environment as is FreeBSD. 
Code base is SVN head from a few hours ago. A doc 
update is included.

While I would like to see this functionality in Python
2.5, it is not a critical issue.

Critique of the approach and implementation welcome. 
Something not addressed is the issue of tests,
primarily because I haven't been able to think of a
viable testing strategy - I'm all ears to suggestions
for this.

----------------------------------------------------------------------

>Comment By: Andrew I MacIntyre (aimacintyre)
Date: 2006-06-14 12:05

Message:
Logged In: YES 
user_id=250749

Checked in to trunk after further revision in r46919, with
minor test tweaks in r46925 & r46929.

----------------------------------------------------------------------

Comment By: Andrew I MacIntyre (aimacintyre)
Date: 2006-05-26 00:47

Message:
Logged In: YES 
user_id=250749

Ok, v3 includes the additions to the threading module, tests
in both test_thread and test_threading and docs in both
thread and threading modules (duplicated as I don't know how
to do the LaTex linking).

If there are no other issues needing to be addressed, I
propose to check these changes in sometime on the weekend of
June 3-4 or thereabouts to get in a bit before the beta release.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2006-04-24 06:25

Message:
Logged In: YES 
user_id=31435

Right, this one: "a simple shadow of the function as a
module level function".  If it affects all threads (which it
does), then a module function is a natural place for it.  If
I a saw a method on the Thread class, the most natural (to
me ;-)) assumption is that a_thread.stack_size(N) would set
the stack size for the specific thread `a_thread`, but not
affect other threads.  Part of what makes that "the most
natural" assumption is that Thread has no class or static
methods today.  As a module-level function, no such
confusion is sanely possible.

Sticking "stack_size" in threading.__all__, and adding

from thread import stack_size

to threading.py is all I'm looking for here.  Well, plus
docs and a test case ;-)

----------------------------------------------------------------------

Comment By: Andrew I MacIntyre (aimacintyre)
Date: 2006-04-23 15:35

Message:
Logged In: YES 
user_id=250749

Thanks Tim.

My default action is to try and match the prevailing style,
but  cut'n'paste propagated the flaw.  thread_pthread.h was
clean AFAICS, so I'll do a style normalisation (as a
separate checkin) on thread_nt.py and thread_os2.h when
commit time comes.

As an "implementation detail", I hadn't considered that
exposing it via threading was appropriate.

I can see 2 approaches:
- a simple shadow of the function as a module level function;
or
- a classmethod of the Thread class.

Any hints on which would be the more preferable or natural
approach?

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2006-04-22 15:46

Message:
Logged In: YES 
user_id=31435

The patch applies cleanly on WinXP, "and works" (I checked
this by setting various stack sizes, spawning a thread doing
nothing but a raw_input(), and looking at the VM size under
Task Manager while the thread was paused waiting for input
-- the VM size went up each time roughly by the stack-size
increase; finally set stack_size to 0 again, and all the
"extra" VM went away).

Note that Python C style for defining functions puts the
function name in the first column.  For example,

"""
static int
_pythread_nt_set_stacksize(size_t size)
"""

instead of

"""
static int _pythread_nt_set_stacksize(size_t size)
"""

The patch isn't consistent about this, and perhaps it's
errenously ;-) aping bad style in surrounding function
definitions.

This should really be exposed via threading.py.  `thread` is
increasingly "just an implementation detail" of `threading`,
and it actually felt weird to me to write a test program
that had to import `thread`.

----------------------------------------------------------------------

Comment By: Andrew I MacIntyre (aimacintyre)
Date: 2006-04-14 22:51

Message:
Logged In: YES 
user_id=250749

I have updated the patch along the lines Martin suggested.

I have omitted OS/2 from the list of supported platforms in
the doc patch as I haven't added OS/2 to anywhere else in
the docs.  My thinging has been that OS/2 is a 2nd tier
platform, and I have kept an extensive port README file in
the build directory (PC/os2emx) documenting port specific
behaviour.

The idea with the environment variable version was that it
would be less "intrusive" a change from the user POV.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2006-04-11 01:09

Message:
Logged In: YES 
user_id=21627

re 1) Currently, the usage of the stacksize attribute is
depending on the definition of a THREAD_STACK_SIZE macro. I
don't know where that comes from, but I guess whoever
defines it knows what he is doing, so that the stacksize
attribute is defined on such a system.

re 2) I can accept that Python enforces a minimum above
PTHREAD_STACK_MIN; it shouldn't be possible to set the stack
size below PTHREAD_STACK_MIN, since that *will* fail when a
thread is created.

-1 for an environment variable version. What problem would
that solve? If this patch gets implemented, applications can
define their own environment variables if they think it
helps, and users/admins can put something in
sitecustomize.py if they think there should be an
environment variable controlling the stack size for all
Python applications on the system.

----------------------------------------------------------------------

Comment By: Andrew I MacIntyre (aimacintyre)
Date: 2006-04-11 00:45

Message:
Logged In: YES 
user_id=250749

1) wrt _POSIX_THREAD_ATTR_STACKSIZE, I'll look at that
(though I note its absence from the existing code...)

2) PTHREAD_STACK_MIN on FreeBSD is 1k, which seemed grossly
inadequate for Python (my impression is that 20-32k is a
fairly safe minimum for Python).  In principle I don't have
a problem 
with relying on PTHREAD_STACK_MIN, except for trying to play
it safe.  Any further thoughts on this?

I'm also putting together an environment variable only
version of the patch, with a view to getting that in first,
and reworking this patch to work on top of that.

Thanks for the comments.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2006-04-10 23:41

Message:
Logged In: YES 
user_id=21627

Usage of pthread_attr_setstacksize should be conditional on
the definition of _POSIX_THREAD_ATTR_STACKSIZE, according to
POSIX. Errors from pthread_attr_setstacksize should be
reported (POSIX lists EINVAL as a possible error).

I think PTHREAD_STACK_MIN should be considered. 

The documentation should list availibility of the feature,
currently Win32, OS/2, and POSIX threads (with the TSS
option, to be precise). If some platforms have specific
additional requirements on the possible values (eg. must be
a multiple of the page size), these should be documented, as
well.

Apart from that, the patch looks fine.

----------------------------------------------------------------------

Comment By: Andrew I MacIntyre (aimacintyre)
Date: 2006-03-22 19:28

Message:
Logged In: YES 
user_id=250749

Thanks for the comments.

As implemented, the function is both a getter and
(optionally) a setter which makes attempting to use a
"get"/"set" prefix 
awkward.

I chose this approach to make it a little simpler to support
temporary changes.  I did consider using a module
attribute/variable, but it is slightly more unwieldy for
this case:

old_size = thread.stack_size(new_size)
...
thread.stack_size(old_size)

vs

old_size = thread.stack_size
thread.stack_size = new_size
...
thread.stack_size = old_size

or (using get/set accessors)

old_size = thread.get_stacksize()
thread.set_stacksize(new_size)
...
thread.set_stacksize(old_size)

I think an argument can be made for passing on the
"get"/"set" naming consistency based on the guidelines in
PEP 8.  While I have a preference for what I've implemented,
I'm more interested in getting the functionality in than
debating its decor.  If there's a strong view about these 
issues, I'm prepared to revise the patch accordingly.

I don't believe that the functionality belongs anywhere else
than the thread module, except possibly shadowing it in the
threading module, as it is highly specific to thread
support.  The sys module seems more appropriate for general 
knobs, and only for specific knobs when there is no other
choice IMO.  Doing it outside the thread module also
complicates the implementation, which I was trying to keep
as simple as I could.


----------------------------------------------------------------------

Comment By: Hye-Shik Chang (perky)
Date: 2006-03-21 00:58

Message:
Logged In: YES 
user_id=55188

I'm all for this!  The FreeBSD port have maintained a local
patch to bump THREAD_STACK_SIZE.  The patch will lighten
FreeBSD users' burden around thread stack size.

BTW, the naming, "thread.stack_size" seems to miss a verb
while all the other functions on the thread module have it.
 How about set_stack_size() or set_stacksize()?  Or, how
about in sys module?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1454481&group_id=5470


More information about the Patches mailing list