The current behavior of the atexit module is that if any of the exit
handlers raises an exception the remaining handlers are not run. Greg
Chapman posted a bug report about this:
http://www.python.org/sf/1052242
Greg proposed catching any exceptions and continuing so that all exit
handlers at least have a chance to run and Raymond agrees with him. I
attached a patch to the ticket to add a flag to determine the behavior on
the principle that atexit has been around long enough that someone out there
probably relies on the early exit behavior. This is the old Python chestnut
of using a flag to preserve existing behavior as the default while allowing
users to set the flag to get the new behavior.
I'm happy to go either way, but thought perhaps a quick poll of the troops
might be in order, hence this note.
Skip
This one came up while working on ZODB:
weakref callback vs. gc vs. threads
http://www.python.org/sf/1055820
Short course: in the presence of weakrefs, cyclic gc is still hosed (it
turns out that neither threads nor weakref callbacks are necessary to get
hosed).
temp2a.py there demonstrates there's a problem, but in an unclear way
(hundreds of objects, hundreds of weakrefs and weakref callbacks (all via
WeakValueDictionary internals), 3 threads). OTOH, there's nothing clever or
tricky about it. Sooner or later, it just fails (an accessible instance of
a user-defined class gets its __dict__ cleared "by magic").
temp2b.py reduces it to 2 objects and 1 thread. This is contrived, but is
deterministic.
temp2c.py boosts it to 3 objects, and is a nightmare: it shows that the
problem can occur during a gc collection that doesn't see *any* objects
having a weakref with a callback. There is a weakref with a callback here,
but it's attached to an object in an older generation, and collection of a
younger generation triggers that callback indirectly. Because this is such
a nasty case (no amount of analysis of the objects in the generation being
collected can deduce that it's possible for a weakref callback to run),
there are extensive comments and an ASCII-art diagram in the file.
Even worse, temp2d.py shows we can get in trouble even if there's never a
weakref with a callback. It's enough to have one weakref (without a
callback), and one object with a __del__ method.
Offhand, I don't have a plausible solution that will work. The elegant
<wink> analysis in gc_weakref.txt missed what should have been obvious even
then: cyclic trash is still potentially reachable via "global" weakrefs, so
if any Python code whatsoever can run while gc is breaking cycles (whether
via __del__ or via wr callback), global weakrefs can resurrect cyclic trash.
That suggests some Draconian approaches.
Anyone have a bright idea? It's remarkable how long we've managed to go
without noticing that everything is disastrously broken here <0.9 wink>.
On Tue, 19 Oct 2004 12:02:14 +0200 (CEST), Evan Jones
<ejones(a)uwaterloo.ca> wrote:
> Subject: [Python-Dev] Changing pymalloc behaviour for long running
> processes
>
[ snip ]
>
> The short version of the problem is that obmalloc.c never frees memory.
> This is a great strategy if the application runs for a short time then
> quits, or if it has fairly constant memory usage. However, applications
> with very dynamic memory needs and that run for a long time do not
> perform well because Python hangs on to the peak amount of memory
> required, even if that memory is only required for a tiny fraction of
> the run time. With my application, I have a python process which occupy
> 1 GB of RAM for ~20 hours, even though it only uses that 1 GB for about
> 5 minutes. This is a problem that needs to be addressed, as it
> negatively impacts the performance of Python when manipulating very
> large data sets. In fact, I found a mailing list post where the poster
> was looking for a workaround for this issue, but I can't find it now.
>
> Some posts to various lists [1] have stated that this is not a real
> problem because virtual memory takes care of it. This is fair if you
> are talking about a couple megabytes. In my case, I'm talking about
> ~700 MB of wasted RAM, which is a problem. First, this is wasting space
> which could be used for disk cache, which would improve the performance
> of my system. Second, when the system decides to swap out the pages
> that haven't been used for a while, they are dirty and must be written
> to swap. If Python ever wants to use them again, they will be brought
> it from swap. This is much worse than informing the system that the
> pages can be discarded, and allocating them again later. In fact, the
> other native object types (ints, lists) seem to realize that holding on
> to a huge amount of memory indefinitely is a bad strategy, because they
> explicitly limit the size of their free lists. So why is this not a
> good idea for other types?
>
> Does anyone else see this as a problem?
>
This is such a big problem for us that we had to rewrite some of our daemons
to fork request handlers so that the memory would be freed. That's the only
way we've found to deal with it, and it seems, that's the preferred python
way of doing things, using processes, IPC, fork, etc. instead of threads.
In order to be able to release memory, the interpreter has to allocate
memory in chunks bigger than the minimum that can be returned to the
OS, e.g., in Linux that'd be 256bytes (iirc), so that libc's malloc would
use mmap to allocate that chunk. Otherwise, if the memory was
obtained with brk, then in most virtually all OSes and malloc implementations,
it won't be returned to the OS even if the interpreter frees the memory.
For example, consider the following code in the interactive interpreter:
for i in range(10000000):
pass
That run will create a lot of little integer objects and the virtual memory
size of the interpreter will quickly grow to 155MB and then drop to 117MB.
The 117MB left are all those little integer objects that are not in use any
more that the interpreter would reuse as needed.
When the system needs memory, it will page out the pages where these objects
have been allocated to swap.
In our application, paging to swap is extremely bad because sometimes
we're running the OS booted from the net without swap. The daemon has to
loop over list of 20 to 40 thousand items at a time and it quickly grows to
60mb on the first run and then continues to grow from there. When something
else needs memory, it tries to swap and then crashes.
In the example above, the difference between 155MB and 117MB is 37MB, which I
assume is the size of the list object returned by 'range()' which contains the
references to the integers. The list goes away when the interpreter finishes
running the loop and because it was already known how big it was going to be,
it was allocated as a big chunk using mmap (my speculation). As a result, that
memory was given back to the OS and the virtual memory size of the interpreter
went down from 155MB to 117MB.
Regards,
--
Luis P Caamano
Atlanta, GA USA
PS
I rarely post to python-dev, this is probably the first time, so
please let me take this
opportunity to thank all the python developers for all your efforts,
such a great
language, and great tool. My respect and admiration to all of you.
There is a subtlety in CreateProcess in the Win32 API in that if one
specifies an environment (via the lpEnvironment argument), the
environment strings (A) must be sorted alphabetically and (B) that sort
must be case-insensitive. See the Remarks section on:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/ba…
If this is not done, then surprises can happen with the use of
{Get|Set}EnvironmentVariable in the created process:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/ba…
Neither _subprocess.pyd (supporting the new subprocess.py module on
Windows) nor PyWin32's CreateProcess binding do this. I haven't done so
yet, but I should be able to put together a test case for subprocess.py
for this. We came across such a surprise when using my process.py module
that uses this PyWin32 code (which it looks like _subprocess.c borrowed).
Fixing (A) is easy with a "PyList_Sort(keys)" and some other minor
changes to _subprocess.c::getenvironment() -- and to
win32process.i::CreateEnvironmentString() in PyWin32.
However, I'd like some guidance on the best way to case-insensitively
sort a Python list in C code to fix (B). The best thing I see would be
to expose PyString_Lower/PyUnicode_Lower and/or
PyString_Upper/PyUnicode_Upper so they can be used to .lower()/.upper()
the given environment mapping keys for sorting.
Does that sound reasonable? Is there some problem to this approach that
anyone can see?
Trent
--
Trent Mick
trentm(a)activestate.com
[Tim Delaney]
#- I think those three platforms are sufficiently
#- representative of Python
#- users, so if it works on them, and the code looks good to a
#- reviewer, it
#- should be committed. It's not exactly a large patch after all ...
Do you want to take a look at it? ;)
#- What's the bug number? I've got a FreeBSD (5.2.1) virtual
#- machine sitting
#- around I could try it on (tomorrow - bed time now ;).
The bug is 1050828.
Thanks!
. Facundo
People:
I have these doubts from a while ago, and while a learned a lot about this
through Raymond Hettinger, I still have some loose ends. Don't know if
there's an official position or it's just developer common sense (which I
still don't have), but I didn't find an article/PEP about this. Such paper
exists?
For now, I'll ask you about a specific issue: there's this bug open about
the reindent.py tool, which has an issue about the reindented code file's
metadata (more specific: permissions). So I came up with a solution (small
patch, three lines), which leaves the reindented file with the same
permissions that the original one.
My problem is that I can not decide to commit those changes. We're in beta,
and don't know if the changes will be tested enough before the final
release, in enough platforms (for now, it's tested on Linux, Win2k and MacOS
X).
And if I don't commit these now, when?
Thank you all!
. Facundo
Hello
I am little new with the python....
If I want to develope any database related application,with options like
add,save,modify,delete,previous,last,next,first for maintaining Employee's records.I want to use .mdb file or d-base database
as a backend.
In VB we can use ADO or DAO, In java we use JDBC, so for python
please guide me for developing any basic application....
Regards
Sandeep
Yahoo! India Matrimony: Find your life partneronline.
(re-sent and modified, after I recognized that my
hardware-clock is broken, need a new note-buck)
Dear community,
I would love to publish Stackless 3.1, of course.
Also I know that there is some inherent bug in it.
This is the state of the art sine four months.
I am currently in a very tight project and have no
time to dig into this problem.
BUT IT IS URGENT!
I'm seeking for a person who would take the job to
find the buglet. He would need to debug and
nail down a commercial application, which I cannot
make public. (S)He would need to sign an NDA with me.
The success payment would be $500, minimum. If the problem
shows up to be very hard (to some undefined definition
of very hard, to be negotiated), it can be increased
to $1000.
If my app works afterwards, Stackless 3.1 is just fine
and can go out to the public..
If it doesn't work, no payment happens.
The identified problem needs to be documented by a
reproducible test case.
If somebody is interested, please contact me privately.
And be aware, this is really no easy stuff. You need to
be a real hardcore system hacker with many years of
experience.
(Armin, Bob, Jeff, Lutz, Richard, Stefan, Stephan?)
Here is the CVS path to the dev trunk:
CVSROOT=:pserver:anonymous@stackless.com:/home/cvs
cvs co slpdev/src/2.3/dev
The cheapest complete solution wins. Hurry up :-)
Sincerely -- chris
--
Christian Tismer :^) <mailto:tismer@stackless.com>
tismerysoft GmbH : Have a break! Take a ride on Python's
Carmerstr. 2 : *Starship* http://starship.python.net/
10623 Berlin : PGP key -> http://wwwkeys.pgp.net/
work +49 30 31 86 04 18 home +49 30 802 86 56 mobile +49 173 24 18 776
PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04
whom do you want to sponsor today? http://www.stackless.com/
Comments? Is it safe or not to assume that by the time the Python has
started, only fds 0,1,2 are open?
/Peter
> I think this is the core of the problem. The test_close_fds
> test works like this:
>
> All file descriptors in the forked child (except 0,1,2) are
> closed. Then the Python binary is executed via execvp(). A
> small test program is passed to the Python binary via the -c
> command line option. If the OS and subprocess module works
> correctly, we can be sure of that by the time of the
> execve() system call, only file descriptors (0,1,2) are open
> (well, the errpipe as well, but let's leave that out for
> now). But, by the time the Python binary starts executing
> the small program, all sorts of things may have happened.
> I'm not really sure we can trust Python not to open files
> during startup. For example, if we have a PYTHONSTARTUP
> file, that open file will have a file descriptor, perhaps 3.
>
> On one hand, this bug could indicate a bug in the Python
> interpreter itself: perhaps a file descriptor leak. On the
> other hand, this test might be a bit too unsafe.
>
> So probably, this test should be removed.