
I just submitted a bunch of patches to SF to correct memory leaks, possible memory leaks, and uninitialized memory reads (UMR). The problems were found by using purify on the regression tests. There are still some problems that I have not be able to correct. Here is a list of the biggest problems, the details are at the end of this mail: test_long_future - leaks 43124 bytes (future division on longs) test_asynchat (and others) - leaks 2k+ socketmodule.c:633 test___all__ (and others) - UMR marshal.c:438 test_parser - leaks 5k - parsermodule.c:721 recurses to line 716 There are other smaller problems, but this mail is already quite long. Once the bigger problems are resolved, I can work more on the small ones. Let me know if there's more info necessary to fix these problems. Also, if anyone wants to see the complete list of regression tests run and the results, let me know. What is the best format to post this info? Would there be a better way to deal with these kinds or problems? Neal -- test_long_future: there are memory leak problems with binary operations which use longs and future division: the following code leaks 32 bytes: >>> from __future__ import division >>> 5L / 3L here's the stack trace, using current CVS snapshot (from Sat): malloc [rtlib.o] muladd1 [longobject.c:51] PyLong_FromString [longobject.c:1027] parsenumber [compile.c:1096] com_atom [compile.c:1461] com_power [compile.c:1853] com_factor [compile.c:1982] com_term [compile.c:1994] com_arith_expr [compile.c:2027] com_shift_expr [compile.c:2053] com_and_expr [compile.c:2079] com_xor_expr [compile.c:2101] com_expr [compile.c:2123] com_comparison [compile.c:2177] com_and_test [compile.c:2252] com_test [compile.c:2353] and: muladd1 [longobject.c:51] PyLong_FromString [longobject.c:1027] parsenumber [compile.c:1096] com_atom [compile.c:1461] com_power [compile.c:1853] com_factor [compile.c:1982] com_term [compile.c:1992] com_arith_expr [compile.c:2027] com_shift_expr [compile.c:2053] com_and_expr [compile.c:2079] com_xor_expr [compile.c:2101] com_expr [compile.c:2123] com_comparison [compile.c:2177] com_and_test [compile.c:2252] com_test [compile.c:2353] here's a different stack trace if it helps: x_add [longobject.c:51] long_add [longobject.c:1408] binary_op1 [abstract.c:343] PyNumber_Add [abstract.c:578] eval_frame [ceval.c:960] PyEval_EvalCodeEx [ceval.c:2549] fast_function [ceval.c:3116] eval_frame [ceval.c:1996] PyEval_EvalCodeEx [ceval.c:2549] PyEval_EvalCode [ceval.c:483] PyImport_ExecCodeModuleEx [import.c:494] load_source_module [import.c:764] load_module [import.c:1348] import_submodule [import.c:1887] load_next [import.c:1743] test_asynchat: memory is allocated from socketmodule.c:633 (getaddr) it appears the memory is deallocated, but purify is still complaining here's the stack trace: malloc [rtlib.o] __IPv6_alloc [getipnodeby.c] getipnodebyname [getipnodeby.c] get_addr [getaddrinfo.c] getaddrinfo [libsocket.so.1] setipaddr [socketmodule.c:633] getsockaddrarg [socketmodule.c:821] PySocketSock_bind [socketmodule.c:1186] fast_cfunction [ceval.c:3086] eval_frame [ceval.c:1979] PyEval_EvalCodeEx [ceval.c:2549] PyEval_EvalCode [ceval.c:483] PyImport_ExecCodeModuleEx [import.c:494] load_source_module [import.c:764] load_module [import.c:1348] import_submodule [import.c:1887] test___all__: I'm not sure if this is a problem in Python or not. It's possible that the problem is atof()/strtod() reads the string in 4 byte increments. This is occurring while in: __big_float_times_power [libc.so.1] __decimal_to_binary_integer [libc.so.1] __decimal_to_unpacked [libc.so.1] decimal_to_double [libc.so.1] strtod [libc.so.1] r_object [marshal.c:438] r_object [marshal.c:524] r_object [marshal.c:595] PyMarshal_ReadObjectFromString [marshal.c:741] PyMarshal_ReadLastObjectFromFile [marshal.c:700] load_source_module [import.c:582] load_module [import.c:1348] import_submodule [import.c:1887] load_next [import.c:1743] import_module_ex [import.c:1594] PyImport_ImportModuleEx [import.c:1635] Reading 2 bytes from 0xffbe9034 on the stack. Address 0xffbe9034 is 452 bytes below frame pointer in function __decimal_to_unpacked. test_parser: This memory was allocated from: malloc [rtlib.o] PyNode_AddChild [node.c:35] build_node_children [parsermodule.c:716] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] build_node_children [parsermodule.c:721] Block of 20 bytes (69 times); last block at 0x4fc238

hi neal,
I just submitted a bunch of patches to SF to correct memory leaks, possible memory leaks, and uninitialized memory reads (UMR).
(I tried to follow up over at sourceforge, but all I get is an error message saying "ERROR!" and nothing else...) I'm a bit puzzled over your proposed SRE patch. the patch changes return PyString_FromString(""); to result = PyString_FromString(""); Py_INCREF(result); return result; but both according the documentation and the implementation, FromString returns a new reference (usually another reference to the internal nullstring object). in other words, if there's a bug somewhere, I'm not sure it's in SRE.
What is the best format to post this info? Would there be a better way to deal with these kinds or problems?
keep on posting them to this list, until someone comes up with a better idea. </F>

Fredrik Lundh wrote:
Hmmm, I thought that patch looked a bit suspect. I backed out this patch, then reran all the tests (143 of them) that this patch should have addressed. Nothing new was reported. So this patch must have been something I was playing with, but isn't correct. You can (try to) close this bug in SF. If I find a problem, I'll submit a better patch. Neal

[Neal Norwitz]
However, from __future__ import division while 1: 5L / 3L can run all day without memory size increasing, so this "leak" is probably bogus (note that Python stores pointers to all sorts of malloc'ed memory into file-static vrbls, and tiny "leaks" are often-- at considerable cost --traced simply to that, e.g., a static Python string object constant got dynamically initialized). OTOH, if I change the tail end of test_long_future.py to while 1: test_true_division() it leaks like a sieve, so *something* is wrong there. I'll track it down; the routines that show up at the top of the stack traces appear to be blameless:

[Neal Norwitz]
This one has been fixed -- or, at least, test_long_future.py in an infinite loop no longer grows. Thanks! I don't intend to look at more of these (too much else to do). If they're not resolved quickly, please open distinct bug reports for each, else they'll simply get lost.

Tim Peters wrote:
Tim: You wrote: Neal, if you do more of these, could you please limit them to one module per patch? Not all the suggested fixes have made sense, and the report gets to be a mess when only part of a patch can be applied. By module do you mean directory (Modules, Lib, Python, etc) or do you mean individual files? Neal

[Neal]
By module do you mean directory (Modules, Lib, Python, etc) or do you mean individual files?
I mean one patch == the fewest number of files that must be changed simultaneously. This may vary from one (likely most common) to dozens (likely very uncommon), depending on the specific changes involved. I expect most "memory leak" fixes don't require coordinated changes across multiple files, and are better handled by one-file patches when that's possible.

hi neal,
I just submitted a bunch of patches to SF to correct memory leaks, possible memory leaks, and uninitialized memory reads (UMR).
(I tried to follow up over at sourceforge, but all I get is an error message saying "ERROR!" and nothing else...) I'm a bit puzzled over your proposed SRE patch. the patch changes return PyString_FromString(""); to result = PyString_FromString(""); Py_INCREF(result); return result; but both according the documentation and the implementation, FromString returns a new reference (usually another reference to the internal nullstring object). in other words, if there's a bug somewhere, I'm not sure it's in SRE.
What is the best format to post this info? Would there be a better way to deal with these kinds or problems?
keep on posting them to this list, until someone comes up with a better idea. </F>

Fredrik Lundh wrote:
Hmmm, I thought that patch looked a bit suspect. I backed out this patch, then reran all the tests (143 of them) that this patch should have addressed. Nothing new was reported. So this patch must have been something I was playing with, but isn't correct. You can (try to) close this bug in SF. If I find a problem, I'll submit a better patch. Neal

[Neal Norwitz]
However, from __future__ import division while 1: 5L / 3L can run all day without memory size increasing, so this "leak" is probably bogus (note that Python stores pointers to all sorts of malloc'ed memory into file-static vrbls, and tiny "leaks" are often-- at considerable cost --traced simply to that, e.g., a static Python string object constant got dynamically initialized). OTOH, if I change the tail end of test_long_future.py to while 1: test_true_division() it leaks like a sieve, so *something* is wrong there. I'll track it down; the routines that show up at the top of the stack traces appear to be blameless:

[Neal Norwitz]
This one has been fixed -- or, at least, test_long_future.py in an infinite loop no longer grows. Thanks! I don't intend to look at more of these (too much else to do). If they're not resolved quickly, please open distinct bug reports for each, else they'll simply get lost.

Tim Peters wrote:
Tim: You wrote: Neal, if you do more of these, could you please limit them to one module per patch? Not all the suggested fixes have made sense, and the report gets to be a mess when only part of a patch can be applied. By module do you mean directory (Modules, Lib, Python, etc) or do you mean individual files? Neal

[Neal]
By module do you mean directory (Modules, Lib, Python, etc) or do you mean individual files?
I mean one patch == the fewest number of files that must be changed simultaneously. This may vary from one (likely most common) to dozens (likely very uncommon), depending on the specific changes involved. I expect most "memory leak" fixes don't require coordinated changes across multiple files, and are better handled by one-file patches when that's possible.
participants (3)
-
Fredrik Lundh
-
Neal Norwitz
-
Tim Peters