MemoryError... how much memory?

There are a lot of places where Python or modules do something like: self->buf = (char *)malloc(size); if (!self->buf) { PyErr_SetString(PyExc_MemoryError, "out of memory"); At job, we're having some MemoryErrors, and one thing that we would love to know, if how much memory it was asking when that happened. So, I thought about doing something like: char message[50]; ... self->buf = (char *)malloc(size); if (!self->buf) { snprintf(message, 50, "out of memory (asked: %lld)", size); PyErr_SetString(PyExc_MemoryError, &message); Is any inherent problem in doing this? May it be a good idea to make it generic, like providing a PyErr_MemoryError that could accept a message and a number, and stores that number in the exception objects internals? Thanks! -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

2010/10/27 Facundo Batista <facundobatista@gmail.com>:
There are a lot of places where Python or modules do something like:
self->buf = (char *)malloc(size); if (!self->buf) { PyErr_SetString(PyExc_MemoryError, "out of memory");
At job, we're having some MemoryErrors, and one thing that we would love to know, if how much memory it was asking when that happened.
Isn't this usually when you do something like [None]*2**300? In that case, wouldn't you know how much memory you're requesting? Also, why is that useful? -- Regards, Benjamin

On Wed, Oct 27, 2010 at 12:05 PM, Benjamin Peterson <benjamin@python.org> wrote:
Isn't this usually when you do something like [None]*2**300? In that case, wouldn't you know how much memory you're requesting?
It could happen on any malloc. It depends on how much you have free. Don't think on getting a MemoryError on a python you just opened in the console. Think about a server with a month of uptime, where you have all the memory fragmented, etc.
Also, why is that useful?
It helps to determine why we're having some Memory Errors on our long-lived server, how is the behaviour when that happens, etc. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

On 07:09 pm, facundobatista@gmail.com wrote:
On Wed, Oct 27, 2010 at 12:05 PM, Benjamin Peterson <benjamin@python.org> wrote:
Isn't this usually when you do something like [None]*2**300? In that case, wouldn't you know how much memory you're requesting?
It could happen on any malloc. It depends on how much you have free.
Don't think on getting a MemoryError on a python you just opened in the console. Think about a server with a month of uptime, where you have all the memory fragmented, etc.
Also, why is that useful?
It helps to determine why we're having some Memory Errors on our long-lived server, how is the behaviour when that happens, etc.
But... If you allocated all of your memory to some garbage, and then a 5 byte string can't be allocated, you don't really care about the 5 byte string, you care about the garbage that's wasting your memory. Tools like heapy will give you a lot of information. Maybe it wouldn't hurt anyone to have more information in a MemoryError. But I don't think it's going to help a lot either. It's not the information that you're really interested in. Jean-Paul

Facundo Batista <facundobatista@gmail.com> writes:
On Wed, Oct 27, 2010 at 12:05 PM, Benjamin Peterson <benjamin@python.org> wrote:
Isn't this usually when you do something like [None]*2**300? In that case, wouldn't you know how much memory you're requesting?
It could happen on any malloc. It depends on how much you have free.
It also depends on how much is being requested. The caller knows that amount, surely? -- \ “If you do not trust the source do not use this program.” | `\ —Microsoft Vista security dialogue | _o__) | Ben Finney

On Thu, Oct 28, 2010 at 8:00 AM, Ben Finney <ben+python@benfinney.id.au> wrote:
Facundo Batista <facundobatista@gmail.com> writes:
On Wed, Oct 27, 2010 at 12:05 PM, Benjamin Peterson <benjamin@python.org> wrote:
Isn't this usually when you do something like [None]*2**300? In that case, wouldn't you know how much memory you're requesting?
It could happen on any malloc. It depends on how much you have free.
It also depends on how much is being requested. The caller knows that amount, surely?
For a server process, the MemoryError in the log won't always have the context information showing what the values were in the calling frames. The idea behind Facundo's request is similar to the reason why we print the type names in a lot of TypeErrors. If you see MemoryError (5 bytes), the things you go looking for are very different from those you look for when you see MemoryError(1 gajillion bytes). (i.e. for the former, you look for a memory or other resource leak, for the latter, you look for the reason your code is trying to get 1 gajillion bytes from the OS). If a long-lived server isn't crashing but is still getting MemoryError occasionally, problems with specific oversized requests are much more likely than a general resource leak (as those usually bring the whole process down eventually). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Oct 27, 2010 at 8:27 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
If you see MemoryError (5 bytes), the things you go looking for are very different from those you look for when you see MemoryError(1 gajillion bytes). (i.e. for the former, you look for a memory or other resource leak, for the latter, you look for the reason your code is trying to get 1 gajillion bytes from the OS). If a long-lived server isn't crashing but is still getting MemoryError occasionally, problems with specific oversized requests are much more likely than a general resource leak (as those usually bring the whole process down eventually).
Very well explained, you're all right. Furthermore, our server is fairly complex: we're using quite some libraries to do different jobs, and one of the approaches (not the only one) that we're taking to deal with this beast is to analyze its memory-related behaviour from an external POV (thinking it as a black box). So, beyond it's arguable utility, do you think that having that information could harm us in some way? Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

On Thu, Oct 28, 2010 at 9:14 PM, Facundo Batista <facundobatista@gmail.com> wrote:
So, beyond it's arguable utility, do you think that having that information could harm us in some way?
I think the idea is sound in principle, but may run into some practical implementation problems due to special cases when raising MemoryError. But creating a patch and putting on the tracker sounds like something worth trying. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Furthermore, our server is fairly complex: we're using quite some libraries to do different jobs, and one of the approaches (not the only one) that we're taking to deal with this beast is to analyze its memory-related behaviour from an external POV (thinking it as a black box).
So, beyond it's arguable utility, do you think that having that information could harm us in some way?
I think implementing it might do harm. When a memory error is raised, you are typically out of memory, so allocating more memory might fail (it just did). Therefore, allocating more objects or doing string formatting will likely fail (unless the requested size is much larger than the memory required for these operations). So the chance increases that you trigger a fatal error. Regards, Martin

On Thu, Oct 28, 2010 at 11:14 PM, "Martin v. Löwis" <martin@v.loewis.de> wrote:
Furthermore, our server is fairly complex: we're using quite some libraries to do different jobs, and one of the approaches (not the only one) that we're taking to deal with this beast is to analyze its memory-related behaviour from an external POV (thinking it as a black box).
So, beyond it's arguable utility, do you think that having that information could harm us in some way?
I think implementing it might do harm. When a memory error is raised, you are typically out of memory, so allocating more memory might fail (it just did). Therefore, allocating more objects or doing string formatting will likely fail (unless the requested size is much larger than the memory required for these operations).
So the chance increases that you trigger a fatal error.
What Martin describes here is a more explicit description of what I meant by "practical implementation problems" and "special cases when raising MemoryError". However, I think thresholding the additional error formatting to only kick in the requested amount of memory exceeds a certain size would be an adequate safeguard without reducing the utility in Facundo's use case (the pre-allocated instance can have a generic error message saying an allocation of less than the threshold value failed). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Am 28.10.2010 15:14, schrieb "Martin v. Löwis":
Furthermore, our server is fairly complex: we're using quite some libraries to do different jobs, and one of the approaches (not the only one) that we're taking to deal with this beast is to analyze its memory-related behaviour from an external POV (thinking it as a black box).
So, beyond it's arguable utility, do you think that having that information could harm us in some way?
I think implementing it might do harm. When a memory error is raised, you are typically out of memory, so allocating more memory might fail (it just did). Therefore, allocating more objects or doing string formatting will likely fail (unless the requested size is much larger than the memory required for these operations).
So the chance increases that you trigger a fatal error.
Especially since we have a MemoryError instance preallocated to avoid exactly this problem. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On Thu, 28 Oct 2010 15:54:50 +0200 Georg Brandl <g.brandl@gmx.net> wrote:
Am 28.10.2010 15:14, schrieb "Martin v. Löwis":
Furthermore, our server is fairly complex: we're using quite some libraries to do different jobs, and one of the approaches (not the only one) that we're taking to deal with this beast is to analyze its memory-related behaviour from an external POV (thinking it as a black box).
So, beyond it's arguable utility, do you think that having that information could harm us in some way?
I think implementing it might do harm. When a memory error is raised, you are typically out of memory, so allocating more memory might fail (it just did). Therefore, allocating more objects or doing string formatting will likely fail (unless the requested size is much larger than the memory required for these operations).
So the chance increases that you trigger a fatal error.
Especially since we have a MemoryError instance preallocated to avoid exactly this problem.
And which creates other problems of its own, such as keeping many objects alive: http://bugs.python.org/issue5437 ;) Antoine.
participants (8)
-
"Martin v. Löwis"
-
Antoine Pitrou
-
Ben Finney
-
Benjamin Peterson
-
exarkun@twistedmatrix.com
-
Facundo Batista
-
Georg Brandl
-
Nick Coghlan