[New-bugs-announce] [issue3531] file read preallocs 'size' bytes which can cause memory problems

Andrew Dalke report at bugs.python.org
Sat Aug 9 04:29:45 CEST 2008

New submission from Andrew Dalke <dalke at dalkescientific.com>:

I wrote a buggy PNG parser which ended up doing several file.read(large 
value).  It causes a MemoryError, which was strange because the file was 
only a few KB long.

I tracked it down to the implementation of read().  When given a size 
hint it preallocates the return string with that size.  If the hint is 
for 10MB then the string returned will be preallocated fro 10MB, even if 
the actual read is empty.

Here's a reproducible

BLOCKSIZE = 10*1024*1024

f=open("empty.txt", "w")

data = []
for i in range(10000):
    s = f.read(BLOCKSIZE)
    assert len(s) == 0

I wasn't sure if this is properly a bug, but since the MemoryError 
exception I got was quite unexpected and required digging into the 
source code to figure out, I'll say that it is.

components: Interpreter Core
messages: 70924
nosy: dalke
severity: normal
status: open
title: file read preallocs 'size' bytes which can cause memory problems
type: resource usage

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list