[Tutor] If you don't close file when writing, do bytes stay in memory?

Oxymoron moron.oxy at gmail.com
Sat Oct 10 08:17:10 CEST 2009


(Did not do a reply-all earlier)

On Sat, Oct 10, 2009 at 3:20 PM, xbmuncher <xboxmuncher at gmail.com> wrote:
> Which piece of code will conserve more memory?
>  I think that code #2 will because I close the file more often, thus freeing
> more memory by closing it.
> Am I right in this thinking... or does it not save me any more bytes in
> memory by closing the file often?

In general, you should only close a file once you're done with writing
all the data you need. Opening and closing after each write within a
loop is unnecessarily costly, as it usually translates to multiple
system calls. Thus the general pattern is:

open file
while have stuff to do:
  do stuff, read/write file
finally close file

(As an aside, you can also check out the 'with' statement in Python
2.5+ that'll handle the closing.)

With regard to memory consumption, there are 2 issues to be considered:

1. An open file handle obviously takes up some memory space, but this
is (usually) insignificant in comparison to actual data being
read/written, in your example the actual bytes. Not that you shouldn't
worry about keeping too many file descriptors open, there are limits
per process/user depending on the OS on the number of file
descriptors/handles that can be open at any one time.

2. The data itself, if the data is in memory, say from a list, etc.
obviously that takes up space, adding to the memory consumed by your
program. This is where Python's garbage collection (GC) comes in
handy. I am not familiar with the exact type of GC the CPython
interpreter uses, but as a general rule, if you scope your data to
just where it's needed, e.g. within functions, etc. and it is not
accidentally referred to from somewhere else for reference-able types
(e.g. lists), then once the function or code block is complete, the
memory will be reclaimed by Python. This is also a good reason to
avoid global variables, or limit them to things like constants, or
overall configuration, or items that are reused many times across the
application (e.g. caching). Otherwise, the memory remains for the
lifetime of the program even if your code does not use the data.

So point 2, the actual data items, lists, variables etc. are more
significant in terms of memory, in comparison to file handles. In your
example, the data is produced by getData(), so is only created as
necessary, but let's say you did:

x = getData()

x is a global, and is no longer ever used except maybe once - since it
is in scope, the memory remains. (Strings are of course immutable and
often cached by Python once created so this may be a moot point for
string data.)

Hope that helps.


-- Kamal

-- 
There is more to life than increasing its speed.
 -- Mahatma Gandhi


More information about the Tutor mailing list