[Python-bugs-list] [Bug #131249] cgi.py opens too many (temporary) files

noreply@sourceforge.net noreply@sourceforge.net
Sat, 17 Feb 2001 18:37:37 -0800


Bug #131249, was updated on 2001-Feb-06 04:59
Here is a current snapshot of the bug.

Project: Python
Category: Python Library
Status: Open
Resolution: None
Bug Group: None
Priority: 5
Submitted by: stadt
Assigned to : akuchling
Summary: cgi.py opens too many (temporary) files

Details: cgi.FieldStorage() is used to get the contents of a webform. It
turns out that
for each <input> line, a new temporary file is opened. This causes the
script
that is using cgi.FieldStorage() to reach the webserver's limit of number
of
opened files, as described by 'ulimit -n'. The standard value for Solaris
systems
seems to be 64, so webforms with that many <input> fields cannot be dealt
with.

A solution would seem to use the same temporary filename, since only a
maxmimum one file is (temporarily) used at the same time. I did an "ls|wc
-l"
while the script was running, which showed only zeroes and ones.

(I'm using Python for CyberChair, an online paper submission and reviewing
system.
The webform under discussion has one input field for each reviewer, stating
the
papers he or she is supposed to be reviewing. One conference that is using
CyberChair
has almost 140 reviewers. Their system's open file limit is 64. Using the
same data on
a system with an open file limit of 260 _is_ able to deal with this.)

Follow-Ups:

Date: 2001-Feb-17 18:37
By: stadt

Comment:
I do *not* mean file upload fields. I stumbled upon this with a webform
that
contains 141 'simple' input fields like the form you can see here (which
'only'
contains 31 of those input fields):

http://www.cyberchair.org/cgi-cyb/genAssignPageReviewerPapers.py

(use chair/chair to login)

When the maximum number of file descriptors used per process was
increased
to 160 (by the sysadmins), the problem did not occur anymore, and the
webform
could be processed.

This was the error message I got: 

Traceback (most recent call last):
  File
"/usr/local/etc/httpd/DocumentRoot/ICML2001/cgi-bin/submitAssignRP.py",
line 144, in main
  File "/opt/python/2.0/sparc-sunos5.6/lib/python2.0/cgi.py", line 504, in
__init__
  File "/opt/python/2.0/sparc-sunos5.6/lib/python2.0/cgi.py", line 593, in
read_multi
  File "/opt/python/2.0/sparc-sunos5.6/lib/python2.0/cgi.py", line 506, in
__init__
  File "/opt/python/2.0/sparc-sunos5.6/lib/python2.0/cgi.py", line 603, in
read_single
  File "/opt/python/2.0/sparc-sunos5.6/lib/python2.0/cgi.py", line 623, in
read_lines
  File "/opt/python/2.0/sparc-sunos5.6/lib/python2.0/cgi.py", line 713, in
make_file
  File "/opt/python/2.0/sparc-sunos5.6/lib/python2.0/tempfile.py", line
144, in TemporaryFile
OSError: [Errno 24] Too many open files:
'/home/yara/brodley/icml2001/tmp/@26048.61'

I understand why you assume that it would concern *file* uploads, but this
is
not the case. As I reported before, it turns out that for each 'simple'
<input> field
a temporary file is used in to transfer the contents to the script that
uses the
cgi.FieldStorage() method, even if no files are being uploaded. The problem
is not
that too many files are open at the same time (which is 1 at most). It is
the *amount*
of files that is causing the troubles. If the same temporary file would be
used, this
problem would probably not have happened. 

My colleague Fred Gansevles wrote a possible solution, but mentioned that
this
might introduce the need for protection against a 'symlink attack'
(whatever
that may be). This solution(?) concentrates on the open file descriptor's
problem,
while Fred suggests a redesign of FieldStorage() would probably be
better.

import os, tempfile
AANTAL = 50

class TemporaryFile:
    def __init__(self):
        self.name = tempfile.mktemp("")
        open(self.name, 'w').close()
        self.offset = 0
    
    def seek(self, offset):
        self.offset = offset
    
    def read(self):
        fd = open(self.name, 'w+b', -1)
        fd.seek(self.offset)
        data = fd.read()
        self.offset = fd.tell()
        fd.close()
        return data
    
    def write(self, data):
        fd = open(self.name, 'w+b', -1)
        fd.seek(self.offset)
        fd.write(data)
        self.offset = fd.tell()
        fd.close()

    def __del__(self):
        os.unlink(self.name)

def add_fd(l, n) :
    map(lambda x,l=l: l.append(open('/dev/null')), range(n))

def add_tmp(l, n) :
    map(lambda x,l=l: l.append(TemporaryFile()), range(n))

def main ():
    import getopt, sys
    try:
        import resource
        soft, hard = resource.getrlimit (resource.RLIMIT_NOFILE)
        resource.setrlimit (resource.RLIMIT_NOFILE, (hard, hard))
    except ImportError:
        soft, hard = 64, 1024
    opts, args = getopt.getopt(sys.argv[1:], 'n:t')
    aantal = AANTAL
    tmp = add_fd
    for o, a in opts:
        if o == '-n':
            aantal = int(a)
        elif o == '-t':
            tmp = add_tmp
    print "aantal te gebruiken fd's:", aantal   #dutch; English: 'number of
fds to be used'
    print 'tmp:', tmp.func_name
    tmp_files = []
    files=[]
    tmp(tmp_files, aantal)
    try:
        add_fd(files,hard)
    except IOError:
        pass
    print "aantal vrije gebruiken fd's:", len(files)  #enlish: 'number of
free fds'

main()

Running the above code:

    python ulimit.py [-n number] [-t]
    default number = 50, while using 'real' fd-s for temporary files.
    When using the '-t' flag 'smart' temporary files are used.

    Output:

        $ python ulimit.py
        aantal te gebruiken fd's: 50
        tmp: add_fd
        aantal vrije gebruiken fd's: 970

        $ python ulimit.py -t
        aantal te gebruiken fd's: 50
        tmp: add_tmp
        aantal vrije gebruiken fd's: 1020

        $ python ulimit.py -n 1000
        aantal te gebruiken fd's: 1000
        tmp: add_fd
        aantal vrije gebruiken fd's: 20

        $ python ulimit.py -n 1000 -t
        aantal te gebruiken fd's: 1000
        tmp: add_tmp
        aantal vrije gebruiken fd's: 1020
-------------------------------------------------------

Date: 2001-Feb-16 21:41
By: akuchling

Comment:
I assume you mean 64 file upload fields, right?  
Can you provide a small test program that triggers the problem?
-------------------------------------------------------

For detailed info, follow this link:
http://sourceforge.net/bugs/?func=detailbug&bug_id=131249&group_id=5470