[ python-Bugs-1610654 ] cgi.py multipart/form-data
SourceForge.net
noreply at sourceforge.net
Thu Dec 7 10:18:19 CET 2006
Bugs item #1610654, was opened at 2006-12-07 09:18
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1610654&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Performance
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Chui Tey (teyc)
Assigned to: Nobody/Anonymous (nobody)
Summary: cgi.py multipart/form-data
Initial Comment:
Uploading large binary files using multipart/form-data can be very inefficient because LF character may occur too frequently, resulting in the read_line_to_outer_boundary looping too many times.
*** cgi.py.Py24 Thu Dec 7 18:46:13 2006
--- cgi.py Thu Dec 7 16:38:04 2006
***************
*** 707,713 ****
last = next + "--"
delim = ""
while 1:
! line = self.fp.readline()
if not line:
self.done = -1
break
--- 703,709 ----
last = next + "--"
delim = ""
while 1:
! line = self.fp_readline()
if not line:
self.done = -1
break
***************
*** 729,734 ****
--- 730,753 ----
delim = ""
self.__write(odelim + line)
+ def fp_readline(self):
+
+ tell = self.fp.tell()
+ buffer = self.fp.read(1 << 17)
+ parts = buffer.split("\n")
+ retlst = []
+ for part in parts:
+ if part.startswith("--"):
+ if retlst:
+ retval = "\n".join(retlst) + "\n"
+ else:
+ retval = part + "\n"
+ self.fp.seek(tell + len(retval))
+ return retval
+ else:
+ retlst.append(part)
+ return buffer
+
def skip_lines(self):
"""Internal: skip lines until outer boundary if defined."""
if not self.outerboundary or self.done:
The patch reads the file in larger increments. For my test file of 138 Mb, it reduced parsing time from 168 seconds to 19 seconds.
#------------ test script --------------------
import cgi
import cgi
import os
import profile
import stat
def run():
filename = 'body.txt'
size = os.stat(filename)[stat.ST_SIZE]
fp = open(filename,'rb')
environ = {}
environ["CONTENT_TYPE"] = open('content_type.txt','rb').read()
environ["REQUEST_METHOD"] = "POST"
environ["CONTENT_LENGTH"] = str(size)
fieldstorage = cgi.FieldStorage(fp, None, environ=environ)
return fieldstorage
import hotshot, hotshot.stats
import time
if 1:
t1 = time.time()
prof = hotshot.Profile("bug1718.prof")
# hotshot profiler will crash with the
# patch applied on windows xp
#prof_results = prof.runcall(run)
prof_results = run()
prof.close()
t2 = time.time()
print t2-t1
if 0:
for key in prof_results.keys():
if len(prof_results[key].value)> 100:
print key, prof_results[key].value[:80] + "..."
else:
print key, prof_results[key]
content_type.txt
----------------------------
multipart/form-data; boundary=----------ThIs_Is_tHe_bouNdaRY_$
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1610654&group_id=5470
More information about the Python-bugs-list
mailing list