
Hi, I am little obssessed with my own coded after i found out it is slow. I am trying to construct a html table string from a result that i obtain from database.My code segment look like this: assume result holds 2D list. ******* method******************************** def f1(item): return "<td>"+str(item)+"</td>" ***************************************** ************main program piece*********** tab = "<table align='center' border='1'>\n" for x in result: tab = tab + "<tr>" tab = tab + string.join(map(f1,x),"")+"</tr>\n" tab = tab + "</table>" print "Content-type:text/html\n" print tab ********************************************** Can anyone improve this code segment? Thanks alot
The big problem is that tab=tab+"..." in a loop is expensive since Python must allocate a new string and copy the old one each time. The running time will increase with the square of the number of records, i.e. O(n*n). It may not be intuitive, but putting all the partial strings in a list and then joining them at the end is more efficient as the running time will increase only linearly with the number of records, i.e. O(n). tab = [] tab.append("<table align='center' border='1'>\n") for x in result: tab.append("<tr>") for y in x: tab.extend(("<td>",y,"</td>\n")) tab.append("</tr>") tab.append("</table>") print string.join(tab, "") O(n) will always beat O(n*2) for large enough n, even if the individual operations are slower. Brent

Brent Burley wrote: [snip]
The big problem is that tab=tab+"..." in a loop is expensive since Python must allocate a new string and copy the old one each time. The running time will increase with the square of the number of records, i.e. O(n*n).
It may not be intuitive, but putting all the partial strings in a list and then joining them at the end is more efficient as the running time will increase only linearly with the number of records, i.e. O(n).
An alternative is to use StringIO (or cStringIO if you're not dealing with unicode strings):
tab = [] tab.append("<table align='center' border='1'>\n") for x in result: tab.append("<tr>") for y in x: tab.extend(("<td>",y,"</td>\n")) tab.append("</tr>") tab.append("</table>") print string.join(tab, "")
from StringIO import StringIO # from cStringIO import StringIO result = StringIO() result.write('<table align="center" border="1">\n') for x in result: result.write("<tr>") for y in x: result.write("<td>") result.write(y) result.write("</td>\n") result.write("</tr>") result.write("</table>") print result.getvalue() For an extra speedboost you could do this on top: write = result.write and then you can just use: write("<tr">) and so on. I believe cStringIO can be even faster than the .join() operation, but I haven't timed it. Regards, Martijn

Thanks for the input - I hadn't really thought about StringIO. Now that I've taken a look ... StringIO is no different than what I've shown - it appends each string to a list and then uses string.join to build the final result. There's also additional overhead since it must support random access file operations. cStringIO should be faster, but it might not be. It allocates a single string buffer and reallocates it to be larger whenever it runs out of room (doubling the previous size). You end up reallocating and copying the string log2(n) times. If you were building a very large string, the difference could be significant. Brent Martijn Faassen wrote:
Brent Burley wrote: [snip]
The big problem is that tab=tab+"..." in a loop is expensive since Python must allocate a new string and copy the old one each time. The running time will increase with the square of the number of records, i.e. O(n*n).
It may not be intuitive, but putting all the partial strings in a list and then joining them at the end is more efficient as the running time will increase only linearly with the number of records, i.e. O(n).
An alternative is to use StringIO (or cStringIO if you're not dealing with unicode strings):
I believe cStringIO can be even faster than the .join() operation, but I haven't timed it.
participants (2)
-
Brent Burley
-
Martijn Faassen