Fastest way to retrieve and write html contents to file
nospam at
Mon May 2 01:59:51 EDT 2016
On 5/2/2016 1:15 AM, Stephen Hansen wrote:
> On Sun, May 1, 2016, at 10:00 PM, DFS wrote:
>> I tried the 10-loop test several times with all versions.
> Also how, _exactly_, are you testing this?
> C:\Python27>python -m timeit "filename='C:\\test.txt';
> webpage=''; import urllib2;
> r = urllib2.urlopen(webpage); f = open(filename, 'w');
> f.write(; f.close();"
> 10 loops, best of 3: 175 msec per loop
> That's a whole lot less the 0.88secs.
import requests, urllib, urllib2, pycurl
import time
webpage = ""
webfile = "D:\\econpy001.html"
loops = 10
startTime = time.clock()
for i in range(loops):
endTime = time.clock()
print "Finished urllib in %.2g seconds" %(endTime-startTime)
startTime = time.clock()
for i in range(loops):
r = urllib2.urlopen(webpage)
f = open(webfile,"w")
endTime = time.clock()
print "Finished urllib2 in %.2g seconds" %(endTime-startTime)
startTime = time.clock()
for i in range(loops):
r = requests.get(webpage)
f = open(webfile,"w")
endTime = time.clock()
print "Finished requests in %.2g seconds" %(endTime-startTime)
startTime = time.clock()
for i in range(loops):
with open(webfile + str(i) + ".txt", 'wb') as f:
c = pycurl.Curl()
c.setopt(c.URL, webpage)
c.setopt(c.WRITEDATA, f)
endTime = time.clock()
print "Finished pycurl in %.2g seconds" %(endTime-startTime)
$ python
Finished urllib in 0.88 seconds
Finished urllib2 in 0.83 seconds
Finished requests in 0.89 seconds
Finished pycurl in 1.1 seconds
Those results are consistent. They go up or down a little, but never
below 0.82 seconds (for urllib2), or above 1.2 seconds (for pycurl)
VBScript is consistently 0.44 to 0.48
More information about the Python-list
mailing list