Fastest way to retrieve and write html contents to file
DFS
nospam at dfs.com
Mon May 2 02:47:01 EDT 2016
On 5/2/2016 2:05 AM, Steven D'Aprano wrote:
> On Monday 02 May 2016 15:00, DFS wrote:
>
>> I tried the 10-loop test several times with all versions.
>>
>> The results were 100% consistent: VBSCript xmlHTTP was always 2x faster
>> than any python method.
>
>
> Are you absolutely sure you're comparing the same job in two languages?
As near as I can tell. In VBScript I'm actually dereferencing various
objects (that adds to the time), but I don't do that in python. I don't
know enough to even know if it's necessary, or good practice, or what.
> Is VB using a local web cache, and Python not?
I'm not specifying a local web cache with either (wouldn't know how or
where to look). If you have Windows, you can try it.
-------------------------------------------------------------------
Option Explicit
Dim xmlHTTP, fso, fOut, startTime, endTime, webpage, webfile,i
webpage = "http://econpy.pythonanywhere.com/ex/001.html"
webfile = "D:\econpy001.html"
startTime = Timer
For i = 1 to 10
Set xmlHTTP = CreateObject("MSXML2.serverXMLHTTP")
xmlHTTP.Open "GET", webpage
xmlHTTP.Send
Set fso = CreateObject("Scripting.FileSystemObject")
Set fOut = fso.CreateTextFile(webfile, True)
fOut.WriteLine xmlHTTP.ResponseText
fOut.Close
Set fOut = Nothing
Set fso = Nothing
Set xmlHTTP = Nothing
Next
endTime = Timer
wscript.echo "Finished VBScript in " & FormatNumber(endTime -
startTime,3) & " seconds"
-------------------------------------------------------------------
save it to a .vbs file and run it like this:
$cscript /nologo filename.vbs
> Are you saving files with both
> tests? To the same local drive? (To ensure you aren't measuring the
> difference between "write this file to a slow IDE hard disk, write that file
> to a fast SSD".)
Identical functionality (retrieve webpage, write html to file). Same
webpage, written to the same folder on the same hard drive (not SSD).
The 10 file writes (open/write/close) don't make a meaningful difference
at all:
VBScript 0.0156 seconds
urllib2 0.0034 seconds
This file is 3.55K.
> Once you are sure that you are comparing the same task in two languages,
> then make sure the measurement is meaningful. If you change from a (let's
> say) 1 KB file to a 100 KB file, do you see the same 2 x difference? What if
> you increase it to a 10000 KB file?
Do you know a webpage I can hit 10x repeatedly to download a good size
file? I'm always paranoid they'll block me thinking I'm a
"professional" web scraper or something.
Thanks
More information about the Python-list
mailing list