Unpythonic Python
David Abrahams
dave at boost-consulting.com
Wed Aug 25 15:24:35 EDT 2004
Rob Williscroft <rtw at freenet.co.uk> writes:
> David Abrahams wrote in news:uy8k31as1.fsf at boost-consulting.com in
> comp.lang.python:
>
>> Rob Williscroft <rtw at freenet.co.uk> writes:
>>
>>> David Abrahams wrote in news:uzn4j2s38.fsf at boost-consulting.com in
>>> comp.lang.python:
>>>
>>>>> That's not the problem. I can download the file reliably from
>>>>> other machines.
>>>
>>> At the same time, using http ?
>>
>> I can download the file reliably using IE from my WinXP box.
>>
>> I can download the file reliably using urllib from Cygwin Python 2.3.2
>>
>> The 2nd element returned by urlretrieve is
>
> Which version, the one that works or the one that doesn't ?
>
>>
>> 'Date: Wed, 25 Aug 2004 14:50:17 GMT\r\nServer: Apache/2.0.40 (Red
>> Hat Linux)\r\nLast-Modified: Wed, 25 Aug 20 2 GMT\r\nETag:
The one that works.
> Something is missing here:
>
> Last-Modified: Wed, 25 Aug 20 2 GMT
>
> Contrast:
>
> Wed, 25 Aug 2004 14:50:17 GMT
Where did that come from, what do you think is missing, and why?
>> "b63d5b-20ec84b-18057e80"\r\nAccept-Ranges: bytes\r\nContent-Length:
>> 34523211\r\nContent-Type: n/x-bzip2\r\nConnection: close\r\n'
>
> 34 MB's ( I got 6 MB's )
It's 34MB.
>>>> Trying again with Python 2.3 on Cygwin.
>>
>> As you can see from the above, it works. Is there a known urllib bug
>> in earlier Pythons?
>
> Sorry I don't know, but I've seen the same truncation with no python,
> and no unix.
Argh.
>>> Is it possible the file is being (re) uploaded (via cvs) during your
>>> cron job's download, thus truncating your download ?
>>
>> I don't think so.
>
> Can you test wether or not this is happening ? I.e if you don't
> get the full 34523211 bytes re-download and compare the above
> Length, ETag and Last-Modified.
>
I did some tests, but didn't come up with anything conclusive. I set
my cron job to start 3 hours later. We'll see.
>>> Perhapse you should change to cvs:
>>>
>>> os.system( 'cvs ... ' )
>>
>> The problem with that is that I want to capture the whole CVS
>> history, not just today's state.
>
> I was suggesting you get the tarball via cvs, though presumably
> sourceforge don't give you the option.
No they don't.
> http has the problem that
> the server will just truncate the download if the source file
> gets replaced.
>
>>
>>> FWIW, I tried downlading with IE using the link above I got a
>>> truncated 6 and bit MB's (16:15 BST (UTC +0100)).
>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>
>> Sorry, what does that mean? Did it show that message in a dialog,
>> or...?
>>
>
> No, I got a download complete, but the file was only 6 MB's, bzip2 -t
> told me the file was truncated, the (16:15 ...) is the time I tried
> downloading, BST = British Summer Time, though you wouldn't know it
> from the weather :).
>
> Further I just ran:
>
> import urllib
>
> filename, headers = \
> urllib.urlretrieve(
> 'http://cvs.sourceforge.net/cvstarballs/boost-cvsroot.tar.bz2',
> 'boost-cvsroot.tar.bz2')
>
> print filename
>
> print headers
>
> boost-cvsroot.tar.bz2
> Date: Wed, 25 Aug 2004 16:53:20 GMT
> Server: Apache/2.0.40 (Red Hat Linux)
> Last-Modified: Wed, 25 Aug 2004 14:14:02 GMT
> ETag: "b63d5b-20ec84b-18057e80"
> Accept-Ranges: bytes
> Content-Length: 34523211
> Content-Type: application/x-bzip2
> Connection: close
>
> The script ended at 17::59 BST, Note the difference bettween the two
> times in the headers, suggesting the file was modified 1:45 min's
> ago ~ the same time my attempted download with IE failed.
That's odd! Your (failed) download modified the file being
downloaded??
--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com
More information about the Python-list
mailing list