How do I 'stat' online files?

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Wed Jul 25 03:23:49 CEST 2007


En Tue, 24 Jul 2007 10:47:16 -0300, Carsten Haese <carsten at uniqsys.com>  
escribió:

> On Tue, 2007-07-24 at 09:07 -0400, DB Daniel Brown wrote:
>> I am working on a program that needs to stat files (gif, swf, xml,
>> dirs, etc) from the web. I know how to stat a local file…
>> but I can’t figure out how to stat a file that resides on a web
>> server.
>
> That's because urlopen returns a file-like object, not a file. The best
> you can hope for is to inspect the headers that the web server returns:
>
>>>> import urllib
>>>> f = urllib.urlopen("http://www.python.org")
>>>> f.headers['last-modified']
> 'Mon, 23 Jul 2007 20:35:52 GMT'
>>>> f.headers.items()
> [('content-length', '14053'), ('accept-ranges', 'bytes'), ('server',
> 'Apache/2.2.3 (Debian) DAV/2 SVN/1.4.2 mod_ssl/2.2.3 OpenSSL/0.9.8c'),
> ('last-modified', 'Mon, 23 Jul 2007 20:35:52 GMT'), ('connection',
> 'close'), ('etag', '"60193-36e5-39089a00"'), ('date', 'Tue, 24 Jul 2007
> 13:42:57 GMT'), ('content-type', 'text/html')]
>
> Maybe that's good enough for your needs.

This generates an HTTP GET request - transfering the contents too,  
innecesarily. Using an HTTP HEAD request would be better, as only the  
headers are transfered. Since urllib can't generate a HEAD request, one  
has to use httplib instead (it's just a bit more "low level"):

py> import httplib
py> conn = httplib.HTTPConnection("www.python.org")
py> conn.request("HEAD", "/images/python-logo.gif")
py> resp = conn.getresponse()
py> resp.getheaders()
[('content-length', '2549'), ('accept-ranges', 'bytes'), ('server',  
'Apache/2.2.
3 (Debian) DAV/2 SVN/1.4.2 mod_ssl/2.2.3 OpenSSL/0.9.8c'),  
('last-modified', 'Tu
e, 24 Jul 2007 23:41:20 GMT'), ('etag', '"6015b-9f5-ee27c800"'), ('date',  
'Wed,
25 Jul 2007 01:12:43 GMT'), ('content-type', 'image/gif')]
py> conn.close()

-- 
Gabriel Genellina




More information about the Python-list mailing list