Cannot able to retreive compressed html URL
rushik
rushik.upadhyay at gmail.com
Sun Feb 15 16:17:53 EST 2009
On Feb 15, 11:56 am, rushik <rushik.upadh... at gmail.com> wrote:
> Hi,
> I am trying to build python script which retreives and analyze the
> various URLs and generate reports.
>
> Some of the urls are like "http://xyz.com/test.html.gz", I am trying
> to retreive it using urllib2 library and then using gzip library
> trying to decompress it.
>
> ex - server_url is say -http://xyz.com/test.html.gz
>
> logpage = urllib2.urlopen(server_url)
> html_content = cal_logpage.read()
> logpage.close()
>
> gz_tmp = open("gzip.txt.gz", "w")
> gz_tmp.write(html_content)
> gz_tmp.close()
> f = gzip.open("gzip.txt.gz", "rb")
> file_content = f.read()
> f.close()
>
> #return the resulting html content.
> return html_content
>
> on executing the code, its giving
>
> zlib.error - Error -3 while decompressing: invalid distance too far
> back
>
> the same URL I am able to retreive in proper html page format from
> browser
>
> please let me know if I am doing something wrong here, or is there any
> other better way to do so.
>
> Thanks,
> R
I got the solution !! using now urllib.retrieve
thx,
R
More information about the Python-list
mailing list