[Tutor] screen scraping without the request
Martin Walsh
mwalsh at groktech.org
Sun Apr 22 19:02:48 CEST 2007
Hi Rohan,
You might also try the LiveHTTPHeaders firefox extension, it is also
very good for this type of reverse engineering.
http://livehttpheaders.mozdev.org/index.html
HTH,
Marty
Luke Paireepinart wrote:
> Kent Johnson wrote:
>> Rohan Deshpande wrote:
>>
>>> Hi All,
>>>
>>> the previous thread on screen scraping got me thinking of starting a
>>> similar project. However, the problem is I have no idea what the POST
>>> request is as there is no escape string after the URL when the resulting
>>> page comes up. I essentially need to pull the HTML from a page that is
>>> generated on a users machine and pipe it into a python script. How
>>> should I go about doing this? Is it possible/feasible to decipher the
>>> POST request and get the HTML, or use some screen scraping python libs a
>>> la the javascript DOM hacks? I was thinking of the possibilities of the
>>> former, but the interaction on the site is such that the user enters a
>>> username/password and goes through a couple links before getting to the
>>> page I need. Perhaps Python can use the session cookie and then pull
>>> the right page?
>>>
> Have you tried using Firebug? It's an extension for Firefox.
> You might be able to run it while you're navigating the site, and see
> the communciation between you and the server and get the POST that way,
> but I'm not completely certain about that.
> -Luke
> _______________________________________________
> Tutor maillist - Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
More information about the Tutor
mailing list