Saving a page loaded using the webbrowser library?
Irmen de Jong
irmen at -NOSPAM-xs4all.nl
Thu Mar 25 04:13:27 EDT 2010
On 3/25/10 8:41 AM, Dr. Benjamin David Clarke wrote:
> Does anyone know of a way to save the a loaded web page to file after
> opening it with a webbrowser.open() call?
>
> Specifically, what I want to do is get the raw HTML from a web page.
> This web page uses Javascript. I need the resulting HTML after the
> Javascript has been run. I've seen a lot about trying to get Python to
> run Javascript but there doesn't seem to be any promising solution. I
> can get the raw HTML that I want by saving the page after it has been
> loaded via the webbrowser.open() call. Is there any way to automate
> this? Does anyone have any ideas for better approaches to this
> problem? I don't need ti to be pretty or anything.
I think I would use an appropriate GUI automation library to simulate
user interaction with the web browser that you just started, and e.g.
select the File > Save page as > HTML only menu option from the browser...
If the javascript heavily modifies the DOM, that might not work however.
You might need additional tooling such as Web Developer Toolbar for
Firefox where you then can View Source > View Generated Source.
irmen
More information about the Python-list
mailing list