[Tutor] saving webpage as webarchive
Danny Yoo
dyoo at hashcollision.org
Tue Mar 1 03:42:24 EST 2016
> I want to save a webpage as a webarchive, and not just get the text.
> I hope there’s a way to do it without saving all of the images separately.
> And even if I do have to download them separately, then how would I combine everything into the HTM webarchive?
If I understand your question properly, I think you're asking for
something like the use of the 'wget' utility, which knows how to
download an entire web site:
http://www.linuxjournal.com/content/downloading-entire-web-site-wget
Trying to do this as a Python program is not a simple task; it's
equivalent to writing a web crawler.
http://www-rohan.sdsu.edu/~gawron/python_for_ss/course_core/book_draft/web/web_intro.html
explains some basic ideas.
More information about the Tutor
mailing list