Ideas on how to parse a dynamically generated html pages
usernet at ilthio.net
Fri Oct 22 04:49:30 CEST 2010
On 2010-10-22, chad <cdalten at gmail.com> wrote:
> or less what happens is when a person clicks on url, a pop up menu
> appears asking the users for some data. How would I go about
> automating this? Just curious because the web spider doesn't actually
> pick up the urls that generate the menu. I'm assuming the actual url
> link is dynamically generated?
You have two options:
generating menues, then it is getting the data it uses to generate
those menus from somewhere. Once you have found that resource,
you can access it yourself with a request from your Python code.
This is generally the best approach if possible.
2. You can automate a bowser thorough a COM/XPCOM/etc. interface
which allows you to access the DOM object in real time as it is
There are libraries that will do this as well. I have used
this on heavy AJAX style interfaces with mountains of spagetti
try to understand.
More information about the Python-list