Web automation

Paul Boddie paul at boddie.org.uk
Wed Nov 9 10:23:31 EST 2005


qwwee... at yahoo.it wrote:
> The contribution of Paul Boddie is valuable. I too examined DCOP
> and even chose as browser Konqueror, being a KDE application.
> But DCOP doesn't go to such a low level. It is not possible
> to send a simulated keystroke from one KDE application to another.

I imagine that you can send keystrokes using the xlib package described
earlier. Nevertheless, a "proper" automation interface doesn't work at
that level. Instead, you work with more high-level concepts than
sending keypresses and scanning around the window list to see what
happened.

One example of automation is the OutlookExplorer program I wrote [1]
which connects to Microsoft Outlook and exports messages, calendar
events, and so on. Instead of pretending that to be a user clicking on
different things, reading things off the screen, and then navigating
around - something which would be very easy to get wrong - the program
instead connects to Outlook's automation interface via COM, selects
each folder in turn using the high-level interface provided, and
invokes various methods on the interface to export messages.

With a browser, one may use a similarly high-level interface: instead
of firing keypresses into the location bar and then firing a Return
keypress to tell the browser to load a page, you invoke a method in the
browser's automation interface - openURL in the mainwindow interface
for Konqueror, I believe. After that, things can be more difficult, but
even so, you should still have moderately high-level access to the
document being displayed, for example, even if it is via a DOM.

> Not being an expert I can't understand nor comment on the more
> technical parts of your reply (out-of-process automation,
> PyXML-style DOM etc.).

All I meant by "out-of-process" was whether you can just start a Python
program outside the browser (eg. in a normal console) which connects to
the browser in order to do its work. The PyXML-style DOM was a
reference to the way the HTML document is represented - if you're used
to XML processing in Java, JavaScript, Qt or even Python, you'll have
seen such a thing before.

Paul

[1] http://www.boddie.org.uk/python/COM.html




More information about the Python-list mailing list