Web automation

Paul Boddie paul at boddie.org.uk
Tue Nov 8 02:58:51 CET 2005

Mike Meyer wrote:
> qwwee... at yahoo.it writes:
> > but I supposed the everyone knew that web automation (and in general
> > "automation") is only a problem in Linux.
> I don't know it. I don't believe it, either. I automate web tasks on
> Unix systems (I don't use many Linux systems, but it's the same tool
> set) on a regular basis.

I imagine that "Web automation" is taken here to mean the automation of
Web browsers, with all the advantages/issues such an approach entails.
The problem on non-Windows systems is the lack of a common (or
enforced) technology for exposing application object models: Mozilla
has XPCOM which apparently doesn't permit the exposure of enough useful
functionality to other processes for "Web automation" tasks (and whose
components seem bizarre enough to defeat my casual investigations into
automation with in-browser components), whilst Konqueror/KHTML is
somewhat accessible via DCOP although the interfaces to much of KDE are
somewhat limited.

Taking the challenge on board, I decided to build on the existing KPart
plugin work done with PyKDE [1] and produce a component which exposes
active documents using DCOP [2]. Combined with an extended version of
qtxmldom [3] the result is a system which permits out-of-process
automation of KHTML and thus Konqueror with the documents available
using a PyXML-style DOM. Currently, the work is in an early phase and
there's a lot of learning about DCOP and PyKDE to be done, but I think
the concept is more or less worked out.

If only GNOME and KDE had stuck with CORBA, though... :-/


P.S. Of course, the existing KPart plugins permit in-browser embedding
which is easily good enough for many automation tasks, and there are
plenty of examples of moderately useful tools and scripts to prove this
point. In the revised plugins collection [2], there's a plugin which
extracts hCalendar information, for example.

[1] http://www.boddie.org.uk/david/Projects/Python/KDE/index.html
[2] http://www.boddie.org.uk/python/kpartplugins.html
[3] http://www.boddie.org.uk/python/qtxmldom.html

More information about the Python-list mailing list