John J. Lee
jjl at pobox.com
Sun Jun 17 18:57:21 CEST 2007
Johny <python at hope.cz> writes:
> How can I get a website thumbnail?
> I would like to allow visitors to add their URLs to our pages with
> the thumbnail of their website.
> Can anyone suggest a solution for web thumbnails?
There are a number of ways to do it, most of them involving web
browsers, and all of them painful in one way or anther. No doubt I've
missed some out, and note that I've not actually tried most of these:
* Find some commercial component to do it. I think I'd want to know
what it was doing under the hood if I used something like this,
since there will be implications for rendering reliability,
flakiness, etc. I imagine most of these will be convenient
wrappers around MSIE.
* Use a library capable of HTML rendering. GUI toolkits like Qt, wx
and GTK often have simple HTML rendering engines, for example.
This is relatively convenient, but is ruled out in your case, since
it sounds like you're dealing with arbitrary HTML. Only real
battle-hardened web browsers can cope with rendering that reliably.
* On Unix, send keystroke/mouse events to X11 to automate a browser
(not sure how this is done, though I was looking at some (not open
source) code that does it not long ago...).
* Talk to Firefox using XPCOM and in-process Python code running
inside Firefox. Example code and/or up-to-date documentation would
make this much easier, but I don't think there's much around still.
* Talk to Firefox using JSSH. I'm assuming this works, but never
done it. Presumably it also involves XPCOM interfaces, but with
presumably without the risk of Firefox build pain or PyXPCOM quirks.
* Automate Konqueror with DCOP (or DBUS in KDE 4, I guess). Never
tried it, but assume it must be capable of this. Perhaps other
browsers support capable DBUS interfaces now, also... Konqueror
will not generate the output you expect for some pages, though.
PyKDE might work here, too.
* Automate MSIE on Windows with COM (with pywin32 or with ctypes'
"comtypes" COM support -- note you still need a third-party package
for this -- the ctypes that's part of Python 2.5 standard library
isn't enough). This is a mixture of very easy and incredibly
painful, depending on what exactly you want to do -- can't say how
awkward this particular case is. Last time I looked, it seemed
that .NET didn't expose the necessary interfaces to do this, so it
*does* have to be COM, not .NET (maybe Vista has changed that,
Many of the above would likely involve also either a printer driver
that writes images rather than actually printing (perhaps print to
.eps and then via ghostscript to a .png), or some code to take a
"screen"shot (preferably just of the browser window contents!). It's
plausible some browsers might be able to create images themselves
rather than requiring you to talk to an external driver yourself, but
I don't know of a specific one.
The length of this reply indicates what a PITA this is!
If I were doing something commercial on Windows, I'd look into the
first option. Otherwise, I'd go for Firefox / X11 events or Firefox /
JSSH. Note that both of the latter will involve work in creating an
isolated environment for Firefox to run in. I've seen Xvnc, a
temporary HOME directory, and a canned prefs.js working OK for this
A few actual bits of code, which may or may not be robust ;-)
This looks better to use than DCOP/DBUS or PyKDE if you're willing to
risk using KHTML (the rendering engine in Konqueror):
More information about the Python-list