[Web-SIG] HTML parsing - get text position and font size
Dirkjan Ochtman
dirkjan at ochtman.nl
Mon Jan 12 13:16:00 CET 2009
2009/1/12 Girish Redekar <girish.redekar at gmail.com>:
> is still tedious as font sizes in html/css can be expressed in multiple
> methods (like <FONT> tags, sizes in pixels, relative sizes, default larger
> size for header etc). One can get down and code each of these cases, but I
> was hoping someone has already (and reliably) worked on the same
So basically you want a full-on headless browser? Pretty non-trivial.
Your best bet would probably be to hook into a Mozilla instance
somehow (PyXPCOM, anyone?) and try to read the styles from the DOM
there.
Cheers,
Dirkjan
More information about the Web-SIG
mailing list