[XML-SIG] xml / html parsing for web

kentsin kentsin@sinaman.com
Tue Dec 12 09:21:31 CST 2000


I have download 4Suite but I found it difficult to understand from the document to build what I want. I have also read the linkcheck code which contain a very smart regular expression to parse almost all links. What I found missing is a javascript driven or form driven links : some site have 

<option .... value="link1"...

Which linkchecker can not follow.

Moreover, I would like to extract the form data and link them with labels found on the page. Associating the link with the hot text or image. Which linkchecker can not. 

Linkchecker's regular expression approach is much clear to me, but as a newbie I would like to hear from you that how far can it go? Does it worth for me to go into the 4dom way?

Can somebody point me to some 4dom sample code? 

Many thanks to all who reply.

Best Regards,

Kent Sin


===================================================================
新浪免費電子郵箱 http://sinamail.sina.com.hk 
立即下載 SinaTicker http://sinaticker.sina.com.hk