parse html rendered by js

yanghq yanghq at neusoft.com
Fri Feb 11 09:20:32 CET 2011


hi,
    I wanna get attribute value like href,src... in html.

    for simple html page libxml2dom can help me parse it into dom, and
get what  I want;

    but for some pages rendered by js, like:

document.write(
'<frameset border="0" frameborder="no" rows="0,*,0" onLoad="start()"
onUnload="end()" onResize="change()">'+
  '<frameset border="0" frameborder="no" cols="*,*,*,*,*,0">'+
'<frame name="cfgFrame" noresize scrolling="no"
src="../frame.html?rtfPossible=' + rtfPossibleString + '">'+
'<frame name="mboxFrame" noresize scrolling="no"
src="../frame.html?rtfPossible=' + rtfPossibleString + '">'+
'<frame name="cmdFrame" noresize scrolling="no"
src="../frame.html?rtfPossible=' + rtfPossibleString + '">'+
'<frame name="msgFrame" noresize scrolling="no"
src="../frame.html?rtfPossible=' + rtfPossibleString + '">'+
'<frame name="pabFrame" noresize scrolling="no"
src="../frame.html?rtfPossible=' + rtfPossibleString + '">'+
'<frame name="cnFrame" noresize scrolling="no" src="../frame.html?' +
main.clientargs + '">'+
  ''+
  '<frame name="mailFrame" marginwidth="0" marginheight="0" noresize
src="../frame.html?rtfPossible=' + rtfPossibleString + '">'+
'<frame name="appletFrame" marginwidth="0" marginheight="0" noresize
src="../frame.html?rtfPossible=' + rtfPossibleString + '">'+
''
)
how can I get the atrribute value of 'src', thank you for any help.

---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this communication in error,please 
immediately notify the sender by return e-mail, and delete the original message and all copies from 
your system. Thank you. 
---------------------------------------------------------------------------------------------------


More information about the Python-list mailing list