data:image/s3,"s3://crabby-images/4e3e2/4e3e2cbb3cc1cc718a6b01fc2d77df812cd2a66b" alt=""
here is my code: import urllib import lxml.html down=" http://sc.hkex.com.hk/gb/www.hkex.com.hk/chi/market/sec_tradinfo/stockcode/e... " file=urllib.urlopen(down). read() root=lxml.html.document_fromstring(file) data1 = root.xpath('//tr[@class="tr_normal" and .//img]') print "the row which contains img :" for u in data1: print u.text_content() data2 = root.xpath('//tr[@class="tr_normal" and not(.//img)]') print "the row which do not contain img :" for u in data2: print u.text_content() the output is :(i omit many lines ) the row which contains img : 00329 the row which do not contain img : 00001长江实业1,000#HOF ................many lines omitted 00327百富环球1,000#H 00328ALCO HOLDINGS2,000# i wondered why there are so many lines i can't get such as : (you can see in the web http://sc.hkex.com.hk/gb/www.hkex.com.hk/chi/market/sec_tradinfo/stockcode/e... ) 00330 思捷环球 100 # H O F 00331 春天百货 2,000 # H 00332 NGAI LIK IND 4,000 # ...................many lines ommitted i want to know how can i get these ??
participants (1)
-
contro opinion