Urllib vs. FireFox
Gilles Ganault
nospam at nospam.com
Fri Oct 24 14:38:37 EDT 2008
Hello
After scratching my head as to why I failed finding data from a web
using the "re" module, I discovered that a web page as downloaded by
urllib doesn't match what is displayed when viewing the source page in
FireFox.
For instance, when searching Amazon for "Wargames":
URLLIB:
<a
href="http://www.amazon.fr/Wargames-Matthew-Broderick/dp/B00004RJ7H"><span
class="srTitle">Wargames</span></a>
~ Matthew Broderick, Dabney Coleman, John Wood, et Ally Sheedy
<span class="bindingBlock">(<span class="binding">Cassette
vidéo</span> - 2000)</span></td></tr>
FIREFOX:
<div class="productTitle"><a
href="http://www.amazon.fr/Wargames-Matthew-Broderick/dp/B00004RJ7H/ref=sr_1_1?ie=UTF8&s=dvd&qid=1224872998&sr=8-1">
Wargames</a> <span class="binding"> ~ Matthew Broderick, Dabney
Coleman, John Wood, et Ally Sheedy</span><span class="binding">
(<span class="format">Cassette vidéo</span> - 2000)</span></div>
Why do they differ?
Thank you.
More information about the Python-list
mailing list