Urllib vs. FireFox

Gilles Ganault nospam at nospam.com
Fri Oct 24 20:38:37 CEST 2008


Hello

After scratching my head as to why I failed finding data from a web
using the "re" module, I discovered that a web page as downloaded by
urllib doesn't match what is displayed when viewing the source page in
FireFox.

For instance, when searching Amazon for "Wargames":

URLLIB:
<a
href="http://www.amazon.fr/Wargames-Matthew-Broderick/dp/B00004RJ7H"><span
class="srTitle">Wargames</span></a>
  
   ~ Matthew Broderick, Dabney Coleman, John Wood,  et Ally Sheedy
<span class="bindingBlock">(<span class="binding">Cassette
vidéo</span> - 2000)</span></td></tr>

FIREFOX:
 <div class="productTitle"><a
href="http://www.amazon.fr/Wargames-Matthew-Broderick/dp/B00004RJ7H/ref=sr_1_1?ie=UTF8&s=dvd&qid=1224872998&sr=8-1">
Wargames</a> <span class="binding"> ~ Matthew Broderick, Dabney
Coleman, John Wood,  et Ally Sheedy</span><span class="binding">
(<span class="format">Cassette vidéo</span> - 2000)</span></div>

Why do they differ?

Thank you.



More information about the Python-list mailing list