[Tutor] Feedparser and google news/google reader

David Kim davidkim05 at gmail.com
Thu Mar 11 03:06:22 CET 2010


I have been working through some of the examples in the Programming
Collective Intelligence book by Toby Segaran. I highly recommend it, btw.

Anyway, one of the simple exercises required is using feedparser to pull in
RSS/Atom feeds from different sources (before doing more interesting
things). The algorithm stuff I pretty much follow, but one thing is driving
me CRAZY. I can't seem to pull more than 10 items from a google news feed.
For example, I'd like to pull 1000 google news items (using some search
term, let's say 'lightsabers'). The associated atom feed url, however, only
holds ten items. And its hard to do some of the clustering exercises with
only ten items!

Anyway, I imagine this must be a straightforward thing and I'm being a
moron, but I don't know where else to ask this question. I did see some
posts about an n=100 term one can add to the url (the limit seems to be 100
items), but it only seems to effect the webpage view and not the feed. I've
also tried subscribing to the feed in Google Reader and making the feed
public, but I seem to be running into the same problem. Is this a feedparser
thing or a google thing?

The url I'm using is
http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&as_scoring=r&as_maxm=3&q=health+information+exchange&as_qdr=a&as_drrb=q&as_mind=8&as_minm=2&cf=all&as_maxd=100&output=rss

Can anyone help me? I'm tearing my hair out and want to choke my computer.
It's probably not relevant, but I'm running Snow Leopard and Python 2.6
(actually EPD 6.1).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20100310/8840339a/attachment-0001.html>


More information about the Tutor mailing list