Podcast catcher in Python
garrickp at gmail.com
Fri Sep 11 17:30:31 CEST 2009
On Sep 11, 8:20 am, Chuck <galois... at gmail.com> wrote:
> Hi all,
> I would like to code a simple podcast catcher in Python merely as an
> exercise in internet programming. I am a CS student and new to
> Python, but understand Java fairly well. I understand how to connect
> to a server with urlopen, but then I don't understand how to download
> the mp3, or whatever, podcast? Do I need to somehow parse the XML
> document? I really don't know. Any ideas?
You will first have to download the RSS XML file, then parse that file
for the URL for the audio file itself. Something like eTree will help
immensely in this part. You'll also have to keep track of what you've
I'd recommend taking a look at the RSS XML yourself, so you know what
it is you have to parse out, and where to find it. From there, it
should be fairly easy to come up with the proper query to pull it
automatically out of the XML.
As a kindness to the provider, I would recommend a fairly lengthy
sleep between GETs, particularly if you want to scrape their back
Unfortunately, I no longer have the script I created to do just such a
thing in the past, but the process is rather straightforward, once you
know where to look.
More information about the Python-list