Podcast catcher in Python

Chuck galois271 at gmail.com
Fri Sep 11 13:56:45 EDT 2009


On Sep 11, 10:30 am, Falcolas <garri... at gmail.com> wrote:
> On Sep 11, 8:20 am, Chuck <galois... at gmail.com> wrote:
>
> > Hi all,
>
> > I would like to code a simple podcast catcher in Python merely as an
> > exercise in internet programming.  I am a CS student and new to
> > Python, but understand Java fairly well.  I understand how to connect
> > to a server with urlopen, but then I don't understand how to download
> > the mp3, or whatever, podcast?  Do I need to somehow parse the XML
> > document?  I really don't know.  Any ideas?
>
> > Thanks!
>
> > Chuck
>
> You will first have to download the RSS XML file, then parse that file
> for the URL for the audio file itself. Something like eTree will help
> immensely in this part. You'll also have to keep track of what you've
> already downloaded.
>
> I'd recommend taking a look at the RSS XML yourself, so you know what
> it is you have to parse out, and where to find it. From there, it
> should be fairly easy to come up with the proper query to pull it
> automatically out of the XML.
>
> As a kindness to the provider, I would recommend a fairly lengthy
> sleep between GETs, particularly if you want to scrape their back
> catalog.
>
> Unfortunately, I no longer have the script I created to do just such a
> thing in the past, but the process is rather straightforward, once you
> know where to look.
>
> ~G

Thanks!  I will see what I can do.



More information about the Python-list mailing list