Podcast catcher in Python

Dave Angel davea at ieee.org
Sat Sep 19 14:40:18 CEST 2009

Chuck wrote:
> On Sep 12, 3:37 pm, Chuck <galois... at gmail.com> wrote:
>> On Sep 11, 9:54 pm, Chris Rebert <c... at rebertia.com> wrote:
>>> On Fri, Sep 11, 2009 at 7:43 PM, Chuck <galois... at gmail.com> wrote:
>>>> Does anyone know how I should read/download the mp3 file, and how I
>>>> should write/save it so that I can play it on a media player such as
>>>> Windoze media player?  Excuse my ignorance, but I am a complete noob
>>>> at this.  I downloaded the mp3, and I got a ton of hex, I think, but
>>>> it could've been unicode.
>>> urllib.urlretrieve():http://docs.python.org/library/urllib.html#urllib.urlretrieve
>>> Cheers,
>>> Chris
>> Thanks Chris!  I will play around with this.
> I am using Python 3.1, but I can't figure out why I can't use
> xml.dom.minidom.  Here is my code:
> from xml.dom.minidom import parse, parseString
> url =http://minnesota.publicradio.org/tools/podcasts/
> grammar_grater.xml'  #just for test purposes
> doc =arse(url)  #I have also tried parseString(url), not to mention
> a million other methods from xml.Etree, xml.sax etc...  all to no
> avail
> What the heck am I doing wrong?  How can I get this xml file and use
> the toprettyxml() method.  Or something, so I can parse it.  I don't
> have any books and the documentation for Python kind of sucks.  I am a
> complete noob to Python and internet programming.  (I'm sure that is
> obvious :) )
> Thanks!
> Charlie
Wrong?  You didn't specify your OS environment, you didn't show the 
error message (and traceback), you posted an apparently unrelated 
question in the same thread (there's no XML inside a mp3 file).

xml.dom.minidom.parse() takes a filename or a 'file' object as its first 
argument.  You gave it a URL, so it complained.  You can fix that either 
by using urllib.urlopen() or by separately copying the data to a local 
file and using its filename here.

In general, I'd recommend against testing new code live against the 
internet, since errors can occur from the vagaries of the internet as 
well as from bugs in your code.  Sometimes it's hard to tell the 
difference when the symptoms change each time you run.

So I'd download the xml data that you want to test with to a local file, 
and test out your parsing logic against that copy.  In fact, first 
testing will probably be against a simplified version of that copy.

How do you download the file?  Well, if you're using Firefox, you can 
browse to that page, and do View->Source.  Then copy/paste that text 
into a text editor, and save it locally.  Something similar probably 
works in other browsers, maybe even IE.

Or you can use urlretrieve, as suggested earlier in this thread.  But 
I'd make that a separate script, so that you can separate the bugs in 
downloading from the bugs in parsing.  After everything mostly works, 
you can think about combining them.


More information about the Python-list mailing list