Help parsing a page with python

mierdatutis mi mmm286 at gmail.com
Wed Jan 27 11:49:39 EST 2010


Hello,

Thanks Javier,

But I think that the page embeds a viewer. Only the viewer knows the URL to
the FLV file itself. I can't see any direct correspondence between the
elements of the two URLs,  I cant see a way to construct the FLV's URL from
the contents of that page :-(

I have to do manually, I dont see other way :-(

Many thanks for your answer
Muchas gracias





2010/1/27 Javier Collado <javier.collado at gmail.com>

> Hello,
>
> You can find some advice here:
> http://www.packtpub.com/article/web-scraping-with-python-part-2
>
> Best regards,
>    Javier
>
> 2010/1/27 mierdatutis mi <mmm286 at gmail.com>:
> > Hello again,
> >
> > What test case for Windmill? Can you say me the link, please?
> >
> > Many thanks
> >
> > 2010/1/27 Javier Collado <javier.collado at gmail.com>
> >>
> >> Hello,
> >>
> >> A test case for Windmill might also be used to extract the information
> >> that you're looking for.
> >>
> >> Best regards,
> >>    Javier
> >>
> >> 2010/1/27 mierdatutis mi <mmm286 at gmail.com>:
> >> > Those videos are generated by javascript.
> >> > There is some parser with python for javascript???
> >> >
> >> > Thanks a lot!
> >> >
> >> >
> >> > 2010/1/27 Simon Brunning <simon at brunningonline.net>
> >> >>
> >> >> 2010/1/27 mierdatutis mi <mmm286 at gmail.com>:
> >> >> > Hi,
> >> >> >
> >> >> > I would like to parse a webpage to can get the url of the video
> >> >> > download. I
> >> >> > use pyhton and firebug but I cant get the url link.
> >> >> >
> >> >> > Example:
> >> >> >
> >> >> > The url where I have to get the video link is:
> >> >> >
> >> >> >
> >> >> >
> http://www.rtve.es/mediateca/videos/20100125/saber-comer---salsa-verde-judiones-25-01-10/676590.shtml
> "
> >> >> >
> >> >> > The video is
> >> >> > http://www.rtve.es/resources/TE_SSAC011/flv/8/2/1264426362028.flv
> >> >> > Could you help me please?
> >> >>
> >> >> That URL doesn't appear to be in the HTML - it must be being brought
> >> >> in by the JavaScript somehow.
> >> >>
> >> >> --
> >> >> Cheers,
> >> >> Simon B.
> >> >> --
> >> >> http://mail.python.org/mailman/listinfo/python-list
> >> >
> >> >
> >> > --
> >> > http://mail.python.org/mailman/listinfo/python-list
> >> >
> >> >
> >> --
> >> http://mail.python.org/mailman/listinfo/python-list
> >
> >
> > --
> > http://mail.python.org/mailman/listinfo/python-list
> >
> >
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20100127/b0f795c7/attachment-0001.html>


More information about the Python-list mailing list