[Tutor] list all links with certain extension in an html file
M Hussain
md.husen at gmail.com
Fri Sep 28 15:17:52 CEST 2012
On Fri, Sep 28, 2012 at 1:10 PM, <tutor-request at python.org> wrote:
> Date: Sun, 16 Sep 2012 12:50:09 +0530
> From: Santosh Kumar <sntshkmr60 at gmail.com>
> To: tutor at python.org
> Subject: [Tutor] list all links with certain extension in an html file
> python
> Message-ID:
> <
> CAE7MaQa53X8Pav96q2ka0VajHnJtRZ_rgZcmH_cbsaQDiz5GGg at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> I want to extract (no I don't want to download) all links that end in
> a certain extension.
>
> <link rel="stylesheet" type="text/css" href="http://foo.bar/part1.css
> ">
>
> Please note that I don't want to download those CSS, instead I want
> something like this (to stdout):
>
> http://foo.bar/part1.css
>
> Also I don't want to use external libraries. I am asking for: which
> libraries and functions should I use?
>
>
> do you mean, you want to parse the file and the URL of those css files,
then parse the file, there are many parsing options
http://lxml.de/parsing.html
you don't have to use external libraries either, you may use
http://docs.python.org/library/htmlparser.html or regular expressions
or may be I did't understood what you really want to do.
Br
- Hussain
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20120928/54fad6b6/attachment-0001.html>
More information about the Tutor
mailing list