[Tutor] Re trouble
Øyvind Dale Spørck
oyvind.sporck at eniro.no
Mon Oct 27 14:42:46 EST 2003
Hello,
I am using the Re module to filter out some webadresses out of html
documents, but cannot seem to get it right. What should go in the paranteses
of the re.search?
Here is an example from the html:
<a
href="../../../../../../get.liste.kvakk.no/fs/http_3A/www.db.no/smurf/defaul
t.htm"><b>Dagbladet AS</b></a> [<a
href="../../../../../../get.liste.kvakk.no/is/http_3A/testside.no/smurf/defa
ult.htm"><font color="#CC3300"><b>Vis side</b></font></a>
In other words, I would like to get a list of these adresses:
www.db.no/smurf/default.htm
testside.no/smurf/default.htm
These adresses can be anything. I guess the common nominator is that they
start after http_3A/ and ends before the first ".
How would I write that so that re picks out the right stuff?
Thanks in advance,
Øyvind
More information about the Tutor
mailing list