Regex Help

Lawrence D'Oliveiro ldo at geek-central.gen.new_zealand
Wed Sep 24 01:50:53 CEST 2008


In message <mailman.1369.1222101506.3487.python-list at python.org>, Support
Desk wrote:

> Anybody know of a good regex to parse html links from html code? The one I
> am currently using seems to be cutting off the last letter of some links,
> and returning links like
> 
> http://somesite.co
> 
> or http://somesite.ph
> 
> the code I am using is
> 
> 
> regex = r'<a href=["|\']([^"|\']+)["|\']>'

Can you post some example HTML sequences that this regexp is not handling
correctly?



More information about the Python-list mailing list