Use Regular Expressions to extract URL's
Jimbo
nilly16 at yahoo.com
Fri Apr 30 02:53:06 EDT 2010
Hello
I am using regular expressions to grab URL's from a string(of HTML
code). I am getting on very well & I seem to be grabbing the full URL
[b]but[/b]
I also get a '"' character at the end of it. Do you know how I can get
rid of the '"' char at the end of my URL
[b]Example of problem:[/b]
[quote]
I get this when I extract a url from a string
http://google.com"
I want to get this
http://google.com
[/quote]
My regular expression:
[code]
def find_urls(string):
""" Extract all URL's from a string & return as a list """
url_list = re.findall(r'(?:http://|www.).*?["]',string)
return url_list
[/code]
More information about the Python-list
mailing list