[Mailman-Users] Pipermail URL handling in archives

Stephen J. Turnbull stephen at xemacs.org
Sat Feb 23 02:13:02 CET 2008


Jim Popovitch writes:

 > On Fri, Feb 22, 2008 at 4:03 PM, Mark Sapiro <mark at msapiro.net> wrote:
 > So, the problem seems to appear with commas too which makes me wonder
 > if this can be resolved with this:
 > 
 >    urlpat = re.compile(r'(\w+://[^>)\s]+?)(\.|,)?(\s|$)') # URLs in text
 > 
 > but then I got to thinking about any other punctuation make that
 > follows a URL... and my mind started spinning :-)
 > 
 > Any ideas, anyone?

Unfortunately sre doesn't support POSIX character classes (like
[:punct:]) AFAIK, but I would say it's a good idea to make that

   urlpat = re.compile(r'(\w+://[^>)\s]+?)[#,.::\'"!?()]?(\s|$)') # URLs in text

for starters.  It would be better to replace it with a real
URL-matching regexp, though.



More information about the Mailman-Users mailing list