Regular expression - dot problem!
fzhenglee23 at yahoo.com.cn
Thu Jun 8 00:15:16 CEST 2006
Fredrik Lundh <fredrik at pythonware.com> wrote： 李政 wrote:
> I've a problem with regular express(dot problem). I checked Python
> Library Reference, but i can't find any infomation that is useful.
like what a dot means in a regular expression? you really need to work
on your google fu ;-)
in the meantime, look under "The special characters are" on this page:
May be my bad writing english confused you. I know what a dot means in a regular expression. In the case you are forced to use regular expression in the way:
patter = 'www.'
if re.compile(pattern).match(string) is not None:
if re.compile(r'www.').match(string) is not None:
if re.compile('www\.').match(string) is not None:
, how you process special characters, like dot.
> * if re.compile(pattern).match(urldomain) is not None:*
> return INTERNAL_LINK # match. url is internal link
if you want to check if the url starts with a given prefix, use
Your suggestion is really helpful. I use both startswith(prefix) and endswith(suffix) in my program, and it works better. Here is the new one:
def getLinkType(url, sitedomain):
# get the domain which 'url' belongs to
urldomain = urlparse4esa(url)
tmpsd = ''
tmpsd = sitedomain[4:]
return INTERNAL_LINK # match. url is internal link
return EXTERNAL_LINK # doesn't match. url is external link
Thks for your help!
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-list