OT: regex to find email

Josh Close narshe at gmail.com
Tue Sep 21 19:07:29 CEST 2004


On Tue, 21 Sep 2004 18:53:33 +0200, Remy Blank
<remy.blank_asps at pobox.com> wrote:
> Josh Close wrote:
> > Ok, I see how this works.... but now how would I add {3,64} for the id
> > and {3,255} for the domain? I forgot to throw that part in earlier. I
> > believe a valid id is 3-64 chars and domain is 3-255 chars.
> >
> > So basically like this
> >
> > [\w[\w\._-]*]{3,64}@[[\w\._-]{3,255}\.[\w\._-]+]
> >
> > ......I know that won't work, but I'd like to verify that the id is
> > 3-64 chars long, and doesn't start with -._ and the domain is 3-255
> > chars long and doesn't start with -._ but must have a dot and tld's
> > like .com.au need to be accounted for also.
> 
> Let's see. Testing for a 3-64 char id should be easy:
> 
> [a-zA-Z0-9][\w\.-]{2,63}@ ...
> 
> You can't use \w in the first bracket, because you want to exclude
> the underscore.
> 
> About the domain, I can't remember if the total length is limited,
> or if each individual component is. The latter case is easy (say,
> for components with lenghts 3-64):
> 
> ... @([\w-]{3,64}\.)+[\w-]{3,64}
> 
> But I suspect this is not yet what you want. If you want to make
> sure the total length of the domain is 3-255 chars, you'll have
> to extract it after a match and check its length. Extraction could
> be done with a named group:
> 
> ... @(?P<domain>([\w-]{3,64}\.)+[\w-]{3,64})
> 
> Although I'm not sure how nested groups are handled. Combining both
> parts and defining a group for the id as well gives:
> 
> (?P<id>[a-zA-Z0-9][\w\.-]{2,63})@(?P<domain>([\w-]{3,64}\.)+[\w-]{3,64})
> 
> (All on one line, obviously)
> 
> HTH,
> 
> 
> -- Remy
> 

Only problem with this is, -name at domain.tld will be caught as
name at domain.tld. Basically, if there is a -._ before the name, then I
don't want to capture the whole email. So it would be a -._ preceded
by a \s I guess.

Maybe I could do a \s(?![\.\-\_]) before the search you suggested.

-Josh



More information about the Python-list mailing list