[Mailman-Developers] E-mail validation code
Charlie Clark
charlie at begeistert.org
Fri Jan 31 11:52:33 EST 2003
Dear list,
I've recently had a whole load of e-mail addresses to check and have been
looking for a way to help this. I was pointed in the direction of
Mailman/Utils.py:ValidateEmail by Danny Yoo on the Python tutor list.
It looks like this:
###
_badchars = re.compile('[][()<>|;^,]')
def ValidateEmail(str):
"""Verify that the an email address isn't grossly invalid."""
# Pretty minimal, cheesy check. We could do better...
if not str:
raise Errors.MMBadEmailError
if _badchars.search(str) or str[0] == '-':
raise Errors.MMHostileAddress
if string.find(str, '/') <> -1 and \
os.path.isdir(os.path.split(str)[0]):
# then
raise Errors.MMHostileAddress
user, domain_parts = ParseEmail(str)
# this means local, unqualified addresses, are no allowed
if not domain_parts:
raise Errors.MMBadEmailError
if len(domain_parts) < 2:
raise Errors.MMBadEmailError
###
My unskilled eye agrees with the comment.
Having trawled the web alternatives I've come across the following
approaches:
The Python stuff is at
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/66439/index_txt
It is a module with a single class which comes with some built in patterns
and the ability to make custom patterns. The built in e-mail test doesn't
catch things like spaces or @@ and other things.
The perl stuff is at
http://aspn.activestate.com/ASPN/Cookbook/Rx/Recipe/68432/index_txt
sub ValidEmailAddr { #check if e-mail address format is valid
my $mail = shift; #in
form name at host
return 0 if ( $mail !~ /^[0-9a-zA-Z\.\-\_]+\@[0-9a-zA-Z\.\-]+$/ );
#characters allowed on name: 0-9a-Z-._ on host: 0-9a-Z-. on between: @
return 0 if ( $mail =~ /^[^0-9a-zA-Z]|[^0-9a-zA-Z]$/); #must
start or end with alpha or num
return 0 if ( $mail !~ /([0-9a-zA-Z]{1})\@./ ); #name
must end with alpha or num
return 0 if ( $mail !~ /.\@([0-9a-zA-Z]{1})/ ); #host
must start with alpha or num
return 0 if ( $mail =~ /.\.\-.|.\-\..|.\.\..|.\-\-./g ); #pair
.- or -. or -- or .. not allowed
return 0 if ( $mail =~ /.\.\_.|.\-\_.|.\_\..|.\_\-.|.\_\_./g ); #pair
._ or -_ or _. or _- or __ not allowed
return 0 if ( $mail !~ /\.([a-zA-Z]{2,3})$/ ); #host
must end with '.' plus 2 or 3 alpha for TopLevelDomain (MUST be modified in
future!)
return 1;
}
This seems to catch pretty much everything but it's perl and I'm not sure
what !~ and =~ do
I've started work on making custom definitions based on the perl source like
this
sv1 = StringValidator("joe at testmail.com")
sv1.definePattern("test1", "^[0-9a-zA-Z\.\-\_]+\@[0-9a-zA-Z\.\-]+$")
sv1.definePattern("test2", "^[^0-9a-zA-Z]|[^0-9a-zA-Z]$")
if not sv1.isValidForPattern("test1"):
print sv1.validateString, " has invalid characters in the name"
elif not sv1.isValidForPattern("test1"):
print sv1.validateString, " doesn't start or end with alpha or num"
else:
print sv1.validateString, "is valid"
These tests work pretty well but I'm having trouble turning the perl lines
with =~ into usable Python code: they tend to invalidate real addresses.
And I've been given
http://www.interclasse.com/scripts/EMailValidatorCLS.php
I'll confess to not being much of a programmer but I'm sure I can come up
with an improvement on the current function with a little help.
Thank you.
Charlie
--
Charlie Clark
Helmholtzstr. 20
Düsseldorf
D- 40215
Tel: +49-211-938-5360
More information about the Mailman-Developers
mailing list