E-mail validation code
Dear list, I've recently had a whole load of e-mail addresses to check and have been looking for a way to help this. I was pointed in the direction of Mailman/Utils.py:ValidateEmail by Danny Yoo on the Python tutor list. It looks like this: ### _badchars = re.compile('[][()<>|;^,]') def ValidateEmail(str): """Verify that the an email address isn't grossly invalid.""" # Pretty minimal, cheesy check. We could do better... if not str: raise Errors.MMBadEmailError if _badchars.search(str) or str[0] == '-': raise Errors.MMHostileAddress if string.find(str, '/') <> -1 and \ os.path.isdir(os.path.split(str)[0]): # then raise Errors.MMHostileAddress user, domain_parts = ParseEmail(str) # this means local, unqualified addresses, are no allowed if not domain_parts: raise Errors.MMBadEmailError if len(domain_parts) < 2: raise Errors.MMBadEmailError ### My unskilled eye agrees with the comment. Having trawled the web alternatives I've come across the following approaches: The Python stuff is at http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/66439/index_txt It is a module with a single class which comes with some built in patterns and the ability to make custom patterns. The built in e-mail test doesn't catch things like spaces or @@ and other things. The perl stuff is at http://aspn.activestate.com/ASPN/Cookbook/Rx/Recipe/68432/index_txt sub ValidEmailAddr { #check if e-mail address format is valid my $mail = shift; #in form name@host return 0 if ( $mail !~ /^[0-9a-zA-Z\.\-\_]+\@[0-9a-zA-Z\.\-]+$/ ); #characters allowed on name: 0-9a-Z-._ on host: 0-9a-Z-. on between: @ return 0 if ( $mail =~ /^[^0-9a-zA-Z]|[^0-9a-zA-Z]$/); #must start or end with alpha or num return 0 if ( $mail !~ /([0-9a-zA-Z]{1})\@./ ); #name must end with alpha or num return 0 if ( $mail !~ /.\@([0-9a-zA-Z]{1})/ ); #host must start with alpha or num return 0 if ( $mail =~ /.\.\-.|.\-\..|.\.\..|.\-\-./g ); #pair .- or -. or -- or .. not allowed return 0 if ( $mail =~ /.\.\_.|.\-\_.|.\_\..|.\_\-.|.\_\_./g ); #pair ._ or -_ or _. or _- or __ not allowed return 0 if ( $mail !~ /\.([a-zA-Z]{2,3})$/ ); #host must end with '.' plus 2 or 3 alpha for TopLevelDomain (MUST be modified in future!) return 1; } This seems to catch pretty much everything but it's perl and I'm not sure what !~ and =~ do I've started work on making custom definitions based on the perl source like this sv1 = StringValidator("joe@testmail.com") sv1.definePattern("test1", "^[0-9a-zA-Z\.\-\_]+\@[0-9a-zA-Z\.\-]+$") sv1.definePattern("test2", "^[^0-9a-zA-Z]|[^0-9a-zA-Z]$") if not sv1.isValidForPattern("test1"): print sv1.validateString, " has invalid characters in the name" elif not sv1.isValidForPattern("test1"): print sv1.validateString, " doesn't start or end with alpha or num" else: print sv1.validateString, "is valid" These tests work pretty well but I'm having trouble turning the perl lines with =~ into usable Python code: they tend to invalidate real addresses. And I've been given http://www.interclasse.com/scripts/EMailValidatorCLS.php I'll confess to not being much of a programmer but I'm sure I can come up with an improvement on the current function with a little help. Thank you. Charlie -- Charlie Clark Helmholtzstr. 20 Düsseldorf D- 40215 Tel: +49-211-938-5360
participants (1)
-
Charlie Clark