[Tutor] Amazing power of Regular Expressions...

Michael Sparks ms at cerenity.org
Mon Nov 6 11:32:32 CET 2006


On Monday 06 November 2006 01:08, Alan Gauld wrote:
> While using a dictionary is probably overkill, so is a regex.

No, in this case it's absolutely the right choice.

> A simple string holding all characters and an 'in' test would probably
> be both easier to read and faster. 

I'm stunned you think this. It's precisely this sort of naivete that baffles 
me with regard to regexes. 

> Which kind of illustrates the point of the thread I think! :-)

Actually, no, it doesn't.

A regex compiles to a jump table, and executes as a statemachine. (That
or the implementation in the library is crap, and I don't believe for a
second python would be that dumb)

Given you can compile the regex and use it over and over again (as was
the context the example code (isplain) was executed in) using a regex
is absolutely the best way. 

I'd be extremely surprised if you could get your suggested approach faster.

I also doubt it would actually be clearer, and in this case, that's MUCH more 
important. (heck, that's the reason regexes are useful - compact clear 
representations of a lexical structure that you want to check a string 
matches rather than obfuscated by the language doing the checking)

   * ^[0-9A-Za-z_.-]*$

Is a very simple pattern, and as a result an extremely simple specification.

If any developer has a problem with that sort of pattern, they really need to 
go away and learn regexes, since they're missing important tools. (which 
shouldn't be over used).

I'm serious, if you think ^[0-9A-Za-z_.-]*$ is unclear and complex, go away 
and relearn regexes.


Michael.


More information about the Tutor mailing list