[Tutor] Help on RE

tee chwee liong tcl76 at hotmail.com
Sun Jan 23 04:31:40 CET 2011


elegant. :) 
simple yet elegant. 
 

 
> Date: Sun, 23 Jan 2011 14:10:35 +1100
> From: steve at pearwood.info
> To: tutor at python.org
> Subject: Re: [Tutor] Help on RE
> 
> tee chwee liong wrote:
> > thanks for making me understand more on re. re is a confusing topic as i'm starting on python. 
> 
> I quote the great Jamie Zawinski, a world-class programmer and hacker:
> 
> Some people, when confronted with a problem, think 'I know, I'll
> use regular expressions." Now they have two problems.
> 
> 
> Zawinski doesn't mean that you should never use regexes. But they should 
> be used only when necessary, for problems that are difficult enough to 
> require a dedicated domain-specific language for solving search problems.
> 
> Because that's what regexes are: they're a programming language for text 
> searching. They're not a full-featured programming language like Python 
> (technically, they are not Turing Complete) but nevertheless they are a 
> programming language. A programming language with a complicated, 
> obscure, hideously ugly syntax (and people complain about Forth!). Even 
> the creator of Perl, Larry Wall, has complained about regex syntax and 
> gives 19 serious faults with regular expressions:
> 
> http://dev.perl.org/perl6/doc/design/apo/A05.html
> 
> Most people turn to regexes much too quickly, using them to solve 
> problems that are either too small to need regexes, or too large. Using 
> regexes for solving your problem is like using a chainsaw for peeling an 
> orange.
> 
> Your data is very simple, and doesn't need regexes. It looks like this:
> 
> 
> Platform: PC
> Tempt : 25
> TAP0 :0
> TAP1 :1
> +++++++++++++++++++++++++++++++++++++++++++++
> Port Chnl Lane EyVt EyHt
> +++++++++++++++++++++++++++++++++++++++++++++
> 0 1 1 75 55
> 0 1 2 10 35
> 0 1 3 25 35
> 0 1 4 35 25
> 0 1 5 10 -1
> +++++++++++++++++++++++++++++++++++++++++++++
> Time: 20s
> 
> 
> The part you care about is the table of numbers, each line looks like this:
> 
> 0 1 5 10 -1
> 
> The easiest way to parse this line is this:
> 
> numbers = [int(word) for word in line.split()]
> 
> All you need then is a way of telling whether you have a line in the 
> table, or a header. That's easy -- just catch the exception and ignore it.
> 
> template = "Port=%d, Channel=%d, Lane=%d, EyVT=%d, EyHT=%d"
> for line in lines:
> try:
> numbers = [int(word) for word in line.split()]
> except ValueError:
> continue
> print(template % tuple(numbers))
> 
> 
> Too easy. Adding regexes just makes it slow, fragile, and difficult.
> 
> 
> My advice is, any time you think you might need regexes, you probably don't.
> 
> 
> -- 
> Steven
> 
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20110123/9a44a6e6/attachment.html>


More information about the Tutor mailing list