scanf style parsing

Tim Hammerquist tim at vegeta.ath.cx
Thu Sep 27 18:42:30 EDT 2001


Me parece que Duncan Booth <duncan at NOSPAMrcp.co.uk> dijo:
> tim at vegeta.ath.cx (Tim Hammerquist) wrote in 
> news:slrn9r61oo.uim.tim at vegeta.ath.cx:
> 
> > But don't think regex's are disposable just because Python's string type
> > is more convenient.  Consider the following:
> > 
> >     # perl
> >     if ($filename =~ /\.([ps]?html?|cgi|php[\d]?|pl)$/) { ... }
> >     # python
> >     re_web_files = re.compile(r'\.([ps]?html?|cgi|php[\d]?|pl)$')
> >     m = re_web_files.search(filename)
> >     if m:
> >         ...
> > 
> > This is a very complicated (but relatively efficient way) to match files
> > with all the folowing extensions:
> >     .htm    .html   .shtm   .shtml  .phtm   .phtml
> >     .cgi
> >     .php    .php2   .php3   .php4
> >     .pl
> 
> Wouldn't you be happier with this?:
> 
>    extensions = ['.htm', '.html', '.shtm', '.shtml', '.phtm',
>         '.phtml', '.cgi', '.php', '.php2', 'php3', '.php4', '.pl']
>    ext = os.path.splitext(filename)[1]
>    if ext in extensions:
>       ...
> 
> which has the arguable advantage of matching what your description says 
> instead of what your original code does.

The main point of the example was to demonstrate my own peeve (Python's
clumsy re implementation), not to show an example of good idiomatic
Python.  As I said, this is a good thing; it keeps
those of us with Perl experience in check.  <wink>

You're solution is quite good, and probably one I'd use; in practice, I
would add the .lower() method to the ext var, just as I would add the
re.I flag in the re.compile() statement...unless I wanted to put IIS
servers through Hell.  <wink>

> regexes are wonderful: in moderation.

That bore repeating. Thank you. =)

-- 
Destinations are often a surprise to the destined.
    -- Thessaly, The Sandman



More information about the Python-list mailing list