Regular expression to capture model numbers

Piet van Oostrum piet at cs.uu.nl
Fri Apr 24 09:23:56 CEST 2009


>>>>> John Machin <sjmachin at lexicon.net> (JM) wrote:

>JM> On Apr 24, 1:29 am, Piet van Oostrum <p... at cs.uu.nl> wrote:

>>> obj = re.compile(r'(?:[a-z]+[-0-9]|[0-9]+[-a-z]|-+[0-9a-z])[-0-9a-z]*', re.I)

>JM> Understandable and maintainable, I don't think. Suppose that instead
>JM> the first character is limited to being alphabetic. You have to go
>JM> through the whole process of elaborating the possibilites again, and I
>JM> don't consider that process qualifies as "express[ing] complicated
>JM> conditions like that".

No, I don't think regular expressions are the best tool for these kind
of tests. I just wanted to show that it *could* be done. By the way,
your additional hypothetical requirement that the first character should
be alphabetic just makes it easier: only the first alternative remains.
But on the other hand, suppose you would have the requirement that the
pattern should not end in a hyphen then it becomes even uglier. Or when
there should never be two hyphens in a row, I wouldn't even think of
using a re, although theoretically it would be possible.

Translating these requirements into re's is not `composable'.
-- 
Piet van Oostrum <piet at cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: piet at vanoostrum.org



More information about the Python-list mailing list