[Tutor] regexp

Sat Nov 5 02:06:45 CET 2011

Dinara, Steven,

On 4 November 2011 23:29, Steven D'Aprano <steve at pearwood.info> wrote:

> Is this homework? You should have said so.
>

Inded he should've...

> I don't understand questions like this. Do carpenters ask their
> apprentices to cut a piece of wood with a hammer? Do apprentice chefs get
> told to dice carrots using only a spoon? Computer programming is the only
> skill I know of where teachers routinely insist that students use
> inappropriate tools to solve a problem, just to prove they can do it.
>

Calm down dear, it's only a reg-ex... ;)

Additionally, if I may say so, I find this part of your response rather
less than helpful, it's rant-ish IMHO, and as far as I'm concerned the
apprentice analogies are not akin to what programming teachers do and why
they do it.  Fact is, people need to learn the tools of their trade
somewhere.  Sometimes it's neccesary to focus on a particular tool and
therefore dream up some artificial problem for the sake of learning to use
the tool.  It's not that apprentices are being taught to cut wood with a
hammer or whatnot.  And yes, regular expression can be damn useful
sometimes.  Whether or not you like regular expressions, use them, or are
good at them have no bearing on whether Dinara should be able learn how to
use them, nor is it relevant what you think of the question or if the
context of the learning is a contrived question/excercise.

> In any case, if this is even possible using regular expressions -- and I
> don't think it is -- I have no idea how to do it. Good luck. Maybe somebody
> else might have a clue.
>

@Dinara:

It is actually.  Let's describe the requirement a bit differently, first in
words, then see if we can write a regular expression that will match that
description:  A word that matches our requirement will start with 0 or more
occurences of a, and be followed by 0 or more occurrences of b, and so on,
until z, after which we must have reached the end of the word.  So the
requirements for the regex is:
1.) The regex must start at the beginning of the string
2.) 0 or more a's may be matched, followed by 0 or more b's, followed by 0
or more c's and son on, up to z,
3.) The regex must end at the end of the string (so 1 and 3 together imply
that all the text in the string must be matched/consumed by the regex for a
match to have been found.

If we write an equivalent regex, to the above requirement, and it matches
all the text in a string (e.g. a match is found), then by definition it
will have found a word containing letters in alphabetized order.  The only
special case to handle would be the empty string -- this would be matched
by the above regex but may not be considered correct per the intent of the
problem. (On the other hand, an empty string is not a word either, so one
might consider this invalid input in the first place and should properly
probably reject the input and refuse to process it.)

I'll give you some further hints.

1.) To specify/match the beginning of a string, you use the ^ character in
a regex.
2.) To specify 0 or more of something you append an asterisk, e.g. *
3.) To specify a letter to be matched, you can write it directly.  To
therefore match 0 or more a's for example, you'd write a*
4.) To specify a sequence of things you simply write them out.  So for
example the regular expression a*b* will match strings like 'ab', 'aab',
'abb', 'b', 'a', but not 'baa'...
5.) To specify the end of a string, you use the $ character in a regex.

By the way, it's possible to write an alternatve version of the function
that Steven provied in a single line (of body code) with a regex.

HTH,

Walter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20111105/bfdabfef/attachment.html>