[Tutor] regexp
Dinara Vakhitova
di.marvellous at gmail.com
Sun Nov 6 20:35:57 CET 2011
Dear Terry,
Thank you for your advise, I'll try to implement it.
D.
2011/11/6 Terry Carroll <carroll at tjc.com>
> On Sat, 5 Nov 2011, Dinara Vakhitova wrote:
>
> I need to find the words in a corpus, which letters are in the
>> alphabetical
>> order ("almost", "my" etc.)
>> I started with matching two consecutive letters in a word, which are in
>> the alphabetical order, and tried to use this expression: ([a-z])[\1-z],
>> but
>> it won't work, it's matching any sequence of two letters. I can't figure
>> out
>> why... Evidently I can't refer to a group like this, can I? But how in
>> this
>> case can I achieve what I need?
>>
>
> First, I agree with the others that this is a lousy task for regular
> expressions. It's not the tool I would use. But, I do think it's doable,
> provided the requirement is not to check with a single regular expression.
> For simplicity's sake, I'll construe the problem as determining whether a
> given string consists entirely of lower-case alphabetic characters,
> arranged in alphabetical order.
>
> What I would do is set a variable to the lowest permissible character,
> i.e., "a", and another to the highest permissible character, i.e., "z"
> (actually, you could just use a constant, for the highest, but I like the
> symmetry.
>
> Then construct a regex to see if a character is within the
> lowest-permissible to highest-permissible range.
>
> Now, iterate through the string, processing one character at a time. On
> each iteration:
>
> - test if your character meets the regexp; if not, your answer is
> "false"; on pass one, this means it's not lower-case alphabetic; on
> subsequent passes, it means either that, or that it's not in sorted
> order.
> - If it passes, update your lowest permissible character with the
> character you just processed.
> - regenerate your regexp using the updated lowest permissible character.
> - iterate.
>
> I assumed lower case alphabetic for simplicity, but you could modify this
> basic approach with mixed case (e.g., first transforming to all-lower-case
> copy) or other complications.
>
> I don't think there's a problem with asking for help with homework on this
> list; but you should identify it as homework, so the responders know not to
> just give you a solution to your homework, but instead provide you with
> hints to help you solve it.
>
> ______________________________**_________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/**mailman/listinfo/tutor<http://mail.python.org/mailman/listinfo/tutor>
>
--
*Yours faithfully,
Dinara Vakhitova*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20111106/ec0fa419/attachment-0001.html>
More information about the Tutor
mailing list