Whittle it on down
Random832
random832 at fastmail.com
Thu May 5 09:21:39 EDT 2016
On Thu, May 5, 2016, at 03:36, Steven D'Aprano wrote:
> Putting non-ASCII letters aside for the moment, how would you match these
> specs as a regular expression?
Well, obviously *your* language (not the OP's), given the cases you
reject, is "one or more sequences of letters separated by
space*-ampersand-space*", and that is actually one of the easiest kinds
of regex to write: "[A-Z]+( *& *[A-Z]+)*".
However, your spec is wrong:
> - Leading or trailing spaces, or spaces not surrounding an ampersand,
> must not match: "AAA BBB" must be rejected.
The *very first* item in OP's list of good outputs is 'PHYSICAL FITNESS
CONSULTANTS & TRAINERS'.
If you want something that's extremely conservative (except for the
*very odd in context* choice of allowing arbitrary numbers of spaces -
why would you allow this but reject leading or trailing space?) and
accepts all of OP's input:
[A-Z]+(( *& *| +)[A-Z]+)*
More information about the Python-list
mailing list