[Python-Dev] Should we move to replace re with regex?

Steven D'Aprano steve at pearwood.info
Sat Aug 27 05:31:03 CEST 2011


Ben Finney wrote:
> Steven D'Aprano <steve at pearwood.info> writes:
> 
>> Ben Finney wrote:
>>> "M.-A. Lemburg" <mal at egenix.com> writes:
>>>> No, you tell them: "If you want Unicode 6 semantics, use regex, if
>>>> you're fine with Unicode 2.0/3.0 semantics, use re".
>>> What do we say, then, to those who are unaware of the different
>>> semantics between those versions of Unicode, and want regular expression
>>> to “just work” in Python?
>>>
>>> To which document can we direct them to understand what semantics they
>>> want?
>> Presumably, like all modules, both the re and the regex module will
>> have their own individual pages in the library reference.
> 
> My question is directed more to M-A Lemburg's passage above, and its
> implicit assumption that the user understand the changes between
> “Unicode 2.0/3.0 semantics” and “Unicode 6 semantics”, and how their own
> needs relate to those semantics.
> 
> For programmers who know they want to follow Unicode conventions in
> Python, but don't know the distinction M-A Lemburg is drawing, to which
> document does he recommend we direct them?


I can only repeat my answer: the docs for the new regex module should 
include a discussion of the differences. If that requires summarising 
the differences that M-A Lemburg refers to, then so be it.


> “The Unicode specification document in its various versions” isn't a
> feasible answer.

Presumably the Unicode spec will be the canonical source, but I agree 
that we should not expect people to read that in order to make a 
decision between re and regex.


-- 
Steven


More information about the Python-Dev mailing list