[Python-ideas] Give regex operations more sugar

Chris Angelico rosuav at gmail.com
Thu Jun 14 04:59:24 EDT 2018


On Thu, Jun 14, 2018 at 6:21 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Thu, Jun 14, 2018 at 12:12:34AM -0700, Brendan Barnwell wrote:
>> On 2018-06-13 23:37, Chris Angelico wrote:
> [...]
>> >How is this materially different from:
>> >
>> >"some string".re_match(...)
>> >
>> >? It's not a grouped namespace in any technical sense, but to any
>> >human, a set of methods that start with a clear prefix is functionally
>> >a group.
>>
>>       Do you really mean that? :-)
>>
>>       As far as I can see, by the same argument, there is no need for
>> modules.  Instead of math.sin and math.cos, we can just have math_sin
>> and math_cos.  Instead of os.path.join we can just have os_path_join.
>> And so on.  Just one big namespace for everything.  But as we all know,
>> namespaces are one honking great idea!
>
> I'm not Chris, but I'll try to give an answer...
>
> Visually, there shouldn't be any difference between using . as a
> namespace separator and using _ instead. Whether we type math.sin or
> math_sin makes little difference beyond familiarity.
>
> But it does make a difference in whether we can treat math as a distinct
> object without the .sin part, and whether we can treat namespaces as
> real values or not.
>
> So math.sin is little different from math_sin, but the fact that math
> alone is a module, a first-class object, and not just a prefix of the
> name, makes a big difference.

Yep. That's pretty much what I meant.

There are many different types of namespace in Python. Some are actual
first-class objects (modules, classes, etc). Others are not, but (to a
programmer) are very similar (classes that end "Error", the various
constants in the stat module, etc). Sometimes it's useful to query a
collection - you can say "show me all the methods and attributes of
float" or "give me all the builtins that end with Error" - and as
groups or collections, both types of namespace are reasonably
functional. But there is a very real *thing* that collects up all the
float methods, and that is the type <float>. That's a thing, and it
has an identity. What is the thing that gathers together all Errors
(as opposed to, say, all subclasses of Exception, which can be queried
from the Exception type)?

Sometimes the line is blurry. What's the true identity of the math
module, other than "the collection of all things mathy"? It'd be
plausible to have a "trig" module that has sin/cos/tan etc, and it'd
also be plausible to say "from math import Fraction". But when there
is no strong identity to the actual thing, and there's a technical and
technological reason to avoid giving it an arbitrary identity (what is
"spam".re and just how magical is it?), there's basically no reason to
do it.

Python gives us multiple tools, and there are good reasons to use all
of them. In this case, yes, I most definitely *am* saying that
<"spam".re_> is a valid human-readable namespace, but one which has no
intrinsic identity.

ChrisA


More information about the Python-ideas mailing list