[Python-ideas] Give regex operations more sugar

Steven D'Aprano steve at pearwood.info
Thu Jun 14 04:21:51 EDT 2018


On Thu, Jun 14, 2018 at 12:12:34AM -0700, Brendan Barnwell wrote:
> On 2018-06-13 23:37, Chris Angelico wrote:
[...]
> >How is this materially different from:
> >
> >"some string".re_match(...)
> >
> >? It's not a grouped namespace in any technical sense, but to any
> >human, a set of methods that start with a clear prefix is functionally
> >a group.
> 
> 	Do you really mean that? :-)
> 
> 	As far as I can see, by the same argument, there is no need for 
> modules.  Instead of math.sin and math.cos, we can just have math_sin 
> and math_cos.  Instead of os.path.join we can just have os_path_join. 
> And so on.  Just one big namespace for everything.  But as we all know, 
> namespaces are one honking great idea!

I'm not Chris, but I'll try to give an answer...

Visually, there shouldn't be any difference between using . as a 
namespace separator and using _ instead. Whether we type math.sin or 
math_sin makes little difference beyond familiarity.

But it does make a difference in whether we can treat math as a distinct 
object without the .sin part, and whether we can treat namespaces as 
real values or not.

So math.sin is little different from math_sin, but the fact that math 
alone is a module, a first-class object, and not just a prefix of the 
name, makes a big difference.

As you say:

> 	Now, of course there are other advantages to modules (such as being 
> able to save the time of loading things you don't need),

Loading on demand is one such advantage. Organising source code is 
another.

Being able to pass the math object around as a first-class value, to 
call getattr() and setattr() or vars() or use introspection on it. You 
can't do that if its just a name prefix.

> and likewise 
> there are other advantages to this descriptor mechanism in some cases. 
> (For instance, sometimes the sub-object may want to hold state if it is 
> going to be passed around and used later, rather than just having a 
> method called and being thrown away immediately.)

We can get that from making the regex method a method directly on the 
string object.

The question I have is, what benefit does the str.re intermediate object 
bring? Does it carry its own weight?

In his refactoring books, Martin Fowler makes it clear that objects 
ought to carry their own weight. When an object grows too big, you ought 
to split out functionality and state into intermediate objects. But if 
those intermediate objects do too little, the extra complexity they 
bring isn't justified by their usefulness.

class Count:
    def __init__(self, start=0):
        self.counter = 0
    def __iadd__(self, value):
        self.counter += value

Would you use that class, or say it simply adds a needless level of 
indirection?

If the re namespace doesn't do something to justify itself beyond simply 
adding a namespace, then Chris is right: we might as well just use re_ 
as a prefix and use a de facto namespace, and save the extra mental 
complexity and the additional indirection by dropping this intermediate 
descriptor object.


-- 
Steve


More information about the Python-ideas mailing list