[Python-ideas] Give regex operations more sugar
Steven D'Aprano
steve at pearwood.info
Thu Jun 14 04:21:51 EDT 2018
On Thu, Jun 14, 2018 at 12:12:34AM -0700, Brendan Barnwell wrote:
> On 2018-06-13 23:37, Chris Angelico wrote:
[...]
> >How is this materially different from:
> >
> >"some string".re_match(...)
> >
> >? It's not a grouped namespace in any technical sense, but to any
> >human, a set of methods that start with a clear prefix is functionally
> >a group.
>
> Do you really mean that? :-)
>
> As far as I can see, by the same argument, there is no need for
> modules. Instead of math.sin and math.cos, we can just have math_sin
> and math_cos. Instead of os.path.join we can just have os_path_join.
> And so on. Just one big namespace for everything. But as we all know,
> namespaces are one honking great idea!
I'm not Chris, but I'll try to give an answer...
Visually, there shouldn't be any difference between using . as a
namespace separator and using _ instead. Whether we type math.sin or
math_sin makes little difference beyond familiarity.
But it does make a difference in whether we can treat math as a distinct
object without the .sin part, and whether we can treat namespaces as
real values or not.
So math.sin is little different from math_sin, but the fact that math
alone is a module, a first-class object, and not just a prefix of the
name, makes a big difference.
As you say:
> Now, of course there are other advantages to modules (such as being
> able to save the time of loading things you don't need),
Loading on demand is one such advantage. Organising source code is
another.
Being able to pass the math object around as a first-class value, to
call getattr() and setattr() or vars() or use introspection on it. You
can't do that if its just a name prefix.
> and likewise
> there are other advantages to this descriptor mechanism in some cases.
> (For instance, sometimes the sub-object may want to hold state if it is
> going to be passed around and used later, rather than just having a
> method called and being thrown away immediately.)
We can get that from making the regex method a method directly on the
string object.
The question I have is, what benefit does the str.re intermediate object
bring? Does it carry its own weight?
In his refactoring books, Martin Fowler makes it clear that objects
ought to carry their own weight. When an object grows too big, you ought
to split out functionality and state into intermediate objects. But if
those intermediate objects do too little, the extra complexity they
bring isn't justified by their usefulness.
class Count:
def __init__(self, start=0):
self.counter = 0
def __iadd__(self, value):
self.counter += value
Would you use that class, or say it simply adds a needless level of
indirection?
If the re namespace doesn't do something to justify itself beyond simply
adding a namespace, then Chris is right: we might as well just use re_
as a prefix and use a de facto namespace, and save the extra mental
complexity and the additional indirection by dropping this intermediate
descriptor object.
--
Steve
More information about the Python-ideas
mailing list