[Python-ideas] More user-friendly version for string.translate()

Chris Barker chris.barker at noaa.gov
Wed Nov 2 14:13:37 EDT 2016


On Tue, Nov 1, 2016 at 12:15 AM, Stephen J. Turnbull <
turnbull.stephen.fw at u.tsukuba.ac.jp> wrote:

>  > pretty slick -- but any hope of it being as fast as a C implemented
> method?
>
> I would expect not in CPython, but if "fast" matters, why are you
> using CPython rather than PyPy or Cython?


oh come on!


>  If it matters *that* much,
> you can afford to write your own C implementation.


This is about a possible addition to the stdlib -- me writing my own C
implementation has nothing to do with it.


> But I doubt that
> fast matters "that much" often enough to be worth maintaining yet
> another string method in Python.


This could be said about every string method in Python -- I understand that
every addition is more code to maintain. But somehow we are adding all
kinds of stuff like yet another string formatting method, talking about
null coalescing operators and who knows what -- those are all a MUCH larger
burden -- not just for maintaining the interpreter, but for everyone using
python having more to remember and understand.

On the other hand, powerful and performant string methods are a major plus
for Python -- a good reason to us it over Perl :-)

So an new one that provides, as I write before:

 > 1) single method call to do a common thing
>  >
>  > 2) nice fast, pure C performance
>

would fit right into to Python, and indeed, would be a similar
implementation to existing methods -- so the maintenance burden would be a
small addition (i.e if the internal representation for strings changed, all
those methods would need re-visiting and similar changes)

So the only key question is -- is the a common enough use case?

 > so I think a "keep these" method would help with both of these
>  > goals.
>
> Sure, but the translate method already gives you that, and a lot more.
>

yes but only with the fairly esoteric use of defaultdict. which brings me
back to the above:

1) single method call to do a common thing

the nice thing about a single method call is discoverability -- no newbie
is going to figure out the .translate + defaultdict approach.



> Note that when you're talking about working with Unicode characters,
> no natural language activity I can imagine (not even translating
> Buddhist texts, which involves a couple of Indian scripts as well as
> Han ideographs) uses more than a fraction of defined characters.
>

which is why you may want to remove all the others :-)

So really translate with defaultdict is a specialized loop that
> marries an algorithmic body (which could do things like look up the
> original script or other character properties to decide on the
> replacement for the generic case) with a (usually "small") table of
> exceptions.  That seems like inspired design to me.
>

indeed -- .translate() itself is remarkably flexible -- you could even pas
in a custom class that does all sorts of logic. and adding the defaultdict
is an easy way to add a useful feature. But again, advanced usage and not
very discoverable.

Maybe that means we need some more docs and/or perhaps recipes instead.

Anyway, I joined this thread to clarify what might be on the table -- but
while I think it's a good idea, I dont have the bandwidth to move it
through the process -- so unless someone steps up that does, we're done.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20161102/fc10fa08/attachment-0001.html>


More information about the Python-ideas mailing list