On Wed, Oct 23, 2019 at 10:59 AM Steven D'Aprano <steve@pearwood.info> wrote:
For the example you gave, besides saving a few characters I don't see the advantage over the existing way we have to do that:
'one two three'.split()
One of the reasons why Python is "slow" is that lots of things that can be done at compile-time are deferred to run-time. I doubt that splitting short strings will often be a bottle-neck, but idioms like this cannot help to contribute (even if only a little bit) to the extra work the Python interpreter does at run-time:
load a pre-allocated string constant look up the "split" attribute in the instance (not found) look up the "split" attribute in the class call the descriptor protocol which returns a method call the method build and return a list garbage collect the string constant
versus:
build and return a list from pre-allocated strings
(Or something like this, I'm not really an expert on the Python internals, I just pretend to know what I'm talking about.)
This could be done as an optimization without changing syntax or semantics.. As long as the initial string is provided as a literal, it should be possible to call the method at compile time, since (as far as I know) every string method is a pure function. It's made a little more complicated by the problem of mutable return values (str.split() returns a list, and if you call it again, you have to get a new unique list in case one of them gets mutated), but if you immediately iterate over it, that won't be a problem. Currently, the CPython optimizer can recognize constructs like "if x in [1,2,3,4]" or "for x in [1,2,3,4]" and use a literal tuple instead of building a list. Recognizing the splitting of a string as another equivalent literal could be done the same way. Whether it's worthwhile or not is another question, but if the performance penalty of the run-time splitting is a problem, that CAN be fixed even without new syntax. ChrisA