On Tue, May 02, 2017 at 11:39:48PM +0100, Erik wrote:
On 02/05/17 12:31, Steven D'Aprano wrote:
Rather than duplicate the API and logic everywhere, I suggest we add a new string method. My suggestion is str.chunk(size, delimiter=' ') and str.rchunk() with the same arguments:
For the record, I now think the second argument should be called "sep", for separator, and I'm okay with Greg's suggestion we call the method "group".
"1234ABCDEF".chunk(4) => returns "1234 ABCD EF" [...]
Why do you want to limit it to strings?
I'm not stopping anyone from proposing a generalisation of this that works with other sequence types. As somebody did :-) I've also been thinking about generalisations such as grouping lines into paragraphs, words into lines, etc. In text processing, chunking can refer to more than just characters. But here we have a specific, concrete use-case that involves strings. Anything else is YAGNI until a need is demonstrated :-)
Isn't something like this potentially useful for all sequences (where the result is a tuple of objects that are the same as the source sequence - be that strings or lists or lazy ranges or whatever?). Why aren't the chunks returned via an iterator?
String methods should return strings. That's not to argue against a generic iterator solution, but the barrier to use of an iterator solution is higher than just calling a method. You have to learn about importing, you need to know there is an itertools module (or a third party module to install first!), you have to know how to convert the iterator back to a string... -- Steve