
On 9 Mar 2023, at 00:37, Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
Having had my last proposal shot down in flames, up I bob with another. 😁
See this discussion that has a nice solution proposed with the concat function. Barry
It seems to me that it would be useful to be able to make the str.join() function put separators, not only between the items of its operand, but also optionally at the beginning or end. E.g. '|'.join(('Spam', 'Ham', 'Eggs')) returns 'Spam|Ham|Eggs' but it might be useful to make it return one of '|Spam|Ham|Eggs' 'Spam|Ham|Eggs|' '|Spam|Ham|Eggs|' Again, I suggest that this apply to byte strings as well as strings. Going through the 3.8.3 stdlib I have found 24 examples where the separator needs to be added at the beginning 52 where the separator needs to be added at the end 4 where the separator needs to be added at the both ends I list these examples below. Apologies if there are any mistakes.
While guessing is no substitute for measurement, it seems plausible that using this feature where appropriate would increase runtime performance by avoiding 1 (or 2) calls of str.__add__. This is perhaps more relevant when the separator is not a short constant string, as in this example: Lib\email\_header_value_parser.py:2854: return policy.linesep.join(lines) + policy.linesep Note also this example: Lib\site-packages\setuptools\command\build_ext.py:221: pkg = '.'.join(ext._full_name.split('.')[:-1] + ['']) where the author has used the unintuitive device of appending an empty string to a list to force join() to add an extra final dot, thereby avoiding 1 call of str.__add__ at the cost of 1 call of list.append.
What I cannot decide is what the best API would be. str.join() currently takes only 1 parameter, so it would be possible to add an extra parameter or two. One scheme would be to have an atEnds parameter which could take values such as 0=default behaviour 1=add sep at start 2=add sep at end 3=add sep at both ends or 's'=add sep at start 'e'=add sep at end 'b'=add sep at both ends (some) other=default behaviour Another would be to have 2 parameters, atStart and atEnd, which would both default to False or 0. E.g. '|'.join(('Spam', 'Ham', 'Eggs'), 1) == '|Spam|Ham|Eggs' '|'.join(('Spam', 'Ham', 'Eggs'), 0, 1) == 'Spam|Ham|Eggs|' Neither scheme results in particularly transparent usage, though no worse than s.splitlines(True) # What on earth is this parameter???
Corner case: What if join() is passed an empty sequence? This is debatable, but I think it should return the separator if requested to add it at the beginning or end, and double it up if both are requested. This would preserve identities such as sep.join(seq, <PleaseAddSeparatorAtBeginning>) == sep + sep.join(seq)
Best wishes Rob Cliffe
EXAMPLES WHERE SEPARATOR ADDED AT START:
Lib\http\server.py:933: splitpath = ('/' + '/'.join(head_parts), tail_part) Lib\site-packages\numpy\ctypeslib.py:333: name += "_"+"_".join(flags) Lib\site-packages\numpy\testing\_private\utils.py:842: err_msg += '\n' + '\n'.join(remarks) Lib\site-packages\pip\_vendor\pyparsing\core.py:2092-2095: out = [ "\n" + "\n".join(comments) if comments else "", pyparsing_test.with_line_numbers(t) if with_line_numbers else t, ] Lib\site-packages\pip\_vendor\requests\status_codes.py:121-125: __doc__ = ( __doc__ + "\n" + "\n".join(doc(code) for code in sorted(_codes)) if __doc__ is not None else None ) Lib\site-packages\reportlab\lib\utils.py:1093: self._writeln(' '+' '.join(A.__self__)) Lib\site-packages\reportlab\platypus\flowables.py:708: L = "\n"+"\n".join(L) Lib\site-packages\twisted\mail\smtp.py:1647: r.append(c + b' ' + b' '.join(v)) Lib\site-packages\twisted\protocols\ftp.py:1203: return (PWD_REPLY, '/' + '/'.join(self.workingDirectory)) Lib\site-packages\twisted\runner\procmon.py:424-426: return ('<' + self.__class__.__name__ + ' ' + ' '.join(l) + '>') Lib\site-packages\twisted\web\rewrite.py:34: request.path = '/'+'/'.join(request.prepath+request.postpath) Lib\site-packages\twisted\web\rewrite.py:51: request.path = '/'+'/'.join(request.prepath+request.postpath) Lib\site-packages\twisted\web\twcgi.py:78: scriptName = b"/" + b"/".join(request.prepath) Lib\site-packages\twisted\web\twcgi.py:95: env["PATH_INFO"] = "/" + "/".join(pp) Lib\site-packages\twisted\web\vhost.py:115: request.path = b'/' + b'/'.join(request.postpath) Lib\site-packages\twisted\web\wsgi.py:283: scriptName = b'/' + b'/'.join(request.prepath) Lib\site-packages\twisted\web\wsgi.py:288: pathInfo = b'/' + b'/'.join(request.postpath) Lib\site-packages\twisted\web\test\test_wsgi.py:272: uri = '/' + '/'.join([urlquote(seg, safe) for seg in requestSegments]) Lib\site-packages\wx\py\magic.py:55: command = 'sx("'+aliasDict[c[0]]+' '+' '.join(c[1:])+'")' Lib\site-packages\zope\interface\exceptions.py:257-260: return '\n ' + '\n '.join( x._str_details.strip() if isinstance(x, _TargetInvalid) else str(x) for x in self.exceptions ) Lib\smtplib.py:537 and 545: optionlist = ' ' + ' '.join(options) Lib\unittest\case.py:1094-1096: diffMsg = '\n' + '\n'.join( difflib.ndiff(pprint.pformat(seq1).splitlines(), pprint.pformat(seq2).splitlines())) Lib\unittest\case.py:1207-1209: diff = ('\n' + '\n'.join(difflib.ndiff( pprint.pformat(d1).splitlines(), pprint.pformat(d2).splitlines())))
SEPARATOR ADDED AT END:
Lib\distutils\command\config.py:303: body = "\n".join(body) + "\n" Lib\email\contentmanager.py:145: def embedded_body(lines): return linesep.join(lines) + linesep Lib\email\contentmanager.py:146: def normal_body(lines): return b'\n'.join(lines) + b'\n' Lib\email\policy.py:215: return name + ': ' + self.linesep.join(lines) + self.linesep Lib\email\_header_value_parser.py:2854: return policy.linesep.join(lines) + policy.linesep Lib\site-packages\numpy\distutils\command\config.py:346: body = '\n'.join(body) + "\n" Lib\site-packages\numpy\distutils\command\config.py:407: body = '\n'.join(body) + "\n" Lib\site-packages\oauthlib\oauth2\rfc6749\tokens.py:158: base_string = '\n'.join(base) + '\n' Lib\site-packages\PIL\ImageCms.py:770: return "\r\n\r\n".join(arr) + "\r\n\r\n" Lib\site-packages\pip\_internal\operations\freeze.py:254: return "\n".join(list(self.comments) + [str(req)]) + "\n" Lib\site-packages\pip\_internal\operations\install\legacy.py:54: f.write("\n".join(new_lines) + "\n") Lib\site-packages\pip\_vendor\pyparsing\testing.py:323-331: return ( header1 + header2 + "\n".join( "{:{}d}:{}{}".format(i, lineno_width, line, eol_mark) for i, line in enumerate(s_lines, start=start_line) ) + "\n" ) Lib\site-packages\pycparser\c_generator.py:117: if n.storage: s += ' '.join(n.storage) + ' ' Lib\site-packages\pycparser\c_generator.py:366: if n.funcspec: s = ' '.join(n.funcspec) + ' ' Lib\site-packages\pycparser\c_generator.py:367: if n.storage: s += ' '.join(n.storage) + ' ' Lib\site-packages\pycparser\c_generator.py:382: if n.quals: s += ' '.join(n.quals) + ' ' Lib\site-packages\pycparser\c_generator.py:397: nstr += ' '.join(modifier.dim_quals) + ' ' Lib\site-packages\pycparser\c_generator.py:417: return ' '.join(n.names) + ' ' Lib\site-packages\pythonwin\pywin\framework\scriptutils.py:109: return ".".join(modBits) + "." + fname, newPathReturn Lib\site-packages\reportlab\pdfbase\pdfdoc.py:1118: code = '\n'.join(code)+'\n' Lib\site-packages\reportlab\pdfbase\pdfutils.py:102: f.write('\r\n'.join(code)+'\r\n') Lib\site-packages\reportlab\pdfbase\_can_cmap_data.py:54: src = '\n'.join(buf) + '\n' Lib\site-packages\reportlab\pdfgen\pdfimages.py:203: content = '\n'.join(self.imageData[3:-1]) + '\n' Lib\site-packages\setuptools\command\build_ext.py:221: pkg = '.'.join(ext._full_name.split('.')[:-1] + ['']) Lib\site-packages\setuptools\command\easy_install.py:1056: f.write('\n'.join(locals()[name]) + '\n') Lib\site-packages\setuptools\command\easy_install.py:1606: data = '\n'.join(lines) + '\n' Lib\site-packages\setuptools\command\egg_info.py:672: cmd.write_file("top-level names", filename, '\n'.join(sorted(pkgs)) + '\n') Lib\site-packages\setuptools\command\egg_info.py:683: value = '\n'.join(value) + '\n' Lib\site-packages\setuptools\_distutils\command\config.py:303: body = "\n".join(body) + "\n" Lib\site-packages\twisted\conch\manhole.py:360-362: return (b'\n'.join(self.interpreter.buffer) + b'\n' + b''.join(self.lineBuffer)) Lib\site-packages\twisted\conch\client\knownhosts.py:547-549: hostsFileObj.write( b"\n".join([entry.toString() for entry in self._added]) + b"\n") Lib\site-packages\twisted\conch\ssh\keys.py:1340: return b'\n'.join(lines) + b'\n' Lib\site-packages\twisted\conch\test\test_conch.py:556: expectedResult = '\n'.join(['line #%02d' % (i,) for i in range(60)]) + '\n' Lib\site-packages\twisted\conch\test\test_helper.py:360: self.term.write(b'\n'.join((s1, s2, s3)) + b'\n') Lib\site-packages\twisted\internet\test\test_process.py:769: scriptFile.write(os.linesep.join(sourceLines) + os.linesep) Lib\site-packages\twisted\mail\imap4.py:5713: hdrs = '\r\n'.join(hdrs) + '\r\n' Lib\site-packages\twisted\mail\imap4.py:5952: base = b'.'.join([(x + 1).__bytes__() for x in self.part]) + b'.' + base Lib\site-packages\twisted\mail\test\test_pop3.py:312: self.message = b'\n'.join(self.lines) + b'\n' Lib\site-packages\twisted\mail\test\test_pop3.py:376: output = b'\r\n'.join(client.response) + b'\r\n' Lib\site-packages\twisted\mail\test\test_smtp.py:100: message = b'\n'.join(self.buffer) + b'\n' Lib\site-packages\twisted\mail\test\test_smtp.py:344: message = b'\n'.join(self.buffer) + b'\n' Lib\site-packages\twisted\python\text.py:146: return '\n'.join(lines)+'\n' Lib\site-packages\twisted\test\test_iutils.py:40: scriptFile.write(os.linesep.join(sourceLines) + os.linesep) Lib\site-packages\win32comext\adsi\demos\scp.py:350: description = __doc__ + "\ncommands:\n" + "\n".join(arg_descs) + "\n" Lib\site-packages\wx\py\crust.py:259: self.SetValue('\n'.join(hist) + '\n') Lib\site-packages\wx\py\introspect.py:342: command = terminator.join(pieces[:-1]) + terminator Lib\site-packages\zope\interface\document.py:78: return "\n\n".join(r) + "\n\n" Lib\test\test_nntplib.py:495: lit = "\r\n".join(lit.splitlines()) + "\r\n" Lib\test\test_univnewlines.py:24:DATA_LF = "\n".join(DATA_TEMPLATE) + "\n" Lib\test\test_univnewlines.py:25:DATA_CR = "\r".join(DATA_TEMPLATE) + "\r" Lib\test\test_univnewlines.py:26:DATA_CRLF = "\r\n".join(DATA_TEMPLATE) + "\r\n" Lib\test\test_tools\test_pindent.py:33: return '\n'.join(line.lstrip() for line in data.splitlines()) + '\n'
SEPARATOR ADDED AT BOTH ENDS:
Lib\pydoc.py:1582: sys.stdout.write('\n' + '\n'.join(lines[r:r+inc]) + '\n') Lib\site-packages\office365\runtime\odata\odata_batch_request.py:129: buffer = eol + eol.join(lines) + eol Lib\test\test_generators.py:1424: print("|" + "|".join(squares) + "|") Lib\test\test_generators.py:1620: print("|" + "|".join(row) + "|") _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/LP56JV... Code of Conduct: http://python.org/psf/codeofconduct/