On 13.02.13 15:44, Nick Coghlan wrote:
On Wed, Feb 13, 2013 at 10:06 PM, Christian Tismer firstname.lastname@example.org wrote:
To avoid such hidden traps in larger code bases, documentation is needed that clearly gives a warning saying "don't do that", like CS students learn for most other languages.
How much more explicit do you want us to be?
"""6. CPython implementation detail: If s and t are both strings, some Python implementations such as CPython can usually perform an in-place optimization for assignments of the form s = s + t or s += t. When applicable, this optimization makes quadratic run-time much less likely. This optimization is both version and implementation dependent. For performance sensitive code, it is preferable to use the str.join() method which assures consistent linear concatenation performance across versions and implementations."""
So please don't blame us for people not reading a warning that is already there.
I don't, really not. This was a cross-posting effect. I was using the PyPy documentation, only, and there a lot of things are mentioned, but this behavioral difference was missing. Python-dev was not addressed at all.
... Deliberately *relying* on the += hack to avoid quadratic runtime is just plain wrong, and our documentation already says so.
If anyone really thinks it will help, I can add a CPython implementation note back in to the Python 3 docs as well, pointing out that CPython performance measurements may hide broken algorithmic complexity related to string concatenation, but the corresponding note in Python 2 doesn't seem to have done much good :P
Well, while we are at it: Yes, it says so as a note at the end of http://docs.python.org/2/library/stdtypes.html#typesseq
I doubt that many people read that far, and they do not search documentation about sequence types when they are adding some strings together. People seem to have a tendency to just try something out instead and see if it works. That even seems to get worse the better and bigger the Python documentation grows. ;-)
Maybe it would be a good idea to remove that concat optimization completely? Then people will wake up and read the docs to find out what's wrong ;-) No, does not help, because their test cases will not cover the reality.
----- Thinking a bit more about it.
If you think about docs improvement, I don't believe it helps to make the very complete reference documentation even more complete. Completeness is great, don't take me wrong! But what people read is what pops right into their face, and I think that could be added.
I think before getting people to work through long and complete documentation, it is probably easier to wake their interest by something like "Hey, are you doing things this way?" And then there is a short, concise list of bad and good things, maybe even dynamic as in WingWare's "Wing Tips" or any better approach.
From that easily reachable, only a few pages long tabular collection of short hints and caveats there could be linkage to the existing, real documentation that explains things in more detail. Maybe that could be a way to get people to actually read.
Just an idea.
cheers - Chris
p.s.: Other nice attempts that don't seem to really work:
Some hints like http://docs.python.org/2/howto/doanddont.html are not bad, although that is hidden in the HowTO section, does only address a few things, and also the sub-title "in-depth documents on specific topics" is not what they seek in the first place while hacking on some code.
Looking things up in a quick ref like http://rgruet.free.fr/PQR27/PQR2.7.html is very concise but does also _not_ mention what to avoid. Others exist, like http://infohost.nmt.edu/tcc/help/pubs/python/web/
By the way, the first thing I find via google is: http://www.python.org/doc/QuickRef.html which is quite funny (v1.3)