[Tutor] s.insert(i, x) explanation in docs for Python 3.4 confusing to me
Cameron Simpson
cs at zip.com.au
Sun Jan 17 18:27:00 EST 2016
On 17Jan2016 14:50, Alex Kleider <akleider at sonic.net> wrote:
>Again, a personal thank you.
No worries.
>More often than not, when answering one thing, you teach me about
>other things. The 'thing' thing is only the latest. Of course
>I knew that using a name bound to a collection would effect the
>contents of the collection but it would never have occurred to me
>to use it to advantage as you describe.
Note that it shouldn't be overdone. Pointless optimisation is also a source of
bugs and obfuscation.
As I remarked, I'll do this for readability if needed or for performance where
"things" (or whatever) is going to be heavily accessed (eg in a busy loop),
such that getting-it-from-the-instance might significantly affect the
throughput. For many things you may as well not waste your time with it - the
direct expression of the problem is worth it for maintainability.
It is largely if I'm doing something low level (where interpreted or high level
languages like Python lose because their data abstraction or cost-of-operation
may exceed the teensy teensy task being performed.
As an example, I'm rewriting some code right now that scans bytes objects for
boundaries in the data. It is inherently a sequential "get a byte, update some
numbers and consult some tables, get the next byte ...". It is also heavily
used in the storage phase of this application: all new data is scanned this
way.
In C one might be going:
for (char *cp=buffer; buf_len > 0; cp++, buflen--) {
b = *cp;
... do stuff with "b" ...
}
That will be efficiently compiled to machine code and run flat out at your
CPU's native speed. The Python equivalent, which would look a bit like this:
for offset in range(len(buffer)):
b = bs[offset]
... do stuff with "b" ...
or possibly:
for offset, b in enumerate(buffer):
... do stuff with "b" ...
will inherently be slower because in Python "bs", "b" and "offset" may be any
type, so Python must do all this through object methods, and CPython (for
example) compiles to opcodes for a logial machine, which are themselves
interpreted by C code to decide what to do. You can see that the ratio of
"implement the Python operations" to the core "do something with a byte" stuff
can potentially be very large.
At some point I may be writing a C extension for these very busy parts of the
code but for now I am putting some effort into making the pure Python
implementation as efficient as I can while keeping it correct. This is an
occasion when it is worth expending significant effort on minimising
indirection (using "things" instead of "self.things", etc), because that
indirection has a cost that _in this case_ will be large compared to the core
operations.
Cheers,
Cameron Simpson <cs at zip.com.au>
More information about the Tutor
mailing list