Performance: sets vs dicts.

Wed Sep 1 09:46:26 EDT 2010

On 09/01/10 00:09, Aahz wrote:
> In article <mailman.239.1283257496.29448.python-list at python.org>,
> Jerry Hill  <malaclypse2 at gmail.com> wrote:
>> On Mon, Aug 30, 2010 at 7:42 PM, Aahz <aahz at pythoncraft.com> wrote:
>>>
>>> Possibly; IMO, people should not need to run timeit to determine basic
>>> algorithmic speed for standard Python datatypes.
>>
>> http://wiki.python.org/moin/TimeComplexity takes a stab at it.  IIRC,
>> last time this came up, there was some resistance to making promises
>> about time complexity in the official docs, since that would make
>> those numbers part of the language, and thus binding on other
>> implementations.
> 
> I'm thoroughly aware of that page and updated it yesterday to make it
> easier to find.  ;-)
> 
> However, I think there are some rock-bottom basic guarantees we can make
> regardless of implementation.  Does anyone seriously think that an
> implementation would be accepted that had anything other than O(1) for
> index access into tuples and lists?  Dicts that were not O(1) for access
> with non-pathological hashing?  That we would accept sets having O()
> performance worse than dicts?
> 
> I suggest that we should agree on these guarantees and document them in
> the core.

While I think documenting them would be great for all programmers that
care about practical and theoretical execution speed; I think including
these implementation details in core documentation as a "guarantee"
would be a bad idea for the reasons Terry outlined.

One way of resolving that is by having two documentations (or two
separate sections in the documentation) for:
- Python -- the language -- documenting Python as an abstract language,
this is the documentation which can be shared across all Python
implementations. This will also be the specification for Python Language
which other implementations will be measured to.
- CPython -- the Python interpreter -- documents implementation details
and performance metrics. It should be properly noted that these are not
part of the language per se. This will be the playground for CPython
experts that need to fine tune their applications to the last drop of
blood and don't mind their application going nuts with the next release
of CPython.