[New-bugs-announce] [issue13279] Add memcmp into unicode_compare for optimizing compares

Thu Oct 27 19:52:41 CEST 2011

New submission from Richard Saunders <RichIsMyName at mac.com>:

In discussions of memcmp performance, (http://www.picklingtools.com/study.pdf)
it was noted how well Python 2.7 can take advantage of faster memcmps (indeed, the rich comparisons are all memcmp calls).
There have been some discussion on python-dev at python.org as well.

With unicode and Python 3.3 (and anyPython 3.x) there are a 
few places we could call memcmp to make string comparisons faster, but they are not completely trivial.

Basically, if the unicode strings are "1 byte kind", then memcmp can be used almost as is.  If the unicode strings are the same kind, they can at least use memcmp to compare for equality or inequality.

There is also a minor optimization laying in unicode_compare: if you
are comparing two strings for equality/inequality, there is no reason to look at the entire string if the lengths are different.

These 3 minor optimizations can make unicode_compare faster.

----------
components: Interpreter Core
messages: 146508
nosy: RichIsMyName, asmodai, loewis, pitrou, scoder
priority: normal
severity: normal
status: open
title: Add memcmp into unicode_compare for optimizing compares
type: performance
versions: Python 3.3, Python 3.4

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue13279>
_______________________________________