![](https://secure.gravatar.com/avatar/34074d36919afb11d5ed1b8e330a1e68.jpg?s=120&d=mm&r=g)
Hi, We have been following the nightly builds of PyPy, with our testing workload (first described in the "CFFI speed results" thread). The news are very good. The performance of PyPy + CFFI has gone up considerably (~30% faster) since the last time we wrote about it! By adding on that speed up also our optimizations of the CFFI based SQLite3 wrapper (MSPW) that we are developing, the end result is that most of our test queries are at the same speed or faster than CPython + APSW now. Unfortunately, one of the queries where PyPy is slower [*] than CPython + APSW, is very central to all of our workflows, which means that we cannot fully convert to using PyPy. The main culprit of PyPy's slowness is the conversion (encoding, decoding) from PyPy's unicodes to UTF-8. It is the only thing, with a big percentage (~48%), remaining at the top of our performance profiles . Right now we are using PyPy's "codecs.utf_8_encode" and "codecs.utf_8_decode" to do this conversion. It there a faster way to do these conversions (encoding, decoding) in PyPy? Does CPython do something more clever than PyPY, like storing unicodes with full ASCII char content, in an ASCII representation? Thank you very much, lefteris. [*] For 1M rows: CPython + APSW: 10.5 sec PyPy + MSPW: 15.5 sec