A few notes:

1) Amara 0.9.2 had some crufty code that (unbeknownst to me) was killing
performance.  I removed it for 0.9.3 and saw a near 15X.  Fair
benchmarks should really start with 0.9.3.

2) Amara will never compete with cElementTree in raw performance until I
somehow conjure up the time to write it in C.  I wouldn't hold my breath
waiting for that.  Winning the raw benchmark race is not my intention
with Amara.  My goal is rather combining maximum Python idiom with
maximum declarative power.  I'll get another dramatic speedup in Amara
(3X-4X in my sandbox) once the next 4Suite release is out and I switch
to Jeremy Kloth's super-fast low-level C/Domlette/SAX implementation,
but even then I'll expect cElementTree to be faster.

Frankly, the speed of cElementTree amazes me.

