You are right. I made a thinko.
List construction from an iterator is O(N) just as is `sum(1 for _ in it)`. Both of them need to march through every element. But as a constant multiplier, just constructing the list should be faster than needing an addition (Python append is O(1) because of smart dynamic memory pre-allocation).
So the "just read the iterator" is about 2-3 times faster than read-then-accumulate). Of course, it the run-lengths are LARGE, we can start worrying about the extra memory allocation needed as a tradeoff. Your sum uses constant memory.