The general problem with the ceval switch statement is that it is too big. Adding new opcodes will only make it bigger, so I doubt that much can be gained in general by trying to come up with new do-everything-in-one-opcode cases.
The last point is probably compiler dependent. GCC has the tendency to use the same layout for the assembler code as you use in the C source code, so placing often used code close to the top results in better locality (at least on my machines).
My experience with gcc (on x86) is that it uses a lookup table for contiguous switch statements rather than a long chain of compares/branches. A quick look at the assembler output from ceval.c suggests it's using a lookup table. What architecture did you observe this on?