LGTM. I did have the same idea when added specialized 8-bit opcodes.
Some optimization (like constant folding) it is worth to move yet one step earlier, to AST.
Other idea - instead of EXTENDED_ARG have two sets of instructions: short 16-bit with 8-bit arg, and long 32-bit with 24-bit arg. For simplicity initially only long instructions are emitted (if it makes sense).