An idiom for code generation with exec
eliben at gmail.com
Fri Jun 20 21:44:52 CEST 2008
On Jun 20, 3:19 pm, George Sakkis <george.sak... at gmail.com> wrote:
> On Jun 20, 8:03 am, eliben <eli... at gmail.com> wrote:
> > On Jun 20, 9:17 am, Bruno Desthuilliers <bruno.
> > 42.desthuilli... at websiteburo.invalid> wrote:
> > > eliben a écrit :> Hello,
> > > > In a Python program I'm writing I need to dynamically generate
> > > > functions[*]
> > > (snip)
> > > > [*] I know that each time a code generation question comes up people
> > > > suggest that there's a better way to achieve this, without using exec,
> > > > eval, etc.
> > > Just to make things clear: you do know that you can dynamically build
> > > functions without exec, do you ?
> > Yes, but the other options for doing so are significantly less
> > flexible than exec.
> > > > But in my case, for reasons too long to fully lay out, I
> > > > really need to generate non-trivial functions with a lot of hard-coded
> > > > actions for performance.
> > > Just out of curiousity : could you tell a bit more about your use case
> > > and what makes a simple closure not an option ?
> > Okay.
> > I work in the field of embedded programming, and one of the main uses
> > I have for Python (and previously Perl) is writing GUIs for
> > controlling embedded systems. The communication protocols are usually
> > ad-hoc messages (headear, footer, data, crc) built on top of serial
> > communication (RS232).
> > The packets that arrive have a known format. For example (YAMLish
> > syntax):
> > packet_length: 10
> > fields:
> > - name: header
> > offset: 0
> > length: 1
> > - name: time_tag
> > offset: 1
> > length: 1
> > transform: val * 2048
> > units: ms
> > - name: counter
> > offset: 2
> > length: 4
> > bytes-msb-first: true
> > - name: bitmask
> > offset: 6
> > length: 1
> > bit_from: 0
> > bit_to: 5
> > ...
> > This is a partial capability display. Fields have defined offsets and
> > lengths, can be only several bits long, can have defined
> > transformations and units for convenient display.
> > I have a program that should receive such packets from the serial port
> > and display their contents in tabular form. I want the user to be able
> > to specify the format of his packets in a file similar to above.
> > Now, in previous versions of this code, written in Perl, I found out
> > that the procedure of extracting field values from packets is very
> > inefficient. I've rewritten it using a dynamically generated procedure
> > for each field, that does hard coded access to its data. For example:
> > def get_counter(packet):
> > data = packet[2:6]
> > data.reverse()
> > return data
> > This gave me a huge speedup, because each field now had its specific
> > function sitting in a dict that quickly extracted the field's data
> > from a given packet.
> It's still not clear why the generic version is so slower, unless you
> extract only a few selected fields, not all of them. Can you post a
> sample of how you used to write it without exec to clarify where the
> inefficiency comes from ?
The generic version has to make a lot of decisions at runtime, based
on the format specification.
Extract the offset from the spec, extract the length. Is it msb-
first ? Then reverse. Are specific bits required ? If so, do bit
operations. Should bits be reversed ? etc.
A dynamically generated function doesn't have to make any decisions -
everything is hard coded in it, because these decisions have been done
at compile time. This can save a lot of dict accesses and conditions,
and results in a speedup.
I guess this is not much different from Lisp macros - making decisions
at compile time instead of run time and saving performance.
More information about the Python-list