I suspect that you'd do better here if you removed a bunch of layers from the conversion functions. Right now it looks like: imap->chain->convert_row->tuple->generator->izip. That's five levels deep and Python functions are reasonably expensive. I would try to be a lot less clever and do something like:
def data_iterator(row_iter, delim):
row0 = row_iter.next().split(delim)
converters = find_formats(row0) # left as an exercise
yield tuple(f(x) for f, x in zip(conversion_functions, row0))
for row in row_iter:
yield tuple(f(x) for f, x in zip(conversion_functions, row0))
That's just a sketch and I haven't timed it, but it cuts a few levels out of the call chain, so has a reasonable chance of being faster. If you wanted to be really clever, you could use some exec magic after you figure out the conversion functions to compile a special function that generates the tuples directly without any use of tuple or zip. I don't have time to work through the details right now, but the code you would compile would end up looking this:
for (x0, x1, x2) in row_iter:
yield (int(x0), float(x1), float(x2))
Here we've assumed that find_formats determined that there are three fields, an int and two floats. Once you have this info you can build an appropriate function and exec it. This would cut another couple levels out of the call chain. Again, I haven't timed it, or tried it, but it looks like it would be fun to try.
-tim