[pypy-dev] Converting from Python strings to C strings slow?
Carl Friedrich Bolz
cfbolz at gmx.de
Sat Jan 26 11:27:27 CET 2008
Martin C. Martin wrote:
>
> Maciek Fijalkowski wrote:
>> Martin C. Martin wrote:
>>> Hi,
>>>
>>> There seems to be a lot of overhead when passing a large string (23
>>> Meg) to C compiled RPython code. For example, this code:
>>>
>>> def small(text):
>>> return 3
>>>
>>> t = Translation(small)
>>> t.annotate()
>>> t.rtype()
>>> f3 = t.compile_c()
>>>
>>> st = time.time()
>>> z = f3(xml)
>>> print time.time() - st
>>>
>>>
>> This is wrong. You should even get a warning, the proper command is
>> t.annotate([str]).
>
> Oops, yes, I've been working with variations of this all day, and I
> hadn't actually compiled & run the example in the email, although I'd
> done something equivalent.
>
>> Besides, this is not the official way of writing rpython standalone
>> programs.
>
> Thanks, but I'm not trying to write a standalone program, I need to call
> some 3rd party libraries. For example, the string comes from one of a
> couple dozen of socket connections, managed by Twisted. So I just want
> my inner loop in RPython. The inner loop turns XML into a MySQL
> statement, which the main python program can then send to a database.
>
> So I need to get a big string into RPython, and a smaller (but still
> pretty big) string out of it.
Couldn't you just use a subprocess, read the string from stdin and write
the result to stdout? It's quite likely that this is not slower than the
way strings are passed in and out now and has many advantages. You would
need to use os.read and os.write, since sys.stdin/stdout is not
supported in RPython, but apart from that it should work fine.
One of them is that if you use the Translation class, your RPython
program will use reference counting, which is our slowest GC. If you use
a subprocess you get the benefits of our much better generational GC.
Cheers,
Carl Friedrich
More information about the Pypy-dev
mailing list