[pypy-dev] Converting from Python strings to C strings slow?
Martin C. Martin
martin at martincmartin.com
Sun Jan 27 09:51:46 CET 2008
Carl Friedrich Bolz wrote:
> Martin C. Martin wrote:
>>
>> Thanks, but I'm not trying to write a standalone program, I need to
>> call some 3rd party libraries. For example, the string comes from one
>> of a couple dozen of socket connections, managed by Twisted. So I
>> just want my inner loop in RPython. The inner loop turns XML into a
>> MySQL statement, which the main python program can then send to a
>> database.
>>
>> So I need to get a big string into RPython, and a smaller (but still
>> pretty big) string out of it.
>
> Couldn't you just use a subprocess, read the string from stdin and write
> the result to stdout? It's quite likely that this is not slower than the
> way strings are passed in and out now and has many advantages. You would
> need to use os.read and os.write, since sys.stdin/stdout is not
> supported in RPython, but apart from that it should work fine.
>
> One of them is that if you use the Translation class, your RPython
> program will use reference counting, which is our slowest GC. If you use
> a subprocess you get the benefits of our much better generational GC.
What I'm really looking for is a way to write most of my applications in
a dynamic language (because its more productive to write & maintain),
then if and when performance is a problem, have some way to speed it up.
PyPy promises to do this even before performance is a problem, which
will be great!
Until that comes, I was hoping for a language where I could give some
hints to the compiler or runtime to speed it up. Things like "although
this binding could change each time through the loop, it doesn't
actually change, so there's no need to do a hash lookup for every
access." Or "this variable is always an int."
The only language I know of that can do that is Lisp, which is a strong
possibility. But Lisp's syntax is more verbose and low level than
modern dynamic languages, it doesn't have as many libraries, it doesn't
have an IDE with auto completion, or a good source level debugger. I
had hoped Groovy would be like that, with its optional typing and Java
inspired syntax and semantics, but sadly, the developers valued
dynamism, however rarely used, over performance.
So the next best thing is to rewrite the performance critical parts in
some other language. I had hoped RPython would be that language for
Python, but it turns out not to be. I could rewrite in C++, but the
semantics of C++ are very different than Python, so interfacing the two
becomes verbose and awkward. The ctypes module looks good for calling C
libraries that weren't originally designed to work with Python. But it
doesn't have a good way (or any way?) to manipulate Python objects from
C. Even Java's JNI makes for a lot of boilerplate code to translate
back and forth.
So it looks like my best bet may be Groovy, which interacts with Java
seamlessly. A year ago, when I last checked, the IDEs weren't up to the
job, but may that's changed.
And once PyPy is done, that may be an even better solution.
Best,
Martin
More information about the Pypy-dev
mailing list