Mailman 3 Embedded Python in C application - capi-sig

newer
overriding pythons import - change...

Embedded Python in C application

older
Overriding __builtins__ from C to...

Eljay Love-Jensen

Sept. 26, 2008

3:16 p.m.

Hi everyone,

First, my apologies if I'm in the wrong forum for my "embedding Python in a C application" questions. Please redirect me if I've wandered into the wrong place.

I have two needs for using Python in my application that I hope has an easy answer without rewriting Python's internals.

I need to use Python* in a multi-threaded application, where separate threads may be working on very long lasting Python scripts, and other threads may be involved in short Python scripts. None of the Python scripts running concurrently have any shared state with any of the other Python scripts running concurrently. Number of threads is in the 100-1000 range.

I need to manage Python's use of the heap by providing a memory pool for Python to use, rather than allowing Python to use malloc/free. This is to prevent memory fragmentation, and to allow easy disposal of a memory pool used for a closed Python interpreter instance.

A quick view of Py_Initialize() indicates that Python does not return some sort of "Py_State" pointer which represents the entire state of a Python interpreter. (Nor some sort of Py_Alloc().) Nor accepts a custom malloc/free function pointers. Hmmm.

Does anyone have experience with using Python in this fashion?

(If relevant, it will be Python 3.x not Python 2.x.)

Thanks, --Eljay

Doesn't HAVE to be Python. Could be JavaScript or Lua or whatnot. My preferences these days is a Python solution.

Show replies by date

Gustavo Carneiro

September 2008

5:48 p.m.

2008/9/26 Eljay Love-Jensen <eljay@adobe.com>

...

Hi everyone,

First, my apologies if I'm in the wrong forum for my "embedding Python in a C application" questions. Please redirect me if I've wandered into the wrong place.

I have two needs for using Python in my application that I hope has an easy answer without rewriting Python's internals.

I need to use Python* in a multi-threaded application, where separate threads may be working on very long lasting Python scripts, and other threads may be involved in short Python scripts. None of the Python scripts running concurrently have any shared state with any of the other Python scripts running concurrently. Number of threads is in the 100-1000 range.

I need to manage Python's use of the heap by providing a memory pool for Python to use, rather than allowing Python to use malloc/free. This is to prevent memory fragmentation, and to allow easy disposal of a memory pool used for a closed Python interpreter instance.

A quick view of Py_Initialize() indicates that Python does not return some sort of "Py_State" pointer which represents the entire state of a Python interpreter. (Nor some sort of Py_Alloc().) Nor accepts a custom malloc/free function pointers. Hmmm.

Python already has its own highly optimized memory allocator, it does not use malloc/free directly. That's why the configure option --without-pymalloc exists.

So I think your basic premise is wrong. But in any case maybe you are looking for PyInterpreterState_New(). But beware that going down that path is going to be painful: multiple interpreter states and threading can lead to many hours of debugging. I would think thrice before deciding I really need it.

...

Does anyone have experience with using Python in this fashion?

I remember trying to debug xchat python plugin interface which used multiple interpreter states and multiple threads. I wish I could forget those horrors...

...

(If relevant, it will be Python 3.x not Python 2.x.)

Thanks, --Eljay

Doesn't HAVE to be Python. Could be JavaScript or Lua or whatnot. My preferences these days is a Python solution.

capi-sig mailing list capi-sig@python.org http://mail.python.org/mailman/listinfo/capi-sig

-- Gustavo J. A. M. Carneiro INESC Porto, Telecommunications and Multimedia Unit "The universe is always one step beyond logic." -- Frank Herbert

Swapnil Talekar

4:48 a.m.

On Fri, Sep 26, 2008 at 11:18 PM, Gustavo Carneiro <gjcarneiro@gmail.com>wrote:

...

2008/9/26 Eljay Love-Jensen <eljay@adobe.com>

...
Hi everyone,

First, my apologies if I'm in the wrong forum for my "embedding Python in a C application" questions. Please redirect me if I've wandered into the wrong place.

I have two needs for using Python in my application that I hope has an easy answer without rewriting Python's internals.

I need to use Python* in a multi-threaded application, where separate threads may be working on very long lasting Python scripts, and other threads may be involved in short Python scripts. None of the Python scripts running concurrently have any shared state with any of the other Python scripts running concurrently. Number of threads is in the 100-1000 range.

I need to manage Python's use of the heap by providing a memory pool for Python to use, rather than allowing Python to use malloc/free. This is to prevent memory fragmentation, and to allow easy disposal of a memory pool used for a closed Python interpreter instance.

A quick view of Py_Initialize() indicates that Python does not return some sort of "Py_State" pointer which represents the entire state of a Python interpreter. (Nor some sort of Py_Alloc().) Nor accepts a custom malloc/free function pointers. Hmmm.

...
Python already has its own highly optimized memory allocator, it does not use malloc/free directly. That's why the configure option --without-pymalloc exists.

...
So I think your basic premise is wrong. But in any case maybe you are looking for PyInterpreterState_New(). But beware that going down that path is going to be painful: multiple interpreter states and threading can lead to many hours of debugging. I would think thrice before deciding I really need it.

...

But If Eljay is trying to actually create multiple threads to run the scripts simultaneously, then I guess he has much more to worry about than just PyInterpreterState. The API does not take care of all the Python global variables for sure.

--Swapnil Talekar

Adam Olsen

9:32 p.m.

On Fri, Sep 26, 2008 at 9:16 AM, Eljay Love-Jensen <eljay@adobe.com> wrote:

...

Hi everyone,

First, my apologies if I'm in the wrong forum for my "embedding Python in a C application" questions. Please redirect me if I've wandered into the wrong place.

I have two needs for using Python in my application that I hope has an easy answer without rewriting Python's internals.

I need to use Python* in a multi-threaded application, where separate threads may be working on very long lasting Python scripts, and other threads may be involved in short Python scripts. None of the Python scripts running concurrently have any shared state with any of the other Python scripts running concurrently. Number of threads is in the 100-1000 range.

I need to manage Python's use of the heap by providing a memory pool for Python to use, rather than allowing Python to use malloc/free. This is to prevent memory fragmentation, and to allow easy disposal of a memory pool used for a closed Python interpreter instance.

A quick view of Py_Initialize() indicates that Python does not return some sort of "Py_State" pointer which represents the entire state of a Python interpreter. (Nor some sort of Py_Alloc().) Nor accepts a custom malloc/free function pointers. Hmmm.

Does anyone have experience with using Python in this fashion?

Don't use multiple interpreters. They're not really separate, they're buggy, they offer *NO* advantage to you over just using multiple threads.

Likewise, you can't force memory to be freed, as it'd still be used by python.

The only way to force cleanup is to spawn a subprocess. This'd also let you use multiple cores. You can probably mitigate the startup cost by having a given subprocess run several short scripts or one long script.

-- Adam Olsen, aka Rhamphoryncus

Swapnil Talekar

5:24 a.m.

On Sat, Sep 27, 2008 at 3:02 AM, Adam Olsen <rhamph@gmail.com> wrote:

...

...
Hi everyone,

First, my apologies if I'm in the wrong forum for my "embedding Python in a C application" questions. Please redirect me if I've wandered into the wrong place.

I have two needs for using Python in my application that I hope has an easy answer without rewriting Python's internals.

I need to use Python* in a multi-threaded application, where separate threads may be working on very long lasting Python scripts, and other threads may be involved in short Python scripts. None of the Python

On Fri, Sep 26, 2008 at 9:16 AM, Eljay Love-Jensen <eljay@adobe.com> wrote: scripts

...
running concurrently have any shared state with any of the other Python scripts running concurrently. Number of threads is in the 100-1000 range.

I need to manage Python's use of the heap by providing a memory pool for Python to use, rather than allowing Python to use malloc/free. This is to prevent memory fragmentation, and to allow easy disposal of a memory pool used for a closed Python interpreter instance.

A quick view of Py_Initialize() indicates that Python does not return some sort of "Py_State" pointer which represents the entire state of a Python interpreter. (Nor some sort of Py_Alloc().) Nor accepts a custom malloc/free function pointers. Hmmm.

Does anyone have experience with using Python in this fashion?
      >Don't use multiple interpreters.  They're not really separate,
they're >buggy, they offer *NO* advantage to you over just using multiple >threads.

...

they're buggy? sure. they'r not really separate? well, now if you want to have multiple threads running scripts, I don't see how you can get away without having multiple interpreters (in the same process)and they REALLY have to be separate. That's not a easy task though. As I said, the separation has to be more than just separate PyInterpreterStates

...

Likewise, you can't force memory to be freed, as it'd still be used by python.

The only way to force cleanup is to spawn a subprocess. This'd also let you use multiple cores. You can probably mitigate the startup cost by having a given subprocess run several short scripts or one long script.

...

Well, if you have your own memory manager, i.e. other than Python's and you are embedding the Interpreter in your application. I don't see any reason why you should not be able to cleanup at any appropriate point you think. Python is still using the memory? sure it is. But not after its done with the script

-- Swapnil Talekar

Adam Olsen

6:02 a.m.

On Fri, Sep 26, 2008 at 11:24 PM, Swapnil Talekar <swapnil.st@gmail.com> wrote:

...

On Sat, Sep 27, 2008 at 3:02 AM, Adam Olsen <rhamph@gmail.com> wrote:

...
...
Hi everyone,

First, my apologies if I'm in the wrong forum for my "embedding Python in a C application" questions. Please redirect me if I've wandered into the wrong place.

I have two needs for using Python in my application that I hope has an easy answer without rewriting Python's internals.

I need to use Python* in a multi-threaded application, where separate threads may be working on very long lasting Python scripts, and other threads may be involved in short Python scripts. None of the Python

On Fri, Sep 26, 2008 at 9:16 AM, Eljay Love-Jensen <eljay@adobe.com> wrote: scripts

...
running concurrently have any shared state with any of the other Python scripts running concurrently. Number of threads is in the 100-1000 range.

I need to manage Python's use of the heap by providing a memory pool for Python to use, rather than allowing Python to use malloc/free. This is to prevent memory fragmentation, and to allow easy disposal of a memory pool used for a closed Python interpreter instance.

A quick view of Py_Initialize() indicates that Python does not return some sort of "Py_State" pointer which represents the entire state of a Python interpreter. (Nor some sort of Py_Alloc().) Nor accepts a custom malloc/free function pointers. Hmmm.

Does anyone have experience with using Python in this fashion?
      >Don't use multiple interpreters.  They're not really separate,
they're >buggy, they offer *NO* advantage to you over just using multiple >threads.
...
they're buggy? sure. they'r not really separate? well, now if you want to have multiple threads running scripts, I don't see how you can get away without having multiple interpreters (in the same process)and they REALLY have to be separate. That's not a easy task though. As I said, the separation has to be more than just separate PyInterpreterStates

You must not be very familiar with threading. All you need to do is give each script it's own *local* state and not modify any globals. No need for multiple interpreters.

All the subinterpreter API does is give each interpreter a separate copy of the modules, so poorly designed APIs that use global state can pretend they've got separate processes, without actually having separate processes. Rather obscure, and not useful for the OP.

...

...
Likewise, you can't force memory to be freed, as it'd still be used by python.

The only way to force cleanup is to spawn a subprocess. This'd also let you use multiple cores. You can probably mitigate the startup cost by having a given subprocess run several short scripts or one long script.

...
Well, if you have your own memory manager, i.e. other than Python's and you are embedding the Interpreter in your application. I don't see any reason why you should not be able to cleanup at any appropriate point you think. Python is still using the memory? sure it is. But not after its done with the script

If you free any memory like that you'll have hosed the entire python interpreter. After that there's nothing useful to do but exit the process.. so you might as well exit in the first place.

-- Adam Olsen, aka Rhamphoryncus

Jack Jansen

9:18 p.m.

And, to possibly make this a bit clearer for the OP: the basic problem
is that a lot of state is kept in global variables. So, the
"interpreter state" is spread all over memory, and there's no way you
can duplicate or free this without freeing the whole process.

-- Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma
Goldman

5993

Age (days ago)

5994

Last active (days ago)

List overview

Download

6 comments

5 participants

participants (5)

Adam Olsen
Eljay Love-Jensen
Gustavo Carneiro
Jack Jansen
Swapnil Talekar

Embedded Python in C application

Eljay Love-Jensen

Gustavo Carneiro

Swapnil Talekar

Adam Olsen

Swapnil Talekar

Adam Olsen

Jack Jansen

tags

participants (5)