Mailman 3 [Twisted-Python] sharing a dict between child processes - Twisted

newer
[Twisted-Python] Weekly Bug Summary

[Twisted-Python] sharing a dict between child processes

older
[Twisted-Python] Weekly Bug Summary

Waqar Khan

Nov. 5, 2019

10:19 p.m.

Hi, So, I am writing a twisted server. This server spawn multiple child processes using reactor spawnProcess that initializes a process protocol. Now, each of the childprocess receives some REST requests. Each process has a dict that acts as cache. Now, I want to share dict across processes. In general, python has SharedMemoryManager in multiprocessing module which would have helped. https://docs.python.org/3/library/multiprocessing.shared_memory.html#multipr... But since I am using twisted internal process implementation, how do I share this dict across the processes so that all the processes use this common cache? Thanks Waqar

Attachments:

attachment.htm (text/html — 938 bytes)

Show replies by date

Maarten ter Huurne

November 2019

6:21 a.m.

On Wednesday, 6 November 2019 07:19:56 CET Waqar Khan wrote:

...

Hi, So, I am writing a twisted server. This server spawn multiple child processes using reactor spawnProcess that initializes a process protocol.

Now, each of the childprocess receives some REST requests. Each process has a dict that acts as cache. Now, I want to share dict across processes. In general, python has SharedMemoryManager in multiprocessing module which would have helped. https://docs.python.org/3/library/multiprocessing.shared_memory.html#m ultiprocessing.managers.SharedMemoryManager.SharedMemory But since I am using twisted internal process implementation, how do I share this dict across the processes so that all the processes use this common cache?

Keeping a dictionary in SharedMemoryManager seems far from trivial. I don't think you can allocate arbitrary Python objects in the shared memory and even if you could, you would run into problems when one process mutates the dictionary while another is looking up something or also mutating it. It could in theory work if you implement a custom lock-less dictionary, but that would be a lot of work and hard to get right. Also having shared memory mutations be synced between multiple CPU cores could degrade performance, since keeping core-local CPU caches in sync is expensive. Would it be an option to have only one process accept the REST requests, check whether the result is in the cache and only distribute work to the other processes if you get a cache miss? Typically the case where an answer is cached is pretty fast, so perhaps you don't need multiple processes to handle incoming requests. Bye, Maarten

Waqar Khan

8:35 a.m.

Hi Marteen, Thanks for the response. When you say "when one process mutates the dictionary while another is looking up something or also mutating it." Do you mean that a key/value pair is getting modified or is it that a dict, in general, is getting modified. The first one is not really a concern as the key to value mapping is unique(so all the processes will require the same value for same key). So read/write to dict doesnt really have to be "threadsafe" or anything like that. But, dict getting modified and made available across rest of the processes will be common. The thing is, the major cost of our task is I/O. So, when a request comes in we fetch some data and then cache it. Now, each processes has their own cache and that is very inefficient. One idea is to share the cache across processes. Does that make sense? Thanks for the help. On Wed, Nov 6, 2019 at 6:22 AM Maarten ter Huurne <maarten@treewalker.org> wrote:

...

On Wednesday, 6 November 2019 07:19:56 CET Waqar Khan wrote:

...
Hi, So, I am writing a twisted server. This server spawn multiple child processes using reactor spawnProcess that initializes a process protocol.

Now, each of the childprocess receives some REST requests. Each process has a dict that acts as cache. Now, I want to share dict across processes. In general, python has SharedMemoryManager in multiprocessing module which would have helped. https://docs.python.org/3/library/multiprocessing.shared_memory.html#m ultiprocessing.managers.SharedMemoryManager.SharedMemory But since I am using twisted internal process implementation, how do I share this dict across the processes so that all the processes use this common cache?

Keeping a dictionary in SharedMemoryManager seems far from trivial. I don't think you can allocate arbitrary Python objects in the shared memory and even if you could, you would run into problems when one process mutates the dictionary while another is looking up something or also mutating it.

It could in theory work if you implement a custom lock-less dictionary, but that would be a lot of work and hard to get right. Also having shared memory mutations be synced between multiple CPU cores could degrade performance, since keeping core-local CPU caches in sync is expensive.

Would it be an option to have only one process accept the REST requests, check whether the result is in the cache and only distribute work to the other processes if you get a cache miss? Typically the case where an answer is cached is pretty fast, so perhaps you don't need multiple processes to handle incoming requests.

Bye, Maarten

_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Scott, Barry

8:38 a.m.

On Wednesday, 6 November 2019 14:21:22 GMT Maarten ter Huurne wrote:

...

On Wednesday, 6 November 2019 07:19:56 CET Waqar Khan wrote:

...
Hi, So, I am writing a twisted server. This server spawn multiple child processes using reactor spawnProcess that initializes a process protocol.

Now, each of the childprocess receives some REST requests. Each process has a dict that acts as cache. Now, I want to share dict across processes. In general, python has SharedMemoryManager in multiprocessing module which would have helped. https://docs.python.org/3/library/multiprocessing.shared_memory.html#m ultiprocessing.managers.SharedMemoryManager.SharedMemory But since I am using twisted internal process implementation, how do I share this dict across the processes so that all the processes use this common cache?

Keeping a dictionary in SharedMemoryManager seems far from trivial. I don't think you can allocate arbitrary Python objects in the shared memory and even if you could, you would run into problems when one process mutates the dictionary while another is looking up something or also mutating it.

It could in theory work if you implement a custom lock-less dictionary, but that would be a lot of work and hard to get right. Also having shared memory mutations be synced between multiple CPU cores could degrade performance, since keeping core-local CPU caches in sync is expensive.

Would it be an option to have only one process accept the REST requests, check whether the result is in the cache and only distribute work to the other processes if you get a cache miss? Typically the case where an answer is cached is pretty fast, so perhaps you don't need multiple processes to handle incoming requests.

We have used a couple of ways to cache. 1. Use a singleton process to hold the cache and ask it, via IPC, for answers from the other process. 2. have a cache in each process Barry

...

Bye, Maarten

_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Waqar Khan

8:43 a.m.

Hi Barry, Thanks for the response. Where can I read more about (1). It seems like that is something I need to explore. As we already have (2) (cache for each process). Thanks again for your help. On Wed, Nov 6, 2019 at 8:39 AM Scott, Barry <barry.scott@forcepoint.com> wrote:

...

On Wednesday, 6 November 2019 14:21:22 GMT Maarten ter Huurne wrote:

...
On Wednesday, 6 November 2019 07:19:56 CET Waqar Khan wrote:

...
Hi, So, I am writing a twisted server. This server spawn multiple child processes using reactor spawnProcess that initializes a process protocol.

Now, each of the childprocess receives some REST requests. Each process has a dict that acts as cache. Now, I want to share dict across processes. In general, python has SharedMemoryManager in multiprocessing module which would have helped. https://docs.python.org/3/library/multiprocessing.shared_memory.html#m ultiprocessing.managers.SharedMemoryManager.SharedMemory But since I am using twisted internal process implementation, how do I share this dict across the processes so that all the processes use this common cache?

Keeping a dictionary in SharedMemoryManager seems far from trivial. I don't think you can allocate arbitrary Python objects in the shared memory and even if you could, you would run into problems when one process mutates the dictionary while another is looking up something or also mutating it.

It could in theory work if you implement a custom lock-less dictionary, but that would be a lot of work and hard to get right. Also having shared memory mutations be synced between multiple CPU cores could degrade performance, since keeping core-local CPU caches in sync is expensive.

Would it be an option to have only one process accept the REST requests, check whether the result is in the cache and only distribute work to the other processes if you get a cache miss? Typically the case where an answer is cached is pretty fast, so perhaps you don't need multiple processes to handle incoming requests.

We have used a couple of ways to cache. 1. Use a singleton process to hold the cache and ask it, via IPC, for answers from the other process. 2. have a cache in each process

Barry

...
Bye, Maarten

_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Scott, Barry

9:07 a.m.

On Wednesday, 6 November 2019 16:43:52 GMT Waqar Khan wrote:

...

Hi Barry, Thanks for the response. Where can I read more about (1). It seems like that is something I need to explore. As we already have (2) (cache for each process). Thanks again for your help.

We use the UDS (Unix domain sockets) to talk to a master process. Twisted has support for this. But you need a small patch to avoid data lose. UDS does not lose data and is message based, not bytes based. We use pickle to encode requests and responses. Barry The patch is: --- Twisted-18.4.0.orig/src/twisted/internet/unix.py.orig 2018-08-01 12:45:38.711115425 +0100 +++ Twisted-18.4.0/src/twisted/internet/unix.py 2018-08-01 12:45:47.946115123 +0100 @@ -509,11 +509,6 @@ return self.write(datagram, address) elif no == EMSGSIZE: raise error.MessageLengthError("message too long") - elif no == EAGAIN: - # oh, well, drop the data. The only difference from UDP - # is that UDP won't ever notice. - # TODO: add TCP-like buffering - pass else: raise You then have to handle the EAGAIN error and do retries yourself. As it stands the patch is not good enough to put into twisted as a full fix would need to put the handling of the retries into twisted. I guess (2) does not work for you as the cache hit rate is low and you need to share the cache to get a benefit. Cache entries only get used a few times? In our case the hit rate is high (99%+) and we just pay the cost of populating the caches on process start up, which ends up being noise. Barry

...

On Wed, Nov 6, 2019 at 8:39 AM Scott, Barry <barry.scott@forcepoint.com>

wrote:

...
On Wednesday, 6 November 2019 14:21:22 GMT Maarten ter Huurne wrote:

...
On Wednesday, 6 November 2019 07:19:56 CET Waqar Khan wrote:

...
Hi, So, I am writing a twisted server. This server spawn multiple child processes using reactor spawnProcess that initializes a process protocol.

Now, each of the childprocess receives some REST requests. Each process has a dict that acts as cache. Now, I want to share dict across processes. In general, python has SharedMemoryManager in multiprocessing module which would have helped. https://docs.python.org/3/library/multiprocessing.shared_memory.html#m ultiprocessing.managers.SharedMemoryManager.SharedMemory But since I am using twisted internal process implementation, how do I share this dict across the processes so that all the processes use this common cache?

Keeping a dictionary in SharedMemoryManager seems far from trivial. I don't think you can allocate arbitrary Python objects in the shared memory and even if you could, you would run into problems when one process mutates the dictionary while another is looking up something or also mutating it.

It could in theory work if you implement a custom lock-less dictionary, but that would be a lot of work and hard to get right. Also having shared memory mutations be synced between multiple CPU cores could degrade performance, since keeping core-local CPU caches in sync is expensive.

Would it be an option to have only one process accept the REST requests, check whether the result is in the cache and only distribute work to the other processes if you get a cache miss? Typically the case where an answer is cached is pretty fast, so perhaps you don't need multiple processes to handle incoming requests.

We have used a couple of ways to cache. 1. Use a singleton process to hold the cache and ask it, via IPC, for answers from the other process. 2. have a cache in each process

Barry

...
Bye,

Maarten

_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Waqar Khan

10:23 a.m.

Thanks for the info. Yeah, it seems that UDS is the way to go. (I need to read more about them). Actually, is there a simple example you can give that can help me understand this a bit better? Thanks On Wed, Nov 6, 2019 at 9:07 AM Scott, Barry <barry.scott@forcepoint.com> wrote:

...

On Wednesday, 6 November 2019 16:43:52 GMT Waqar Khan wrote:

...
Hi Barry, Thanks for the response. Where can I read more about (1). It seems like that is something I need to explore. As we already have (2) (cache for each process). Thanks again for your help.

We use the UDS (Unix domain sockets) to talk to a master process. Twisted has support for this. But you need a small patch to avoid data lose.

UDS does not lose data and is message based, not bytes based. We use pickle to encode requests and responses.

Barry

The patch is:

--- Twisted-18.4.0.orig/src/twisted/internet/unix.py.orig 2018-08-01 12:45:38.711115425 +0100 +++ Twisted-18.4.0/src/twisted/internet/unix.py 2018-08-01 12:45:47.946115123 +0100 @@ -509,11 +509,6 @@ return self.write(datagram, address) elif no == EMSGSIZE: raise error.MessageLengthError("message too long") - elif no == EAGAIN: - # oh, well, drop the data. The only difference from UDP - # is that UDP won't ever notice. - # TODO: add TCP-like buffering - pass else: raise

You then have to handle the EAGAIN error and do retries yourself. As it stands the patch is not good enough to put into twisted as a full fix would need to put the handling of the retries into twisted.

I guess (2) does not work for you as the cache hit rate is low and you need to share the cache to get a benefit. Cache entries only get used a few times?

In our case the hit rate is high (99%+) and we just pay the cost of populating the caches on process start up, which ends up being noise.

Barry

...
On Wed, Nov 6, 2019 at 8:39 AM Scott, Barry <barry.scott@forcepoint.com>

wrote:

...
On Wednesday, 6 November 2019 14:21:22 GMT Maarten ter Huurne wrote:

...
On Wednesday, 6 November 2019 07:19:56 CET Waqar Khan wrote:

...
Hi, So, I am writing a twisted server. This server spawn multiple child processes using reactor spawnProcess that initializes a process protocol.

Now, each of the childprocess receives some REST requests. Each process has a dict that acts as cache. Now, I want to share dict across processes. In general, python has SharedMemoryManager in multiprocessing

...
...
...
...
which would have helped.

https://docs.python.org/3/library/multiprocessing.shared_memory.html#m

...
ultiprocessing.managers.SharedMemoryManager.SharedMemory But since I am using twisted internal process implementation, how do I share

...
...
...
...
dict across the processes so that all the processes use this common cache?

Keeping a dictionary in SharedMemoryManager seems far from trivial. I don't think you can allocate arbitrary Python objects in the shared memory and even if you could, you would run into problems when one process mutates the dictionary while another is looking up something or also mutating it.

It could in theory work if you implement a custom lock-less dictionary, but that would be a lot of work and hard to get right. Also having shared memory mutations be synced between multiple CPU cores could degrade performance, since keeping core-local CPU caches in sync is expensive.

Would it be an option to have only one process accept the REST requests, check whether the result is in the cache and only distribute work to

module this the

...
...
...
other processes if you get a cache miss? Typically the case where an answer is cached is pretty fast, so perhaps you don't need multiple processes to handle incoming requests.

We have used a couple of ways to cache. 1. Use a singleton process to hold the cache and ask it, via IPC, for answers from the other process. 2. have a cache in each process

Barry

...
Bye,

Maarten

_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Scott, Barry

2:23 a.m.

On Wednesday, 6 November 2019 18:23:41 GMT Waqar Khan wrote:

...

Thanks for the info. Yeah, it seems that UDS is the way to go. (I need to read more about them).

Actually, is there a simple example you can give that can help me understand this a bit better? Thanks

The code I have I cannot share sorry. The twisted docs should get you going. But as I said without that patch you will lose messages under load. You have to handle the EAGAIN and retry the send. We have a send queue that we drain on a timer if the send fails. The failure is caused by the receiving end not process messages fast enough. Barry

...

On Wed, Nov 6, 2019 at 9:07 AM Scott, Barry <barry.scott@forcepoint.com>

wrote:

...
On Wednesday, 6 November 2019 16:43:52 GMT Waqar Khan wrote:

...
Hi Barry,

Thanks for the response. Where can I read more about (1). It

seems

...
like that is something I need to explore. As we already have (2) (cache for each process). Thanks again for your help.

We use the UDS (Unix domain sockets) to talk to a master process. Twisted has support for this. But you need a small patch to avoid data lose.

UDS does not lose data and is message based, not bytes based. We use pickle to encode requests and responses.

Barry

The patch is:

--- Twisted-18.4.0.orig/src/twisted/internet/unix.py.orig 2018-08-01 12:45:38.711115425 +0100 +++ Twisted-18.4.0/src/twisted/internet/unix.py 2018-08-01 12:45:47.946115123 +0100 @@ -509,11 +509,6 @@

return self.write(datagram, address)

elif no == EMSGSIZE: raise error.MessageLengthError("message too long")

- elif no == EAGAIN: - # oh, well, drop the data. The only difference from UDP - # is that UDP won't ever notice. - # TODO: add TCP-like buffering - pass

else: raise

You then have to handle the EAGAIN error and do retries yourself. As it stands the patch is not good enough to put into twisted as a full fix would need to put the handling of the retries into twisted.

I guess (2) does not work for you as the cache hit rate is low and you need to share the cache to get a benefit. Cache entries only get used a few times?

In our case the hit rate is high (99%+) and we just pay the cost of populating the caches on process start up, which ends up being noise.

Barry

...
On Wed, Nov 6, 2019 at 8:39 AM Scott, Barry <barry.scott@forcepoint.com>

wrote:

...
On Wednesday, 6 November 2019 14:21:22 GMT Maarten ter Huurne wrote:

...
On Wednesday, 6 November 2019 07:19:56 CET Waqar Khan wrote:

...
Hi, So, I am writing a twisted server. This server spawn multiple child processes using reactor spawnProcess that initializes a process protocol.

Now, each of the childprocess receives some REST requests. Each process has a dict that acts as cache. Now, I want to share dict across processes. In general, python has SharedMemoryManager in multiprocessing

module

...
...
...
...
which would have helped.

https://docs.python.org/3/library/multiprocessing.shared_memory.html#m

...
...
...
...
ultiprocessing.managers.SharedMemoryManager.SharedMemory But since

I

...
...
...
...
am using twisted internal process implementation, how do I share

this

...
...
...
...
dict across the processes so that all the processes use this common cache?

Keeping a dictionary in SharedMemoryManager seems far from trivial. I don't think you can allocate arbitrary Python objects in the shared memory and even if you could, you would run into problems when one process mutates the dictionary while another is looking up something

or

...
...
...
also mutating it.

It could in theory work if you implement a custom lock-less

dictionary,

...
...
...
but that would be a lot of work and hard to get right. Also having shared memory mutations be synced between multiple CPU cores could degrade performance, since keeping core-local CPU caches in sync is expensive.

Would it be an option to have only one process accept the REST

requests,

...
...
...
check whether the result is in the cache and only distribute work to

the

...
...
...
other processes if you get a cache miss? Typically the case where an answer is cached is pretty fast, so perhaps you don't need multiple processes to handle incoming requests.

We have used a couple of ways to cache. 1. Use a singleton process to hold the cache and ask it, via IPC, for answers from the other process. 2. have a cache in each process

Barry

...
Bye,

Maarten

_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Sean DiZazzo

9:29 p.m.

If you need guaranteed delivery of the data, why not just use a TCP connection to the unix socket, instead of a UDP connection which inherently can lose data? In that case I don't think your patch would be needed. I didn't look at the source, so perhaps I missed something. On Wed, Nov 6, 2019 at 9:10 AM Scott, Barry <barry.scott@forcepoint.com> wrote:

...

On Wednesday, 6 November 2019 16:43:52 GMT Waqar Khan wrote:

...
Hi Barry, Thanks for the response. Where can I read more about (1). It seems like that is something I need to explore. As we already have (2) (cache for each process). Thanks again for your help.

We use the UDS (Unix domain sockets) to talk to a master process. Twisted has support for this. But you need a small patch to avoid data lose.

UDS does not lose data and is message based, not bytes based. We use pickle to encode requests and responses.

Barry

The patch is:

--- Twisted-18.4.0.orig/src/twisted/internet/unix.py.orig 2018-08-01 12:45:38.711115425 +0100 +++ Twisted-18.4.0/src/twisted/internet/unix.py 2018-08-01 12:45:47.946115123 +0100 @@ -509,11 +509,6 @@ return self.write(datagram, address) elif no == EMSGSIZE: raise error.MessageLengthError("message too long") - elif no == EAGAIN: - # oh, well, drop the data. The only difference from UDP - # is that UDP won't ever notice. - # TODO: add TCP-like buffering - pass else: raise

You then have to handle the EAGAIN error and do retries yourself. As it stands the patch is not good enough to put into twisted as a full fix would need to put the handling of the retries into twisted.

I guess (2) does not work for you as the cache hit rate is low and you need to share the cache to get a benefit. Cache entries only get used a few times?

In our case the hit rate is high (99%+) and we just pay the cost of populating the caches on process start up, which ends up being noise.

Barry

...
On Wed, Nov 6, 2019 at 8:39 AM Scott, Barry <barry.scott@forcepoint.com>

wrote:

...
On Wednesday, 6 November 2019 14:21:22 GMT Maarten ter Huurne wrote:

...
On Wednesday, 6 November 2019 07:19:56 CET Waqar Khan wrote:

...
Hi, So, I am writing a twisted server. This server spawn multiple child processes using reactor spawnProcess that initializes a process protocol.

Now, each of the childprocess receives some REST requests. Each process has a dict that acts as cache. Now, I want to share dict across processes. In general, python has SharedMemoryManager in multiprocessing

...
...
...
...
which would have helped.

https://docs.python.org/3/library/multiprocessing.shared_memory.html#m

...
ultiprocessing.managers.SharedMemoryManager.SharedMemory But since I am using twisted internal process implementation, how do I share

...
...
...
...
dict across the processes so that all the processes use this common cache?

Keeping a dictionary in SharedMemoryManager seems far from trivial. I don't think you can allocate arbitrary Python objects in the shared memory and even if you could, you would run into problems when one process mutates the dictionary while another is looking up something or also mutating it.

It could in theory work if you implement a custom lock-less dictionary, but that would be a lot of work and hard to get right. Also having shared memory mutations be synced between multiple CPU cores could degrade performance, since keeping core-local CPU caches in sync is expensive.

Would it be an option to have only one process accept the REST requests, check whether the result is in the cache and only distribute work to

module this the

...
...
...
other processes if you get a cache miss? Typically the case where an answer is cached is pretty fast, so perhaps you don't need multiple processes to handle incoming requests.

We have used a couple of ways to cache. 1. Use a singleton process to hold the cache and ask it, via IPC, for answers from the other process. 2. have a cache in each process

Barry

...
Bye,

Maarten

_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Scott, Barry

2:07 a.m.

On Thursday, 7 November 2019 05:29:34 GMT Sean DiZazzo wrote:

...

If you need guaranteed delivery of the data, why not just use a TCP connection to the unix socket, instead of a UDP connection which inherently can lose data? In that case I don't think your patch would be needed.

I didn't look at the source, so perhaps I missed something.

UDS is not UDP.

...

...
We use the UDS (Unix domain sockets) to talk to a master process. Twisted has support for this. But you need a small patch to avoid data lose.

UDS does not lose data and is message based, not bytes based. We use pickle to encode requests and responses.

Barry

Glyph

3:39 p.m.

...

On Nov 7, 2019, at 2:07 AM, Scott, Barry <barry.scott@forcepoint.com> wrote:

On Thursday, 7 November 2019 05:29:34 GMT Sean DiZazzo wrote:

...
If you need guaranteed delivery of the data, why not just use a TCP connection to the unix socket, instead of a UDP connection which inherently can lose data? In that case I don't think your patch would be needed.

I didn't look at the source, so perhaps I missed something.

UDS is not UDP.

Specifically, a UNIX datagram socket is a datagram-based IPC mechanism but unlike UDP it is not unreliable. You can still get EAGAIN or EMSGSIZE because buffers fill up and datagrams are too big, but if you put a datagram in, it comes out again. Barry's (embarrassingly long-standing) bug in Twisted's handling of UDS failures is here: https://twistedmatrix.com/trac/ticket/9504 <https://twistedmatrix.com/trac/ticket/9504> . You can't make "a TCP connection" to a UNIX-domain socket; TCP is a thing you do over networks and the UNIX address family is for local inter-process communication on a single host.

...

...
...
We use the UDS (Unix domain sockets) to talk to a master process. Twisted has support for this. But you need a small patch to avoid data lose.

UDS does not lose data and is message based, not bytes based. We use pickle to encode requests and responses.

Barry

_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

1902

Age (days ago)

1903

Last active (days ago)

List overview

Download

10 comments

5 participants

participants (5)

Glyph
Maarten ter Huurne
Scott, Barry
Sean DiZazzo
Waqar Khan

[Twisted-Python] sharing a dict between child processes

tags

participants (5)