Eventfd with epoll BlockingIOError
jenkris at tutanota.com
jenkris at tutanota.com
Thu Nov 25 17:29:07 EST 2021
Thanks very much for your reply.
I am now getting a single event returned in Python, but it's not the right event, as I'll explain below.
I rearranged the Python code based on your comments:
#!/usr/bin/python3
import sys
import os
import select
print("Inside Python")
event_fd = int(sys.argv[3])
print("Eventfd received by Python")
print(event_fd)
event_write_value = 100
ep = select.epoll(-1)
ep.register(event_fd, select.EPOLLIN | select.EPOLLOUT )
os.set_blocking(event_fd, False)
#__________
print("Starting poll loop")
for fd_event in ep.poll():
print("Python fd_event")
print(fd_event)
fd_received = fd_event[0]
event_received = fd_event[1]
You advised to leave off select.EPOLLOUT from the line ep.register(event_fd, select.EPOLLIN | select.EPOLLOUT ) -- which makes sense because I'm not waiting for that event -- but without it both processes freeze in the for loop (below print("Starting poll loop")) so we never receive an EPOLLIN event. So I included it, and here is the screen output from gdb:
Inside Python
Eventfd received by Python
5
Everything OK in Python
Starting poll loop
Python fd_event
(5, 4)
Writing to Python
5 Received from Python
8 Writing to Python
Failed epoll_wait Bad file descriptor
5 Received from Python
8 Writing to Python
Failed epoll_wait Bad file descriptor
5 Received from Python
-1time taken 0.000629
Failed to close epoll file descriptor
Unlink_shm status: Bad file descriptor
fn() took 0.000717 seconds to execute
[Inferior 1 (process 26718) exited normally]
(gdb) q
The Python fd_event tuple is 5, 4 -- 5 is the correct file descriptor and 4 is an EPOLLOUT event, which is not what I want.
The eventfd is created in C as nonblocking:
int eventfd_initialize() {
int efd = eventfd(0, EFD_NONBLOCK);
return efd; }
When C writes it calls epoll_wait:
ssize_t epoll_write(int event_fd, int epoll_fd, struct epoll_event * event_struc, int action_code)
{
int64_t ewbuf[1];
ewbuf[0] = (int64_t)action_code;
int maxevents = 1;
int timeout = -1;
fprintf(stdout, " Writing to Python \n%d", event_fd);
write(event_fd, &ewbuf, 8);
if (epoll_wait(epoll_fd, event_struc, maxevents, timeout) == -1)
{
fprintf(stderr, "Failed epoll_wait %s\n", strerror(errno));
}
ssize_t rdval = read(event_fd, &ewbuf, 8);
fprintf(stdout, " Received from Python \n%ld", rdval);
return 0;
}
The C side initializes its epoll this way:
int epoll_initialize(int efd, int64_t * output_array)
{
struct epoll_event ev = {};
int epoll_fd = epoll_create1(0);
struct epoll_event * ptr_ev = &ev;
if(epoll_fd == -1)
{
fprintf(stderr, "Failed to create epoll file descriptor\n");
return 1;
}
ev.events = EPOLLIN | EPOLLOUT;
ev.data.fd = efd; //was 0
if(epoll_ctl(epoll_fd, EPOLL_CTL_ADD, efd, &ev) == -1)
{
fprintf(stderr, "Failed to add file descriptor to epoll\n");
close(epoll_fd);
return 1;
}
output_array[0] = epoll_fd;
output_array[1] = (int64_t)ptr_ev; //&ev;
return 0;
}
Technically C is not waiting for an EPOLLIN event, but again without it both processes freeze unless either C or Python includes both events. So that appears to be where the problem is.
The Linux epoll man page says, "epoll_wait waits for I/O events, blocking the calling thread if no events are currently available." https://man7.org/linux/man-pages/man7/epoll.7.html. That may be the clue to why both processes freeze when I poll on only one event in each one.
Thanks for any ideas based on this update, and thanks again for your earlier reply.
Jen
--
Sent with Tutanota, the secure & ad-free mailbox.
Nov 25, 2021, 06:34 by barry at barrys-emacs.org:
>
>
>
>> On 24 Nov 2021, at 22:42, Jen via Python-list <>> python-list at python.org>> > wrote:
>>
>> I have a C program that uses fork-execv to run Python 3.10 in a child process, and I am using eventfd with epoll for IPC between them. The eventfd file descriptor is created in C and passed to Python through execv. Once the Python child process starts I print the file descriptor to verify that it is correct (it is).
>>
>> In this scenario C will write to the eventfd at intervals and Python will read the eventfd and take action based on the value in the eventfd. But in the Python while True loop I get "BlockingIOError: [Errno 11] Resource temporarily unavailable" then with each new read it prints "Failed epoll_wait Bad file descriptor."
>>
>> This is the Python code:
>>
>> #!/usr/bin/python3
>> import sys
>> import os
>> import select
>>
>> print("Inside Python")
>>
>> event_fd = int(sys.argv[3])
>>
>>
>> print("Eventfd received by Python")
>> print(event_fd)
>>
>> ep = select.epoll(-1)
>> ep.register(event_fd, select.EPOLLIN | select.EPOLLOUT)
>>
>
> This says tell me if I can read or write to the event_fd.
> write will be allowed until the kernel buffers are full.
>
> Usually you only add EPOLLOUT if you have data to write.
> In this case do not set EPOLLOUT.
>
> And if you know that you will never fill the kernel buffers then you
> do not need to bother polling for write.
>
>
>>
>> event_write_value = 100
>>
>> while True:
>>
>> print("Waiting in Python for event")
>> ep.poll(timeout=None, maxevents=- 1)
>>
>
> You have to get the result of the poll() and process the list of entries that are returned.
>
> You must check that POLLIN is set before attempting the read.
>
>
>
>> v = os.eventfd_read(event_fd)
>>
>
> Will raise EWOULDBLOCK because there is no data available to read.
>
> Here is the docs from python:
>
> poll.> poll> (> [> timeout> ]> )
>
> Polls the set of registered file descriptors, and returns a possibly-empty listcontaining > (fd,> > event)> 2-tuples for the descriptors that have events orerrors to report. > fd> is the file descriptor, and > event> is a bitmask withbits set for the reported events for that descriptor — > POLLIN> forwaiting input, > POLLOUT> to indicate that the descriptor can be writtento, and so forth. An empty list indicates that the call timed out and no filedescriptors had any events to report. If > timeout> is given, it specifies thelength of time in milliseconds which the system will wait for events beforereturning. If > timeout> is omitted, negative, or > None <>> , the call willblock until there is an event for this poll object.
>
> You end up with code like this:
>
> for fd_event in ep.poll():
> fd, event == fd_event
> if (event&select.POLLIN) != 0 and fd == event_fd:
> v = os.eventfd_read(event_fd)
>
>
>>
>> if v != 99:
>> print("found")
>> print(v)
>>
>> os.eventfd_write(event_fd, event_write_value)
>>
>> if v == 99:
>> os.close(event_fd)
>>
>> This is the C code that writes to Python, then waits for Python to write back:
>>
>> ssize_t epoll_write(int event_fd, int epoll_fd, struct epoll_event * event_struc, int action_code)
>> {
>> int64_t ewbuf[1];
>> ewbuf[0] = (int64_t)action_code;
>> int maxevents = 1;
>> int timeout = -1;
>>
>> fprintf(stdout, " Writing to Python \n%d", event_fd);
>>
>> write(event_fd, &ewbuf, 8);
>>
>> if (epoll_wait(epoll_fd, event_struc, maxevents, timeout) == -1)
>> {
>> fprintf(stderr, "Failed epoll_wait %s\n", strerror(errno));
>> }
>>
>> ssize_t rdval = read(event_fd, &ewbuf, 8);
>>
>> fprintf(stdout, " Received from Python \n%ld", rdval);
>>
>> return 0;
>> }
>>
>> This is the screen output when I run with gdb:
>>
>> Inside Python
>> Eventfd received by Python
>> 5
>> Waiting in Python for event
>> Traceback (most recent call last):
>> File "/usr/local/lib/python3.10/runpy.py", line 196, in >> >> _run_module_as_main
>> return _run_code(code, main_globals, None,
>> File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
>> exec(code, run_globals)
>> File "/opt/P01_SH/NPC_CPython.py", line 36, in <module>
>> v = os.eventfd_read(event_fd)
>> BlockingIOError: [Errno 11] Resource temporarily unavailable
>>
>
> Expected as there you have not checked that there is data to read.
> Check for POLLIN being set.
>
>
>> Writing to Python
>> 5 Received from Python
>> 8 Writing to Python
>> Failed epoll_wait Bad file descriptor
>> 5 Received from Python
>> 8 Writing to Python
>> Failed epoll_wait Bad file descriptor
>> 5 Received from Python
>> -1time taken 0.000548
>> Failed to close epoll file descriptor
>> Unlink_shm status: Bad file descriptor
>> fn() took 0.000648 seconds to execute
>> [Inferior 1 (process 12618) exited normally]
>> (gdb)
>>
>> So my question is why do I get "BlockingIOError: [Errno 11] Resource temporarily unavailable" and "Failed epoll_wait Bad file descriptor" from Python?
>>
>
> If your protocol is not trivia you should implement a state machine to know what to do at each event.
>
> Barry
>
>
>>
>> --
>> Sent with Tutanota, the secure & ad-free mailbox.
>> --
>> https://mail.python.org/mailman/listinfo/python-list
>>
More information about the Python-list
mailing list