[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [Libevent-users] epoll erros



On Fri, Oct 22, 2010 at 1:54 PM, Nick Mathewson <nickm@xxxxxxxxxxxxx> wrote:
  [...]
> Actually, straceing the application up to the point where it gets its
> first message like
>
> [warn] Epoll ADD(1) on fd 13 failed.  Old events were 0; read change
> was 1 (add); write change was 0 (none): File exists
>
> would probably help if option 4 or option  is the case.  If you do
> this, please send the strace and the debug log for the same run
> together.  If it's more than 10 or 20 KiB, though, please upload it
> somewhere and post a URL or send it to me personally?  I've got a
> feeling lots of folks on this list don't necessarily want multiple
> 100KiB attachments.

Thanks to Gilad for a speedy response!  Here is the sequence of events
that causes the bug to appear:

1: socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 12
2: dup(12)                 = 13
3: epoll_ctl(4, EPOLL_CTL_ADD, 13, {EPOLLIN, {u32=13, u64=13}}) = 0
4: epoll_wait(4, {{EPOLLIN, {u32=13, u64=13}}}, 32, 1633) = 1
5: epoll_wait(4, {{EPOLLIN, {u32=13, u64=13}}}, 32, 1563) = 1
6: epoll_ctl(4, EPOLL_CTL_ADD, 13, {EPOLLIN, {u32=13, u64=13}}) = -1
EEXIST (File exists)
7: epoll_wait(4, {{EPOLLIN, {u32=13, u64=13}}}, 32, 1) = 1
8: close(13)               = 0
9: epoll_ctl(4, EPOLL_CTL_DEL, 13, {EPOLLIN, {u32=13, u64=13}}) = -1
EBADF (Bad file descriptor)
10: epoll_wait(4, {}, 32, 1469) = 0
11: dup(12)                 = 13
12: epoll_ctl(4, EPOLL_CTL_ADD, 13, {EPOLLIN, {u32=13, u64=13}}) = -1
EEXIST (File exists)

Apparently, the Linux kernel associates epoll state with files in such
a way that the epoll state is shared across dup()'d fds.  I'll read
the kernel source a little more to be sure of what's happening.  I've
attached a variant of my test code to reproduce this.  Thanks, Gilad,
for all your patience on this!

Now the last step is to figure out: what is the right fix here?  I'm
probably going to need to sleep on that one.  My current sense is that
we will not be able to support every possible usage of dup()'d fds in
a single epoll-based event base, and that we'll need to amend the docs
to say so, but that it should be possible to support the usage that
Gilad's current application is doing.  But more thought is needed
here, and I probably ought to peruse the kernel source a little more
to make sure that dup+epoll works the way I'm guessing it works.

Thanks again,
-- 
Nick
#include <sys/epoll.h>
#include <unistd.h>
#include <sys/fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

#include <sys/socket.h>

#define DIE(s) do { perror(s); exit(1); } while(0)

int use_fd = -1;
int fds[2] = {-1,-1};
int epoll_fd;

int opnum=0;

void
init(void)
{
	epoll_fd = epoll_create(32000);
	if (epoll_fd < 0)
		DIE("epoll_create");

	if (socketpair(AF_UNIX, SOCK_STREAM, 0, fds) < 0)
		DIE("socketpair (1)");

        use_fd = dup(fds[0]);
}

void
stop(void)
{
	printf("%d: close\n", ++opnum);
	close(use_fd);
}

void
redup(void)
{
	int new_fd;
	printf("%d: redup\n", ++opnum);
	new_fd = dup(fds[0]);

	if (new_fd != use_fd) {
		printf("dup failed to get the same fds\n");
		exit(1);
	}
}

void
op(int ctl, int arg)
{
	struct epoll_event ev;
	const char *opname, *argname;
	++opnum;
	switch (ctl) {
	case EPOLL_CTL_ADD: opname = "add"; break;
	case EPOLL_CTL_DEL: opname = "del"; break;
	case EPOLL_CTL_MOD: opname = "mod"; break;
	default:
		abort();
	}
	switch (arg) {
	case EPOLLIN: argname = "IN"; break;
	case EPOLLOUT: argname = "OUT"; break;
	case EPOLLIN|EPOLLOUT: argname = "IN|OUT"; break;
	default:
		abort();
	}

	printf("%d: %s(%s)\t", opnum, opname, argname);
	fflush(stdout);

	memset(&ev, 0, sizeof(ev));
	ev.events = arg;
	if (epoll_ctl(epoll_fd, ctl, use_fd, &ev)<0) {
		printf("%d: %s\n", errno, strerror(errno));
	} else {
		printf("OK\n");
	}
}

static void add(int what) { op(EPOLL_CTL_ADD, what); }
static void del(int what) { op(EPOLL_CTL_DEL, what); }
static void mod(int what) { op(EPOLL_CTL_MOD, what); }

int
main(int argc, char **argv)
{
	init();

	add(EPOLLOUT);
	mod(EPOLLIN);
	add(EPOLLIN);
	stop();
	del(EPOLLIN);
	redup();
	add(EPOLLIN);

	return 0;
}