[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [Libevent-users] epoll erros
I think something needs to walk the file->f_ep_links list on close() and
remove any epitems where epi->ffd->fd is the fd being closed from the
tree in epi->ep.
I don't have a Linux box to try this on though :-).
On Fri, Oct 22, 2010 at 09:15:26PM +0100, Nicholas Marriott wrote:
> It also looks to me like eventpoll_release is not called until the
> struct file is released (ie all refs are gone), NOT when the fd is
> closed.
>
> So this means that because it is dup()d it remains on the tree even
> after close().
>
> Then because you dup() it again _to the same target fd_, when you try to
> add it it has the same struct file pointer and the same fd, and is
> already on the tree.
>
>
>
> On Fri, Oct 22, 2010 at 09:05:33PM +0100, Nicholas Marriott wrote:
> > On a quick look it seems to me like epoll keys the tree on the epoll fd
> > by both underlying struct file and the file descriptor itself, so it
> > should work with dup()d fds.
> >
> >
> > On Fri, Oct 22, 2010 at 08:42:13PM +0100, Nicholas Marriott wrote:
> > > If this is right it seems really stupid and inconvenient. Sounds more
> > > like a bug to me.
> > >
> > > I add dup()d fds to libevent and haven't had any complaints from Linux
> > > users. Although I don't remove and add them (or read and write from both
> > > at the same time, I think).
> > >
> > > dup()d fds definitely seems to work with kqueue.
> > >
> > >
> > > On Fri, Oct 22, 2010 at 03:08:00PM -0400, Nick Mathewson wrote:
> > > > On Fri, Oct 22, 2010 at 1:54 PM, Nick Mathewson <nickm@xxxxxxxxxxxxx> wrote:
> > > > [...]
> > > > > Actually, straceing the application up to the point where it gets its
> > > > > first message like
> > > > >
> > > > > [warn] Epoll ADD(1) on fd 13 failed. Old events were 0; read change
> > > > > was 1 (add); write change was 0 (none): File exists
> > > > >
> > > > > would probably help if option 4 or option is the case. If you do
> > > > > this, please send the strace and the debug log for the same run
> > > > > together. If it's more than 10 or 20 KiB, though, please upload it
> > > > > somewhere and post a URL or send it to me personally? I've got a
> > > > > feeling lots of folks on this list don't necessarily want multiple
> > > > > 100KiB attachments.
> > > >
> > > > Thanks to Gilad for a speedy response! Here is the sequence of events
> > > > that causes the bug to appear:
> > > >
> > > > 1: socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 12
> > > > 2: dup(12) = 13
> > > > 3: epoll_ctl(4, EPOLL_CTL_ADD, 13, {EPOLLIN, {u32=13, u64=13}}) = 0
> > > > 4: epoll_wait(4, {{EPOLLIN, {u32=13, u64=13}}}, 32, 1633) = 1
> > > > 5: epoll_wait(4, {{EPOLLIN, {u32=13, u64=13}}}, 32, 1563) = 1
> > > > 6: epoll_ctl(4, EPOLL_CTL_ADD, 13, {EPOLLIN, {u32=13, u64=13}}) = -1
> > > > EEXIST (File exists)
> > > > 7: epoll_wait(4, {{EPOLLIN, {u32=13, u64=13}}}, 32, 1) = 1
> > > > 8: close(13) = 0
> > > > 9: epoll_ctl(4, EPOLL_CTL_DEL, 13, {EPOLLIN, {u32=13, u64=13}}) = -1
> > > > EBADF (Bad file descriptor)
> > > > 10: epoll_wait(4, {}, 32, 1469) = 0
> > > > 11: dup(12) = 13
> > > > 12: epoll_ctl(4, EPOLL_CTL_ADD, 13, {EPOLLIN, {u32=13, u64=13}}) = -1
> > > > EEXIST (File exists)
> > > >
> > > > Apparently, the Linux kernel associates epoll state with files in such
> > > > a way that the epoll state is shared across dup()'d fds. I'll read
> > > > the kernel source a little more to be sure of what's happening. I've
> > > > attached a variant of my test code to reproduce this. Thanks, Gilad,
> > > > for all your patience on this!
> > > >
> > > > Now the last step is to figure out: what is the right fix here? I'm
> > > > probably going to need to sleep on that one. My current sense is that
> > > > we will not be able to support every possible usage of dup()'d fds in
> > > > a single epoll-based event base, and that we'll need to amend the docs
> > > > to say so, but that it should be possible to support the usage that
> > > > Gilad's current application is doing. But more thought is needed
> > > > here, and I probably ought to peruse the kernel source a little more
> > > > to make sure that dup+epoll works the way I'm guessing it works.
> > > >
> > > > Thanks again,
> > > > --
> > > > Nick
> > >
> > >
> > > ***********************************************************************
> > > To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
> > > unsubscribe libevent-users in the body.
> > ***********************************************************************
> > To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
> > unsubscribe libevent-users in the body.
> ***********************************************************************
> To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
> unsubscribe libevent-users in the body.
***********************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
unsubscribe libevent-users in the body.