[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [Libevent-users] epoll erros



I think something needs to walk the file->f_ep_links list on close() and
remove any epitems where epi->ffd->fd is the fd being closed from the
tree in epi->ep.

I don't have a Linux box to try this on though :-).



On Fri, Oct 22, 2010 at 09:15:26PM +0100, Nicholas Marriott wrote:
> It also looks to me like eventpoll_release is not called until the
> struct file is released (ie all refs are gone), NOT when the fd is
> closed.
> 
> So this means that because it is dup()d it remains on the tree even
> after close().
> 
> Then because you dup() it again _to the same target fd_, when you try to
> add it it has the same struct file pointer and the same fd, and is
> already on the tree.
> 
> 
> 
> On Fri, Oct 22, 2010 at 09:05:33PM +0100, Nicholas Marriott wrote:
> > On a quick look it seems to me like epoll keys the tree on the epoll fd
> > by both underlying struct file and the file descriptor itself, so it
> > should work with dup()d fds.
> > 
> > 
> > On Fri, Oct 22, 2010 at 08:42:13PM +0100, Nicholas Marriott wrote:
> > > If this is right it seems really stupid and inconvenient. Sounds more
> > > like a bug to me.
> > > 
> > > I add dup()d fds to libevent and haven't had any complaints from Linux
> > > users. Although I don't remove and add them (or read and write from both
> > > at the same time, I think).
> > > 
> > > dup()d fds definitely seems to work with kqueue.
> > > 
> > > 
> > > On Fri, Oct 22, 2010 at 03:08:00PM -0400, Nick Mathewson wrote:
> > > > On Fri, Oct 22, 2010 at 1:54 PM, Nick Mathewson <nickm@xxxxxxxxxxxxx> wrote:
> > > >   [...]
> > > > > Actually, straceing the application up to the point where it gets its
> > > > > first message like
> > > > >
> > > > > [warn] Epoll ADD(1) on fd 13 failed.  Old events were 0; read change
> > > > > was 1 (add); write change was 0 (none): File exists
> > > > >
> > > > > would probably help if option 4 or option  is the case.  If you do
> > > > > this, please send the strace and the debug log for the same run
> > > > > together.  If it's more than 10 or 20 KiB, though, please upload it
> > > > > somewhere and post a URL or send it to me personally?  I've got a
> > > > > feeling lots of folks on this list don't necessarily want multiple
> > > > > 100KiB attachments.
> > > > 
> > > > Thanks to Gilad for a speedy response!  Here is the sequence of events
> > > > that causes the bug to appear:
> > > > 
> > > > 1: socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 12
> > > > 2: dup(12)                 = 13
> > > > 3: epoll_ctl(4, EPOLL_CTL_ADD, 13, {EPOLLIN, {u32=13, u64=13}}) = 0
> > > > 4: epoll_wait(4, {{EPOLLIN, {u32=13, u64=13}}}, 32, 1633) = 1
> > > > 5: epoll_wait(4, {{EPOLLIN, {u32=13, u64=13}}}, 32, 1563) = 1
> > > > 6: epoll_ctl(4, EPOLL_CTL_ADD, 13, {EPOLLIN, {u32=13, u64=13}}) = -1
> > > > EEXIST (File exists)
> > > > 7: epoll_wait(4, {{EPOLLIN, {u32=13, u64=13}}}, 32, 1) = 1
> > > > 8: close(13)               = 0
> > > > 9: epoll_ctl(4, EPOLL_CTL_DEL, 13, {EPOLLIN, {u32=13, u64=13}}) = -1
> > > > EBADF (Bad file descriptor)
> > > > 10: epoll_wait(4, {}, 32, 1469) = 0
> > > > 11: dup(12)                 = 13
> > > > 12: epoll_ctl(4, EPOLL_CTL_ADD, 13, {EPOLLIN, {u32=13, u64=13}}) = -1
> > > > EEXIST (File exists)
> > > > 
> > > > Apparently, the Linux kernel associates epoll state with files in such
> > > > a way that the epoll state is shared across dup()'d fds.  I'll read
> > > > the kernel source a little more to be sure of what's happening.  I've
> > > > attached a variant of my test code to reproduce this.  Thanks, Gilad,
> > > > for all your patience on this!
> > > > 
> > > > Now the last step is to figure out: what is the right fix here?  I'm
> > > > probably going to need to sleep on that one.  My current sense is that
> > > > we will not be able to support every possible usage of dup()'d fds in
> > > > a single epoll-based event base, and that we'll need to amend the docs
> > > > to say so, but that it should be possible to support the usage that
> > > > Gilad's current application is doing.  But more thought is needed
> > > > here, and I probably ought to peruse the kernel source a little more
> > > > to make sure that dup+epoll works the way I'm guessing it works.
> > > > 
> > > > Thanks again,
> > > > -- 
> > > > Nick
> > > 
> > > 
> > > ***********************************************************************
> > > To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
> > > unsubscribe libevent-users    in the body.
> > ***********************************************************************
> > To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
> > unsubscribe libevent-users    in the body.
> ***********************************************************************
> To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
> unsubscribe libevent-users    in the body.
***********************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
unsubscribe libevent-users    in the body.