Re: [Libevent-users] Deadlock driving me nuts

To: Nick Mathewson <nickm@xxxxxxxxxxxxx>

Subject: Re: [Libevent-users] Deadlock driving me nuts

From: Sherif Fanous <sherif.fanous@xxxxxxxxx>

Date: Sat, 9 Apr 2011 10:16:42 +0200

Cc: libevent-users@xxxxxxxxxxxxx

Delivered-to: archiver@xxxxxxxx

Delivered-to: libevent-users-outgoing@xxxxxxxx

Delivered-to: libevent-users@xxxxxxxx

Delivery-date: Sat, 09 Apr 2011 04:17:09 -0400

Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=OtTIpW3te5wBH35fnNJ2oR6sFYWmzFaCeLWL52clHME=; b=hxvQEPPzTXHY+EmqJZZV7TlvaJqEPIbzKj0c7uhFHnArIyJajt0vbEtCPWiQbM/KKx 2kUJ1R+aMi39g5PuWd4WLFL66A+9ZryxJAnzrHeE4gH+9gXnMvnfG9wDJMR1eJs0Povz dg7Xrv3/Xyc+52Ws2fqzmyeUuPpA175CDPPuA=

Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; b=piyfqO5xIPyJoZsXj4DWI0SoEJ3ECG3p57PdMCbRcYYKjO4ZoC5gtZLFba4Nq0WcPJ a4ni7+uFlnYOKm6hEUr3tPXizu/1GLvXbVx5x+FlaYH6iNj1OAvQyeQawXK8mOXGBII7 1EKtQWVuWyyP0nhd/cO0po2Nt7S1iNjVZLGU0=

In-reply-to: <BANLkTikJa5H-tGmp5ZFfgmWs=sOiDMcQ4w@xxxxxxxxxxxxxx>

References: <BANLkTinrRa3weXm19Z7FcmUPgsNR950pPQ@xxxxxxxxxxxxxx> <BANLkTikJa5H-tGmp5ZFfgmWs=sOiDMcQ4w@xxxxxxxxxxxxxx>

Reply-to: libevent-users@xxxxxxxxxxxxx

Sender: owner-libevent-users@xxxxxxxxxxxxx

As a matter of fact, the thread in question is usually stuck in event_active_nolock.

The funny thing is that it shows that it's stuck in line 2212 in event.c which is the opening brace of the function.

How would I walk through ctx->events?

Thanks

Sherif

On Fri, Apr 8, 2011 at 9:38 PM, Nick Mathewson <nickm@xxxxxxxxxxxxx> wrote:

On Fri, Apr 8, 2011 at 3:37 AM, Sherif Fanous <sherif.fanous@xxxxxxxxx> wrote:
> Hello
> I'm running into a deadlock using libevent. I've been trying to figure out
> why or where it is happening for the last 3 days but have failed to do so.
> The deadlock occurs with the event base responsible for sending network
> packets. Below is a gdb output showing the deadlocked threads.

Thread 21 (Thread 27350):
#0 evmap_io_active (base=0x872b890, fd=247, events=36) at evmap.c:399
#1 0x080c84ae in epoll_dispatch (base=0x872b890, tv=0xb6e1c334) at epoll.c:436

So, this thread ought to be holding the event_base lock, but I don't
know why it would be freaking out like this. The place that it seems
to be stuck on is:

TAILQ_FOREACH(ev, &ctx->events, ev_io_next) {
if (ev->ev_events & events)
event_active_nolock(ev, ev->ev_events & events, 1);
}

I'm guessing that if you let it run a while, it stays stuck at the
same place? (And not, say, in event_active_nolock)?

So I can only think of two ways that could get stuck. The first one
is if the ctx->events list had somehow gotten corrupted (say, if it
became circular somehow). The second is if event_active_nolock had
somehow become stuck... but if that's the case, then I would expect
event_active_nolock to appear on your stack trace.

To see if I guessed right about the first case, could you try walking
through the ctx->events list via ev_io_next and seeing whether it
actually ends, or whether it's corrupted?

--
Nick