[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

RE: [Libevent-users] deadlock in libevent-2.0.5-beta



I now have a clearer view of things.
Like Zhou said - there is a thread adding an event to the sender's event_base which is blocked on writing to base->th_notify_fd[1]:

#0  0x00000030c1c0cadb in __write_nocancel () from /opt/breach-proxy/lib64/libpthread.so.0
#1  0x00002aaaaaccdd4e in evthread_notify_base_default () from /opt/breach/bwd/lib/libevent_core.so.4
#2  0x00002aaaaaccf70f in event_add_internal () from /opt/breach/bwd/lib/libevent_core.so.4
#3  0x00002aaaaaccfafa in event_add () from /opt/breach/bwd/lib/libevent_core.so.4
#4  0x00002aaaaacd4481 in evbuffer_run_callbacks () from /opt/breach/bwd/lib/libevent_core.so.4
#5  0x00002aaaaacd7703 in evbuffer_add_reference () from /opt/breach/bwd/lib/libevent_core.so.4
#6  0x00002aaaaaab84bc in CBTcpProxy::SendByRef (this=0x7ffffa9d6270,
    data=0x2aaaae247820 "GET /index1.html HTTP/1.1\r\nHost: 200.200.200.2\r\nConnection: Keep-Alive\r\nUser-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)\r\nAccept: */*\r\nAccept-Language: en-us\r\nAcc"..., data_len=283,
    pxcn=<value optimized out>, side=<value optimized out>,
    cleanupfn=0x2aaaaaab60c0 <return_mp_block(void const*, unsigned long, void*)>, extra=0x2aaabc0e8ef0) at tcpproxy.cpp:336
#7  0x00002aaaaaab6185 in default_receiver_cb (fd=320, what=<value optimized out>, p_pxcn=0x2aaab40c96e0, read_from=SIDE_CLIENT)
    at receiver_thread.cpp:217
#8  0x00002aaaaaab662b in receiver_cb (fd=320, what=34, arg=0x2aaab40c96e0) at receiver_thread.cpp:144
#9  0x00002aaaaacd1399 in event_base_loop () from /opt/breach/bwd/lib/libevent_core.so.4

[root]# strace -p 25613
Process 25613 attached - interrupt to quit
write(17, "\0", 1


I guess that writing blocks due to full buffers because the sender thread is too busy to read from the socket in the rate it is written to (this is why this reproduces only on very high load).
Here's the sender's stack:
#0  0x00000030c1c0c758 in __lll_mutex_lock_wait () from /opt/breach-proxy/lib64/libpthread.so.0
#1  0x00000030c1c087fa in _L_mutex_lock_908 () from /opt/breach-proxy/lib64/libpthread.so.0
#2  0x00000030c1c08682 in pthread_mutex_lock () from /opt/breach-proxy/lib64/libpthread.so.0
#3  0x00002aaaaacd1f53 in event_del () from /opt/breach/bwd/lib/libevent_core.so.4
#4  0x00002aaaaacdafe3 in bufferevent_writecb () from /opt/breach/bwd/lib/libevent_core.so.4
#5  0x00002aaaaacd1399 in event_base_loop () from /opt/breach/bwd/lib/libevent_core.so.4
#6  0x00002aaaaaab7220 in CBTcpProxySenderThread::run (this=0xe0314c0) at sender_thread.cpp:35

[root]# strace -p 25612
Process 25612 attached - interrupt to quit
futex(0xe031b50, FUTEX_WAIT, 2, NULL



My previous mail was misleading - sorry.
Thanks all for the help.
Avi




-----Original Message-----
From: owner-libevent-users@xxxxxxxxxxxxx [mailto:owner-libevent-users@xxxxxxxxxxxxx] On Behalf Of Nick Mathewson
Sent: Sunday, July 04, 2010 4:54 PM
To: libevent-users@xxxxxxxxxxxxx
Subject: Re: [Libevent-users] deadlock in libevent-2.0.5-beta

On Sun, Jul 4, 2010 at 4:29 AM, Avi Bab <avib@xxxxxxxxxx> wrote:
>
> Indeed it seems that someone, some when, failed to release the lock.
> At the time of the deadlock the third thread (The ReceiverThread) is dispatching on a different eventbase.
>
> This third thread does do some manipulation on bufferevents that are registered with the Sender's event_base:

 [..]
>
> This is the only interaction with the third thread.
> I do not see a relation to the deadlock.
>

Confusing!  I don't see how this would cause the deadlock either.

Have you tried configuring the lock debugging feature?  Just after you
call evthread_use_pthreads(), call evthread_enable_lock_debugging():
it will put wrappers around all the locks and locking callbacks to
track each lock's current owner and recursion count.  If there are
gross errors, you'll get an assertion failure with a hopefully useful
stack trace.  If not,  you can inspect the recursion count and owner
in the debugger by casting the lock to "struct debug_lock" and looking
at the  count and held_by variables.  [The owner is just a pthread_t
as an unsigned long, as returned by pthread_self().]

yrs,
-- 
Nick
***********************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
unsubscribe libevent-users    in the body.
***********************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
unsubscribe libevent-users    in the body.