[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[Libevent-users] Buffer event race condition

To: libevent-users <libevent-users@xxxxxxxx>
Subject: [Libevent-users] Buffer event race condition
From: Kevin Bowling <kevin.bowling@xxxxxxxxxx>
Date: Mon, 20 Dec 2010 01:32:50 -0500
Delivered-to: archiver@xxxxxxxx
Delivered-to: libevent-users-outgoing@xxxxxxxx
Delivered-to: libevent-users@xxxxxxxx
Delivery-date: Mon, 20 Dec 2010 01:32:59 -0500
Reply-to: libevent-users@xxxxxxxxxxxxx
Sender: owner-libevent-users@xxxxxxxxxxxxx

I'm at wit's end with a libevent threading bug. As part of a disconnect client routine, I manually call the errorcb with EVENT_ERROR_EOF. The idea was to keep all the cleanup code in one callback but I'm beginning to think this was ill-conceived. Somehow, libevent is entering a condition wait and it looks like the event base is never releasing the cv mutex.

For the worker threads, the workers are passed a *bev and *ctx pointer through a singly linked list that is mutex protected. When a client disconnects, a reaper is run in the errorcb to remove any pending work entries that match that *bev pointer. This happens mutually exclusive to the workers so a worker should not consume a null *bev. I'm guessing the errorcb is being called twice.

It may be a longshot asking such a broad question but any advice is appreciated.

Here is the backtrace of the worker thread(6), and the event base(1):

Thread 6 (Thread 0xf5de5b70 (LWP 25688)):
#0 0xffffe430 in __kernel_vsyscall ()
#1 0xf7f6d625 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#2 0xf7fd3cf0 in evthread_posix_cond_wait (_cond=0x8061018, _lock=0x8060598,
    tv=0x0) at evthread_pthread.c:156
#3 0xf7fa15f5 in debug_cond_wait (_cond=0x8061018, _lock=0x80606c0, tv=0x0)
    at evthread.c:222
#4 0xf7f9f14f in event_del_internal (ev=0x8061558) at event.c:2142
#5 0xf7f9efdc in event_del (ev=0x8061558) at event.c:2110
#6 0xf7faa75b in be_socket_destruct (bufev=0x8061508)
    at bufferevent_sock.c:583
#7 0xf7fa8edc in _bufferevent_decref_and_unlock (bufev=0x8061508)
    at bufferevent.c:601
#8 0xf7fa9128 in bufferevent_free (bufev=0x8061508) at bufferevent.c:659
#9 0x0805365e in errorcb (bev=0x8061508, error=16, ctx=0x8061410)
    at craftd.c:173
#10 0x0805556f in packetdecoder (pkttype=255 '\377', pktlen=11, bev=0x8061508,
    player=0x8061410) at network/decoder.c:292
#11 0x080549f4 in run_worker (arg=0xffffd238) at network/worker.c:145
#12 0xf7f6893e in start_thread () from /lib/libpthread.so.0
#13 0xf7edcd9e in clone () from /lib/libc.so.6

Thread 1 (Thread 0xf7dea6c0 (LWP 25681)):
#0 0xffffe430 in __kernel_vsyscall ()
#1 0xf7f70319 in __lll_lock_wait () from /lib/libpthread.so.0
#2 0xf7f6b5c2 in _L_lock_543 () from /lib/libpthread.so.0
#3 0xf7f6b455 in pthread_mutex_lock () from /lib/libpthread.so.0
#4 0xffffcfd8 in ?? ()
#5 0xf7dea6c0 in ?? ()
#6 0x08060468 in ?? ()
#7 0xf7fd4ff4 in ?? () from /usr/local/lib/libevent_pthreads-2.0.so.5
#8 0xf7fa99f0 in bufferevent_socket_outbuf_cb (buf=Cannot access memory at address 0x6459
) at bufferevent_sock.c:119
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

If you're willing to swim through some spaghetti code, the errorcb is on line 125:
https://github.com/kev009/craftd/blob/781c2ec4503fe1c96fce23f29239226019f44d81/src/craftd.c#L125

Regards,
Kevin

Follow-Ups:
- Re: [Libevent-users] Buffer event race condition
  - From: Mark Ellzey

Prev by Author: [Libevent-users] Threaded event bases
Next by Author: Re: [Libevent-users] Offline for a couple of weeks
Previous by thread: [Libevent-users] Socket shutdown part 2.
Next by thread: Re: [Libevent-users] Buffer event race condition
Index(es):
- Author
- Thread