[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [Libevent-users] Race Condition resulting in busy loop



Hi Azat, 

1. I apologize for sending html formatted content to the list. My bad!
2. Thank you for the pointer on event_add() and event_assign() without a corresponding event_del. On further review of my code, I do observe that I'm doing a "evtimer_add" in the callback of the timer event itself without a performing a pending check. I'm fixing that now.  
However if that was the root cause of the infinite loop, I would have expected  ev_evcallback.evcb_cb_union.evcb_callback to point to my custom callback function instead of pointing to "be_openssl_handshakeeventcb". Is my understanding correct or is this happening because adding a timer without deleting/ checking whether it was enabled led to this final corrupt state or as you say "strange, hard-to-diagnose bugs" ?

On Fri, Mar 3, 2017 at 1:30 AM, Azat Khuzhin <a3at.mail@xxxxxxxxx> wrote:
On Wed, Mar 1, 2017 at 8:51 PM, Sanjiv <sanjiv.raj@xxxxxxxxx> wrote:
>
> Hi,

Hello, please send regular plain-text messages (without any html).

...

> a) libevent was built with NDEBUG flag - I've already had one unit test fail because of this and have subsequently to switched to releasing a newer build without NDEBUG. However there were some clients which picked up the library built with NDEBUG.

Which one?

> b) An https request callback was invoked with "evhttp_request_get_response_code" returning 0. (This was a call to S3 in US-EAST-1 during the outage on 02/28/2017, so I don't have much details on how to recreate this issue given that AWS is keeping quiet on the same.)
>
> The following is a snippet from gdb stack trace.

...

> As you can observe, the ev_timeout for the event obtained from priority queue is always less than current time and hence it never breaks out of the loop. I have interrupted the process multiple times and it's always within the stack call trace for timeout_process and always using the same "ev" event address from the priority queue.

I suspect next pattern (more information about this you can find at [1]):
- event_add()
- event_assign() # without event_del()

[1] http://libevent-users.monkey.narkive.com/GsU2FrYK/infinite-loop-in-evmap-io-active

And if so, bad things will happen. From event_assign() documentation:
  Note that it is NOT safe to call this function on an event that is
  active or pending.  Doing so WILL corrupt internal data structures in
  Libevent, and lead to strange, hard-to-diagnose bugs.  You _can_ use
  event_assign to change an existing event, but only if it is not active
  or pending!

Do you compile without EVENT__DISABLE_DEBUG_MODE?
In this case you can try to run libevent with
event_enable_debug_mode() it will find such cases.

If this will not help, please try valgrind (memcheck) or something like this.

  Azat.
***********************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
unsubscribe libevent-users    in the body.



--
Sanjiv Raj


Marvin: I've been talking to the ship's computer.
Arthur: And?
Marvin: It hates me.