[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [Libevent-users] Clarify new behavior?



It probably isn't worth using a lot of your time to pursue this as there are ways for us to get around it, and frankly, we probably should anyway.

Set aside the details of that specific scenario. The issue really centers around progressing the event library while already inside an event handler. In the prior release series, this was allowed, although perhaps not intentionally.

It appears that the new warning plus error return was intended to ensure that people don't do this any more, and I can see some of the issues. I think there are ways to resolve those problems (e.g., locking only active events instead of the entire base), but they also involve (possibly substantial) overhead.

Remember that we didn't used to have different event bases, so our code was written to work within a single base. While we can and will revise the code to use multiple bases, that does pose its own issues - e.g., tracking which events are sitting on what bases so we know what to progress when necessary. In a very complex, disperse code such as OMPI, this can become difficult.

What we will likely do instead is a hybrid where we use only a couple of event bases, and then rework the code so that actual work is done in "progress_callback" functions. In other words, we loop the event lib, let each triggered event handler collect any messages onto appropriate message lists, leave the event handler, and then cycle thru a list of registered functions that process any data on their respective lists. Thus, if those functions need to progress the event lib, they can do so from outside of an event handler.

So I think we have a way of making it work. I was mainly just wanting to confirm the change in behavior before we embark on the rewrite, and not really asking libevent to change something.

HTH
Ralph

On Oct 26, 2010, at 8:16 PM, Nick Mathewson wrote:

> On Sat, Oct 23, 2010 at 5:14 AM, Ralph Castain <rhc@xxxxxxxxxxxx> wrote:
>> Hi folks
>> 
>> I successfully updated our libevent integration in Open MPI, but have encountered a problem with one use-case that used to work and now doesn't. Before proceeding to devise a fix, I just wanted to confirm that I accurately understand the issue.
>> 
>> The problem arises from this scenario:
> 
> Hi, Ralph!  I'm going over this again to try to figure out what to do.
> I think that the short term answer, since you're already shipping a
> patched libevent, and you aren't calling event_base_dispatch on a
> single base from more than thread at once, is for you to remove the
> entire "if" block that checks for reentrant invocation, warns, and
> returns.  If the behavior you get now works for you, that's probably a
> fine workaround for now.  I am pretty sure that it was never actually
> planned to work as it does, but there's no sense in you rewriting your
> code for future 2.1 semantics until they are nailed down.  (And
> there's not much chance of the semantics of reentrant event_base_loop
> invocation getting settled in a 2.0 timeframe IMO).
> 
> 
> That said, I want to ask you a few questions about your use case to
> see if this is actually the best way for Libevent to do what you need,
> or if there's some other piece of functionality that could let you
> implement what you want more cleanly.
> 
>> 
>> 1. we receive a command via a message that we receive in a file descriptor event. We "push" the command message into a timer event (duration zero time) to help break a threading issue, and then return from the file descriptor event.
> 
> So I'm confused here.  You say "to break a threading issue", but in a
> later message you say "After all, we are running single-threaded".
> 
> Also, I'm assuming you know about event_active() and dummy events
> (fd=-1, events=0), and that you're using this zero-duration timeout
> trick for some other reason.  Why, and should there be a better way to
> do that?
> 
>> 2. the event library is called with LOOP_ONCE, causing the timer event to fire.
>> 
>> 3. from within the timer event, the command causes us to execute a procedure that results in us having to wait for another event to occur. We "block" in that position, running a loop that includes a call to progress the event library (i.e., a call to event_loop(LOOP_ONCE)).
> 
> But, why with the same event_base?  That's the part that confuses me.
> When you block in the callback invoked in 3, you stop executing _all_
> other active event callbacks that might be waiting to execute.  Then
> later the first time  you call event_loop once, you run them.  Was
> that what you had in mind?  I am not getting the architecture here.
> Maybe some pseudocode would make me understand. :/
> 
> yrs,
> -- 
> Nick
> ***********************************************************************
> To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
> unsubscribe libevent-users    in the body.

***********************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
unsubscribe libevent-users    in the body.