[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [Libevent-users] Signals and priority queues
I've been digging further into this, and I believe I have much of it resolved now. However, I have encountered a problem that appears to be something in libevent itself.
I configured libevent with debug enabled, and turned it on at execution - and was barraged by:
[warn] select: Invalid argument
Digging further into the reason, I found that the message comes from the following code in select_dispatch (file select.c):
res = select(nfds, sop->event_readset_out,
sop->event_writeset_out, NULL, tv);
EVBASE_ACQUIRE_LOCK(base, th_base_lock);
check_selectop(sop);
if (res == -1) {
if (errno != EINTR) {
event_warn("select");
return (-1);
}
return (0);
}
The timeout value being supplied to select_dispatch is being corrupted after the first time thru the routine - it comes into the routine the first time as {0, 0}, but is an illegal value thereafter. Resetting the timeout to the original value resolves the problem.
Obviously, removing debug "quiets" the message barrage - but I wonder if something else is going on here, or if there is a bug in libevent itself?
Thanks
Ralph
On Jan 6, 2012, at 12:47 PM, Ralph Castain wrote:
> Afraid I'm going to have to eat my words here, Nick. It looks like something is going on in the code - not entirely sure just where yet (mine or libevent). I've installed a clean version of 2.0.13 (removing everything but the glue) into OMPI, and the problems persist. I've also tried converting to a true fd-based event using pipes, and get the identical behavior.
>
> I'm going to spend some more time over the weekend looking at this before begging more of your time on it. I'm hoping to pin it down a little more for you, or at least provide an updated reproducer.
>
> Thanks again
> Ralph
>
> On Jan 6, 2012, at 8:15 AM, Ralph Castain wrote:
>
>>
>> On Jan 6, 2012, at 7:02 AM, Nick Mathewson wrote:
>>
>>> On Fri, Jan 6, 2012 at 7:24 AM, Ralph Castain <rhc@xxxxxxxxxxxx> wrote:
>>>> If it helps, I have now confirmed that I *can* activate the t2 event during the t1func callback in my example *provided* I called event_assign on it prior to entering event_base_loop. It is also okay for me to event_add the t2 event during the callback - I am simply not allowed to event_assign *and* activate it there.
>>>>
>>>> However, it is okay to assign the event during the callback so long as I don't activate it until after I return.
>>>>
>>>> Seems a little strange to me - is this the intended behavior?
>>>
>>> Well, no, of course not.
>>>
>>> Looking at your code, the only weird thing I see at first glance is
>>> that you are calling event_add() on t1 and t2 -- you shouldn't be
>>> doing that. event_add() is only for events that you want libevent to
>>> poll or wait for, but waiting for EV_WRITE on fd -1 isn't
>>> well-defined. If you want to activate them yourself with
>>> event_active(), there's no need to event_add() them.
>>>
>>> That shouldn't be causing this problem, though, I think. (Unless it is?)
>>
>> BINGO! Indeed, event_add was the source of the trouble. My bad for not understanding when event_add was required.
>>
>>>
>>> I just tried your test programs, though, and they worked okay for me
>>> on OSX and linux, using Libevent 2.0.13-stable and Libevent
>>> 2.0.14-stable.
>>>
>>> What platform are you running your tests on? Have you tried other
>>> platforms too? Does the outcome depend on which libevent backend is
>>> in use? Have you tried this with an unpatched Libevent, just to
>>> confirm that it's not introduced by any openmpi patches?
>>
>> FWICT, it is a corruption issue, and so it does indeed depend on platform and backend - just a question of what memory location gets trounced.
>>
>> FWIW, I was conducting my tests on OSX and linux as well, using OMPI with 2.0.13 underneath. I think the difference in our results is due to the location issue - I suspect that you might also hit a problem if we continued chaining events long enough, but I haven't confirmed it.
>>
>> Also fwiw: the OMPI changes are confined to configuration/Makefile areas - we actually don't fiddle with the libevent code itself other than a couple of places where we test for stdbool.h before including it.
>>
>> Thanks Nick!
>> Ralph
>>
>>>
>>> yrs,
>>> --
>>> Nick
>>> ***********************************************************************
>>> To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
>>> unsubscribe libevent-users in the body.
>>
>
***********************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
unsubscribe libevent-users in the body.