[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [Libevent-users] Signals and priority queues



I've been digging further into this, and I believe I have much of it resolved now. However, I have encountered a problem that appears to be something in libevent itself.

I configured libevent with debug enabled, and turned it on at execution - and was barraged by:

[warn] select: Invalid argument

Digging further into the reason, I found that the message comes from the following code in select_dispatch (file select.c):

	res = select(nfds, sop->event_readset_out,
	    sop->event_writeset_out, NULL, tv);

	EVBASE_ACQUIRE_LOCK(base, th_base_lock);

	check_selectop(sop);

	if (res == -1) {
		if (errno != EINTR) {
			event_warn("select");
			return (-1);
		}

		return (0);
	}

The timeout value being supplied to select_dispatch is being corrupted after the first time thru the routine - it comes into the routine the first time as {0, 0}, but is an illegal value thereafter. Resetting the timeout to the original value resolves the problem.

Obviously, removing debug "quiets" the message barrage - but I wonder if something else is going on here, or if there is a bug in libevent itself?

Thanks
Ralph


On Jan 6, 2012, at 12:47 PM, Ralph Castain wrote:

> Afraid I'm going to have to eat my words here, Nick. It looks like something is going on in the code - not entirely sure just where yet (mine or libevent). I've installed a clean version of 2.0.13 (removing everything but the glue) into OMPI, and the problems persist. I've also tried converting to a true fd-based event using pipes, and get the identical behavior.
> 
> I'm going to spend some more time over the weekend looking at this before begging more of your time on it. I'm hoping to pin it down a little more for you, or at least provide an updated reproducer.
> 
> Thanks again
> Ralph
> 
> On Jan 6, 2012, at 8:15 AM, Ralph Castain wrote:
> 
>> 
>> On Jan 6, 2012, at 7:02 AM, Nick Mathewson wrote:
>> 
>>> On Fri, Jan 6, 2012 at 7:24 AM, Ralph Castain <rhc@xxxxxxxxxxxx> wrote:
>>>> If it helps, I have now confirmed that I *can* activate the t2 event during the t1func callback in my example *provided* I called event_assign on it prior to entering event_base_loop. It is also okay for me to event_add the t2 event during the callback - I am simply not allowed to event_assign *and* activate it there.
>>>> 
>>>> However, it is okay to assign the event during the callback so long as I don't activate it until after I return.
>>>> 
>>>> Seems a little strange to me - is this the intended behavior?
>>> 
>>> Well, no, of course not.
>>> 
>>> Looking at your code, the only weird thing I see at first glance is
>>> that you are calling event_add() on t1 and t2 -- you shouldn't be
>>> doing that.  event_add() is only for events that you want libevent to
>>> poll or wait for, but waiting for EV_WRITE on fd -1 isn't
>>> well-defined.  If you want to activate them yourself with
>>> event_active(), there's no need to event_add() them.
>>> 
>>> That shouldn't be causing this problem, though, I think.  (Unless it is?)
>> 
>> BINGO! Indeed, event_add was the source of the trouble. My bad for not understanding when event_add was required.
>> 
>>> 
>>> I just tried your test programs, though, and they worked okay for me
>>> on OSX and linux, using Libevent 2.0.13-stable and Libevent
>>> 2.0.14-stable.
>>> 
>>> What platform are you running your tests on?  Have you tried other
>>> platforms too?  Does the outcome depend on which libevent backend is
>>> in use?  Have you tried this with an unpatched Libevent, just to
>>> confirm that it's not introduced by any openmpi patches?
>> 
>> FWICT, it is a corruption issue, and so it does indeed depend on platform and backend - just a question of what memory location gets trounced.
>> 
>> FWIW, I was conducting my tests on OSX and linux as well, using OMPI with 2.0.13 underneath. I think the difference in our results is due to the location issue - I suspect that you might also hit a problem if we continued chaining events long enough, but I haven't confirmed it.
>> 
>> Also fwiw: the OMPI changes are confined to configuration/Makefile areas - we actually don't fiddle with the libevent code itself other than a couple of places where we test for stdbool.h before including it.
>> 
>> Thanks Nick!
>> Ralph
>> 
>>> 
>>> yrs,
>>> -- 
>>> Nick
>>> ***********************************************************************
>>> To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
>>> unsubscribe libevent-users    in the body.
>> 
> 

***********************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
unsubscribe libevent-users    in the body.