[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [Libevent-users] Signals and priority queues



On Jan 13, 2012, at 8:29 AM, Nick Mathewson wrote:

> On Fri, Jan 13, 2012 at 10:13 AM, Ralph Castain <rhc@xxxxxxxxxxxx> wrote:
> 
>>> What kind of illegal value are you seeing,
>> 
>> 1326467251, 774650
> 
> Okay, that looks like it's the actual current time!  I wonder why that
> would make select() give an error, though.  Maybe because the current
> time plus that many seconds exceeds a 32-bit TIME_MAX ?

Best I can tell, that is correct - select thinks that is an offset, and the result is too large.

> 
>>> coming from where?
>> 
>> I'm not sure who calls "select_dispatch" - the value is passed into it.
> 
> The line is
>                 res = evsel->dispatch(base, tv_p);
> in event_base_loop() in event.c
> 
>>> Are you
>>> using the common_timeout code?
>> 
>> This is just flowing thru from a call to event_loop - I'm not sure of the progression that takes us down to select_dispatch.
> 
> I meant, is any part of your code calling
> event_base_init_common_timeout() ? It sounds like "no".

Nope

> 
> So, three possibilities come to mind:
>  1) Something is calling event_add with an absolute time rather than
> a number of seconds/usec to delay.
>  2) Something in Libevent is calling event_add_internal with an
> absolute time rather than a delay, and is not setting the
> tv_is_absolute flag
>  3) timeout_correct has gone crazy, and thinks that the current time
> has been reset to 0 for some reason.
> 
> Adding some assertions in event_add_internal might track this down.
> Trivially, you could do
>   if (tv && !tv_is_absolute) {
>       /* waiting one billion seconds should be enough for anyone */
>       EVUTIL_ASSERT(tv->tv_sec < 1000000000);
>   }
> 
> to try to detect 1 and 2.

Interesting. The above code never tripped, so I dug a little further and found that event_add_internal is never being called with a tv value that is large. I did find it to be a race condition - sometimes the  code completes and exits before I get the error condition report.

The timeout value clearly isn't a garbage value - I dumped the values out, compared to current time as of the error:

warn] select: Invalid argument
TV OUT OF SPEC AT CNT 2: value 1326472513:976848 curtime 1326472513:977043
Ralph
[warn] select: Invalid argument
TV OUT OF SPEC AT CNT 3: value 1326472513:977327 curtime 1326472513:977413

So the value is getting updated and appears valid. What's strange is why libevent is passing an absolute time to select as it is supposed to be a relative value per the man page:

    If timeout is a non-nil pointer, it specifies a maximum interval to wait for the selection to complete.  If
     timeout is a nil pointer, the select blocks indefinitely.  To effect a poll, the timeout argument should be
     non-nil, pointing to a zero-valued timeval structure.  Timeout is not changed by select(), and may be reused
     on subsequent calls, however it is good style to re-initialize it before each invocation of select().

Any easy way I can output an identifier that would tell us something about which event is involved? I see that I'm not getting output from the event_debug calls in the code, even though I've configured with debug enabled and called:

        event_enable_debug_mode();
        event_set_debug_output(1);

Anything else required to get that output? Would it help?

> 
> -- 
> Nick
> ***********************************************************************
> To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
> unsubscribe libevent-users    in the body.

***********************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
unsubscribe libevent-users    in the body.