[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [Libevent-users] a dead looping bug when changing system time backward



On Mon, Apr 9, 2012 at 11:44 PM, ååé <zhuoyihuang@xxxxxxxxx> wrote:
> Hi all,
>
> I found an issue that it will trigger the timer callback dramatically
> like a dead looping when moving the system time backward for a year
> and then moving it forward to the accurate time.
> Of course, I had to change the system time for other activities then.
> However, I've made a test case for each of libevent's version of
> 2.0.10, 2.0.12, 2.0.17, unfortunately, those tests had caused the same
> issue when the system time was changed in the way I mentioned. Also,
> those weren't any stress-testing methodology. I just wanted to reveal
> and confirm the appearance. In addition, I did all the test on a
> windows platform.
>
> I was appreciated for the reply from Nick that I reported this issue
> to. Â I figured that we won't change the system time manually at the
> production phase of a system for sure. But what I worry about is in a
> complicated situation that a machine in work would probably be
> synchronized its system time with an external standard timer. So, it
> could probably cause the issue like this if the time difference is
> about 3 ~ 5 seconds, especially for some sort of online game server
> which needs a timer to calculate some logical procedures by elapsed
> time.

So, the undesired behavior here occurs because of the current behavior
of periodic timers (that is, events with a timeout and EV_PERSIST set
on them): when an event elapses and is rescheduled, it gets
rescheduled relative to the time at which it was scheduled before, not
the current time.

For example, if a periodic timer that is supposed to go off every five
seconds is scheduled at time 0, it will fire at time 0, and then be
rescheduled for time 5, time 10, time 15, and so on.  Even if some
CPU-intensive task or timing issue causes the event to fire at time
0.3 instead of time 0, it still gets rescheduled for time 5, not time
5.3.

This scheduling approach is usually desirable: it means that the timer
doesn't become skewed.  But, as Jeffrey has noted, it seems to act in
a way most people wouldn't want in response to big jumps forward in
time.

One kind of big jump forward is a big shift in the underlying clock.
As William Ahern notes, we can solve that by using monotonic clocks
instead of system clocks whenever possible.  But there's another kind
of big jump that happens: when a computer resumes after having been
suspended.  Some monotonic clocks treat this as a jump forward in
time, and some do not[*].

[*] http://www.python.org/dev/peps/pep-0418/

And there's a third way for libevent to see a big jump forward in
time: if the program calls event_base_loop() sporadically, it is free
to wait as long as it wants between invocations.

So, what's the right behavior for periodic events in these cases?  If
there is an event that's supposed to run every 5 seconds, and time has
jumped forward by 16 seconds, it seems reasonable to run the event 1
time, or maybe even 3 times... but if time has jumped forward by one
day, it seems unlikely that the programmer really wants us to run the
event 17280 times.

Perhaps this argues for a cap on how far into the past we should be
willing to reschedule a periodic event, or how many "missed firings"
we'll compensate for before we drop some on the floor?


yrs,
-- 
Nick
***********************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
unsubscribe libevent-users    in the body.