[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: "Your system clock just jumped" on Debian+VMware ESX



Lucky Green wrote:
> I am seeing the following errors in the Tor log:
>
> [...]
> Feb 28 04:54:43.008 [notice] Your system clock just jumped 4398 seconds
> backward; assuming established circuits no longer work.
> Feb 28 04:54:46.020 [notice] Your system clock just jumped 4399 seconds
> forward; assuming established circuits no longer work.
> [...]

How to fix this issue
==============
After extensive testing, I have been able to confirm that the following
fix will put an end to the clock jump errors when using Debian Etch on
VMware ESX: simply add "notsc" to your kernel line in /boot/grub/menu.lst

Example Before
----------------
"kernel /vmlinuz--2.6.22-4-amd64 root=/dev/hda1 ro"

Example After
--------------
"kernel /vmlinuz--2.6.22-4-amd64 root=/dev/hda1 ro notsc"

I have not seen any adverse effects from this change.

Likely Root Cause
=============
Current Linux kernels use the CPU's "Time Stamp Counter" (TSC) to
determine the time as returned by gettimeofday(). Tor uses
gettimeofday() to determine the age of a circuit. If multiple virtual
CPUs are allocated to the guest OS, the TSC of the virtual CPUs can get
wildly out of sync. As the OS switches the Tor process from one virtual
CPU to the other, the time returned by gettimeofday() will diverge
massively.

Note that this is not an issue for processes that do not use the TSC to
determine time. You will not see the jumps running "date" in a loop.

For the error to occur the process has to migrate between CPUs. You will
therefore not encounter this error on uniprocessor systems.

How to not fix this issue
================
Most articles online that are covering issues with the clock in VMware
guest operating systems will tell you to sync the time in the guest OS
with the host, recommend you install ntp in the guest, implore you not
to install ntp in the guest, or advise you to add "clock=pit nosmp
noapci nolapci" to your grub kernel line. Of all those recommendations,
all but the latter concern themselves with clock drift, having no effect
on the clock jumps discussed here. Adding "clock=pit nosmp noapci
nolapci" will in fact prevent the clock jumps for the simple reason that
it turns your multiprocessor VM into a uniprocessor VM. Probably not
what you had in mind.

In addition to Tor, the popular IMAP server dovecot experiences the
exact same issues when installed on a multi-processor guest OS on top of
a VMware ESX host.

Adding "notsc" to the kernel line in menu.lst resolves this issue for
both Tor and dovecot.

In several weeks of operation, I have not seen any adverse effects from
disabling the use of the TSC on either my Tor or IMAP servers.

Hope this helps,
--Lucky Green <shamrock@xxxxxxxxxxxxxx>