# NTP Drifts +ve and -ve - NTP

This is a discussion on NTP Drifts +ve and -ve - NTP ; Hi,We are using NTP4, when CPU is very busy some of the UDP packets dropped by the kernel, so the local clock drifts 60 milliseconds from the time server. From that point NTP keeps drifts +ve and -ve for 2 ...

# Thread: NTP Drifts +ve and -ve

1. ## NTP Drifts +ve and -ve

Hi,We are using NTP4, when CPU is very busy some of the UDP packets dropped by the kernel, so the local clock drifts 60 milliseconds from the time server. From that point NTP keeps drifts +ve and -ve for 2 to 3 three days to become stable. The graph looks a like a sine wave oscillating and reaching zero after 3 days.My question are:1. Why NTP drifting +ve and -ve?2. Why should NTP taking 3 days for correcting 60 milliseconds?3. Is this a problem or it is expected? Regards,Arul
__________________________________________________ _______________
Chose your Life Partner? Join MSN Matrimony FREE

2. ## Re: NTP Drifts +ve and -ve

Arul Murugan wrote:
> Hi,We are using NTP4, when CPU is very busy some of the UDP packets

+ hdropped by the kernel, so the local clock drifts 60 milliseconds from

The problem is not dropped packets, but delayed packets.

+ the time server. From that point NTP keeps drifts +ve and -ve for 2 to 3

Especially given the long recovery times you describe, it is more likely
that the measured time server time is drifting from its true time and
that the local clock is actually rather more accurately tracking it than
the offsets imply.

However, if the errors always start as negative slips, your problem is
not CPU overload but a device driver, often IDE run on non-DMA mode,
with poor interrupt latency.

+ three days to become stable. The graph looks a like a sine wave
+ oscillating and reaching zero after 3 days.My question are:1. Why NTP
+ drifting +ve and -ve?2. Why should NTP taking 3 days for correcting 60
+ milliseconds?3. Is this a problem or it is expected? Regards,Arul

ntpd isn't designed to cope with systematic changes well, it assumes
random perturbations until convinced otherwise, The best way of dealing
with those involves running a low pass filter with a time constant
reflecting typical crystal frequency variation times.

I am surprised that it has not been convinced otherwise in this case, so
could you confirm that you are using the recommended minpoll of 6 (64
seconds) and the recommended maxpoll of 10 (1024 seconds). Using high
values for these will compromise recovery times.

Normally this problem is caused by network congestion, not CPU overload.
To get round asymmetric link delay problems, you should configure your
routers and the corresponding ISP routers to give priority to NTP
traffic. In default of that, you should use the tinker huffpuff option.

If you really are having effects from CPU loading, you need to find
network hardware with better drivers. If you are losing clock
interrupts, you need to investigate the drivers with poor latency.

Quite a few people believe that ntpd's assumptions about measurement
error statistics are not valid in the world in which NTP is used by most
system admins, and, if you are using Linux, I'm sure Unruh will suggest
one alternative.

3. ## Re: NTP Drifts +ve and -ve

In article ,
umarul_it@hotmail.com (Arul Murugan) wrote:

> Hi, We are using NTP4, when CPU is very busy some of the UDP packets [are] dropped
> by the kernel, so the local clock drifts 60 milliseconds from the time server.

Dropped packets are quite unlikely to be the problem, even if most
packets never arrive.

More likely is that the NTP daemon is being preempted between taking the
send timestamp and the sent packet actually appearing on the wire, and
between received packets actual arrival time and when the daemon is able
to obtain the receipt timestamp. These preemptions appear to the daemon
as very large and random asymmetrical transport delays. If sufficiently
common, these bad observations will seep through the various filter
steps in NTP, and corrupt the measurements of clock offset error used to
update the servo.

See .

What computer platform and operating system are you using?

One classic solution is to give the NTP demon sufficient realtime
priority to outrank whatever else the CPU is doing, thus sharply
reducing fraction of NTP polls that suffer preemption.

This raised priority will not cause those other activities to be any
slower because the NTP daemon is an insignificant consumer of CPU
resources.

> From that point, NTP keeps drifts +ve and -ve for 2 to 3 three days to
> become stable. The graph looks a like a sine wave oscillating and reaching
> zero after 3 days. My question are:
>
> 1. Why [is] NTP drifting +ve and -ve?

Because the clock servo is being fed contaminated data, as explained
above.

> 2. Why should NTP [be] taking 3 days for correcting 60 milliseconds?

Because it takes NTP days versus a few hours to slog through all that

> 3. Is this a problem or it is expected?

Both. It is a problem for sure, but is to be expected under these
circumstances.

Joe Gwinn

4. ## Re: NTP Drifts +ve and -ve

David Woolley writes:

>Arul Murugan wrote:
>> Hi,We are using NTP4, when CPU is very busy some of the UDP packets

>+ hdropped by the kernel, so the local clock drifts 60 milliseconds from

>The problem is not dropped packets, but delayed packets.

>+ the time server. From that point NTP keeps drifts +ve and -ve for 2 to 3

>Especially given the long recovery times you describe, it is more likely
>that the measured time server time is drifting from its true time and
>that the local clock is actually rather more accurately tracking it than
>the offsets imply.

>However, if the errors always start as negative slips, your problem is
>not CPU overload but a device driver, often IDE run on non-DMA mode,
>with poor interrupt latency.

>+ three days to become stable. The graph looks a like a sine wave
>+ oscillating and reaching zero after 3 days.My question are:1. Why NTP
>+ drifting +ve and -ve?2. Why should NTP taking 3 days for correcting 60
>+ milliseconds?3. Is this a problem or it is expected? Regards,Arul

You do not state what you regard as "correcting 60 ms". Is it getting the
error down to 1ms? 10ms?

>ntpd isn't designed to cope with systematic changes well, it assumes
>random perturbations until convinced otherwise, The best way of dealing
>with those involves running a low pass filter with a time constant
>reflecting typical crystal frequency variation times.

>I am surprised that it has not been convinced otherwise in this case, so
>could you confirm that you are using the recommended minpoll of 6 (64
>seconds) and the recommended maxpoll of 10 (1024 seconds). Using high
>values for these will compromise recovery times.

>Normally this problem is caused by network congestion, not CPU overload.
> To get round asymmetric link delay problems, you should configure your
>routers and the corresponding ISP routers to give priority to NTP
>traffic. In default of that, you should use the tinker huffpuff option.

>If you really are having effects from CPU loading, you need to find
>network hardware with better drivers. If you are losing clock
>interrupts, you need to investigate the drivers with poor latency.

>Quite a few people believe that ntpd's assumptions about measurement
>error statistics are not valid in the world in which NTP is used by most
>system admins, and, if you are using Linux, I'm sure Unruh will suggest
>one alternative.

Sure, why not. chrony. It responds to change much faster than does ntp
while still maintaining good long term stability.ntp is designed as a
simple feedback loop. To keep the loop stable, the time scale of the loop
is set very long ( about 8-16 poll internals-- because ntp tends to throw
away about 7/8 of the incoming data in order to try to eliminate network
delay errors as much as possible). Since ntp tends to operate on poll 10
which is 20 min, this give a feedback loop time scale of about 5-10 hrs.
Ie, the error is reduced by 1/e (40%) every 5-10 hrs. (actually since it is
a second order critically damped system, this is not really accurate. The
correction action goes to zero faster than that, overshots by something
like 20% and then comes
back to zero). But 3 days sounds like a very long time unless you are using
very long poll intervals.

chrony does a linear fit to the past data (corrected for the clock
corrections), testing to see if the errors are
random or consistantly changing, lowering the time scale over which the
slope and offset are determined in the latter case -- ie it has a constantly adjusted Allan
variance minimum. When the noise is random, long times are used to beat
down the statistical noise. When it is consistantly off, it shortens the
scale to allow it to respond rapidly to clock frequency drifts. It also tries to
eliminate the offset, as determined by the fit, much much faster than does
ntp. Very different philosophies. ntp tends to have larger offset variances
and maybe slightly smaller frequency variances.

5. ## Re: NTP Drifts +ve and -ve

"Unruh" wrote in message
news:yoWqk.8851\$%b7.4796@edtnps82...

> [...] (actually since it is
> a second order critically damped system, this is not really accurate. The
> correction action goes to zero faster than that, overshots by something
> like 20% and then comes back to zero). ...

damped system? ISTR critical damping being defined as not overshooting.

Groetjes,
Maarten Wiltink

6. ## Re: NTP Drifts +ve and -ve

"Maarten Wiltink" writes:

>"Unruh" wrote in message
>news:yoWqk.8851\$%b7.4796@edtnps82...

>> [...] (actually since it is
>> a second order critically damped system, this is not really accurate. The
>> correction action goes to zero faster than that, overshots by something
>> like 20% and then comes back to zero). ...

>damped system? ISTR critical damping being defined as not overshooting.

A critically damped system is one whose solution is (A+Bt) e^(-gt)
If B is negative it overshoots. If B is 0 it approaches 0 as rapidly as
possible. If B is positive it may actually increase before it decreases.
My vague recollection is tha tthe parameters were chosen for ntp to be
critically damped, but the initial conditions are in general such that B is
negative. An underdamped system will always have oscillations (infinitely
many but decreasing in amplitude. A critically or overdamped system can
overshoot as well, but has the problem that in general it approaches
equilibrium more slowly than a critically damped one.

>Groetjes,
>Maarten Wiltink

7. ## Re: NTP Drifts +ve and -ve

Bill,

Not quite. The loop filter is a second-order polynomial with damping
factor as described in rfc-1305, the web documentation and my book. The
coefficients are chosen for a slightly underdamped characteristic
yielding an overshoot of about 7 percent at all poll intervals.

Dave

Unruh wrote:
> "Maarten Wiltink" writes:
>
>
>>"Unruh" wrote in message
>>news:yoWqk.8851\$%b7.4796@edtnps82...

>
>
>>>[...] (actually since it is
>>>a second order critically damped system, this is not really accurate. The
>>>correction action goes to zero faster than that, overshots by something
>>>like 20% and then comes back to zero). ...

>
>
>>damped system? ISTR critical damping being defined as not overshooting.

>
>
> A critically damped system is one whose solution is (A+Bt) e^(-gt)
> If B is negative it overshoots. If B is 0 it approaches 0 as rapidly as
> possible. If B is positive it may actually increase before it decreases.
> My vague recollection is tha tthe parameters were chosen for ntp to be
> critically damped, but the initial conditions are in general such that B is
> negative. An underdamped system will always have oscillations (infinitely
> many but decreasing in amplitude. A critically or overdamped system can
> overshoot as well, but has the problem that in general it approaches
> equilibrium more slowly than a critically damped one.
>
>
>
>>Groetjes,
>>Maarten Wiltink

>
>
>