peers with high offset on Intel Quad Corerunning Windows 2003 - NTP

This is a discussion on peers with high offset on Intel Quad Corerunning Windows 2003 - NTP ; Hi, I'm consistently seeing offsets between +- 6 ms on my Windows servers. For the same hardware running Linux, NTP behaves great (+- 500 us offset) but on Windows it is bad. Also the same version of Windows on AMD ...

+ Reply to Thread
Results 1 to 14 of 14

Thread: peers with high offset on Intel Quad Corerunning Windows 2003

  1. peers with high offset on Intel Quad Corerunning Windows 2003

    Hi,
    I'm consistently seeing offsets between +- 6 ms on my Windows servers.
    For the same hardware running Linux, NTP behaves great (+- 500 us offset)
    but on Windows it is bad. Also the same version of Windows on AMD procs is
    behaves great.

    Hardware:
    HP ProLiant DL360 G5
    * 2 Quad-Core Intel(R) Xeon(R) Processor E5450
    * 8 GB of memory

    Software:
    * Windows 2003
    * ntp-4.2.0a

    Here is ntp.conf:
    ========================
    server 192.168.10.5 burst iburst minpoll 4 maxpoll 6
    server 192.168.10.6 burst iburst minpoll 4 maxpoll 6
    server 10.10.10.10 burst iburst minpoll 4 maxpoll 6
    server 10.10.10.11 burst iburst minpoll 4 maxpoll 6
    server 10.10.10.12 burst iburst minpoll 4 maxpoll 6
    server 10.10.10.13 burst iburst minpoll 4 maxpoll 6
    server 10.10.10.14 burst iburst minpoll 4 maxpoll 6
    server 10.10.10.15 burst iburst minpoll 4 maxpoll 6
    ========================

    % ntpq -np
    remote refid st t when poll reach delay offset
    jitter
    ================================================== ============================
    *192.168.10.5 .GPS. 1 u 44 64 377 0.905 -4.274
    0.656
    +192.168.10.6 .CDMA. 1 u 1 64 377 18.731 -6.324
    1.915
    +10.10.10.10 192.168.10.5 2 u 29 64 377 0.399 -5.357
    0.535
    -10.10.10.11 192.168.10.5 2 u 4 64 377 0.352 -6.036
    0.175
    -10.10.10.12 192.168.10.5 2 u 34 64 377 0.397 -5.200
    0.642
    -10.10.10.13 192.168.10.5 2 u 1 64 377 0.498 -5.910
    1.410
    -10.10.10.14 192.168.10.5 2 u 2 64 377 0.376 -6.069
    1.450
    -10.10.10.15 192.168.10.5 2 u 1 64 377 0.335 -5.132
    1.465


    Have any one seen this issue? Any advice? Thanks in advance.

  2. Re: peers with high offset on Intel Quad Core running Windows 2003

    Jimmy ntp wrote:
    > Hi,
    > I'm consistently seeing offsets between +- 6 ms on my Windows servers.


    6ms is not a high offset. In particular, it is rather lower than you
    would expect with w32time, and better than the resolution with which
    ordinary Windows applications can obtain the time from Windows. (Also
    offset is not the same as clock error, and, in the stable state, is
    considerably larger than that).

    >
    > Software:
    > * Windows 2003
    > * ntp-4.2.0a


    That's quite old. I'm not sure that it has all the hacks to improve
    performance on Windows.
    >
    > Here is ntp.conf:
    > ========================
    > server 192.168.10.5 burst iburst minpoll 4 maxpoll 6


    These minpolls and maxpolls will compromise the ability to smooth over
    offset variations. Why are you not using the recommended ones?

  3. Re: peers with high offset on Intel Quad Core running Windows 2003

    David Woolley wrote:

    >
    > That's quite old. I'm not sure that it has all the hacks to improve
    > performance on Windows.


    Note that one of those hacks which may or may not be in that version is
    forcing the use of multimedia timers. This is a mixed blessing, as it
    can result in lost clock interrupts if you have high latency drivers,
    e.g. IDE in programmed mode.

  4. Re: peers with high offset on Intel Quad Core running Windows 2003

    David Woolley wrote:
    > Jimmy ntp wrote:

    []
    >> Here is ntp.conf:
    >> ========================
    >> server 192.168.10.5 burst iburst minpoll 4 maxpoll 6

    >
    > These minpolls and maxpolls will compromise the ability to smooth over
    > offset variations. Why are you not using the recommended ones?


    Do you think that th use of "burst" here is wise? "iburst", yes.

    David



  5. Re: peers with high offset on Intel Quad Core running Windows 2003

    Jimmy ntp wrote:
    > Hi,
    > I'm consistently seeing offsets between +- 6 ms on my Windows servers.
    > For the same hardware running Linux, NTP behaves great (+- 500 us offset)
    > but on Windows it is bad. Also the same version of Windows on AMD procs is
    > behaves great.
    >
    > Hardware:
    > HP ProLiant DL360 G5
    > * 2 Quad-Core Intel(R) Xeon(R) Processor E5450
    > * 8 GB of memory
    >
    > Software:
    > * Windows 2003
    > * ntp-4.2.0a
    >
    > Here is ntp.conf:
    > ========================
    > server 192.168.10.5 burst iburst minpoll 4 maxpoll 6
    > server 192.168.10.6 burst iburst minpoll 4 maxpoll 6
    > server 10.10.10.10 burst iburst minpoll 4 maxpoll 6
    > server 10.10.10.11 burst iburst minpoll 4 maxpoll 6
    > server 10.10.10.12 burst iburst minpoll 4 maxpoll 6
    > server 10.10.10.13 burst iburst minpoll 4 maxpoll 6
    > server 10.10.10.14 burst iburst minpoll 4 maxpoll 6
    > server 10.10.10.15 burst iburst minpoll 4 maxpoll 6
    > ========================
    >
    > % ntpq -np
    > remote refid st t when poll reach delay offset
    > jitter
    > ================================================== ============================
    > *192.168.10.5 .GPS. 1 u 44 64 377 0.905 -4.274
    > 0.656
    > +192.168.10.6 .CDMA. 1 u 1 64 377 18.731 -6.324
    > 1.915
    > +10.10.10.10 192.168.10.5 2 u 29 64 377 0.399 -5.357
    > 0.535
    > -10.10.10.11 192.168.10.5 2 u 4 64 377 0.352 -6.036
    > 0.175
    > -10.10.10.12 192.168.10.5 2 u 34 64 377 0.397 -5.200
    > 0.642
    > -10.10.10.13 192.168.10.5 2 u 1 64 377 0.498 -5.910
    > 1.410
    > -10.10.10.14 192.168.10.5 2 u 2 64 377 0.376 -6.069
    > 1.450
    > -10.10.10.15 192.168.10.5 2 u 1 64 377 0.335 -5.132
    > 1.465
    >
    >
    > Have any one seen this issue? Any advice? Thanks in advance.


    Remove the "burst" and "minpoll 4 maxpoll 6" from all your server
    statements! Restart ntpd and be sure to use the -g switch.

    Burst is a special purpose keyword intended for situations in which your
    system makes a dial-up telephone connection to a server three or four
    times a day. Using the burst keyword in any other situation is
    considered abusive!

    The default values of MINPOLL and MAXPOLL (six and ten) are correct for
    virtually all situations and should be left intact! My non-mathematical
    explanation is that the shorter poll intervals allow large errors to be
    corrected quickly and the longer intervals allow small errors to be
    corrected very accurately. NTPD adjusts the polling interval within the
    range defined by MINPOLL and MAXPOLL as needed. See RFC-1305 for the math.

    If you are good at advanced math and control systems theory, you may
    find RFC-1305 enlightening!

    Also note that Windows is a difficult environment! The clock ticks at
    something like 17 millisecond intervals. If you really need/want time
    to the nearest microsecond, Windows is a poor choice of O/S. There is
    some way to interpolate between ticks which has been mentioned here from
    time to time but I don't recall what it is. I have no need for time to
    the nearest microsecond on my Windows systems and have not tried to
    memorize the details!

  6. Re: peers with high offset on Intel Quad Core running Windows 2003

    David Woolley writes:

    >Jimmy ntp wrote:
    >> Hi,
    >> I'm consistently seeing offsets between +- 6 ms on my Windows servers.


    >6ms is not a high offset. In particular, it is rather lower than you
    >would expect with w32time, and better than the resolution with which
    >ordinary Windows applications can obtain the time from Windows. (Also
    >offset is not the same as clock error, and, in the stable state, is
    >considerably larger than that).


    >>
    >> Software:
    >> * Windows 2003
    >> * ntp-4.2.0a


    >That's quite old. I'm not sure that it has all the hacks to improve
    >performance on Windows.
    >>
    >> Here is ntp.conf:
    >> ========================
    >> server 192.168.10.5 burst iburst minpoll 4 maxpoll 6


    >These minpolls and maxpolls will compromise the ability to smooth over
    >offset variations. Why are you not using the recommended ones?


    They will decrease the offset variations, and increase the frequency
    variations. That may be good or bad depending on the situation your
    computers find themselves in. It is also bad from the point of view of
    network and server congestion, but if that is your own server, that does
    not matter. If it is a public server, it does matter.


  7. Re: peers with high offset on Intel Quad Core running Windows 2003

    Unruh wrote:
    > David Woolley writes:
    >
    >> These minpolls and maxpolls will compromise the ability to smooth over
    >> offset variations. Why are you not using the recommended ones?

    >
    > They will decrease the offset variations, and increase the frequency
    > variations. That may be good or bad depending on the situation your


    What I meant was that, in the presence of short term variation in
    measured time from the servers, it would increase the error relative to
    true time, as it would end up tracking the perturbations in the
    measurement, rather than the underlying time.

    In particular, if you are getting large offsets because the platform
    makes it difficult to measure the time accurately, a low maxpoll will
    result in the system time wandering badly, but a higher value may allow
    the measurement noise to be averaged out.


  8. Re: peers with high offset on Intel Quad Core running Windows 2003

    Richard B. Gilbert wrote:

    > Also note that Windows is a difficult environment! The clock ticks at
    > something like 17 millisecond intervals. If you really need/want time


    Recent versions of ntpd force multi-media timers on, which reduces the
    interval between ticks, at the cost of an increased risk in lost ticks.
    Supposedly this was done because of having different measurement
    errors when they were on rather than off, and therefore having timing
    dependent on what application was running, but it also increases the
    accuracy.

    The real problem, though, is that Windows does not interpolate between
    ticks in the kernel. ntpd atttempts to do so in user space, but is
    rather vulnerable to scheduling delays, if the system is loaded, but
    ordinary applications will only see the time changing on a clock tick,
    not at about every microsecond, for older PC based Unix and Linux or
    every TSC tick for recent ones, particularly Linux.

    > to the nearest microsecond, Windows is a poor choice of O/S. There is
    > some way to interpolate between ticks which has been mentioned here from
    > time to time but I don't recall what it is. I have no need for time to
    > the nearest microsecond on my Windows systems and have not tried to
    > memorize the details!


    Yes. Generally, if you care about timing to an accuracy of more than a
    few tens of milliseconds, you should not be using Windows. Windows is
    optimised for human interfaces, so the most extreme timing it is
    designed for is probably that associated with playing simple MIDI files.

  9. Re: peers with high offset on Intel Quad Core running Windows 2003

    David,

    David Woolley wrote:
    > Richard B. Gilbert wrote:
    >
    >> Also note that Windows is a difficult environment! The clock ticks at
    >> something like 17 millisecond intervals. If you really need/want time

    >
    > Recent versions of ntpd force multi-media timers on, which reduces the
    > interval between ticks, at the cost of an increased risk in lost ticks.


    This is not quite correct. Even older versions of the NTP port for Windows
    have tried to interpolate the timer ticks using the performance counter.

    Recent versions can optionally set the Windows multimedia timer to highest
    resolution because otherwise the Windows system time would seem to step
    back and forth when another application which makes use of the multimedia
    timer was started/stopped. So this is just another workaround for
    constrains due to Windows.

    This workaround is enabled by default if NTP is installed using the GUI
    setup program available from Meinberg, but can also be disabled during
    installation.

    Martin
    --
    Martin Burnicki

    Meinberg Funkuhren
    Bad Pyrmont
    Germany

  10. Re: peers with high offset on Intel Quad Core running Windows 2003

    Martin Burnicki wrote:

    >
    > This is not quite correct. Even older versions of the NTP port for Windows
    > have tried to interpolate the timer ticks using the performance counter.
    >

    I *was* aware of that, but that doesn't change the fact that ordinary
    applications only see the time updating on a tick; only ntpd benefits
    from the interpolation. On modern Unices, all applications get high
    resolution clock readings.

    Also, I believe that the zero point of the interpolation is rather
    sensitive to scheduling delays on loaded systems.

  11. Re: peers with high offset on Intel Quad Core running Windows 2003

    David Woolley wrote:
    > Martin Burnicki wrote:
    >
    >>
    >> This is not quite correct. Even older versions of the NTP port for
    >> Windows
    >> have tried to interpolate the timer ticks using the performance counter.
    >>

    > I *was* aware of that, but that doesn't change the fact that ordinary
    > applications only see the time updating on a tick; only ntpd benefits
    > from the interpolation. On modern Unices, all applications get high
    > resolution clock readings.
    >
    > Also, I believe that the zero point of the interpolation is rather
    > sensitive to scheduling delays on loaded systems.


    If you think about it, you'll realize that the NTP sw clock, based on
    RDTSC or other high-res timer not locked to the OS clock, is another
    FLL/PLL.

    I.e. to do it right you cannot simply extrapolate from the last timer
    tick, you should instead weigh those TSC samples that have the least
    latency behind the actual timer tick.

    The real problem is that you can very easily end up with two sw locked
    loops on top of each other, and from what little I remember from my Uni
    control theory classes, this is an easy way to destroy stability, unless
    at least one of those loops are very simple.

    Terje

    --
    -
    "almost all programming can be viewed as an exercise in caching"

  12. Re: peers with high offset on Intel Quad Core running Windows 2003

    David,

    David Woolley wrote:
    > Martin Burnicki wrote:
    >
    >>
    >> This is not quite correct. Even older versions of the NTP port for
    >> Windows have tried to interpolate the timer ticks using the performance
    >> counter.
    >>

    > I *was* aware of that, but that doesn't change the fact that ordinary
    > applications only see the time updating on a tick; only ntpd benefits
    > from the interpolation. On modern Unices, all applications get high
    > resolution clock readings.
    >
    > Also, I believe that the zero point of the interpolation is rather
    > sensitive to scheduling delays on loaded systems.


    I fully agree to all of the above. I just wanted to point out this has
    nothing to do with the multimedia timer stuff which caused a different
    problem under Windows.

    Martin
    --
    Martin Burnicki

    Meinberg Funkuhren
    Bad Pyrmont
    Germany

  13. Re: peers with high offset on Intel Quad Core running Windows 2003

    Terje,

    Terje Mathisen wrote:
    > David Woolley wrote:
    >> Also, I believe that the zero point of the interpolation is rather
    >> sensitive to scheduling delays on loaded systems.


    Yes. However, the results are better than without any interpolation.

    > If you think about it, you'll realize that the NTP sw clock, based on
    > RDTSC or other high-res timer not locked to the OS clock, is another
    > FLL/PLL.
    >
    > I.e. to do it right you cannot simply extrapolate from the last timer
    > tick, you should instead weigh those TSC samples that have the least
    > latency behind the actual timer tick.


    Yes this is also correct.

    > The real problem is that you can very easily end up with two sw locked
    > loops on top of each other, and from what little I remember from my Uni
    > control theory classes, this is an easy way to destroy stability, unless
    > at least one of those loops are very simple.


    I don't think the TSC stuff is really an own control loop. What the
    interpolation or extrapolation ;-) routine currently does is pick up the
    current PerformanceCounter value as close as possible after a timer tick
    (of course delayed by a certain variable latency).

    Interpolated time is then computed based on how many PerfCnt increments have
    occurred after the latest system time/PerfCnt pair has been picked up.
    AFAICS there is no feedback to the TSC clock source here.

    As you've already mentioned a possible improvement would be indeed to try to
    determine the TSC sample with the lowest latency in order to decrease the
    interpolation jitter between several tick intervals.

    However, there's still the frequency of the PerformanceCounter which is used
    as returned by the OS. The real frequency is also off the nominal frequency
    which results in an system dependent estimation error proportional to the
    length of the estimation interval. This could be avoided if the real
    frequency of the PerformanceCounter would be computed and the computed
    frequency be used for the estimated system time.

    Martin
    --
    Martin Burnicki

    Meinberg Funkuhren
    Bad Pyrmont
    Germany

  14. Re: peers with high offset on Intel Quad Core running Windows 2003

    Martin Burnicki wrote:
    > Terje,
    > As you've already mentioned a possible improvement would be indeed to try to
    > determine the TSC sample with the lowest latency in order to decrease the
    > interpolation jitter between several tick intervals.


    Right
    >
    > However, there's still the frequency of the PerformanceCounter which is used
    > as returned by the OS. The real frequency is also off the nominal frequency
    > which results in an system dependent estimation error proportional to the
    > length of the estimation interval. This could be avoided if the real
    > frequency of the PerformanceCounter would be computed and the computed
    > frequency be used for the estimated system time.


    My dual loop approach would indeed measure/tweak the highres counter
    frequency by comparing it to the (ntpd-corrected) system clock, instead
    of depending upon whatever the OS thinks it is.

    The good thing about using the PerformanceCounter is that you can either
    read the frequency as well on every tick, or you can do so if the count
    seems off:

    This allows a way to notice if/when the OS has turned on some kind of
    cpu throttling that affects the counter frequency, but the best way to
    do this would be to hook into the OS/power driver with a callback which
    is notified each time the frequency changes.

    Terje

    --
    -
    "almost all programming can be viewed as an exercise in caching"

+ Reply to Thread