Linux/Windows: Different synchronization behaviour - NTP

This is a discussion on Linux/Windows: Different synchronization behaviour - NTP ; Hi, We are running NTP in an isolated network. We use 2 HOPF stratum 1 NTP servers with GPS reference clocks as time source. We are running ntpd clients on both Linux and Windows Server 2003 with (nearly) identical ntp.conf ...

+ Reply to Thread
Results 1 to 3 of 3

Thread: Linux/Windows: Different synchronization behaviour

  1. Linux/Windows: Different synchronization behaviour

    Hi,

    We are running NTP in an isolated network. We use 2 HOPF stratum 1 NTP
    servers with GPS reference clocks as time source. We are running ntpd
    clients on both Linux and Windows Server 2003 with (nearly) identical
    ntp.conf configuration files.

    ntpd versions:
    Linux: 4.1.2
    Windows: 4.2.2

    Both the Linux and Windows clients were running without problems until
    one HOPF NTP server started reporting its refid as LOCAL(0) and
    stratum=11. We have not had time to check what happened to it yet, but
    suspect it has lost contact with its GPS receiver.

    Below I attach ntpq output and the ntp.conf configuration file. The
    Linux ntpd is happy and uses the working HOPF server as its system
    peer, while the Windows ntpd considers both HOPF servers to be
    falsetickers.

    ntpd on the clients is peering with two Linux ntpd's in addition to the
    two HOPF servers. We have however tested that the problem still remains
    after removing the two peers from ntp.conf.

    Our questions:
    - Which ntpd behaves correctly (Linux or Windows)?
    - Why do they behave differently?

    We will of course be happy if someone has hints for further
    investigation.

    Thanks in advance.

    Geir Guldstein





    ------------------------------------------------------

    ntpq output from Linux:

    ntpq> pe
    remote refid st t when poll reach delay offset
    jitter
    ================================================== ============================
    *nccgps02 .hopf. 1 u 30 64 77 7.812 -370.45
    7.812
    +nccgps01 LOCAL(0) 11 u 35 64 77 7.812 629.355
    7.812
    nccas02 0.0.0.0 16 u - 64 0 0.000 0.000
    4000.00
    nccas03 nccgps02 2 u 47 64 76 7.812 -26.955
    7.812
    ntpq> rv
    status=06f4 leap_none, sync_ntp, 15 events, event_peer/strat_chg,
    version="ntpd 4.1.2@1.892 Tue Feb 24 06:31:19 EST 2004 (1)",
    processor="x86_64", system="Linux2.4.21-32.EL", leap=00, stratum=2,
    precision=-7, rootdelay=7.812, rootdispersion=1824.707, peer=46724,
    refid=nccgps02,
    reftime=c8f03774.bd50ca1c Mon, Oct 30 2006 12:33:56.739, poll=6,
    clock=c8f0377a.ffd7e458 Mon, Oct 30 2006 12:34:02.999, state=3,
    offset=0.000, frequency=-20.189, jitter=958.864, stability=243.459
    ntpq> as
    ind assID status conf reach auth condition last_event cnt
    ================================================== =========
    1 46724 96f4 yes yes none sys.peer reachable 15
    2 46725 94f4 yes yes none candidat reachable 15
    3 46726 8000 yes yes none reject
    4 46727 90f4 yes yes none reject reachable 15
    ntpq>



    ------------------------------------------------------

    ntpq output from Windows Server 2003:

    ntpq> pe
    remote refid st t when poll reach delay offset
    jitter
    ================================================== ============================
    x192.168.1.142 .hopf. 1 u 27 64 377 0.805 -1001.6
    0.076
    x192.168.1.141 LOCAL(0) 11 u 27 64 377 0.952 -1.582
    0.055
    192.168.1.32 .STEP. 16 u - 64 0 0.000 0.000
    0.000
    192.168.1.33 .STEP. 16 u - 64 0 0.000 0.000
    0.000
    ntpq> rv
    assID=0 status=c054 sync_alarm, sync_unspec, 5 events,
    event_peer/strat_chg,
    version="ntpd 4.2.2@1.1532 Sep 20 10:40:21 (UTC+02:00) 2006 (43)",
    processor="unknown", system="WINDOWS/NT", leap=11, stratum=16,
    precision=-20, rootdelay=0.000, rootdispersion=17.445, peer=0,
    refid=¼▬♫─, reftime=00000000.00000000 ------ -- ---- --:--:--,
    poll=6,
    clock=c8f096a4.4a0e659c Mon, Oct 30 2006 19:20:04.289, state=2,
    offset=0.000, frequency=-6.394, jitter=0.028, noise=0.001,
    stability=0.000
    ntpq> as

    ind assID status conf reach auth condition last_event cnt
    ================================================== =========
    1 25622 9124 yes yes none falsetick reachable 2
    2 25623 9124 yes yes none falsetick reachable 2
    3 25624 8000 yes yes none reject
    4 25625 8000 yes yes none reject
    ntpq>



    ------------------------------------------------------

    ntp.conf:

    # Prohibit general access to this service.
    restrict default ignore

    # Permit all access over the loopback interface. This could
    # be tightened as well, but to do so would effect some of
    # the administrative functions.
    restrict 127.0.0.1


    # -- CLIENT NETWORK -------
    # Permit systems on this network to synchronize with this
    # time service. Do not permit those systems to modify the
    # configuration of this service. Also, do not use those
    # systems as peers for synchronization.
    # restrict 192.168.1.0 mask 255.255.255.0 notrust nomodify notrap


    # --- OUR TIMESERVERS -----
    # or remove the default restrict line
    # Permit time synchronization with our time source, but do not
    # permit the source to query or modify the service on this system.

    # restrict mytrustedtimeserverip mask 255.255.255.255 nomodify notrap
    noquery
    # server mytrustedtimeserverip
    restrict 192.168.1.142 mask 255.255.255.255 nomodify notrap noquery
    server 192.168.1.142
    restrict 192.168.1.141 mask 255.255.255.255 nomodify notrap noquery
    server 192.168.1.141 prefer
    restrict 192.168.1.32 mask 255.255.255.255 nomodify notrap noquery
    peer 192.168.1.32
    restrict 192.168.1.33 mask 255.255.255.255 nomodify notrap noquery
    peer 192.168.1.33



    # --- NTP MULTICASTCLIENT ---
    #multicastclient # listen on default 224.0.1.1
    # restrict 224.0.1.1 mask 255.255.255.255 notrust nomodify notrap
    # restrict 192.168.1.0 mask 255.255.255.0 notrust nomodify notrap



    # --- GENERAL CONFIGURATION ---
    #
    # Undisciplined Local Clock. This is a fake driver intended for backup
    # and when no outside source of synchronized time is available. The
    # default stratum is usually 3, but in this case we elect to use
    stratum
    # 0. Since the server line does not have the prefer keyword, this
    driver
    # is never used for synchronization, unless no other other
    # synchronization source is available. In case the local host is
    # controlled by some external source, such as an external oscillator or
    # another protocol, the prefer keyword would cause the local host to
    # disregard all other synchronization sources, unless the kernel
    # modifications are in use and declare an unsynchronized condition.
    #

    #
    # Drift file. Put this in a directory which the daemon can write to.
    # No symbolic links allowed, either, since the daemon updates the file
    # by creating a temporary in the same directory and then rename()'ing
    # it to the file.
    #
    #driftfile /var/lib/ntp/drift
    driftfile C:\WINDOWS\system32\drivers\etc\ntp.drift
    broadcastdelay 0.008

    #
    # Authentication delay. If you use, or plan to use someday, the
    # authentication facility you should make the programs in the
    auth_stuff
    # directory and figure out what this number should be on your machine.
    #
    authenticate yes

    #
    # Keys file. If you want to diddle your server at run time, make a
    # keys file (mode 600 for sure) and define the key number to be
    # used for making requests.
    #
    # PLEASE DO NOT USE THE DEFAULT VALUES HERE. Pick your own, or remote
    # systems might be able to reset your clock at will. Note also that
    # ntpd is started with a -A flag, disabling authentication, that
    # will have to be removed as well.
    #
    #keys /etc/ntp/keys


  2. Re: Linux/Windows: Different synchronization behaviour

    > "Geir G" wrote in message
    news:1162279048.696498.218770@k70g2000cwa.googlegr oups.com...

    > We are running NTP in an isolated network. We use 2 HOPF stratum 1
    > NTP servers with GPS ...


    > Below I attach ntpq output and the ntp.conf configuration file. The
    > Linux ntpd is happy and uses the working HOPF server as its system
    > peer, while the Windows ntpd considers both HOPF servers to be
    > falsetickers.


    Well, that's what you get for using two, I'm afraid.


    > ntpq output from Linux:


    > ntpq> pe
    > remote refid st t when poll reach delay offset
    > ================================================== ====================
    > *nccgps02 .hopf. 1 u 30 64 77 7.812 -370.45
    > +nccgps01 LOCAL(0) 11 u 35 64 77 7.812 629.355


    They're a second apart! No wonder the software gets confused.

    IIAMN, the stratum is merely informative and there is no quantifiable
    reason to prefer a stratum-1 server over a stratum-11 by itself. All
    sorts of error margins might be reported as zero on the disconnected
    clock, resulting in it looking perversely attractive rather than the
    bad choice it really probably is. (The problem is in the 'probably'.
    It _might_ be synchronised very well by external means. NTP doesn't know.)

    The best solution would probably be to turn off the local-clock fallback
    in the Hopf units.

    Groetjes,
    Maarten Wiltink



  3. Linux/Windows: Different synchronization behaviour

    In article <1162279048.696498.218770@k70g2000cwa.googlegroups. com>,
    "Geir G wrote:

    > *nccgps02 .hopf. 1 u 30 64 77 7.812 -370.45

    ^^
    The Linux one hasn't finished starting.
    ^^^^^
    The 7.812's are weirdly identical!

    > 7.812


    > precision=-7, rootdelay=7.812, rootdispersion=1824.707, peer=46724,

    ^^^^
    Because it hasn't finished starting the dispersion hasn't converged,
    so it will apply very large error tolerances on the times, and, in this
    case allow the two times that are 1 second apart to both be accepted.
    Hopefully, once dispersion has tightened it will reject one of them
    (this system has a third vote, which will tie break).

    ^^
    This precision is extremely poor for a modern machine. It probably
    indicates that clock ticks are not being interpolated - maybe that
    is because the kernel assumes it has a TSC and has dropped the code that
    read the balance in the CTC, but maybe you have a chip with no TSC support.
    This, I think, explains the 7.812s, at least partly.


    > x192.168.1.142 .hopf. 1 u 27 64 377 0.805 -1001.6

    ^^^
    Windows has fully loaded the filter pipeline

    > precision=-20, rootdelay=0.000, rootdispersion=17.445, peer=0,


    It's got a reasonable precision (1 microsecond) as against the +10ms
    (1/128 s) that the Linux system is reporting. More importantly,
    dispersion is tight, and, as 1 second > 17ms the error bands for the two
    times don't overlap, and as the third (and fourth) servers are doing an
    error recovery step, there is no tie breaker. Both servers are therefore
    rejected as there is no overlap that contains the majority of the servers.

    I don't know why the HOPF has a local clock fallback configured. In
    my view they are overused.

+ Reply to Thread