xntpd crash on solaris 8 - NTP

This is a discussion on xntpd crash on solaris 8 - NTP ; Hi,all: There are 3 sun solaris 8 boxes in my lab, resently I tried to open xntpd service on them---2 as servers,one as client, but met a strange thing. When startup the xntpd on 3 boxes, all time are syncronized. ...

+ Reply to Thread
Results 1 to 3 of 3

Thread: xntpd crash on solaris 8

  1. xntpd crash on solaris 8

    Hi,all:


    There are 3 sun solaris 8 boxes in my lab, resently I tried to open xntpd
    service on them---2 as servers,one as client, but met a strange thing.

    When startup the xntpd on 3 boxes, all time are syncronized. If I adjusted
    the time of client 10 min forward, xntpd could sync it by about 4 poll
    interval, but when I adjusted the client time about 2 hours forward, the
    xntpd daemon crashed after about 4 poll interval!

    I checked the process by "ntpq -c peers", the normal log is as following:

    BJWSS_ccm_1_14_1:> while [ 1 ]; do ntpq -c peers; sleep 10; done
    remote refid st t when poll reach delay offset
    disp
    ================================================== ==========================
    ==
    *BJWSS_sam_1_5_1 .LCL. 1 u 25 128 377 0.53 -0.041
    0.03
    +BJWSS_sam_1_6_1 BJWSS_sam_1_5_1 2 u 35 128 377 0.49 -0.011
    0.03
    remote refid st t when poll reach delay offset
    disp
    ================================================== ==========================
    ==
    *BJWSS_sam_1_5_1 .LCL. 1 u 35 128 377 0.53 -0.041
    0.03
    +BJWSS_sam_1_6_1 BJWSS_sam_1_5_1 2 u 45 128 377 0.49 -0.011
    0.03

    but if I adjusted 2 hours forward of client, xntpd couldn't select a new
    primary ntp server,after about 4 poll interval, xntpd daemon will crash
    automatically (

    ================================================== ==========================
    ==
    *BJWSS_sam_1_5_1 .LCL. 1 u 140m 128 377 0.53 -0.024
    0.03
    +BJWSS_sam_1_6_1 BJWSS_sam_1_5_1 2 u 141m 128 377 0.41 0.026
    0.05
    remote refid st t when poll reach delay offset
    disp
    ================================================== ==========================
    ==
    *BJWSS_sam_1_5_1 .LCL. 1 u 141m 128 377 0.53 -0.024
    0.03
    BJWSS_sam_1_6_1 BJWSS_sam_1_5_1 2 u 8 128 377 0.46 -830538
    15875.0
    remote refid st t when poll reach delay offset
    disp
    ================================================== ==========================
    ==
    BJWSS_sam_1_5_1 .LCL. 1 u 8 128 377 0.34 -830538
    15875.0
    BJWSS_sam_1_6_1 BJWSS_sam_1_5_1 2 u 18 128 377 0.46 -830538
    15875.0
    remote refid st t when poll reach delay offset
    disp
    ================================================== ==========================
    ==
    BJWSS_sam_1_5_1 .LCL. 1 u 18 128 377 0.34 -830538
    15875.0
    BJWSS_sam_1_6_1 BJWSS_sam_1_5_1 2 u 28 128 377 0.46 -830538
    15875.0

    .......

    Any one met similar problem before? The OS version of the boxes is "SunOS
    BJWSS_ccm_1_14_1 5.8 Generic_117350-12 sun4u sparc SUNW,Netra-240"...






    The ntp.conf files are as following:

    server1:
    BJWSS_sam_1_5_1

    # Either a peer or server. Replace "XType" with a value from the
    # table above.
    server 127.127.1.0 prefer
    fudge 127.127.1.0 stratum 0
    peer BJWSS_sam_1_6_1

    enable auth monitor
    driftfile /var/ntp/ntp.drift
    statsdir /var/ntp/ntpstats/
    filegen peerstats file peerstats type day enable
    filegen loopstats file loopstats type day enable
    filegen clockstats file clockstats type day enable

    ================================================== ==========================

    server2:

    BJWSS_sam_1_6_1
    # Either a peer or server. Replace "XType" with a value from the
    # table above.
    server BJWSS_sam_1_5_1 prefer
    #server 127.127.1.0
    fudge 127.127.1.0 stratum 0
    peer BJWSS_sam_1_5_1 minpoll 6

    enable auth monitor
    driftfile /var/ntp/ntp.drift
    statsdir /var/ntp/ntpstats/
    filegen peerstats file peerstats type day enable
    filegen loopstats file loopstats type day enable
    filegen clockstats file clockstats type day enable


    ================================================== ==========================
    ================

    Client

    # /etc/inet/ntp.conf generated on Sat Jun 3 23:23:22 CST 2006
    # @(#)ntp.client 1.2 96/11/06 SMI
    #
    # /etc/inet/ntp.client
    #
    # An example file that could be copied over to /etc/inet/ntp.conf; it
    # provides a configuration for a host that passively waits for a server
    # to provide NTP packets on the ntp multicast net.
    #

    server BJWSS_sam_1_5_1 prefer minpoll 6 maxpoll 7
    server BJWSS_sam_1_6_1 minpoll 6 maxpoll 7
    driftfile /var/ntp/ntp.drift

    b/r

    oxy



  2. Re: xntpd crash on solaris 8

    In article <1153790786.409182@slbhw0>, kk@girard.org wrote:

    > When startup the xntpd on 3 boxes, all time are syncronized. If I adjusted


    xntpd is obsolete. One vendor misnames the current ntpd as xntpd,
    but I don't believe that that is Sun. However, I think the following
    is still essentially true.

    > the time of client 10 min forward, xntpd could sync it by about 4 poll


    This is not a reasonable thing to do to an NTP system. Having a system
    without UTC traceable sources is somewhat borderline, at best, although
    rather commonly done, but having time sources that don't behave like
    UTC (i.e. don't advance at almost exactly 1 second per second give or
    take measurement errors) is well outside the intended normal use of
    NTP.

    > interval, but when I adjusted the client time about 2 hours forward, the
    > xntpd daemon crashed after about 4 poll interval!


    Already noted that this is by design.

    Also, although not directly relevant to the original question.

    > server 127.127.1.0 prefer
    > fudge 127.127.1.0 stratum 0


    This is considered very bad practice. Local clocks as references should be
    set as close to stratum 16 as possible, consistent with the network
    topology, so that there is no chance that they will ever be confused
    with a properly traceable time source.

    > peer BJWSS_sam_1_6_1


    Peering machines using local reference clocks isn't a good idea.
    It's just possible that having only two machines involved and your use
    of prefer mitigates the instabilities that this can cause, but the right
    way of providing redundancy is to establish a client server relationship
    and make the local clock stratum on the combined client/server be at
    least two larger than that on the server. Or maybe that's why you
    had to comment out the server on 1_6_1.

    > server BJWSS_sam_1_5_1 prefer
    > peer BJWSS_sam_1_5_1 minpoll 6


    peer and server are mutually exclusive.

    > server BJWSS_sam_1_5_1 prefer minpoll 6 maxpoll 7


    Using the default minpoll and maxpoll is strongly reccommended. Note.
    I believe that maxpoll doesn't limit the loop time constant, in which
    case, it will not make the client as responsive to phase variations
    in the server as I suspect you are trying to achieve. As already
    noted, deliberately introducing phase variations is not an intended
    use case for NTP.

    It may make it more responsive in terms of the error recovery when
    exceeding the 128ms step out limit, but, as noted above, that is not
    intended to be used in normal operation. Note, though, that there
    is still a long confirmation delay before stepping.


  3. Re: xntpd crash on solaris 8

    yes, you are right, it's not a normal case, I just wanna copy the case
    happened in our customer's site, I also suspected some administrators
    modified the time of client manually (offset>1024s), the xntpd crashed.

    I will modify the configuration of strtum and poll interval as you
    commended, and check the audit record of sun boxes, thanks for hints from
    all of you.

    b.r.

    oxy


    "David Woolley" дʼ
    news:T1153810565@djwhome.demon.co.uk...
    > In article <1153790786.409182@slbhw0>, kk@girard.org wrote:
    >
    > > When startup the xntpd on 3 boxes, all time are syncronized. If I

    adjusted
    >
    > xntpd is obsolete. One vendor misnames the current ntpd as xntpd,
    > but I don't believe that that is Sun. However, I think the following
    > is still essentially true.
    >
    > > the time of client 10 min forward, xntpd could sync it by about 4 poll

    >
    > This is not a reasonable thing to do to an NTP system. Having a system
    > without UTC traceable sources is somewhat borderline, at best, although
    > rather commonly done, but having time sources that don't behave like
    > UTC (i.e. don't advance at almost exactly 1 second per second give or
    > take measurement errors) is well outside the intended normal use of
    > NTP.
    >
    > > interval, but when I adjusted the client time about 2 hours forward, the
    > > xntpd daemon crashed after about 4 poll interval!

    >
    > Already noted that this is by design.
    >
    > Also, although not directly relevant to the original question.
    >
    > > server 127.127.1.0 prefer
    > > fudge 127.127.1.0 stratum 0

    >
    > This is considered very bad practice. Local clocks as references should

    be
    > set as close to stratum 16 as possible, consistent with the network
    > topology, so that there is no chance that they will ever be confused
    > with a properly traceable time source.
    >
    > > peer BJWSS_sam_1_6_1

    >
    > Peering machines using local reference clocks isn't a good idea.
    > It's just possible that having only two machines involved and your use
    > of prefer mitigates the instabilities that this can cause, but the right
    > way of providing redundancy is to establish a client server relationship
    > and make the local clock stratum on the combined client/server be at
    > least two larger than that on the server. Or maybe that's why you
    > had to comment out the server on 1_6_1.
    >
    > > server BJWSS_sam_1_5_1 prefer
    > > peer BJWSS_sam_1_5_1 minpoll 6

    >
    > peer and server are mutually exclusive.
    >
    > > server BJWSS_sam_1_5_1 prefer minpoll 6 maxpoll 7

    >
    > Using the default minpoll and maxpoll is strongly reccommended. Note.
    > I believe that maxpoll doesn't limit the loop time constant, in which
    > case, it will not make the client as responsive to phase variations
    > in the server as I suspect you are trying to achieve. As already
    > noted, deliberately introducing phase variations is not an intended
    > use case for NTP.
    >
    > It may make it more responsive in terms of the error recovery when
    > exceeding the 128ms step out limit, but, as noted above, that is not
    > intended to be used in normal operation. Note, though, that there
    > is still a long confirmation delay before stepping.
    >




+ Reply to Thread