xntpd crash on solaris 8 - NTP
This is a discussion on xntpd crash on solaris 8 - NTP ; Hi,all:
There are 3 sun solaris 8 boxes in my lab, resently I tried to open xntpd
service on them---2 as servers,one as client, but met a strange thing.
When startup the xntpd on 3 boxes, all time are syncronized. ...
-
xntpd crash on solaris 8
Hi,all:
There are 3 sun solaris 8 boxes in my lab, resently I tried to open xntpd
service on them---2 as servers,one as client, but met a strange thing.
When startup the xntpd on 3 boxes, all time are syncronized. If I adjusted
the time of client 10 min forward, xntpd could sync it by about 4 poll
interval, but when I adjusted the client time about 2 hours forward, the
xntpd daemon crashed after about 4 poll interval!
I checked the process by "ntpq -c peers", the normal log is as following:
BJWSS_ccm_1_14_1:> while [ 1 ]; do ntpq -c peers; sleep 10; done
remote refid st t when poll reach delay offset
disp
================================================== ==========================
==
*BJWSS_sam_1_5_1 .LCL. 1 u 25 128 377 0.53 -0.041
0.03
+BJWSS_sam_1_6_1 BJWSS_sam_1_5_1 2 u 35 128 377 0.49 -0.011
0.03
remote refid st t when poll reach delay offset
disp
================================================== ==========================
==
*BJWSS_sam_1_5_1 .LCL. 1 u 35 128 377 0.53 -0.041
0.03
+BJWSS_sam_1_6_1 BJWSS_sam_1_5_1 2 u 45 128 377 0.49 -0.011
0.03
but if I adjusted 2 hours forward of client, xntpd couldn't select a new
primary ntp server,after about 4 poll interval, xntpd daemon will crash
automatically
(
================================================== ==========================
==
*BJWSS_sam_1_5_1 .LCL. 1 u 140m 128 377 0.53 -0.024
0.03
+BJWSS_sam_1_6_1 BJWSS_sam_1_5_1 2 u 141m 128 377 0.41 0.026
0.05
remote refid st t when poll reach delay offset
disp
================================================== ==========================
==
*BJWSS_sam_1_5_1 .LCL. 1 u 141m 128 377 0.53 -0.024
0.03
BJWSS_sam_1_6_1 BJWSS_sam_1_5_1 2 u 8 128 377 0.46 -830538
15875.0
remote refid st t when poll reach delay offset
disp
================================================== ==========================
==
BJWSS_sam_1_5_1 .LCL. 1 u 8 128 377 0.34 -830538
15875.0
BJWSS_sam_1_6_1 BJWSS_sam_1_5_1 2 u 18 128 377 0.46 -830538
15875.0
remote refid st t when poll reach delay offset
disp
================================================== ==========================
==
BJWSS_sam_1_5_1 .LCL. 1 u 18 128 377 0.34 -830538
15875.0
BJWSS_sam_1_6_1 BJWSS_sam_1_5_1 2 u 28 128 377 0.46 -830538
15875.0
.......
Any one met similar problem before? The OS version of the boxes is "SunOS
BJWSS_ccm_1_14_1 5.8 Generic_117350-12 sun4u sparc SUNW,Netra-240"...
The ntp.conf files are as following:
server1:
BJWSS_sam_1_5_1
# Either a peer or server. Replace "XType" with a value from the
# table above.
server 127.127.1.0 prefer
fudge 127.127.1.0 stratum 0
peer BJWSS_sam_1_6_1
enable auth monitor
driftfile /var/ntp/ntp.drift
statsdir /var/ntp/ntpstats/
filegen peerstats file peerstats type day enable
filegen loopstats file loopstats type day enable
filegen clockstats file clockstats type day enable
================================================== ==========================
server2:
BJWSS_sam_1_6_1
# Either a peer or server. Replace "XType" with a value from the
# table above.
server BJWSS_sam_1_5_1 prefer
#server 127.127.1.0
fudge 127.127.1.0 stratum 0
peer BJWSS_sam_1_5_1 minpoll 6
enable auth monitor
driftfile /var/ntp/ntp.drift
statsdir /var/ntp/ntpstats/
filegen peerstats file peerstats type day enable
filegen loopstats file loopstats type day enable
filegen clockstats file clockstats type day enable
================================================== ==========================
================
Client
# /etc/inet/ntp.conf generated on Sat Jun 3 23:23:22 CST 2006
# @(#)ntp.client 1.2 96/11/06 SMI
#
# /etc/inet/ntp.client
#
# An example file that could be copied over to /etc/inet/ntp.conf; it
# provides a configuration for a host that passively waits for a server
# to provide NTP packets on the ntp multicast net.
#
server BJWSS_sam_1_5_1 prefer minpoll 6 maxpoll 7
server BJWSS_sam_1_6_1 minpoll 6 maxpoll 7
driftfile /var/ntp/ntp.drift
b/r
oxy
-
Re: xntpd crash on solaris 8
In article <1153790786.409182@slbhw0>, kk@girard.org wrote:
> When startup the xntpd on 3 boxes, all time are syncronized. If I adjusted
xntpd is obsolete. One vendor misnames the current ntpd as xntpd,
but I don't believe that that is Sun. However, I think the following
is still essentially true.
> the time of client 10 min forward, xntpd could sync it by about 4 poll
This is not a reasonable thing to do to an NTP system. Having a system
without UTC traceable sources is somewhat borderline, at best, although
rather commonly done, but having time sources that don't behave like
UTC (i.e. don't advance at almost exactly 1 second per second give or
take measurement errors) is well outside the intended normal use of
NTP.
> interval, but when I adjusted the client time about 2 hours forward, the
> xntpd daemon crashed after about 4 poll interval!
Already noted that this is by design.
Also, although not directly relevant to the original question.
> server 127.127.1.0 prefer
> fudge 127.127.1.0 stratum 0
This is considered very bad practice. Local clocks as references should be
set as close to stratum 16 as possible, consistent with the network
topology, so that there is no chance that they will ever be confused
with a properly traceable time source.
> peer BJWSS_sam_1_6_1
Peering machines using local reference clocks isn't a good idea.
It's just possible that having only two machines involved and your use
of prefer mitigates the instabilities that this can cause, but the right
way of providing redundancy is to establish a client server relationship
and make the local clock stratum on the combined client/server be at
least two larger than that on the server. Or maybe that's why you
had to comment out the server on 1_6_1.
> server BJWSS_sam_1_5_1 prefer
> peer BJWSS_sam_1_5_1 minpoll 6
peer and server are mutually exclusive.
> server BJWSS_sam_1_5_1 prefer minpoll 6 maxpoll 7
Using the default minpoll and maxpoll is strongly reccommended. Note.
I believe that maxpoll doesn't limit the loop time constant, in which
case, it will not make the client as responsive to phase variations
in the server as I suspect you are trying to achieve. As already
noted, deliberately introducing phase variations is not an intended
use case for NTP.
It may make it more responsive in terms of the error recovery when
exceeding the 128ms step out limit, but, as noted above, that is not
intended to be used in normal operation. Note, though, that there
is still a long confirmation delay before stepping.
-
Re: xntpd crash on solaris 8
yes, you are right, it's not a normal case, I just wanna copy the case
happened in our customer's site, I also suspected some administrators
modified the time of client manually (offset>1024s), the xntpd crashed.
I will modify the configuration of strtum and poll interval as you
commended, and check the audit record of sun boxes, thanks for hints from
all of you.
b.r.
oxy
"David Woolley" дÈëÓʼþ
news:T1153810565@djwhome.demon.co.uk...
> In article <1153790786.409182@slbhw0>, kk@girard.org wrote:
>
> > When startup the xntpd on 3 boxes, all time are syncronized. If I
adjusted
>
> xntpd is obsolete. One vendor misnames the current ntpd as xntpd,
> but I don't believe that that is Sun. However, I think the following
> is still essentially true.
>
> > the time of client 10 min forward, xntpd could sync it by about 4 poll
>
> This is not a reasonable thing to do to an NTP system. Having a system
> without UTC traceable sources is somewhat borderline, at best, although
> rather commonly done, but having time sources that don't behave like
> UTC (i.e. don't advance at almost exactly 1 second per second give or
> take measurement errors) is well outside the intended normal use of
> NTP.
>
> > interval, but when I adjusted the client time about 2 hours forward, the
> > xntpd daemon crashed after about 4 poll interval!
>
> Already noted that this is by design.
>
> Also, although not directly relevant to the original question.
>
> > server 127.127.1.0 prefer
> > fudge 127.127.1.0 stratum 0
>
> This is considered very bad practice. Local clocks as references should
be
> set as close to stratum 16 as possible, consistent with the network
> topology, so that there is no chance that they will ever be confused
> with a properly traceable time source.
>
> > peer BJWSS_sam_1_6_1
>
> Peering machines using local reference clocks isn't a good idea.
> It's just possible that having only two machines involved and your use
> of prefer mitigates the instabilities that this can cause, but the right
> way of providing redundancy is to establish a client server relationship
> and make the local clock stratum on the combined client/server be at
> least two larger than that on the server. Or maybe that's why you
> had to comment out the server on 1_6_1.
>
> > server BJWSS_sam_1_5_1 prefer
> > peer BJWSS_sam_1_5_1 minpoll 6
>
> peer and server are mutually exclusive.
>
> > server BJWSS_sam_1_5_1 prefer minpoll 6 maxpoll 7
>
> Using the default minpoll and maxpoll is strongly reccommended. Note.
> I believe that maxpoll doesn't limit the loop time constant, in which
> case, it will not make the client as responsive to phase variations
> in the server as I suspect you are trying to achieve. As already
> noted, deliberately introducing phase variations is not an intended
> use case for NTP.
>
> It may make it more responsive in terms of the error recovery when
> exceeding the 128ms step out limit, but, as noted above, that is not
> intended to be used in normal operation. Note, though, that there
> is still a long confirmation delay before stepping.
>