Hi friend,
Have a look at these :
http://support.ntp.org/bin/view/Supp...HardwareIssues
http://support.ntp.org/bin/view/Support/KnownOsIssues
Venu
This is a discussion on Significant clock skew in cluster environment - NTP ; Hello all. I'd appreciate some help. I'm the defacto admin for a small research cluster in an academic institution. All hosts are running GNU/Linux under a 2.6.22-family kernel. The ntp version in use is 4.2.4 p4. My campus runs two ...
Hello all.
I'd appreciate some help. I'm the defacto admin for a small research
cluster in an academic institution. All hosts are running GNU/Linux
under a 2.6.22-family kernel. The ntp version in use is 4.2.4 p4. My
campus runs two ntp servers. My cluster's headnode uses the two campus
ntp servers as its sources. Internal cluster nodes then use the cluster
headnode as their (only) ntp time source. The internal cluster nodes
have no route to the internet, only the headnode does.
I'm seeing a problem wherein internal cluster nodes develop significant
clock skew over time. By "significant" I mean up to 700 seconds over
two weeks of uptime. I am checking this using "ntpq -p" and looking at
the offset field. The only thing I can think of is that some of the
machines, including the headnode, are configured to use the Linux
"ondemand" CPU frequency governor. These processors are older AMD
Opteron 246/248 chips capable of dynamic frequency management. However,
I also have nodes with older AMD Athlon processors that do not employ
dynamic frequency management which also exhibit this phenomenon.
Additionally, on the headnode I am seeing in the ntpd syslog output
messages like:
ntpd[5642]: frequency error 509 PPM exceeds tolerance 500 PPM
But there are no such log entries on any of the internal nodes.
Is there any issue with dynamic processor frequency control negatively
affecting ntp?
If this is not it, I can give the basic contents of my ntp.conf files.
None of these machines are running onboard firewalls, and ntpd is being
started through the init system.
---
On the head node:
Two sets of server directives in the form:
server a.b.c.d iburst
restrict a.b.c.d nomodify notrap nopeer noquery
where a.b.c.d is one of the campus ntp servers' IP addresses.
Thereafter there is:
restrict default ignore
restrict 127.0.0.1
restrict h.i.j.k mask l.m.n.o nomodify nopeer notrap
where h.i.j.k and l.m.n.o are correctly defined to allow all the
internal cluster hosts to query this machine
followed by:
restrict h.i.j.p mask 255.255.255.255
where p is the head node's internal cluster IP address.
---
On the internal cluster nodes (all use the identical file):
server h.i.j.p iburst
where h.i.j.p is the headnode's IP address
followed by:
restrict default ignore
restrict 127.0.0.1
restrict h.i.j.p mask 255.255.255.255 nopeer
---
Thanks for any help.
--
metallurgist@airpost.net
--
http://www.fastmail.fm - The professional email service
Hi friend,
Have a look at these :
http://support.ntp.org/bin/view/Supp...HardwareIssues
http://support.ntp.org/bin/view/Support/KnownOsIssues
Venu
metallurgist@airpost.net wrote:
>
> Is there any issue with dynamic processor frequency control negatively
> affecting ntp?
The TSC is used to interpolate, in some circumstances, and its
calibration will be totally confused by variable clock rates.
>
> If this is not it, I can give the basic contents of my ntp.conf files.
> None of these machines are running onboard firewalls, and ntpd is being
> started through the init system.
This is not going to be an ntpd configuration problem.
>
On Jun 24, 3:34*pm, David Woolley
wrote:
>
> The TSC is used to interpolate, in some circumstances, and its
> calibration will be totally confused by variable clock rates.
That's not an issue anymore in the OP's kervel version.
HTH