Re: Architecture / best practice for small/medium company setups
Tom Smith wrote to me offline about internal server peering, and I want to post
some here (with his permission) to get further feedback on it. What do you think
of his idea to use server declarations to handle better the case of Internet
connection outage?
[color=blue]
> You may peer between local servers, but it is important that
> you have fewer peers on each server than real clock sources.
> Otherwise, the peers can start following one other rather than
> the real clocks. If I have 4 low stratum servers configured,
> I generally try to have no more than 2 peers. In fact, I usually
> use server rather than peer declarations, because "peer" establishes
> a bidirectional server relationship, which can unintentionally
> multiply the number of server/peers. For example, you might
> configure server declarations among the 4 local servers as
> follows:
>
> A->B,C
> B->C,D
> C->D,A
> D->A,B[/color]
TIA for more insight,
Joachim
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Joachim Schrod Email: [email]jschrod@acm.org[/email]
Roedermark, Germany
Re: Architecture / best practice for small/medium company setups
"Joachim Schrod" <jschrod@acm.org> wrote in message
news:4gkb7jF1n889gU1@individual.net...
[color=blue]
> [...] What do you think of his idea to use server declarations to
> handle better the case of Internet connection outage?[/color]
Peering can indeed cause massive problems, including that 24-hour spiral
of doom.
That particular problem is caused by having a number of peers all at
the same local-clock stratum. If one of them is put at a lower stratum
than the others, they will follow it instead of their own clocks. If it
dies, they will descend into the spiral after all, however, the same
solution applies. The end result is a staggered local-clock layout
with e.g. peer A at stratum 8, B at 10, C at 12, and D at 14.
Tom proposes a directional circle of servers instead of a fully-connected
mesh. Both serve the same purpose: if one of the servers loses all of its
own external references, it can 'borrow' those of a neighbour. In the mesh,
all neighbours are always eligible, and a peer will only ever fall one
stratum as long as any outside references are left anywhere. In the circle,
_if_ internal hosts are also failing, it may take more indirections or the
network may even become partitioned.
Groetjes,
Maarten Wiltink
Re: Architecture / best practice for small/medium company setups
In article <44a52073$1$31653$e4fe514c@news.xs4all.nl>,
Maarten Wiltink <maarten@kittensandcats.net> wrote:0
[color=blue]
> That particular problem is caused by having a number of peers all at
> the same local-clock stratum. If one of them is put at a lower stratum[/color]
Isn't the real problem caused by the use of the local clock driver.
Distributors like including this and users don't understand that ntpd
will maintain frequency even without any servers, so a lot of people
include them without really understanding them.
The other problem is that many people think that ntpd is designed to
negotiate a consensus time under such circumstances, when except possibly
for the new isolated networks option, it isn't and one should use timed,
or similar, instead.
Re: Architecture / best practice for small/medium company setups
"David Woolley" <david@djwhome.demon.co.uk> wrote in message
news:T1151748999@djwhome.demon.co.uk...[color=blue]
> In article <44a52073$1$31653$e4fe514c@news.xs4all.nl>,
> Maarten Wiltink <maarten@kittensandcats.net> wrote:0[/color]
[color=blue][color=green]
>> [The spiral of doom] is caused by having a number of peers all at
>> the same local-clock stratum. If one of them is put at a lower stratum[/color]
>
> Isn't the real problem caused by the use of the local clock driver.
> Distributors like including this and users don't understand that ntpd
> will maintain frequency even without any servers, so a lot of people
> include them without really understanding them.[/color]
Note that even while a peer cluster is descending into a spiral,
one is in no worse a situation than letting all clients coast by
themselves.
But there are gains to be had by configuring the local clock into
a server. The server will still drift away from UTC, but now all
the clients will stay close to it and therefore to each other.
Unfortunately, if local clocks are configured incorrectly into a
peer cluster, new and exciting problems may be introduced. With
local clocks all at the same stratum, arguably the worst outcome
of all is likely to happen: the servers drifting confidently apart,
with some clients following one and some another, hopping and
stepping until the servers have drifted too far apart to remain
believable. All the while believing that nothing is the matter,
because they're still synchronised to something.
[color=blue]
> The other problem is that many people think that ntpd is designed to
> negotiate a consensus time under such circumstances, when except
> possibly for the new isolated networks option, it isn't and one
> should use timed, or similar, instead.[/color]
This is a problem. But it is a problem of misunderstanding and can be
cured by education.
Groetjes,
Maarten Wiltink
Re: Architecture / best practice for small/medium company setups
>>> In article <44a66f65$0$31643$e4fe514c@news.xs4all.nl>, "Maarten Wiltink" <maarten@kittensandcats.net> writes:
Maarten> Note that even while a peer cluster is descending into a spiral,
Maarten> one is in no worse a situation than letting all clients coast by
Maarten> themselves.
The problem is that if all clients coast by themselves and if one is using
NFS or Kerberos or wants to match log entries or ... then there can be
Problems.
If this is not an issue at a site, then it's probably not needed.
I frequently run NFS so I want all my machines to agree on time if, for
example, I lose internet connectivity (which at one site, used to happen
fairly often).
H
Re: Architecture / best practice for small/mediumcompany setups
Maarten Wiltink wrote:[color=blue][color=green]
>> The other problem is that many people think that ntpd is designed to
>> negotiate a consensus time under such circumstances, when except
>> possibly for the new isolated networks option, it isn't and one
>> should use timed, or similar, instead.[/color]
>
> This is a problem. But it is a problem of misunderstanding and can be
> cured by education.
>[/color]
And one for which Dave has designed a solution that works well. See
manycast.
Danny
[color=blue]
> Groetjes,
> Maarten Wiltink[/color]
_______________________________________________
questions mailing list
[email]questions@lists.ntp.isc.org[/email]
[url]https://lists.ntp.isc.org/mailman/listinfo/questions[/url]