Re: Architecture / best practice for small/medium company setups - NTP

This is a discussion on Re: Architecture / best practice for small/medium company setups - NTP ; Tom Smith wrote to me offline about internal server peering, and I want to post some here (with his permission) to get further feedback on it. What do you think of his idea to use server declarations to handle better ...

+ Reply to Thread
Results 1 to 6 of 6

Thread: Re: Architecture / best practice for small/medium company setups

  1. Re: Architecture / best practice for small/medium company setups

    Tom Smith wrote to me offline about internal server peering, and I want to post
    some here (with his permission) to get further feedback on it. What do you think
    of his idea to use server declarations to handle better the case of Internet
    connection outage?

    > You may peer between local servers, but it is important that
    > you have fewer peers on each server than real clock sources.
    > Otherwise, the peers can start following one other rather than
    > the real clocks. If I have 4 low stratum servers configured,
    > I generally try to have no more than 2 peers. In fact, I usually
    > use server rather than peer declarations, because "peer" establishes
    > a bidirectional server relationship, which can unintentionally
    > multiply the number of server/peers. For example, you might
    > configure server declarations among the 4 local servers as
    > follows:
    >
    > A->B,C
    > B->C,D
    > C->D,A
    > D->A,B


    TIA for more insight,

    Joachim

    --
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
    Joachim Schrod Email: jschrod@acm.org
    Roedermark, Germany

  2. Re: Architecture / best practice for small/medium company setups

    "Joachim Schrod" wrote in message
    news:4gkb7jF1n889gU1@individual.net...

    > [...] What do you think of his idea to use server declarations to
    > handle better the case of Internet connection outage?


    Peering can indeed cause massive problems, including that 24-hour spiral
    of doom.

    That particular problem is caused by having a number of peers all at
    the same local-clock stratum. If one of them is put at a lower stratum
    than the others, they will follow it instead of their own clocks. If it
    dies, they will descend into the spiral after all, however, the same
    solution applies. The end result is a staggered local-clock layout
    with e.g. peer A at stratum 8, B at 10, C at 12, and D at 14.

    Tom proposes a directional circle of servers instead of a fully-connected
    mesh. Both serve the same purpose: if one of the servers loses all of its
    own external references, it can 'borrow' those of a neighbour. In the mesh,
    all neighbours are always eligible, and a peer will only ever fall one
    stratum as long as any outside references are left anywhere. In the circle,
    _if_ internal hosts are also failing, it may take more indirections or the
    network may even become partitioned.

    Groetjes,
    Maarten Wiltink



  3. Re: Architecture / best practice for small/medium company setups

    In article <44a52073$1$31653$e4fe514c@news.xs4all.nl>,
    Maarten Wiltink wrote:0

    > That particular problem is caused by having a number of peers all at
    > the same local-clock stratum. If one of them is put at a lower stratum


    Isn't the real problem caused by the use of the local clock driver.
    Distributors like including this and users don't understand that ntpd
    will maintain frequency even without any servers, so a lot of people
    include them without really understanding them.

    The other problem is that many people think that ntpd is designed to
    negotiate a consensus time under such circumstances, when except possibly
    for the new isolated networks option, it isn't and one should use timed,
    or similar, instead.

  4. Re: Architecture / best practice for small/medium company setups

    "David Woolley" wrote in message
    news:T1151748999@djwhome.demon.co.uk...
    > In article <44a52073$1$31653$e4fe514c@news.xs4all.nl>,
    > Maarten Wiltink wrote:0


    >> [The spiral of doom] is caused by having a number of peers all at
    >> the same local-clock stratum. If one of them is put at a lower stratum

    >
    > Isn't the real problem caused by the use of the local clock driver.
    > Distributors like including this and users don't understand that ntpd
    > will maintain frequency even without any servers, so a lot of people
    > include them without really understanding them.


    Note that even while a peer cluster is descending into a spiral,
    one is in no worse a situation than letting all clients coast by
    themselves.

    But there are gains to be had by configuring the local clock into
    a server. The server will still drift away from UTC, but now all
    the clients will stay close to it and therefore to each other.

    Unfortunately, if local clocks are configured incorrectly into a
    peer cluster, new and exciting problems may be introduced. With
    local clocks all at the same stratum, arguably the worst outcome
    of all is likely to happen: the servers drifting confidently apart,
    with some clients following one and some another, hopping and
    stepping until the servers have drifted too far apart to remain
    believable. All the while believing that nothing is the matter,
    because they're still synchronised to something.


    > The other problem is that many people think that ntpd is designed to
    > negotiate a consensus time under such circumstances, when except
    > possibly for the new isolated networks option, it isn't and one
    > should use timed, or similar, instead.


    This is a problem. But it is a problem of misunderstanding and can be
    cured by education.

    Groetjes,
    Maarten Wiltink



  5. Re: Architecture / best practice for small/medium company setups

    >>> In article <44a66f65$0$31643$e4fe514c@news.xs4all.nl>, "Maarten Wiltink" writes:

    Maarten> Note that even while a peer cluster is descending into a spiral,
    Maarten> one is in no worse a situation than letting all clients coast by
    Maarten> themselves.

    The problem is that if all clients coast by themselves and if one is using
    NFS or Kerberos or wants to match log entries or ... then there can be
    Problems.

    If this is not an issue at a site, then it's probably not needed.

    I frequently run NFS so I want all my machines to agree on time if, for
    example, I lose internet connectivity (which at one site, used to happen
    fairly often).

    H

  6. Re: Architecture / best practice for small/mediumcompany setups

    Maarten Wiltink wrote:
    >> The other problem is that many people think that ntpd is designed to
    >> negotiate a consensus time under such circumstances, when except
    >> possibly for the new isolated networks option, it isn't and one
    >> should use timed, or similar, instead.

    >
    > This is a problem. But it is a problem of misunderstanding and can be
    > cured by education.
    >


    And one for which Dave has designed a solution that works well. See
    manycast.

    Danny

    > Groetjes,
    > Maarten Wiltink

    _______________________________________________
    questions mailing list
    questions@lists.ntp.isc.org
    https://lists.ntp.isc.org/mailman/listinfo/questions


+ Reply to Thread