Speed of ntp convergence - NTP

This is a discussion on Speed of ntp convergence - NTP ; Just another data point on the behaviour of ntp. My ntp went down ( due to something removing the ntp user from the password file). When I brought it back up again, the time was out and I think the ...

+ Reply to Thread
Page 1 of 3 1 2 3 LastLast
Results 1 to 20 of 43

Thread: Speed of ntp convergence

  1. Speed of ntp convergence




    Just another data point on the behaviour of ntp. My ntp went down ( due to
    something removing the ntp user from the password file). When I brought it
    back up again, the time was out and I think the drift file was out.
    I have three sources-- a stratum 2 nearby server, a distant stratum 1 server and
    a gps refclock with a PPS. The PPS is setup to drive the refclock_shm driver.
    The refclock has a poll of 4 (pollinterval of 16), and both the other are
    default ( minref 6 maxref 10)
    I brought the system back up at 54770 78683.101 (the ntp log file date and
    time) The way the PPS works is that it waits about 5 min until it is sure
    that ntp has the time from the other servers to within 250ms. It then
    switches on the shm driver. The refclock started out with an offset of
    about 150ms , which increased to 280ms and was eventually (about 6 min later) reset to zero offset
    because I was running with the -g for ntp.

    This was also the point at which the PPS shm kicked in.
    However the drift rate was clearly off, because the offset then gradually
    increased until at 82555 (3878 sec after the start or about 1hr) it was 52ms off (the
    maximum offset) . By the end of the day ( 86400
    or about 7700sec or 2 hrs) it was still 19ms offset from the PPS zero time.

    An hour later, it was still 7ms off, another hour, 2.6ms and another hour
    later, still 1.2 ms off. Ie, only after about 6 hours was it within a ms of
    the correct time. Now, usually this PPS controls the time to within about
    2us (not ms, usec) but it is apparently going to take over 10 hours to get
    there. That is of course completely rediculous.

    The shm poll interval is 16 sec and even if ntp throws away 7/8 that would
    give a max time between data points of about 128sec or 2 min. Thus ntp should
    have a time constant of about 4 min. Instead the time constant is about an
    hour. It would seem that the time constant is selected as far longer than
    the longest poll inteval. ( the poll interval during this time on the other
    two source was 64 sec, or poll interval of 6).

    Within the first minute, I could by hand
    determine the offset and slope of the CPU clock to withing a few usec not
    msec or tens of msec. and the slope to better than 1PPM.

    But never mind my concern about the markovian system feedback system ntp
    uses. That argument I am sure everyone is tired of. What concerns me is
    the long (1hr) time constant of the feedback loop, about 200 times longer
    than the poll interval. Ie, it does not seem to me that ntp is fulfilling
    its design criteria.


    Here after 5.5 hours after startup is the ouput of ntpq -p

    string[root]>ntpq -p
    remote refid st t when poll reach delay offset jitter
    ================================================== ============================
    +tick.usask.ca .GPS. 1 u 18 64 377 44.925 1.455 4.252
    +ntp.ubc.ca 140.142.16.34 2 u 44 64 343 0.672 0.260 0.767
    *SHM(0) .PPS. 0 l 1 16 377 0.000 1.136 0.023




  2. Re: Speed of ntp convergence

    Unruh wrote:
    []
    > An hour later, it was still 7ms off, another hour, 2.6ms and another
    > hour
    > later, still 1.2 ms off. Ie, only after about 6 hours was it within a
    > ms of
    > the correct time. Now, usually this PPS controls the time to within
    > about 2us (not ms, usec) but it is apparently going to take over 10
    > hours to get
    > there. That is of course completely rediculous.


    There sounds to be something wrong with your system.

    As a comparison, I have a very old Pentium 133 system here running FreeBSD
    with local GPS PPS and some other Internet-based stratum 2/3 servers
    (probably NTP pool and a fixed name). I'm sure it's well within a few
    minutes for it to reach it's full accuracy (tens of microseconds). For
    interest, I've just (0645 UTC) switched it off and on, and we will be able
    to watch its recovery here (30 minute updates):

    http://www.satsignal.eu/mrtg/pixie_ntp.html

    Here it is about a minute after startup:

    ntpq -p pixie
    remote refid st t when poll reach delay offset
    jitter
    ================================================== ============================
    +calx.pulsewidth 193.120.10.3 2 u 62 64 1 22.272 1.700
    0.743
    +admin.islay.bit 192.33.96.102 2 u 62 64 1 21.131 0.921
    1.112
    +dnscache-london 128.250.33.242 2 u 62 64 1 22.845 3.299
    0.666
    88-96-233-89.ds .PPS. 1 u 14 128 7 63.431 0.044
    2.789
    *utserv.mcc.ac.u 193.62.22.98 2 u 64 64 1 26.494 4.312
    0.829
    GPS_NMEA(1) .PPS. 0 l 12 64 3 0.000 -0.137
    1.654

    ... and a few minutes later ...

    ntpq -p pixie
    remote refid st t when poll reach delay offset
    jitter
    ================================================== ============================
    +calx.pulsewidth 193.120.10.3 2 u 54 64 37 22.348 2.946
    0.877
    +admin.islay.bit 192.33.96.102 2 u 53 64 37 20.496 1.862
    1.057
    +dnscache-london 128.250.33.242 2 u 57 64 37 23.090 3.809
    0.662
    88-96-233-89.ds .PPS. 1 u 134 256 17 63.431 0.044
    2.007
    +utserv.mcc.ac.u 193.62.22.98 2 u 53 64 37 25.564 5.371
    0.868
    *GPS_NMEA(1) .PPS. 0 l 3 64 77 0.000 -0.001
    0.803

    It's using the out-of-the-box NTP code, and probably a rather old version
    of NTP.

    version="ntpd 4.2.0-a Sun May 8 06:01:21 UTC 2005 (1)"

    It's a very simple system, described here:
    http://www.satsignal.eu/ntp/FreeBSD-GPS-PPS.htm

    Cheers,
    David


  3. Re: Speed of ntp convergence

    David J Taylor wrote:
    []
    > As a comparison, I have a very old Pentium 133 system here running
    > FreeBSD with local GPS PPS and some other Internet-based stratum 2/3
    > servers (probably NTP pool and a fixed name). I'm sure it's well
    > within a few minutes for it to reach it's full accuracy (tens of
    > microseconds). For interest, I've just (0645 UTC) switched it off
    > and on, and we will be able to watch its recovery here (30 minute
    > updates):
    > http://www.satsignal.eu/mrtg/pixie_ntp.html

    []
    > Cheers,
    > David


    Checking a little later shows convergence down to noise level within 15
    minutes.

    David


  4. Re: Speed of ntp convergence

    Could it be a poor crystal being affected by the temperature change?
    After all, even with a solid PPS, the system time is controlled by its
    crystals.

    David showed data for his Sun system and even x86 systems by Sun have
    pretty good crystals.

    HTH

  5. Re: Speed of ntp convergence

    "David J Taylor" writes:

    >Unruh wrote:
    >[]
    >> An hour later, it was still 7ms off, another hour, 2.6ms and another
    >> hour
    >> later, still 1.2 ms off. Ie, only after about 6 hours was it within a
    >> ms of
    >> the correct time. Now, usually this PPS controls the time to within
    >> about 2us (not ms, usec) but it is apparently going to take over 10
    >> hours to get
    >> there. That is of course completely rediculous.


    >There sounds to be something wrong with your system.


    Nope, it is not my "ssytem" if by that you mean my computer. The
    convergence is a beautiful exponential convergence with a time scale of 1
    hour almost exactly. That is not hardware. That is the software ntp
    protocol.


    >As a comparison, I have a very old Pentium 133 system here running FreeBSD
    >with local GPS PPS and some other Internet-based stratum 2/3 servers
    >(probably NTP pool and a fixed name). I'm sure it's well within a few
    >minutes for it to reach it's full accuracy (tens of microseconds). For
    >interest, I've just (0645 UTC) switched it off and on, and we will be able
    >to watch its recovery here (30 minute updates):


    Try switching it off, changing the value int he drift file by say 50PPM and
    then switching it on again, and see how long it takes to recover from that.

    Note, if you are running gps, why have a poll level 6? The recommendation
    for ref- clocks is poll level 4?

    > http://www.satsignal.eu/mrtg/pixie_ntp.html


    >Here it is about a minute after startup:


    >ntpq -p pixie
    > remote refid st t when poll reach delay offset
    >jitter
    >================================================== ============================
    >+calx.pulsewidth 193.120.10.3 2 u 62 64 1 22.272 1.700
    >0.743
    >+admin.islay.bit 192.33.96.102 2 u 62 64 1 21.131 0.921
    >1.112
    >+dnscache-london 128.250.33.242 2 u 62 64 1 22.845 3.299
    >0.666
    > 88-96-233-89.ds .PPS. 1 u 14 128 7 63.431 0.044
    >2.789
    >*utserv.mcc.ac.u 193.62.22.98 2 u 64 64 1 26.494 4.312
    >0.829
    > GPS_NMEA(1) .PPS. 0 l 12 64 3 0.000 -0.137
    >1.654


    >.. and a few minutes later ...


    >ntpq -p pixie
    > remote refid st t when poll reach delay offset
    >jitter
    >================================================== ============================
    >+calx.pulsewidth 193.120.10.3 2 u 54 64 37 22.348 2.946
    >0.877
    >+admin.islay.bit 192.33.96.102 2 u 53 64 37 20.496 1.862
    >1.057
    >+dnscache-london 128.250.33.242 2 u 57 64 37 23.090 3.809
    >0.662
    > 88-96-233-89.ds .PPS. 1 u 134 256 17 63.431 0.044
    >2.007
    >+utserv.mcc.ac.u 193.62.22.98 2 u 53 64 37 25.564 5.371
    >0.868
    >*GPS_NMEA(1) .PPS. 0 l 3 64 77 0.000 -0.001
    >0.803


    >It's using the out-of-the-box NTP code, and probably a rather old version
    >of NTP.


    > version="ntpd 4.2.0-a Sun May 8 06:01:21 UTC 2005 (1)"


    >It's a very simple system, described here:
    > http://www.satsignal.eu/ntp/FreeBSD-GPS-PPS.htm


    And I am using 4.2.4p4



    >Cheers,
    >David



  6. Re: Speed of ntp convergence

    Evandro Menezes writes:

    >Could it be a poor crystal being affected by the temperature change?
    >After all, even with a solid PPS, the system time is controlled by its
    >crystals.


    Nope. As I said it is a beautiful exponential convergence of the phase
    offset with a time scale of almost exactly one hour.


    >David showed data for his Sun system and even x86 systems by Sun have
    >pretty good crystals.


    Sure. Mine is a stock Linux PC hardware, and running 4.2.4p4, not 4.2.0

    It is possible a bug has crept into the software,




    >HTH


  7. Re: Speed of ntp convergence

    Unruh wrote:
    []
    > Nope, it is not my "ssytem" if by that you mean my computer. The
    > convergence is a beautiful exponential convergence with a time scale
    > of 1
    > hour almost exactly. That is not hardware. That is the software ntp
    > protocol.


    Just along the lines that if my system converges in 10 minutes, I am at a
    loss to see why yours takes ten hours. It seems to me that there is
    nothing inherently wrong with the NTP version I have.

    []
    > Try switching it off, changing the value int he drift file by say
    > 50PPM and
    > then switching it on again, and see how long it takes to recover from
    > that.


    Why would I do that? The drift values rarely change by more than five,
    certainly not by 50. If you are seeing a change of 50, then perhaps that
    it part of your problem?

    > Note, if you are running gps, why have a poll level 6? The
    > recommendation
    > for ref- clocks is poll level 4?


    Probably because the system is also polling Internet servers, and I didn't
    want to hammer them at 16 second intervals. It seems that the Internet
    poll interval /must/ be the same as the ref-clock poll interval, and that
    it doesn't automatically adjust upwards (which would be nice).

    Where did you get your information from, by the way? I found this:

    http://sunsite.ualberta.ca/Documenta...a/clockopt.htm

    "For most directly connected reference clocks, both minpoll and maxpoll
    default to 6 (64 s)."

    from a quick Google search, but I don't know if that's in the official
    documentation.

    Cheers,
    David


  8. Re: Speed of ntp convergence

    >> Try switching it off, changing the value int he drift file by say
    >> 50PPM and
    >> then switching it on again, and see how long it takes to recover from
    >> that.


    >Why would I do that? The drift values rarely change by more than five,
    >certainly not by 50. If you are seeing a change of 50, then perhaps that
    >it part of your problem?


    A big step like that makes it easy to see how the system responds.
    At least if it's a linear system.

    --
    These are my opinions, not necessarily my employer's. I hate spam.


  9. Re: Speed of ntp convergence


    >Note, if you are running gps, why have a poll level 6? The recommendation
    >for ref- clocks is poll level 4?


    Where/who does that recommendation come from?

    --
    These are my opinions, not necessarily my employer's. I hate spam.


  10. Re: Speed of ntp convergence

    "David J Taylor" writes:

    >Unruh wrote:
    >[]
    >> Nope, it is not my "ssytem" if by that you mean my computer. The
    >> convergence is a beautiful exponential convergence with a time scale
    >> of 1
    >> hour almost exactly. That is not hardware. That is the software ntp
    >> protocol.


    >Just along the lines that if my system converges in 10 minutes, I am at a
    >loss to see why yours takes ten hours. It seems to me that there is
    >nothing inherently wrong with the NTP version I have.


    >[]
    >> Try switching it off, changing the value int he drift file by say
    >> 50PPM and
    >> then switching it on again, and see how long it takes to recover from
    >> that.


    >Why would I do that? The drift values rarely change by more than five,


    To test to see how long your system takes to converge. It is clear on my
    system and somehow it is getting a very bad idea of the frequency. For
    example, the time drifts to offset larger than 125ms, and resets to 0 ( because of
    the -g) but almost immediately it drifts at about 140PPM. It then takes 10
    hours to get the offset back to 0.

    Why is the drift off so badly? I do not know, but that is irrelevant. ntp
    should NOT take 10 hours to correct a badly drifting clock.

    So how long does it take your system to correct a bad drift file?



    >certainly not by 50. If you are seeing a change of 50, then perhaps that
    >it part of your problem?


    It may be. But ntp's problem is that it is taking far far far too long to
    correct a bad drift.



    >> Note, if you are running gps, why have a poll level 6? The
    >> recommendation
    >> for ref- clocks is poll level 4?


    >Probably because the system is also polling Internet servers, and I didn't
    >want to hammer them at 16 second intervals. It seems that the Internet
    >poll interval /must/ be the same as the ref-clock poll interval, and that
    >it doesn't automatically adjust upwards (which would be nice).


    No, each source has its own poll interval. They do NOT need to be the same.



    >Where did you get your information from, by the way? I found this:


    > http://sunsite.ualberta.ca/Documenta...a/clockopt.htm


    > "For most directly connected reference clocks, both minpoll and maxpoll
    >default to 6 (64 s)."


    I recall from somewhere that that it was 4, but I could be wrong. I have it
    as 4 for my refclock. (4.0.99 us a but old)




    >from a quick Google search, but I don't know if that's in the official
    >documentation.






  11. Re: Speed of ntp convergence

    Hal Murray wrote:
    >>> Try switching it off, changing the value int he drift file by say
    >>> 50PPM and
    >>> then switching it on again, and see how long it takes to recover
    >>> from that.

    >
    >> Why would I do that? The drift values rarely change by more than
    >> five, certainly not by 50. If you are seeing a change of 50, then
    >> perhaps that it part of your problem?

    >
    > A big step like that makes it easy to see how the system responds.
    > At least if it's a linear system.


    Yes, I appreciate that, Hal, but it doesn't emulate the situation here
    very well, which I understood to be slow convergence after a routine
    start. It sounds as if the OP may have an incorrect drift file - it's
    worth checking that it /is/ being updated.

    Cheers,
    David


  12. Re: Speed of ntp convergence

    Unruh wrote:
    []
    > So how long does it take your system to correct a bad drift file?


    Oh, much longer, but that just suggests that your drift file is bad, not
    any inherent fault in NTP.


    >>> Note, if you are running gps, why have a poll level 6? The
    >>> recommendation
    >>> for ref- clocks is poll level 4?

    >
    >> Probably because the system is also polling Internet servers, and I
    >> didn't want to hammer them at 16 second intervals. It seems that
    >> the Internet poll interval /must/ be the same as the ref-clock poll
    >> interval, and that it doesn't automatically adjust upwards (which
    >> would be nice).

    >
    > No, each source has its own poll interval. They do NOT need to be the
    > same.


    It's not what Dave Mills told me on this newsgroup, when PPS reference
    clocks are included.

    David


  13. Re: Speed of ntp convergence

    "David J Taylor" writes:

    >Hal Murray wrote:
    >>>> Try switching it off, changing the value int he drift file by say
    >>>> 50PPM and
    >>>> then switching it on again, and see how long it takes to recover
    >>>> from that.

    >>
    >>> Why would I do that? The drift values rarely change by more than
    >>> five, certainly not by 50. If you are seeing a change of 50, then
    >>> perhaps that it part of your problem?

    >>
    >> A big step like that makes it easy to see how the system responds.
    >> At least if it's a linear system.


    >Yes, I appreciate that, Hal, but it doesn't emulate the situation here
    >very well, which I understood to be slow convergence after a routine
    >start. It sounds as if the OP may have an incorrect drift file - it's
    >worth checking that it /is/ being updated.




    The drift file read 10. The actual drift was 250 (determined after the
    system had settled down). The drift file never changed even after a day of
    running. ntp does not seem to be rewriting the drift file. Now that is a
    problem (although with the apparent Linux bug in the timing routines where is
    miscalibrates the clock on bootup, the drift is NOT stable over reboots
    anyway, so the existence of a drift file is irrelevant. ) However, the question is
    about the bahaviour of ntp. ntp should NOT be taking 10 hours to get over a
    wrong value in the drift file. With GPS refclock the system should be able
    to figure out the drift to better than 1 PPM from two successive readings
    of the clock ( 16 sec) or atmost 3 reading (32 sec) NOT 10 hours.
    It is a design flaw in ntp. A design flaw which Mills absolutely refuses to
    fix, saying ntp works as designed. Clocks jump in frequency-- for example if the machine
    suddenly gets used a lot, the temp can jump by 20-30C which can drive the
    rate out by a few PPM. With a response time of hours, that means that the
    ntp response to that change in temperature takes hours, apparently even if
    the poll interval is 4 (16 sec).
    Mills has based his feedback loop on the Markovian theory of simple
    feedback loops of engineering control theory. In most systems from the 20th
    century, memory was in very short supply. The only memory was in the
    parameters of the system itself ( voltage across a capacitor, current in an
    inductor, speed of the motor) and the control theory worked well with that. Each
    bit of memory (each additional capacitor, inductor, governor) cost a lot. In a
    digital computer, memory is virtually infinite in supply. It costs
    essentially nothing to remember. Thus you can use your data values from
    almost as far back as you wish. Of course they can become irrelevant
    because the physical situation changes.

    The ntp control theory uses only the clock offset and rate as its "memory",
    (current measurement alters only those two parameters and then is forgotten
    except in its effect of those two parameters. In fact the control comes
    ONLY though control of the drift rate). Chrony remembers up to 64 previous
    measurements-- correcting them for changes in offset and frequency of the
    clock-- and throws them away only when it becomes clear that the parameters
    of the system have changed (error no longer dominated by random noise, but
    by some consistant change) and those old values are useless for the
    prediction of the future behaviour of the clock. This means that it can
    after three measurements get a very good estimate of both the frequency and
    phase offset of the clock, and correct them, refining them as more data
    comes in. At present its key limitation is that it does not do refclocks at
    all. (It also only runs on Linux).

    All of this is largely irrelevant if what you want is millisecond accuracy
    of your clock. ntp is great for that. Or if your computer is on all the
    time and ntp has converged-- my measurements indicate chrony is only about
    2-3 times better than ntp in that situation, caused I think primarily by
    the temperature induced rate fluctuations of the cpu. That means that if
    you are concerned with usec timing, a difference between say a 10 us
    accuracy and a 5 usec accuracy, which for almost all of us is irrelevant.

    ntp also has the advantage of obeying the KISS principle (Keep it stimple,
    stupid) in that direct control of the rate of the clock is far simpler than
    keeping a memory, updating and correcting the memory, trying to figure out
    when to forget and when to remember,... And the more complex, the greater
    the possibility of error. (although with clock-filter, huff-and-puff,
    server selection,... ntp is getting pretty complex as well.)




  14. Re: Speed of ntp convergence

    "David J Taylor" writes:

    >Unruh wrote:
    >[]
    >> So how long does it take your system to correct a bad drift file?


    >Oh, much longer, but that just suggests that your drift file is bad, not
    >any inherent fault in NTP.



    >>>> Note, if you are running gps, why have a poll level 6? The
    >>>> recommendation
    >>>> for ref- clocks is poll level 4?

    >>
    >>> Probably because the system is also polling Internet servers, and I
    >>> didn't want to hammer them at 16 second intervals. It seems that
    >>> the Internet poll interval /must/ be the same as the ref-clock poll
    >>> interval, and that it doesn't automatically adjust upwards (which
    >>> would be nice).

    >>
    >> No, each source has its own poll interval. They do NOT need to be the
    >> same.


    >It's not what Dave Mills told me on this newsgroup, when PPS reference
    >clocks are included.


    Well, the evidence is that teh different clocks DO have and use different
    poll intervals. My refclock is on poll level 4 and a reading is taken from
    it every 16 seconds as recorded in peerstats. My outside reference servers
    are on the standard poll 6-10 and they are read every 64 seconds intially
    and later at longer intervals.
    Here is teh output of ntpq -p

    +tick.usask.ca .GPS. 1 u 14 64 377 44.904 0.147 0.074
    +ntp.ubc.ca 142.3.100.2 2 u 586 1024 371 0.666 9.912 1.265
    *SHM(0) .PPS. 0 l 2 16 377 0.000 -0.004 0.002

    Note that the refclock is read every 16 seconds, ntp.ubc.ca is read every
    1024 sec (poll 10) and tick.usask.ca is read every 64 sec (poll 6) and
    looking at peerstats confirms this.


  15. Re: Speed of ntp convergence

    Unruh wrote:
    > "David J Taylor" writes:
    >
    >> Hal Murray wrote:
    >>>>> Try switching it off, changing the value int he drift file by say
    >>>>> 50PPM and
    >>>>> then switching it on again, and see how long it takes to recover
    >>>>> from that.
    >>>> Why would I do that? The drift values rarely change by more than
    >>>> five, certainly not by 50. If you are seeing a change of 50, then
    >>>> perhaps that it part of your problem?
    >>> A big step like that makes it easy to see how the system responds.
    >>> At least if it's a linear system.

    >
    >> Yes, I appreciate that, Hal, but it doesn't emulate the situation here
    >> very well, which I understood to be slow convergence after a routine
    >> start. It sounds as if the OP may have an incorrect drift file - it's
    >> worth checking that it /is/ being updated.

    >
    >
    >
    > The drift file read 10. The actual drift was 250 (determined after the
    > system had settled down). The drift file never changed even after a day of
    > running. ntp does not seem to be rewriting the drift file. Now that is a
    > problem (although with the apparent Linux bug in the timing routines where is
    > miscalibrates the clock on bootup, the drift is NOT stable over reboots
    > anyway, so the existence of a drift file is irrelevant. ) However, the question is
    > about the bahaviour of ntp. ntp should NOT be taking 10 hours to get over a
    > wrong value in the drift file.


    That's easy to fix! If the drift file is not correct, remove it before
    starting ntpd.

    How do you tell if it's incorrect? Since ntpd is supposed to
    update/rewrite the drift file every sixty minutes, a drift file more
    than sixty minutes old is suspect!

    > With GPS refclock the system should be able
    > to figure out the drift to better than 1 PPM from two successive readings
    > of the clock ( 16 sec) or atmost 3 reading (32 sec) NOT 10 hours.
    > It is a design flaw in ntp. A design flaw which Mills absolutely refuses to
    > fix, saying ntp works as designed. Clocks jump in frequency-- for example if the machine
    > suddenly gets used a lot, the temp can jump by 20-30C which can drive the
    > rate out by a few PPM. With a response time of hours, that means that the
    > ntp response to that change in temperature takes hours, apparently even if
    > the poll interval is 4 (16 sec).
    > Mills has based his feedback loop on the Markovian theory of simple
    > feedback loops of engineering control theory. In most systems from the 20th
    > century, memory was in very short supply. The only memory was in the
    > parameters of the system itself ( voltage across a capacitor, current in an
    > inductor, speed of the motor) and the control theory worked well with that. Each
    > bit of memory (each additional capacitor, inductor, governor) cost a lot. In a
    > digital computer, memory is virtually infinite in supply. It costs
    > essentially nothing to remember. Thus you can use your data values from
    > almost as far back as you wish. Of course they can become irrelevant
    > because the physical situation changes.
    >
    > The ntp control theory uses only the clock offset and rate as its "memory",
    > (current measurement alters only those two parameters and then is forgotten
    > except in its effect of those two parameters. In fact the control comes
    > ONLY though control of the drift rate). Chrony remembers up to 64 previous
    > measurements-- correcting them for changes in offset and frequency of the
    > clock-- and throws them away only when it becomes clear that the parameters
    > of the system have changed (error no longer dominated by random noise, but
    > by some consistant change) and those old values are useless for the
    > prediction of the future behaviour of the clock. This means that it can
    > after three measurements get a very good estimate of both the frequency and
    > phase offset of the clock, and correct them, refining them as more data
    > comes in. At present its key limitation is that it does not do refclocks at
    > all. (It also only runs on Linux).
    >
    > All of this is largely irrelevant if what you want is millisecond accuracy
    > of your clock. ntp is great for that. Or if your computer is on all the
    > time and ntp has converged-- my measurements indicate chrony is only about
    > 2-3 times better than ntp in that situation, caused I think primarily by
    > the temperature induced rate fluctuations of the cpu. That means that if
    > you are concerned with usec timing, a difference between say a 10 us
    > accuracy and a 5 usec accuracy, which for almost all of us is irrelevant.
    >
    > ntp also has the advantage of obeying the KISS principle (Keep it stimple,
    > stupid) in that direct control of the rate of the clock is far simpler than
    > keeping a memory, updating and correcting the memory, trying to figure out
    > when to forget and when to remember,... And the more complex, the greater
    > the possibility of error. (although with clock-filter, huff-and-puff,
    > server selection,... ntp is getting pretty complex as well.)
    >
    >
    >


    You are at liberty to write your own version of ntpd using your
    preferred algorithm! Dave designed ntpd to cope with the, usually
    horrible, behavior of the internet. This is not necessarily the best
    design for all circumstances. It is, however, what we have to work with
    unless we are willing and able to "roll our own"! I am not!


  16. Re: Speed of ntp convergence

    Richard B. Gilbert wrote:

    > preferred algorithm! Dave designed ntpd to cope with the, usually
    > horrible, behavior of the internet. This is not necessarily the best


    That was the internet of some 20 years ago.

  17. Re: Speed of ntp convergence

    "Richard B. Gilbert" writes:

    >Unruh wrote:
    >> "David J Taylor" writes:
    >>
    >>> Hal Murray wrote:
    >>>>>> Try switching it off, changing the value int he drift file by say
    >>>>>> 50PPM and
    >>>>>> then switching it on again, and see how long it takes to recover
    >>>>>> from that.
    >>>>> Why would I do that? The drift values rarely change by more than
    >>>>> five, certainly not by 50. If you are seeing a change of 50, then
    >>>>> perhaps that it part of your problem?
    >>>> A big step like that makes it easy to see how the system responds.
    >>>> At least if it's a linear system.

    >>
    >>> Yes, I appreciate that, Hal, but it doesn't emulate the situation here
    >>> very well, which I understood to be slow convergence after a routine
    >>> start. It sounds as if the OP may have an incorrect drift file - it's
    >>> worth checking that it /is/ being updated.

    >>
    >>
    >>
    >> The drift file read 10. The actual drift was 250 (determined after the
    >> system had settled down). The drift file never changed even after a day of
    >> running. ntp does not seem to be rewriting the drift file. Now that is a
    >> problem (although with the apparent Linux bug in the timing routines where is
    >> miscalibrates the clock on bootup, the drift is NOT stable over reboots
    >> anyway, so the existence of a drift file is irrelevant. ) However, the question is
    >> about the bahaviour of ntp. ntp should NOT be taking 10 hours to get over a
    >> wrong value in the drift file.


    >That's easy to fix! If the drift file is not correct, remove it before
    >starting ntpd.


    Of course. However, I have no idea it is incorrect until after ntp has
    started up and shown me it was incorrect.

    >How do you tell if it's incorrect? Since ntpd is supposed to
    >update/rewrite the drift file every sixty minutes, a drift file more
    >than sixty minutes old is suspect!


    I think my problem was that the permissions on /etc/ntp/drift were
    incorrect ( owned by root rather than by ntp). But that makes no
    difference to how ntp behaves. ntp should do the "right thing" even if the
    drift file is wrong. It should take a bit longer, but not 10 hours longer.
    And with the current apparent bug in Linux wehre the system time is
    miscalibrated, it would seem that the drift file on Linux is ALWAYS wrong.



    >> With GPS refclock the system should be able
    >> to figure out the drift to better than 1 PPM from two successive readings
    >> of the clock ( 16 sec) or atmost 3 reading (32 sec) NOT 10 hours.
    >> It is a design flaw in ntp. A design flaw which Mills absolutely refuses to
    >> fix, saying ntp works as designed. Clocks jump in frequency-- for example if the machine
    >> suddenly gets used a lot, the temp can jump by 20-30C which can drive the
    >> rate out by a few PPM. With a response time of hours, that means that the
    >> ntp response to that change in temperature takes hours, apparently even if
    >> the poll interval is 4 (16 sec).
    >> Mills has based his feedback loop on the Markovian theory of simple
    >> feedback loops of engineering control theory. In most systems from the 20th
    >> century, memory was in very short supply. The only memory was in the
    >> parameters of the system itself ( voltage across a capacitor, current in an
    >> inductor, speed of the motor) and the control theory worked well with that. Each
    >> bit of memory (each additional capacitor, inductor, governor) cost a lot. In a
    >> digital computer, memory is virtually infinite in supply. It costs
    >> essentially nothing to remember. Thus you can use your data values from
    >> almost as far back as you wish. Of course they can become irrelevant
    >> because the physical situation changes.
    >>
    >> The ntp control theory uses only the clock offset and rate as its "memory",
    >> (current measurement alters only those two parameters and then is forgotten
    >> except in its effect of those two parameters. In fact the control comes
    >> ONLY though control of the drift rate). Chrony remembers up to 64 previous
    >> measurements-- correcting them for changes in offset and frequency of the
    >> clock-- and throws them away only when it becomes clear that the parameters
    >> of the system have changed (error no longer dominated by random noise, but
    >> by some consistant change) and those old values are useless for the
    >> prediction of the future behaviour of the clock. This means that it can
    >> after three measurements get a very good estimate of both the frequency and
    >> phase offset of the clock, and correct them, refining them as more data
    >> comes in. At present its key limitation is that it does not do refclocks at
    >> all. (It also only runs on Linux).
    >>
    >> All of this is largely irrelevant if what you want is millisecond accuracy
    >> of your clock. ntp is great for that. Or if your computer is on all the
    >> time and ntp has converged-- my measurements indicate chrony is only about
    >> 2-3 times better than ntp in that situation, caused I think primarily by
    >> the temperature induced rate fluctuations of the cpu. That means that if
    >> you are concerned with usec timing, a difference between say a 10 us
    >> accuracy and a 5 usec accuracy, which for almost all of us is irrelevant.
    >>
    >> ntp also has the advantage of obeying the KISS principle (Keep it stimple,
    >> stupid) in that direct control of the rate of the clock is far simpler than
    >> keeping a memory, updating and correcting the memory, trying to figure out
    >> when to forget and when to remember,... And the more complex, the greater
    >> the possibility of error. (although with clock-filter, huff-and-puff,
    >> server selection,... ntp is getting pretty complex as well.)
    >>
    >>
    >>


    >You are at liberty to write your own version of ntpd using your
    >preferred algorithm! Dave designed ntpd to cope with the, usually
    >horrible, behavior of the internet. This is not necessarily the best
    >design for all circumstances. It is, however, what we have to work with
    > unless we are willing and able to "roll our own"! I am not!


    I have other things I am supposed to do with my time, and I am not a good
    enough programmer (Ie, it would take me 10 times longer to program as it
    would a good programmer) to spend the time. But the nice thing about open
    source is that jobs can be partitioned. One person can test (I am
    reasonably good at that) and another can code.
    I disagree that the design is optimal for "the usual horrible behaviour of
    the internet". Many of the design decisions occured before the internet.
    His example of the horrible Malaysia link is from long ago. I suspect he
    would have trouble finding such a horrible link now even in Malaysia
    (although a link to the moon might well be just as bad).
    That ntp is good I do not dispute. That ntp is optimal I do dispute. Many
    many people have complained about the behaviour of ntp on startup. There is
    no excuse for it from the point of view of principle (although perhaps the
    KISS principle might apply). It does not HAVE to be that bad in order for
    ntp to work well.

    But this is going off track. I have a situation in which may chosen clock
    is a refclock with poll interval 4. The time scale of ntp is supposed to be
    something like 16 times the poll interval, which would be 256 seconds. It
    is an hour, which is 20 times longer. What I am asking is why is the time
    interval so long when from what I have seen of ntp it should be much
    shorter. Have I misunderstood the design (eg, the time scale is 16 times
    the highest possible poll interval, which would be 4 hr. which
    is not right either). or is there some bug in ntp as designed?


  18. Re: Speed of ntp convergence

    David Woolley wrote:
    > Richard B. Gilbert wrote:
    >
    >> preferred algorithm! Dave designed ntpd to cope with the, usually
    >> horrible, behavior of the internet. This is not necessarily the best

    >
    > That was the internet of some 20 years ago.


    The internet of today is similar. It may be a little better but I
    wouldn't count on it!

    Have you ever compared the performance of ntpd, using internet servers,
    between 11PM and 7AM local time with the performance during the rest of
    the day? The difference is quite noticeable; the performance from 11PM
    to 7AM is clearly better!

    The only difference that occurs to me as an explanation is that the
    network is noticeably less busy during that period.

  19. Re: Speed of ntp convergence

    Unruh wrote:
    > "Richard B. Gilbert" writes:
    >
    >> Unruh wrote:
    >>> "David J Taylor" writes:
    >>>
    >>>> Hal Murray wrote:
    >>>>>>> Try switching it off, changing the value int he drift file by say
    >>>>>>> 50PPM and
    >>>>>>> then switching it on again, and see how long it takes to recover
    >>>>>>> from that.
    >>>>>> Why would I do that? The drift values rarely change by more than
    >>>>>> five, certainly not by 50. If you are seeing a change of 50, then
    >>>>>> perhaps that it part of your problem?
    >>>>> A big step like that makes it easy to see how the system responds.
    >>>>> At least if it's a linear system.
    >>>> Yes, I appreciate that, Hal, but it doesn't emulate the situation here
    >>>> very well, which I understood to be slow convergence after a routine
    >>>> start. It sounds as if the OP may have an incorrect drift file - it's
    >>>> worth checking that it /is/ being updated.
    >>>
    >>>
    >>> The drift file read 10. The actual drift was 250 (determined after the
    >>> system had settled down). The drift file never changed even after a day of
    >>> running. ntp does not seem to be rewriting the drift file. Now that is a
    >>> problem (although with the apparent Linux bug in the timing routines where is
    >>> miscalibrates the clock on bootup, the drift is NOT stable over reboots
    >>> anyway, so the existence of a drift file is irrelevant. ) However, the question is
    >>> about the bahaviour of ntp. ntp should NOT be taking 10 hours to get over a
    >>> wrong value in the drift file.

    >
    >> That's easy to fix! If the drift file is not correct, remove it before
    >> starting ntpd.

    >
    > Of course. However, I have no idea it is incorrect until after ntp has
    > started up and shown me it was incorrect.
    >
    >> How do you tell if it's incorrect? Since ntpd is supposed to
    >> update/rewrite the drift file every sixty minutes, a drift file more
    >> than sixty minutes old is suspect!

    >
    > I think my problem was that the permissions on /etc/ntp/drift were
    > incorrect ( owned by root rather than by ntp). But that makes no
    > difference to how ntp behaves. ntp should do the "right thing" even if the
    > drift file is wrong. It should take a bit longer, but not 10 hours longer.
    > And with the current apparent bug in Linux wehre the system time is
    > miscalibrated, it would seem that the drift file on Linux is ALWAYS wrong.
    >


    Do not blame ntpd for the consequences of your errors! If ntpd is
    configured correctly and operated correctly, it behaves quite well.

    And if you can write an ntpd equivalent that works better, I'm sure that
    most of us would be interested and, perhaps, even grateful!

  20. Re: Speed of ntp convergence

    Richard B. Gilbert wrote:

    >
    > The internet of today is similar. It may be a little better but I
    > wouldn't count on it!
    >

    For one thing, the components from which it is made are 1,000 times
    faster and have 1,000 times the memory.

    (There are other differences, like the de-skilling of sytem management.)

+ Reply to Thread
Page 1 of 3 1 2 3 LastLast