high precision tracking: trying to understandsudden jumps - NTP

This is a discussion on high precision tracking: trying to understandsudden jumps - NTP ; Hello, I'm trying to configure a small network for high precision time. Recently acquired an Endrun CDMA time server that runs like a dream, tracking CDMA time to about +/- 5 microseconds. The clients are a rag-tag assembly of diverse ...

+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 20 of 29

Thread: high precision tracking: trying to understandsudden jumps

  1. high precision tracking: trying to understandsudden jumps

    Hello,

    I'm trying to configure a small network for high precision time.
    Recently acquired an Endrun CDMA time server that runs like
    a dream, tracking CDMA time to about +/- 5 microseconds.

    The clients are a rag-tag assembly of diverse systems including
    a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80,
    IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop.

    All are configured to prefer the Endrun clock and poll it on a
    16 second interval. All are attached to a single SMC gigabit
    Ethernet switch with only the Endrun and two Sun systems running
    at a lower speed of 100 MBPS. Close to zero network traffic
    and system loads.

    All systems are running 'ntpd' 4.2.4p4. Compiled NTP native
    64-bit for the Windows X64 system. [A #ifdef tweak to
    'intptr_t' and 'uintptr_t' is required, will provide patch if
    desired].

    It generally is working well, with the systems tracking anywhere
    from +/- 100 microseconds to +/- 500 microseconds most of the
    time.

    However once or twice a day, all the systems experience a
    random, uncorrelated time shift of from one to several
    milliseconds. Had an issue where a UPS voltage correction shift
    and cheap power supply on the Windows X64 box appeared to be a
    problem, but that was fixed by configuring the UPS to consider
    110V nominal instead of 120V.

    Does anyone have any ideas about what could be causing these
    random time jumps and what might be done to eliminate them?

    Something I'm planning to try is to make sure that 'mlock' is
    configured in the daemons--presently 'autoconf' has left it
    disabled for some reason. However I don't belive page
    faults are the culprit. All the daemons are running at
    the highest real-time priority in the respective systems.

    The above configuration is a controlled lab setup. The next
    target is a stack eight of DELL 1950 servers in a production
    data center running Windows 2003 R2 and slaved to a newer Endrun
    time server. Don't have useful data from these systems yet
    because the network jitter is outrageous. Working with the
    network admin to hopefully have the NTP traffic to and from the
    Endrun clock bypass level 3 switch/router rule checking. They
    have large, complex router ACL rulesets I suspect as the cause
    of the jitter.

    Attached are fairly representative graphs of the offset and
    frequency for two of the lab servers.

    Thanks


    P.S. Resent without graphs as the list mailer says
    they're not allowed. Happy to send them or the raw
    'loopstats' to anyone interested.

  2. Re: high precision tracking: trying to understand sudden jumps

    starlight@binnacle.cx wrote:
    > Hello,
    >
    > I'm trying to configure a small network for high precision time.
    > Recently acquired an Endrun CDMA time server that runs like
    > a dream, tracking CDMA time to about +/- 5 microseconds.
    >
    > The clients are a rag-tag assembly of diverse systems including
    > a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80,
    > IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop.
    >
    > All are configured to prefer the Endrun clock and poll it on a
    > 16 second interval. All are attached to a single SMC gigabit
    > Ethernet switch with only the Endrun and two Sun systems running
    > at a lower speed of 100 MBPS. Close to zero network traffic
    > and system loads.
    >
    > All systems are running 'ntpd' 4.2.4p4. Compiled NTP native
    > 64-bit for the Windows X64 system. [A #ifdef tweak to
    > 'intptr_t' and 'uintptr_t' is required, will provide patch if
    > desired].
    >
    > It generally is working well, with the systems tracking anywhere
    > from +/- 100 microseconds to +/- 500 microseconds most of the
    > time.
    >
    > However once or twice a day, all the systems experience a
    > random, uncorrelated time shift of from one to several
    > milliseconds.



    Forcing the poll interval to 16 seconds is not always a good idea!
    Ntpd will select a poll interval, generally starting at 64 seconds, and
    ramping up to as long as 1024 seconds as the clock is beaten into
    submission!

    Directly connected refclocks are frequently polled at shorter intervals
    but I don't think your refclock is "directly connected" in the same
    sense that a clock working through a serial or parallel port is directly
    connected!

    A clock connected via ethernet with all the latencies and jitter
    thereunto appertaining is no different than any other network server and
    should be polled in the same manner!

    The very short poll intervals correct large errors quickly and the very
    long intervals correct small errors very accurately!


  3. Re: high precision tracking: trying to understand sudden jumps

    starlight@binnacle.cx wrote:

    > The clients are a rag-tag assembly of diverse systems including
    > a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80,
    > IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop.


    How are you interpolating the 16ms ticks on the Windows system? How are
    you disabling power management on the lap top?

    >
    > It generally is working well, with the systems tracking anywhere
    > from +/- 100 microseconds to +/- 500 microseconds most of the
    > time.


    How are you measuring the difference from true time? In principle, if
    ntpd can measure it, it will correct it.

    >
    > However once or twice a day, all the systems experience a
    > random, uncorrelated time shift of from one to several
    > milliseconds. Had an issue where a UPS voltage correction shift


    In which direction is the slip? Backward only slips against true time
    (these might appear as forward slips if the real error is in the server)
    are typically due to lost clock interrupts. If that is the case it
    implies you are using a tick rate of other than 100Hz. Please note that
    the Linux kernel code is broken for clock frequencies other than 100Hz
    and the use of 1000Hz significantly increases the likelihood of a lost
    interrupt.

    The normal source of lsot interrupts is disk drivers using programmed
    transfers.

    > and cheap power supply on the Windows X64 box appeared to be a
    > problem, but that was fixed by configuring the UPS to consider
    > 110V nominal instead of 120V.
    >


  4. Re: high precision tracking: trying to understandsudden jumps

    Here are URLs for those two sample graphs:

    http://binnacle.cx/file/ntp_hickups_linux.gif
    http://binnacle.cx/file/ntp_hickups_win.gif

    David Woolley wrote:
    >
    >> The clients are a rag-tag assembly of diverse systems including


    >> a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80,


    >> IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop.

    >
    >How are you interpolating the 16ms ticks on the Windows system?
    >How are you disabling power management on the lap top?


    The generic version of 'ntpd' has some sophisticated code that

    handles interpolation. See the source. Power management is

    disabled on the laptop using the standard control panel option.

    Don't really care that much about this machine anyway.

    >> It generally is working well, with the systems tracking anywhere


    >> from +/- 100 microseconds to +/- 500 microseconds most of the


    >> time.

    >
    >How are you measuring the difference from true time? In principle, if


    >ntpd can measure it, it will correct it.


    Using 'ntpd' 'loopstats'. It does, check out the graphs.

    Maybe I'll turn on 'peerstats' too, but I really doubt a

    stand-alone good quality switch would be causing random delays.

    Pings are consistently 400 microseconds and 'ntpq -p' reports 800

    microsecond roundtrip delays. I've never heard of a switch
    causing a 5ms delay.

    >>


    >> However once or twice a day, all the systems experience a


    >> random, uncorrelated time shift of from one to several


    >> milliseconds. Had an issue where a UPS voltage correction shift


    >
    >In which direction is the slip? Backward only slips against true time


    >(these might appear as forward slips if the real error is in the server)


    >are typically due to lost clock interrupts. If that is the case it


    >implies you are using a tick rate of other than 100Hz. Please note that


    >the Linux kernel code is broken for clock frequencies other than 100Hz


    >and the use of 1000Hz significantly increases the likelihood of a lost


    >interrupt.


    Perhaps that's a problem. The RHEL/Centos stock kernel seems to
    have a 1000Hz clock interrupt. At least 'vmstat' shows 1000
    ints/sec on an idle system.

    >The normal source of lost interrupts is disk drivers using programmed


    >transfers.


    Think it's all DMA. Remember this is a really diverse bunch
    of machines and OSs. The RS/6000 is working the best.

    These jumps aren't killing me. Just want to figure out if they

    can be eliminated. If we needed super accurate time we'd

    probably have make use of PTP (precision timing protocol).
    Still très expensive.

  5. Re: high precision tracking: trying to understand sudden jumps

    "Richard B. Gilbert" writes:

    >starlight@binnacle.cx wrote:
    >> Hello,
    >>
    >> I'm trying to configure a small network for high precision time.
    >> Recently acquired an Endrun CDMA time server that runs like
    >> a dream, tracking CDMA time to about +/- 5 microseconds.
    >>
    >> The clients are a rag-tag assembly of diverse systems including
    >> a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80,
    >> IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop.
    >>
    >> All are configured to prefer the Endrun clock and poll it on a
    >> 16 second interval. All are attached to a single SMC gigabit
    >> Ethernet switch with only the Endrun and two Sun systems running
    >> at a lower speed of 100 MBPS. Close to zero network traffic
    >> and system loads.
    >>
    >> All systems are running 'ntpd' 4.2.4p4. Compiled NTP native
    >> 64-bit for the Windows X64 system. [A #ifdef tweak to
    >> 'intptr_t' and 'uintptr_t' is required, will provide patch if
    >> desired].
    >>
    >> It generally is working well, with the systems tracking anywhere
    >> from +/- 100 microseconds to +/- 500 microseconds most of the
    >> time.
    >>
    >> However once or twice a day, all the systems experience a
    >> random, uncorrelated time shift of from one to several
    >> milliseconds.

    >


    >Forcing the poll interval to 16 seconds is not always a good idea!
    >Ntpd will select a poll interval, generally starting at 64 seconds, and
    >ramping up to as long as 1024 seconds as the clock is beaten into
    >submission!


    It is his network, he is not going to overload it. So, if he wants a 16 sec
    poll interval that is up to him.
    I agree it is not a good idea for remote servers, but on his own system it
    is fine.


    >Directly connected refclocks are frequently polled at shorter intervals
    >but I don't think your refclock is "directly connected" in the same
    >sense that a clock working through a serial or parallel port is directly
    >connected!


    >A clock connected via ethernet with all the latencies and jitter
    >thereunto appertaining is no different than any other network server and
    >should be polled in the same manner!


    ??? The longer polls are in order not to swamp the remote server whith
    10000 people all polling every 16 sec ( or 1 sec) There is nothing in ntp
    itself that mandates a longer poll interval. In fact a shorter poll
    interval makes ntp much more responsive to changes ( clock drifts, etc)



    >The very short poll intervals correct large errors quickly and the very
    >long intervals correct small errors very accurately!


    No for a properly designed system both should be corrected.



  6. Re: high precision tracking: trying to understand sudden jumps

    "Unruh" wrote in message
    news:woRHj.10373$_v3.4025@edtnps90...

    > [...] Would I really believe that the CDMA cell phone network
    > would care if their time signal were accurate to usec?


    I would. Because IIUC, this is the basis on which they divide
    timeslots between stations.

    Groetjes,
    Maarten Wiltink



  7. Re: high precision tracking: trying to understand sudden jumps

    David Woolley writes:

    >starlight@binnacle.cx wrote:


    >> The clients are a rag-tag assembly of diverse systems including
    >> a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80,
    >> IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop.


    >How are you interpolating the 16ms ticks on the Windows system? How are
    >you disabling power management on the lap top?


    >>
    >> It generally is working well, with the systems tracking anywhere
    >> from +/- 100 microseconds to +/- 500 microseconds most of the
    >> time.


    >How are you measuring the difference from true time? In principle, if
    >ntpd can measure it, it will correct it.


    I expect that he means the offsets that ntp measures. NTP does NOT correct
    random offsets. Ie, if there is noise source which makes the offsets vary
    by 500usec ntp will not get rid of them. You will see them in the offsets
    as measured by ntp. Now, the time keeping might (or might not) be more
    accurate than that, but those offsets are what I suspect he means.


    >>
    >> However once or twice a day, all the systems experience a
    >> random, uncorrelated time shift of from one to several
    >> milliseconds. Had an issue where a UPS voltage correction shift


    >In which direction is the slip? Backward only slips against true time
    >(these might appear as forward slips if the real error is in the server)
    >are typically due to lost clock interrupts. If that is the case it
    >implies you are using a tick rate of other than 100Hz. Please note that
    >the Linux kernel code is broken for clock frequencies other than 100Hz
    >and the use of 1000Hz significantly increases the likelihood of a lost
    >interrupt.


    He claims on all the systems.


    >The normal source of lsot interrupts is disk drivers using programmed
    >transfers.


    Almost all disk drives on Linux now use dma.


    >> and cheap power supply on the Windows X64 box appeared to be a
    >> problem, but that was fixed by configuring the UPS to consider
    >> 110V nominal instead of 120V.
    >>


  8. Re: high precision tracking: trying to understand sudden jumps

    "Unruh" wrote in message
    news:VLTHj.7067$pb5.722@edtnps89...
    > "Richard B. Gilbert" writes:


    >> Forcing the poll interval to 16 seconds is not always a good idea!
    >> Ntpd will select a poll interval, generally starting at 64 seconds,
    >> and ramping up to as long as 1024 seconds as the clock is beaten
    >> into submission!

    >
    > It is his network, he is not going to overload it. So, if he wants a
    > 16 sec poll interval that is up to him.
    > I agree it is not a good idea for remote servers, but on his own system
    > it is fine.

    [...]
    > ??? The longer polls are in order not to swamp the remote server whith
    > 10000 people all polling every 16 sec ( or 1 sec) There is nothing in
    > ntp itself that mandates a longer poll interval. In fact a shorter poll
    > interval makes ntp much more responsive to changes ( clock drifts, etc)


    >> The very short poll intervals correct large errors quickly and the
    >> very long intervals correct small errors very accurately!

    >
    > No for a properly designed system both should be corrected.


    You seem to be missing the point. Once the large errors have been
    corrected, NTP goes on to the small errors. For that, it _needs_ a
    longer poll interval. That this gives the server more air is a
    happy coincidence, but not why it does it.

    Given the measurement error, you need to let the small error
    accumulate over a longer period. Otherwise it would simply be
    lost in the noise.

    Do the math: assume the (constant!) measurement error to be +/- 1 ms,
    the frequency error in my local host to be 1000 PPM (1/1000). With a
    1 s polling interval, the real value is 1 ms and the measurement
    will be between 0 and 2 ms. Not very good. With a 1000 s polling
    interval, the real value is 1 s and the measurement will be between
    0.999 and 1.001 s. Now that's useful to correct your clock with.

    Now use more realistic numbers, like 50 PPM to start with, a polling
    interval of 64 s and I'm not exactly sure what for the measuring
    jitter. But the gist should be clear: that 50 PPM will go down, the
    SNR will worsen, and the polling interval should go up to improve it
    again.

    Starting with a short interval is good to correct large errors
    quickly. Backing off once you've done so is good to avoid pestering
    the server, but it's also good to correct small errors accurately,
    and _that_ is why it's done. And of course, once a larger than
    expected offset is measured, the polling interval is shortened
    again.

    Groetjes,
    Maarten Wiltink



  9. Re: high precision tracking: trying to understand sudden jumps

    Unruh wrote:
    > "Richard B. Gilbert" writes:
    >
    >
    >>starlight@binnacle.cx wrote:
    >>
    >>>Hello,
    >>>
    >>>I'm trying to configure a small network for high precision time.
    >>>Recently acquired an Endrun CDMA time server that runs like
    >>>a dream, tracking CDMA time to about +/- 5 microseconds.
    >>>
    >>>The clients are a rag-tag assembly of diverse systems including
    >>>a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80,
    >>>IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop.
    >>>
    >>>All are configured to prefer the Endrun clock and poll it on a
    >>>16 second interval. All are attached to a single SMC gigabit
    >>>Ethernet switch with only the Endrun and two Sun systems running
    >>>at a lower speed of 100 MBPS. Close to zero network traffic
    >>>and system loads.
    >>>
    >>>All systems are running 'ntpd' 4.2.4p4. Compiled NTP native
    >>>64-bit for the Windows X64 system. [A #ifdef tweak to
    >>>'intptr_t' and 'uintptr_t' is required, will provide patch if
    >>>desired].
    >>>
    >>>It generally is working well, with the systems tracking anywhere
    >>>from +/- 100 microseconds to +/- 500 microseconds most of the
    >>>time.
    >>>
    >>>However once or twice a day, all the systems experience a
    >>>random, uncorrelated time shift of from one to several
    >>>milliseconds.

    >>
    >>

    >
    >
    >>Forcing the poll interval to 16 seconds is not always a good idea!
    >>Ntpd will select a poll interval, generally starting at 64 seconds, and
    >>ramping up to as long as 1024 seconds as the clock is beaten into
    >>submission!

    >
    >
    > It is his network, he is not going to overload it. So, if he wants a 16 sec
    > poll interval that is up to him.
    > I agree it is not a good idea for remote servers, but on his own system it
    > is fine.
    >
    >
    >
    >>Directly connected refclocks are frequently polled at shorter intervals
    >>but I don't think your refclock is "directly connected" in the same
    >>sense that a clock working through a serial or parallel port is directly
    >>connected!

    >
    >
    >>A clock connected via ethernet with all the latencies and jitter
    >>thereunto appertaining is no different than any other network server and
    >>should be polled in the same manner!

    >
    >
    > ??? The longer polls are in order not to swamp the remote server whith
    > 10000 people all polling every 16 sec ( or 1 sec) There is nothing in ntp
    > itself that mandates a longer poll interval. In fact a shorter poll
    > interval makes ntp much more responsive to changes ( clock drifts, etc)
    >
    >
    >
    >
    >>The very short poll intervals correct large errors quickly and the very
    >>long intervals correct small errors very accurately!

    >
    >
    > No for a properly designed system both should be corrected.
    >
    >


    If you don't measure across a long interval, you will never see some of
    those small errors. When you measure across 1024 seconds you overwhelm
    the network jitter. The long interval is part of the design for just
    that reason.

    Suppose your frequency error is 5 PPM or 0.43 seconds per day. Do you
    think you can measure that error accurately with a 64 second poll
    interval? If you are working over the internet, an error that small is
    going to disappear in the jitter. It will be sixteen times more obvious
    at the longer interval.

    You can poll a hardware reference clock at 16 second intervals because
    the network is not involved! The latency and jitter a PPS signal over a
    serial port are an order or two of magnitiude less than what you get
    over a busy network.


  10. Re: high precision tracking: trying to understand sudden jumps

    >However once or twice a day, all the systems experience a
    >random, uncorrelated time shift of from one to several
    >milliseconds.


    What does that mean?

    I'm guessing that "uncorrelated" means the glitches don't happen
    at the same time.

    Are all clients seeing occasional problems? Do they match
    cron jobs or some activity burst on the system?

    Can you try another network switch? Or maybe even run without
    any switches? (plug the CDMA box directly into a second ethernet
    port)

    Can you try another NTP server? How about setting up a PC,
    letting it run for a day to establich a good drift file, and
    then making it run on the local clock only. That will drift,
    slowly, but there won't be any jumps.

    How about adding another client that doesn't do anything?
    (Turn off cron too.)

    --
    These are my opinions, not necessarily my employer's. I hate spam.


  11. Re: high precision tracking: trying to understand sudden jumps

    Unruh wrote:
    > 10000 people all polling every 16 sec ( or 1 sec) There is nothing in ntp
    > itself that mandates a longer poll interval. In fact a shorter poll
    > interval makes ntp much more responsive to changes ( clock drifts, etc)


    As I understand it, locking maxpoll low only slightly improves
    responsiveness. The main effect is simply to oversample, as the time
    constants still adjust to values appropriate to a poll interval of 1024s.

  12. Re: high precision tracking: trying to understand sudden jumps

    Unruh wrote:
    >>

    > I expect that he means the offsets that ntp measures. NTP does NOT correct


    I suspect that too.

    > random offsets. Ie, if there is noise source which makes the offsets vary


    It averages them so as to reduce their effective size.

    > by 500usec ntp will not get rid of them. You will see them in the offsets
    > as measured by ntp. Now, the time keeping might (or might not) be more
    > accurate than that, but those offsets are what I suspect he means.


    The question is about "measured errors" that significantly exceed the
    random offsets. In any case the systematic error can also greatly
    exceed the measured offset - that represents an error that ntpd cannot
    measure.
    >
    >
    > Almost all disk drives on Linux now use dma.


    They need to do both and the drivers that caused this problem were
    capable of using DMA. The problem was, I believe, that certain chipsets
    were unsafe with DMA, so the default, at least used to be, the
    unconditional one of doing programmed transfers; you could enable DMA at
    your own risk.

    My impression is that there are still enough systems with lost disk
    interrupts that someone reporting one tick backward steps can reasonably
    be assumed to have that problem, and it is a reasonable probability for
    someone who doesn't report the direction of the step. The other common
    cause of steps, which are balanced in both directions, is not applicable
    here.

  13. Re: high precision tracking: trying to understand sudden jumps

    Maarten Wiltink wrote:
    >
    > You seem to be missing the point. Once the large errors have been
    > corrected, NTP goes on to the small errors. For that, it _needs_ a
    > longer poll interval. That this gives the server more air is a
    > happy coincidence, but not why it does it.


    I don't believe it *needs* longer poll intervals; I think they are
    simply wasteful in that the offsets are low pass filtered in such a way
    that clamping maxpoll makes very little difference to the result, when
    the time constant goes high.

    I'm not sure that there is any user configurable option that actually
    does what people think they are doing by locking down maxpoll, in terms
    of keeping the loop time constant low.

    A clamped maxpoll may improve the reponsiveness to faults causing time
    steps of more than 128ms, but one should be attacking the problem, not
    the symptom.

  14. Re: high precision tracking: trying to understand sudden jumps

    "Maarten Wiltink" writes:

    >"Unruh" wrote in message
    >news:VLTHj.7067$pb5.722@edtnps89...
    >> "Richard B. Gilbert" writes:


    >>> Forcing the poll interval to 16 seconds is not always a good idea!
    >>> Ntpd will select a poll interval, generally starting at 64 seconds,
    >>> and ramping up to as long as 1024 seconds as the clock is beaten
    >>> into submission!

    >>
    >> It is his network, he is not going to overload it. So, if he wants a
    >> 16 sec poll interval that is up to him.
    >> I agree it is not a good idea for remote servers, but on his own system
    >> it is fine.

    >[...]
    >> ??? The longer polls are in order not to swamp the remote server whith
    >> 10000 people all polling every 16 sec ( or 1 sec) There is nothing in
    >> ntp itself that mandates a longer poll interval. In fact a shorter poll
    >> interval makes ntp much more responsive to changes ( clock drifts, etc)


    >>> The very short poll intervals correct large errors quickly and the
    >>> very long intervals correct small errors very accurately!

    >>
    >> No for a properly designed system both should be corrected.


    >You seem to be missing the point. Once the large errors have been
    >corrected, NTP goes on to the small errors. For that, it _needs_ a
    >longer poll interval. That this gives the server more air is a
    >happy coincidence, but not why it does it.



    I have no idea what this means. ntp simply runs a second order feedback
    network It does not do anything for "large and small" errors.

    >Given the measurement error, you need to let the small error
    >accumulate over a longer period. Otherwise it would simply be
    >lost in the noise.


    No idea what you mean.



    >Do the math: assume the (constant!) measurement error to be +/- 1 ms,
    >the frequency error in my local host to be 1000 PPM (1/1000). With a
    >1 s polling interval, the real value is 1 ms and the measurement
    >will be between 0 and 2 ms. Not very good. With a 1000 s polling
    >interval, the real value is 1 s and the measurement will be between
    >0.999 and 1.001 s. Now that's useful to correct your clock with.


    You are not talking about large and small errors, you aree talking about
    phase and frequency errors. And no computer has fixed eitehr phase of
    frequency errors. They keep changing. Thus integrating for a longer time
    does not help if the frequency errors ( drift) keeps changing.



    >Now use more realistic numbers, like 50 PPM to start with, a polling
    >interval of 64 s and I'm not exactly sure what for the measuring
    >jitter. But the gist should be clear: that 50 PPM will go down, the
    >SNR will worsen, and the polling interval should go up to improve it
    >again.


    ??? What you are descibing in one of the key problems with the ntp
    algorithm.



    >Starting with a short interval is good to correct large errors
    >quickly. Backing off once you've done so is good to avoid pestering
    >the server, but it's also good to correct small errors accurately,
    >and _that_ is why it's done. And of course, once a larger than
    >expected offset is measured, the polling interval is shortened
    >again.


    Anyway, that is not his problem. He is getting ms spikes in the loopfilter.
    Those wipe out anything else he does. It destroys all attempts by ntp to
    discipline the clock.



    >Groetjes,
    >Maarten Wiltink




  15. Re: high precision tracking: trying to understand sudden jumps

    Unruh wrote:

    >
    > I have no idea what this means. ntp simply runs a second order feedback
    > network It does not do anything for "large and small" errors.
    >


    See sections G.4, G.5 and following of RFC 1305 (page 95 and onwards in
    the PDF version). A couple of parameters are dynmacially adjusted to
    change the time constant of the network, so it is not entirely "simple".

  16. Re: high precision tracking: trying to understand sudden jumps


    >>probably have make use of PTP (precision timing protocol).

    >
    >No idea what that is. If you had wanted super precision you would have put
    >a GPS onto each machine, I hope.
    >
    >From the Wikipedia entry on PTP it looks absolutely no different from ntp.
    >I have no idea what the idea is.


    The basic idea is to do the time stamping in hardware deep in
    the network adaper. That avoids lots and lots of jitter.

    --
    These are my opinions, not necessarily my employer's. I hate spam.


  17. Re: high precision tracking: trying to understand sudden jumps

    Hal Murray wrote:
    >>>probably have make use of PTP (precision timing protocol).

    >>
    >>No idea what that is. If you had wanted super precision you would have put
    >>a GPS onto each machine, I hope.
    >>
    >>From the Wikipedia entry on PTP it looks absolutely no different from ntp.
    >>I have no idea what the idea is.

    >
    > The basic idea is to do the time stamping in hardware deep in
    > the network adaper. That avoids lots and lots of jitter.


    Yes, PTP can yield an accuracy better than 100 ns if both the NICs at the
    clients and the server support hardware timestamping of sent/received PTP
    packets.

    On the other hand, also *every* network node between the PTP endpoints has
    to be PTP-aware and compensate the packet delay it introduces, so you will
    probably only get full PTP accuracy in your local network where you have
    control over all the equipment.

    Switches can very well insert a delay in the range of milliseconds. If there
    are incoming packets at different ports at the same time which shall go out
    on the same port then the packets have to be queued. Unless the network is
    really heavily loaded this may happen only occasionally, but it may happen.

    The switches included in our PTP starter kit
    http://www.meinberg.de/english/ptp-starterkit/
    implement PTP boundary clocks for the ports in order to eliminate the
    queuing delay. Without this special handling PTP would suffer from the same
    latencies as NTP.

    On the other hand, NTP yields quite good results without requiring special
    hardware, even over WAN connections.


    Martin
    --
    Martin Burnicki

    Meinberg Funkuhren
    Bad Pyrmont
    Germany

  18. Re: high precision tracking: trying to understand sudden jumps

    starlight@binnacle.cx wrote:
    > The generic version of 'ntpd' has some sophisticated code that
    >
    > handles interpolation. See the source. Power management is


    I know that. But the problem is that normal applications just get a
    more accurate time for the most recent tick, but still don't see any
    times between ticks.

    >
    > Pings are consistently 400 microseconds and 'ntpq -p' reports 800


    Which is excessive for 1GHz network doing essentially nothing but NTP.

    > probably have make use of PTP (precision timing protocol).
    > Still très expensive.


    I assume by PTP you mean ethernet cards that extract a timestamp with a
    very low latency. I doubt that this will help with lost interrutps. If
    you really want extreme accuracy for applications you need to:

    1) use hardware that maintains a high resolution time completely
    independent of the software and is directly readable by application code
    (I'm not sure if Windows supports such direct reading).

    2) you will need to add code to the device drivers that actually
    communicate the real world events that you interested in to the
    software, to read from that special clock very early in their ISR
    (better still devices that will read it using DMA).


  19. Re: high precision tracking: trying to understand sudden jumps

    hal-usenet@ip-64-139-1-69.sjc.megapath.net (Hal Murray) writes:


    >>>probably have make use of PTP (precision timing protocol).

    >>
    >>No idea what that is. If you had wanted super precision you would have put
    >>a GPS onto each machine, I hope.
    >>
    >>From the Wikipedia entry on PTP it looks absolutely no different from ntp.
    >>I have no idea what the idea is.


    >The basic idea is to do the time stamping in hardware deep in
    >the network adaper. That avoids lots and lots of jitter.


    It avoids some jitter. Does that mean that you have to have special
    hardware (special network cards, or special network card drivers?)
    It does nothing for the 300us jitter I see on my ADSL connected computer.
    It might do something for the 10us jitter I see on my ethernet connected
    lan-- probably take it down to 8us or something (has anyone tested where
    the jitter is-- in the network cards or in the switches?)



    >--
    >These are my opinions, not necessarily my employer's. I hate spam.



  20. Re: high precision tracking: trying to understand sudden jumps

    Martin Burnicki writes:

    >Hal Murray wrote:
    >>>>probably have make use of PTP (precision timing protocol).
    >>>
    >>>No idea what that is. If you had wanted super precision you would have put
    >>>a GPS onto each machine, I hope.
    >>>
    >>>From the Wikipedia entry on PTP it looks absolutely no different from ntp.
    >>>I have no idea what the idea is.

    >>
    >> The basic idea is to do the time stamping in hardware deep in
    >> the network adaper. That avoids lots and lots of jitter.


    >Yes, PTP can yield an accuracy better than 100 ns if both the NICs at the
    >clients and the server support hardware timestamping of sent/received PTP
    >packets.


    I am still confused. To timestamp you have to read the computer's clock.
    That is a software operation-- reading the counter in the cpu, translating
    to time, returning the result through the kernel, etc. That has all kinds
    of variable latencies,etc. I am having trouble seeing 100ns. Also seeing
    the PPS from the hardware clock and its interrupts. Or are you replacing
    all of the hardware and software of the system? (new kernel, new interrupt
    system, new nics, etc)



    >On the other hand, also *every* network node between the PTP endpoints has
    >to be PTP-aware and compensate the packet delay it introduces, so you will
    >probably only get full PTP accuracy in your local network where you have
    >control over all the equipment.


    >Switches can very well insert a delay in the range of milliseconds. If there
    >are incoming packets at different ports at the same time which shall go out
    >on the same port then the packets have to be queued. Unless the network is
    >really heavily loaded this may happen only occasionally, but it may happen.


    >The switches included in our PTP starter kit
    >http://www.meinberg.de/english/ptp-starterkit/
    >implement PTP boundary clocks for the ports in order to eliminate the
    >queuing delay. Without this special handling PTP would suffer from the same
    >latencies as NTP.


    >On the other hand, NTP yields quite good results without requiring special
    >hardware, even over WAN connections.
    >


    >Martin
    >--
    >Martin Burnicki


    >Meinberg Funkuhren
    >Bad Pyrmont
    >Germany


+ Reply to Thread
Page 1 of 2 1 2 LastLast