Re: high precision tracking: trying to understandsudden jumps - NTP

This is a discussion on Re: high precision tracking: trying to understandsudden jumps - NTP ; At 04:51 PM 3/30/2008 -0700, Bill Unruh wrote: >Are those on the same day? Yes, same day. Uncorrelated to anything I can identify or each other. Same story on all the boxes. Running a hefty multi-system compile with heavy NFS ...

+ Reply to Thread
Results 1 to 4 of 4

Thread: Re: high precision tracking: trying to understandsudden jumps

  1. Re: high precision tracking: trying to understandsudden jumps

    At 04:51 PM 3/30/2008 -0700, Bill Unruh wrote:
    >Are those on the same day?


    Yes, same day. Uncorrelated to anything I can identify
    or each other. Same story on all the boxes. Running
    a hefty multi-system compile with heavy NFS and Samba
    traffic does not produce these events, though it disturbs
    the Windows boxes slightly when CPU goes to 100%.

    >Which "linux" and which "windows" are those graphs since you
    >have 2 linux and 2 windows clients.


    That's the dual-core AMD 2.4GHz Athlon Tyan mobo whitebox
    runing Centos 4.5 SMP kernel. Similar results on the
    Dell Dimension 2400 2.4GHz Intel P4 running Centos 4.5
    mono-processor kernel.

    Windows is a dual-core 3.4GHz Pentium D Tyan mobo whitebox
    running 2003 R2 SP2 standard server.

    >As I said, seeing the
    >peerstats files would be helpful (offset and roundtrip)


    Might try them later, but I can't belive a high-quality
    SMC switch is causing multi-millisecond delays. Just not
    possible. Pings are all about 400 microseconds, consistent
    but slightly different on each system. Round trip is
    800 microseconds. Attaching the output from a bulk 'ntpq -p'
    'ntptrace' script I have below. Note that's 'ntptrace'
    version 4.1 since the 4.2 script has useless offset info.

    >Also these graphs seem to have cut off the spikes. Are the
    >spikes actaully higher or is that an illusion?


    Higher. Sometimes 1ms, sometimes 5-6ms.

    >(Note the spikes are hundreds of usec, not many msec)


    That would be the ~1ms example, check out the other one.





    remote refid st t when poll reach delay offset jitter
    ================================================== ============================
    Endrun CDMA
    LOCAL(0) LOCAL(0) 10 l 18 64 377 0.000 0.000 0.015
    *HOPF_S(0) .CDMA. 0 l 6 16 377 0.000 0.000 0.015
    Centos 32
    *eachna .CDMA. 1 u 3 16 377 0.683 -0.004 0.009
    -tock.usno.navy. .USNO. 1 u 452 1024 377 20.678 1.432 2.822
    +navobs1.wustl.e .GPS. 1 u 479 1024 377 50.136 -1.513 0.164
    +time.nist.gov .ACTS. 1 u 471 1024 377 66.528 -1.708 0.156
    -tick.ucla.edu .GPS. 1 u 432 1024 377 87.372 3.296 0.085
    Ultra 10
    *172.29.87.3 .CDMA. 1 u 11 16 377 0.869 -0.016 0.042
    172.29.87.15: stratum 2, offset -0.000007, synch distance 0.00783
    172.29.87.3: stratum 1, offset -0.000018, synch distance 0.00038, refid 'CDMA'
    Ultra 80
    *172.29.87.3 .CDMA. 1 u 4 16 377 0.942 -0.012 0.012
    172.29.87.17: stratum 2, offset -0.000038, synch distance 0.00685
    172.29.87.3: stratum 1, offset -0.000017, synch distance 0.00038, refid 'CDMA'
    44p
    *172.29.87.3 .CDMA. 1 u 13 16 377 0.809 -0.001 0.016
    172.29.87.13: stratum 2, offset -0.000014, synch distance 0.00627
    172.29.87.3: stratum 1, offset -0.000018, synch distance 0.00038, refid 'CDMA'
    Centos 64
    *172.29.87.3 .CDMA. 1 u 12 16 377 0.664 0.003 0.487
    172.29.87.19: stratum 2, offset -0.000009, synch distance 0.00720
    172.29.87.3: stratum 1, offset -0.000018, synch distance 0.00038, refid 'CDMA'
    W2K3 64
    *172.29.87.3 .CDMA. 1 u 4 16 377 0.734 0.053 0.014
    172.29.87.20: stratum 2, offset -0.000060, synch distance 0.00650
    172.29.87.3: stratum 1, offset -0.000019, synch distance 0.00038, refid 'CDMA'
    XP 32 laptop
    *172.29.87.3 .CDMA. 1 u 7 16 377 0.819 0.468 0.256
    172.29.87.12: stratum 2, offset -0.000173, synch distance 0.00655
    172.29.87.3: stratum 1, offset -0.000017, synch distance 0.00038, refid 'CDMA'

  2. Re: high precision tracking: trying to understand sudden jumps

    On 2008-03-31, David Woolley
    wrote:

    > Steve Kostecke wrote:
    >
    >> On 2008-03-31, David Woolley wrote:
    >>
    >>> Bill Unruh wrote:
    >>>
    >>>> On Sun, 30 Mar 2008, starlight@binnacle.cx wrote:
    >>>
    >>> You appear to be quoting an off list reply with no indication of
    >>> permission, although it is just possible that the email gateway
    >>> forwarded it to email subscribers without forwarding it to the
    >>> usenet group proper.

    >>
    >> What you are suggesting is not possible.
    >>
    >> The Usenet news-group is just another subscriber to the questions
    >> list.

    >
    > It's certainly very possible that the missing article was private
    > email only, although possibly by mistake.


    Private e-mail can not be a "missing article".

    >The mailing list doesn't seem to be a simple subscriber,


    There is only _one_ type of list subscriber: those who receive mail from
    the list.

    >as an example quoted before showed no sign of attachments in the usenet
    >version, but the mail archive version that I was pointed to mentioned
    >that attachments (a PGP signature) had been suppressed.


    Our mailing lists strip out all manner of MIME cruft. The gateway is a
    bit more stringent to protect those of us who use real (i.e. console)
    news readers.

    > I assume you mean the usenet gateway is a subscriber, as usenet
    > groups can't subscribe to mailing lists on their own. In that case,
    > it is at least theoretically possible that the gateway suppresses the
    > message on the usenet side, but if it is an ordinary subscriber on the
    > mailing list side, the message will still go to other mailing list
    > subscribers. One obvious case in which this would happen is if there
    > was a duplicate message ID.


    Both the mailing-list and the gateway use the original message ID to
    prevent duplicate posts/articles.

    Every post/article is propagated exactly _once_.

    There is no supression. There is no Cabal.

    --
    Steve Kostecke
    NTP Public Services Project - http://support.ntp.org/

  3. Re: high precision tracking: trying to understand sudden jumps

    On Mar 30, 8:05*pm, starli...@binnacle.cx wrote:

    > Might try them later, but I can't belive a high-quality
    > SMC switch is causing multi-millisecond delays. *Just not


    Do you have access to a different (Cisco, Extreme, Foundry, or HP)
    switch for testing? If not, try a crossover cable between the NTP
    server and one of the systems. If the problem disappears, you'll know
    the switch was the culprit.

    We've seen lots of strange issues with less expensive switches
    (NetGear, similar to SMC) that just don't happen with the more
    expensive brands. You often get what you pay for.

  4. Re: high precision tracking: trying to understand sudden jumps

    David Woolley schrieb:
    > Heiko Gerstung wrote:
    >
    >> time has passed without the signal coming back. This results in the
    >> time server replying with stratum 12 (for example) after a while and
    >> ensures that everybody has the same time, although it might be wrong.
    >> If a user does not want that, they can simply set the local clock
    >> stratum to 15 and the server will not be accepted anymore.
    >>
    >> Can you please let me know why you consider this a "bad implementation"?

    >
    > Because the protocol fails to signal the loss of the time source
    > properly when one has a local clock configured. As such, I believe that
    > enabling a local clock should always be an opt in choice. Basically,
    > when it falls back to the local clock, root dispersion goes to zero,
    > when the true situation is that root dispersion is growing without bound.


    The signal is the higher stratum level, at least for a lot of SNTP
    implementations. Almost noone is looking at the root dispersion value when it
    comes to SNTP ...

    In our web interface you can disable the use of the local clock reference
    completely. I always recommend to keep it active but set its stratum to 15,
    which should result in being rejected by any standards compliant client.

    Running without the local clock ref means the server signals itself as being
    synchronized by a stratum 0 source (e.g. GPS) and only the root dispersion value
    is increasing. As I said, most embedded/SNTP-only software checks for the SYNC
    status and (sometimes) stratum level.

    > Things can go seriously wrong if there is more than one local clock
    > source on a network, as it becomes possible for them to outvote the real
    > time.


    Yes, but I would not go that far to say that offering the end user the choice to
    enable the local clock driver in his NTP appliance is a "bad implementation". I
    however can fully agree that there are a number of things that could go wrong
    when you use it (something that applies to a number of configuration options
    like tinker or restrict ...).

    Cheers,
    Heiko

+ Reply to Thread