Re: high precision tracking: trying to understandsudden jumps - NTP
This is a discussion on Re: high precision tracking: trying to understandsudden jumps - NTP ; At 04:51 PM 3/30/2008 -0700, Bill Unruh wrote:
>Are those on the same day?
Yes, same day. Uncorrelated to anything I can identify
or each other. Same story on all the boxes. Running
a hefty multi-system compile with heavy NFS ...
-
Re: high precision tracking: trying to understandsudden jumps
At 04:51 PM 3/30/2008 -0700, Bill Unruh wrote:
>Are those on the same day?
Yes, same day. Uncorrelated to anything I can identify
or each other. Same story on all the boxes. Running
a hefty multi-system compile with heavy NFS and Samba
traffic does not produce these events, though it disturbs
the Windows boxes slightly when CPU goes to 100%.
>Which "linux" and which "windows" are those graphs since you
>have 2 linux and 2 windows clients.
That's the dual-core AMD 2.4GHz Athlon Tyan mobo whitebox
runing Centos 4.5 SMP kernel. Similar results on the
Dell Dimension 2400 2.4GHz Intel P4 running Centos 4.5
mono-processor kernel.
Windows is a dual-core 3.4GHz Pentium D Tyan mobo whitebox
running 2003 R2 SP2 standard server.
>As I said, seeing the
>peerstats files would be helpful (offset and roundtrip)
Might try them later, but I can't belive a high-quality
SMC switch is causing multi-millisecond delays. Just not
possible. Pings are all about 400 microseconds, consistent
but slightly different on each system. Round trip is
800 microseconds. Attaching the output from a bulk 'ntpq -p'
'ntptrace' script I have below. Note that's 'ntptrace'
version 4.1 since the 4.2 script has useless offset info.
>Also these graphs seem to have cut off the spikes. Are the
>spikes actaully higher or is that an illusion?
Higher. Sometimes 1ms, sometimes 5-6ms.
>(Note the spikes are hundreds of usec, not many msec)
That would be the ~1ms example, check out the other one.
remote refid st t when poll reach delay offset jitter
================================================== ============================
Endrun CDMA
LOCAL(0) LOCAL(0) 10 l 18 64 377 0.000 0.000 0.015
*HOPF_S(0) .CDMA. 0 l 6 16 377 0.000 0.000 0.015
Centos 32
*eachna .CDMA. 1 u 3 16 377 0.683 -0.004 0.009
-tock.usno.navy. .USNO. 1 u 452 1024 377 20.678 1.432 2.822
+navobs1.wustl.e .GPS. 1 u 479 1024 377 50.136 -1.513 0.164
+time.nist.gov .ACTS. 1 u 471 1024 377 66.528 -1.708 0.156
-tick.ucla.edu .GPS. 1 u 432 1024 377 87.372 3.296 0.085
Ultra 10
*172.29.87.3 .CDMA. 1 u 11 16 377 0.869 -0.016 0.042
172.29.87.15: stratum 2, offset -0.000007, synch distance 0.00783
172.29.87.3: stratum 1, offset -0.000018, synch distance 0.00038, refid 'CDMA'
Ultra 80
*172.29.87.3 .CDMA. 1 u 4 16 377 0.942 -0.012 0.012
172.29.87.17: stratum 2, offset -0.000038, synch distance 0.00685
172.29.87.3: stratum 1, offset -0.000017, synch distance 0.00038, refid 'CDMA'
44p
*172.29.87.3 .CDMA. 1 u 13 16 377 0.809 -0.001 0.016
172.29.87.13: stratum 2, offset -0.000014, synch distance 0.00627
172.29.87.3: stratum 1, offset -0.000018, synch distance 0.00038, refid 'CDMA'
Centos 64
*172.29.87.3 .CDMA. 1 u 12 16 377 0.664 0.003 0.487
172.29.87.19: stratum 2, offset -0.000009, synch distance 0.00720
172.29.87.3: stratum 1, offset -0.000018, synch distance 0.00038, refid 'CDMA'
W2K3 64
*172.29.87.3 .CDMA. 1 u 4 16 377 0.734 0.053 0.014
172.29.87.20: stratum 2, offset -0.000060, synch distance 0.00650
172.29.87.3: stratum 1, offset -0.000019, synch distance 0.00038, refid 'CDMA'
XP 32 laptop
*172.29.87.3 .CDMA. 1 u 7 16 377 0.819 0.468 0.256
172.29.87.12: stratum 2, offset -0.000173, synch distance 0.00655
172.29.87.3: stratum 1, offset -0.000017, synch distance 0.00038, refid 'CDMA'
-
Re: high precision tracking: trying to understand sudden jumps
On 2008-03-31, David Woolley
wrote:
> Steve Kostecke wrote:
>
>> On 2008-03-31, David Woolley wrote:
>>
>>> Bill Unruh wrote:
>>>
>>>> On Sun, 30 Mar 2008, starlight@binnacle.cx wrote:
>>>
>>> You appear to be quoting an off list reply with no indication of
>>> permission, although it is just possible that the email gateway
>>> forwarded it to email subscribers without forwarding it to the
>>> usenet group proper.
>>
>> What you are suggesting is not possible.
>>
>> The Usenet news-group is just another subscriber to the questions
>> list.
>
> It's certainly very possible that the missing article was private
> email only, although possibly by mistake.
Private e-mail can not be a "missing article".
>The mailing list doesn't seem to be a simple subscriber,
There is only _one_ type of list subscriber: those who receive mail from
the list.
>as an example quoted before showed no sign of attachments in the usenet
>version, but the mail archive version that I was pointed to mentioned
>that attachments (a PGP signature) had been suppressed.
Our mailing lists strip out all manner of MIME cruft. The gateway is a
bit more stringent to protect those of us who use real (i.e. console)
news readers.
> I assume you mean the usenet gateway is a subscriber, as usenet
> groups can't subscribe to mailing lists on their own. In that case,
> it is at least theoretically possible that the gateway suppresses the
> message on the usenet side, but if it is an ordinary subscriber on the
> mailing list side, the message will still go to other mailing list
> subscribers. One obvious case in which this would happen is if there
> was a duplicate message ID.
Both the mailing-list and the gateway use the original message ID to
prevent duplicate posts/articles.
Every post/article is propagated exactly _once_.
There is no supression. There is no Cabal.
--
Steve Kostecke
NTP Public Services Project - http://support.ntp.org/
-
Re: high precision tracking: trying to understand sudden jumps
On Mar 30, 8:05*pm, starli...@binnacle.cx wrote:
> Might try them later, but I can't belive a high-quality
> SMC switch is causing multi-millisecond delays. *Just not
Do you have access to a different (Cisco, Extreme, Foundry, or HP)
switch for testing? If not, try a crossover cable between the NTP
server and one of the systems. If the problem disappears, you'll know
the switch was the culprit.
We've seen lots of strange issues with less expensive switches
(NetGear, similar to SMC) that just don't happen with the more
expensive brands. You often get what you pay for.
-
Re: high precision tracking: trying to understand sudden jumps
David Woolley schrieb:
> Heiko Gerstung wrote:
>
>> time has passed without the signal coming back. This results in the
>> time server replying with stratum 12 (for example) after a while and
>> ensures that everybody has the same time, although it might be wrong.
>> If a user does not want that, they can simply set the local clock
>> stratum to 15 and the server will not be accepted anymore.
>>
>> Can you please let me know why you consider this a "bad implementation"?
>
> Because the protocol fails to signal the loss of the time source
> properly when one has a local clock configured. As such, I believe that
> enabling a local clock should always be an opt in choice. Basically,
> when it falls back to the local clock, root dispersion goes to zero,
> when the true situation is that root dispersion is growing without bound.
The signal is the higher stratum level, at least for a lot of SNTP
implementations. Almost noone is looking at the root dispersion value when it
comes to SNTP ...
In our web interface you can disable the use of the local clock reference
completely. I always recommend to keep it active but set its stratum to 15,
which should result in being rejected by any standards compliant client.
Running without the local clock ref means the server signals itself as being
synchronized by a stratum 0 source (e.g. GPS) and only the root dispersion value
is increasing. As I said, most embedded/SNTP-only software checks for the SYNC
status and (sometimes) stratum level.
> Things can go seriously wrong if there is more than one local clock
> source on a network, as it becomes possible for them to outvote the real
> time.
Yes, but I would not go that far to say that offering the end user the choice to
enable the local clock driver in his NTP appliance is a "bad implementation". I
however can fully agree that there are a number of things that could go wrong
when you use it (something that applies to a number of configuration options
like tinker or restrict ...).
Cheers,
Heiko