Odd (mis)behavior when reference clock fails - NTP

This is a discussion on Odd (mis)behavior when reference clock fails - NTP ; Steve Kostecke writes: >On 2008-09-23, Unruh wrote: >> That is precisely what ntp is supposed to do for you. >> Unless you are suggesting that the clock drift is greater than 500PPM. >Unidirectional drift and periodic clock steps are a ...

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2
Results 21 to 33 of 33

Thread: Odd (mis)behavior when reference clock fails

  1. Re: Odd (mis)behavior when reference clock fails

    Steve Kostecke writes:

    >On 2008-09-23, Unruh wrote:


    >> That is precisely what ntp is supposed to do for you.
    >> Unless you are suggesting that the clock drift is greater than 500PPM.


    >Unidirectional drift and periodic clock steps are a good indicator that
    >the clock drift exceeds ntpd's correction capabilities. In these cases
    >adjusting the tick to bring the clock drift within +/- 500PPM is
    >sometimes the solution.


    >As suggested elsewhere in this thread stopping ntpd, deleting the
    >drift.file, and starting ntpd is another potential solution.


    Overdrifting does not seem to be the problem as in his recent post. His
    drift of 5ms in a day is only .1PPM. ntp should be able to handle that
    easily and it indicates some other problem. And that is less than one lost
    tick per day, if that were theproblem. Strange.


  2. Re: Odd (mis)behavior when reference clock fails

    Unruh wrote:
    []
    > Overdrifting does not seem to be the problem as in his recent post.
    > His drift of 5ms in a day is only .1PPM. ntp should be able to handle
    > that easily and it indicates some other problem. And that is less
    > than one lost tick per day, if that were theproblem. Strange.


    He also said: "Time is very stable and I see the drift at 93.576 ppm."

    David



  3. Re: Odd (mis)behavior when reference clock fails

    "David J Taylor" writes:

    >Unruh wrote:
    >[]
    >> Overdrifting does not seem to be the problem as in his recent post.
    >> His drift of 5ms in a day is only .1PPM. ntp should be able to handle
    >> that easily and it indicates some other problem. And that is less
    >> than one lost tick per day, if that were theproblem. Strange.


    >He also said: "Time is very stable and I see the drift at 93.576 ppm."


    I was refering to the excess drift he was seeing when his clock became
    "unsynchronized" It is tiny. If the real drift rate were over 500PPM I sure
    would not expect it to 500.06 PPM. Ie, all evidence is that this is NOT due
    to an overdrift. Or timer tick loss.

  4. Re: Odd (mis)behavior when reference clock fails

    Kevin,

    Kevin Oberman wrote:
    [...]
    > Another thought...could it be PPS that is causing it? After all, the pin
    > on the bulkhead connector is still getting the PPS signal. I am using the
    > kernel PPS implementation, so could that be training the kernel even
    > though ntp is not using it?


    That's also what I've got in mind when I read you latest posts.

    Can you disconnect the PPS signal and see what's happening?

    Martin
    --
    Martin Burnicki

    Meinberg Funkuhren
    Bad Pyrmont
    Germany

  5. Re: Odd (mis)behavior when reference clockfails

    > From: Martin Burnicki
    > Date: Wed, 24 Sep 2008 09:24:43 +0200
    > Sender: questions-bounces+oberman=es.net@lists.ntp.org
    >
    >
    > Kevin,
    >
    > Kevin Oberman wrote:
    > [...]
    > > Another thought...could it be PPS that is causing it? After all, the pin
    > > on the bulkhead connector is still getting the PPS signal. I am using the
    > > kernel PPS implementation, so could that be training the kernel even
    > > though ntp is not using it?

    >
    > That's also what I've got in mind when I read you latest posts.
    >
    > Can you disconnect the PPS signal and see what's happening?


    Martin,

    We have a winner! It is the PPS. If I take that out, it syncs correctly
    to all of the other systems.

    Looks like PPS will train whatever sync source is selected, not just the
    reference clock. So it was reference clock drifting off time with no
    input signal, marking the time as inaccurate so that ntpd was ignoring
    it, but still sending out the PPS such which the system was still
    listing to via the kernel NTP_SYNC, but was training the clock without
    paying any attention to the validity or presence of time from the
    reference clock.

    It looks like ntpd should be disabling PPS_SYNC when the reference clock
    is bad, but is not doing so. Note: I am referring ONLY to the kernel
    using PPS_SYNC. ntpd, itself seems to not pay attention to PPS unless
    the reference clock is selected for sync.

    If I get some time, I'm going to look at the PPS code in ntpd and see it
    this can be done easily.
    --
    R. Kevin Oberman, Network Engineer
    Energy Sciences Network (ESnet)
    Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
    E-mail: oberman@es.net Phone: +1 510 486-8634
    Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751

  6. Re: Odd (mis)behavior when reference clock fails

    oberman@es.net (Kevin Oberman) writes:

    >> From: Martin Burnicki
    >> Date: Wed, 24 Sep 2008 09:24:43 +0200
    >> Sender: questions-bounces+oberman=es.net@lists.ntp.org
    >>
    >>
    >> Kevin,
    >>
    >> Kevin Oberman wrote:
    >> [...]
    >> > Another thought...could it be PPS that is causing it? After all, the pin
    >> > on the bulkhead connector is still getting the PPS signal. I am using the
    >> > kernel PPS implementation, so could that be training the kernel even
    >> > though ntp is not using it?

    >>
    >> That's also what I've got in mind when I read you latest posts.
    >>
    >> Can you disconnect the PPS signal and see what's happening?


    >Martin,


    >We have a winner! It is the PPS. If I take that out, it syncs correctly
    >to all of the other systems.


    >Looks like PPS will train whatever sync source is selected, not just the
    >reference clock. So it was reference clock drifting off time with no
    >input signal, marking the time as inaccurate so that ntpd was ignoring
    >it, but still sending out the PPS such which the system was still
    >listing to via the kernel NTP_SYNC, but was training the clock without
    >paying any attention to the validity or presence of time from the
    >reference clock.


    I at least am confused. What is generating the pps signal. I would have
    thought it was the hardware clock that you say is misbehaving. If so it
    should not send out a PPS signal at all. Or is it your computer itself that
    is sending out a PPS based on its own clock? In that case you certainly
    should NOT be using it as a source of time.

    >It looks like ntpd should be disabling PPS_SYNC when the reference clock
    >is bad, but is not doing so. Note: I am referring ONLY to the kernel


    If the reference clock is bad it should not be sending out a PPS. Why is it
    doing so?


    >using PPS_SYNC. ntpd, itself seems to not pay attention to PPS unless
    >the reference clock is selected for sync.


    >If I get some time, I'm going to look at the PPS code in ntpd and see it
    >this can be done easily.


    If that pps is really not a good pps source coming from an idependent
    harware time source, it should not be enabled at all.

  7. Re: Odd (mis)behavior when reference clockfails

    > From: Unruh
    > Date: Wed, 24 Sep 2008 18:10:24 GMT
    > Sender: questions-bounces+oberman=es.net@lists.ntp.org
    >
    >
    > oberman@es.net (Kevin Oberman) writes:
    >
    > >> From: Martin Burnicki
    > >> Date: Wed, 24 Sep 2008 09:24:43 +0200
    > >> Sender: questions-bounces+oberman=es.net@lists.ntp.org
    > >>
    > >>
    > >> Kevin,
    > >>
    > >> Kevin Oberman wrote:
    > >> [...]
    > >> > Another thought...could it be PPS that is causing it? After all, the pin
    > >> > on the bulkhead connector is still getting the PPS signal. I am using the
    > >> > kernel PPS implementation, so could that be training the kernel even
    > >> > though ntp is not using it?
    > >>
    > >> That's also what I've got in mind when I read you latest posts.
    > >>
    > >> Can you disconnect the PPS signal and see what's happening?

    >
    > >Martin,

    >
    > >We have a winner! It is the PPS. If I take that out, it syncs correctly
    > >to all of the other systems.

    >
    > >Looks like PPS will train whatever sync source is selected, not just the
    > >reference clock. So it was reference clock drifting off time with no
    > >input signal, marking the time as inaccurate so that ntpd was ignoring
    > >it, but still sending out the PPS such which the system was still
    > >listing to via the kernel NTP_SYNC, but was training the clock without
    > >paying any attention to the validity or presence of time from the
    > >reference clock.

    >
    > I at least am confused. What is generating the pps signal. I would have
    > thought it was the hardware clock that you say is misbehaving. If so it
    > should not send out a PPS signal at all. Or is it your computer itself that
    > is sending out a PPS based on its own clock? In that case you certainly
    > should NOT be using it as a source of time.


    The clock, for better or worse, tags the time it supplies with an
    accuracy character which indicates accuracy, but the lack of accuracy
    does not cause the PPS to stop. This includes complete loss of
    accuracy. The clock keeps running and keeps time, but it is now limited
    to the accuracy of the internal clock in the reference clock. This is
    what was drifting, not the system clock. It was simply dragging the
    system clock along for the ride.

    > >It looks like ntpd should be disabling PPS_SYNC when the reference clock
    > >is bad, but is not doing so. Note: I am referring ONLY to the kernel

    >
    > If the reference clock is bad it should not be sending out a PPS. Why is it
    > doing so?


    Because it does. I can contact EndRun about it. I agree that it would be
    best if the clock stopped the PPS, but ntpd could do the same things
    and I see no reason that it should enable PPS_SYNC until the PPS is
    marked as ready and should be disabled when the PPS is no longer marked
    as valid. It marks the PPS validity in 'ntpq -p', so it knows whether it
    considers PPS valid. Why should it allow the PPS_SYNC when it has PPS no
    marked valid?.

    > >using PPS_SYNC. ntpd, itself seems to not pay attention to PPS unless
    > >the reference clock is selected for sync.

    >
    > >If I get some time, I'm going to look at the PPS code in ntpd and see it
    > >this can be done easily.

    >
    > If that pps is really not a good pps source coming from an idependent
    > harware time source, it should not be enabled at all.


    If the clock is getting a good signal, the PPS is valid. If it is not
    getting a signal, it is only as accurate as the internal clock in the
    device.
    --
    R. Kevin Oberman, Network Engineer
    Energy Sciences Network (ESnet)
    Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
    E-mail: oberman@es.net Phone: +1 510 486-8634
    Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751

  8. Re: Odd (mis)behavior when reference clock fails

    oberman@es.net (Kevin Oberman) writes:

    >> From: Unruh
    >> Date: Wed, 24 Sep 2008 18:10:24 GMT
    >> Sender: questions-bounces+oberman=es.net@lists.ntp.org
    >>
    >>
    >> oberman@es.net (Kevin Oberman) writes:
    >>
    >> >> From: Martin Burnicki
    >> >> Date: Wed, 24 Sep 2008 09:24:43 +0200
    >> >> Sender: questions-bounces+oberman=es.net@lists.ntp.org
    >> >>
    >> >>
    >> >> Kevin,
    >> >>
    >> >> Kevin Oberman wrote:
    >> >> [...]
    >> >> > Another thought...could it be PPS that is causing it? After all, the pin
    >> >> > on the bulkhead connector is still getting the PPS signal. I am using the
    >> >> > kernel PPS implementation, so could that be training the kernel even
    >> >> > though ntp is not using it?
    >> >>
    >> >> That's also what I've got in mind when I read you latest posts.
    >> >>
    >> >> Can you disconnect the PPS signal and see what's happening?

    >>
    >> >Martin,

    >>
    >> >We have a winner! It is the PPS. If I take that out, it syncs correctly
    >> >to all of the other systems.

    >>
    >> >Looks like PPS will train whatever sync source is selected, not just the
    >> >reference clock. So it was reference clock drifting off time with no
    >> >input signal, marking the time as inaccurate so that ntpd was ignoring
    >> >it, but still sending out the PPS such which the system was still
    >> >listing to via the kernel NTP_SYNC, but was training the clock without
    >> >paying any attention to the validity or presence of time from the
    >> >reference clock.

    >>
    >> I at least am confused. What is generating the pps signal. I would have
    >> thought it was the hardware clock that you say is misbehaving. If so it
    >> should not send out a PPS signal at all. Or is it your computer itself that
    >> is sending out a PPS based on its own clock? In that case you certainly
    >> should NOT be using it as a source of time.


    >The clock, for better or worse, tags the time it supplies with an
    >accuracy character which indicates accuracy, but the lack of accuracy
    >does not cause the PPS to stop. This includes complete loss of
    >accuracy. The clock keeps running and keeps time, but it is now limited
    >to the accuracy of the internal clock in the reference clock. This is
    >what was drifting, not the system clock. It was simply dragging the
    >system clock along for the ride.


    Ah, ok. I now understand. Unfortunately the PPS in ntp is a separate
    hardware clock from the others.

    There is no necessary link between harware clocks-- ie ntp assumes that
    different hardware clocks are independent sources of time. Whether there is
    any way of tying them together (Ie, A is a good source of time if and only
    if B is a good source of time) I do not know. It could be done I suppose.



    >> >It looks like ntpd should be disabling PPS_SYNC when the reference clock
    >> >is bad, but is not doing so. Note: I am referring ONLY to the kernel

    >>
    >> If the reference clock is bad it should not be sending out a PPS. Why is it
    >> doing so?


    >Because it does. I can contact EndRun about it. I agree that it would be
    >best if the clock stopped the PPS, but ntpd could do the same things
    >and I see no reason that it should enable PPS_SYNC until the PPS is
    >marked as ready and should be disabled when the PPS is no longer marked
    >as valid. It marks the PPS validity in 'ntpq -p', so it knows whether it
    >considers PPS valid. Why should it allow the PPS_SYNC when it has PPS no
    >marked valid?.


    the PPS_SYNC flag is AFAIK an indication as to whether ntp regards the
    internal clock as synchronised to the PPS, not whether it thinks the PPS is
    a good source of time. ntp has no way of knowing if a hardware time source
    is a good source of time or not, especially if there are only two of them.
    It has no idependent measures.

    Especially with a hardware time source, it must assume that that source is
    good.

    >> >using PPS_SYNC. ntpd, itself seems to not pay attention to PPS unless
    >> >the reference clock is selected for sync.

    >>
    >> >If I get some time, I'm going to look at the PPS code in ntpd and see it
    >> >this can be done easily.

    >>
    >> If that pps is really not a good pps source coming from an idependent
    >> harware time source, it should not be enabled at all.


    >If the clock is getting a good signal, the PPS is valid. If it is not
    >getting a signal, it is only as accurate as the internal clock in the
    >device.


    But how does ntp know that? It has no idea that the PPS signal has anything
    to do with the cdma time. You PPS could be from a GPS sattelite, and thus
    be a better source of time than the cdma signal. It cannot know that.
    I have a pps on my system which I drive via a parallel port module. I could
    have another device attached to my serial port-- perhaps oneof your cdma
    clock sources. If the pps stops and no more signals come, then the pps will
    fail. If the pps keeps sending signals approx once per second, ntp HAS to
    assume that they are good.



  9. Re: Odd (mis)behavior when reference clock fails

    >>> In article <20080924165603.4711445048@ptavv.es.net>, oberman@es.net (Kevin Oberman) writes:

    >> From: Martin Burnicki
    >> Can you disconnect the PPS signal and see what's happening?


    Kevin> We have a winner! It is the PPS. If I take that out, it syncs
    Kevin> correctly to all of the other systems.

    Kevin> Looks like PPS will train whatever sync source is selected, not just
    Kevin> the reference clock. So it was reference clock drifting off time with
    Kevin> no input signal, marking the time as inaccurate so that ntpd was
    Kevin> ignoring it, but still sending out the PPS such which the system was
    Kevin> still listing to via the kernel NTP_SYNC, but was training the clock
    Kevin> without paying any attention to the validity or presence of time from
    Kevin> the reference clock.

    Please see http://support.ntp.org/bin/view/Dev/...rDriverPPSCode .

    Dave, here is more evidence that the current behavior is insufficient and
    even potentially wrong.

    If you don't like the current proposal, do you have a suggestion for how we
    can address the current situation?

    --
    Harlan Stenn
    http://ntpforum.isc.org - be a member!

  10. Re: Odd (mis)behavior when reference clock fails


    >Why would you poll a UPS as fast as you can? This sounds like the old
    >joke-- Doctor I have a real headache-- What do you do?-- Before it hurts I
    >bash my head against a brick wall.


    It seemed like a reasonable idea at the time. I'm using it to monitor
    the line voltage. You send it a command. It tells you the min input
    voltage since the last time you asked. The faster I ask the better
    time resolution I get on the duration of glitches.


    >It sounds like you are losing timer interrupts.


    Thanks for the suggestion. It's a 1 GHz VIA C7 system. It was running
    at 250 Hz. I just switched to 100. No change.

    --
    These are my opinions, not necessarily my employer's. I hate spam.


  11. Re: Odd (mis)behavior when reference clock fails

    hal-usenet@ip-64-139-1-69.sjc.megapath.net (Hal Murray) writes:


    >>Why would you poll a UPS as fast as you can? This sounds like the old
    >>joke-- Doctor I have a real headache-- What do you do?-- Before it hurts I
    >>bash my head against a brick wall.


    >It seemed like a reasonable idea at the time. I'm using it to monitor
    >the line voltage. You send it a command. It tells you the min input
    >voltage since the last time you asked. The faster I ask the better
    >time resolution I get on the duration of glitches.


    Does it really make sense to get microsecond resolution on the glitches.

    Try puttin in a microsleep into your routine to see if that solves the
    clock problem. It is possible that the reading of the ups is done via some
    sort of badly written interrupt routine which holds open the interrupts too
    long. It is however a bit weird I will admit.




    >>It sounds like you are losing timer interrupts.


    >Thanks for the suggestion. It's a 1 GHz VIA C7 system. It was running
    >at 250 Hz. I just switched to 100. No change.






  12. Re: Odd (mis)behavior when reference clock fails

    >Does it really make sense to get microsecond resolution on the glitches.

    I doubt if whatever the UPS is using for an A/D is good for microseconds.

    I've thought of using a wall-wart and some dividers to feed an audio
    input channel. That would probably be good for 10s of microseconds.

    I'm saving that for when I have lots of spare time.


    >Try puttin in a microsleep into your routine to see if that solves the
    >clock problem. It is possible that the reading of the ups is done via some
    >sort of badly written interrupt routine which holds open the interrupts too
    >long. It is however a bit weird I will admit.


    I talk to the UPS over a serial port. I suppose I could try
    a USB to serial gizmo.

    Another idea is to make an echo plug and some software that sends
    stuff to itself and see what that does on various systems.

    --
    These are my opinions, not necessarily my employer's. I hate spam.


  13. Re: Odd (mis)behavior when reference clock fails

    hal-usenet@ip-64-139-1-69.sjc.megapath.net (Hal Murray) writes:

    >>Does it really make sense to get microsecond resolution on the glitches.


    >I doubt if whatever the UPS is using for an A/D is good for microseconds.


    >I've thought of using a wall-wart and some dividers to feed an audio
    >input channel. That would probably be good for 10s of microseconds.


    >I'm saving that for when I have lots of spare time.



    >>Try puttin in a microsleep into your routine to see if that solves the
    >>clock problem. It is possible that the reading of the ups is done via some
    >>sort of badly written interrupt routine which holds open the interrupts too
    >>long. It is however a bit weird I will admit.


    >I talk to the UPS over a serial port. I suppose I could try
    >a USB to serial gizmo.


    That would be far worse. And serial is good maybe for 100s of milliseconds,
    certainly not microseconds.
    Try decreasing the ups polling and see if makes a difference.


    >Another idea is to make an echo plug and some software that sends
    >stuff to itself and see what that does on various systems.


    >--
    >These are my opinions, not necessarily my employer's. I hate spam.



+ Reply to Thread
Page 2 of 2 FirstFirst 1 2