# drift modeling question - NTP

This is a discussion on drift modeling question - NTP ; Has anyone thought about removing both the linear terms and quadratic terms in the drift, by utilizing the temperature sensor readings available on many of the latest motherboards? Crystal oscillators tend to have both a linear bias and a quadratic ...

1. ## drift modeling question

terms in the drift, by utilizing the temperature sensor readings
available on many of the latest motherboards?

Crystal oscillators tend to have both a linear bias and a quadratic bias,
determinate upon temperature, leaving a stochastic component, which I
believe we're trying to compensate for all along by the operation of ntp.

Integration of the temperature readings and knowledge of the current
drift should suggest the appropriate coefficients for the corrections.

Just a thought, but it seems a shame that we're not taking advantage of
the thermal data available to us, when correcting for clock drift and
simply using a linear correction.

2. ## Re: drift modeling question

In article ,
shy author writes:
>terms in the drift, by utilizing the temperature sensor readings
>available on many of the latest motherboards?

Linear works pretty well. I think it would be hard to get quadratic.

NTP temperature compensation
Mark Martinec, 2001-01-08
http://www.ijs.si/time/temp-compensation/

>Just a thought, but it seems a shame that we're not taking advantage of
>the thermal data available to us, when correcting for clock drift and
>simply using a linear correction.

The temperature sensor is probably in the CPU chip. You probably
want it on the crystal. The sensors on CPUs tend to be coarse.

--
These are my opinions, not necessarily my employer's. I hate spam.

3. ## Re: drift modeling question

shy author writes:

>terms in the drift, by utilizing the temperature sensor readings
>available on many of the latest motherboards?

>Crystal oscillators tend to have both a linear bias and a quadratic bias,
>determinate upon temperature, leaving a stochastic component, which I
>believe we're trying to compensate for all along by the operation of ntp.

>Integration of the temperature readings and knowledge of the current
>drift should suggest the appropriate coefficients for the corrections.

>Just a thought, but it seems a shame that we're not taking advantage of
>the thermal data available to us, when correcting for clock drift and
>simply using a linear correction.

A Quadratic correction tends to run away pretty badly if you suddenly have
a period of no ntp readings. But I am not at all sure that is what you
mean. Rather you seem to mean that one use the temp to estimate what the
linear drift is ( from the past readings of temp and drift.)

4. ## Re: drift modeling question

> A Quadratic correction tends to run away pretty badly if you suddenly
> have a period of no ntp readings. But I am not at all sure that is what
> you mean. Rather you seem to mean that one use the temp to estimate what
> the linear drift is ( from the past readings of temp and drift.)

Correct, in that the linear correction will vary with temperature (and
could be matched closely by a quadratic)

I am thinking of a least-squares fit to the drift data, versus
temperature. Another poster mentioned that we really need temperature of
the crystal, I agree, but there should be strong correlation between that
(unknown) temperature and the motherboard or processor chip temperatures.

5. ## Re: drift modeling question

shy author writes:

>> A Quadratic correction tends to run away pretty badly if you suddenly
>> have a period of no ntp readings. But I am not at all sure that is what
>> you mean. Rather you seem to mean that one use the temp to estimate what
>> the linear drift is ( from the past readings of temp and drift.)

>Correct, in that the linear correction will vary with temperature (and
>could be matched closely by a quadratic)

That is NOT a quadratic correction. It is a linear correction where the
linear coefficient depends on temperature.

>I am thinking of a least-squares fit to the drift data, versus
>temperature. Another poster mentioned that we really need temperature of
>the crystal, I agree, but there should be strong correlation between that
>(unknown) temperature and the motherboard or processor chip temperatures.

And a lag.

6. ## Re: drift modeling question

>Correct, in that the linear correction will vary with temperature (and
>could be matched closely by a quadratic)

I don't think you can measure the temperature and/or drift accurately
enough to make quadratic corrections interesting.

>I am thinking of a least-squares fit to the drift data, versus
>temperature. Another poster mentioned that we really need temperature of
>the crystal, I agree, but there should be strong correlation between that
>(unknown) temperature and the motherboard or processor chip temperatures.

Years ago, I tried to measure things.

Most PCs have 2 xtals. One at 14.xxx MHz (cheap, 4X color burst)
that drives the CPU and most motherboard logic through a magic clock
generator (PLL) chip, and another that is a 32 KHz watch crystal for
keeping time when the CPU is off. The latter also makes interrupts
for the scheduler.

I had the temperature probe on the 14 MHz xtal. It didn't work very well.

I was assuming that the system used something like a cycle counter
(TSC) for timekeeping. That has troubles in multi-CPU systems.
The code I was actually running used the timer interrupts for
timekeeping and the TSC to interpolate between ticks.

Things got a lot better when I moved the probe to the 32 KHz xtal which
was in a corner of the board far from the CPU.

Here are a couple of graphs:
http://www.megapathdsl.net/~hmurray/ntp/drift-ex.gif
http://www.megapathdsl.net/~hmurray/ntp/drift.gif
http://www.megapathdsl.net/~hmurray/ntp/slope.gif
http://www.megapathdsl.net/~hmurray/ntp/slope2.gif

Again, read Mark Martinec's web page. He's got a lot of good
data.
http://www.ijs.si/time/temp-compensation/

--
These are my opinions, not necessarily my employer's. I hate spam.

7. ## Re: drift modeling question

shy author wrote:
>>A Quadratic correction tends to run away pretty badly if you suddenly
>>have a period of no ntp readings. But I am not at all sure that is what
>>you mean. Rather you seem to mean that one use the temp to estimate what
>>the linear drift is ( from the past readings of temp and drift.)

>
>
> Correct, in that the linear correction will vary with temperature (and
> could be matched closely by a quadratic)
>
> I am thinking of a least-squares fit to the drift data, versus
> temperature. Another poster mentioned that we really need temperature of
> the crystal, I agree, but there should be strong correlation between that
> (unknown) temperature and the motherboard or processor chip temperatures.
>

One could stick a LM75 Temp Sensor ( or similar with better resolution )
to the Xtal and connect it to the MotherBoard SM-Bus.
Should not be all that difficult to integrate.

uwe

8. ## Re: drift modeling question

Hal Murray wrote:
>
> Most PCs have 2 xtals. One at 14.xxx MHz (cheap, 4X color burst)
> that drives the CPU and most motherboard logic through a magic clock
> generator (PLL) chip, and another that is a 32 KHz watch crystal for
> keeping time when the CPU is off. The latter also makes interrupts
> for the scheduler.

Historically interrupts from the 32kHz clock have not been used, except,
possibly, in powered down states to initiate a restart from suspend or
hibernate. It is possible that has changed very recently, but they
certainly weren't used historically.

>
> I had the temperature probe on the 14 MHz xtal. It didn't work very well.
>
> I was assuming that the system used something like a cycle counter
> (TSC) for timekeeping. That has troubles in multi-CPU systems.

I think the most common way of doing timekeeping at the 1ms and higher
level is to use the counter timer, which is driven from a signal divided
down to approximately 1MHz. TSC is mainly used to interpolate between
ticks or to detect missed ticks.

> The code I was actually running used the timer interrupts for
> timekeeping and the TSC to interpolate between ticks.

but note that the timer interrupts are not based on the 32kHz
oscillator, in the typical system.

9. ## Re: drift modeling question

>Historically interrupts from the 32kHz clock have not been used, except,
>possibly, in powered down states to initiate a restart from suspend or
>hibernate. It is possible that has changed very recently, but they
>certainly weren't used historically.

Well, at least one of us is confused. Or maybe my history starts
long before yours.

I'm pretty sure that some of the systems I've worked on used
the interrupt from the 32 KHz clock chip to drive the scheduler.

Some/many systems have long had troubles keeping time if interrups
get lost. That wouldn't make sense if something like the TSC was
used for timekeeping.

Here is the message from Dave Mills that got me thinking in the
right direction:
Message-ID: <3CE6FB0C.C1A206@udel.edu>

After that, I moved the temperature probe over to the 32 KHz crystal
and my temperature data looked much cleaner. The 32 KHz crystal is
off in a corner of the board. THe main CPU crystal is reasonably close
to the center of the board where all the heat is generated.

In the last year or two, the Linux timekeeping stuff has changed
a lot, partly in order to support laptops and such that go into
serious power down mode and don't want to waste a lot of battery
on each tick when there is nothing to do.

--
These are my opinions, not necessarily my employer's. I hate spam.

10. ## Re: drift modeling question

Hal Murray wrote:

>
> I'm pretty sure that some of the systems I've worked on used
> the interrupt from the 32 KHz clock chip to drive the scheduler.

From the beginning, "IBM" PCs did not use the 32kHz clock. It is
possible that some other hardware used a 32kHz derived clock for powered
up timing, but I'm not aware of anything before Linux, recently, started
having options for every possible source. Historically, what tended to
precede the counter-timer clocks running at about a MHz, was mains
frequency clocks, running at 50 or 60 Hz.

>
> Some/many systems have long had troubles keeping time if interrups
> get lost. That wouldn't make sense if something like the TSC was
> used for timekeeping.

TSC is a relatively very recent feature. The interrupts come from a
counter timer, which was originally part of a counter-timer chip (Intel
8254-2), but is now just part of the overall ASIC. It's clocked at
1.190 MHz (that's what the PC Technical Reference says, but I suspect,
if I traced the circuit, I would find it is 1.19318667. The original
MS-DOS clock rate is a result of counting this for the full 16 bits of
the counter. (Other channels, on the 8254, clocked at the same rate,
where used to generate memory refresh timing and speaker beeps.)
>
>

11. ## Re: drift modeling question

David Woolley wrote:

> 1.190 MHz (that's what the PC Technical Reference says, but I suspect,

That's actually the AT, not the basic PC. It looks like the AT used a
completely separate clock for for the processor clock, whereas I think
the PC used the 14.1... MHz one as the basic source for that. Modern
PCs tend to have only one crystal besides the 32kHz one. The former
being used for everything except the CMOS clock.

12. ## Re: drift modeling question

> Again, read Mark Martinec's web page. He's got a lot of good data.
> http://www.ijs.si/time/temp-compensation/

This is almost exactly what I was thinking. Mark's 7 years ahead of me
(laugh)

Thanks for the post, I appreciate the information.

13. ## Re: drift modeling question

David and Hal,

David Woolley wrote:
> Hal Murray wrote:
>>
>> Most PCs have 2 xtals. One at 14.xxx MHz (cheap, 4X color burst)
>> that drives the CPU and most motherboard logic through a magic clock
>> generator (PLL) chip, and another that is a 32 KHz watch crystal for
>> keeping time when the CPU is off. The latter also makes interrupts
>> for the scheduler.

>
> Historically interrupts from the 32kHz clock have not been used, except,
> possibly, in powered down states to initiate a restart from suspend or
> hibernate. It is possible that has changed very recently, but they
> certainly weren't used historically.

About which operating system(s) are you talking?

The PC's standard RTC chip can certainly generate cyclic interrupts.
However, if a cyclic interrupt from the RTC or from another timer chip is
used to drive the scheduler depends on the type and eventually on the
version of an operating system, isn't it?

So one system may be using the RTC's interrupts and another one may not.

Martin
--
Martin Burnicki

Meinberg Funkuhren
Germany

14. ## Re: drift modeling question

Martin Burnicki writes:

>David and Hal,

>David Woolley wrote:
>> Hal Murray wrote:
>>>
>>> Most PCs have 2 xtals. One at 14.xxx MHz (cheap, 4X color burst)
>>> that drives the CPU and most motherboard logic through a magic clock
>>> generator (PLL) chip, and another that is a 32 KHz watch crystal for
>>> keeping time when the CPU is off. The latter also makes interrupts
>>> for the scheduler.

>>
>> Historically interrupts from the 32kHz clock have not been used, except,
>> possibly, in powered down states to initiate a restart from suspend or
>> hibernate. It is possible that has changed very recently, but they
>> certainly weren't used historically.

>About which operating system(s) are you talking?

>The PC's standard RTC chip can certainly generate cyclic interrupts.
>However, if a cyclic interrupt from the RTC or from another timer chip is
>used to drive the scheduler depends on the type and eventually on the
>version of an operating system, isn't it?

>So one system may be using the RTC's interrupts and another one may not.

So the question is, do you know of any operating systems which use the RTC
to drive the scheduler?

The rtc has been going through a bunch of changes recently-- from Motorola
MC146818 to HPET to more recent rtc chipsets.

Under Linux support is a mess. For example if you turn on an interrupt, the
system returns immediately even though the conditions of the interrupt have
not been met. Some bug in the code.

15. ## Re: drift modeling question

Martin Burnicki wrote:
>
> About which operating system(s) are you talking?

For powered up timing, MS-DOS, its predecessors if they implemented a
software clock at all, the MS-DOS Windows (3.0, 3.1, 95, 98, ME, 98SE),
most, if not all of the NT Windows (NT 3.5, NT 4.0, Windows 2000,
Windows XP, probably Windows 2003), Linux from start to 2.4, and mostly
for 2.6, SCO OpenServer.......

(The tick rate for MS-DOS family systems is a good clue to which timer
they use.)
>
> The PC's standard RTC chip can certainly generate cyclic interrupts.
> However, if a cyclic interrupt from the RTC or from another timer chip is

But generally isn't used for that. ISTR that some early PCs didn't have
an RTC and had to be set when booted.

> used to drive the scheduler depends on the type and eventually on the
> version of an operating system, isn't it?

Divergence into alternative periodic sources on IBM PC type machines is
very recent.

16. ## Re: drift modeling question

David Woolley wrote:
> Martin Burnicki wrote:
>>
>> About which operating system(s) are you talking?

>
> For powered up timing, MS-DOS, its predecessors if they implemented a
> software clock at all, the MS-DOS Windows (3.0, 3.1, 95, 98, ME, 98SE),
> most, if not all of the NT Windows (NT 3.5, NT 4.0, Windows 2000,
> Windows XP, probably Windows 2003), Linux from start to 2.4, and mostly
> for 2.6, SCO OpenServer.......
>
> (The tick rate for MS-DOS family systems is a good clue to which timer
> they use.)
>>
>> The PC's standard RTC chip can certainly generate cyclic interrupts.
>> However, if a cyclic interrupt from the RTC or from another timer chip is

>
> But generally isn't used for that. ISTR that some early PCs didn't have
> an RTC and had to be set when booted.

The original IBM PC and the PC/XT did not have hardware clocks. If you
wanted a clock you purchased a "multifunction card" and plugged it into
the bus. I believe that the PC/AT was the first IBM PC with a native
clock. I have forgotten what else the multifunction card did but it
supplied a few goodies that were not native.

Clone manufacturers may have beaten IBM and installed clock as standard
first.

17. ## Re: drift modeling question

Unruh wrote:
>
> Under Linux support is a mess. For example if you turn on an interrupt, the
> system returns immediately even though the conditions of the interrupt have
> not been met. Some bug in the code.
>

could you be more specific?

uwe

18. ## Re: drift modeling question

David Woolley writes:

>Martin Burnicki wrote:
>>
>> About which operating system(s) are you talking?

>For powered up timing, MS-DOS, its predecessors if they implemented a
>software clock at all, the MS-DOS Windows (3.0, 3.1, 95, 98, ME, 98SE),
>most, if not all of the NT Windows (NT 3.5, NT 4.0, Windows 2000,
>Windows XP, probably Windows 2003), Linux from start to 2.4, and mostly
>for 2.6, SCO OpenServer.......

>(The tick rate for MS-DOS family systems is a good clue to which timer
>they use.)
>>
>> The PC's standard RTC chip can certainly generate cyclic interrupts.
>> However, if a cyclic interrupt from the RTC or from another timer chip is

>But generally isn't used for that. ISTR that some early PCs didn't have
>an RTC and had to be set when booted.

>> used to drive the scheduler depends on the type and eventually on the
>> version of an operating system, isn't it?

>Divergence into alternative periodic sources on IBM PC type machines is
>very recent.

There is the rtc, and then there is the timer chip on the PC board. I
thought that they were very different. The RTC is the on board real time
clock powered by a battery on the cmos. The timer interrups are a divisor
of the main bus clock that drives the computer bus I believe. I thought we
were talking about the rtc, not the bus clock. ( int 8 not int 0)

19. ## Re: drift modeling question

Unruh wrote:

>
> There is the rtc, and then there is the timer chip on the PC board. I
> thought that they were very different. The RTC is the on board real time
> clock powered by a battery on the cmos. The timer interrups are a divisor
> of the main bus clock that drives the computer bus I believe. I thought we
> were talking about the rtc, not the bus clock. ( int 8 not int 0)
>

That's what I was saying. Martin was suggesting that the RTC was being
used, and that alternative sources were more common than in reality.
The RTC runs off the 32kHz crystal.

Note that on the PC/AT, for which I have detailed circuit diagrams, the
main bus clock and the timer clock are distinct. I think that is
because the original clock was right for the PC but too slow for the AT.
They have merged again, because the crystal frequency is now
multiplied to get the main bus clock.

HPET timers are recent, and generally use the non-32kHz crystal, i.e.
they have the same temperature dependence as the counter timer interrupt
and TSC, because they are derived from the same oscillator.

20. ## Re: drift modeling question

Uwe Klein writes:

>Unruh wrote:
>>
>> Under Linux support is a mess. For example if you turn on an interrupt, the
>> system returns immediately even though the conditions of the interrupt have
>> not been met. Some bug in the code.
>>

>could you be more specific?

Sure
bugzilla.kernel.org bug 11112
On more recent systems (eg systems with HPET) the kernel code was not
cleaning up old interrupt information when the intrrupt UIE ( the update
interrupt which tell the rtc to issue an interrupt once per second just
when the seconds marker was turning over in the rtc) was switched on. This
meant that the interrupt was retured immediately. The usual way of reading
the rtc for accurate work was to switch on the UIE, and then do a select() or
a read() on the /dev/rtc fd, and wait for the update. However, it returned
immediately meaning that the first select/read after the UIE was turned on
returned immediately and not when the rtc updated. Ie, the time would be
out by up to a second. Now if you did not care whether or not the rtc was
accurate, that is ok. However if you wanted to set or read the rtc
accurately, that is terrible. Here you discipline your computer to within 1
microsecond, and you cannot discipline your rtc to better than a second.

I discovered this about 2 years ago while trying to understand chrony, but
at that time thought it is was just some weirdness with the rtc I did not
understand. Last week, when a new computer could not run chrony at all, I
looked into it further, and discovered the above bug in the rtc driver (the
hpet and also the new rtc-cmos drivers). (There was also a further bug in
the hpet glue and the Mandriva kernel setup which was why chrony was not
working, but that is another story).

Note that there are now kernel fixes for the bug which will probably make
it into the 2.6.28 kernel. In the meantime, make sure that if you want to
accurately read the rtc, do a select or read twice-- the second one will be
at the update boundary (well on an hpet system, that is not clear either.
It seems that the rtc situation is a bit of a mess these days caused by