[PATCH 0/5] time/ntp changes - Kernel

This is a discussion on [PATCH 0/5] time/ntp changes - Kernel ; I've re-based my timekeeping patch queue ontop of Roman's recent NTP and div64 patches. I've been a little short on time, so I've not been able to get them all tested (ntp testing takes a number of hours), but they ...

+ Reply to Thread
Results 1 to 19 of 19

Thread: [PATCH 0/5] time/ntp changes

  1. [PATCH 0/5] time/ntp changes

    I've re-based my timekeeping patch queue ontop of Roman's recent NTP and
    div64 patches. I've been a little short on time, so I've not been able
    to get them all tested (ntp testing takes a number of hours), but they
    do build and I wanted to get some feedback on the re-base.

    My apologies to Ingo for sitting on his patches for this long.

    I'll get a test setup going over the weekend to make sure all is well.

    The total patch queue is large, so while folks probably won't try it
    themselves, here's the full series I'm using to build:

    #start w/ linus' -git

    #roman's udiv patches
    01-udiv.patchd
    02-convert.patch
    03-remove.patch

    #roman's ntp changes
    1-cleanup.patch
    2-ntp4.patch
    3-timefreq.patch
    4-timeoffset.patch
    5-tai.patch
    6-scaleshift.patch
    7-ticklength.patch
    8-leap.patch

    #my timekeeping changes
    mult-adj-split.patch
    monotonic-raw

    #ntp cleanups from Ingo and me
    ntp-cleanup-ingo1.patch
    ntp-cleanup-ingo2.patch
    ntp_cleanup-john.patch


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. [PATCH 1/5] split clocksource adjustment from clockosurce mult

    The clocksource frequency is represented by
    clocksource->mult/2^(clocksource->shift). Currently, when NTP makes
    adjustments to the clock frequency, they are made directly to the mult
    value.

    This has the drawback that once changed, we cannot know what the orignal
    mult value was, or how much adjustment has been applied.

    This property causes problems in calculating proper ntp intervals when
    switching back and forth between clocksources.

    This patch separates the current mult value into a mult and mult_adj
    pair. The mult value stays constant, while the ntp clocksource
    adjustments are done only to the mult_adj value.

    This allows for correct ntp interval calculation and additionally lays
    the groundwork for a new notion of time, what I'm calling the
    monotonic-raw time, which is introduced in a following patch.

    Thoughts or comments would be appreciated.

    thanks
    -john

    Signed-off-by: John Stultz

    diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c
    index 17fda52..37851e1 100644
    --- a/arch/ia64/kernel/time.c
    +++ b/arch/ia64/kernel/time.c
    @@ -357,7 +357,7 @@ void update_vsyscall(struct timespec *wall, struct clocksource *c)

    /* copy fsyscall clock data */
    fsyscall_gtod_data.clk_mask = c->mask;
    - fsyscall_gtod_data.clk_mult = c->mult;
    + fsyscall_gtod_data.clk_mult = c->mult + c->mult_adj;
    fsyscall_gtod_data.clk_shift = c->shift;
    fsyscall_gtod_data.clk_fsys_mmio = c->fsys_mmio;
    fsyscall_gtod_data.clk_cycle_last = c->cycle_last;
    diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
    index 3b26fbd..36aa8da 100644
    --- a/arch/powerpc/kernel/time.c
    +++ b/arch/powerpc/kernel/time.c
    @@ -814,7 +814,7 @@ void update_vsyscall(struct timespec *wall_time, struct clocksource *clock)

    /* XXX this assumes clock->shift == 22 */
    /* 4611686018 ~= 2^(20+64-22) / 1e9 */
    - t2x = (u64) clock->mult * 4611686018ULL;
    + t2x = (u64) (clock->mult + clock->mult_adj) * 4611686018ULL;
    stamp_xsec = (u64) xtime.tv_nsec * XSEC_PER_SEC;
    do_div(stamp_xsec, 1000000000);
    stamp_xsec += (u64) xtime.tv_sec * XSEC_PER_SEC;
    diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c
    index 3f82427..14afd48 100644
    --- a/arch/x86/kernel/vsyscall_64.c
    +++ b/arch/x86/kernel/vsyscall_64.c
    @@ -83,7 +83,7 @@ void update_vsyscall(struct timespec *wall_time, struct clocksource *clock)
    vsyscall_gtod_data.clock.vread = clock->vread;
    vsyscall_gtod_data.clock.cycle_last = clock->cycle_last;
    vsyscall_gtod_data.clock.mask = clock->mask;
    - vsyscall_gtod_data.clock.mult = clock->mult;
    + vsyscall_gtod_data.clock.mult = clock->mult + clock->mult_adj;
    vsyscall_gtod_data.clock.shift = clock->shift;
    vsyscall_gtod_data.wall_time_sec = wall_time->tv_sec;
    vsyscall_gtod_data.wall_time_nsec = wall_time->tv_nsec;
    diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
    index 85778a4..e917a30 100644
    --- a/include/linux/clocksource.h
    +++ b/include/linux/clocksource.h
    @@ -45,7 +45,8 @@ struct clocksource;
    * @read: returns a cycle value
    * @mask: bitmask for two's complement
    * subtraction of non 64 bit counters
    - * @mult: cycle to nanosecond multiplier
    + * @mult: cycle to nanosecond multiplier (unadjusted)
    + * @mult_adj: NTP adjustment factor added to the multiplier
    * @shift: cycle to nanosecond divisor (power of two)
    * @flags: flags describing special properties
    * @vread: vsyscall based read
    @@ -63,6 +64,7 @@ struct clocksource {
    cycle_t (*read)(void);
    cycle_t mask;
    u32 mult;
    + s32 mult_adj;
    u32 shift;
    unsigned long flags;
    cycle_t (*vread)(void);
    @@ -179,7 +181,7 @@ static inline cycle_t clocksource_read(struct clocksource *cs)
    static inline s64 cyc2ns(struct clocksource *cs, cycle_t cycles)
    {
    u64 ret = (u64)cycles;
    - ret = (ret * cs->mult) >> cs->shift;
    + ret = (ret * (cs->mult + cs->mult_adj)) >> cs->shift;
    return ret;
    }

    @@ -199,7 +201,7 @@ static inline void clocksource_calculate_interval(struct clocksource *c,
    {
    u64 tmp;

    - /* XXX - All of this could use a whole lot of optimization */
    + /* Do the ns -> cycle conversion first, ignoring mult_adj */
    tmp = length_nsec;
    tmp <<= c->shift;
    tmp += c->mult/2;
    @@ -209,7 +211,8 @@ static inline void clocksource_calculate_interval(struct clocksource *c,
    if (c->cycle_interval == 0)
    c->cycle_interval = 1;

    - c->xtime_interval = (u64)c->cycle_interval * c->mult;
    + /* Go back from cycles -> shifted ns, this time include mult_adj */
    + c->xtime_interval = (u64)c->cycle_interval * (c->mult + c->mult_adj);
    }


    diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
    index 1af9fb0..5b11b0f 100644
    --- a/kernel/time/timekeeping.c
    +++ b/kernel/time/timekeeping.c
    @@ -426,7 +426,7 @@ static void clocksource_adjust(s64 offset)
    } else
    return;

    - clock->mult += adj;
    + clock->mult_adj += adj;
    clock->xtime_interval += interval;
    clock->xtime_nsec -= offset;
    clock->error -= (interval - offset) <<


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. [PATCH 4/5] ntp.c code flow clenaups (from Ingo)

    From: Ingo Molnar
    Subject: [patch] ntp: clean up code flow

    clean up the flow of code by splitting the way too large and way too
    nested do_adjtimex() function.

    no change in the logic.

    Signed-off-by: Ingo Molnar

    Merged on top of Roman's very similar cleanups.
    (In other words: Blame John for the merge screwups).

    Signed-off-by: John Stultz

    Index: linux-2.6/kernel/time/ntp.c
    ================================================== =================
    --- linux-2.6.orig/kernel/time/ntp.c 2008-03-14 20:10:41.000000000 -0700
    +++ linux-2.6/kernel/time/ntp.c 2008-03-14 20:43:48.000000000 -0700
    @@ -275,6 +275,87 @@ static void notify_cmos_timer(void)
    static inline void notify_cmos_timer(void) { }
    #endif

    +static inline void process_adjtimex_adj_offset(struct timex *txc, long sec)
    +{
    + if (txc->modes & ADJ_STATUS) {
    + if ((time_status & STA_PLL) && !(txc->status & STA_PLL)) {
    + time_state = TIME_OK;
    + time_status = STA_UNSYNC;
    + }
    + /* only set allowed bits */
    + time_status &= STA_RONLY;
    + time_status |= txc->status & ~STA_RONLY;
    +
    + switch (time_state) {
    + case TIME_OK:
    + start_timer:
    + if (time_status & STA_INS) {
    + time_state = TIME_INS;
    + sec += 86400 - sec % 86400;
    + hrtimer_start(&leap_timer, ktime_set(sec, 0), HRTIMER_MODE_ABS);
    + } else if (time_status & STA_DEL) {
    + time_state = TIME_DEL;
    + sec += 86400 - (sec + 1) % 86400;
    + hrtimer_start(&leap_timer, ktime_set(sec, 0), HRTIMER_MODE_ABS);
    + }
    + break;
    + case TIME_INS:
    + case TIME_DEL:
    + time_state = TIME_OK;
    + goto start_timer;
    + break;
    + case TIME_WAIT:
    + if (!(time_status & (STA_INS | STA_DEL)))
    + time_state = TIME_OK;
    + break;
    + case TIME_OOP:
    + hrtimer_restart(&leap_timer);
    + break;
    + }
    + }
    +
    + if (txc->modes & ADJ_NANO)
    + time_status |= STA_NANO;
    + if (txc->modes & ADJ_MICRO)
    + time_status &= ~STA_NANO;
    +
    + if (txc->modes & ADJ_FREQUENCY) {
    + time_freq = (s64)txc->freq * PPM_SCALE;
    + time_freq = min(time_freq, MAXFREQ_SCALED);
    + time_freq = max(time_freq, -MAXFREQ_SCALED);
    + }
    +
    + if (txc->modes & ADJ_MAXERROR)
    + time_maxerror = txc->maxerror;
    + if (txc->modes & ADJ_ESTERROR)
    + time_esterror = txc->esterror;
    +
    + if (txc->modes & ADJ_TIMECONST) {
    + time_constant = txc->constant;
    + if (!(time_status & STA_NANO))
    + time_constant += 4;
    + time_constant = min(time_constant, (long)MAXTC);
    + time_constant = max(time_constant, 0l);
    + }
    +
    + if (txc->modes & ADJ_TAI && txc->constant > 0)
    + time_tai = txc->constant;
    +
    + if (txc->modes & ADJ_OFFSET) {
    + if (txc->modes == ADJ_OFFSET_SINGLESHOT)
    + /* adjtime() is independent from ntp_adjtime() */
    + time_adjust = txc->offset;
    + else
    + ntp_update_offset(txc->offset);
    + }
    +
    + if (txc->modes & ADJ_TICK)
    + tick_usec = txc->tick;
    +
    + if (txc->modes & (ADJ_TICK|ADJ_FREQUENCY|ADJ_OFFSET))
    + ntp_update_frequency();
    +}
    +
    /*
    * adjtimex mainly allows reading (and writing, if superuser) of
    * kernel time-keeping variables. used by xntpd.
    @@ -282,7 +363,7 @@ static inline void notify_cmos_timer(voi
    int do_adjtimex(struct timex *txc)
    {
    struct timespec ts;
    - long save_adjust, sec;
    + long save_adjust;
    int result;

    /* In order to modify anything, you gotta be super-user! */
    @@ -306,6 +387,7 @@ int do_adjtimex(struct timex *txc)

    if (time_state != TIME_OK && txc->modes & ADJ_STATUS)
    hrtimer_cancel(&leap_timer);
    +
    getnstimeofday(&ts);

    write_seqlock_irq(&xtime_lock);
    @@ -314,121 +396,42 @@ int do_adjtimex(struct timex *txc)
    save_adjust = time_adjust;

    /* If there are input parameters, then process them */
    - if (txc->modes) {
    - if (txc->modes & ADJ_STATUS) {
    - if ((time_status & STA_PLL) &&
    - !(txc->status & STA_PLL)) {
    - time_state = TIME_OK;
    - time_status = STA_UNSYNC;
    - }
    - /* only set allowed bits */
    - time_status &= STA_RONLY;
    - time_status |= txc->status & ~STA_RONLY;
    -
    - switch (time_state) {
    - case TIME_OK:
    - start_timer:
    - sec = ts.tv_sec;
    - if (time_status & STA_INS) {
    - time_state = TIME_INS;
    - sec += 86400 - sec % 86400;
    - hrtimer_start(&leap_timer, ktime_set(sec, 0), HRTIMER_MODE_ABS);
    - } else if (time_status & STA_DEL) {
    - time_state = TIME_DEL;
    - sec += 86400 - (sec + 1) % 86400;
    - hrtimer_start(&leap_timer, ktime_set(sec, 0), HRTIMER_MODE_ABS);
    - }
    - break;
    - case TIME_INS:
    - case TIME_DEL:
    - time_state = TIME_OK;
    - goto start_timer;
    - break;
    - case TIME_WAIT:
    - if (!(time_status & (STA_INS | STA_DEL)))
    - time_state = TIME_OK;
    - break;
    - case TIME_OOP:
    - hrtimer_restart(&leap_timer);
    - break;
    - }
    - }
    -
    - if (txc->modes & ADJ_NANO)
    - time_status |= STA_NANO;
    - if (txc->modes & ADJ_MICRO)
    - time_status &= ~STA_NANO;
    -
    - if (txc->modes & ADJ_FREQUENCY) {
    - time_freq = (s64)txc->freq * PPM_SCALE;
    - time_freq = min(time_freq, MAXFREQ_SCALED);
    - time_freq = max(time_freq, -MAXFREQ_SCALED);
    - }
    -
    - if (txc->modes & ADJ_MAXERROR)
    - time_maxerror = txc->maxerror;
    - if (txc->modes & ADJ_ESTERROR)
    - time_esterror = txc->esterror;
    -
    - if (txc->modes & ADJ_TIMECONST) {
    - time_constant = txc->constant;
    - if (!(time_status & STA_NANO))
    - time_constant += 4;
    - time_constant = min(time_constant, (long)MAXTC);
    - time_constant = max(time_constant, 0l);
    - }
    -
    - if (txc->modes & ADJ_TAI && txc->constant > 0)
    - time_tai = txc->constant;
    -
    - if (txc->modes & ADJ_OFFSET) {
    - if (txc->modes == ADJ_OFFSET_SINGLESHOT)
    - /* adjtime() is independent from ntp_adjtime() */
    - time_adjust = txc->offset;
    - else
    - ntp_update_offset(txc->offset);
    - }
    - if (txc->modes & ADJ_TICK)
    - tick_usec = txc->tick;
    -
    - if (txc->modes & (ADJ_TICK|ADJ_FREQUENCY|ADJ_OFFSET))
    - ntp_update_frequency();
    - }
    + if (txc->modes)
    + process_adjtimex_adj_offset(txc, ts.tv_sec);

    result = time_state; /* mostly `TIME_OK' */
    if (time_status & (STA_UNSYNC|STA_CLOCKERR))
    result = TIME_ERROR;

    if ((txc->modes == ADJ_OFFSET_SINGLESHOT) ||
    - (txc->modes == ADJ_OFFSET_SS_READ))
    + (txc->modes == ADJ_OFFSET_SS_READ)) {
    txc->offset = save_adjust;
    - else {
    + } else {
    txc->offset = shift_right(time_offset * NTP_INTERVAL_FREQ,
    NTP_SCALE_SHIFT);
    if (!(time_status & STA_NANO))
    txc->offset /= NSEC_PER_USEC;
    }
    - txc->freq = shift_right((s32)(time_freq >> PPM_SCALE_INV_SHIFT) *
    - (s64)PPM_SCALE_INV,
    - NTP_SCALE_SHIFT);
    - txc->maxerror = time_maxerror;
    - txc->esterror = time_esterror;
    - txc->status = time_status;
    - txc->constant = time_constant;
    - txc->precision = 1;
    - txc->tolerance = MAXFREQ_SCALED / PPM_SCALE;
    - txc->tick = tick_usec;
    - txc->tai = time_tai;
    + txc->freq = shift_right((s32)(time_freq >> PPM_SCALE_INV_SHIFT) *
    + (s64)PPM_SCALE_INV, NTP_SCALE_SHIFT);
    + txc->maxerror = time_maxerror;
    + txc->esterror = time_esterror;
    + txc->status = time_status;
    + txc->constant = time_constant;
    + txc->precision = 1;
    + txc->tolerance = MAXFREQ_SCALED / PPM_SCALE;
    + txc->tick = tick_usec;
    + txc->tai = time_tai;

    /* PPS is not implemented, so these are zero */
    - txc->ppsfreq = 0;
    - txc->jitter = 0;
    - txc->shift = 0;
    - txc->stabil = 0;
    - txc->jitcnt = 0;
    - txc->calcnt = 0;
    - txc->errcnt = 0;
    - txc->stbcnt = 0;
    + txc->ppsfreq = 0;
    + txc->jitter = 0;
    + txc->shift = 0;
    + txc->stabil = 0;
    + txc->jitcnt = 0;
    + txc->calcnt = 0;
    + txc->errcnt = 0;
    + txc->stbcnt = 0;
    write_sequnlock_irq(&xtime_lock);

    txc->time.tv_sec = ts.tv_sec;


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. [PATCH 3/5] cleanups ntp.c (from Ingo)

    Make this file more readable by applying a consistent coding style.

    No code changed.

    Signed-off-by: Ingo Molnar

    Merged on top of Roman's very similar cleanups.
    (In other words: Blame John for the merge screwups).

    Signed-off-by: John Stultz


    Index: linux-2.6/kernel/time/ntp.c
    ================================================== =================
    --- linux-2.6.orig/kernel/time/ntp.c 2008-03-14 19:21:02.000000000 -0700
    +++ linux-2.6/kernel/time/ntp.c 2008-03-14 20:10:41.000000000 -0700
    @@ -1,13 +1,10 @@
    /*
    - * linux/kernel/time/ntp.c
    - *
    * NTP state machine interfaces and logic.
    *
    * This code was mainly moved from kernel/timer.c and kernel/time.c
    * Please see those files for relevant copyright info and historical
    * changelogs.
    */
    -
    #include
    #include
    #include
    @@ -22,32 +19,35 @@
    /*
    * Timekeeping variables
    */
    -unsigned long tick_usec = TICK_USEC; /* USER_HZ period (usec) */
    -unsigned long tick_nsec; /* ACTHZ period (nsec) */
    -u64 tick_length;
    -static u64 tick_length_base;
    -
    -static struct hrtimer leap_timer;
    -
    -#define MAX_TICKADJ 500 /* microsecs */
    -#define MAX_TICKADJ_SCALED (((u64)(MAX_TICKADJ * NSEC_PER_USEC) << \
    - NTP_SCALE_SHIFT) / NTP_INTERVAL_FREQ)
    +unsigned long tick_usec = TICK_USEC; /* USER_HZ period (usec) */
    +unsigned long tick_nsec; /* ACTHZ period (nsec) */
    +u64 tick_length;
    +static u64 tick_length_base;
    +static const long max_tickadj = 500; /* (usec) */
    +
    +static struct hrtimer leap_timer;

    /*
    * phase-lock loop variables
    */
    /* TIME_ERROR prevents overwriting the CMOS clock */
    -static int time_state = TIME_OK; /* clock synchronization status */
    -int time_status = STA_UNSYNC; /* clock status bits */
    -static long time_tai; /* TAI offset (s) */
    -static s64 time_offset; /* time adjustment (ns) */
    -static long time_constant = 2; /* pll time constant */
    -long time_maxerror = NTP_PHASE_LIMIT; /* maximum error (us) */
    -long time_esterror = NTP_PHASE_LIMIT; /* estimated error (us) */
    -static s64 time_freq; /* frequency offset (scaled ns/s)*/
    -static long time_reftime; /* time at last adjustment (s) */
    -long time_adjust;
    -static long ntp_tick_adj;
    +static int time_state = TIME_OK; /* clock synch status */
    +int time_status = STA_UNSYNC; /* clock status bits */
    +static long time_tai; /* TAI offset (s) */
    +static s64 time_offset; /* time adjustment (ns) */
    +static long time_constant = 2; /* pll time constant */
    +long time_maxerror = NTP_PHASE_LIMIT;/* maximum error (us) */
    +long time_esterror = NTP_PHASE_LIMIT;/* estimated error (us) */
    +static s64 time_freq; /* frequency offset (scaled ns/s)*/
    +static long time_reftime; /* time at last adjustment (s) */
    +long time_adjust;
    +static long ntp_tick_adj;
    +
    +
    +static inline u64 scale_tickadj(u64 adj)
    +{
    + return ((adj * NSEC_PER_USEC) << NTP_SCALE_SHIFT) / NTP_INTERVAL_FREQ;
    +}

    static void ntp_update_frequency(void)
    {
    @@ -111,15 +111,15 @@ static void ntp_update_offset(long offse
    */
    void ntp_clear(void)
    {
    - time_adjust = 0; /* stop active adjtime() */
    - time_status |= STA_UNSYNC;
    - time_maxerror = NTP_PHASE_LIMIT;
    - time_esterror = NTP_PHASE_LIMIT;
    + time_adjust = 0; /* stop active adjtime() */
    + time_status |= STA_UNSYNC;
    + time_maxerror = NTP_PHASE_LIMIT;
    + time_esterror = NTP_PHASE_LIMIT;

    ntp_update_frequency();

    - tick_length = tick_length_base;
    - time_offset = 0;
    + tick_length = tick_length_base;
    + time_offset = 0;
    }

    /*
    @@ -136,28 +136,32 @@ static enum hrtimer_restart ntp_leap_sec
    switch (time_state) {
    case TIME_OK:
    break;
    +
    case TIME_INS:
    xtime.tv_sec--;
    wall_to_monotonic.tv_sec++;
    time_state = TIME_OOP;
    - printk(KERN_NOTICE "Clock: "
    - "inserting leap second 23:59:60 UTC\n");
    + printk(KERN_NOTICE
    + "Clock: inserting leap second 23:59:60 UTC\n");
    leap_timer.expires = ktime_add_ns(leap_timer.expires,
    NSEC_PER_SEC);
    res = HRTIMER_RESTART;
    break;
    +
    case TIME_DEL:
    xtime.tv_sec++;
    time_tai--;
    wall_to_monotonic.tv_sec--;
    time_state = TIME_WAIT;
    - printk(KERN_NOTICE "Clock: "
    - "deleting leap second 23:59:59 UTC\n");
    + printk(KERN_NOTICE
    + "Clock: deleting leap second 23:59:59 UTC\n");
    break;
    +
    case TIME_OOP:
    time_tai++;
    time_state = TIME_WAIT;
    /* fall through */
    +
    case TIME_WAIT:
    if (!(time_status & (STA_INS | STA_DEL)))
    time_state = TIME_OK;
    @@ -193,21 +197,20 @@ void second_overflow(void)
    * Compute the phase adjustment for the next second. The offset is
    * reduced by a fixed factor times the time constant.
    */
    - tick_length = tick_length_base;
    - time_adj = shift_right(time_offset, SHIFT_PLL + time_constant);
    - time_offset -= time_adj;
    - tick_length += time_adj;
    + tick_length = tick_length_base;
    + time_adj = shift_right(time_offset, SHIFT_PLL + time_constant);
    + time_offset -= time_adj;
    + tick_length += time_adj;

    if (unlikely(time_adjust)) {
    - if (time_adjust > MAX_TICKADJ) {
    - time_adjust -= MAX_TICKADJ;
    - tick_length += MAX_TICKADJ_SCALED;
    - } else if (time_adjust < -MAX_TICKADJ) {
    - time_adjust += MAX_TICKADJ;
    - tick_length -= MAX_TICKADJ_SCALED;
    + if (time_adjust > max_tickadj) {
    + time_adjust -= max_tickadj;
    + tick_length += scale_tickadj(max_tickadj);
    + } else if (time_adjust < -max_tickadj) {
    + time_adjust += max_tickadj;
    + tick_length -= scale_tickadj(max_tickadj);
    } else {
    - tick_length += (s64)(time_adjust * NSEC_PER_USEC /
    - NTP_INTERVAL_FREQ) << NTP_SCALE_SHIFT;
    + tick_length += scale_tickadj(time_adjust);
    time_adjust = 0;
    }
    }
    @@ -216,13 +219,13 @@ void second_overflow(void)
    #ifdef CONFIG_GENERIC_CMOS_UPDATE

    /* Disable the cmos update - used by virtualization and embedded */
    -int no_sync_cmos_clock __read_mostly;
    +int no_sync_cmos_clock __read_mostly;

    -static void sync_cmos_clock(unsigned long dummy);
    +static void sync_cmos_clock(unsigned long unused);

    static DEFINE_TIMER(sync_cmos_timer, sync_cmos_clock, 0, 0);

    -static void sync_cmos_clock(unsigned long dummy)
    +static void sync_cmos_clock(unsigned long unused)
    {
    struct timespec now, next;
    int fail = 1;
    @@ -234,12 +237,13 @@ static void sync_cmos_clock(unsigned lon
    * This code is run on a timer. If the clock is set, that timer
    * may not expire at the correct time. Thus, we adjust...
    */
    - if (!ntp_synced())
    + if (!ntp_synced()) {
    /*
    * Not synced, exit, do not restart a timer (if one is
    * running, let it run out).
    */
    return;
    + }

    getnstimeofday(&now);
    if (abs(now.tv_nsec - (NSEC_PER_SEC / 2)) <= tick_nsec / 2)
    @@ -271,7 +275,8 @@ static void notify_cmos_timer(void)
    static inline void notify_cmos_timer(void) { }
    #endif

    -/* adjtimex mainly allows reading (and writing, if superuser) of
    +/*
    + * adjtimex mainly allows reading (and writing, if superuser) of
    * kernel time-keeping variables. used by xntpd.
    */
    int do_adjtimex(struct timex *txc)
    @@ -293,10 +298,11 @@ int do_adjtimex(struct timex *txc)
    }

    /* if the quartz is off by more than 10% something is VERY wrong ! */
    - if (txc->modes & ADJ_TICK)
    + if (txc->modes & ADJ_TICK) {
    if (txc->tick < 900000/USER_HZ ||
    txc->tick > 1100000/USER_HZ)
    return -EINVAL;
    + }

    if (time_state != TIME_OK && txc->modes & ADJ_STATUS)
    hrtimer_cancel(&leap_timer);


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. [PATCH 5/5] make more ntp values static


    Make time_status, time_maxerror and time_esterror static to ntp.c

    Signed-off-by: John Stultz


    Index: linux-2.6/arch/cris/arch-v32/kernel/time.c
    ================================================== =================
    --- linux-2.6.orig/arch/cris/arch-v32/kernel/time.c 2008-02-13 12:07:00.000000000 -0800
    +++ linux-2.6/arch/cris/arch-v32/kernel/time.c 2008-03-14 20:13:24.000000000 -0700
    @@ -248,8 +248,7 @@ timer_interrupt(int irq, void *dev_id)
    * The division here is not time critical since it will run once in
    * 11 minutes
    */
    - if ((time_status & STA_UNSYNC) == 0 &&
    - xtime.tv_sec > last_rtc_update + 660 &&
    + if (ntp_synced() && xtime.tv_sec > last_rtc_update + 660 &&
    (xtime.tv_nsec / 1000) >= 500000 - (tick_nsec / 1000) / 2 &&
    (xtime.tv_nsec / 1000) <= 500000 + (tick_nsec / 1000) / 2) {
    if (set_rtc_mmss(xtime.tv_sec) == 0)
    Index: linux-2.6/arch/mn10300/kernel/rtc.c
    ================================================== =================
    --- linux-2.6.orig/arch/mn10300/kernel/rtc.c 2008-02-13 12:07:00.000000000 -0800
    +++ linux-2.6/arch/mn10300/kernel/rtc.c 2008-03-14 20:13:24.000000000 -0700
    @@ -117,8 +117,7 @@ void check_rtc_time(void)
    * RTC clock accordingly every ~11 minutes. set_rtc_mmss() has to be
    * called as close as possible to 500 ms before the new second starts.
    */
    - if ((time_status & STA_UNSYNC) == 0 &&
    - xtime.tv_sec > last_rtc_update + 660 &&
    + if (ntp_synced() && xtime.tv_sec > last_rtc_update + 660 &&
    xtime.tv_nsec / 1000 >= 500000 - ((unsigned) TICK_SIZE) / 2 &&
    xtime.tv_nsec / 1000 <= 500000 + ((unsigned) TICK_SIZE) / 2
    ) {
    Index: linux-2.6/include/linux/timex.h
    ================================================== =================
    --- linux-2.6.orig/include/linux/timex.h 2008-03-14 19:21:02.000000000 -0700
    +++ linux-2.6/include/linux/timex.h 2008-03-14 20:18:44.000000000 -0700
    @@ -206,23 +206,8 @@ extern int tickadj; /* amount of adjus
    /*
    * phase-lock loop variables
    */
    -extern int time_status; /* clock synchronization status bits */
    -extern long time_maxerror; /* maximum error */
    -extern long time_esterror; /* estimated error */
    -
    extern long time_adjust; /* The amount of adjtime left */

    -extern void ntp_init(void);
    -extern void ntp_clear(void);
    -
    -/**
    - * ntp_synced - Returns 1 if the NTP status is not UNSYNC
    - *
    - */
    -static inline int ntp_synced(void)
    -{
    - return !(time_status & STA_UNSYNC);
    -}

    /* Required to safely shift negative values */
    #define shift_right(x, s) ({ \
    @@ -243,6 +228,9 @@ static inline int ntp_synced(void)
    /* Returns how long ticks are at present, in ns / 2^NTP_SCALE_SHIFT. */
    extern u64 tick_length;

    +extern void ntp_init(void);
    +extern int ntp_synced(void);
    +extern void ntp_clear(void);
    extern void second_overflow(void);
    extern void update_ntp_one_tick(void);
    extern int do_adjtimex(struct timex *);
    Index: linux-2.6/kernel/time/ntp.c
    ================================================== =================
    --- linux-2.6.orig/kernel/time/ntp.c 2008-03-14 20:13:06.000000000 -0700
    +++ linux-2.6/kernel/time/ntp.c 2008-03-14 20:15:27.000000000 -0700
    @@ -32,13 +32,13 @@ static struct hrtimer leap_timer;
    */
    /* TIME_ERROR prevents overwriting the CMOS clock */
    static int time_state = TIME_OK; /* clock synch status */
    -int time_status = STA_UNSYNC; /* clock status bits */
    +static int time_status = STA_UNSYNC; /* clock status bits */
    static long time_tai; /* TAI offset (s) */
    static s64 time_offset; /* time adjustment (ns) */
    static long time_constant = 2; /* pll time constant */
    -long time_maxerror = NTP_PHASE_LIMIT;/* maximum error (us) */
    -long time_esterror = NTP_PHASE_LIMIT;/* estimated error (us) */
    -static s64 time_freq; /* frequency offset (scaled ns/s)*/
    +static long time_maxerror = NTP_PHASE_LIMIT;/* maximum error (us) */
    +static long time_esterror = NTP_PHASE_LIMIT;/* estimated error (us) */
    +static s64 time_freq; /* frequency offset (scaled ns/s)*/
    static long time_reftime; /* time at last adjustment (s) */
    long time_adjust;
    static long ntp_tick_adj;
    @@ -174,6 +174,15 @@ static enum hrtimer_restart ntp_leap_sec
    return res;
    }

    +/**
    + * ntp_synced - Returns 1 if the NTP status is not UNSYNC
    + *
    + */
    +int ntp_synced(void)
    +{
    + return !(time_status & STA_UNSYNC);
    +}
    +
    /*
    * this routine handles the overflow of the microsecond field
    *


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [PATCH 1/5] split clocksource adjustment from clockosurce mult

    Hi,

    On Fri, 14 Mar 2008, john stultz wrote:

    > @@ -179,7 +181,7 @@ static inline cycle_t clocksource_read(struct clocksource *cs)
    > static inline s64 cyc2ns(struct clocksource *cs, cycle_t cycles)
    > {
    > u64 ret = (u64)cycles;
    > - ret = (ret * cs->mult) >> cs->shift;
    > + ret = (ret * (cs->mult + cs->mult_adj)) >> cs->shift;
    > return ret;
    > }


    This add an extra memory access and extra instruction to a code path, we
    should keep as short as possible.
    I'd rather add a mult_raw or mult_org and use that for your next patch.

    bye, Roman
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: [PATCH 2/5] introduce CLOCK_MONOTONIC_RAW

    Hi,

    On Fri, 14 Mar 2008, john stultz wrote:

    > My solution is to introduce CLOCK_MONOTONIC_RAW. This exposes a
    > nanosecond based time value, that increments starting at bootup and has
    > no frequency adjustments made to it what so ever.


    A general comment: the raw clock doesn't need any adjustments, so updates
    don't have to be done that frequently and you can move most of that logic
    into second_overflow().

    > @@ -439,6 +475,7 @@ static void clocksource_adjust(s64 offset)
    > void update_wall_time(void)
    > {
    > cycle_t offset;
    > + static u64 raw_snsec; /* shifted raw nanosecnds */
    >
    > /* Make sure we're fully resumed: */
    > if (unlikely(timekeeping_suspended))


    IMO that's really a clock property, so this belongs in the clock
    structure.
    (Some day we may want to have multiple active clocks for various purposes
    and thus export multiple raw clocks.)

    > @@ -466,6 +504,11 @@ void update_wall_time(void)
    > second_overflow();
    > }
    >
    > + if (raw_snsec >= (u64)NSEC_PER_SEC << clock->shift) {
    > + raw_snsec -= (u64)NSEC_PER_SEC << clock->shift;
    > + monotonic_raw.tv_sec++;
    > + }
    > +


    You don't really have to use clock->shift as a scale, thus simplifying the
    shifting here (e.g. by using NTP_SCALE_SHIFT).
    I'm thinking about doing this for e.g. xtime_nsec as well.

    bye, Roman
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [PATCH 4/5] ntp.c code flow clenaups (from Ingo)

    Hi,

    On Fri, 14 Mar 2008, john stultz wrote:

    > +static inline void process_adjtimex_adj_offset(struct timex *txc, long sec)


    Somewhat ugly name...
    From an userspace perspective it's the core of ntp_adjtime(), so I'd
    prefer something close to that.

    bye, Roman
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: [PATCH 1/5] split clocksource adjustment from clockosurce mult


    On Sat, 2008-03-15 at 05:28 +0100, Roman Zippel wrote:
    > Hi,
    >
    > On Fri, 14 Mar 2008, john stultz wrote:
    >
    > > @@ -179,7 +181,7 @@ static inline cycle_t clocksource_read(struct clocksource *cs)
    > > static inline s64 cyc2ns(struct clocksource *cs, cycle_t cycles)
    > > {
    > > u64 ret = (u64)cycles;
    > > - ret = (ret * cs->mult) >> cs->shift;
    > > + ret = (ret * (cs->mult + cs->mult_adj)) >> cs->shift;
    > > return ret;
    > > }

    >
    > This add an extra memory access and extra instruction to a code path, we
    > should keep as short as possible.
    > I'd rather add a mult_raw or mult_org and use that for your next patch.


    Yea, I played with that at first, but the functoinal duplication (two
    values, both storing close to the same info) was ugly.

    I guess I could go back to mult_orig.

    -john


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: [PATCH 4/5] ntp.c code flow clenaups (from Ingo)


    On Sat, 2008-03-15 at 05:52 +0100, Roman Zippel wrote:
    > Hi,
    >
    > On Fri, 14 Mar 2008, john stultz wrote:
    >
    > > +static inline void process_adjtimex_adj_offset(struct timex *txc, long sec)

    >
    > Somewhat ugly name...


    Oh, that's a mis-merge. Thanks for catching it. The name should have
    been: process_adjtimex_modes().

    process_adjtimex_adj_offset was basically the same as your
    ntp_update_offset() function, so I killed it but grabbed the wrong
    function name.

    > From an userspace perspective it's the core of ntp_adjtime(), so I'd
    > prefer something close to that.


    does process_adjtimex_modes(), or ntp_process_adjtimex_modes work for
    you?

    thanks
    -john


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: [PATCH 5/5] make more ntp values static

    Hi,

    On Fri, 14 Mar 2008, john stultz wrote:

    > Make time_status, time_maxerror and time_esterror static to ntp.c


    time_maxerror and time_esterror are ok, but ntp_synced() is such a small
    function, I'd rather keep it inlined.

    bye, Roman
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: [PATCH 3/5] cleanups ntp.c (from Ingo)

    Hi,

    On Fri, 14 Mar 2008, john stultz wrote:

    > -#define MAX_TICKADJ 500 /* microsecs */
    > +static const long max_tickadj = 500; /* (usec) */


    That's a matter of taste, but MAX_TICKADJ just works as well, so why
    change it?

    > /* TIME_ERROR prevents overwriting the CMOS clock */
    > -static int time_state = TIME_OK; /* clock synchronization status */
    > -int time_status = STA_UNSYNC; /* clock status bits */
    > -static long time_tai; /* TAI offset (s) */
    > -static s64 time_offset; /* time adjustment (ns) */
    > -static long time_constant = 2; /* pll time constant */
    > -long time_maxerror = NTP_PHASE_LIMIT; /* maximum error (us) */
    > -long time_esterror = NTP_PHASE_LIMIT; /* estimated error (us) */
    > -static s64 time_freq; /* frequency offset (scaled ns/s)*/
    > -static long time_reftime; /* time at last adjustment (s) */
    > -long time_adjust;
    > -static long ntp_tick_adj;
    > +static int time_state = TIME_OK; /* clock synch status */
    > +int time_status = STA_UNSYNC; /* clock status bits */
    > +static long time_tai; /* TAI offset (s) */
    > +static s64 time_offset; /* time adjustment (ns) */
    > +static long time_constant = 2; /* pll time constant */
    > +long time_maxerror = NTP_PHASE_LIMIT;/* maximum error (us) */
    > +long time_esterror = NTP_PHASE_LIMIT;/* estimated error (us) */
    > +static s64 time_freq; /* frequency offset (scaled ns/s)*/
    > +static long time_reftime; /* time at last adjustment (s) */
    > +long time_adjust;
    > +static long ntp_tick_adj;


    I must say I really hate this kind of formating, it either gets out of
    sync or requires reformatting. I would very strongly prefer not to
    introduce this sort of formatting here.

    > +static inline u64 scale_tickadj(u64 adj)
    > +{
    > + return ((adj * NSEC_PER_USEC) << NTP_SCALE_SHIFT) / NTP_INTERVAL_FREQ;
    > +}


    The function argument is actually s32 and depending on NTP_INTERVAL_FREQ
    it might produce a link error on 32bit archs.

    bye, Roman
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. Re: [PATCH 3/5] cleanups ntp.c (from Ingo)

    john stultz wrote:
    > Make this file more readable by applying a consistent coding style.
    >
    > No code changed.
    >
    > Signed-off-by: Ingo Molnar
    >
    > Merged on top of Roman's very similar cleanups.
    > (In other words: Blame John for the merge screwups).
    >
    > Signed-off-by: John Stultz
    >
    >
    > Index: linux-2.6/kernel/time/ntp.c
    > ================================================== =================


    > @@ -193,21 +197,20 @@ void second_overflow(void)
    > * Compute the phase adjustment for the next second. The offset is
    > * reduced by a fixed factor times the time constant.
    > */
    > - tick_length = tick_length_base;
    > - time_adj = shift_right(time_offset, SHIFT_PLL + time_constant);
    > - time_offset -= time_adj;
    > - tick_length += time_adj;
    > + tick_length = tick_length_base;
    > + time_adj = shift_right(time_offset, SHIFT_PLL + time_constant);
    > + time_offset -= time_adj;
    > + tick_length += time_adj;
    >
    > if (unlikely(time_adjust)) {
    > - if (time_adjust > MAX_TICKADJ) {
    > - time_adjust -= MAX_TICKADJ;
    > - tick_length += MAX_TICKADJ_SCALED;
    > - } else if (time_adjust < -MAX_TICKADJ) {
    > - time_adjust += MAX_TICKADJ;
    > - tick_length -= MAX_TICKADJ_SCALED;
    > + if (time_adjust > max_tickadj) {
    > + time_adjust -= max_tickadj;
    > + tick_length += scale_tickadj(max_tickadj);
    > + } else if (time_adjust < -max_tickadj) {
    > + time_adjust += max_tickadj;
    > + tick_length -= scale_tickadj(max_tickadj);
    > } else {
    > - tick_length += (s64)(time_adjust * NSEC_PER_USEC /
    > - NTP_INTERVAL_FREQ) << NTP_SCALE_SHIFT;
    > + tick_length += scale_tickadj(time_adjust);
    > time_adjust = 0;
    > }
    > }



    Since you are at it, isn't "time_adj" a poor name since you also have a
    "time_adjust"?
    --
    Regards,
    Jörg-Volker.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  14. Re: [PATCH 4/5] ntp.c code flow clenaups (from Ingo)

    Hi John,

    On Saturday 15 March 2008, john stultz wrote:
    > Index: linux-2.6/kernel/time/ntp.c
    > ================================================== =================
    > --- linux-2.6.orig/kernel/time/ntp.c 2008-03-14 20:10:41.000000000 -0700
    > +++ linux-2.6/kernel/time/ntp.c 2008-03-14 20:43:48.000000000 -0700
    > @@ -275,6 +275,87 @@ static void notify_cmos_timer(void)
    > static inline void notify_cmos_timer(void) { }
    > #endif
    >
    > +static inline void process_adjtimex_adj_offset(struct timex *txc, long sec)
    > +{
    > + if (txc->modes & ADJ_STATUS) {
    > + if ((time_status & STA_PLL) && !(txc->status & STA_PLL)) {
    > + time_state = TIME_OK;
    > + time_status = STA_UNSYNC;
    > + }
    > + /* only set allowed bits */
    > + time_status &= STA_RONLY;
    > + time_status |= txc->status & ~STA_RONLY;
    > +
    > + switch (time_state) {
    > + case TIME_OK:
    > + start_timer:
    > + if (time_status & STA_INS) {
    > + time_state = TIME_INS;
    > + sec += 86400 - sec % 86400;
    > + hrtimer_start(&leap_timer, ktime_set(sec, 0), HRTIMER_MODE_ABS);
    > + } else if (time_status & STA_DEL) {
    > + time_state = TIME_DEL;
    > + sec += 86400 - (sec + 1) % 86400;
    > + hrtimer_start(&leap_timer, ktime_set(sec, 0), HRTIMER_MODE_ABS);
    > + }
    > + break;
    > + case TIME_INS:
    > + case TIME_DEL:
    > + time_state = TIME_OK;
    > + goto start_timer;
    > + break;


    I don't understand, why the goto is required here.

    If you move these two cases in front of case "TIME_OK" and
    omit the break, you can remove the goto and the label "start_timer".
    Don't forget the /* fall through */ comment then.

    If you want to keep this patch a simple code movement,
    maybe add a later patch to improve this on top of it.


    Best Regards

    Ingo Oeser
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  15. Re: [PATCH 3/5] cleanups ntp.c (from Ingo)

    Hi,

    On Sat, 15 Mar 2008, Jörg-Volker Peetz wrote:

    > Since you are at it, isn't "time_adj" a poor name since you also have a
    > "time_adjust"?


    These are two different (and independent) mechanism to adjust time. The
    names are a little confusing, but I'd be a little reluctant to rename
    time_adj, as the same name is used in the reference implementation.

    bye, Roman

  16. Re: [PATCH 4/5] ntp.c code flow clenaups (from Ingo)

    Hi,

    On Sat, 15 Mar 2008, Ingo Oeser wrote:

    > I don't understand, why the goto is required here.
    >
    > If you move these two cases in front of case "TIME_OK" and
    > omit the break, you can remove the goto and the label "start_timer".
    > Don't forget the /* fall through */ comment then.


    Nice catch, I'll do this in the original patch.

    bye, Roman
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  17. Re: [PATCH 2/5] introduce CLOCK_MONOTONIC_RAW


    On Sat, 2008-03-15 at 05:50 +0100, Roman Zippel wrote:
    > Hi,
    >
    > On Fri, 14 Mar 2008, john stultz wrote:
    >
    > > My solution is to introduce CLOCK_MONOTONIC_RAW. This exposes a
    > > nanosecond based time value, that increments starting at bootup and has
    > > no frequency adjustments made to it what so ever.

    >
    > A general comment: the raw clock doesn't need any adjustments, so updates
    > don't have to be done that frequently and you can move most of that logic
    > into second_overflow().


    You're suggesting adding MONTONIC_RAW code to ntp.c ? That doesn't make
    much sense to me (the whole point is its not ntp adjusted). Even if you
    mean just inside the "if (clock->xtime_nsec >= (u64)NSEC_PER_SEC ..."
    conditional, then that means we don't get to leverage the
    cycles_interval, and cycle_last, values, so all of that would have to be
    duplicated and managed. Which I'm not sure it would save us much more
    then the extra add and compare here.


    > > @@ -439,6 +475,7 @@ static void clocksource_adjust(s64 offset)
    > > void update_wall_time(void)
    > > {
    > > cycle_t offset;
    > > + static u64 raw_snsec; /* shifted raw nanosecnds */
    > >
    > > /* Make sure we're fully resumed: */
    > > if (unlikely(timekeeping_suspended))

    >
    > IMO that's really a clock property, so this belongs in the clock
    > structure.
    > (Some day we may want to have multiple active clocks for various purposes
    > and thus export multiple raw clocks.)


    I disagree. I think that crufts up the clocksource structure (which is
    ideally just a simple hw counter abstraction), with timekeeping state.

    I'm still not sold on the multiple clocks with multiple notions of time
    idea you keep on bringing up. But if/when we cross that bridge, maybe it
    would be better to add a timekeeping_clock mid-layer abstraction that
    keeps the clocksource specific timekeeping state. That way we don't add
    lots of complexity for the clocksource driver writers to deal with and
    we allow the clocksources to be better re-purposed (for maybe more sane
    things like performance counters) without getting too bloated.

    > > @@ -466,6 +504,11 @@ void update_wall_time(void)
    > > second_overflow();
    > > }
    > >
    > > + if (raw_snsec >= (u64)NSEC_PER_SEC << clock->shift) {
    > > + raw_snsec -= (u64)NSEC_PER_SEC << clock->shift;
    > > + monotonic_raw.tv_sec++;
    > > + }
    > > +

    >
    > You don't really have to use clock->shift as a scale, thus simplifying the
    > shifting here (e.g. by using NTP_SCALE_SHIFT).
    > I'm thinking about doing this for e.g. xtime_nsec as well.


    Could you explain this more?

    Given the clocksources have different shift values, why should we not
    keep the extra resolution?

    How does adding the extra shifting to convert from the clocksource scale
    to the NTP_SCALE_SHIFT value simplify the shifting?

    thanks
    -john


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  18. Re: [PATCH 2/5] introduce CLOCK_MONOTONIC_RAW


    On Mon, 2008-03-17 at 19:03 -0700, john stultz wrote:
    > On Sat, 2008-03-15 at 05:50 +0100, Roman Zippel wrote:
    > > > @@ -439,6 +475,7 @@ static void clocksource_adjust(s64 offset)
    > > > void update_wall_time(void)
    > > > {
    > > > cycle_t offset;
    > > > + static u64 raw_snsec; /* shifted raw nanosecnds */
    > > >
    > > > /* Make sure we're fully resumed: */
    > > > if (unlikely(timekeeping_suspended))

    > >
    > > IMO that's really a clock property, so this belongs in the clock
    > > structure.
    > > (Some day we may want to have multiple active clocks for various purposes
    > > and thus export multiple raw clocks.)

    >
    > I disagree. I think that crufts up the clocksource structure (which is
    > ideally just a simple hw counter abstraction), with timekeeping state.


    Bah. Ok, I've talked myself out of this one.

    I still think it crufts up the clocksource structure, but its more
    consistent that we follow the established cruft (such as the
    pre-calculated cycle_interval/xtime_interval/raw_interval combo) rather
    then me trying to arbitrarily draw the line in the sand at this
    variable.


    > I'm still not sold on the multiple clocks with multiple notions of time
    > idea you keep on bringing up. But if/when we cross that bridge, maybe it
    > would be better to add a timekeeping_clock mid-layer abstraction that
    > keeps the clocksource specific timekeeping state. That way we don't add
    > lots of complexity for the clocksource driver writers to deal with and
    > we allow the clocksources to be better re-purposed (for maybe more sane
    > things like performance counters) without getting too bloated.


    I still think pulling out all of the non-counter-abstraction bits out of
    the clocksource and into a mid-level timekeeping_clock structure would
    still be ideal here, but I'll save our time/energy on that one for
    another day.

    thanks
    -john


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  19. Re: [PATCH 2/5] introduce CLOCK_MONOTONIC_RAW

    Hi,

    On Tuesday 18. March 2008, john stultz wrote:

    > > A general comment: the raw clock doesn't need any adjustments, so updates
    > > don't have to be done that frequently and you can move most of that logic
    > > into second_overflow().

    >
    > You're suggesting adding MONTONIC_RAW code to ntp.c ? That doesn't make
    > much sense to me (the whole point is its not ntp adjusted).


    I didn't looked at the code, what I meant was moving it close to it, so it's
    done at the same time.

    > Even if you
    > mean just inside the "if (clock->xtime_nsec >= (u64)NSEC_PER_SEC ..."
    > conditional, then that means we don't get to leverage the
    > cycles_interval, and cycle_last, values, so all of that would have to be
    > duplicated and managed. Which I'm not sure it would save us much more
    > then the extra add and compare here.


    You could use a shift (e.g. SHIFT_HZ for non NO_HZ), to scale these value up a
    little.

    > > IMO that's really a clock property, so this belongs in the clock
    > > structure.
    > > (Some day we may want to have multiple active clocks for various purposes
    > > and thus export multiple raw clocks.)

    >
    > I disagree. I think that crufts up the clocksource structure (which is
    > ideally just a simple hw counter abstraction), with timekeeping state.


    It's not cruft, littering the code with unnecessary static variables is IMO
    worse. In OO terms this would be an abstract base class, the actual
    implementation only needs to provide a few functions and otherwise doesn't
    really have to worry about all the details.
    Splitting the structure is certainly an option, but hiding related variables
    as static is IMO not a good choice.

    > I'm still not sold on the multiple clocks with multiple notions of time
    > idea you keep on bringing up.


    Well, here are some ideas where I think it might be useful, it would make it
    possible to chain clocks with different frequencies:
    - SMP systems with slighty different clock frequencies.
    - instead of deactivating an unstable tsc clock, delay activation until we
    know it's stable.
    - deactivate a tsc (which stops during sleep) before sleep and resync when
    needed.
    - use different clocks for timer purposes (e.g. jiffies and a slow clock).

    (The last point would make IMO more acceptable to use hrtimer for more trivial
    purposes, if it were at least possible to select the used clock.)

    > > > + if (raw_snsec >= (u64)NSEC_PER_SEC << clock->shift) {
    > > > + raw_snsec -= (u64)NSEC_PER_SEC << clock->shift;
    > > > + monotonic_raw.tv_sec++;
    > > > + }
    > > > +

    > >
    > > You don't really have to use clock->shift as a scale, thus simplifying
    > > the shifting here (e.g. by using NTP_SCALE_SHIFT).
    > > I'm thinking about doing this for e.g. xtime_nsec as well.

    >
    > Could you explain this more?
    >
    > Given the clocksources have different shift values, why should we not
    > keep the extra resolution?


    The extra resolution is still there, as it doesn't make any sense to use a
    shift larger than 32.

    > How does adding the extra shifting to convert from the clocksource scale
    > to the NTP_SCALE_SHIFT value simplify the shifting?


    NTP_SCALE_SHIFT is a constant and a very simple shift, which produces better
    code.
    There are two ways to use the nsec value here:
    1. nsec_now = (((cycle_new - cycle_last) * mult) + nsec) >> shift
    2. nsec_now = ((cycle_new - cycle_last) * mult) >> shift +
    nsec >> NTP_SCALE_SHIFT
    Both do the same under the condition that the gettime operation is not
    significantly shorter than 1nsec, but the second version can generate
    slightly better code. I intended to use the first for xtime_nsec, but instead
    the value is still splitted at every timer interrupt.
    For the raw time please at least get rid of the splitting of raw_nsec in
    update_wall_time(). From the monotonic_raw value you only need the tv_sec
    part and the tv_nsec part can be calculated as above.

    bye, Roman
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread