Y2038 bug strikes early - NTP

This is a discussion on Y2038 bug strikes early - NTP ; From the latest RISKS Digest: Date: Thu, 29 Jun 2006 13:38:25 -0700 From: Conrad Heiney Subject: Y2038 bug strikes early Starting on May 12, 2006, many installations of the AOLServer web server failed. Not all versions or all configurations failed, ...

+ Reply to Thread
Results 1 to 10 of 10

Thread: Y2038 bug strikes early

  1. Y2038 bug strikes early


    From the latest RISKS Digest:


    Date: Thu, 29 Jun 2006 13:38:25 -0700
    From: Conrad Heiney
    Subject: Y2038 bug strikes early

    Starting on May 12, 2006, many installations of the AOLServer web server
    failed. Not all versions or all configurations failed, but the ones that did
    became unusable. On start, the server would eat virtual memory and then
    terminate with a memory allocation error. Discussion on the mailing list
    revealed the starting date of the problem, indicating that some part of the
    software had a clock issue. On careful inspection it was discovered that
    database threads were a common factor. It was then noted by a perceptive
    person that the servers all failed on or before exactly one billion seconds
    before the end of the Unix epoch in 2038. Many installations had very long
    database timeouts, which caused the software to look ahead and see the End
    of Time. Adjusting the timeouts stopped the crashes.

    The risk of the known clock bug striking 32 years early indicates there may
    be other "pre-problems" lurking in software that will show up long before
    the date we have comfortably set as the deadline.

    The thread discussing the problem and its resolution is here:
    http://www.mail-archive.com/aolserve.../msg09812.html

  2. Re: Y2038 bug strikes early


    > From: Conrad Heiney
    > Starting on May 12, 2006, many installations of the AOLServer web
    > server failed. Not all versions or all configurations failed, but
    > the ones that did became unusable. On start, the server would eat
    > virtual memory and then terminate with a memory allocation error.


    I would have expected that one server would have failed with an error
    message and millions of others followed with the message "ME TOO".

    -wolfgang

  3. Re: Y2038 bug strikes early

    Marc,

    Unix doesn't have to have a 2038 rollover problem, just as NTP doesn't
    have a 2036 rollover problem. Evidence to this assertion has been
    reported in recent messages to this list and the hackers@ntp.org support
    group. It's all in the carefully designed 64-bit twos complement
    calculations that determine the relative date and time, as long as the
    clock is set first within 68 years of the actual calendar date. See
    http://www.eecis.udel.edu/~mills/y2k.html.

    Dave

    Marc Brett wrote:
    > From the latest RISKS Digest:
    >
    >
    > Date: Thu, 29 Jun 2006 13:38:25 -0700
    > From: Conrad Heiney
    > Subject: Y2038 bug strikes early
    >
    > Starting on May 12, 2006, many installations of the AOLServer web server
    > failed. Not all versions or all configurations failed, but the ones that did
    > became unusable. On start, the server would eat virtual memory and then
    > terminate with a memory allocation error. Discussion on the mailing list
    > revealed the starting date of the problem, indicating that some part of the
    > software had a clock issue. On careful inspection it was discovered that
    > database threads were a common factor. It was then noted by a perceptive
    > person that the servers all failed on or before exactly one billion seconds
    > before the end of the Unix epoch in 2038. Many installations had very long
    > database timeouts, which caused the software to look ahead and see the End
    > of Time. Adjusting the timeouts stopped the crashes.
    >
    > The risk of the known clock bug striking 32 years early indicates there may
    > be other "pre-problems" lurking in software that will show up long before
    > the date we have comfortably set as the deadline.
    >
    > The thread discussing the problem and its resolution is here:
    > http://www.mail-archive.com/aolserve.../msg09812.html


  4. Re: Y2038 bug strikes early

    In article ,
    David L. Mills wrote:

    >Unix doesn't have to have a 2038 rollover problem, just as NTP doesn't
    >have a 2036 rollover problem. Evidence to this assertion has been
    >reported in recent messages to this list and the hackers@ntp.org support
    >group. It's all in the carefully designed 64-bit twos complement
    >calculations that determine the relative date and time


    I'd like to see some evidence of these Unix(R) systems of which you
    speak, with "carefully designed 64-bit twos complement calculations".

    If you adhere to the Single UNIX Specification, your date and time
    representation is determined by a formula in the POSIX standard.[1]
    The result of evaluating that formula will exceed 2**31 in January,
    2038 -- end of story. One can hope that all systems in use by then
    will have settled on a time_t type wider than that (or even better,
    that time_t becomes an ill-remembered historical relic), and that all
    applications which store times on disk or transmit them over the
    network have done likewise, but I'm not counting on it.

    -GAWollman

    [1] This formula is highly unlikely to change in the ongoing POSIX
    revision, even though its representation of leap seconds is ambiguous.

    --
    Garrett A. Wollman | As the Constitution endures, persons in every
    wollman@csail.mit.edu | generation can invoke its principles in their own
    Opinions not those | search for greater freedom.
    of MIT or CSAIL. | - A. Kennedy, Lawrence v. Texas, 539 U.S. 558 (2003)

  5. Re: Y2038 bug strikes early

    Garrett,

    The issue has nothing to do with Unix or POSIX. It has to do with NTP
    timestamps. There are two sources of evidence, the page I referenced,
    and actual test with Solaris 10 and current NTP daemon ntpd. Set the
    Unix clock in the server to early 2037; set the client to the current
    date and start with the -g option. Try this the other way around as
    well. All this proves only that the NTP rollover will be transparent as
    long as Unix is transparent beyond 2038.

    It is important to note that NTP calculations never assume an absolute
    value, only an offset relative to 136-year eras. Native Unix timekeeping
    could do the same thing with result calculations spanning Unix eras
    would be unambiguous as long as the difference between two timestamps
    did not exceed 34 years (because Unix seconds are signed).

    Modern kernels I have seen represent seconds in 64-bit twos complement
    signed integer, which is the same as in the NTP date format. While the
    base era for NTP is 1900 and for Unix is 1970, the 64-bit signed seconds
    field can represent seconds since before the big bang until after the
    Sun grows cold.

    Dave

    Garrett Wollman wrote:
    > In article ,
    > David L. Mills wrote:
    >
    >
    >>Unix doesn't have to have a 2038 rollover problem, just as NTP doesn't
    >>have a 2036 rollover problem. Evidence to this assertion has been
    >>reported in recent messages to this list and the hackers@ntp.org support
    >>group. It's all in the carefully designed 64-bit twos complement
    >>calculations that determine the relative date and time

    >
    >
    > I'd like to see some evidence of these Unix(R) systems of which you
    > speak, with "carefully designed 64-bit twos complement calculations".
    >
    > If you adhere to the Single UNIX Specification, your date and time
    > representation is determined by a formula in the POSIX standard.[1]
    > The result of evaluating that formula will exceed 2**31 in January,
    > 2038 -- end of story. One can hope that all systems in use by then
    > will have settled on a time_t type wider than that (or even better,
    > that time_t becomes an ill-remembered historical relic), and that all
    > applications which store times on disk or transmit them over the
    > network have done likewise, but I'm not counting on it.
    >
    > -GAWollman
    >
    > [1] This formula is highly unlikely to change in the ongoing POSIX
    > revision, even though its representation of leap seconds is ambiguous.
    >


  6. Re: Y2038 bug strikes early

    In article ,
    David L. Mills wrote:

    >The issue has nothing to do with Unix or POSIX. It has to do with NTP
    >timestamps.


    That's great for NTP but isn't responsive to your original claim:

    >>>Unix doesn't have to have a 2038 rollover problem, just as NTP doesn't
    >>>have a 2036 rollover problem.


    UNIX brand operating systems are not (under current standards)
    permitted to do the sort of epoch-windowing you describe and NTP
    implements. Thus, the only solution to the Y2038 problem which
    comports with the requirements of the standard is to make time_t be a
    wider type.[1] As you note, many operating systems are now using a
    64-bit type internally, but applications and protocols have failed to
    keep up. The concern for the industry is, will we find and fix all
    those systems in time? (I hope, given how much the pace of change has
    increased, that this will be a non-issue in thirty years' time.)

    -GAWollman

    [1] Every time in recent memory that the POSIX committee has tried to
    tackle the leap-second bug, someone always pops up and insists that
    the Y2038 problem must also be solved at the same time (by increasing
    the required range of time_t). This then gets tangled up with the
    issue of finer-resolution file timestamps and the whole mess goes into
    a rathole, not to be seen until the next round of revisions.

    --
    Garrett A. Wollman | As the Constitution endures, persons in every
    wollman@csail.mit.edu | generation can invoke its principles in their own
    Opinions not those | search for greater freedom.
    of MIT or CSAIL. | - A. Kennedy, Lawrence v. Texas, 539 U.S. 558 (2003)

  7. Re: Y2038 bug strikes early

    Garrett Wollman wrote:

    > In article ,
    > David L. Mills wrote:
    >
    >
    >>The issue has nothing to do with Unix or POSIX. It has to do with NTP
    >>timestamps.

    >
    >
    > That's great for NTP but isn't responsive to your original claim:
    >
    >
    >>>>Unix doesn't have to have a 2038 rollover problem, just as NTP doesn't
    >>>>have a 2036 rollover problem.

    >
    >
    > UNIX brand operating systems are not (under current standards)
    > permitted to do the sort of epoch-windowing you describe and NTP
    > implements. Thus, the only solution to the Y2038 problem which
    > comports with the requirements of the standard is to make time_t be a
    > wider type.[1] As you note, many operating systems are now using a
    > 64-bit type internally, but applications and protocols have failed to
    > keep up. The concern for the industry is, will we find and fix all
    > those systems in time? (I hope, given how much the pace of change has
    > increased, that this will be a non-issue in thirty years' time.)
    >



    The available evidence (Y2K) suggests that the problem will not be
    addressed until 2036 at the earliest. "It's not going to break in my
    working lifetime so why should I fix it?"!!!!! As I recall the Y2K
    problem was first noted sometime in the 1970s. It didn't need to be
    fixed for 25 years so nobody worried about it.

  8. Re: Y2038 bug strikes early

    "Richard B. Gilbert" writes:

    >The available evidence (Y2K) suggests that the problem will not be
    >addressed until 2036 at the earliest. "It's not going to break in my
    >working lifetime so why should I fix it?"!!!!! As I recall the Y2K
    >problem was first noted sometime in the 1970s. It didn't need to be
    >fixed for 25 years so nobody worried about it.


    OSes are moving to 64 bitness rapidly and as such we hope
    to see fewer and fewer 32 bit UNIX OSes and programs with
    bad timestamp handling.

    Casper
    --
    Expressed in this posting are my opinions. They are in no way related
    to opinions held by my employer, Sun Microsystems.
    Statements on Sun products included here are not gospel and may
    be fiction rather than truth.

  9. Re: Y2038 bug strikes early

    In article <8dudnf3focSOlk3ZnZ2dnUVZ_qydnZ2d@comcast.com>,
    Richard B. Gilbert wrote:
    > ... As I recall the Y2K
    >problem was first noted sometime in the 1970s. It didn't need to be
    >fixed for 25 years so nobody worried about it.


    Not true, in the 1980's mortgage software had to deal with dates
    beyond 1999.

    --
    -- Rod --
    rodd(at)polylogics(dot)com

  10. Re: Y2038 bug strikes early

    Rod Dorman wrote:
    > In article <8dudnf3focSOlk3ZnZ2dnUVZ_qydnZ2d@comcast.com>,
    > Richard B. Gilbert wrote:
    >
    >> ... As I recall the Y2K
    >>problem was first noted sometime in the 1970s. It didn't need to be
    >>fixed for 25 years so nobody worried about it.

    >
    >
    > Not true, in the 1980's mortgage software had to deal with dates
    > beyond 1999.
    >


    Picky, picky, picky.

    All right, so almost nobody addressed the problem. The point was that
    people were aware as early as the mid 1970's that the two digit years
    that had been widely used to save space on 80 column punched cards and
    never changed to four digit years were going to be a problem. It wasn't
    so much the old data as the programs that were written to handle two
    digit years and just assumed that the years involved were all 19xx.

    New software generally used four digit years but there was an awful lot
    of legacy stuff that had been around since the 1960's and 70's that
    needed to be fixed. Nobody wanted to spend money fixing something that
    wouldn't break for twenty-five years, or twenty years, or ten or five.
    The effort to test and clean up all the application software didn't
    really get under way until around 1998 or, in many cases, 1999.

+ Reply to Thread