Database seems to have become corrupted - Kerberos

This is a discussion on Database seems to have become corrupted - Kerberos ; I was hoping someone could give me a bit of advice. This morning at precisely 4AM our long-running master KDC started spewing these errors once every propagate run: dump: error performing Kerberos version 5 release 1.3 dump (Database record is ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: Database seems to have become corrupted

  1. Database seems to have become corrupted

    I was hoping someone could give me a bit of advice. This morning at
    precisely 4AM our long-running master KDC started spewing these errors
    once every propagate run:

    dump: error performing Kerberos version 5 release 1.3 dump (Database
    record is incomplete or corrupted)
    /usr/kerberos/sbin/kprop: '/var/kerberos/krb5kdc/slave_datatrans' more
    recent than '^X^S'^H/kerberos/krb5k\
    dc/slave_datatrans.dump_ok'.

    If I run kadmin and do a listprincs, I get:

    get_principals: Database record is incomplete or corrupted while
    retrieving list. Here's what the server has:

    -rw------- 1 root root 8192 Jul 17 10:22 principal.kadm5
    -rw------- 1 root root 442368 Jul 17 11:13 principal
    -rw------- 1 root root 0 Jul 17 11:13 principal.ok
    -rw------- 1 root root 1 Jul 20 04:00 slave_datatrans.XXX.last_prop
    -rw------- 1 root root 1 Jul 20 04:00 slave_datatrans.dump_ok
    -rw------- 1 root root 210687 Jul 20 09:40 slave_datatrans

    At this point I have backups on tape which I'm pulling off and I have
    a slave KDC which as I understand things should have a good copy of
    the database. Here's what it has:

    -rw------- 1 root root 0 Jul 20 04:00 principal.kadm5.lock
    -rw------- 1 root root 8192 Jul 20 04:00 principal.kadm5
    -rw------- 1 root root 303104 Jul 20 04:00 principal
    -rw------- 1 root root 444158 Jul 20 04:00 from_master
    -rw------- 1 root root 0 Jul 20 04:00 principal.ok

    I'm operating remotely today and don't want to completely hose the
    system. Could someone give me a few hints as to what the simplest
    thing to do would be?

    - J<
    ________________________________________________
    Kerberos mailing list Kerberos@mit.edu
    https://mailman.mit.edu/mailman/listinfo/kerberos


  2. Re: Database seems to have become corrupted

    >I'm operating remotely today and don't want to completely hose the
    >system. Could someone give me a few hints as to what the simplest
    >thing to do would be?


    The simplest thing to do would be to copy the file "from_master" back to
    the master, and use "kdb5_util load" to load it back into the database.
    That's probably the most recent good copy of the database that you'll have
    available.

    (Personally, I do a dump of the KDC database once a week to somewhere
    else on the filesystem, just in case of database corruption).

    --Ken
    ________________________________________________
    Kerberos mailing list Kerberos@mit.edu
    https://mailman.mit.edu/mailman/listinfo/kerberos


  3. Re: Database seems to have become corrupted

    >>>>> "KH" == Ken Hornstein writes:

    KH> The simplest thing to do would be to copy the file "from_master"
    KH> back to the master, and use "kdb5_util load" to load it back into
    KH> the database.

    Thanks; that seems to have worked.

    KH> (Personally, I do a dump of the KDC database once a week to
    KH> somewhere else on the filesystem, just in case of database
    KH> corruption).

    I do have everything on tape, but now that I think about it, I only
    have it in the internal binary format and it would be useful to have
    the plain text dump around. Time to set up another cron job.

    Thanks again for your advice.

    - J<
    ________________________________________________
    Kerberos mailing list Kerberos@mit.edu
    https://mailman.mit.edu/mailman/listinfo/kerberos


  4. Re: Database seems to have become corrupted

    On Jul 20, 2006, at 10:44, Jason L Tibbitts III wrote:
    > get_principals: Database record is incomplete or corrupted while
    > retrieving list.


    Do you have any notion what might've been changed since the previous
    kprop run when you got a successful dump? It's great that you've
    recovered your database, of course, but now, if we can identify a
    possible cause for the corruption, perhaps we can reduce the chances
    of similar problems in the future.... Any interesting operations
    being performed on the database, especially locally via
    kadmin.local? Any interesting events in the OS logs, like disk
    problems or running out of space? Etc....

    Ken



    ________________________________________________
    Kerberos mailing list Kerberos@mit.edu
    https://mailman.mit.edu/mailman/listinfo/kerberos


  5. Re: Database seems to have become corrupted

    >>>>> "KR" == Ken Raeburn writes:

    KR> Do you have any notion what might've been changed since the
    KR> previous kprop run when you got a successful dump?

    I have no idea at all. Lots of stuff tends to happen at 4AM on a Red
    Hat/Fedora system but I can't pin it down to anything specific.

    The only thing I can possibly come up with is that the logs are
    rotated then, which could result in additional space usage on /var as
    the old logs are compressed. But there's plenty of space there.

    kadmin had nothing logged for two days before the issue. krb5kdc.log
    of course had tons of stuff logged, but nothing out of the ordinary
    around the time of the failure.

    - J<
    ________________________________________________
    Kerberos mailing list Kerberos@mit.edu
    https://mailman.mit.edu/mailman/listinfo/kerberos


+ Reply to Thread