Software configuration management tool required - Unix

This is a discussion on Software configuration management tool required - Unix ; Hello group, Our company has more than 100 servers running all different kinds of services which are currently all documented. The problem is: after a couple of months the piece of paper will be worthless if it doesn't get updated ...

+ Reply to Thread
Results 1 to 10 of 10

Thread: Software configuration management tool required

  1. Software configuration management tool required

    Hello group,

    Our company has more than 100 servers running all different kinds of
    services which are currently all documented. The problem is: after a couple
    of months the piece of paper will be worthless if it doesn't get updated by
    the system administrators logging what they changed.

    There are several administrators working on the servers and the problem is
    that not everything which gets changed will be logged. People forget about
    it, or just don't care to log.

    Management has now issued a new policy requiring *everyone* to log the
    changes. Unfortunately, checking all the servers if the administrators are
    living up to the policy is a very time-consuming task.

    Is there any software out there which is able to check remote servers on
    their running services and their configuration? I need to know which
    services are running, where their configuration lives, where they're
    logging to, where theier data is stored (if any), what their dependencies
    are, which cronjobs are planned and when, ...

    Unfortunately I can't use snmp, since that only lists services *currently*
    running, no cronjobs and no configuration files etc.

    I know there probably won't be any tool out there which is able to do all
    the stuff we want, but if it only detects a little bit it would be of great
    help to us.

    The servers are running different versions of Linux and FreeBSD.

    Please let me know if you know any software for this purpose.

    With Regards,
    Vincent van Scherpenseel.

  2. Re: Software configuration management tool required

    On Thu, 18 Aug 2005 10:53:25 +0200, Vincent van Scherpenseel wrote:
    > Hello group,
    >
    > Our company has more than 100 servers running all different kinds of
    > services which are currently all documented.


    ....that you know of...

    > The problem is: after a couple
    > of months the piece of paper will be worthless if it doesn't get updated by
    > the system administrators logging what they changed.


    Yup.

    > There are several administrators working on the servers and the problem is
    > that not everything which gets changed will be logged. People forget about
    > it, or just don't care to log.


    Normal and predictable behavior, yes.

    > Management has now issued a new policy requiring *everyone* to log the
    > changes. Unfortunately, checking all the servers if the administrators are
    > living up to the policy is a very time-consuming task.
    > Is there any software out there which is able to check remote servers on
    > their running services and their configuration? I need to know which
    > services are running, where their configuration lives, where they're
    > logging to, where theier data is stored (if any), what their dependencies
    > are, which cronjobs are planned and when, ...


    Couple of thoughts. On a basic level, you could get your logging by
    instituting sudo on your servers - all work done as root is logged in
    that manner. The logs aren't the most human readable but they're
    complete.

    There's a commercial product called "BladeLogic" (named strangely as it
    has nothing to do with specifically blade servers, but there you go)
    which we'll most likely be putting in place next year here, for our
    100+ unix boxes. It has all the logging, rollback, things like "change
    the encryption on all apache instances in the DMZ" type logic, and a ton
    of other stuff. Scheduling as well. They'll come out & give you the
    dog&pony show; we had the demo and it looks pretty good. A friend of
    mine went to work for them and he's pretty cynical generally, but he's
    very enthused about this; for a while after he went there he'd call and
    tell me "Hey, you know that quarterly patching you guys do? I've got a
    module that does it hands-off", and so on. Looks like a solid tool,
    and not obscenely expensive.

    > Unfortunately I can't use snmp, since that only lists services *currently*
    > running, no cronjobs and no configuration files etc.


    Yup. Same reasons we went looking for something else, and when budget
    allows (next fiscal year) we'll most likely go with it.

    > Please let me know if you know any software for this purpose.


    Likewise; we prefer open source for several reasons, and I'd love to
    hear about other options as well. But, sometimes, buying a commercial
    package makes sense.

    Dave Hinz


  3. Re: Software configuration management tool required

    Dave Hinz writes:

    > On Thu, 18 Aug 2005 10:53:25 +0200, Vincent van Scherpenseel wrote:

    [...]
    >> Please let me know if you know any software for this purpose.

    >
    > Likewise; we prefer open source for several reasons, and I'd love to
    > hear about other options as well. But, sometimes, buying a commercial
    > package makes sense.


    You may be interested in some of the papers and the mailing list over
    at:

    http://www.infrastructures.org/

    There's a similar mailing list for network people:

    http://www.greatcircle.com/lists/network-automation/

    --
    David Magda
    Because the innovator has for enemies all those who have done well under
    the old conditions, and lukewarm defenders in those who may do well
    under the new. -- Niccolo Machiavelli, _The Prince_, Chapter VI

  4. Re: Software configuration management tool required

    Vincent van Scherpenseel writes:

    > Our company has more than 100 servers running all different kinds of
    > services which are currently all documented. The problem is: after a
    > couple of months the piece of paper will be worthless if it doesn't
    > get updated by the system administrators logging what they changed.


    For procedures a Wiki could be useful. Also, a weblog where people can
    post could also be useful for simple "heads up" posts about changes or
    planned changes.

    > There are several administrators working on the servers and the
    > problem is that not everything which gets changed will be
    > logged. People forget about it, or just don't care to log.


    A combination of restricting root access, using sudo, and something
    like RCS/CVS/Subversion may encourage people to 'follow procedures'.

    > Management has now issued a new policy requiring *everyone* to log
    > the changes. Unfortunately, checking all the servers if the
    > administrators are living up to the policy is a very time-consuming
    > task.


    Discipline comes from inside, not from outside. (I think I got that
    from a fortune cookie.

    > Unfortunately I can't use snmp, since that only lists services
    > *currently* running, no cronjobs and no configuration files etc.


    SNMP (or other monitoring system that uses SNMP) should be looked to
    help monitor how things are running. The system administrators should
    be one of the first people to know when things aren't working
    properly. Something like Nagios doesn't cost a penny, and isn't too
    difficult to set up.

    > The servers are running different versions of Linux and FreeBSD.


    I would look at radmind:

    http://rsug.itd.umich.edu/software/radmind/

    Perhaps cfengine as well:

    http://www.cfengine.org/

    In another post I mention infrastructures.org; go through the mailing
    list archives as this has been discussed a couple of times.

    --
    David Magda
    Because the innovator has for enemies all those who have done well under
    the old conditions, and lukewarm defenders in those who may do well
    under the new. -- Niccolo Machiavelli, _The Prince_, Chapter VI

  5. Re: Software configuration management tool required

    In article ,
    David Magda wrote:

    > Vincent van Scherpenseel writes:
    >
    > > Our company has more than 100 servers running all different kinds of
    > > services which are currently all documented. The problem is: after a
    > > couple of months the piece of paper will be worthless if it doesn't
    > > get updated by the system administrators logging what they changed.

    >
    > For procedures a Wiki could be useful. Also, a weblog where people can
    > post could also be useful for simple "heads up" posts about changes or
    > planned changes.
    >
    > > There are several administrators working on the servers and the
    > > problem is that not everything which gets changed will be
    > > logged. People forget about it, or just don't care to log.

    >
    > A combination of restricting root access, using sudo, and something
    > like RCS/CVS/Subversion may encourage people to 'follow procedures'.
    >
    > > Management has now issued a new policy requiring *everyone* to log
    > > the changes. Unfortunately, checking all the servers if the
    > > administrators are living up to the policy is a very time-consuming
    > > task.

    >
    > Discipline comes from inside, not from outside. (I think I got that
    > from a fortune cookie.
    >
    > > Unfortunately I can't use snmp, since that only lists services
    > > *currently* running, no cronjobs and no configuration files etc.

    >
    > SNMP (or other monitoring system that uses SNMP) should be looked to
    > help monitor how things are running. The system administrators should
    > be one of the first people to know when things aren't working
    > properly. Something like Nagios doesn't cost a penny, and isn't too
    > difficult to set up.
    >
    > > The servers are running different versions of Linux and FreeBSD.

    >
    > I would look at radmind:
    >
    > http://rsug.itd.umich.edu/software/radmind/
    >
    > Perhaps cfengine as well:
    >
    > http://www.cfengine.org/
    >
    > In another post I mention infrastructures.org; go through the mailing
    > list archives as this has been discussed a couple of times.


    All the reporting systems in the world aren't going to help you at all
    if the admins aren't held accountable for any unreported changes they
    make. We had someone like this at my last contract but she was the
    boss's "favorite" and could do no wrong. So she got away with it. The
    rest of us had to do project plans, design review, and change control
    for every change. At least we could test stuff on the "test" systems.
    Development systems were essentially production environments where the
    developers lived and played. They didn't like the fact that they had to
    request changes that had to be reviewed but their boss just told them to
    shut up and he took the heat.

    One of the things they implemented was taking root away from everyone
    but line managers. They had sealed envelopes with root passwords that
    they only opened if there was an outage that precluded sudo or
    Powerbroker from running (networked sudo with a non-shelling vi). Since
    that was the only way to get root and it logged to a central system, we
    had records of changes people made. Another thing that was a lifesaver
    was always requiring a reboot after every change to ensure that (a) the
    change didn't screw anything up and it would boot and (b) no flakey
    hardware had cropped up. Otherwise, each system was usually up for 3-6
    months.

    Be prepared for staff turnover if you implement this type of
    environment. Some admins just won't like it and will rebel, either
    leaving or being fired. If you have older admins, they tend to like
    controlled environments like this. It allows for fewer emergency calls.

    --
    DeeDee, don't press that button! DeeDee! NO! Dee...




  6. Re: Software configuration management tool required

    On Thu, 18 Aug 2005 17:42:51 -0700, Michael Vilain wrote:
    >
    > All the reporting systems in the world aren't going to help you at all
    > if the admins aren't held accountable for any unreported changes they
    > make.


    Right. Which is why if you use a tool which _automates_ the
    documentation, it's the best way to accomplish it. If it automates
    documentation and makes the job easier (let the admin figure out "what
    to do" and then have a "and now go do that 100 times" button) it's more
    likely to be used and welcomed by your staff.

    > One of the things they implemented was taking root away from everyone
    > but line managers.


    I'm not sure what "line manager" means in your world, but here's how we
    do it - we _have_ the root passwords in an encrypted database, but the
    only time I use them is when I'm doing something that involves logging
    in to single-user mode (usually patch clusters). All the day-to-day
    work is done with sudo which is logged (individually on the servers at
    this time, so of limited value except for investigating if something
    went wrong).

    > Another thing that was a lifesaver
    > was always requiring a reboot after every change to ensure that (a) the
    > change didn't screw anything up and it would boot and (b) no flakey
    > hardware had cropped up.


    Every change? Ouch. Our policy is that if you do something that needs
    to start at boot, you test it by running the rc?.d script that init will
    run at boot, to start it up. Has been pretty successful, but we only
    have 6 Unix admins to keep honest, so it's not too bad.

    > Be prepared for staff turnover if you implement this type of
    > environment. Some admins just won't like it and will rebel, either
    > leaving or being fired. If you have older admins, they tend to like
    > controlled environments like this. It allows for fewer emergency calls.


    Well, that's why making it painless and pleasant is preferable to being
    dictatorial. But yeah, if people don't want to be accountable for what
    they do, that's a problem.

    Dave Hinz

  7. Re: Software configuration management tool required

    In article <3mmce5F17l8afU11@individual.net>,
    Dave Hinz wrote:

    > On Thu, 18 Aug 2005 17:42:51 -0700, Michael Vilain wrote:
    > >
    > > All the reporting systems in the world aren't going to help you at all
    > > if the admins aren't held accountable for any unreported changes they
    > > make.

    >
    > Right. Which is why if you use a tool which _automates_ the
    > documentation, it's the best way to accomplish it. If it automates
    > documentation and makes the job easier (let the admin figure out "what
    > to do" and then have a "and now go do that 100 times" button) it's more
    > likely to be used and welcomed by your staff.


    Here's the meat of the problem we had: how do you document changes to a
    system? Unless you have a snapshop before and after of every relevant
    file, software configuration (e.g. oracle layout on raw filesystems) and
    hardware configuration, how do you figure out what's changed via a
    script or program? For some of the big systems, it would take quite a
    long time to run an auditing program and a lot of storage to keep those
    records.

    I was a fan of a "site logbook" for each system that required the admin
    to fill out a running dialog of what they were doing as they were doing
    it during a change (aka a "devolution"). But this model breaks down
    with a roomful of servers. We had a summer intern create an Access
    database that we all had to fill out _daily_ of what each of us did. It
    was mailed to managers and section heads daily. If there was a system
    change or outage scheduled for that night, there's better be an entry in
    the database for it the next morning unless you're still working on it.
    It also helped the various shifts communicate with each other so we knew
    what happened last night just by reading the email (I also scanned log
    files just to check--I caught a few problems by noticing differences in
    "that's not what it's usually like").

    The automation method only really works if you have each system audited
    down to the serial number on each board in terms of hardware and total
    software configuration. Many of the servers were "one-offs" running a
    single application (e.g. finance or MRP or documentation or
    trouble-tickets or email & calendaring or the 100+ company-private web
    sites). Knowing what disks had what on them, their layout, memory, disk
    controllers, tape drives, and even the crontabs that ran nightly was all
    important and had to be tracked.

    >
    > > One of the things they implemented was taking root away from everyone
    > > but line managers.

    >
    > I'm not sure what "line manager" means in your world, but here's how we
    > do it - we _have_ the root passwords in an encrypted database, but the
    > only time I use them is when I'm doing something that involves logging
    > in to single-user mode (usually patch clusters). All the day-to-day
    > work is done with sudo which is logged (individually on the servers at
    > this time, so of limited value except for investigating if something
    > went wrong).


    Line managers at this place were those that managed the people that did
    things. They didn't do things themselves accept to assign tasks,
    prioritize and go to endless meetings. They may have done the grunt
    work some years ago, but have since become a manager of grunts.

    >
    > > Another thing that was a lifesaver
    > > was always requiring a reboot after every change to ensure that (a) the
    > > change didn't screw anything up and it would boot and (b) no flakey
    > > hardware had cropped up.

    >
    > Every change? Ouch. Our policy is that if you do something that needs
    > to start at boot, you test it by running the rc?.d script that init will
    > run at boot, to start it up. Has been pretty successful, but we only
    > have 6 Unix admins to keep honest, so it's not too bad.


    Well, some developers are rather blithe about changing system parameters
    because Oracle or some vendor tells them to do so. Some of those
    parameters affect things like the SGA or the maximum open files. We had
    no problem changing them on the development systems overnight with a
    reboot to test if the change screwed up the startup of Oracle or the
    backups or other stuff. It was really a life saver.

    Stuff like adding printers or accounts or day-to-day stuff that's in
    written procedures wasn't at issue. It was system changes that merit
    the reboot verification. Developers changing the Oracle environment had
    the wrath of their boss to deal with and wasn't really our problem.
    We'd sort of get bent out of shape when they change something that
    caused Oracle to fail to restart after backups.

    >
    > > Be prepared for staff turnover if you implement this type of
    > > environment. Some admins just won't like it and will rebel, either
    > > leaving or being fired. If you have older admins, they tend to like
    > > controlled environments like this. It allows for fewer emergency calls.

    >
    > Well, that's why making it painless and pleasant is preferable to being
    > dictatorial. But yeah, if people don't want to be accountable for what
    > they do, that's a problem.
    >
    > Dave Hinz


    Well, the last contract _was_ rather dictatorial about such things.

    --
    DeeDee, don't press that button! DeeDee! NO! Dee...




  8. Re: Software configuration management tool required

    On Fri, 19 Aug 2005 13:19:35 -0700, Michael Vilain wrote:
    > In article <3mmce5F17l8afU11@individual.net>,
    > Dave Hinz wrote:


    >> Right. Which is why if you use a tool which _automates_ the
    >> documentation, it's the best way to accomplish it. If it automates
    >> documentation and makes the job easier (let the admin figure out "what
    >> to do" and then have a "and now go do that 100 times" button) it's more
    >> likely to be used and welcomed by your staff.

    >
    > Here's the meat of the problem we had: how do you document changes to a
    > system?


    Document, or document _usably_? Sadly, not a lot of overlap.

    > Unless you have a snapshop before and after of every relevant
    > file, software configuration (e.g. oracle layout on raw filesystems) and
    > hardware configuration, how do you figure out what's changed via a
    > script or program?


    Right. A centralized tool that allows you to make the changes, provides
    snapshots, and easy backout and automation would be the ideal. From
    their claims, bladelogic is just that, and the fact that a trusted
    friend who now works for them is still enthusiastic about it leads me to
    believe that it's more true than "marketing fluff".


    > For some of the big systems, it would take quite a
    > long time to run an auditing program and a lot of storage to keep those
    > records.


    Well, if all changes are made by a mechanism that tracks, then you know
    what changes are made, by definition. It's a different way of working,
    though, and the best way to get something like that adopted is to have
    using it be less work than doing it the normal way. If it's harder
    _and_ a hassle, it'll get ignored.

    > I was a fan of a "site logbook" for each system that required the admin
    > to fill out a running dialog of what they were doing as they were doing
    > it during a change (aka a "devolution"). But this model breaks down
    > with a roomful of servers. We had a summer intern create an Access
    > database that we all had to fill out _daily_ of what each of us did. It
    > was mailed to managers and section heads daily. If there was a system
    > change or outage scheduled for that night, there's better be an entry in
    > the database for it the next morning unless you're still working on it.


    Or unless it's monday and you completely forgot what you did friday.

    > It also helped the various shifts communicate with each other so we knew
    > what happened last night just by reading the email (I also scanned log
    > files just to check--I caught a few problems by noticing differences in
    > "that's not what it's usually like").


    I wish I had the time to know my logfiles personally, but with 6 guys
    and 100-ish servers, it's just not going to happen. Hell, I can't even
    remember all the sites we host anymore.

    > The automation method only really works if you have each system audited
    > down to the serial number on each board in terms of hardware and total
    > software configuration. Many of the servers were "one-offs" running a
    > single application (e.g. finance or MRP or documentation or
    > trouble-tickets or email & calendaring or the 100+ company-private web
    > sites).


    Well, to some extent, I think. Again, we don't have it in yet so I'm
    somewhat speculating, but...our webserver cluster is a series of
    identical enough boxes. If, for instance, I want to turn off some
    encryption method on all apache instances...let's see, that's close to
    100 of 'em. Too many files to edit by hand for my comfort. In that
    case, the boxes don't need to be identical, and the files are _not_
    identical, but on all of 'em I need to change the line which says
    "blah +blurgh"
    ....to just say
    "blurgh"

    Sure, I could do some foreach server in (list) type thing, but there's
    no tracking. If I use the tool for it, it's tracked, the old version of
    the file is saved, and I can revert if I need to. All of these things
    are, of course, scriptable, this just puts a framework and a boatload of
    sample scripts to start with.

    > Knowing what disks had what on them, their layout, memory, disk
    > controllers, tape drives, and even the crontabs that ran nightly was all
    > important and had to be tracked.


    Yup.

    >> I'm not sure what "line manager" means in your world, but here's how we
    >> do it - we _have_ the root passwords in an encrypted database, but the
    >> only time I use them is when I'm doing something that involves logging
    >> in to single-user mode (usually patch clusters). All the day-to-day
    >> work is done with sudo which is logged (individually on the servers at
    >> this time, so of limited value except for investigating if something
    >> went wrong).


    > Line managers at this place were those that managed the people that did
    > things. They didn't do things themselves accept to assign tasks,
    > prioritize and go to endless meetings. They may have done the grunt
    > work some years ago, but have since become a manager of grunts.


    Ah. The position I'm trying to avoid, at least for now. Got it.

    >> Every change? Ouch. Our policy is that if you do something that needs
    >> to start at boot, you test it by running the rc?.d script that init will
    >> run at boot, to start it up. Has been pretty successful, but we only
    >> have 6 Unix admins to keep honest, so it's not too bad.


    > Well, some developers are rather blithe about changing system parameters
    > because Oracle or some vendor tells them to do so. Some of those
    > parameters affect things like the SGA or the maximum open files. We had
    > no problem changing them on the development systems overnight with a
    > reboot to test if the change screwed up the startup of Oracle or the
    > backups or other stuff. It was really a life saver.


    Something to consider, anyway, yes.

    >> Well, that's why making it painless and pleasant is preferable to being
    >> dictatorial. But yeah, if people don't want to be accountable for what
    >> they do, that's a problem.


    > Well, the last contract _was_ rather dictatorial about such things.


    Customer gets to set the rules, after all. We've got some, well, let's
    just say large financial institutions whose names probably appear "in
    your wallet" that we deal with, and the demands of some of them are
    pretty strict. It's doubly ironic when those same companies show up on
    the front page of the WSJ for data security breaches, which, if they
    followed what they force us to follow, couldn't happen.

    Topic drift anyone? Sorry about that.


  9. Re: Software configuration management tool required

    Vincent van Scherpenseel writes:

    > Is there any software out there which is able to check remote servers on
    > their running services and their configuration? I need to know which
    > services are running, where their configuration lives, where they're
    > logging to, where theier data is stored (if any), what their dependencies
    > are, which cronjobs are planned and when, ...


    Try ServDoc
    http://servdoc.sourceforge.net/

    It documents many "standard" services, configurations,... .
    All you need to do is to run it (it's just one perl script) on a
    regular basis and collect the results centrally.

    It's easy to add documentation for new services.
    Or ask the maintainer :-)

    Uli

    --
    '''
    (0 0)
    +------oOO----(_)--------------+
    | |
    | Ulrich Herbst |
    | |
    | Tel. ++49-7271-940775 |
    | |
    | Ulrich.Herbst@gmx.de |
    +-------------------oOO--------+
    |__|__|
    || ||
    ooO Ooo

  10. Re: Software configuration management tool required

    Begin
    On 2005-08-18, David Magda wrote:
    [snip!]
    >
    > I would look at radmind:
    >
    > http://rsug.itd.umich.edu/software/radmind/
    >
    > Perhaps cfengine as well:
    >
    > http://www.cfengine.org/
    >
    > In another post I mention infrastructures.org; go through the mailing
    > list archives as this has been discussed a couple of times.


    Thanks for the links and sorry for re-awakening a rather old thread,
    'twas interesting enough to drop the question here;

    Are there experiences with the arusha project/ARK here? I had a shot
    at it once, but then that test-server got usurped in one of the many
    other projects and I kinda forgot about it. It is python based which I
    don't like too much, but if the benefits are big enough that's easily
    overlooked, of course.

    http://ark.sourceforge.net/


    --
    j p d (at) d s b (dot) t u d e l f t (dot) n l .

+ Reply to Thread