How often should I reboot Solaris and LynxOS
Hi,
Would someone please give me some pointers on a trend analysis on resource
leaks for Solaris 9 and LynxOS (or in general on UNIX machines). Basically,
I need statistics to determine how often I need to restart these machines to
avoid unplanned failures. Thanks.
ekl
Re: How often should I reboot Solaris and LynxOS
On Thu, 24 Jul 2003 11:18:08 -0400,
EKL <En-Kuang_Lung@raytheon.com>, in
<QMSTa.3816$c6.3317@bos-service2.ext.raytheon.com> wrote:
+> Would someone please give me some pointers on a trend analysis on resource
+> leaks for Solaris 9 and LynxOS (or in general on UNIX machines). Basically,
+> I need statistics to determine how often I need to restart these machines to
+> avoid unplanned failures. Thanks.
Ummm...you don't? I can't speak to LynxOS, but I've had Solaris boxen
with uptimes on the order of 4-6 months. If they weren't connected to
the commidity internet, and the power stayed on, and I wasn't so
zealous in applying the recommended and security patches, I could
probably get uptimes in the *years*.
What you have to worry about are buggy applications doing bad things
to your system resources.
James
--
Consulting Minister for Consultants, DNRC
I can please only one person per day. Today is not your day. Tomorrow
isn't looking good, either.
I am BOFH. Resistance is futile. Your network will be assimilated.
Re: How often should I reboot Solaris and LynxOS
>[color=blue]
> +> Would someone please give me some pointers on a trend analysis on[/color]
resource[color=blue]
> +> leaks for Solaris 9 and LynxOS (or in general on UNIX machines).[/color]
Basically,[color=blue]
> +> I need statistics to determine how often I need to restart these[/color]
machines to[color=blue]
> +> avoid unplanned failures. Thanks.
>
> Ummm...you don't? I can't speak to LynxOS, but I've had Solaris boxen
> with uptimes on the order of 4-6 months. If they weren't connected to
> the commidity internet, and the power stayed on, and I wasn't so
> zealous in applying the recommended and security patches, I could
> probably get uptimes in the *years*.
>
> What you have to worry about are buggy applications doing bad things
> to your system resources.
>[/color]
Thanks James. Even though you may not need to reboot for a long time, but do
you notice any performance degradation over time or unexplained resource
losses of usages? Thanks.
Re: How often should I reboot Solaris and LynxOS
On Thu, 24 Jul 2003, EKL wrote:
[color=blue]
> Would someone please give me some pointers on a trend analysis on resource
> leaks for Solaris 9 and LynxOS (or in general on UNIX machines). Basically,
> I need statistics to determine how often I need to restart these machines to
> avoid unplanned failures. Thanks.[/color]
I don't know about LynxOS, but you reboot a Solaris machine
when you add hardware that isn't hot swappable, or when you
apply one or more patches that need the machine to be rebooted
to take effect (e.g., the kernel jumbo patch). Depending on
your paranoia level, you should probably look at adding the
latest recommended patch cluster quarterly perhaps, more or
less.
Rebooting Solaris machines "just because" is neither necessary
nor desirable. It's a habit from people who look after Windoze
machines.
--
Rich Teer, SCNA, SCSA
President,
Rite Online Inc.
Voice: +1 (250) 979-1638
URL: [url]http://www.rite-online.net[/url]
Re: How often should I reboot Solaris and LynxOS
In article <Y%STa.3818$c6.3326@bos-service2.ext.raytheon.com>,
"EKL" <En-Kuang_Lung@raytheon.com> wrote:
[color=blue][color=green]
> >
> > +> Would someone please give me some pointers on a trend analysis on[/color]
> resource[color=green]
> > +> leaks for Solaris 9 and LynxOS (or in general on UNIX machines).[/color]
> Basically,[color=green]
> > +> I need statistics to determine how often I need to restart these[/color]
> machines to[color=green]
> > +> avoid unplanned failures. Thanks.
> >
> > Ummm...you don't? I can't speak to LynxOS, but I've had Solaris boxen
> > with uptimes on the order of 4-6 months. If they weren't connected to
> > the commidity internet, and the power stayed on, and I wasn't so
> > zealous in applying the recommended and security patches, I could
> > probably get uptimes in the *years*.
> >
> > What you have to worry about are buggy applications doing bad things
> > to your system resources.[/color]
>
> Thanks James. Even though you may not need to reboot for a long time, but do
> you notice any performance degradation over time or unexplained resource
> losses of usages? Thanks.[/color]
That only happens when you let evil developers use your systems. They
can write the worst stuff that screws up your filesystems and not clean
up after themselves. Then they want hourly backups of all their files
in case they screw up. If you just run without users, your servers can
go for _very long_ without rebooting. If the applications and system
are designed with properly, the day-to-day procedures to run the system
are established and maintained, and the environment is OK (e.g.
datacenter with A/C, UPS, and backup generator), you shouldn't need to
reboot your systems for anything less than a hardware failure.
Obviously, that's not possible. You ask an overly-broad, marketing-type
question without specifics of your situation. You must be responding to
something a PHB asked you to find out.
The "general rule" of UNIX is "don't reboot unless you have to (or
you're lazy and don't want to figure out what to fix your problem".
Sysadmins are forever quoting the longest "uptime" on their systems as a
matter of pride.
In common practice, scheduling downtime at least once a month gets the
users used to the idea of a "downtime" and gives you headroom to plan
projects that require an outage--hardware and software upgrades,
testing, or whatever.
If you can't do this, buy lots of hardware and build a fault tolerant
system. Bring your checkbook.
--
DeeDee, don't press that button! DeeDee! NO! Dee...
Re: How often should I reboot Solaris and LynxOS
EKL wrote:
[color=blue]
> Hi,
>
> Would someone please give me some pointers on a trend analysis on resource
> leaks for Solaris 9 and LynxOS (or in general on UNIX machines). Basically,
> I need statistics to determine how often I need to restart these machines to
> avoid unplanned failures. Thanks.
>
> ekl
>
>
>[/color]
Dunno Solaris 9 specifically, but if there are no application leaks,
uptime is more determined by the need to perform hardware and/or
software maintenance/upgrades rather than any inherent need.
Have seen servers with uptimes in excess of a year and doubt if
that is a record....hopefully not starting a "longest uptime"
subthread.
Re: How often should I reboot Solaris and LynxOS
"Michael Vilain <vilain@spamcop.net>" wrote in news:news-
[email]0A1283.10123524072003@news.tdl.com[/email]:
[color=blue]
> If you just run without users[/color]
Yup, it's them darned users that are the problem!
Re: How often should I reboot Solaris and LynxOS
In article <QMSTa.3816$c6.3317@bos-service2.ext.raytheon.com>, En-
[email]Kuang_Lung@raytheon.com[/email] says...[color=blue]
> Hi,
>
> Would someone please give me some pointers on a trend analysis on resource
> leaks for Solaris 9 and LynxOS (or in general on UNIX machines). Basically,
> I need statistics to determine how often I need to restart these machines to
> avoid unplanned failures. Thanks.[/color]
For Solaris: Never. (Unless there's some application software that's
always running and leaking memory.)
At my last job, we had an old Sparc 20 that had been running
continuously for four years before we shut it down and replaced it with
an Ultra 10. I mean literally continuously -- the "uptime" command said
something like 1481 days. The Cisco router on the same subnet had been
running continuously for even longer.
'Course, that Sparc 20 wasn't running Solaris 9, obviously. It was
Solaris 2.5.1. I presume that Solaris 9 is just as stable as 2.5.1
is/was.
Re: How often should I reboot Solaris and LynxOS
In article <news-0A1283.10123524072003@news.tdl.com>, "Michael Vilain
<vilain@spamcop.net>" says...[color=blue]
>
> The "general rule" of UNIX is "don't reboot unless you have to (or
> you're lazy and don't want to figure out what to fix your problem".[/color]
Hey, you'd be surprised how often a reboot will fix a particularly pesky
problem. Just ask your friendly neighborhood MCSE.
[color=blue]
> Sysadmins are forever quoting the longest "uptime" on their systems as a
> matter of pride.[/color]
Heh.. 1,481 days. Read it and weep.
[color=blue]
> In common practice, scheduling downtime at least once a month gets the
> users used to the idea of a "downtime" and gives you headroom to plan
> projects that require an outage--hardware and software upgrades,
> testing, or whatever.[/color]
And it gives you a chance to install the latest set of recommended and
security patches, after which a reboot is usually a good idea.
Re: How often should I reboot Solaris and LynxOS
"EKL" <En-Kuang_Lung@raytheon.com> wrote:
]Would someone please give me some pointers on a trend analysis on resource
]leaks for Solaris 9 and LynxOS (or in general on UNIX machines). Basically,
]I need statistics to determine how often I need to restart these machines to
]avoid unplanned failures. Thanks.
For the statistics, I run a program that checks various things, such as swap
usage, swap activity, cpu usage, disk usage, i-node usage, network errors etc
and log that using rrdtool. I then display daily stats, and yearly stats so I
can get trends and predict when a filesystem will fill up etc. This works very
well.
For re-boots - many will say that you don't need to. From experience, you dont
need regular reboots, but if you install an application or patch, you should
consider scheduling an attended re-boot shortly afterwards when it suits
everyone. You want to make sure it still boots cleanly in case of an
unscheduled unattended reboot later on. You want to make sure that everything
starts up as it should from a re-boot.
Re: How often should I reboot Solaris and LynxOS
Frank-Christian Kruegel wrote:
[color=blue]
> On Thu, 24 Jul 2003 11:18:08 -0400, "EKL" <En-Kuang_Lung@raytheon.com> wrote:
>[color=green]
>>Would someone please give me some pointers on a trend analysis on resource
>>leaks for Solaris 9 and LynxOS (or in general on UNIX machines). Basically,
>>I need statistics to determine how often I need to restart these machines to
>>avoid unplanned failures. Thanks.[/color]
>
> My uptime record was 611 days on a Netra T1. On the 612th day I had to swap both
> hard disks. No problems during that time.[/color]
See, that's what happens when you allow the hard disks to hit
the 611.5 day timer, at which point they become so fragmented
not even Humpty Dumpty can put them back together.
The only issue I can recall is that after "N" days of operation
the file modification dates would go bogus. Unfortunately
can't remember what "N" was ?240 days? or if this was a
SunOS 4.x feature or only on a machine based on SunOS 4.x
Re: How often should I reboot Solaris and LynxOS
Lon Stowell <lon.stowell@comcast.net> writes:
[color=blue]
> The only issue I can recall is that after "N" days of operation
> the file modification dates would go bogus. Unfortunately
> can't remember what "N" was ?240 days? or if this was a
> SunOS 4.x feature or only on a machine based on SunOS 4.x[/color]
Sure this wasn't the "248 days, I hang" bug in Solaris 2.x?
The comments in the bug report were something like "customer may
lose satellite if this happens again".
Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.
Re: How often should I reboot Solaris and LynxOS
I R A Darth Aggie wrote:[color=blue]
> On 25 Jul 2003 12:49:10 GMT,
> Casper H.S. Dik <Casper.Dik@Sun.COM>, in
> <3f212746$0$49103$e4fe514c@news.xs4all.nl> wrote:[/color]
[color=blue]
> +>
> +> The comments in the bug report were something like "customer may
> +> lose satellite if this happens again".
>
> Oh, great, so NASA already has an excuse if they lose one (or more) of
> the recently launched Mars probes??[/color]
Nah. It was a company who run telecommunications satellites. For some
reason they get a bit annoyed if they lose their geostationary
satellites every 248 days.
That got fixed back in Solaris 2., I think, 6, and patched back as far
as about 2.3.
--
Tony
Re: How often should I reboot Solaris and LynxOS
On Fri, 25 Jul 2003 15:36:48 +0100,
Tony Walton <tony.walton@s-u-n.com>, in
<3F214080.5020909@s-u-n.com> wrote:
+> I R A Darth Aggie wrote:
+> > On 25 Jul 2003 12:49:10 GMT,
+> > Casper H.S. Dik <Casper.Dik@Sun.COM>, in
+> > <3f212746$0$49103$e4fe514c@news.xs4all.nl> wrote:
+>
+> > +>
+> > +> The comments in the bug report were something like "customer may
+> > +> lose satellite if this happens again".
+> >
+> > Oh, great, so NASA already has an excuse if they lose one (or more) of
+> > the recently launched Mars probes??
+>
+> Nah. It was a company who run telecommunications satellites. For some
+> reason they get a bit annoyed if they lose their geostationary
+> satellites every 248 days.
1. Yeah, I imagine they would get a little peeved if a multi-million
dollar platform went missing...
2. How do you lose a geostationary satellite? by definition, you know
exactly where it is, all the time...unless the computer in question
would send out orbital correction commands to the satellite(s)!!
James
--
Consulting Minister for Consultants, DNRC
I can please only one person per day. Today is not your day. Tomorrow
isn't looking good, either.
I am BOFH. Resistance is futile. Your network will be assimilated.
Re: How often should I reboot Solaris and LynxOS
"EKL" <En-Kuang_Lung@raytheon.com> wrote in message news:<QMSTa.3816$c6.3317@bos-service2.ext.raytheon.com>...[color=blue]
> Hi,
>
> Would someone please give me some pointers on a trend analysis on resource
> leaks for Solaris 9 and LynxOS (or in general on UNIX machines). Basically,
> I need statistics to determine how often I need to restart these machines to
> avoid unplanned failures. Thanks.
>
> ekl[/color]
In general, operating systems do not leak resources. That goes for
UNIX (all brands, including Solaris) and other OSs (VMS, Primos,
Multics etc). For maximum reliability just leave the machine switched
on! Perhaps you have been using Microsoft-Windows?
-apm
Re: How often should I reboot Solaris and LynxOS
I R A Darth Aggie wrote:
[color=blue]
> 2. How do you lose a geostationary satellite? by definition, you know
> exactly where it is, all the time...unless the computer in question
> would send out orbital correction commands to the satellite(s)!![/color]
That was pretty much the case, IIRC.
--
Tony