Where to ask about a machine lockup - Ubuntu
This is a discussion on Where to ask about a machine lockup - Ubuntu ; A very machine running Ubuntu Gutsy locked up. It runs some of our
apps and at the time of lockup (midning on weekend) it was not
running anything.
I have some interesting oom-killer messages from /var/log/messages and
wanted to ask ...
-
Where to ask about a machine lockup
A very machine running Ubuntu Gutsy locked up. It runs some of our
apps and at the time of lockup (midning on weekend) it was not
running anything.
I have some interesting oom-killer messages from /var/log/messages and
wanted to ask some mailing list or something where there are some
knowledgeable people. I am very worried about this and want to get to
the bottom of this issue, if it locks up in themiddle of the day, we
will be very screwed.
So.
Where can I ask?
i
--
Due to extreme spam originating from Google Groups, and their inattention
to spammers, I and many others block all articles originating
from Google Groups. If you want your postings to be seen by
more readers you will need to find a different means of
posting on Usenet.
http://improve-usenet.org/
-
Re: Where to ask about a machine lockup
On Mon, 21 Jul 2008 09:58:30 -0500, Ignoramus14558 wrote:
> A very machine running Ubuntu Gutsy locked up. It runs some of our apps
> and at the time of lockup (midning on weekend) it was not running
> anything.
>
> I have some interesting oom-killer messages from /var/log/messages and
> wanted to ask some mailing list or something where there are some
> knowledgeable people. I am very worried about this and want to get to
> the bottom of this issue, if it locks up in themiddle of the day, we
> will be very screwed.
>
> So.
>
> Where can I ask?
>
> i
You could probably ask here, but more details would be in order.
Otherwise, try some of the resources hosted at Ubuntu.
-
Re: Where to ask about a machine lockup
Ignoramus14558 wrote:
> A very machine running Ubuntu Gutsy locked up. It runs some of our apps
> and at the time of lockup (midning on weekend) it was not running
> anything.
I run Red Hat Enterprise Linux 5 on this machine and CentOS 4 on my other
machine. Neither machine has ever locked up and one is over 8 years old and
the other is over 4 years old and they usually run 24/7. Since most
distributions run similar kernels, it is my opinion that it is most
improbable that your are experiencing a software error. I started running
Linux in 1998 with Red Hat Linux 5.0 and 5.2. These locked up once in a
while, usually in the X window system; i.e., the kernel would still be
running OK. Once in a while the kernel would crash. I then ran Red Hat Linux
6.0 and 6.2. I do not recall if they ever crashed or not. But since I
started running Red Hat Linux 7.3 (long since obsolete), I have never had a
crash.
So if your machine is a *86 machine, I suggest the first thing to do is run
memtest86 on it, at least overnight, because memory problems are frequent
for some people, and even if you are not having memory hardware problems, it
would be good to know for sure that you are not having them.
http://www.memtest86.com/
>
> I have some interesting oom-killer messages
I do not know what oom-killer messages are? Are you running out of memory
and the kernel is killing processes to get some? How much memory has your
machine? What processes do you run? How is the memory being used? Do you
have enough swap space? Usually if you have enough swap space, you will not
run out of memory; your performance may be lousy due to thrashing, but it
should not lock up or give you out of memory messages.
For example, here is some of my machine right now:
$ free
total used free shared buffers cached
Mem: 8185240 7697004 488236 0 183536 6427524
-/+ buffers/cache: 1085944 7099296
Swap: 4096496 40 4096456
You can see I am by no means out of memory. What does your machine say when
running? You might set up cron to run the _free_ command every few minutes
and log it. Then when your machine locks up or crashes, you could see if
something is hogging lots of memory.
> from /var/log/messages and wanted to ask some mailing list or something
> where there are some knowledgeable people. I am very worried about this
> and want to get to the bottom of this issue, if it locks up in themiddle
> of the day, we will be very screwed.
>
> So.
>
> Where can I ask?
>
You might ask right here.
--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key: 9A2FC99A Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 11:10:01 up 15:58, 4 users, load average: 4.22, 4.26, 4.26
-
Re: Where to ask about a machine lockup
On 2008-07-21, ray wrote:
> On Mon, 21 Jul 2008 09:58:30 -0500, Ignoramus14558 wrote:
>
>> A very machine running Ubuntu Gutsy locked up. It runs some of our apps
>> and at the time of lockup (midning on weekend) it was not running
>> anything.
>>
>> I have some interesting oom-killer messages from /var/log/messages and
>> wanted to ask some mailing list or something where there are some
>> knowledgeable people. I am very worried about this and want to get to
>> the bottom of this issue, if it locks up in themiddle of the day, we
>> will be very screwed.
>>
>> So.
>>
>> Where can I ask?
>>
>> i
>
> You could probably ask here, but more details would be in order.
> Otherwise, try some of the resources hosted at Ubuntu.
Well, here's the /var/log/messages entry (one of several that occurred
during several minutes). Note that the system has 8 GB of RAM, 5 GB of
swap, has NOT used any swap (see messages below) and just looks to be
in tip top shape. So why oom-killer?
The one interesting message I found is
Normal free:3700kB min:3744kB low:4680kB high:5616kB active:256kB inactive:76kB present:894080kB pages_scanned:506 all_unreclaimable? yes
################################################## ####################
Jul 20 00:22:33 my_server kernel: [373891.820009] perl invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
Jul 20 00:22:35 my_server kernel: [373891.820022] [out_of_memory+389/448] out_of_memory+0x185/0x1c0
Jul 20 00:22:35 my_server kernel: [373891.820033] [__alloc_pages+700/784] __alloc_pages+0x2bc/0x310
Jul 20 00:22:35 my_server kernel: [373891.820036] [vma_link+96/256] vma_link+0x60/0x100
Jul 20 00:22:35 my_server kernel: [373891.820044] [__get_free_pages+56/64] __get_free_pages+0x38/0x40
Jul 20 00:22:35 my_server kernel: [373891.820046] [proc_info_read+69/192] proc_info_read+0x45/0xc0
Jul 20 00:22:35 my_server kernel: [373891.820053] [vfs_read+188/352] vfs_read+0xbc/0x160
Jul 20 00:22:35 my_server kernel: [373891.820057] [proc_info_read+0/192] proc_info_read+0x0/0xc0
Jul 20 00:22:35 my_server kernel: [373891.820060] [sys_read+65/112] sys_read+0x41/0x70
Jul 20 00:22:35 my_server kernel: [373891.820065] [sysenter_past_esp+107/161] sysenter_past_esp+0x6b/0xa1
Jul 20 00:22:35 my_server kernel: [373891.820071] =======================
Jul 20 00:22:35 my_server kernel: [373891.820074] Mem-info:
Jul 20 00:22:35 my_server kernel: [373891.820076] DMA per-cpu:
Jul 20 00:22:35 my_server kernel: [373891.820077] CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
Jul 20 00:22:35 my_server kernel: [373891.820081] CPU 1: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
Jul 20 00:22:35 my_server kernel: [373891.820084] CPU 2: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
Jul 20 00:22:35 my_server kernel: [373891.820086] CPU 3: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
Jul 20 00:22:35 my_server kernel: [373891.820088] CPU 4: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
Jul 20 00:22:35 my_server kernel: [373891.820092] CPU 5: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
Jul 20 00:22:35 my_server kernel: [373891.820095] CPU 6: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
Jul 20 00:22:35 my_server kernel: [373891.820098] CPU 7: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
Jul 20 00:22:35 my_server kernel: [373891.820101] Normal per-cpu:
Jul 20 00:22:35 my_server kernel: [373891.820103] CPU 0: Hot: hi: 186, btch: 31 usd: 135 Cold: hi: 62, btch: 15 usd: 57
Jul 20 00:22:35 my_server kernel: [373891.820106] CPU 1: Hot: hi: 186, btch: 31 usd: 115 Cold: hi: 62, btch: 15 usd: 57
Jul 20 00:22:35 my_server kernel: [373891.820110] CPU 2: Hot: hi: 186, btch: 31 usd: 130 Cold: hi: 62, btch: 15 usd: 60
Jul 20 00:22:35 my_server kernel: [373891.820113] CPU 3: Hot: hi: 186, btch: 31 usd: 129 Cold: hi: 62, btch: 15 usd: 47
Jul 20 00:22:35 my_server kernel: [373891.820117] CPU 4: Hot: hi: 186, btch: 31 usd: 29 Cold: hi: 62, btch: 15 usd: 53
Jul 20 00:22:35 my_server kernel: [373891.820119] CPU 5: Hot: hi: 186, btch: 31 usd: 92 Cold: hi: 62, btch: 15 usd: 60
Jul 20 00:22:35 my_server kernel: [373891.820123] CPU 6: Hot: hi: 186, btch: 31 usd: 170 Cold: hi: 62, btch: 15 usd: 52
Jul 20 00:22:35 my_server kernel: [373891.820126] CPU 7: Hot: hi: 186, btch: 31 usd: 121 Cold: hi: 62, btch: 15 usd: 53
Jul 20 00:22:35 my_server kernel: [373891.820128] HighMem per-cpu:
Jul 20 00:22:35 my_server kernel: [373891.820131] CPU 0: Hot: hi: 186, btch: 31 usd: 61 Cold: hi: 62, btch: 15 usd: 0
Jul 20 00:22:35 my_server kernel: [373891.820135] CPU 1: Hot: hi: 186, btch: 31 usd: 155 Cold: hi: 62, btch: 15 usd: 8
Jul 20 00:22:35 my_server kernel: [373891.820138] CPU 2: Hot: hi: 186, btch: 31 usd: 8 Cold: hi: 62, btch: 15 usd: 8
Jul 20 00:22:35 my_server kernel: [373891.820142] CPU 3: Hot: hi: 186, btch: 31 usd: 135 Cold: hi: 62, btch: 15 usd: 0
Jul 20 00:22:35 my_server kernel: [373891.820145] CPU 4: Hot: hi: 186, btch: 31 usd: 81 Cold: hi: 62, btch: 15 usd: 8
Jul 20 00:22:35 my_server kernel: [373891.820148] CPU 5: Hot: hi: 186, btch: 31 usd: 71 Cold: hi: 62, btch: 15 usd: 10
Jul 20 00:22:35 my_server kernel: [373891.820151] CPU 6: Hot: hi: 186, btch: 31 usd: 51 Cold: hi: 62, btch: 15 usd: 12
Jul 20 00:22:35 my_server kernel: [373891.820154] CPU 7: Hot: hi: 186, btch: 31 usd: 166 Cold: hi: 62, btch: 15 usd: 12
Jul 20 00:22:35 my_server kernel: [373891.820159] Active:98677 inactive:6131 dirty:1 writeback:0 unstable:0
Jul 20 00:22:35 my_server kernel: [373891.820160] free:1761992 slab:8252 mapped:4651 pagetables:305 bounce:0
Jul 20 00:22:35 my_server kernel: [373891.820164] DMA free:3504kB min:68kB low:84kB high:100kB active:36kB inactive:0kB present:16256kB pages_scanned:51 all_unreclaimable? yes
Jul 20 00:22:35 my_server kernel: [373891.820166] lowmem_reserve[]: 0 873 8874
Jul 20 00:22:35 my_server kernel: [373891.820175] Normal free:3700kB min:3744kB low:4680kB high:5616kB active:256kB inactive:76kB present:894080kB pages_scanned:506 all_unreclaimable? yes
Jul 20 00:22:35 my_server kernel: [373891.820178] lowmem_reserve[]: 0 0 64008
Jul 20 00:22:35 my_server kernel: [373891.820183] HighMem free:7040764kB min:512kB low:9096kB high:17684kB active:394416kB inactive:24448kB present:8193024kB pages_scanned:0 all_unreclaimable? no
Jul 20 00:22:35 my_server kernel: [373891.820186] lowmem_reserve[]: 0 0 0
Jul 20 00:22:35 my_server kernel: [373891.820192] DMA: 2*4kB 6*8kB 8*16kB 1*32kB 0*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3544kB
Jul 20 00:22:35 my_server kernel: [373891.820204] Normal: 0*4kB 61*8kB 14*16kB 5*32kB 3*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 3624kB
Jul 20 00:22:35 my_server kernel: [373891.820212] HighMem: 50749*4kB 40737*8kB 23922*16kB 9009*32kB 2165*64kB 765*128kB 548*256kB 788*512kB 520*1024kB 325*2048kB 943*4096kB = 7040764kB
Jul 20 00:22:35 my_server kernel: [373891.820226] Swap cache: add 0, delete 0, find 0/0, race 0+0
Jul 20 00:22:35 my_server kernel: [373891.820229] Free swap = 5855652kB
Jul 20 00:22:35 my_server kernel: [373891.820231] Total swap = 5855652kB
Jul 20 00:22:35 my_server kernel: [373891.820232] Free swap: 5855652kB
Jul 20 00:22:35 my_server kernel: [373891.845074] 2293759 pages of RAM
Jul 20 00:22:35 my_server kernel: [373891.845079] 2064383 pages of HIGHMEM
Jul 20 00:22:35 my_server kernel: [373891.845080] 216220 reserved pages
Jul 20 00:22:35 my_server kernel: [373891.845081] 93055 pages shared
Jul 20 00:22:35 my_server kernel: [373891.845085] 0 pages swap cached
Jul 20 00:22:35 my_server kernel: [373891.845087] 1 pages dirty
Jul 20 00:22:35 my_server kernel: [373891.845089] 0 pages writeback
Jul 20 00:22:35 my_server kernel: [373891.845092] 4651 pages mapped
Jul 20 00:22:35 my_server kernel: [373891.845094] 8252 pages slab
Jul 20 00:22:35 my_server kernel: [373891.845096] 305 pages pagetables
--
Due to extreme spam originating from Google Groups, and their inattention
to spammers, I and many others block all articles originating
from Google Groups. If you want your postings to be seen by
more readers you will need to find a different means of
posting on Usenet.
http://improve-usenet.org/
-
Re: Where to ask about a machine lockup
On 2008-07-21, Jean-David Beyer wrote:
> Ignoramus14558 wrote:
>> A very machine running Ubuntu Gutsy locked up. It runs some of our apps
>> and at the time of lockup (midning on weekend) it was not running
>> anything.
>
> I run Red Hat Enterprise Linux 5 on this machine and CentOS 4 on my other
> machine. Neither machine has ever locked up and one is over 8 years old and
> the other is over 4 years old and they usually run 24/7.
I have has some similar experiences also.
> Since most distributions run similar kernels, it is my opinion that
> it is most improbable that your are experiencing a software error.
I basically want to leave no stone unturned. I want to go in all
directions of search at once. First, I want to ascertain that my
system was really out of memory. If it was, I want to find out the
process. If it was not, I want to know why.
> I started running Linux in 1998 with Red Hat Linux 5.0 and
> 5.2. These locked up once in a while, usually in the X window
> system; i.e., the kernel would still be running OK. Once in a while
> the kernel would crash. I then ran Red Hat Linux 6.0 and 6.2. I do
> not recall if they ever crashed or not. But since I started running
> Red Hat Linux 7.3 (long since obsolete), I have never had a crash.
>
> So if your machine is a *86 machine, I suggest the first thing to do is run
> memtest86 on it, at least overnight, because memory problems are frequent
> for some people, and even if you are not having memory hardware problems, it
> would be good to know for sure that you are not having them.
>
> http://www.memtest86.com/
Thanks. I burned the ISO and will try to run it tonight if possible.
>>
>> I have some interesting oom-killer messages
>
> I do not know what oom-killer messages are? Are you running out of
> memory and the kernel is killing processes to get some?
The kernel invoked oom-killed (see my another post with one such
entry)
> How much memory has your machine?
8GB RAM, 5 GB swap, 32 bit.
> What processes do you run?
We run our own apps (server side).
At the time of the crash no apps were running. There was one perl
script of ours, which would be unlikely to run out of memory as it
does very little, but I would not dismiss it outright.
> How is the memory being used?
Usually out of 8, only 2 is used, and the rest is free. We are not
running enough stuff yet to even approach the available memory size.
> Do you have enough swap space?
5 GB
> Usually if you have enough swap
> space, you will not run out of memory; your performance may be lousy
> due to thrashing, but it should not lock up or give you out of
> memory messages.
I kind of agree with your sentiment!
> For example, here is some of my machine right now:
>
> $ free
> total used free shared buffers cached
> Mem: 8185240 7697004 488236 0 183536 6427524
> -/+ buffers/cache: 1085944 7099296
> Swap: 4096496 40 4096456
here's that machine now:
==] free
total used free shared buffers cached
Mem: 8310156 1501860 6808296 0 178720 671376
-/+ buffers/cache: 651764 7658392
Swap: 5855652 0 5855652
> You can see I am by no means out of memory. What does your machine
> say when running? You might set up cron to run the _free_ command
> every few minutes and log it. Then when your machine locks up or
> crashes, you could see if something is hogging lots of memory.
Yes, I will activate a system activity logger to do top -b | head -15,
ps --sort size auxw |tail -15, and free, every minute.
>> from /var/log/messages and wanted to ask some mailing list or something
>> where there are some knowledgeable people. I am very worried about this
>> and want to get to the bottom of this issue, if it locks up in themiddle
>> of the day, we will be very screwed.
>>
>> So.
>>
>> Where can I ask?
>>
> You might ask right here.
>
ok...
let me know what you think...
--
Due to extreme spam originating from Google Groups, and their inattention
to spammers, I and many others block all articles originating
from Google Groups. If you want your postings to be seen by
more readers you will need to find a different means of
posting on Usenet.
http://improve-usenet.org/
-
Re: Where to ask about a machine lockup
I demand that Ignoramus14558 may or may not have written...
> On 2008-07-21, Jean-David Beyer wrote:
>> Ignoramus14558 wrote:
[snip]
>>> I have some interesting oom-killer messages
>> I do not know what oom-killer messages are? Are you running out of memory
>> and the kernel is killing processes to get some?
> The kernel invoked oom-killed (see my another post with one such entry)
>> How much memory has your machine?
> 8GB RAM, 5 GB swap, 32 bit.
Looks to me like lowmem problems (as. I suggest that you read
http://uwsg.iu.edu/hypermail/linux/k...01.2/3455.html (follow the
thread); that describes a similar problem and suggests a few possible
solutions.
[snip]
--
| Darren Salt | linux or ds at | nr. Ashington, | Toon
| RISC OS, Linux | youmustbejoking,demon,co,uk | Northumberland | Army
| + Output less CO2 => avoid massive flooding. TIME IS RUNNING OUT *FAST*.
When the gods wish to punish us, they answer our prayers.
-
Re: Where to ask about a machine lockup
On 2008-07-21, Darren Salt wrote:
> I demand that Ignoramus14558 may or may not have written...
>
>> On 2008-07-21, Jean-David Beyer wrote:
>>> Ignoramus14558 wrote:
> [snip]
>>>> I have some interesting oom-killer messages
>>> I do not know what oom-killer messages are? Are you running out of memory
>>> and the kernel is killing processes to get some?
>
>> The kernel invoked oom-killed (see my another post with one such entry)
>
>>> How much memory has your machine?
>
>> 8GB RAM, 5 GB swap, 32 bit.
>
> Looks to me like lowmem problems (as. I suggest that you read
> http://uwsg.iu.edu/hypermail/linux/k...01.2/3455.html (follow the
> thread); that describes a similar problem and suggests a few possible
> solutions.
Thanks. Both (upgrading to 64 bit or changing to RHEL) are kind of
radical.
And, more importantly, Parag's recommendation concerns systems with
MORE THAN 8GB of RAM. And my system is 8GB and not more.
--
Due to extreme spam originating from Google Groups, and their inattention
to spammers, I and many others block all articles originating
from Google Groups. If you want your postings to be seen by
more readers you will need to find a different means of
posting on Usenet.
http://improve-usenet.org/
-
Re: Where to ask about a machine lockup
Ignoramus14558 wrote:
> On 2008-07-21, ray wrote:
>> On Mon, 21 Jul 2008 09:58:30 -0500, Ignoramus14558 wrote:
>>
>>> A very machine running Ubuntu Gutsy locked up. It runs some of our apps
>>> and at the time of lockup (midning on weekend) it was not running
>>> anything.
>>>
>>> I have some interesting oom-killer messages from /var/log/messages and
>>> wanted to ask some mailing list or something where there are some
>>> knowledgeable people. I am very worried about this and want to get to
>>> the bottom of this issue, if it locks up in themiddle of the day, we
>>> will be very screwed.
>>>
>>> So.
>>>
>>> Where can I ask?
>>>
>>> i
>> You could probably ask here, but more details would be in order.
>> Otherwise, try some of the resources hosted at Ubuntu.
>
> Well, here's the /var/log/messages entry (one of several that occurred
> during several minutes). Note that the system has 8 GB of RAM, 5 GB of
> swap, has NOT used any swap (see messages below) and just looks to be
> in tip top shape. So why oom-killer?
>
> The one interesting message I found is
>
> Normal free:3700kB min:3744kB low:4680kB high:5616kB active:256kB inactive:76kB present:894080kB pages_scanned:506 all_unreclaimable? yes
>
> ################################################## ####################
>
> Jul 20 00:22:33 my_server kernel: [373891.820009] perl invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
Is that not strange? I did not think application programs, such as perl,
could invoke oom-killer. There is no man page for it, and no file with that
in the name. I wonder if the message is misleading. Could it mean that the
kernel invoked oom-killer and used it to kill a perl application? Or that a
perl application was running when the kernel called oom-killer?
--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key: 9A2FC99A Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 14:25:01 up 19:13, 4 users, load average: 4.49, 4.39, 4.33
-
Re: Where to ask about a machine lockup
On Mon, 21 Jul 2008 18:32:57 +0000, Jean-David Beyer wrote:
> Ignoramus14558 wrote:
>> On 2008-07-21, ray wrote:
>>> On Mon, 21 Jul 2008 09:58:30 -0500, Ignoramus14558 wrote:
>>>
>>>> A very machine running Ubuntu Gutsy locked up. It runs some of our
>>>> apps and at the time of lockup (midning on weekend) it was not running
>>>> anything.
>>>>
>>>> I have some interesting oom-killer messages from /var/log/messages and
>>>> wanted to ask some mailing list or something where there are some
>>>> knowledgeable people. I am very worried about this and want to get to
>>>> the bottom of this issue, if it locks up in themiddle of the day, we
>>>> will be very screwed.
>>>>
>>>> So.
>>>>
>>>> Where can I ask?
>>>>
>>>> i
>>> You could probably ask here, but more details would be in order.
>>> Otherwise, try some of the resources hosted at Ubuntu.
>>
>> Well, here's the /var/log/messages entry (one of several that occurred
>> during several minutes). Note that the system has 8 GB of RAM, 5 GB of
>> swap, has NOT used any swap (see messages below) and just looks to be in
>> tip top shape. So why oom-killer?
>>
>> The one interesting message I found is
>>
>> Normal free:3700kB min:3744kB low:4680kB high:5616kB active:256kB
>> inactive:76kB present:894080kB pages_scanned:506 all_unreclaimable? yes
>>
>> ################################################## ####################
>>
>> Jul 20 00:22:33 my_server kernel: [373891.820009] perl invoked
>> oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
>
> Is that not strange? I did not think application programs, such as perl,
> could invoke oom-killer. There is no man page for it, and no file with
> that in the name. I wonder if the message is misleading. Could it mean
> that the kernel invoked oom-killer and used it to kill a perl application?
> Or that a perl application was running when the kernel called oom-killer?
The kernel can call in oom-killer, application-independent.
-
Re: Where to ask about a machine lockup
On 2008-07-21, Jean-David Beyer wrote:
> Ignoramus14558 wrote:
>> On 2008-07-21, ray wrote:
>>> On Mon, 21 Jul 2008 09:58:30 -0500, Ignoramus14558 wrote:
>>>
>>>> A very machine running Ubuntu Gutsy locked up. It runs some of our apps
>>>> and at the time of lockup (midning on weekend) it was not running
>>>> anything.
>>>>
>>>> I have some interesting oom-killer messages from /var/log/messages and
>>>> wanted to ask some mailing list or something where there are some
>>>> knowledgeable people. I am very worried about this and want to get to
>>>> the bottom of this issue, if it locks up in themiddle of the day, we
>>>> will be very screwed.
>>>>
>>>> So.
>>>>
>>>> Where can I ask?
>>>>
>>>> i
>>> You could probably ask here, but more details would be in order.
>>> Otherwise, try some of the resources hosted at Ubuntu.
>>
>> Well, here's the /var/log/messages entry (one of several that occurred
>> during several minutes). Note that the system has 8 GB of RAM, 5 GB of
>> swap, has NOT used any swap (see messages below) and just looks to be
>> in tip top shape. So why oom-killer?
>>
>> The one interesting message I found is
>>
>> Normal free:3700kB min:3744kB low:4680kB high:5616kB active:256kB inactive:76kB present:894080kB pages_scanned:506 all_unreclaimable? yes
>>
>> ################################################## ####################
>>
>> Jul 20 00:22:33 my_server kernel: [373891.820009] perl invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
>
> Is that not strange? I did not think application programs, such as perl,
> could invoke oom-killer.
No, it is not strange, it is just a confusing log message.
oom-killer was invoked by kernel when a process (such as perl)
innocently requested memory and there was not enough of it.
Look into oom_kill.c:
### static int oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
### unsigned long points, const char *message)
### {
### struct task_struct *c;
###
### if (printk_ratelimit()) {
### printk(KERN_WARNING "%s invoked oom-killer: "
### "gfp_mask=0x%x, order=%d, oomkilladj=%d\n",
### current->comm, gfp_mask, order, current->oomkilladj);
### dump_stack();
### show_mem();
### }
### ...
The first %s is current->comm which refers to current process.
> There is no man page for it, and no file with that in the name. I
> wonder if the message is misleading.
It is.
> Could it mean that the kernel invoked oom-killer and used it to kill
> a perl application? Or that a perl application was running when the
> kernel called oom-killer?
I do not think that it says what it kills. That's a problem.
But the main question is, why is it even invoked.
--
Due to extreme spam originating from Google Groups, and their inattention
to spammers, I and many others block all articles originating
from Google Groups. If you want your postings to be seen by
more readers you will need to find a different means of
posting on Usenet.
http://improve-usenet.org/
-
Re: Where to ask about a machine lockup
Ignoramus14558 wrote:
> On 2008-07-21, Jean-David Beyer wrote:
>> Ignoramus14558 wrote:
>>> On 2008-07-21, ray wrote:
>>>> On Mon, 21 Jul 2008 09:58:30 -0500, Ignoramus14558 wrote:
>>>>
>>>>> A very machine running Ubuntu Gutsy locked up. It runs some of our apps
>>>>> and at the time of lockup (midning on weekend) it was not running
>>>>> anything.
>>>>>
>>>>> I have some interesting oom-killer messages from /var/log/messages and
>>>>> wanted to ask some mailing list or something where there are some
>>>>> knowledgeable people. I am very worried about this and want to get to
>>>>> the bottom of this issue, if it locks up in themiddle of the day, we
>>>>> will be very screwed.
>>>>>
>>>>> So.
>>>>>
>>>>> Where can I ask?
>>>>>
>>>>> i
>>>> You could probably ask here, but more details would be in order.
>>>> Otherwise, try some of the resources hosted at Ubuntu.
>>> Well, here's the /var/log/messages entry (one of several that occurred
>>> during several minutes). Note that the system has 8 GB of RAM, 5 GB of
>>> swap, has NOT used any swap (see messages below) and just looks to be
>>> in tip top shape. So why oom-killer?
>>>
>>> The one interesting message I found is
>>>
>>> Normal free:3700kB min:3744kB low:4680kB high:5616kB active:256kB inactive:76kB present:894080kB pages_scanned:506 all_unreclaimable? yes
>>>
>>> ################################################## ####################
>>>
>>> Jul 20 00:22:33 my_server kernel: [373891.820009] perl invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
>> Is that not strange? I did not think application programs, such as perl,
>> could invoke oom-killer.
>
> No, it is not strange, it is just a confusing log message.
>
> oom-killer was invoked by kernel when a process (such as perl)
> innocently requested memory and there was not enough of it.
>
> Look into oom_kill.c:
>
> ### static int oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
> ### unsigned long points, const char *message)
> ### {
> ### struct task_struct *c;
> ###
> ### if (printk_ratelimit()) {
> ### printk(KERN_WARNING "%s invoked oom-killer: "
> ### "gfp_mask=0x%x, order=%d, oomkilladj=%d\n",
> ### current->comm, gfp_mask, order, current->oomkilladj);
> ### dump_stack();
> ### show_mem();
> ### }
> ### ...
>
> The first %s is current->comm which refers to current process.
>
>
>> There is no man page for it, and no file with that in the name. I
>> wonder if the message is misleading.
>
> It is.
>
>> Could it mean that the kernel invoked oom-killer and used it to kill
>> a perl application? Or that a perl application was running when the
>> kernel called oom-killer?
>
> I do not think that it says what it kills. That's a problem.
>
> But the main question is, why is it even invoked.
>
Agreed. As long as there is sufficient swap space, would oom-killer ever be
invoked?
--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key: 9A2FC99A Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 15:00:02 up 19:48, 5 users, load average: 4.46, 4.28, 4.24
-
Re: Where to ask about a machine lockup
"Ignoramus14558" wrote in message
news:iZCdnVDamMKLORnVnZ2dnUVZ_uWdnZ2d@giganews.com ...
>A very machine running Ubuntu Gutsy locked up. It runs some of our
> apps and at the time of lockup (midning on weekend) it was not
> running anything.
>
> I have some interesting oom-killer messages from /var/log/messages and
> wanted to ask some mailing list or something where there are some
> knowledgeable people. I am very worried about this and want to get to
> the bottom of this issue, if it locks up in themiddle of the day, we
> will be very screwed.
>
> So.
>
> Where can I ask?
>
> i
http://ubuntuforums.org/
That's the only place you'll get any real answers because theses clowns
don't know **** and they wouldn't tell you anything if they did.
-
Re: Where to ask about a machine lockup
On Mon, 21 Jul 2008 19:02:03 +0000, Jean-David Beyer wrote:
> Ignoramus14558 wrote:
>> On 2008-07-21, Jean-David Beyer wrote:
>>> Ignoramus14558 wrote:
>>>> On 2008-07-21, ray wrote:
>>>>> On Mon, 21 Jul 2008 09:58:30 -0500, Ignoramus14558 wrote:
>>>>>
>>>>>> A very machine running Ubuntu Gutsy locked up. It runs some of our
>>>>>> apps and at the time of lockup (midning on weekend) it was not
>>>>>> running anything.
>>>>>>
>>>>>> I have some interesting oom-killer messages from /var/log/messages
>>>>>> and wanted to ask some mailing list or something where there are
>>>>>> some knowledgeable people. I am very worried about this and want to
>>>>>> get to the bottom of this issue, if it locks up in themiddle of the
>>>>>> day, we will be very screwed.
>>>>>>
>>>>>> So.
>>>>>>
>>>>>> Where can I ask?
>>>>>>
>>>>>> i
>>>>> You could probably ask here, but more details would be in order.
>>>>> Otherwise, try some of the resources hosted at Ubuntu.
>>>> Well, here's the /var/log/messages entry (one of several that occurred
>>>> during several minutes). Note that the system has 8 GB of RAM, 5 GB of
>>>> swap, has NOT used any swap (see messages below) and just looks to be
>>>> in tip top shape. So why oom-killer?
>>>>
>>>> The one interesting message I found is
>>>>
>>>> Normal free:3700kB min:3744kB low:4680kB high:5616kB active:256kB
>>>> inactive:76kB present:894080kB pages_scanned:506 all_unreclaimable?
>>>> yes
>>>>
>>>> ################################################## ####################
>>>>
>>>> Jul 20 00:22:33 my_server kernel: [373891.820009] perl invoked
>>>> oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
>>> Is that not strange? I did not think application programs, such as
>>> perl, could invoke oom-killer.
>>
>> No, it is not strange, it is just a confusing log message.
>>
>> oom-killer was invoked by kernel when a process (such as perl)
>> innocently requested memory and there was not enough of it.
>>
>> Look into oom_kill.c:
>>
>> ### static int oom_kill_process(struct task_struct *p, gfp_t gfp_mask,
>> int order, ### unsigned long points, const char *message) ### {
>> ### struct task_struct *c;
>> ###
>> ### if (printk_ratelimit()) {
>> ### printk(KERN_WARNING "%s invoked oom-killer: " ###
>> "gfp_mask=0x%x, order=%d, oomkilladj=%d\n", ###
>> current->comm, gfp_mask, order, current->oomkilladj); ###
>> dump_stack(); ### show_mem();
>> ### }
>> ### ...
>>
>> The first %s is current->comm which refers to current process.
>>
>>
>>> There is no man page for it, and no file with that in the name. I
>>> wonder if the message is misleading.
>>
>> It is.
>>
>>> Could it mean that the kernel invoked oom-killer and used it to kill a
>>> perl application? Or that a perl application was running when the
>>> kernel called oom-killer?
>>
>> I do not think that it says what it kills. That's a problem.
>>
>> But the main question is, why is it even invoked.
>>
> Agreed. As long as there is sufficient swap space, would oom-killer ever
> be invoked?
Google mem_notify.
One way among others to signal the kernel so it doesn't send oom-killer.
-
Re: Where to ask about a machine lockup
On 2008-07-21, Jean-David Beyer wrote:
> Ignoramus14558 wrote:
>> On 2008-07-21, Jean-David Beyer wrote:
>>> Ignoramus14558 wrote:
>>>> On 2008-07-21, ray wrote:
>>>>> On Mon, 21 Jul 2008 09:58:30 -0500, Ignoramus14558 wrote:
>>>>>
>>>>>> A very machine running Ubuntu Gutsy locked up. It runs some of our apps
>>>>>> and at the time of lockup (midning on weekend) it was not running
>>>>>> anything.
>>>>>>
>>>>>> I have some interesting oom-killer messages from /var/log/messages and
>>>>>> wanted to ask some mailing list or something where there are some
>>>>>> knowledgeable people. I am very worried about this and want to get to
>>>>>> the bottom of this issue, if it locks up in themiddle of the day, we
>>>>>> will be very screwed.
>>>>>>
>>>>>> So.
>>>>>>
>>>>>> Where can I ask?
>>>>>>
>>>>>> i
>>>>> You could probably ask here, but more details would be in order.
>>>>> Otherwise, try some of the resources hosted at Ubuntu.
>>>> Well, here's the /var/log/messages entry (one of several that occurred
>>>> during several minutes). Note that the system has 8 GB of RAM, 5 GB of
>>>> swap, has NOT used any swap (see messages below) and just looks to be
>>>> in tip top shape. So why oom-killer?
>>>>
>>>> The one interesting message I found is
>>>>
>>>> Normal free:3700kB min:3744kB low:4680kB high:5616kB active:256kB inactive:76kB present:894080kB pages_scanned:506 all_unreclaimable? yes
>>>>
>>>> ################################################## ####################
>>>>
>>>> Jul 20 00:22:33 my_server kernel: [373891.820009] perl invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
>>> Is that not strange? I did not think application programs, such as perl,
>>> could invoke oom-killer.
>>
>> No, it is not strange, it is just a confusing log message.
>>
>> oom-killer was invoked by kernel when a process (such as perl)
>> innocently requested memory and there was not enough of it.
>>
>> Look into oom_kill.c:
>>
>> ### static int oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
>> ### unsigned long points, const char *message)
>> ### {
>> ### struct task_struct *c;
>> ###
>> ### if (printk_ratelimit()) {
>> ### printk(KERN_WARNING "%s invoked oom-killer: "
>> ### "gfp_mask=0x%x, order=%d, oomkilladj=%d\n",
>> ### current->comm, gfp_mask, order, current->oomkilladj);
>> ### dump_stack();
>> ### show_mem();
>> ### }
>> ### ...
>>
>> The first %s is current->comm which refers to current process.
>>
>>
>>> There is no man page for it, and no file with that in the name. I
>>> wonder if the message is misleading.
>>
>> It is.
>>
>>> Could it mean that the kernel invoked oom-killer and used it to kill
>>> a perl application? Or that a perl application was running when the
>>> kernel called oom-killer?
>>
>> I do not think that it says what it kills. That's a problem.
>>
>> But the main question is, why is it even invoked.
>>
> Agreed. As long as there is sufficient swap space, would oom-killer ever be
> invoked?
>
It's kind of complicated when it is invoked, but the complex cases
seem to cover unusual instances like programs locking themselves in
memory, etc.
--
Due to extreme spam originating from Google Groups, and their inattention
to spammers, I and many others block all articles originating
from Google Groups. If you want your postings to be seen by
more readers you will need to find a different means of
posting on Usenet.
http://improve-usenet.org/
-
Re: Where to ask about a machine lockup
On Mon, 21 Jul 2008 09:58:30 -0500, Ignoramus14558 wrote:
> A very machine running Ubuntu Gutsy locked up. It runs some of our apps
> and at the time of lockup (midning on weekend) it was not running
> anything.
>
> I have some interesting oom-killer messages from /var/log/messages and
> wanted to ask some mailing list or something where there are some
> knowledgeable people. I am very worried about this and want to get to
> the bottom of this issue, if it locks up in themiddle of the day, we
> will be very screwed.
>
> So.
>
> Where can I ask?
>
> i
I have to tell you that I've been running Linux for over five years on a
bunch of different hardware and the only time I've ever seen a system
lockup like you describe it was a hardware problem. I'd start by checking
memory and the hard drive with appropriate diagnostic utilities.
-
Re: Where to ask about a machine lockup
I demand that Ignoramus14558 may or may not have written...
> On 2008-07-21, Darren Salt wrote:
>> I demand that Ignoramus14558 may or may not have written...
>>> On 2008-07-21, Jean-David Beyer wrote:
>>>> Ignoramus14558 wrote:
>> [snip]
>>>>> I have some interesting oom-killer messages
>>>> I do not know what oom-killer messages are? Are you running out of
>>>> memory and the kernel is killing processes to get some?
>>> The kernel invoked oom-killed (see my another post with one such entry)
>>>> How much memory has your machine?
>>> 8GB RAM, 5 GB swap, 32 bit.
>> Looks to me like lowmem problems. I suggest that you read
>> http://uwsg.iu.edu/hypermail/linux/k...01.2/3455.html (follow the
>> thread); that describes a similar problem and suggests a few possible
>> solutions.
> Thanks. Both (upgrading to 64 bit or changing to RHEL) are kind of
> radical.
Changing to RHEL, yes, that would be a bit radical ;-)
But running a 64-bit (well, amd64) kernel should be fine (so long as the
hardware is capable of running it; retaining your existing 32-bit userland
will be fine), and it could be that the modifications made to those RH
kernels are in mainline now.
> And, more importantly, Parag's recommendation concerns systems with MORE
> THAN 8GB of RAM. And my system is 8GB and not more.
That probably doesn't matter; they both have >= 4GB. And even if it does...
is there a memory hole somewhere between 3GB and 4GB? If so, you probably
have some mapped above the 8GB mark, meaning that you "have more than" 8GB
(in terms of the highest available physical address, if I've understood this
remapping stuff properly).
--
| Darren Salt | linux or ds at | nr. Ashington, | Toon
| RISC OS, Linux | youmustbejoking,demon,co,uk | Northumberland | Army
| + Buy less and make it last longer. INDUSTRY CAUSES GLOBAL WARMING.
Are you addicted to taglines? Call Tagliners Anonymous *now*!