CPUs scheduling oddities with 4 cores - Linux

This is a discussion on CPUs scheduling oddities with 4 cores - Linux ; I was running a program that is CPU intensive and often runs for a long time. When I ran multiple processes I encountered an oddity. This is on a system with 2 sockets each populated with a dual core AMD ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: CPUs scheduling oddities with 4 cores

  1. CPUs scheduling oddities with 4 cores

    I was running a program that is CPU intensive and often runs for a long time.
    When I ran multiple processes I encountered an oddity. This is on a system
    with 2 sockets each populated with a dual core AMD Opteron (so 4 cores total).

    root@tesla:/root 422# uname -a
    Linux tesla.ipal.net 2.6.26.2 #1 SMP PREEMPT Sat Aug 16 22:54:27 CDT 2008 i686 Dual-Core AMD Opteron(tm) Processor 2220 AuthenticAMD GNU/Linuxroot@tesla:/root 423#


    1. When I run 4 processes as a normal user, all 4 processes use 100%.

    ================================================== ===========================
    top - 17:12:34 up 15 days, 16:48, 6 users, load average: 3.91, 3.60, 2.72
    Tasks: 226 total, 5 running, 221 sleeping, 0 stopped, 0 zombie
    Cpu(s):100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
    Mem: 8310704k total, 1026128k used, 7284576k free, 429476k buffers
    Swap: 0k total, 0k used, 0k free, 138508k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    12431 phil 20 0 1892 528 444 R 100 0.0 7525:47 factorize
    30204 phil 20 0 1892 528 444 R 100 0.0 0:54.62 factorize
    30210 phil 20 0 1892 528 444 R 100 0.0 0:54.63 factorize
    30196 phil 20 0 1892 528 444 R 100 0.0 0:54.39 factorize
    30241 root 8 -12 2460 1288 888 R 0 0.0 0:00.15 top
    ================================================== ===========================

    2. When I run 1 process as a normal user and 3 processes as root, then one
    of the CPUs is not being used.

    ================================================== ===========================
    top - 17:14:54 up 15 days, 16:50, 6 users, load average: 3.89, 3.67, 2.87
    Tasks: 223 total, 5 running, 218 sleeping, 0 stopped, 0 zombie
    Cpu(s): 75.0%us, 0.1%sy, 0.0%ni, 24.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
    Mem: 8310704k total, 1024864k used, 7285840k free, 429796k buffers
    Swap: 0k total, 0k used, 0k free, 138508k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    30387 root 20 0 1892 528 444 R 100 0.0 1:32.78 factorize
    30403 root 20 0 1892 532 444 R 100 0.0 1:32.69 factorize
    30398 root 20 0 1892 532 444 R 67 0.0 1:01.87 factorize
    12431 phil 20 0 1892 528 444 R 33 0.0 7527:05 factorize
    30493 root 8 -12 2460 1252 888 R 0 0.0 0:00.07 top
    ================================================== ===========================

    Why would the kernel choose to NOT schedule ONE CPU just because 3 processes
    are running as root instead of a non-root user? I could understand root maybe
    getting special priority (but the priorities here were set the same).

    And setting the user process to negative niceness does not change it:

    ================================================== ===========================
    top - 17:18:20 up 15 days, 16:54, 6 users, load average: 3.99, 3.83, 3.08
    Tasks: 223 total, 5 running, 218 sleeping, 0 stopped, 0 zombie
    Cpu(s): 75.0%us, 0.0%sy, 0.0%ni, 25.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
    Mem: 8310704k total, 1024596k used, 7286108k free, 428948k buffers
    Swap: 0k total, 0k used, 0k free, 138496k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    30387 root 20 0 1892 528 444 R 100 0.0 4:58.55 factorize
    30403 root 20 0 1892 532 444 R 100 0.0 4:58.33 factorize
    30398 root 20 0 1892 532 444 R 66 0.0 3:19.08 factorize
    12431 phil 10 -10 1892 528 444 R 34 0.0 7528:13 factorize
    30771 root 8 -12 2460 1276 888 R 0 0.0 0:00.03 top
    ================================================== ===========================

    I would have hope that at least in this case it would have given 12431 more
    CPU access. And even setting the root processes to positive nice also did
    not change this:

    ================================================== ===========================
    top - 17:19:44 up 15 days, 16:55, 6 users, load average: 3.99, 3.86, 3.16
    Tasks: 223 total, 6 running, 217 sleeping, 0 stopped, 0 zombie
    Cpu(s): 8.4%us, 0.0%sy, 66.6%ni, 25.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
    Mem: 8310704k total, 1023604k used, 7287100k free, 429040k buffers
    Swap: 0k total, 0k used, 0k free, 138496k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    30387 root 30 10 1892 528 444 R 100 0.0 6:22.81 factorize
    30403 root 30 10 1892 532 444 R 100 0.0 6:22.50 factorize
    30398 root 30 10 1892 532 444 R 67 0.0 4:15.28 factorize
    12431 phil 10 -10 1892 528 444 R 33 0.0 7528:41 factorize
    30843 root 8 -12 2460 1256 888 R 0 0.0 0:00.01 top
    ================================================== ===========================

    I fired up 3 additional non-root processes (now 4 non-root and 3 root) and
    this does less all 4 CPUs run. The prioritizing is strange in this case
    since one of the new (nice 0) processes gets 100% but the priority one
    (nice -10) still only gets 33%.

    ================================================== ===========================
    top - 17:27:21 up 15 days, 17:03, 6 users, load average: 6.38, 4.78, 3.76
    Tasks: 229 total, 9 running, 220 sleeping, 0 stopped, 0 zombie
    Cpu(s): 49.9%us, 0.1%sy, 50.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
    Mem: 8310704k total, 1026256k used, 7284448k free, 429824k buffers
    Swap: 0k total, 0k used, 0k free, 138496k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    31274 phil 20 0 1892 528 444 R 100 0.0 1:31.85 factorize
    30403 root 30 10 1892 532 444 R 67 0.0 13:27.16 factorize
    30387 root 30 10 1892 528 444 R 67 0.0 13:28.06 factorize
    30398 root 30 10 1892 532 444 R 67 0.0 9:19.47 factorize
    12431 phil 10 -10 1892 528 444 R 33 0.0 7531:14 factorize
    31270 phil 20 0 1892 528 444 R 33 0.0 0:30.74 factorize
    31264 phil 20 0 1892 524 444 R 33 0.0 0:30.74 factorize
    31051 root 8 -12 2460 1288 888 R 0 0.0 0:00.82 top
    ================================================== ===========================

    Here I let top run for a 120 second measurement interval to see if this is
    a case of juggling which processes get more CPU time:

    ================================================== ===========================
    top - 17:31:13 up 15 days, 17:07, 6 users, load average: 6.99, 5.97, 4.47
    Tasks: 229 total, 9 running, 220 sleeping, 0 stopped, 0 zombie
    Cpu(s): 50.0%us, 0.0%sy, 50.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
    Mem: 8310704k total, 1026628k used, 7284076k free, 430192k buffers
    Swap: 0k total, 0k used, 0k free, 138496k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    31274 phil 20 0 1892 528 444 R 100 0.0 5:23.80 factorize
    30398 root 30 10 1892 532 444 R 67 0.0 11:54.46 factorize
    30403 root 30 10 1892 532 444 R 67 0.0 16:02.01 factorize
    30387 root 30 10 1892 528 444 R 67 0.0 16:02.91 factorize
    12431 phil 10 -10 1892 528 444 R 33 0.0 7532:31 factorize
    31264 phil 20 0 1892 524 444 R 33 0.0 1:48.21 factorize
    31270 phil 20 0 1892 528 444 R 33 0.0 1:48.20 factorize
    31582 root 8 -12 2460 1268 892 R 0 0.0 0:00.01 top
    ================================================== ===========================

    We can see that 31274 is being inappropriately favored in this case as it
    has accumulated over the 2 minute period more CPU time than the others.
    Process 12431 should be getting the most, but isn't.

    In all the above cases, process 12431 has been running for a few days.

    --
    |WARNING: Due to extreme spam, googlegroups.com is blocked. Due to ignorance |
    | by the abuse department, bellsouth.net is blocked. If you post to |
    | Usenet from these places, find another Usenet provider ASAP. |
    | Phil Howard KA9WGN (email for humans: first name in lower case at ipal.net) |

  2. Re: CPUs scheduling oddities with 4 cores

    Hi,

    > 1. When I run 4 processes as a normal user, all 4 processes use 100%.


    Not surprising.

    > 2. When I run 1 process as a normal user and 3 processes as root, then one
    > of the CPUs is not being used.
    > Why would the kernel choose to NOT schedule ONE CPU just because 3 processes
    > are running as root instead of a non-root user? I could understand root maybe
    > getting special priority (but the priorities here were set the same).


    Yes, root does get very special priorities, because you want root able
    to "rescue" a system which is on the brink of overload ;-)

    Why would you want to run your processes as root, anyway?

    > I fired up 3 additional non-root processes (now 4 non-root and 3 root) and
    > this does less all 4 CPUs run. The prioritizing is strange in this case
    > since one of the new (nice 0) processes gets 100% but the priority one
    > (nice -10) still only gets 33%.


    Do you wonder? Three of your four CPUs get to handle two CPU-intensive
    processes, so why should any of the respective two get 100%? Well, as
    you found out, root gets some extra priority and you need super-user
    rights to set the nice value for a process beyond a certain threshold,
    so you could experiment with root-priority for a user process - but
    again, why? ;-)

    Have fun...


  3. Re: CPUs scheduling oddities with 4 cores

    On Tue, 02 Sep 2008 12:13:44 +0200 Bernhard Agthe wrote:
    | Hi,
    |
    |> 1. When I run 4 processes as a normal user, all 4 processes use 100%.
    |
    | Not surprising.
    |
    |> 2. When I run 1 process as a normal user and 3 processes as root, then one
    |> of the CPUs is not being used.
    |> Why would the kernel choose to NOT schedule ONE CPU just because 3 processes
    |> are running as root instead of a non-root user? I could understand root maybe
    |> getting special priority (but the priorities here were set the same).
    |
    | Yes, root does get very special priorities, because you want root able
    | to "rescue" a system which is on the brink of overload ;-)

    But that does not explain why a CPU is left idle. Letting root be the first
    up to take a CPU over all others MIGHT be reasonably explained. However,
    do niceness settings mean anything? Does root running at nice 19 still get
    all control of the system in lieu of a user running at nice 0?


    | Why would you want to run your processes as root, anyway?

    Actually, it was a typo. They were run that way unintentionally. I would
    not have discovered the funny behaviour otherwise.


    |> I fired up 3 additional non-root processes (now 4 non-root and 3 root) and
    |> this does less all 4 CPUs run. The prioritizing is strange in this case
    |> since one of the new (nice 0) processes gets 100% but the priority one
    |> (nice -10) still only gets 33%.
    |
    | Do you wonder? Three of your four CPUs get to handle two CPU-intensive
    | processes, so why should any of the respective two get 100%? Well, as
    | you found out, root gets some extra priority and you need super-user
    | rights to set the nice value for a process beyond a certain threshold,
    | so you could experiment with root-priority for a user process - but
    | again, why? ;-)

    There being 4 processes ready to run, and 4 CPUs available, then all 4 should
    be running at 100%. That was not the case. 2 CPUs ran 2 of the processes,
    1 CPU ran the next 2 processes, and 1 CPU sat idle ... in the case where root
    was owner of 3 of the processes. This does not explain why leave one of the
    CPUs idle doing nothing.

    --
    |WARNING: Due to extreme spam, googlegroups.com is blocked. Due to ignorance |
    | by the abuse department, bellsouth.net is blocked. If you post to |
    | Usenet from these places, find another Usenet provider ASAP. |
    | Phil Howard KA9WGN (email for humans: first name in lower case at ipal.net) |

  4. Re: CPUs scheduling oddities with 4 cores

    Hi,

    > But that does not explain why a CPU is left idle. Letting root be the first
    > up to take a CPU over all others MIGHT be reasonably explained. However,
    > do niceness settings mean anything? Does root running at nice 19 still get
    > all control of the system in lieu of a user running at nice 0?


    Actually, I cannot tell you why one core is idle. As to the "nice"
    setting, I think this is a relative measure, so if root starts out with
    a higher priority and you lessen it, it might still be higher than a
    user with (marginally) increased prio. There's a maximum increase users
    are allowed actually, anything else needs root permissions to do.

    > | Why would you want to run your processes as root, anyway?
    >
    > Actually, it was a typo. They were run that way unintentionally. I would
    > not have discovered the funny behaviour otherwise.


    OK ;-)

    > There being 4 processes ready to run, and 4 CPUs available, then all 4 should
    > be running at 100%. That was not the case. 2 CPUs ran 2 of the processes,
    > 1 CPU ran the next 2 processes, and 1 CPU sat idle ... in the case where root
    > was owner of 3 of the processes. This does not explain why leave one of the
    > CPUs idle doing nothing.


    Yup. I cannot give you an answer to that. Sorry.

    Ciao...


  5. Re: CPUs scheduling oddities with 4 cores

    On Thu, 04 Sep 2008 11:39:47 +0200 Bernhard Agthe wrote:
    | Hi,
    |
    |> But that does not explain why a CPU is left idle. Letting root be the first
    |> up to take a CPU over all others MIGHT be reasonably explained. However,
    |> do niceness settings mean anything? Does root running at nice 19 still get
    |> all control of the system in lieu of a user running at nice 0?
    |
    | Actually, I cannot tell you why one core is idle. As to the "nice"
    | setting, I think this is a relative measure, so if root starts out with
    | a higher priority and you lessen it, it might still be higher than a
    | user with (marginally) increased prio. There's a maximum increase users
    | are allowed actually, anything else needs root permissions to do.

    I could understand the case where root would be a higher priority than a user,
    at least for a default (0) nice value. If I set a root process to a niceness
    greater than 0 *AND* set a user process to a niceness less than 0, then at
    some point the user process should run in lieu of the root process. I want
    a way to have processes NOT bog down the system even though they have to run
    with root permission.

    It's the fact that one CPU is idle that I take issue with. I suspect there
    may be a bug in the kernel where some logic attempting to manipulate which
    CPUs some processes can run on (affinity) has an error. I suppose I need to
    do further testing with more combinations, and on other machines with other
    numebrs of CPUs. For example, if running 1 user and 1 root process on a dual
    CPU machine leaves one CPU idle, that's telling me (bad) things.

    --
    |WARNING: Due to extreme spam, googlegroups.com is blocked. Due to ignorance |
    | by the abuse department, bellsouth.net is blocked. If you post to |
    | Usenet from these places, find another Usenet provider ASAP. |
    | Phil Howard KA9WGN (email for humans: first name in lower case at ipal.net) |

+ Reply to Thread