Current Git: BUG: unable to handle kernel paging request at 0000000001a40ca0 - Kernel

This is a discussion on Current Git: BUG: unable to handle kernel paging request at 0000000001a40ca0 - Kernel ; Hi, I just encountered a Bug in latest git: As this is my first bugreport, I am not sure who to cc and which information to provide, so please advise me. Some information is below. ifconfig triggered: BUG: unable to ...

+ Reply to Thread
Results 1 to 9 of 9

Thread: Current Git: BUG: unable to handle kernel paging request at 0000000001a40ca0

  1. Current Git: BUG: unable to handle kernel paging request at 0000000001a40ca0

    Hi,

    I just encountered a Bug in latest git:

    As this is my first bugreport, I am not sure who to cc and which
    information to provide, so please advise me. Some information is below.



    ifconfig triggered:

    BUG: unable to handle kernel paging request at 0000000001a40ca0
    IP: [] kmem_cache_alloc+0x50/0x81
    PGD 79d33067 PUD 79cf7067 PMD 0
    Oops: 0000 [1] SMP
    CPU 0
    Modules linked in: radeon drm uinput snd_hda_intel iwl3945 snd_pcm
    snd_timer rfkill snd led_class snd_page_alloc
    Pid: 3516, comm: ifconfig Not tainted 2.6.26-06077-gc010b2f #23
    RIP: 0010:[] []
    kmem_cache_alloc+0x50/0x81
    RSP: 0000:ffff880079d079e8 EFLAGS: 00010006
    RAX: 0000000000000000 RBX: 0000000000000296 RCX: ffffffff802704ae
    RDX: ffff880001016700 RSI: 0000000001a40ca0 RDI: ffffffff808b5fa0
    RBP: ffff880079d07a08 R08: 000000000000000c R09: 0000000000000001
    R10: ffff880079d07a18 R11: ffff880079cce260 R12: 0000000000011280
    R13: 0000000000000080 R14: 00000000000112d0 R15: 0000000001670328
    FS: 00007fb32edc26f0(0000) GS:ffffffff808b6e80(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 0000000001a40ca0 CR3: 0000000079d43000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process ifconfig (pid: 3516, threadinfo ffff880079d06000, task
    ffff88007f8ed6c0)
    Stack: ffff880079d07a18 00000000000000d0 ffff88007f99e480 0000000000000001
    ffff880079d07a18 ffffffff802704ae ffff880079d07a78 ffffffff802705c2
    ffff88007d01e400 ffffffff80865dc0 ffff880079cced10 0000000000000008
    Call Trace:
    [] mempool_alloc_slab+0x16/0x18
    [] mempool_alloc+0x3e/0xfa
    [] bio_alloc_bioset+0x27/0x94
    [] bio_alloc+0x15/0x24
    [] mpage_alloc+0x2b/0x84
    [] do_mpage_readpage+0x406/0x508
    [] ? __inc_zone_page_state+0x23/0x25
    [] ? add_to_page_cache+0x81/0x98
    [] ? ext3_get_block+0x0/0xee
    [] mpage_readpages+0xab/0xeb
    [] ? ext3_get_block+0x0/0xee
    [] ? __alloc_pages_internal+0xdb/0x3a1
    [] ext3_readpages+0x1f/0x21
    [] __do_page_cache_readahead+0x100/0x18f
    [] do_page_cache_readahead+0x57/0x67
    [] filemap_fault+0x165/0x342
    [] __do_fault+0x55/0x373
    [] handle_mm_fault+0x1ec/0x673
    [] do_page_fault+0x43c/0x7e2
    [] error_exit+0x0/0x51


    Code: 98 48 8b 94 c7 e0 00 00 00 48 8b 32 44 8b 6a 18 48 85 f6 75 13 49
    89 d0 44 89 e6 83 ca ff e8 b3 f8 ff ff 48 89 c6 eb 0a 8b 42 14 <48> 8b
    04 c6 48 89 02 53 9d 31 c0 41 c1 ec 0f 48 85 f6 0f 95 c0
    RIP [] kmem_cache_alloc+0x50/0x81
    RSP
    CR2: 0000000001a40ca0
    ---[ end trace a3dee77ef1628c0f ]---
    BUG: unable to handle kernel paging request at 0000000001a40ca0
    IP: [] kmem_cache_alloc+0x50/0x81
    PGD 79d33067 PUD 79cf7067 PMD 0
    Oops: 0000 [2] SMP
    CPU 0
    Modules linked in: radeon drm uinput snd_hda_intel iwl3945 snd_pcm
    snd_timer rfkill snd led_class snd_page_alloc
    Pid: 1268, comm: kjournald Tainted: G D 2.6.26-06077-gc010b2f #23
    RIP: 0010:[] []
    kmem_cache_alloc+0x50/0x81
    RSP: 0000:ffff88007d02dcd0 EFLAGS: 00010006
    RAX: 0000000000000000 RBX: 0000000000000282 RCX: ffffffff802704ae
    RDX: ffff880001016700 RSI: 0000000001a40ca0 RDI: ffffffff808b5fa0
    RBP: ffff88007d02dcf0 R08: 0000000000000000 R09: ffff88007b04e840
    R10: ffff88007c9b74d0 R11: ffff88007d02dd60 R12: 0000000000011200
    R13: 0000000000000080 R14: 0000000000011210 R15: 0000000000000001
    FS: 0000000000000000(0000) GS:ffffffff808b6e80(0000) knlGS:0000000000000000
    CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    CR2: 0000000001a40ca0 CR3: 0000000079d43000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process kjournald (pid: 1268, threadinfo ffff88007d02c000, task
    ffff88007d6b2b60)
    Stack: ffff88007f424900 0000000000000010 ffff88007f99e480 0000000000000001
    ffff88007d02dd00 ffffffff802704ae ffff88007d02dd60 ffffffff802705c2
    ffff88007b0b3000 ffff880079c5ce40 ffff88007b04e840 0000000000000004
    Call Trace:
    [] mempool_alloc_slab+0x16/0x18
    [] mempool_alloc+0x3e/0xfa
    [] bio_alloc_bioset+0x27/0x94
    [] ? journal_write_metadata_buffer+0x30e/0x328
    [] bio_alloc+0x15/0x24
    [] submit_bh+0x78/0x119
    [] journal_commit_transaction+0x76d/0xccd
    [] ? try_to_del_timer_sync+0x5b/0x67
    [] kjournald+0xc8/0x200
    [] ? autoremove_wake_function+0x0/0x3d
    [] ? kjournald+0x0/0x200
    [] kthread+0x4e/0x7c
    [] child_rip+0xa/0x11
    [] ? kthread+0x0/0x7c
    [] ? child_rip+0x0/0x11


    Code: 98 48 8b 94 c7 e0 00 00 00 48 8b 32 44 8b 6a 18 48 85 f6 75 13 49
    89 d0 44 89 e6 83 ca ff e8 b3 f8 ff ff 48 89 c6 eb 0a 8b 42 14 <48> 8b
    04 c6 48 89 02 53 9d 31 c0 41 c1 ec 0f 48 85 f6 0f 95 c0
    RIP [] kmem_cache_alloc+0x50/0x81
    RSP
    CR2: 0000000001a40ca0
    ---[ end trace a3dee77ef1628c0f ]---
    BUG: unable to handle kernel paging request at 0000000001a40ca0
    IP: [] kmem_cache_alloc+0x50/0x81
    PGD 7d1ac067 PUD 7bcb9067 PMD 0
    Oops: 0000 [3] SMP
    CPU 0
    Modules linked in: radeon drm uinput snd_hda_intel iwl3945 snd_pcm
    snd_timer rfkill snd led_class snd_page_alloc
    Pid: 119, comm: kblockd/0 Tainted: G D 2.6.26-06077-gc010b2f #23
    RIP: 0010:[] []
    kmem_cache_alloc+0x50/0x81
    RSP: 0018:ffff88007fadfcb0 EFLAGS: 00010006
    RAX: 0000000000000000 RBX: 0000000000000082 RCX: ffffffff804c6b64
    RDX: ffff880001016700 RSI: 0000000001a40ca0 RDI: ffffffff808b5fa0
    RBP: ffff88007fadfcd0 R08: 0000000000000000 R09: ffff88007c1ef3c0
    R10: ffff88007d44e890 R11: 0000000000000001 R12: 0000000000000020
    R13: 0000000000000080 R14: ffff88007d5aa800 R15: 0000000000000008
    FS: 0000000000000000(0000) GS:ffffffff808b6e80(0000) knlGS:0000000000000000
    CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    CR2: 0000000001a40ca0 CR3: 000000007d1ad000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process kblockd/0 (pid: 119, threadinfo ffff88007fade000, task
    ffff88007f8dd6c0)
    Stack: 0000000000000001 0000000000000020 ffffffff808894c0 ffff88007c1ef3c0
    ffff88007fadfd00 ffffffff804c6b64 ffff88007fadfd40 ffff88007d60e800
    ffff88007d5aa920 0000000000000000 ffff88007fadfd30 ffffffff804c6c72
    Call Trace:
    [] scsi_pool_alloc_command+0x4d/0x73
    [] __scsi_get_command+0x1e/0x9c
    [] scsi_get_command+0x36/0xa5
    [] scsi_get_cmd_from_req+0x2a/0x5e
    [] scsi_setup_fs_cmnd+0x5d/0x87
    [] sd_prep_fn+0x66/0x449
    [] elv_next_request+0xe3/0x1a4
    [] scsi_request_fn+0x80/0x334
    [] ? blk_unplug_work+0x0/0x1b
    [] __generic_unplug_device+0x29/0x2e
    [] generic_unplug_device+0x2e/0x3c
    [] blk_unplug_work+0x19/0x1b
    [] run_workqueue+0x81/0x10a
    [] worker_thread+0xdd/0xea
    [] ? autoremove_wake_function+0x0/0x3d
    [] ? worker_thread+0x0/0xea
    [] kthread+0x4e/0x7c
    [] child_rip+0xa/0x11
    [] ? kthread+0x0/0x7c
    [] ? child_rip+0x0/0x11


    Code: 98 48 8b 94 c7 e0 00 00 00 48 8b 32 44 8b 6a 18 48 85 f6 75 13 49
    89 d0 44 89 e6 83 ca ff e8 b3 f8 ff ff 48 89 c6 eb 0a 8b 42 14 <48> 8b
    04 c6 48 89 02 53 9d 31 c0 41 c1 ec 0f 48 85 f6 0f 95 c0
    RIP [] kmem_cache_alloc+0x50/0x81
    RSP
    CR2: 0000000001a40ca0
    ---[ end trace a3dee77ef1628c0f ]---



    #git log:
    commit c010b2f76c3032e48097a6eef291d8593d5d79a6
    Merge: 6069fb2... 521c4d9...
    Author: Linus Torvalds
    Date: Tue Jul 22 19:09:51 2008 -0700


    Its a Thinkpad T60 Notebook,
    Core2Duo T7200 CPU
    E1000 pcie nic, e1000e is built in
    gentoo amd64
    gcc (GCC) 4.1.2 (Gentoo 4.1.2 p1.1)





    --
    3rd Law of Computing:
    Anything that can go wr
    fortune: Segmentation violation -- Core dumped
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: Current Git: BUG: unable to handle kernel paging request at 0000000001a40ca0

    On Wed, Jul 23, 2008 at 5:39 PM, Dieter Ries wrote:
    > Hi,
    >
    > I just encountered a Bug in latest git:
    >
    > As this is my first bugreport, I am not sure who to cc and which information
    > to provide, so please advise me. Some information is below.


    Hi,

    Thanks for the report!

    > BUG: unable to handle kernel paging request at 0000000001a40ca0
    > IP: [] kmem_cache_alloc+0x50/0x81
    > PGD 79d33067 PUD 79cf7067 PMD 0
    > Oops: 0000 [1] SMP
    > CPU 0
    > Modules linked in: radeon drm uinput snd_hda_intel iwl3945 snd_pcm snd_timer
    > rfkill snd led_class snd_page_alloc
    > Pid: 3516, comm: ifconfig Not tainted 2.6.26-06077-gc010b2f #23
    > RIP: 0010:[] []
    > kmem_cache_alloc+0x50/0x81
    > RSP: 0000:ffff880079d079e8 EFLAGS: 00010006
    > RAX: 0000000000000000 RBX: 0000000000000296 RCX: ffffffff802704ae
    > RDX: ffff880001016700 RSI: 0000000001a40ca0 RDI: ffffffff808b5fa0
    > RBP: ffff880079d07a08 R08: 000000000000000c R09: 0000000000000001


    [snip]

    > Code: 98 48 8b 94 c7 e0 00 00 00 48 8b 32 44 8b 6a 18 48 85 f6 75 13 49 89
    > d0 44 89 e6 83 ca ff e8 b3 f8 ff ff 48 89 c6 eb 0a 8b 42 14 <48> 8b 04 c6 48
    > 89 02 53 9d 31 c0 41 c1 ec 0f 48 85 f6 0f 95 c0


    The code decodes to:

    mov 0x14(%rdx),%eax
    mov (%rsi,%rax,8),%rax <--- HERE!

    which corresponds to this code in mm/slub.c:

    c->freelist = object[c->offset];

    So the mov 0x14(%rdx) is the loading of c->offset, which means that
    the pointer "c" is held in %rdx (= 0xffff880001016700), and the
    variable c->offset is held in %eax (= 0).

    It also means that the pointer "object" is held in %rsi (= 0x1a40ca0).
    Now, clearly the object pointer is bogus. It was loaded on the line
    above:

    object = c->freelist;

    ...and it may look like c->freelist has become corrupted. This one is
    again loaded from the line:

    c = get_cpu_slab(s, smp_processor_id());

    Everything seems normal, except the c->freelist pointer.

    The rest of the messages are from the same function, but from
    different code paths:

    > [] mempool_alloc_slab+0x16/0x18
    > [] mempool_alloc+0x3e/0xfa
    > [] bio_alloc_bioset+0x27/0x94
    > [] bio_alloc+0x15/0x24
    > [] submit_bh+0x78/0x119
    > [] journal_commit_transaction+0x76d/0xccd
    > [] kjournald+0xc8/0x200
    > [] kthread+0x4e/0x7c
    > [] child_rip+0xa/0x11


    and

    > [] scsi_pool_alloc_command+0x4d/0x73
    > [] __scsi_get_command+0x1e/0x9c
    > [] scsi_get_command+0x36/0xa5
    > [] scsi_get_cmd_from_req+0x2a/0x5e
    > [] scsi_setup_fs_cmnd+0x5d/0x87
    > [] sd_prep_fn+0x66/0x449
    > [] elv_next_request+0xe3/0x1a4
    > [] scsi_request_fn+0x80/0x334
    > [] __generic_unplug_device+0x29/0x2e
    > [] generic_unplug_device+0x2e/0x3c
    > [] blk_unplug_work+0x19/0x1b
    > [] run_workqueue+0x81/0x10a
    > [] worker_thread+0xdd/0xea
    > [] kthread+0x4e/0x7c
    > [] child_rip+0xa/0x11


    ....this seems to suggest that none of the backtraces may actually give
    a good clue as to who caused the corruption to begin with. (In other
    words, I have no more clue than you on who to Cc this.)

    Does the number 0x1a40ca0 look familiar to anybody?

    Dieter: If this is reproducible, it would probably help quite a bit to
    configure the kernel with CONFIG_SLUB_DEBUG and boot with
    slub_debug=FZPUT (unless you already have CONFIG_SLUB_DEBUG_ON set, in
    which case you are already running with the SLUB debugging at boot).
    It might catch the corruption before it becomes fatal, or give us some
    more clues anyway.


    Vegard

    --
    "The animistic metaphor of the bug that maliciously sneaked in while
    the programmer was not looking is intellectually dishonest as it
    disguises that the error is the programmer's own creation."
    -- E. W. Dijkstra, EWD1036
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: Current Git: BUG: unable to handle kernel paging request at 0000000001a40ca0

    Vegard Nossum schrieb:
    > Hi,


    Hi,

    >
    > Dieter: If this is reproducible, it would probably help quite a bit to
    > configure the kernel with CONFIG_SLUB_DEBUG and boot with
    > slub_debug=FZPUT (unless you already have CONFIG_SLUB_DEBUG_ON set, in
    > which case you are already running with the SLUB debugging at boot).
    > It might catch the corruption before it becomes fatal, or give us some
    > more clues anyway.
    >


    I am bisecting the bug right now, as it is kind of reproducible. Its
    always network related, sometimes it happens when init brings up the
    nics, sometimes afterwards with ifconfig etc.

    I will add CONFIG_SLUB_DEBUG_ON for the next runs, next time it happens
    I will post the output.

    >
    > Vegard
    >

    cu
    Dieter
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: Current Git: BUG: unable to handle kernel paging request at 0000000001a40ca0

    Dieter Ries schrieb:
    > Vegard Nossum schrieb:
    >> Hi,

    >

    Hi

    >> Dieter: If this is reproducible, it would probably help quite a bit to
    >> configure the kernel with CONFIG_SLUB_DEBUG and boot with
    >> slub_debug=FZPUT (unless you already have CONFIG_SLUB_DEBUG_ON set, in
    >> which case you are already running with the SLUB debugging at boot).
    >> It might catch the corruption before it becomes fatal, or give us some
    >> more clues anyway.


    I tried to bisect the bug, which failed because there were too many
    kernels not booting with other problems, I guess bisecting just fails in
    the merge window.

    With CONFIG_SLUB_DEBUG_ON the output looks different, unfortunately
    netconsole stops before those are transmitted.

    As there are always some lines about e1000 in the backtraces, I tried to
    boot without LAN cable connected, and it worked, and crashed afterwards
    when I plugged the cable in, with a bug in net/core/dev.c.

    I will search on tomorrow, as it is getting quite late now.

    Should I copy the messages with CONFIG_SLUB_DEBUG_ON by hand, or are
    just some parts important?

    cu
    Dieter
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: Current Git: BUG: unable to handle kernel paging request at 0000000001a40ca0

    On Wed, Jul 23, 2008 at 11:53 PM, Dieter Ries wrote:
    >>> Dieter: If this is reproducible, it would probably help quite a bit to
    >>> configure the kernel with CONFIG_SLUB_DEBUG and boot with
    >>> slub_debug=FZPUT (unless you already have CONFIG_SLUB_DEBUG_ON set, in
    >>> which case you are already running with the SLUB debugging at boot).
    >>> It might catch the corruption before it becomes fatal, or give us some
    >>> more clues anyway.

    >
    > I tried to bisect the bug, which failed because there were too many kernels
    > not booting with other problems, I guess bisecting just fails in the merge
    > window.
    >
    > With CONFIG_SLUB_DEBUG_ON the output looks different, unfortunately
    > netconsole stops before those are transmitted.
    >
    > As there are always some lines about e1000 in the backtraces, I tried to
    > boot without LAN cable connected, and it worked, and crashed afterwards when
    > I plugged the cable in, with a bug in net/core/dev.c.
    >
    > I will search on tomorrow, as it is getting quite late now.
    >
    > Should I copy the messages with CONFIG_SLUB_DEBUG_ON by hand, or are just
    > some parts important?


    There were some e1000 patches in flight on LKML recently; you might be
    able to find them and see if it helps you. It also seems that some
    changes were just committed to -git, so I guess you should try the
    very latest from there.

    You also Cced netdev from the start, so somebody from there should be
    able to help you more from here than I. :-)


    Vegard

    --
    "The animistic metaphor of the bug that maliciously sneaked in while
    the programmer was not looking is intellectually dishonest as it
    disguises that the error is the programmer's own creation."
    -- E. W. Dijkstra, EWD1036
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: Current Git: BUG: unable to handle kernel paging request at 0000000001a40ca0

    Vegard Nossum schrieb:
    > On Wed, Jul 23, 2008 at 11:53 PM, Dieter Ries wrote:
    >>>> Dieter: If this is reproducible, it would probably help quite a bit to
    >>>> configure the kernel with CONFIG_SLUB_DEBUG and boot with
    >>>> slub_debug=FZPUT (unless you already have CONFIG_SLUB_DEBUG_ON set, in
    >>>> which case you are already running with the SLUB debugging at boot).
    >>>> It might catch the corruption before it becomes fatal, or give us some
    >>>> more clues anyway.

    >> I tried to bisect the bug, which failed because there were too many kernels
    >> not booting with other problems, I guess bisecting just fails in the merge
    >> window.
    >>
    >> With CONFIG_SLUB_DEBUG_ON the output looks different, unfortunately
    >> netconsole stops before those are transmitted.


    I think I managed to catch one of those:


    general protection fault: 0000 [1] SMP
    CPU 0
    Modules linked in:
    Pid: 0, comm: swapper Not tainted 2.6.26-06373-gcaf076e #49
    RIP: 0010:[] []
    nf_nat_move_storage+0x21/0x7a
    RSP: 0018:ffffffff8091ab80 EFLAGS: 00010206
    RAX: ffffffff805e08d8 RBX: ffff88007d1fb948 RCX: 000000000000006b
    RDX: ffff88007d175e10 RSI: ffff88007d175e7b RDI: ffff88007d1fb948
    RBP: ffffffff8091aba0 R08: 0000000000000000 R09: ffff88007d175e90
    R10: ffffe20000000008 R11: ffff88007d175e10 R12: 59d2c3ffff88007d
    R13: ffff88007d175e7b R14: 00000000000000a0 R15: 0000000000000001
    FS: 0000000000000000(0000) GS:ffffffff8089ee80(0000) knlGS:0000000000000000
    CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process swapper (pid: 0, threadinfo ffffffff808b0000, task ffffffff80842340)
    Stack: 0000000000000002 ffff88007d3d2000 ffff88007d1fb948 0000000000000070
    ffffffff8091abf0 ffffffff8059d3c4 ffffffff8091ac40 0000000100000001
    ffffffff809e3658 ffff88007d3d2000 0000000000000002 ffff88007f9f6500
    Call Trace:
    [] __nf_ct_ext_add+0x15f/0x1f7
    [] nf_nat_fn+0x84/0x152
    [] nf_nat_in+0x2f/0x71
    [] nf_iterate+0x48/0x85
    [] ? ip_rcv_finish+0x0/0x35d
    [] nf_hook_slow+0x63/0xcb
    [] ? ip_rcv_finish+0x0/0x35d
    [] ? __slab_alloc+0x413/0x4bd
    [] ip_rcv+0x257/0x297
    [] netif_receive_skb+0x1f1/0x263
    [] e1000_receive_skb+0x46/0x5d
    [] e1000_clean_rx_irq+0x20e/0x2a6
    [] ? getnstimeofday+0x3f/0xa0
    [] e1000_clean+0x6d/0x218
    [] ? hrtimer_get_next_event+0xa8/0xb8
    [] net_rx_action+0xa9/0x17c
    [] __do_softirq+0x65/0xd5
    [] call_softirq+0x1c/0x28
    [] do_softirq+0x39/0x77
    [] irq_exit+0x44/0x85
    [] do_IRQ+0x147/0x16a
    [] ret_from_intr+0x0/0xa
    [] ? acpi_idle_enter_bm+0x2a7/0x317
    [] ? acpi_idle_enter_bm+0x29d/0x317
    [] ? menu_select+0x75/0x9e
    [] ? cpuidle_idle_call+0x75/0xa7
    [] ? cpu_idle+0x69/0x8c
    [] ? rest_init+0x61/0x63
    [] ? start_kernel+0x2ad/0x2b9
    [] ? x86_64_start_reservations+0x84/0x88
    [] ? x86_64_start_kernel+0xe4/0xeb


    Code: ff 5b 41 5c 41 5d 41 5e c9 c3 55 48 89 e5 41 55 41 54 53 48 83 ec
    08 e8 c6 a8 c2 ff 4c 8b 66 20 48 89 fb 49 89 f5 4d 85 e4 74 51 <49> f7
    44 24 78 80 01 00 00 74 46 48 c7 c7 78 6a 9e 80 e8 8f 2e
    RIP [] nf_nat_move_storage+0x21/0x7a
    RSP
    ---[ end trace 6f6148e13aab302e ]---
    Kernel panic - not syncing: Aiee, killing interrupt handler!

    >>
    >> As there are always some lines about e1000 in the backtraces, I tried to
    >> boot without LAN cable connected, and it worked, and crashed afterwards when
    >> I plugged the cable in, with a bug in net/core/dev.c.
    >>
    >> Should I copy the messages with CONFIG_SLUB_DEBUG_ON by hand, or are just
    >> some parts important?

    >
    > There were some e1000 patches in flight on LKML recently; you might be
    > able to find them and see if it helps you. It also seems that some
    > changes were just committed to -git, so I guess you should try the
    > very latest from there.


    I reverted some of the last patches concerning e1000 one by one, but the
    last ~12 which I did revert yet didnt solve the problem.

    >
    > You also Cced netdev from the start, so somebody from there should be
    > able to help you more from here than I. :-)
    >
    >
    > Vegard
    >

    cu
    Dieter

    --
    3rd Law of Computing:
    Anything that can go wr
    fortune: Segmentation violation -- Core dumped
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: Current Git: BUG: unable to handle kernel paging request at 0000000001a40ca0

    Pekka Enberg wrote:
    > On Thu, Jul 24, 2008 at 9:51 AM, Dieter Ries wrote:
    >> I think I managed to catch one of those:
    >>
    >> general protection fault: 0000 [1] SMP
    >> CPU 0
    >> Modules linked in:
    >> Pid: 0, comm: swapper Not tainted 2.6.26-06373-gcaf076e #49
    >> RIP: 0010:[] []
    >> nf_nat_move_storage+0x21/0x7a
    >> ...
    >> Call Trace:
    >> [] __nf_ct_ext_add+0x15f/0x1f7
    >> [] nf_nat_fn+0x84/0x152
    >> [] nf_nat_in+0x2f/0x71

    > Dieter, can you please try this patch:
    >
    > http://lkml.org/lkml/2008/7/24/186
    >
    > Patrick, it looks like this could be the same bug Ingo is hitting?


    Yes, that most likely the same underlying problem.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: Current Git: BUG: unable to handle kernel paging request at 0000000001a40ca0

    Hi Dieter,

    On Thu, Jul 24, 2008 at 9:51 AM, Dieter Ries wrote:
    > Vegard Nossum schrieb:
    >>
    >> On Wed, Jul 23, 2008 at 11:53 PM, Dieter Ries wrote:
    >>>>>
    >>>>> Dieter: If this is reproducible, it would probably help quite a bit to
    >>>>> configure the kernel with CONFIG_SLUB_DEBUG and boot with
    >>>>> slub_debug=FZPUT (unless you already have CONFIG_SLUB_DEBUG_ON set, in
    >>>>> which case you are already running with the SLUB debugging at boot).
    >>>>> It might catch the corruption before it becomes fatal, or give us some
    >>>>> more clues anyway.
    >>>
    >>> I tried to bisect the bug, which failed because there were too many
    >>> kernels
    >>> not booting with other problems, I guess bisecting just fails in the
    >>> merge
    >>> window.
    >>>
    >>> With CONFIG_SLUB_DEBUG_ON the output looks different, unfortunately
    >>> netconsole stops before those are transmitted.

    >
    > I think I managed to catch one of those:
    >
    >
    > general protection fault: 0000 [1] SMP
    > CPU 0
    > Modules linked in:
    > Pid: 0, comm: swapper Not tainted 2.6.26-06373-gcaf076e #49
    > RIP: 0010:[] []
    > nf_nat_move_storage+0x21/0x7a
    > RSP: 0018:ffffffff8091ab80 EFLAGS: 00010206
    > RAX: ffffffff805e08d8 RBX: ffff88007d1fb948 RCX: 000000000000006b
    > RDX: ffff88007d175e10 RSI: ffff88007d175e7b RDI: ffff88007d1fb948
    > RBP: ffffffff8091aba0 R08: 0000000000000000 R09: ffff88007d175e90
    > R10: ffffe20000000008 R11: ffff88007d175e10 R12: 59d2c3ffff88007d
    > R13: ffff88007d175e7b R14: 00000000000000a0 R15: 0000000000000001
    > FS: 0000000000000000(0000) GS:ffffffff8089ee80(0000) knlGS:0000000000000000
    > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    > CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
    > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    > Process swapper (pid: 0, threadinfo ffffffff808b0000, task ffffffff80842340)
    > Stack: 0000000000000002 ffff88007d3d2000 ffff88007d1fb948 0000000000000070
    > ffffffff8091abf0 ffffffff8059d3c4 ffffffff8091ac40 0000000100000001
    > ffffffff809e3658 ffff88007d3d2000 0000000000000002 ffff88007f9f6500
    > Call Trace:
    > [] __nf_ct_ext_add+0x15f/0x1f7
    > [] nf_nat_fn+0x84/0x152
    > [] nf_nat_in+0x2f/0x71
    > [] nf_iterate+0x48/0x85
    > [] ? ip_rcv_finish+0x0/0x35d
    > [] nf_hook_slow+0x63/0xcb
    > [] ? ip_rcv_finish+0x0/0x35d
    > [] ? __slab_alloc+0x413/0x4bd
    > [] ip_rcv+0x257/0x297
    > [] netif_receive_skb+0x1f1/0x263
    > [] e1000_receive_skb+0x46/0x5d
    > [] e1000_clean_rx_irq+0x20e/0x2a6
    > [] ? getnstimeofday+0x3f/0xa0
    > [] e1000_clean+0x6d/0x218
    > [] ? hrtimer_get_next_event+0xa8/0xb8
    > [] net_rx_action+0xa9/0x17c
    > [] __do_softirq+0x65/0xd5
    > [] call_softirq+0x1c/0x28
    > [] do_softirq+0x39/0x77
    > [] irq_exit+0x44/0x85
    > [] do_IRQ+0x147/0x16a
    > [] ret_from_intr+0x0/0xa


    Dieter, can you please try this patch:

    http://lkml.org/lkml/2008/7/24/186

    Patrick, it looks like this could be the same bug Ingo is hitting?
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: Current Git: BUG: unable to handle kernel paging request at 0000000001a40ca0

    Pekka Enberg schrieb:
    > Hi Dieter,


    Hi,

    >
    > On Thu, Jul 24, 2008 at 9:51 AM, Dieter Ries wrote:
    >> I think I managed to catch one of those:
    >>
    >>
    >> general protection fault: 0000 [1] SMP
    >> CPU 0
    >> Modules linked in:
    >> Pid: 0, comm: swapper Not tainted 2.6.26-06373-gcaf076e #49
    >> RIP: 0010:[] []
    >> nf_nat_move_storage+0x21/0x7a
    >> RSP: 0018:ffffffff8091ab80 EFLAGS: 00010206
    >> RAX: ffffffff805e08d8 RBX: ffff88007d1fb948 RCX: 000000000000006b
    >> RDX: ffff88007d175e10 RSI: ffff88007d175e7b RDI: ffff88007d1fb948
    >> RBP: ffffffff8091aba0 R08: 0000000000000000 R09: ffff88007d175e90
    >> R10: ffffe20000000008 R11: ffff88007d175e10 R12: 59d2c3ffff88007d
    >> R13: ffff88007d175e7b R14: 00000000000000a0 R15: 0000000000000001
    >> FS: 0000000000000000(0000) GS:ffffffff8089ee80(0000) knlGS:0000000000000000
    >> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    >> CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
    >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    >> Process swapper (pid: 0, threadinfo ffffffff808b0000, task ffffffff80842340)
    >> Stack: 0000000000000002 ffff88007d3d2000 ffff88007d1fb948 0000000000000070
    >> ffffffff8091abf0 ffffffff8059d3c4 ffffffff8091ac40 0000000100000001
    >> ffffffff809e3658 ffff88007d3d2000 0000000000000002 ffff88007f9f6500
    >> Call Trace:
    >> [] __nf_ct_ext_add+0x15f/0x1f7
    >> [] nf_nat_fn+0x84/0x152
    >> [] nf_nat_in+0x2f/0x71
    >> [] nf_iterate+0x48/0x85
    >> [] ? ip_rcv_finish+0x0/0x35d
    >> [] nf_hook_slow+0x63/0xcb
    >> [] ? ip_rcv_finish+0x0/0x35d
    >> [] ? __slab_alloc+0x413/0x4bd
    >> [] ip_rcv+0x257/0x297
    >> [] netif_receive_skb+0x1f1/0x263
    >> [] e1000_receive_skb+0x46/0x5d
    >> [] e1000_clean_rx_irq+0x20e/0x2a6
    >> [] ? getnstimeofday+0x3f/0xa0
    >> [] e1000_clean+0x6d/0x218
    >> [] ? hrtimer_get_next_event+0xa8/0xb8
    >> [] net_rx_action+0xa9/0x17c
    >> [] __do_softirq+0x65/0xd5
    >> [] call_softirq+0x1c/0x28
    >> [] do_softirq+0x39/0x77
    >> [] irq_exit+0x44/0x85
    >> [] do_IRQ+0x147/0x16a
    >> [] ret_from_intr+0x0/0xa

    >
    > Dieter, can you please try this patch:
    >
    > http://lkml.org/lkml/2008/7/24/186
    >
    > Patrick, it looks like this could be the same bug Ingo is hitting?
    >


    it looks like it is fixed with the patch. But I (still/now) have the
    following Warning:


    WARNING: at net/core/dev.c:1344 __netif_schedule+0x31/0x79()
    Modules linked in:
    Pid: 9, comm: events/0 Not tainted 2.6.26-06371-g338b9bb-dirty #51

    Call Trace:
    [] warn_on_slowpath+0x5d/0x92
    [] ? lock_timer_base+0x2b/0x4f
    [] ? __mod_timer+0xc6/0xd8
    [] ? ftrace_record_ip+0x211/0x262
    [] ? mcount_call+0x5/0x31
    [] __netif_schedule+0x31/0x79
    [] e1000_watchdog_task+0x47d/0x695
    [] ? e1000_watchdog_task+0x0/0x695
    [] run_workqueue+0x81/0x10a
    [] worker_thread+0xdd/0xea
    [] ? autoremove_wake_function+0x0/0x3d
    [] ? worker_thread+0x0/0xea
    [] kthread+0x4e/0x7c
    [] child_rip+0xa/0x11
    [] ? kthread+0x0/0x7c
    [] ? child_rip+0x0/0x11

    ---[ end trace a344035e20d497ce ]---

    but this is work in progress I guess.

    Thanks for support!

    cu
    Dieter
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread