[BUG] re-modprobe a nand controller driver module will cause system crash. - Kernel

This is a discussion on [BUG] re-modprobe a nand controller driver module will cause system crash. - Kernel ; Hi folks, These days I found a subtle bug which should be related with mtdcore layers. The detailed story is located at https://blackfin.uclinux.org/gf/proj...r_item_id=4463 . Briefly speaking, 1) modprobe a nand controller driver to add_mtd_paritition(). 2) add_mtd_partition->add_devices->blktrans_notify_add->mtdblock_add_mtd->add_mtd_blktrans_dev 3) in add_mtd_blktrans_dev, alloc_disk ...

+ Reply to Thread
Results 1 to 4 of 4

Thread: [BUG] re-modprobe a nand controller driver module will cause system crash.

  1. [BUG] re-modprobe a nand controller driver module will cause system crash.

    Hi folks,

    These days I found a subtle bug which should be related with mtdcore layers.
    The detailed story is located at
    https://blackfin.uclinux.org/gf/proj...r_item_id=4463.

    Briefly speaking,
    1) modprobe a nand controller driver to add_mtd_paritition().
    2) add_mtd_partition->add_devices->blktrans_notify_add->mtdblock_add_mtd->add_mtd_blktrans_dev
    3) in add_mtd_blktrans_dev, alloc_disk will be called to create a new
    gendisk structure according to the partition setting.
    4) "gd->queue = tr->blkcore_priv->rq;"
    No matter how many partitions (in my test, 2 partitions), there
    will be the same number gendisk structures but just 1 queue.
    They all use the same request_queue which is created in
    register_mtd_blktrans.
    5) mtdblockd kthread handles this request_queue for mtdblock layer.
    6) There is one backing_dev_info structure member (not pointer) in
    request_queue. so for several mtd partitions (serveral gendisks) there
    is only one bdi structure instance.
    7) So the problem is in add_disk(),
    bdi_register_dev(bdi, MKDEV(disk->major, disk->first_minor));
    For 1st partition mtdblock0, it will create /sys/class/bdi/31:0
    and register information in bdi structure instance.
    Then for 2nd partition mtdblock1, because the bdi structure
    instance is the same as the 1st partition, it will overwrite bdi
    structure and create /sys/class/bdi/31:1.
    So the bdi info of 1st partition are totally lost.
    8) When we rmmod the nand controller driver, del_mtd_partition will
    only remove /sys/class/bdi/31:1 but left 1st partition
    /sys/class/bdi/31:0 there.
    9) modprobe again will let the bug show up.

    I found this bug does not relate with my nand flash controller driver
    and it should be fixed in mtdblock layer.
    And if we just add only one partition, there is no such bug at all. I
    tried to solve this bug, but it related with
    mtdblock/mtd_blktrans/block/bdi. It is diffcult for me to find a way
    to satisfy all the parts with minimal changes.

    IMHO, can we just simply remove the bdi_register_dev (in add_disk) and
    bdi_unregister_dev (in unlink_disk)?

    P.S. I also found this bug in latest 2.6.27 kernel mainline.

    Thanks
    -Bryan
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [BUG] re-modprobe a nand controller driver module will cause system crash.

    Hi folks,

    Did anyone meet this issue before?

    -Bryan

    On Thu, Oct 16, 2008 at 6:56 PM, Bryan Wu wrote:
    > Hi folks,
    >
    > These days I found a subtle bug which should be related with mtdcore layers.
    > The detailed story is located at
    > https://blackfin.uclinux.org/gf/proj...r_item_id=4463.
    >
    > Briefly speaking,
    > 1) modprobe a nand controller driver to add_mtd_paritition().
    > 2) add_mtd_partition->add_devices->blktrans_notify_add->mtdblock_add_mtd->add_mtd_blktrans_dev
    > 3) in add_mtd_blktrans_dev, alloc_disk will be called to create a new
    > gendisk structure according to the partition setting.
    > 4) "gd->queue = tr->blkcore_priv->rq;"
    > No matter how many partitions (in my test, 2 partitions), there
    > will be the same number gendisk structures but just 1 queue.
    > They all use the same request_queue which is created in
    > register_mtd_blktrans.
    > 5) mtdblockd kthread handles this request_queue for mtdblock layer.
    > 6) There is one backing_dev_info structure member (not pointer) in
    > request_queue. so for several mtd partitions (serveral gendisks) there
    > is only one bdi structure instance.
    > 7) So the problem is in add_disk(),
    > bdi_register_dev(bdi, MKDEV(disk->major, disk->first_minor));
    > For 1st partition mtdblock0, it will create /sys/class/bdi/31:0
    > and register information in bdi structure instance.
    > Then for 2nd partition mtdblock1, because the bdi structure
    > instance is the same as the 1st partition, it will overwrite bdi
    > structure and create /sys/class/bdi/31:1.
    > So the bdi info of 1st partition are totally lost.
    > 8) When we rmmod the nand controller driver, del_mtd_partition will
    > only remove /sys/class/bdi/31:1 but left 1st partition
    > /sys/class/bdi/31:0 there.
    > 9) modprobe again will let the bug show up.
    >
    > I found this bug does not relate with my nand flash controller driver
    > and it should be fixed in mtdblock layer.
    > And if we just add only one partition, there is no such bug at all. I
    > tried to solve this bug, but it related with
    > mtdblock/mtd_blktrans/block/bdi. It is diffcult for me to find a way
    > to satisfy all the parts with minimal changes.
    >
    > IMHO, can we just simply remove the bdi_register_dev (in add_disk) and
    > bdi_unregister_dev (in unlink_disk)?
    >
    > P.S. I also found this bug in latest 2.6.27 kernel mainline.
    >
    > Thanks
    > -Bryan
    >

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [BUG] re-modprobe a nand controller driver module will cause system crash.

    On Tue, Oct 21, 2008 at 23:24, Bryan Wu wrote:
    > Did anyone meet this issue before?


    actually i think we can see the same issue with the m25p80 driver ?
    if i build it as a module and load/unload it, it crashes with same
    name errors ...

    root:/> rmmod m25p80
    root:/> modprobe m25p80
    m25p80 spi0.2: w25x10 (128 Kbytes)
    Creating 3 MTD partitions on "m25p80":
    0x00000000-0x00040000 : "bootloader(spi)"
    mtd: partition "bootloader(spi)" extends beyond the end of device
    "m25p80" -- size truncated to 0x20000
    kobject_add_internal failed for 31:0 with -EEXIST, don't try to
    register things with the same name in the same directory.

    -mike
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [BUG] re-modprobe a nand controller driver module will cause system crash.

    On Wed, Oct 22, 2008 at 10:59 PM, Mike Frysinger wrote:
    > On Tue, Oct 21, 2008 at 23:24, Bryan Wu wrote:
    >> Did anyone meet this issue before?

    >
    > actually i think we can see the same issue with the m25p80 driver ?
    > if i build it as a module and load/unload it, it crashes with same
    > name errors ...
    >


    Right, this is a common bug for mtd core.

    -Bryan

    > root:/> rmmod m25p80
    > root:/> modprobe m25p80
    > m25p80 spi0.2: w25x10 (128 Kbytes)
    > Creating 3 MTD partitions on "m25p80":
    > 0x00000000-0x00040000 : "bootloader(spi)"
    > mtd: partition "bootloader(spi)" extends beyond the end of device
    > "m25p80" -- size truncated to 0x20000
    > kobject_add_internal failed for 31:0 with -EEXIST, don't try to
    > register things with the same name in the same directory.
    >
    > -mike
    >

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread