[PATCH]IB/ehca:reject dynamic memory add/remove - Kernel

This is a discussion on [PATCH]IB/ehca:reject dynamic memory add/remove - Kernel ; Since the ehca device driver does not support dynamic memory add and remove operations, the driver must explicitly reject such requests in order to prevent unpredictable behaviors related to memory regions already occupied and being used by InfiniBand applications. The ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: [PATCH]IB/ehca:reject dynamic memory add/remove

  1. [PATCH]IB/ehca:reject dynamic memory add/remove

    Since the ehca device driver does not support dynamic memory add and remove
    operations, the driver must explicitly reject such requests in order to prevent
    unpredictable behaviors related to memory regions already occupied and being
    used by InfiniBand applications.
    The solution is to add a memory notifier to the ehca device driver and if a request
    for dynamic memory add or remove comes in, ehca will always reject it.

    Signed-off-by: Stefan Roscher
    ---

    diff -Nurp linux-2.6.27-rc6-7/drivers/infiniband/hw/ehca/ehca_main.c linux-2.6.27-rc6-7.new/drivers/infiniband/hw/ehca/ehca_main.c
    --- linux-2.6.27-rc6-7/drivers/infiniband/hw/ehca/ehca_main.c 2008-09-16 18:19:27.000000000 +0200
    +++ linux-2.6.27-rc6-7.new/drivers/infiniband/hw/ehca/ehca_main.c 2008-10-03 13:52:50.000000000 +0200
    @@ -44,6 +44,8 @@
    #include
    #endif

    +#include
    +#include
    #include "ehca_classes.h"
    #include "ehca_iverbs.h"
    #include "ehca_mrmw.h"
    @@ -964,6 +966,41 @@ void ehca_poll_eqs(unsigned long data)
    spin_unlock(&shca_list_lock);
    }

    +static int ehca_mem_notifier(struct notifier_block *nb,
    + unsigned long action, void *data)
    +{
    + static unsigned long ehca_dmem_warn_time;
    +
    + switch (action) {
    + case MEM_CANCEL_OFFLINE:
    + case MEM_CANCEL_ONLINE:
    + case MEM_ONLINE:
    + case MEM_OFFLINE:
    + return NOTIFY_OK;
    + case MEM_GOING_ONLINE:
    + case MEM_GOING_OFFLINE:
    + /* only ok if no hca is attached to the lpar */
    + spin_lock(&shca_list_lock);
    + if (list_empty(&shca_list)) {
    + spin_unlock(&shca_list_lock);
    + return NOTIFY_OK;
    + } else {
    + spin_unlock(&shca_list_lock);
    + if (printk_timed_ratelimit(&ehca_dmem_warn_time,
    + 30 * 1000))
    + ehca_gen_err("DMEM operations are not allowed"
    + "as long as an ehca adapter is"
    + "attached to the LPAR");
    + return NOTIFY_BAD;
    + }
    + }
    + return NOTIFY_OK;
    +}
    +
    +static struct notifier_block ehca_mem_nb = {
    + .notifier_call = ehca_mem_notifier,
    +};
    +
    static int __init ehca_module_init(void)
    {
    int ret;
    @@ -991,6 +1028,12 @@ static int __init ehca_module_init(void)
    goto module_init2;
    }

    + ret = register_memory_notifier(&ehca_mem_nb);
    + if (ret) {
    + ehca_gen_err("Failed registering memory add/remove notifier");
    + goto module_init3;
    + }
    +
    if (ehca_poll_all_eqs != 1) {
    ehca_gen_err("WARNING!!!");
    ehca_gen_err("It is possible to lose interrupts.");
    @@ -1003,6 +1046,9 @@ static int __init ehca_module_init(void)

    return 0;

    +module_init3:
    + ibmebus_unregister_driver(&ehca_driver);
    +
    module_init2:
    ehca_destroy_slab_caches();

    @@ -1018,6 +1064,8 @@ static void __exit ehca_module_exit(void

    ibmebus_unregister_driver(&ehca_driver);

    + unregister_memory_notifier(&ehca_mem_nb);
    +
    ehca_destroy_slab_caches();

    ehca_destroy_comp_pool();
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [PATCH]IB/ehca:reject dynamic memory add/remove

    On Mon, 2008-10-13 at 13:10 +0200, Stefan Roscher wrote:
    > Since the ehca device driver does not support dynamic memory add and remove
    > operations, the driver must explicitly reject such requests in order to prevent
    > unpredictable behaviors related to memory regions already occupied and being
    > used by InfiniBand applications.
    > The solution is to add a memory notifier to the ehca device driver and if a request
    > for dynamic memory add or remove comes in, ehca will always reject it.


    Why doesn't the driver support it?

    This seems like an awfully extreme action to take. Do you have plans to
    support this in the driver soon?

    -- Dave

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [PATCH]IB/ehca:reject dynamic memory add/remove

    On Tue, 2008-10-14 at 14:23 +0200, Stefan Roscher wrote:
    > On Monday 13 October 2008 07:09:26 pm Dave Hansen wrote:
    > > On Mon, 2008-10-13 at 13:10 +0200, Stefan Roscher wrote:
    > > > Since the ehca device driver does not support dynamic memory add and remove
    > > > operations, the driver must explicitly reject such requests in order to prevent
    > > > unpredictable behaviors related to memory regions already occupied and being
    > > > used by InfiniBand applications.
    > > > The solution is to add a memory notifier to the ehca device driver and if a request
    > > > for dynamic memory add or remove comes in, ehca will always reject it.

    > >
    > > Why doesn't the driver support it?
    > >
    > > This seems like an awfully extreme action to take. Do you have plans to
    > > support this in the driver soon?
    > >

    > There is currently a slight incompatibility how openfabrics uses MRs
    > and how System p does DMEM add/remove, which basically disables this
    > support.
    > If you want to talk to the firmware developpers, I can give you the right contacts.


    I wish I knew what an 'MR' is.

    Could you be a bit more specific so we can get a better changelog?
    Perhaps if we understand the situation better, we can come up with a
    better solution.

    Does this have anything in common with the problems with 16GB pages?

    -- Dave

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [PATCH]IB/ehca:reject dynamic memory add/remove

    On Tue, 2008-10-14 at 14:23 +0200, Stefan Roscher wrote:
    > On Monday 13 October 2008 07:09:26 pm Dave Hansen wrote:
    > > On Mon, 2008-10-13 at 13:10 +0200, Stefan Roscher wrote:
    > > > Since the ehca device driver does not support dynamic memory add and remove
    > > > operations, the driver must explicitly reject such requests in order to prevent
    > > > unpredictable behaviors related to memory regions already occupied and being
    > > > used by InfiniBand applications.
    > > > The solution is to add a memory notifier to the ehca device driver and if a request
    > > > for dynamic memory add or remove comes in, ehca will always reject it.

    > >
    > > Why doesn't the driver support it?
    > >
    > > This seems like an awfully extreme action to take. Do you have plans to
    > > support this in the driver soon?
    > >

    > There is currently a slight incompatibility how openfabrics uses MRs and how System p does DMEM add/remove,
    > which basically disables this support.
    > If you want to talk to the firmware developpers, I can give you the right contacts.


    OK, Stefan and Christoph have very patiently explained the whole
    situation to me.

    The ehca driver needs to register any memory to which it might write
    with the hypervisor (which then talks to the hardware). For normal apps,
    it does get_user_pages() on the userspace memory and tells the
    hypervisor which pages it got.

    But, there are in-kernel users of the hardware as well like NFS and the
    IP stack. These might potentially write anywhere in memory since, for
    instance, an skbuf can be allocated anywhere. Due to limitations in the
    Infiniband software stack, all these users must all share the same
    "L_KEY", which is basically the identifier of the individual Infiniband
    "user".

    So, ehca registers all of the partition's memory with the hypervisor
    when it is loaded to prepare for these in-kernel users. (I think of
    this as mmap("/dev/mem") from a device to kernel memory.) The size of
    this table is restricted by the starting size of the physical memory
    allocated to the partition, so we can't oversize it and just fill it in
    later as memory is added (hypervisor limitation). We also can't resize
    it at runtime because of other hypervisor limitations.

    The only way to change it is basically to shut the adapter down, which
    Infiniband wouldn't deal well with since it doesn't have any
    retransmitting (Infiniband limitation).

    We could restrict the kernel area to which the ehca driver could write.
    We would then just bounce buffer things in and out of it. But, that'd
    be a latency and complexity nightmare. We could probably also modify
    each of the existing in-kernel users (NFS, etc...) to check to see
    whether the memory they're about to touch has been registered with the
    hypervisor. They could only bounce in cases where it hadn't. We could
    probably also detect these in-kernel users and only deny hotplugging in
    case one of them is actually active.

    But, for now, we take the cowardly approach and simply disable memory
    hotplug. You can still hotplug to the system, you just need to
    un-hotplug the ehca adapter from the partition, first. This will, of
    course be well documented in the already huge IBM manual.

    Back to the patch... Could we be a bit more explicit that a user can go
    to the HMC (the IBM control console) and remove the adapter? I'm just
    trying to think of the poor user looking at dmesg. The dude/dudette
    doing this is going to be sitting at the HMC. Can we get an helpful
    message to pop up to them? Will they even see dmesg output?

    -- Dave

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [PATCH]IB/ehca:reject dynamic memory add/remove

    thanks, applied with a slightly expanded changelog.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread