[PATCH 0/4] futex: get_user_pages_fast() for shared futexes - Kernel

This is a discussion on [PATCH 0/4] futex: get_user_pages_fast() for shared futexes - Kernel ; Since get_user_pages_fast() made it in, I thought to give this another try. Lightly tested by disabling the private futexes and running some java proglets. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a ...

+ Reply to Thread
Results 1 to 14 of 14

Thread: [PATCH 0/4] futex: get_user_pages_fast() for shared futexes

  1. [PATCH 0/4] futex: get_user_pages_fast() for shared futexes

    Since get_user_pages_fast() made it in, I thought to give this another try.
    Lightly tested by disabling the private futexes and running some java proglets.


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. [PATCH 1/4] futex: rely on get_user_pages() for shared futexes

    On the way of getting rid of the mmap_sem requirement for shared futexes,
    start by relying on get_user_pages().

    Signed-off-by: Peter Zijlstra
    ---
    include/linux/futex.h | 2
    kernel/futex.c | 162 ++++++++++++++++++++++++--------------------------
    2 files changed, 82 insertions(+), 82 deletions(-)

    Index: linux-2.6/kernel/futex.c
    ================================================== =================
    --- linux-2.6.orig/kernel/futex.c
    +++ linux-2.6/kernel/futex.c
    @@ -161,6 +161,45 @@ static inline int match_futex(union fute
    && key1->both.offset == key2->both.offset);
    }

    +/*
    + * Take a reference to the resource addressed by a key.
    + * Can be called while holding spinlocks.
    + *
    + */
    +static void get_futex_key_refs(union futex_key *key)
    +{
    + if (!key->both.ptr)
    + return;
    +
    + switch (key->both.offset & (FUT_OFF_INODE|FUT_OFF_MMSHARED)) {
    + case FUT_OFF_INODE:
    + atomic_inc(&key->shared.inode->i_count);
    + break;
    + case FUT_OFF_MMSHARED:
    + atomic_inc(&key->private.mm->mm_count);
    + break;
    + }
    +}
    +
    +/*
    + * Drop a reference to the resource addressed by a key.
    + * The hash bucket spinlock must not be held.
    + */
    +static void drop_futex_key_refs(union futex_key *key)
    +{
    + if (!key->both.ptr)
    + return;
    +
    + switch (key->both.offset & (FUT_OFF_INODE|FUT_OFF_MMSHARED)) {
    + case FUT_OFF_INODE:
    + iput(key->shared.inode);
    + break;
    + case FUT_OFF_MMSHARED:
    + mmdrop(key->private.mm);
    + break;
    + }
    +}
    +
    /**
    * get_futex_key - Get parameters which are the keys for a futex.
    * @uaddr: virtual address of the futex
    @@ -184,7 +223,6 @@ static int get_futex_key(u32 __user *uad
    {
    unsigned long address = (unsigned long)uaddr;
    struct mm_struct *mm = current->mm;
    - struct vm_area_struct *vma;
    struct page *page;
    int err;

    @@ -210,98 +248,47 @@ static int get_futex_key(u32 __user *uad
    key->private.address = address;
    return 0;
    }
    - /*
    - * The futex is hashed differently depending on whether
    - * it's in a shared or private mapping. So check vma first.
    - */
    - vma = find_extend_vma(mm, address);
    - if (unlikely(!vma))
    - return -EFAULT;

    - /*
    - * Permissions.
    - */
    - if (unlikely((vma->vm_flags & (VM_IO|VM_READ)) != VM_READ))
    - return (vma->vm_flags & VM_IO) ? -EPERM : -EACCES;
    +again:
    + err = get_user_pages(current, mm, address, 1, 0, 0, &page, NULL);
    + if (err < 0)
    + return err;
    +
    + lock_page(page);
    + if (!page->mapping) {
    + unlock_page(page);
    + put_page(page);
    + goto again;
    + }

    /*
    * Private mappings are handled in a simple way.
    *
    * NOTE: When userspace waits on a MAP_SHARED mapping, even if
    * it's a read-only handle, it's expected that futexes attach to
    - * the object not the particular process. Therefore we use
    - * VM_MAYSHARE here, not VM_SHARED which is restricted to shared
    - * mappings of _writable_ handles.
    + * the object not the particular process.
    */
    - if (likely(!(vma->vm_flags & VM_MAYSHARE))) {
    - key->both.offset |= FUT_OFF_MMSHARED; /* reference taken on mm */
    + if (PageAnon(page)) {
    + key->both.offset |= FUT_OFF_MMSHARED; /* ref taken on mm */
    key->private.mm = mm;
    key->private.address = address;
    - return 0;
    - }
    -
    - /*
    - * Linear file mappings are also simple.
    - */
    - key->shared.inode = vma->vm_file->f_path.dentry->d_inode;
    - key->both.offset |= FUT_OFF_INODE; /* inode-based key. */
    - if (likely(!(vma->vm_flags & VM_NONLINEAR))) {
    - key->shared.pgoff = (((address - vma->vm_start) >> PAGE_SHIFT)
    - + vma->vm_pgoff);
    - return 0;
    + } else {
    + key->both.offset |= FUT_OFF_INODE; /* inode-based key */
    + key->shared.inode = page->mapping->host;
    + key->shared.pgoff = page->index;
    }

    - /*
    - * We could walk the page table to read the non-linear
    - * pte, and get the page index without fetching the page
    - * from swap. But that's a lot of code to duplicate here
    - * for a rare case, so we simply fetch the page.
    - */
    - err = get_user_pages(current, mm, address, 1, 0, 0, &page, NULL);
    - if (err >= 0) {
    - key->shared.pgoff =
    - page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
    - put_page(page);
    - return 0;
    - }
    - return err;
    -}
    + get_futex_key_refs(key);

    -/*
    - * Take a reference to the resource addressed by a key.
    - * Can be called while holding spinlocks.
    - *
    - */
    -static void get_futex_key_refs(union futex_key *key)
    -{
    - if (key->both.ptr == NULL)
    - return;
    - switch (key->both.offset & (FUT_OFF_INODE|FUT_OFF_MMSHARED)) {
    - case FUT_OFF_INODE:
    - atomic_inc(&key->shared.inode->i_count);
    - break;
    - case FUT_OFF_MMSHARED:
    - atomic_inc(&key->private.mm->mm_count);
    - break;
    - }
    + unlock_page(page);
    + put_page(page);
    + return 0;
    }

    -/*
    - * Drop a reference to the resource addressed by a key.
    - * The hash bucket spinlock must not be held.
    - */
    -static void drop_futex_key_refs(union futex_key *key)
    +static inline
    +void put_futex_key(struct rw_semaphore *fshared, union futex_key *key)
    {
    - if (!key->both.ptr)
    - return;
    - switch (key->both.offset & (FUT_OFF_INODE|FUT_OFF_MMSHARED)) {
    - case FUT_OFF_INODE:
    - iput(key->shared.inode);
    - break;
    - case FUT_OFF_MMSHARED:
    - mmdrop(key->private.mm);
    - break;
    - }
    + drop_futex_key_refs(key);
    }

    static u32 cmpxchg_futex_value_locked(u32 __user *uaddr, u32 uval, u32 newval)
    @@ -385,6 +372,7 @@ static int refill_pi_state_cache(void)
    /* pi_mutex gets initialized later */
    pi_state->owner = NULL;
    atomic_set(&pi_state->refcount, 1);
    + pi_state->key = FUTEX_KEY_INIT;

    current->pi_state_cache = pi_state;

    @@ -462,7 +450,7 @@ void exit_pi_state_list(struct task_stru
    struct list_head *next, *head = &curr->pi_state_list;
    struct futex_pi_state *pi_state;
    struct futex_hash_bucket *hb;
    - union futex_key key;
    + union futex_key key = FUTEX_KEY_INIT;

    if (!futex_cmpxchg_enabled)
    return;
    @@ -725,7 +713,7 @@ static int futex_wake(u32 __user *uaddr,
    struct futex_hash_bucket *hb;
    struct futex_q *this, *next;
    struct plist_head *head;
    - union futex_key key;
    + union futex_key key = FUTEX_KEY_INIT;
    int ret;

    if (!bitset)
    @@ -760,6 +748,7 @@ static int futex_wake(u32 __user *uaddr,

    spin_unlock(&hb->lock);
    out:
    + put_futex_key(fshared, &key);
    futex_unlock_mm(fshared);
    return ret;
    }
    @@ -773,7 +762,7 @@ futex_wake_op(u32 __user *uaddr1, struct
    u32 __user *uaddr2,
    int nr_wake, int nr_wake2, int op)
    {
    - union futex_key key1, key2;
    + union futex_key key1 = FUTEX_KEY_INIT, key2 = FUTEX_KEY_INIT;
    struct futex_hash_bucket *hb1, *hb2;
    struct plist_head *head;
    struct futex_q *this, *next;
    @@ -873,6 +862,8 @@ retry:
    if (hb1 != hb2)
    spin_unlock(&hb2->lock);
    out:
    + put_futex_key(fshared, &key2);
    + put_futex_key(fshared, &key1);
    futex_unlock_mm(fshared);

    return ret;
    @@ -886,7 +877,7 @@ static int futex_requeue(u32 __user *uad
    u32 __user *uaddr2,
    int nr_wake, int nr_requeue, u32 *cmpval)
    {
    - union futex_key key1, key2;
    + union futex_key key1 = FUTEX_KEY_INIT, key2 = FUTEX_KEY_INIT;
    struct futex_hash_bucket *hb1, *hb2;
    struct plist_head *head1;
    struct futex_q *this, *next;
    @@ -974,6 +965,8 @@ out_unlock:
    drop_futex_key_refs(&key1);

    out:
    + put_futex_key(fshared, &key2);
    + put_futex_key(fshared, &key1);
    futex_unlock_mm(fshared);
    return ret;
    }
    @@ -1220,6 +1213,7 @@ static int futex_wait(u32 __user *uaddr,
    retry:
    futex_lock_mm(fshared);

    + q.key = FUTEX_KEY_INIT;
    ret = get_futex_key(uaddr, fshared, &q.key);
    if (unlikely(ret != 0))
    goto out_release_sem;
    @@ -1359,6 +1353,7 @@ static int futex_wait(u32 __user *uaddr,
    queue_unlock(&q, hb);

    out_release_sem:
    + put_futex_key(fshared, &q.key);
    futex_unlock_mm(fshared);
    return ret;
    }
    @@ -1410,6 +1405,7 @@ static int futex_lock_pi(u32 __user *uad
    retry:
    futex_lock_mm(fshared);

    + q.key = FUTEX_KEY_INIT;
    ret = get_futex_key(uaddr, fshared, &q.key);
    if (unlikely(ret != 0))
    goto out_release_sem;
    @@ -1624,6 +1620,7 @@ static int futex_lock_pi(u32 __user *uad
    queue_unlock(&q, hb);

    out_release_sem:
    + put_futex_key(fshared, &q.key);
    futex_unlock_mm(fshared);
    if (to)
    destroy_hrtimer_on_stack(&to->timer);
    @@ -1670,7 +1667,7 @@ static int futex_unlock_pi(u32 __user *u
    struct futex_q *this, *next;
    u32 uval;
    struct plist_head *head;
    - union futex_key key;
    + union futex_key key = FUTEX_KEY_INIT;
    int ret, attempt = 0;

    retry:
    @@ -1743,6 +1740,7 @@ retry_unlocked:
    out_unlock:
    spin_unlock(&hb->lock);
    out:
    + put_futex_key(fshared, &key);
    futex_unlock_mm(fshared);

    return ret;
    Index: linux-2.6/include/linux/futex.h
    ================================================== =================
    --- linux-2.6.orig/include/linux/futex.h
    +++ linux-2.6/include/linux/futex.h
    @@ -164,6 +164,8 @@ union futex_key {
    } both;
    };

    +#define FUTEX_KEY_INIT (union futex_key) { .both = { .ptr = NULL } }
    +
    #ifdef CONFIG_FUTEX
    extern void exit_robust_list(struct task_struct *curr);
    extern void exit_pi_state_list(struct task_struct *curr);


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. [PATCH 3/4] futex: use fast_gup()

    Change the get_user_pages() call with fast_gup() which doesn't require holding
    the mmap_sem thereby removing the mmap_sem from all fast paths.

    Signed-off-by: Peter Zijlstra
    ---
    kernel/futex.c | 4 +---
    1 file changed, 1 insertion(+), 3 deletions(-)

    Index: linux-2.6/kernel/futex.c
    ================================================== =================
    --- linux-2.6.orig/kernel/futex.c
    +++ linux-2.6/kernel/futex.c
    @@ -232,9 +232,7 @@ static int get_futex_key(u32 __user *uad
    }

    again:
    - down_read(&mm->mmap_sem);
    - err = get_user_pages(current, mm, address, 1, 0, 0, &page, NULL);
    - up_read(&mm->mmap_sem);
    + err = get_user_pages_fast(address, 1, 0, &page);
    if (err < 0)
    return err;



    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. [PATCH 4/4] futex: cleanup fshared

    fshared doesn't need to be a rw_sem pointer anymore, so clean that up.

    Signed-off-by: Peter Zijlstra
    ---
    kernel/futex.c | 48 +++++++++++++++++++-----------------------------
    1 file changed, 19 insertions(+), 29 deletions(-)

    Index: linux-2.6/kernel/futex.c
    ================================================== =================
    --- linux-2.6.orig/kernel/futex.c
    +++ linux-2.6/kernel/futex.c
    @@ -200,8 +200,7 @@ static void drop_futex_key_refs(union fu
    * For other futexes, it points to &current->mm->mmap_sem and
    * caller must have taken the reader lock. but NOT any spinlocks.
    */
    -static int get_futex_key(u32 __user *uaddr, struct rw_semaphore *fshared,
    - union futex_key *key)
    +static int get_futex_key(u32 __user *uaddr, int fshared, union futex_key *key)
    {
    unsigned long address = (unsigned long)uaddr;
    struct mm_struct *mm = current->mm;
    @@ -268,7 +267,7 @@ again:
    }

    static inline
    -void put_futex_key(struct rw_semaphore *fshared, union futex_key *key)
    +void put_futex_key(int fshared, union futex_key *key)
    {
    drop_futex_key_refs(key);
    }
    @@ -297,10 +296,8 @@ static int get_futex_value_locked(u32 *d

    /*
    * Fault handling.
    - * if fshared is non NULL, current->mm->mmap_sem is already held
    */
    -static int futex_handle_fault(unsigned long address,
    - struct rw_semaphore *fshared, int attempt)
    +static int futex_handle_fault(unsigned long address, int attempt)
    {
    struct vm_area_struct * vma;
    struct mm_struct *mm = current->mm;
    @@ -687,8 +684,7 @@ double_lock_hb(struct futex_hash_bucket
    * Wake up all waiters hashed on the physical page that is mapped
    * to this virtual address:
    */
    -static int futex_wake(u32 __user *uaddr, struct rw_semaphore *fshared,
    - int nr_wake, u32 bitset)
    +static int futex_wake(u32 __user *uaddr, int fshared, int nr_wake, u32 bitset)
    {
    struct futex_hash_bucket *hb;
    struct futex_q *this, *next;
    @@ -735,8 +731,7 @@ out:
    * to this virtual address:
    */
    static int
    -futex_wake_op(u32 __user *uaddr1, struct rw_semaphore *fshared,
    - u32 __user *uaddr2,
    +futex_wake_op(u32 __user *uaddr1, int fshared, u32 __user *uaddr2,
    int nr_wake, int nr_wake2, int op)
    {
    union futex_key key1 = FUTEX_KEY_INIT, key2 = FUTEX_KEY_INIT;
    @@ -790,7 +785,7 @@ retry:
    */
    if (attempt++) {
    ret = futex_handle_fault((unsigned long)uaddr2,
    - fshared, attempt);
    + attempt);
    if (ret)
    goto out;
    goto retry;
    @@ -841,8 +836,7 @@ out:
    * Requeue all waiters hashed on one physical page to another
    * physical page.
    */
    -static int futex_requeue(u32 __user *uaddr1, struct rw_semaphore *fshared,
    - u32 __user *uaddr2,
    +static int futex_requeue(u32 __user *uaddr1, int fshared, u32 __user *uaddr2,
    int nr_wake, int nr_requeue, u32 *cmpval)
    {
    union futex_key key1 = FUTEX_KEY_INIT, key2 = FUTEX_KEY_INIT;
    @@ -1048,8 +1042,7 @@ static void unqueue_me_pi(struct futex_q
    * private futexes.
    */
    static int fixup_pi_state_owner(u32 __user *uaddr, struct futex_q *q,
    - struct task_struct *newowner,
    - struct rw_semaphore *fshared)
    + struct task_struct *newowner, int fshared)
    {
    u32 newtid = task_pid_vnr(newowner) | FUTEX_WAITERS;
    struct futex_pi_state *pi_state = q->pi_state;
    @@ -1128,7 +1121,7 @@ retry:
    handle_fault:
    spin_unlock(q->lock_ptr);

    - ret = futex_handle_fault((unsigned long)uaddr, fshared, attempt++);
    + ret = futex_handle_fault((unsigned long)uaddr, attempt++);

    spin_lock(q->lock_ptr);

    @@ -1152,7 +1145,7 @@ handle_fault:

    static long futex_wait_restart(struct restart_block *restart);

    -static int futex_wait(u32 __user *uaddr, struct rw_semaphore *fshared,
    +static int futex_wait(u32 __user *uaddr, int fshared,
    u32 val, ktime_t *abs_time, u32 bitset)
    {
    struct task_struct *curr = current;
    @@ -1306,13 +1299,13 @@ static int futex_wait(u32 __user *uaddr,
    static long futex_wait_restart(struct restart_block *restart)
    {
    u32 __user *uaddr = (u32 __user *)restart->futex.uaddr;
    - struct rw_semaphore *fshared = NULL;
    + int fshared = 0;
    ktime_t t;

    t.tv64 = restart->futex.time;
    restart->fn = do_no_restart_syscall;
    if (restart->futex.flags & FLAGS_SHARED)
    - fshared = &current->mm->mmap_sem;
    + fshared = 1;
    return (long)futex_wait(uaddr, fshared, restart->futex.val, &t,
    restart->futex.bitset);
    }
    @@ -1324,7 +1317,7 @@ static long futex_wait_restart(struct re
    * if there are waiters then it will block, it does PI, etc. (Due to
    * races the kernel might see a 0 value of the futex too.)
    */
    -static int futex_lock_pi(u32 __user *uaddr, struct rw_semaphore *fshared,
    +static int futex_lock_pi(u32 __user *uaddr, int fshared,
    int detect, ktime_t *time, int trylock)
    {
    struct hrtimer_sleeper timeout, *to = NULL;
    @@ -1570,8 +1563,7 @@ static int futex_lock_pi(u32 __user *uad
    queue_unlock(&q, hb);

    if (attempt++) {
    - ret = futex_handle_fault((unsigned long)uaddr, fshared,
    - attempt);
    + ret = futex_handle_fault((unsigned long)uaddr, attempt);
    if (ret)
    goto out_release_sem;
    goto retry_unlocked;
    @@ -1591,7 +1583,7 @@ static int futex_lock_pi(u32 __user *uad
    * This is the in-kernel slowpath: we look up the PI state (if any),
    * and do the rt-mutex unlock.
    */
    -static int futex_unlock_pi(u32 __user *uaddr, struct rw_semaphore *fshared)
    +static int futex_unlock_pi(u32 __user *uaddr, int fshared)
    {
    struct futex_hash_bucket *hb;
    struct futex_q *this, *next;
    @@ -1682,8 +1674,7 @@ pi_faulted:
    spin_unlock(&hb->lock);

    if (attempt++) {
    - ret = futex_handle_fault((unsigned long)uaddr, fshared,
    - attempt);
    + ret = futex_handle_fault((unsigned long)uaddr, attempt);
    if (ret)
    goto out;
    uval = 0;
    @@ -1815,8 +1806,7 @@ retry:
    * PI futexes happens in exit_pi_state():
    */
    if (!pi && (uval & FUTEX_WAITERS))
    - futex_wake(uaddr, &curr->mm->mmap_sem, 1,
    - FUTEX_BITSET_MATCH_ANY);
    + futex_wake(uaddr, 1, 1, FUTEX_BITSET_MATCH_ANY);
    }
    return 0;
    }
    @@ -1912,10 +1902,10 @@ long do_futex(u32 __user *uaddr, int op,
    {
    int ret = -ENOSYS;
    int cmd = op & FUTEX_CMD_MASK;
    - struct rw_semaphore *fshared = NULL;
    + int fshared = 0;

    if (!(op & FUTEX_PRIVATE_FLAG))
    - fshared = &current->mm->mmap_sem;
    + fshared = 1;

    switch (cmd) {
    case FUTEX_WAIT:


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. [PATCH 2/4] futex: reduce mmap_sem usage

    now that we rely on get_user_pages() for the shared key handling
    move all the mmap_sem stuff closely around the slow paths.

    Signed-off-by: Peter Zijlstra
    ---
    kernel/futex.c | 83 ++-------------------------------------------------------
    1 file changed, 4 insertions(+), 79 deletions(-)

    Index: linux-2.6/kernel/futex.c
    ================================================== =================
    --- linux-2.6.orig/kernel/futex.c
    +++ linux-2.6/kernel/futex.c
    @@ -123,24 +123,6 @@ struct futex_hash_bucket {
    static struct futex_hash_bucket futex_queues[1<
    /*
    - * Take mm->mmap_sem, when futex is shared
    - */
    -static inline void futex_lock_mm(struct rw_semaphore *fshared)
    -{
    - if (fshared)
    - down_read(fshared);
    -}
    -
    -/*
    - * Release mm->mmap_sem, when the futex is shared
    - */
    -static inline void futex_unlock_mm(struct rw_semaphore *fshared)
    -{
    - if (fshared)
    - up_read(fshared);
    -}
    -
    -/*
    * We hash on the keys returned from get_futex_key (see below).
    */
    static struct futex_hash_bucket *hash_futex(union futex_key *key)
    @@ -250,7 +232,9 @@ static int get_futex_key(u32 __user *uad
    }

    again:
    + down_read(&mm->mmap_sem);
    err = get_user_pages(current, mm, address, 1, 0, 0, &page, NULL);
    + up_read(&mm->mmap_sem);
    if (err < 0)
    return err;

    @@ -327,8 +311,7 @@ static int futex_handle_fault(unsigned l
    if (attempt > 2)
    return ret;

    - if (!fshared)
    - down_read(&mm->mmap_sem);
    + down_read(&mm->mmap_sem);
    vma = find_vma(mm, address);
    if (vma && address >= vma->vm_start &&
    (vma->vm_flags & VM_WRITE)) {
    @@ -348,8 +331,7 @@ static int futex_handle_fault(unsigned l
    current->min_flt++;
    }
    }
    - if (!fshared)
    - up_read(&mm->mmap_sem);
    + up_read(&mm->mmap_sem);
    return ret;
    }

    @@ -719,8 +701,6 @@ static int futex_wake(u32 __user *uaddr,
    if (!bitset)
    return -EINVAL;

    - futex_lock_mm(fshared);
    -
    ret = get_futex_key(uaddr, fshared, &key);
    if (unlikely(ret != 0))
    goto out;
    @@ -749,7 +729,6 @@ static int futex_wake(u32 __user *uaddr,
    spin_unlock(&hb->lock);
    out:
    put_futex_key(fshared, &key);
    - futex_unlock_mm(fshared);
    return ret;
    }

    @@ -769,8 +748,6 @@ futex_wake_op(u32 __user *uaddr1, struct
    int ret, op_ret, attempt = 0;

    retryfull:
    - futex_lock_mm(fshared);
    -
    ret = get_futex_key(uaddr1, fshared, &key1);
    if (unlikely(ret != 0))
    goto out;
    @@ -821,12 +798,6 @@ retry:
    goto retry;
    }

    - /*
    - * If we would have faulted, release mmap_sem,
    - * fault it in and start all over again.
    - */
    - futex_unlock_mm(fshared);
    -
    ret = get_user(dummy, uaddr2);
    if (ret)
    return ret;
    @@ -864,7 +835,6 @@ retry:
    out:
    put_futex_key(fshared, &key2);
    put_futex_key(fshared, &key1);
    - futex_unlock_mm(fshared);

    return ret;
    }
    @@ -884,8 +854,6 @@ static int futex_requeue(u32 __user *uad
    int ret, drop_count = 0;

    retry:
    - futex_lock_mm(fshared);
    -
    ret = get_futex_key(uaddr1, fshared, &key1);
    if (unlikely(ret != 0))
    goto out;
    @@ -908,12 +876,6 @@ static int futex_requeue(u32 __user *uad
    if (hb1 != hb2)
    spin_unlock(&hb2->lock);

    - /*
    - * If we would have faulted, release mmap_sem, fault
    - * it in and start all over again.
    - */
    - futex_unlock_mm(fshared);
    -
    ret = get_user(curval, uaddr1);

    if (!ret)
    @@ -967,7 +929,6 @@ out_unlock:
    out:
    put_futex_key(fshared, &key2);
    put_futex_key(fshared, &key1);
    - futex_unlock_mm(fshared);
    return ret;
    }

    @@ -1211,8 +1172,6 @@ static int futex_wait(u32 __user *uaddr,
    q.pi_state = NULL;
    q.bitset = bitset;
    retry:
    - futex_lock_mm(fshared);
    -
    q.key = FUTEX_KEY_INIT;
    ret = get_futex_key(uaddr, fshared, &q.key);
    if (unlikely(ret != 0))
    @@ -1245,12 +1204,6 @@ static int futex_wait(u32 __user *uaddr,
    if (unlikely(ret)) {
    queue_unlock(&q, hb);

    - /*
    - * If we would have faulted, release mmap_sem, fault it in and
    - * start all over again.
    - */
    - futex_unlock_mm(fshared);
    -
    ret = get_user(uval, uaddr);

    if (!ret)
    @@ -1265,12 +1218,6 @@ static int futex_wait(u32 __user *uaddr,
    queue_me(&q, hb);

    /*
    - * Now the futex is queued and we have checked the data, we
    - * don't want to hold mmap_sem while we sleep.
    - */
    - futex_unlock_mm(fshared);
    -
    - /*
    * There might have been scheduling since the queue_me(), as we
    * cannot hold a spinlock across the get_user() in case it
    * faults, and we cannot just set TASK_INTERRUPTIBLE state when
    @@ -1354,7 +1301,6 @@ static int futex_wait(u32 __user *uaddr,

    out_release_sem:
    put_futex_key(fshared, &q.key);
    - futex_unlock_mm(fshared);
    return ret;
    }

    @@ -1403,8 +1349,6 @@ static int futex_lock_pi(u32 __user *uad

    q.pi_state = NULL;
    retry:
    - futex_lock_mm(fshared);
    -
    q.key = FUTEX_KEY_INIT;
    ret = get_futex_key(uaddr, fshared, &q.key);
    if (unlikely(ret != 0))
    @@ -1494,7 +1438,6 @@ static int futex_lock_pi(u32 __user *uad
    * exit to complete.
    */
    queue_unlock(&q, hb);
    - futex_unlock_mm(fshared);
    cond_resched();
    goto retry;

    @@ -1526,12 +1469,6 @@ static int futex_lock_pi(u32 __user *uad
    */
    queue_me(&q, hb);

    - /*
    - * Now the futex is queued and we have checked the data, we
    - * don't want to hold mmap_sem while we sleep.
    - */
    - futex_unlock_mm(fshared);
    -
    WARN_ON(!q.pi_state);
    /*
    * Block on the PI mutex:
    @@ -1544,7 +1481,6 @@ static int futex_lock_pi(u32 __user *uad
    ret = ret ? 0 : -EWOULDBLOCK;
    }

    - futex_lock_mm(fshared);
    spin_lock(q.lock_ptr);

    if (!ret) {
    @@ -1610,7 +1546,6 @@ static int futex_lock_pi(u32 __user *uad

    /* Unqueue and drop the lock */
    unqueue_me_pi(&q);
    - futex_unlock_mm(fshared);

    if (to)
    destroy_hrtimer_on_stack(&to->timer);
    @@ -1621,7 +1556,6 @@ static int futex_lock_pi(u32 __user *uad

    out_release_sem:
    put_futex_key(fshared, &q.key);
    - futex_unlock_mm(fshared);
    if (to)
    destroy_hrtimer_on_stack(&to->timer);
    return ret;
    @@ -1645,8 +1579,6 @@ static int futex_lock_pi(u32 __user *uad
    goto retry_unlocked;
    }

    - futex_unlock_mm(fshared);
    -
    ret = get_user(uval, uaddr);
    if (!ret && (uval != -EFAULT))
    goto retry;
    @@ -1678,10 +1610,6 @@ retry:
    */
    if ((uval & FUTEX_TID_MASK) != task_pid_vnr(current))
    return -EPERM;
    - /*
    - * First take all the futex related locks:
    - */
    - futex_lock_mm(fshared);

    ret = get_futex_key(uaddr, fshared, &key);
    if (unlikely(ret != 0))
    @@ -1741,7 +1669,6 @@ out_unlock:
    spin_unlock(&hb->lock);
    out:
    put_futex_key(fshared, &key);
    - futex_unlock_mm(fshared);

    return ret;

    @@ -1765,8 +1692,6 @@ pi_faulted:
    goto retry_unlocked;
    }

    - futex_unlock_mm(fshared);
    -
    ret = get_user(uval, uaddr);
    if (!ret && (uval != -EFAULT))
    goto retry;


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [PATCH 0/4] futex: get_user_pages_fast() for shared futexes


    * Peter Zijlstra wrote:

    > Since get_user_pages_fast() made it in, I thought to give this another
    > try. Lightly tested by disabling the private futexes and running some
    > java proglets.


    hm, very interesting. Since this is an important futex usecase i started
    testing it in tip/core/futexes:

    cd33272: futex: cleanup fshared
    a135356: futex: use fast_gup()
    39ce77b: futex: reduce mmap_sem usage
    0d7a336: futex: rely on get_user_pages() for shared futexes

    Nick, it would be nice to get an Acked-by/Reviewed-by from you, before
    we think about whether it should go upstream.

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: [PATCH 0/4] futex: get_user_pages_fast() for shared futexes

    On Sunday 28 September 2008 02:17, Ingo Molnar wrote:
    > * Peter Zijlstra wrote:
    > > Since get_user_pages_fast() made it in, I thought to give this another
    > > try. Lightly tested by disabling the private futexes and running some
    > > java proglets.

    >
    > hm, very interesting. Since this is an important futex usecase i started
    > testing it in tip/core/futexes:
    >
    > cd33272: futex: cleanup fshared
    > a135356: futex: use fast_gup()
    > 39ce77b: futex: reduce mmap_sem usage
    > 0d7a336: futex: rely on get_user_pages() for shared futexes
    >
    > Nick, it would be nice to get an Acked-by/Reviewed-by from you, before
    > we think about whether it should go upstream.


    Yeah, these all look pretty good. It's nice to get rid of mmap sem here.

    Which reminds me, we need to put a might_lock mmap_sem into
    get_user_pages_fast...

    But these patches look good to me (last time we discussed them I thought
    there was a race with page truncate, but it looks like you've closed that
    by holding page lock over the whole operation...)

    Nice work, Peter.

    BTW. what kinds of things use inter-process futexes as of now?
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [PATCH 0/4] futex: get_user_pages_fast() for shared futexes

    On Tue, 2008-09-30 at 17:21 +1000, Nick Piggin wrote:
    > On Sunday 28 September 2008 02:17, Ingo Molnar wrote:
    > > * Peter Zijlstra wrote:
    > > > Since get_user_pages_fast() made it in, I thought to give this another
    > > > try. Lightly tested by disabling the private futexes and running some
    > > > java proglets.

    > >
    > > hm, very interesting. Since this is an important futex usecase i started
    > > testing it in tip/core/futexes:
    > >
    > > cd33272: futex: cleanup fshared
    > > a135356: futex: use fast_gup()
    > > 39ce77b: futex: reduce mmap_sem usage
    > > 0d7a336: futex: rely on get_user_pages() for shared futexes
    > >
    > > Nick, it would be nice to get an Acked-by/Reviewed-by from you, before
    > > we think about whether it should go upstream.

    >
    > Yeah, these all look pretty good. It's nice to get rid of mmap sem here.
    >
    > Which reminds me, we need to put a might_lock mmap_sem into
    > get_user_pages_fast...


    Yeah..

    > But these patches look good to me (last time we discussed them I thought
    > there was a race with page truncate, but it looks like you've closed that
    > by holding page lock over the whole operation...)


    Just to be sure, I only hold the page lock over the get_futex_key() op,
    and drop it after getting a ref on the futex key.

    I then drop the futex key ref after the futex op is complete.

    This assumes the futex key ref is suffucient to guarantee whatever is
    needed - which is the point I'm still not quite sure about myself.

    The futex key ref was used between futex ops, with I assume the intent
    to ensure the futex backing stays valid. However, the key ref only takes
    a ref on either the inode or the mm, neither which avoid the specific
    address of the futex to get unmapped between ops.

    So in that respect we're not worse off than before, and any application
    doing: futex_wait(), munmap(), futex_wake() is going to suffer. And as
    far as I understand it get the waiting task stuck in D state for
    ever-more or somesuch.

    By now not holding the mmap_sem over the full futex op, but only over
    the get_futex_key(), that munmap() race gets larger and the actual futex
    could disappear while we're working on it, but in all cases I looked at
    that will make the futex op return -EFAULT, so we should be good there.

    Gah, now that I look at it, it looks like I made get_futex_key()
    asymetric wrt private futexes, they don't take a ref on the key, but
    then do drop one... ouch.. Patch below.

    > Nice work, Peter.


    Thanks!

    > BTW. what kinds of things use inter-process futexes as of now?


    On a regular modern Linux system, not much. But I've been told there are
    applications out there that do indeed make heavy use of them - as
    they're part of POSIX etc.. blah blah :-)

    Also some legacy stuff that's stuck on an ancient glibc (but somehow did
    manage to upgrade the kernel) might benefit.


    ---
    Subject: futex: fixup get_futex_key() for private futexes
    From: Peter Zijlstra

    With the get_user_pages_fast() patches we made get_futex_key() obtain a
    reference on the returned key, but failed to do so for private futexes.

    Signed-off-by: Peter Zijlstra
    ---
    diff --git a/kernel/futex.c b/kernel/futex.c
    index 197fdab..beee9af 100644
    --- a/kernel/futex.c
    +++ b/kernel/futex.c
    @@ -227,6 +227,7 @@ static int get_futex_key(u32 __user *uaddr, int
    fshared, union futex_key *key)
    return -EFAULT;
    key->private.mm = mm;
    key->private.address = address;
    + get_futex_key_refs(key);
    return 0;
    }



    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: [PATCH 0/4] futex: get_user_pages_fast() for shared futexes


    * Nick Piggin wrote:

    > On Sunday 28 September 2008 02:17, Ingo Molnar wrote:
    > > * Peter Zijlstra wrote:
    > > > Since get_user_pages_fast() made it in, I thought to give this another
    > > > try. Lightly tested by disabling the private futexes and running some
    > > > java proglets.

    > >
    > > hm, very interesting. Since this is an important futex usecase i started
    > > testing it in tip/core/futexes:
    > >
    > > cd33272: futex: cleanup fshared
    > > a135356: futex: use fast_gup()
    > > 39ce77b: futex: reduce mmap_sem usage
    > > 0d7a336: futex: rely on get_user_pages() for shared futexes
    > >
    > > Nick, it would be nice to get an Acked-by/Reviewed-by from you, before
    > > we think about whether it should go upstream.

    >
    > Yeah, these all look pretty good. It's nice to get rid of mmap sem here.
    >
    > Which reminds me, we need to put a might_lock mmap_sem into
    > get_user_pages_fast...
    >
    > But these patches look good to me (last time we discussed them I thought
    > there was a race with page truncate, but it looks like you've closed that
    > by holding page lock over the whole operation...)
    >
    > Nice work, Peter.


    great - i've added your Acked-by to the patches and have activated the
    tip/core/futexes branch for tip/auto-core-next linux-next integration.

    here are the commits:

    42569c3: futex: fixup get_futex_key() for private futexes
    c2f9f20: futex: cleanup fshared
    734b05b: futex: use fast_gup()
    6127070: futex: reduce mmap_sem usage
    38d47c1: futex: rely on get_user_pages() for shared futexes

    Thanks,

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: [PATCH 0/4] futex: get_user_pages_fast() for shared futexes


    * Peter Zijlstra wrote:

    > On Tue, 2008-09-30 at 17:21 +1000, Nick Piggin wrote:
    > > On Sunday 28 September 2008 02:17, Ingo Molnar wrote:
    > > > * Peter Zijlstra wrote:
    > > > > Since get_user_pages_fast() made it in, I thought to give this another
    > > > > try. Lightly tested by disabling the private futexes and running some
    > > > > java proglets.
    > > >
    > > > hm, very interesting. Since this is an important futex usecase i started
    > > > testing it in tip/core/futexes:
    > > >
    > > > cd33272: futex: cleanup fshared
    > > > a135356: futex: use fast_gup()
    > > > 39ce77b: futex: reduce mmap_sem usage
    > > > 0d7a336: futex: rely on get_user_pages() for shared futexes
    > > >
    > > > Nick, it would be nice to get an Acked-by/Reviewed-by from you, before
    > > > we think about whether it should go upstream.

    > >
    > > Yeah, these all look pretty good. It's nice to get rid of mmap sem here.
    > >
    > > Which reminds me, we need to put a might_lock mmap_sem into
    > > get_user_pages_fast...

    >
    > Yeah..
    >
    > > But these patches look good to me (last time we discussed them I thought
    > > there was a race with page truncate, but it looks like you've closed that
    > > by holding page lock over the whole operation...)

    >
    > Just to be sure, I only hold the page lock over the get_futex_key() op,
    > and drop it after getting a ref on the futex key.
    >
    > I then drop the futex key ref after the futex op is complete.
    >
    > This assumes the futex key ref is suffucient to guarantee whatever is
    > needed - which is the point I'm still not quite sure about myself.
    >
    > The futex key ref was used between futex ops, with I assume the intent
    > to ensure the futex backing stays valid. However, the key ref only takes
    > a ref on either the inode or the mm, neither which avoid the specific
    > address of the futex to get unmapped between ops.
    >
    > So in that respect we're not worse off than before, and any application
    > doing: futex_wait(), munmap(), futex_wake() is going to suffer. And as
    > far as I understand it get the waiting task stuck in D state for
    > ever-more or somesuch.
    >
    > By now not holding the mmap_sem over the full futex op, but only over
    > the get_futex_key(), that munmap() race gets larger and the actual futex
    > could disappear while we're working on it, but in all cases I looked at
    > that will make the futex op return -EFAULT, so we should be good there.
    >
    > Gah, now that I look at it, it looks like I made get_futex_key()
    > asymetric wrt private futexes, they don't take a ref on the key, but
    > then do drop one... ouch.. Patch below.
    >
    > > Nice work, Peter.

    >
    > Thanks!
    >
    > > BTW. what kinds of things use inter-process futexes as of now?

    >
    > On a regular modern Linux system, not much. But I've been told there are
    > applications out there that do indeed make heavy use of them - as
    > they're part of POSIX etc.. blah blah :-)
    >
    > Also some legacy stuff that's stuck on an ancient glibc (but somehow did
    > manage to upgrade the kernel) might benefit.
    >
    >
    > ---
    > Subject: futex: fixup get_futex_key() for private futexes
    > From: Peter Zijlstra
    >
    > With the get_user_pages_fast() patches we made get_futex_key() obtain a
    > reference on the returned key, but failed to do so for private futexes.
    >
    > Signed-off-by: Peter Zijlstra


    applied to tip/core/futexes, thanks Peter!

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: [PATCH 0/4] futex: get_user_pages_fast() for shared futexes

    On Tuesday 30 September 2008 18:51, Peter Zijlstra wrote:
    > On Tue, 2008-09-30 at 17:21 +1000, Nick Piggin wrote:


    > > But these patches look good to me (last time we discussed them I thought
    > > there was a race with page truncate, but it looks like you've closed that
    > > by holding page lock over the whole operation...)

    >
    > Just to be sure, I only hold the page lock over the get_futex_key() op,
    > and drop it after getting a ref on the futex key.
    >
    > I then drop the futex key ref after the futex op is complete.


    I think that's fine.


    > This assumes the futex key ref is suffucient to guarantee whatever is
    > needed - which is the point I'm still not quite sure about myself.


    It is enough to guarantee enough to function normally I guess. Actually
    futex syscalls are interesting in that they logically perform 2 totally
    different operations and actually want 2 keys but it manages to mash
    them into one (the user address).

    They need a futex identifier, on which to wait/wake/etc. For anonymous
    futexes, this is basically an arbitrary number which is taken from the
    user address (but note that if that address is subsequently mremap()ed
    for example, then the futex identifier does not follow the address). For
    shared futexes, the pagecache address is used for the futex, which is
    derived from the address.

    Then they also need a memory address which is the target of the
    particular operation requested. This doesn't actually logically have to
    be directly related to the futex identifier...

    I guess for all practical purposes, this is fine. "if you mremap a live
    mutex, you lose" and similar probably doesn't constrain too many people
    (although I don't think any of that is actually documented). But anyway,
    just to point out that we do already have constraints on how existing
    futexes are going to work.


    > The futex key ref was used between futex ops, with I assume the intent
    > to ensure the futex backing stays valid. However, the key ref only takes
    > a ref on either the inode or the mm, neither which avoid the specific
    > address of the futex to get unmapped between ops.
    >
    > So in that respect we're not worse off than before, and any application
    > doing: futex_wait(), munmap(), futex_wake() is going to suffer. And as
    > far as I understand it get the waiting task stuck in D state for
    > ever-more or somesuch.


    Yeah, I think you're fine there (the sleep is interruptible, so it should
    not be a DoS or anything). Just so long as you always pin things like the
    inode correctly while taking a ref on it...

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: [PATCH 0/4] futex: get_user_pages_fast() for shared futexes

    Peter Zijlstra a écrit :
    > Just to be sure, I only hold the page lock over the get_futex_key() op,
    > and drop it after getting a ref on the futex key.
    >
    > I then drop the futex key ref after the futex op is complete.
    >
    > This assumes the futex key ref is suffucient to guarantee whatever is
    > needed - which is the point I'm still not quite sure about myself.
    >
    > The futex key ref was used between futex ops, with I assume the intent
    > to ensure the futex backing stays valid. However, the key ref only takes
    > a ref on either the inode or the mm, neither which avoid the specific
    > address of the futex to get unmapped between ops.
    >
    > So in that respect we're not worse off than before, and any application
    > doing: futex_wait(), munmap(), futex_wake() is going to suffer. And as
    > far as I understand it get the waiting task stuck in D state for
    > ever-more or somesuch.
    >
    > By now not holding the mmap_sem over the full futex op, but only over
    > the get_futex_key(), that munmap() race gets larger and the actual futex
    > could disappear while we're working on it, but in all cases I looked at
    > that will make the futex op return -EFAULT, so we should be good there.
    >
    > Gah, now that I look at it, it looks like I made get_futex_key()
    > asymetric wrt private futexes, they don't take a ref on the key, but
    > then do drop one... ouch.. Patch below.
    >
    >> Nice work, Peter.

    >
    > Thanks!
    >
    >> BTW. what kinds of things use inter-process futexes as of now?

    >
    > On a regular modern Linux system, not much. But I've been told there are
    > applications out there that do indeed make heavy use of them - as
    > they're part of POSIX etc.. blah blah :-)


    inter-process futexes are still used for pthread creation/join
    (aka clear_child_tid / CLONE_CHILD_CLEARTID)

    kernel/fork.c, functions mm_release() & sys_set_tid_address()

    I am not sure how it could be converted to private futexes, since
    old binaries (static glibc) will use FUTEX_WAKE like calls.

    >
    > Also some legacy stuff that's stuck on an ancient glibc (but somehow did
    > manage to upgrade the kernel) might benefit.
    >
    >
    > ---
    > Subject: futex: fixup get_futex_key() for private futexes
    > From: Peter Zijlstra
    >
    > With the get_user_pages_fast() patches we made get_futex_key() obtain a
    > reference on the returned key, but failed to do so for private futexes.
    >


    Sorry I am lost...
    private futexes dont need to get references at all...

    > Signed-off-by: Peter Zijlstra
    > ---
    > diff --git a/kernel/futex.c b/kernel/futex.c
    > index 197fdab..beee9af 100644
    > --- a/kernel/futex.c
    > +++ b/kernel/futex.c
    > @@ -227,6 +227,7 @@ static int get_futex_key(u32 __user *uaddr, int
    > fshared, union futex_key *key)
    > return -EFAULT;
    > key->private.mm = mm;
    > key->private.address = address;
    > + get_futex_key_refs(key);
    > return 0;
    > }
    >
    >
    >
    >





    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. Re: [PATCH 0/4] futex: get_user_pages_fast() for shared futexes

    On Tue, 2008-09-30 at 12:55 +0200, Eric Dumazet wrote:
    > Peter Zijlstra a écrit :


    > > On a regular modern Linux system, not much. But I've been told there are
    > > applications out there that do indeed make heavy use of them - as
    > > they're part of POSIX etc.. blah blah :-)

    >
    > inter-process futexes are still used for pthread creation/join
    > (aka clear_child_tid / CLONE_CHILD_CLEARTID)
    >
    > kernel/fork.c, functions mm_release() & sys_set_tid_address()
    >
    > I am not sure how it could be converted to private futexes, since
    > old binaries (static glibc) will use FUTEX_WAKE like calls.


    Ah, thanks, didn't know that.

    > > ---
    > > Subject: futex: fixup get_futex_key() for private futexes
    > > From: Peter Zijlstra
    > >
    > > With the get_user_pages_fast() patches we made get_futex_key() obtain a
    > > reference on the returned key, but failed to do so for private futexes.
    > >

    >
    > Sorry I am lost...
    > private futexes dont need to get references at all...


    Ah, right - its a NOP, that's why it didn't show up in testing.

    The thing is, I changed the semantics of get_futex_key() to return a key
    with reference taken. And then noticed I didn't take one in the private
    futex path, and failed to notice the ref ops are nops for private
    futexes.

    So yeah, the below patch is basically a NOP, but we might consider
    retaining it to maintain the symmetry... dunno

    > > Signed-off-by: Peter Zijlstra
    > > ---
    > > diff --git a/kernel/futex.c b/kernel/futex.c
    > > index 197fdab..beee9af 100644
    > > --- a/kernel/futex.c
    > > +++ b/kernel/futex.c
    > > @@ -227,6 +227,7 @@ static int get_futex_key(u32 __user *uaddr, int
    > > fshared, union futex_key *key)
    > > return -EFAULT;
    > > key->private.mm = mm;
    > > key->private.address = address;
    > > + get_futex_key_refs(key);
    > > return 0;
    > > }



    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  14. Re: [PATCH 0/4] futex: get_user_pages_fast() for shared futexes

    On Tue, Sep 30, 2008 at 3:55 AM, Eric Dumazet wrote:
    > I am not sure how it could be converted to private futexes, since
    > old binaries (static glibc) will use FUTEX_WAKE like calls.


    We considered this back when but any effort seems too much. We'd
    either need a clone flag (a scarce resource) or replace the set_tid
    _address syscall. Given that the futex is woken once per thread
    lifetime it shouldn't be an issue. If the semaphore shows up even
    after this patch feel free to introduce a new set_tid_address syscall.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread