[PATCH 00/23] tracehook - Kernel

This is a discussion on [PATCH 00/23] tracehook - Kernel ; This patch series introduces the "tracehook" interface layer of inlines in . There are more details in the log entry for patch 01/23 and in the header file comments inside that patch. Most of these changes move code around with ...

+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 20 of 38

Thread: [PATCH 00/23] tracehook

  1. [PATCH 00/23] tracehook

    This patch series introduces the "tracehook" interface layer of inlines
    in . There are more details in the log entry for
    patch 01/23 and in the header file comments inside that patch.
    Most of these changes move code around with little or no change,
    and they should not break anything or change any behavior.

    This sets a new standard for uniform arch support to enable clean
    arch-independent implementations of new debugging and tracing stuff,
    denoted by CONFIG_HAVE_ARCH_TRACEHOOK. Patch 20/23 adds that symbol to
    arch/Kconfig, with comments listing everything an arch has to do before
    setting "select HAVE_ARCH_TRACEHOOK". These are elaborted a bit at:
    http://sourceware.org/systemtap/wiki/utrace/arch/HowTo
    The new inlines that arch code must define or call have detailed
    kerneldoc comments in the generic header files that say what is required.

    No arch is obligated to do any work, and no arch's build should be
    broken by these changes. There are several steps that each arch should
    take so it can set HAVE_ARCH_TRACEHOOK. Most of these are simple.
    Providing this support will let new things people add for doing
    debugging and tracing of user-level threads "just work" for your arch
    in the future. For an arch that does not provide HAVE_ARCH_TRACEHOOK,
    some new options for such features will not be available for config.

    I have done some arch work and will submit this to the arch maintainers
    after the generic tracehook series settles in. For now, that work is
    available in my GIT repositories, and in patch and mbox-of-patches form
    at http://people.redhat.com/roland/utrace/2.6-current/

    This paves the way for my "utrace" work, to be submitted later. But it
    is not innately tied to that. I hope that the tracehook series can go
    in soon regardless of what eventually does or doesn't go on top of it.
    For anyone implementing any kind of new tracing/debugging plan, or just
    understanding all the context of the existing ptrace implementation,
    having tracehook.h makes things much easier to find and understand.

    This series must be applied after the "ptrace & wait cleanup" series,
    but only because of patch conflicts. They can be reviewed separately.

    This series of patches is also available in GIT at the URL below, and in a
    mail folder of patch messages (use git-am or formail -s patch -p1) at:
    http://people.redhat.com/roland/utra...tracehook.mbox

    The following changes since commit 666f164f4fbfa78bd00fb4b74788b42a39842c64:
    Roland McGrath (1):
    fix dangling zombie when new parent ignores children

    are available in the git repository at:

    git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace.git tracehook

    Roland McGrath (23):
    tracehook: add linux/tracehook.h
    tracehook: exec
    tracehook: unexport ptrace_notify
    tracehook: exit
    tracehook: clone
    tracehook: vfork-done
    tracehook: release_task
    tracehook: tracehook_tracer_task
    tracehook: tracehook_expect_breakpoints
    tracehook: tracehook_signal_handler
    tracehook: tracehook_consider_ignored_signal
    tracehook: tracehook_consider_fatal_signal
    tracehook: syscall
    tracehook: get_signal_to_deliver
    tracehook: job control
    tracehook: death
    tracehook: force signal_pending()
    tracehook: TIF_NOTIFY_RESUME
    tracehook: asm/syscall.h
    tracehook: CONFIG_HAVE_ARCH_TRACEHOOK
    tracehook: wait_task_inactive
    task_current_syscall
    /proc/PID/syscall

    arch/Kconfig | 18 ++
    arch/ia64/kernel/perfmon.c | 4 +-
    arch/x86/ia32/ia32_aout.c | 6 -
    fs/binfmt_aout.c | 6 -
    fs/binfmt_elf.c | 6 -
    fs/binfmt_elf_fdpic.c | 7 -
    fs/binfmt_flat.c | 3 -
    fs/binfmt_som.c | 2 -
    fs/exec.c | 12 +-
    fs/proc/array.c | 9 +-
    fs/proc/base.c | 39 +++-
    include/asm-generic/syscall.h | 141 ++++++++++
    include/linux/ptrace.h | 72 +++++
    include/linux/sched.h | 10 +-
    include/linux/tracehook.h | 575 +++++++++++++++++++++++++++++++++++++++++
    kernel/exit.c | 53 ++---
    kernel/fork.c | 74 ++----
    kernel/kthread.c | 2 +-
    kernel/ptrace.c | 2 +-
    kernel/sched.c | 29 ++-
    kernel/signal.c | 99 +++++---
    lib/Makefile | 2 +
    lib/syscall.c | 75 ++++++
    mm/nommu.c | 4 +-
    security/selinux/hooks.c | 22 +--
    25 files changed, 1084 insertions(+), 188 deletions(-)
    create mode 100644 include/asm-generic/syscall.h
    create mode 100644 include/linux/tracehook.h
    create mode 100644 lib/syscall.c


    Thanks,
    Roland
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. [PATCH 08/23] tracehook: tracehook_tracer_task

    This adds the tracehook_tracer_task() hook to consolidate all forms of
    "Who is using ptrace on me?" logic. This is used for "TracerPid:" in
    /proc and for permission checks. We also clean up the selinux code
    the called an identical accessor.

    Signed-off-by: Roland McGrath
    ---
    fs/proc/array.c | 9 +++++++--
    fs/proc/base.c | 13 +++++++++----
    include/linux/tracehook.h | 18 ++++++++++++++++++
    security/selinux/hooks.c | 22 +++-------------------
    4 files changed, 37 insertions(+), 25 deletions(-)

    diff --git a/fs/proc/array.c b/fs/proc/array.c
    index 797d775..0d6eb33 100644
    --- a/fs/proc/array.c
    +++ b/fs/proc/array.c
    @@ -80,6 +80,7 @@
    #include
    #include
    #include
    +#include

    #include
    #include
    @@ -168,8 +169,12 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
    rcu_read_lock();
    ppid = pid_alive(p) ?
    task_tgid_nr_ns(rcu_dereference(p->real_parent), ns) : 0;
    - tpid = pid_alive(p) && p->ptrace ?
    - task_pid_nr_ns(rcu_dereference(p->parent), ns) : 0;
    + tpid = 0;
    + if (pid_alive(p)) {
    + struct task_struct *tracer = tracehook_tracer_task(p);
    + if (tracer)
    + tpid = task_pid_nr_ns(tracer, ns);
    + }
    seq_printf(m,
    "State:\t%s\n"
    "Tgid:\t%d\n"
    diff --git a/fs/proc/base.c b/fs/proc/base.c
    index 58c3e6a..6f21daa 100644
    --- a/fs/proc/base.c
    +++ b/fs/proc/base.c
    @@ -69,6 +69,7 @@
    #include
    #include
    #include
    +#include
    #include
    #include
    #include
    @@ -231,10 +232,14 @@ static int check_mem_permission(struct task_struct *task)
    * If current is actively ptrace'ing, and would also be
    * permitted to freshly attach with ptrace now, permit it.
    */
    - if (task->parent == current && (task->ptrace & PT_PTRACED) &&
    - task_is_stopped_or_traced(task) &&
    - ptrace_may_access(task, PTRACE_MODE_ATTACH))
    - return 0;
    + if (task_is_stopped_or_traced(task)) {
    + int match;
    + rcu_read_lock();
    + match = (tracehook_tracer_task(task) == current);
    + rcu_read_unlock();
    + if (match && ptrace_may_access(task, PTRACE_MODE_ATTACH))
    + return 0;
    + }

    /*
    * Noone else is allowed.
    diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
    index 9a5b3be..6468ca0 100644
    --- a/include/linux/tracehook.h
    +++ b/include/linux/tracehook.h
    @@ -73,6 +73,24 @@ static inline int tracehook_unsafe_exec(struct task_struct *task)
    }

    /**
    + * tracehook_tracer_task - return the task that is tracing the given task
    + * @tsk: task to consider
    + *
    + * Returns NULL if noone is tracing @task, or the &struct task_struct
    + * pointer to its tracer.
    + *
    + * Must called under rcu_read_lock(). The pointer returned might be kept
    + * live only by RCU. During exec, this may be called with task_lock()
    + * held on @task, still held from when tracehook_unsafe_exec() was called.
    + */
    +static inline struct task_struct *tracehook_tracer_task(struct task_struct *tsk)
    +{
    + if (task_ptrace(tsk) & PT_PTRACED)
    + return rcu_dereference(tsk->parent);
    + return NULL;
    +}
    +
    +/**
    * tracehook_report_exec - a successful exec was completed
    * @fmt: &struct linux_binfmt that performed the exec
    * @bprm: &struct linux_binprm containing exec details
    diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
    index 63f131f..3481cde 100644
    --- a/security/selinux/hooks.c
    +++ b/security/selinux/hooks.c
    @@ -25,7 +25,7 @@

    #include
    #include
    -#include
    +#include
    #include
    #include
    #include
    @@ -1971,22 +1971,6 @@ static int selinux_vm_enough_memory(struct mm_struct *mm, long pages)
    return __vm_enough_memory(mm, pages, cap_sys_admin);
    }

    -/**
    - * task_tracer_task - return the task that is tracing the given task
    - * @task: task to consider
    - *
    - * Returns NULL if noone is tracing @task, or the &struct task_struct
    - * pointer to its tracer.
    - *
    - * Must be called under rcu_read_lock().
    - */
    -static struct task_struct *task_tracer_task(struct task_struct *task)
    -{
    - if (task->ptrace & PT_PTRACED)
    - return rcu_dereference(task->parent);
    - return NULL;
    -}
    -
    /* binprm security operations */

    static int selinux_bprm_alloc_security(struct linux_binprm *bprm)
    @@ -2238,7 +2222,7 @@ static void selinux_bprm_apply_creds(struct linux_binprm *bprm, int unsafe)
    u32 ptsid = 0;

    rcu_read_lock();
    - tracer = task_tracer_task(current);
    + tracer = tracehook_tracer_task(current);
    if (likely(tracer != NULL)) {
    sec = tracer->security;
    ptsid = sec->sid;
    @@ -5247,7 +5231,7 @@ static int selinux_setprocattr(struct task_struct *p,
    Otherwise, leave SID unchanged and fail. */
    task_lock(p);
    rcu_read_lock();
    - tracer = task_tracer_task(p);
    + tracer = tracehook_tracer_task(p);
    if (tracer != NULL) {
    struct task_security_struct *ptsec = tracer->security;
    u32 ptsid = ptsec->sid;
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. [PATCH 21/23] tracehook: wait_task_inactive

    This extends wait_task_inactive() with a new argument so it
    can be used in a "soft" mode where it will check for the task
    changing state unexpectedly and back off. There is no change
    to existing callers. This lays the groundwork to allow robust,
    noninvasive tracing that can try to sample a blocked thread but
    back off safely if it wakes up.

    Signed-off-by: Roland McGrath
    ---
    arch/ia64/kernel/perfmon.c | 4 ++--
    include/linux/sched.h | 8 ++++++--
    kernel/kthread.c | 2 +-
    kernel/ptrace.c | 2 +-
    kernel/sched.c | 29 +++++++++++++++++++++++++++--
    5 files changed, 37 insertions(+), 8 deletions(-)

    diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c
    index 19d4493..fc8f350 100644
    --- a/arch/ia64/kernel/perfmon.c
    +++ b/arch/ia64/kernel/perfmon.c
    @@ -2626,7 +2626,7 @@ pfm_task_incompatible(pfm_context_t *ctx, struct task_struct *task)
    /*
    * make sure the task is off any CPU
    */
    - wait_task_inactive(task);
    + wait_task_inactive(task, 0);

    /* more to come... */

    @@ -4774,7 +4774,7 @@ recheck:

    UNPROTECT_CTX(ctx, flags);

    - wait_task_inactive(task);
    + wait_task_inactive(task, 0);

    PROTECT_CTX(ctx, flags);

    diff --git a/include/linux/sched.h b/include/linux/sched.h
    index c58e8ab..48832a2 100644
    --- a/include/linux/sched.h
    +++ b/include/linux/sched.h
    @@ -1872,9 +1872,13 @@ extern void set_task_comm(struct task_struct *tsk, char *from);
    extern char *get_task_comm(char *to, struct task_struct *tsk);

    #ifdef CONFIG_SMP
    -extern void wait_task_inactive(struct task_struct * p);
    +extern unsigned long wait_task_inactive(struct task_struct *, long match_state);
    #else
    -#define wait_task_inactive(p) do { } while (0)
    +static inline unsigned long wait_task_inactive(struct task_struct *p,
    + long match_state)
    +{
    + return 1;
    +}
    #endif

    #define next_task(p) list_entry(rcu_dereference((p)->tasks.next), struct task_struct, tasks)
    diff --git a/kernel/kthread.c b/kernel/kthread.c
    index ac3fb73..88538db 100644
    --- a/kernel/kthread.c
    +++ b/kernel/kthread.c
    @@ -176,7 +176,7 @@ void kthread_bind(struct task_struct *k, unsigned int cpu)
    return;
    }
    /* Must have done schedule() in kthread() before we set_task_cpu */
    - wait_task_inactive(k);
    + wait_task_inactive(k, 0);
    set_task_cpu(k, cpu);
    k->cpus_allowed = cpumask_of_cpu(cpu);
    k->rt.nr_cpus_allowed = 1;
    diff --git a/kernel/ptrace.c b/kernel/ptrace.c
    index 8392a9d..082b3fc 100644
    --- a/kernel/ptrace.c
    +++ b/kernel/ptrace.c
    @@ -107,7 +107,7 @@ int ptrace_check_attach(struct task_struct *child, int kill)
    read_unlock(&tasklist_lock);

    if (!ret && !kill)
    - wait_task_inactive(child);
    + ret = wait_task_inactive(child, TASK_TRACED) ? 0 : -ESRCH;

    /* All systems go.. */
    return ret;
    diff --git a/kernel/sched.c b/kernel/sched.c
    index 99e6d85..33aec6d 100644
    --- a/kernel/sched.c
    +++ b/kernel/sched.c
    @@ -1946,16 +1946,24 @@ migrate_task(struct task_struct *p, int dest_cpu, struct migration_req *req)
    /*
    * wait_task_inactive - wait for a thread to unschedule.
    *
    + * If @match_state is nonzero, it's the @p->state value just checked and
    + * not expected to change. If it changes, i.e. @p might have woken up,
    + * then return zero. When we succeed in waiting for @p to be off its CPU,
    + * we return a positive number (its total switch count). If a second call
    + * a short while later returns the same number, the caller can be sure that
    + * @p has remained unscheduled the whole time.
    + *
    * The caller must ensure that the task *will* unschedule sometime soon,
    * else this function might spin for a *long* time. This function can't
    * be called with interrupts off, or it may introduce deadlock with
    * smp_call_function() if an IPI is sent by the same process we are
    * waiting to become inactive.
    */
    -void wait_task_inactive(struct task_struct *p)
    +unsigned long wait_task_inactive(struct task_struct *p, long match_state)
    {
    unsigned long flags;
    int running, on_rq;
    + unsigned long ncsw;
    struct rq *rq;

    for (; {
    @@ -1978,8 +1986,11 @@ void wait_task_inactive(struct task_struct *p)
    * return false if the runqueue has changed and p
    * is actually now running somewhere else!
    */
    - while (task_running(rq, p))
    + while (task_running(rq, p)) {
    + if (match_state && unlikely(p->state != match_state))
    + return 0;
    cpu_relax();
    + }

    /*
    * Ok, time to look more closely! We need the rq
    @@ -1989,9 +2000,21 @@ void wait_task_inactive(struct task_struct *p)
    rq = task_rq_lock(p, &flags);
    running = task_running(rq, p);
    on_rq = p->se.on_rq;
    + ncsw = 0;
    + if (!match_state || p->state == match_state) {
    + ncsw = p->nivcsw + p->nvcsw;
    + if (unlikely(!ncsw))
    + ncsw = 1;
    + }
    task_rq_unlock(rq, &flags);

    /*
    + * If it changed from the expected state, bail out now.
    + */
    + if (unlikely(!ncsw))
    + break;
    +
    + /*
    * Was it really running after all now that we
    * checked with the proper locks actually held?
    *
    @@ -2023,6 +2046,8 @@ void wait_task_inactive(struct task_struct *p)
    */
    break;
    }
    +
    + return ncsw;
    }

    /***
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. [PATCH 17/23] tracehook: force signal_pending()

    This defines a new hook tracehook_force_sigpending() that lets
    tracing code decide to force TIF_SIGPENDING on in recalc_sigpending().

    This is not used yet, so it compiles away to nothing for now.
    It lays the groundwork for new tracing code that can interrupt
    a task synthetically without actually sending a signal.

    Signed-off-by: Roland McGrath
    ---
    include/linux/tracehook.h | 14 ++++++++++++++
    kernel/signal.c | 4 +++-
    2 files changed, 17 insertions(+), 1 deletions(-)

    diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
    index 4c50e1b..43bc51b 100644
    --- a/include/linux/tracehook.h
    +++ b/include/linux/tracehook.h
    @@ -423,6 +423,20 @@ static inline int tracehook_consider_fatal_signal(struct task_struct *task,
    }

    /**
    + * tracehook_force_sigpending - let tracing force signal_pending(current) on
    + *
    + * Called when recomputing our signal_pending() flag. Return nonzero
    + * to force the signal_pending() flag on, so that tracehook_get_signal()
    + * will be called before the next return to user mode.
    + *
    + * Called with @current->sighand->siglock held.
    + */
    +static inline int tracehook_force_sigpending(void)
    +{
    + return 0;
    +}
    +
    +/**
    * tracehook_get_signal - deliver synthetic signal to traced task
    * @task: @current
    * @regs: task_pt_regs(@current)
    diff --git a/kernel/signal.c b/kernel/signal.c
    index cbec18e..44b5823 100644
    --- a/kernel/signal.c
    +++ b/kernel/signal.c
    @@ -134,7 +134,9 @@ void recalc_sigpending_and_wake(struct task_struct *t)

    void recalc_sigpending(void)
    {
    - if (!recalc_sigpending_tsk(current) && !freezing(current))
    + if (unlikely(tracehook_force_sigpending()))
    + set_thread_flag(TIF_SIGPENDING);
    + else if (!recalc_sigpending_tsk(current) && !freezing(current))
    clear_thread_flag(TIF_SIGPENDING);

    }
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. [PATCH 11/23] tracehook: tracehook_consider_ignored_signal

    This defines tracehook_consider_ignored_signal() has a fine-grained
    hook for deciding to prevent the normal short-circuit of sending an
    ignored signal, as ptrace does. There is no change, only cleanup.

    Signed-off-by: Roland McGrath
    ---
    include/linux/tracehook.h | 19 +++++++++++++++++++
    kernel/signal.c | 27 ++++++++++++++++-----------
    2 files changed, 35 insertions(+), 11 deletions(-)

    diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
    index 2d1426f..8cffd34 100644
    --- a/include/linux/tracehook.h
    +++ b/include/linux/tracehook.h
    @@ -312,4 +312,23 @@ static inline void tracehook_signal_handler(int sig, siginfo_t *info,
    ptrace_notify(SIGTRAP);
    }

    +/**
    + * tracehook_consider_ignored_signal - suppress short-circuit of ignored signal
    + * @task: task receiving the signal
    + * @sig: signal number being sent
    + * @handler: %SIG_IGN or %SIG_DFL
    + *
    + * Return zero iff tracing doesn't care to examine this ignored signal,
    + * so it can short-circuit normal delivery and never even get queued.
    + * Either @handler is %SIG_DFL and @sig's default is ignore, or it's %SIG_IGN.
    + *
    + * Called with @task->sighand->siglock held.
    + */
    +static inline int tracehook_consider_ignored_signal(struct task_struct *task,
    + int sig,
    + void __user *handler)
    +{
    + return (task_ptrace(task) & PT_PTRACED) != 0;
    +}
    +
    #endif /* */
    diff --git a/kernel/signal.c b/kernel/signal.c
    index 02d3a11..f68fb01 100644
    --- a/kernel/signal.c
    +++ b/kernel/signal.c
    @@ -22,6 +22,7 @@
    #include
    #include
    #include
    +#include
    #include
    #include
    #include
    @@ -39,24 +40,21 @@

    static struct kmem_cache *sigqueue_cachep;

    -static int __sig_ignored(struct task_struct *t, int sig)
    +static void __user *sig_handler(struct task_struct *t, int sig)
    {
    - void __user *handler;
    + return t->sighand->action[sig - 1].sa.sa_handler;
    +}

    +static int sig_handler_ignored(void __user *handler, int sig)
    +{
    /* Is it explicitly or implicitly ignored? */
    -
    - handler = t->sighand->action[sig - 1].sa.sa_handler;
    return handler == SIG_IGN ||
    (handler == SIG_DFL && sig_kernel_ignore(sig));
    }

    static int sig_ignored(struct task_struct *t, int sig)
    {
    - /*
    - * Tracers always want to know about signals..
    - */
    - if (t->ptrace & PT_PTRACED)
    - return 0;
    + void __user *handler;

    /*
    * Blocked signals are never ignored, since the
    @@ -66,7 +64,14 @@ static int sig_ignored(struct task_struct *t, int sig)
    if (sigismember(&t->blocked, sig) || sigismember(&t->real_blocked, sig))
    return 0;

    - return __sig_ignored(t, sig);
    + handler = sig_handler(t, sig);
    + if (!sig_handler_ignored(handler, sig))
    + return 0;
    +
    + /*
    + * Tracers may want to know about even ignored signals.
    + */
    + return !tracehook_consider_ignored_signal(t, sig, handler);
    }

    /*
    @@ -2324,7 +2329,7 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
    * (for example, SIGCHLD), shall cause the pending signal to
    * be discarded, whether or not it is blocked"
    */
    - if (__sig_ignored(t, sig)) {
    + if (sig_handler_ignored(sig_handler(t, sig), sig)) {
    sigemptyset(&mask);
    sigaddset(&mask, sig);
    rm_from_queue_full(&mask, &t->signal->shared_pending);
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. [PATCH 15/23] tracehook: job control

    This defines the tracehook_notify_jctl() hook to formalize
    the ptrace effects on the job control notifications.
    There is no change, only cleanup.

    Signed-off-by: Roland McGrath
    ---
    include/linux/tracehook.h | 20 ++++++++++++++++++++
    kernel/signal.c | 10 +++++-----
    2 files changed, 25 insertions(+), 5 deletions(-)

    diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
    index 42a0d7b..6dc428d 100644
    --- a/include/linux/tracehook.h
    +++ b/include/linux/tracehook.h
    @@ -451,4 +451,24 @@ static inline int tracehook_get_signal(struct task_struct *task,
    return 0;
    }

    +/**
    + * tracehook_notify_jctl - report about job control stop/continue
    + * @notify: nonzero if this is the last thread in the group to stop
    + * @why: %CLD_STOPPED or %CLD_CONTINUED
    + *
    + * This is called when we might call do_notify_parent_cldstop().
    + * It's called when about to stop for job control; we are already in
    + * %TASK_STOPPED state, about to call schedule(). It's also called when
    + * a delayed %CLD_STOPPED or %CLD_CONTINUED report is ready to be made.
    + *
    + * Return nonzero to generate a %SIGCHLD with @why, which is
    + * normal if @notify is nonzero.
    + *
    + * Called with no locks held.
    + */
    +static inline int tracehook_notify_jctl(int notify, int why)
    +{
    + return notify || (current->ptrace & PT_PTRACED);
    +}
    +
    #endif /* */
    diff --git a/kernel/signal.c b/kernel/signal.c
    index 1766c47..9b8d94a 100644
    --- a/kernel/signal.c
    +++ b/kernel/signal.c
    @@ -605,9 +605,6 @@ static int check_kill_permission(int sig, struct siginfo *info,
    return security_task_kill(t, info, sig, 0);
    }

    -/* forward decl */
    -static void do_notify_parent_cldstop(struct task_struct *tsk, int why);
    -
    /*
    * Handle magic process-wide effects of stop/continue signals. Unlike
    * the signal actions, these happen immediately at signal-generation
    @@ -1629,7 +1626,7 @@ finish_stop(int stop_count)
    * a group stop in progress and we are the last to stop,
    * report to the parent. When ptraced, every thread reports itself.
    */
    - if (stop_count == 0 || (current->ptrace & PT_PTRACED)) {
    + if (tracehook_notify_jctl(stop_count == 0, CLD_STOPPED)) {
    read_lock(&tasklist_lock);
    do_notify_parent_cldstop(current, CLD_STOPPED);
    read_unlock(&tasklist_lock);
    @@ -1766,6 +1763,9 @@ relock:
    signal->flags &= ~SIGNAL_CLD_MASK;
    spin_unlock_irq(&sighand->siglock);

    + if (unlikely(!tracehook_notify_jctl(1, why)))
    + goto relock;
    +
    read_lock(&tasklist_lock);
    do_notify_parent_cldstop(current->group_leader, why);
    read_unlock(&tasklist_lock);
    @@ -1931,7 +1931,7 @@ void exit_signals(struct task_struct *tsk)
    out:
    spin_unlock_irq(&tsk->sighand->siglock);

    - if (unlikely(group_stop)) {
    + if (unlikely(group_stop) && tracehook_notify_jctl(1, CLD_STOPPED)) {
    read_lock(&tasklist_lock);
    do_notify_parent_cldstop(tsk, CLD_STOPPED);
    read_unlock(&tasklist_lock);
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. [PATCH 14/23] tracehook: get_signal_to_deliver

    This defines the tracehook_get_signal() hook to allow tracing code to
    slip in before normal signal dequeuing. This lays the groundwork for
    new tracing features that can inject synthetic signals outside the
    normal queue or control the disposition of delivered signals. The
    calling convention lets tracehook_get_signal() decide both exactly
    what will happen and what signal number to report in the handler/exit.

    Signed-off-by: Roland McGrath
    ---
    include/linux/tracehook.h | 29 +++++++++++++++++++++++++++++
    kernel/signal.c | 38 +++++++++++++++++++++++++++-----------
    2 files changed, 56 insertions(+), 11 deletions(-)

    diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
    index 3548694..42a0d7b 100644
    --- a/include/linux/tracehook.h
    +++ b/include/linux/tracehook.h
    @@ -422,4 +422,33 @@ static inline int tracehook_consider_fatal_signal(struct task_struct *task,
    return (task_ptrace(task) & PT_PTRACED) != 0;
    }

    +/**
    + * tracehook_get_signal - deliver synthetic signal to traced task
    + * @task: @current
    + * @regs: task_pt_regs(@current)
    + * @info: details of synthetic signal
    + * @return_ka: sigaction for synthetic signal
    + *
    + * Return zero to check for a real pending signal normally.
    + * Return -1 after releasing the siglock to repeat the check.
    + * Return a signal number to induce an artifical signal delivery,
    + * setting *@info and *@return_ka to specify its details and behavior.
    + *
    + * The @return_ka->sa_handler value controls the disposition of the
    + * signal, no matter the signal number. For %SIG_DFL, the return value
    + * is a representative signal to indicate the behavior (e.g. %SIGTERM
    + * for death, %SIGQUIT for core dump, %SIGSTOP for job control stop,
    + * %SIGTSTP for stop unless in an orphaned pgrp), but the signal number
    + * reported will be @info->si_signo instead.
    + *
    + * Called with @task->sighand->siglock held, before dequeuing pending signals.
    + */
    +static inline int tracehook_get_signal(struct task_struct *task,
    + struct pt_regs *regs,
    + siginfo_t *info,
    + struct k_sigaction *return_ka)
    +{
    + return 0;
    +}
    +
    #endif /* */
    diff --git a/kernel/signal.c b/kernel/signal.c
    index 860afb8..1766c47 100644
    --- a/kernel/signal.c
    +++ b/kernel/signal.c
    @@ -1779,17 +1779,33 @@ relock:
    do_signal_stop(0))
    goto relock;

    - signr = dequeue_signal(current, &current->blocked, info);
    - if (!signr)
    - break; /* will return 0 */
    + /*
    + * Tracing can induce an artifical signal and choose sigaction.
    + * The return value in @signr determines the default action,
    + * but @info->si_signo is the signal number we will report.
    + */
    + signr = tracehook_get_signal(current, regs, info, return_ka);
    + if (unlikely(signr < 0))
    + goto relock;
    + if (unlikely(signr != 0))
    + ka = return_ka;
    + else {
    + signr = dequeue_signal(current, &current->blocked,
    + info);

    - if (signr != SIGKILL) {
    - signr = ptrace_signal(signr, info, regs, cookie);
    if (!signr)
    - continue;
    + break; /* will return 0 */
    +
    + if (signr != SIGKILL) {
    + signr = ptrace_signal(signr, info,
    + regs, cookie);
    + if (!signr)
    + continue;
    + }
    +
    + ka = &sighand->action[signr-1];
    }

    - ka = &sighand->action[signr-1];
    if (ka->sa.sa_handler == SIG_IGN) /* Do nothing. */
    continue;
    if (ka->sa.sa_handler != SIG_DFL) {
    @@ -1837,7 +1853,7 @@ relock:
    spin_lock_irq(&sighand->siglock);
    }

    - if (likely(do_signal_stop(signr))) {
    + if (likely(do_signal_stop(info->si_signo))) {
    /* It released the siglock. */
    goto relock;
    }
    @@ -1858,7 +1874,7 @@ relock:

    if (sig_kernel_coredump(signr)) {
    if (print_fatal_signals)
    - print_fatal_signal(regs, signr);
    + print_fatal_signal(regs, info->si_signo);
    /*
    * If it was able to dump core, this kills all
    * other threads in the group and synchronizes with
    @@ -1867,13 +1883,13 @@ relock:
    * first and our do_group_exit call below will use
    * that value and ignore the one we pass it.
    */
    - do_coredump((long)signr, signr, regs);
    + do_coredump(info->si_signo, info->si_signo, regs);
    }

    /*
    * Death signals, no core dump.
    */
    - do_group_exit(signr);
    + do_group_exit(info->si_signo);
    /* NOTREACHED */
    }
    spin_unlock_irq(&sighand->siglock);
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. [PATCH 06/23] tracehook: vfork-done

    This moves the PTRACE_EVENT_VFORK_DONE tracing into a tracehook.h inline,
    tracehook_report_vfork_done(). The change has no effect, just clean-up.

    Signed-off-by: Roland McGrath
    ---
    include/linux/tracehook.h | 18 ++++++++++++++++++
    kernel/fork.c | 5 +----
    2 files changed, 19 insertions(+), 4 deletions(-)

    diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
    index 3ebc58b..830e6e1 100644
    --- a/include/linux/tracehook.h
    +++ b/include/linux/tracehook.h
    @@ -210,4 +210,22 @@ static inline void tracehook_report_clone_complete(int trace,
    ptrace_event(0, trace, pid);
    }

    +/**
    + * tracehook_report_vfork_done - vfork parent's child has exited or exec'd
    + * @child: child task, already running
    + * @pid: new child's PID in the parent's namespace
    + *
    + * Called after a %CLONE_VFORK parent has waited for the child to complete.
    + * The clone/vfork system call will return immediately after this.
    + * The @child pointer may be invalid if a self-reaping child died and
    + * tracehook_report_clone() took no action to prevent it from self-reaping.
    + *
    + * Called with no locks held.
    + */
    +static inline void tracehook_report_vfork_done(struct task_struct *child,
    + pid_t pid)
    +{
    + ptrace_event(PT_TRACE_VFORK_DONE, PTRACE_EVENT_VFORK_DONE, pid);
    +}
    +
    #endif /* */
    diff --git a/kernel/fork.c b/kernel/fork.c
    index 56b6ab0..5345b7d 100644
    --- a/kernel/fork.c
    +++ b/kernel/fork.c
    @@ -1376,10 +1376,7 @@ long do_fork(unsigned long clone_flags,
    freezer_do_not_count();
    wait_for_completion(&vfork);
    freezer_count();
    - if (unlikely (current->ptrace & PT_TRACE_VFORK_DONE)) {
    - current->ptrace_message = nr;
    - ptrace_notify ((PTRACE_EVENT_VFORK_DONE << 8) | SIGTRAP);
    - }
    + tracehook_report_vfork_done(p, nr);
    }
    } else {
    nr = PTR_ERR(p);
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. [PATCH 18/23] tracehook: TIF_NOTIFY_RESUME

    This adds tracehook.h inlines to enable a new arch feature in support
    of user debugging/tracing. This is not used yet, but it lays the
    groundwork for a debugger to be able to wrangle a task that's possibly
    running, without interrupting its syscalls in progress.

    Each arch should define TIF_NOTIFY_RESUME, and in their entry.S code
    treat it much like TIF_SIGPENDING. That is, it causes you to take the
    slow path when returning to user mode, where you get the full user-mode
    state accessible as for signal handling or ptrace. The arch code
    should check TIF_NOTIFY_RESUME after handling TIF_SIGPENDING.
    When it's set, clear it and then call tracehook_notify_resume().

    In future, tracing code will call set_notify_resume() when it
    wants to get a callback in tracehook_notify_resume().

    Signed-off-by: Roland McGrath
    ---
    include/linux/tracehook.h | 34 ++++++++++++++++++++++++++++++++++
    1 files changed, 34 insertions(+), 0 deletions(-)

    diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
    index 43bc51b..32867ab 100644
    --- a/include/linux/tracehook.h
    +++ b/include/linux/tracehook.h
    @@ -537,4 +537,38 @@ static inline void tracehook_report_death(struct task_struct *task,
    {
    }

    +#ifdef TIF_NOTIFY_RESUME
    +/**
    + * set_notify_resume - cause tracehook_notify_resume() to be called
    + * @task: task that will call tracehook_notify_resume()
    + *
    + * Calling this arranges that @task will call tracehook_notify_resume()
    + * before returning to user mode. If it's already running in user mode,
    + * it will enter the kernel and call tracehook_notify_resume() soon.
    + * If it's blocked, it will not be woken.
    + */
    +static inline void set_notify_resume(struct task_struct *task)
    +{
    + if (!test_and_set_tsk_thread_flag(task, TIF_NOTIFY_RESUME))
    + kick_process(task);
    +}
    +
    +/**
    + * tracehook_notify_resume - report when about to return to user mode
    + * @regs: user-mode registers of @current task
    + *
    + * This is called when %TIF_NOTIFY_RESUME has been set. Now we are
    + * about to return to user mode, and the user state in @regs can be
    + * inspected or adjusted. The caller in arch code has cleared
    + * %TIF_NOTIFY_RESUME before the call. If the flag gets set again
    + * asynchronously, this will be called again before we return to
    + * user mode.
    + *
    + * Called without locks.
    + */
    +static inline void tracehook_notify_resume(struct pt_regs *regs)
    +{
    +}
    +#endif /* TIF_NOTIFY_RESUME */
    +
    #endif /* */
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. [PATCH 05/23] tracehook: clone

    This moves all the ptrace initialization and tracing logic for task
    creation into tracehook.h and ptrace.h inlines. It reorganizes the code
    slightly, but should not change any behavior.

    There are four tracehook entry points, at each important stage of task
    creation. This keeps the interface from the core fork.c code fairly clean,
    while supporting the complex setup required for ptrace or something like it.

    Signed-off-by: Roland McGrath
    ---
    include/linux/ptrace.h | 22 ++++++++++
    include/linux/tracehook.h | 100 +++++++++++++++++++++++++++++++++++++++++++++
    kernel/fork.c | 69 +++++++++++++------------------
    3 files changed, 150 insertions(+), 41 deletions(-)

    diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
    index c74abfc..dae6d85 100644
    --- a/include/linux/ptrace.h
    +++ b/include/linux/ptrace.h
    @@ -154,6 +154,28 @@ static inline int ptrace_event(int mask, int event, unsigned long message)
    return 1;
    }

    +/**
    + * ptrace_init_task - initialize ptrace state for a new child
    + * @child: new child task
    + * @ptrace: true if child should be ptrace'd by parent's tracer
    + *
    + * This is called immediately after adding @child to its parent's children
    + * list. @ptrace is false in the normal case, and true to ptrace @child.
    + *
    + * Called with current's siglock and write_lock_irq(&tasklist_lock) held.
    + */
    +static inline void ptrace_init_task(struct task_struct *child, bool ptrace)
    +{
    + INIT_LIST_HEAD(&child->ptrace_entry);
    + INIT_LIST_HEAD(&child->ptraced);
    + child->parent = child->real_parent;
    + child->ptrace = 0;
    + if (unlikely(ptrace)) {
    + child->ptrace = current->ptrace;
    + __ptrace_link(child, current->parent);
    + }
    +}
    +
    #ifndef force_successful_syscall_return
    /*
    * System call handlers that, upon successful completion, need to return a
    diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
    index 967ab47..3ebc58b 100644
    --- a/include/linux/tracehook.h
    +++ b/include/linux/tracehook.h
    @@ -110,4 +110,104 @@ static inline void tracehook_report_exit(long *exit_code)
    ptrace_event(PT_TRACE_EXIT, PTRACE_EVENT_EXIT, *exit_code);
    }

    +/**
    + * tracehook_prepare_clone - prepare for new child to be cloned
    + * @clone_flags: %CLONE_* flags from clone/fork/vfork system call
    + *
    + * This is called before a new user task is to be cloned.
    + * Its return value will be passed to tracehook_finish_clone().
    + *
    + * Called with no locks held.
    + */
    +static inline int tracehook_prepare_clone(unsigned clone_flags)
    +{
    + if (clone_flags & CLONE_UNTRACED)
    + return 0;
    +
    + if (clone_flags & CLONE_VFORK) {
    + if (current->ptrace & PT_TRACE_VFORK)
    + return PTRACE_EVENT_VFORK;
    + } else if ((clone_flags & CSIGNAL) != SIGCHLD) {
    + if (current->ptrace & PT_TRACE_CLONE)
    + return PTRACE_EVENT_CLONE;
    + } else if (current->ptrace & PT_TRACE_FORK)
    + return PTRACE_EVENT_FORK;
    +
    + return 0;
    +}
    +
    +/**
    + * tracehook_finish_clone - new child created and being attached
    + * @child: new child task
    + * @clone_flags: %CLONE_* flags from clone/fork/vfork system call
    + * @trace: return value from tracehook_clone_prepare()
    + *
    + * This is called immediately after adding @child to its parent's children list.
    + * The @trace value is that returned by tracehook_prepare_clone().
    + *
    + * Called with current's siglock and write_lock_irq(&tasklist_lock) held.
    + */
    +static inline void tracehook_finish_clone(struct task_struct *child,
    + unsigned long clone_flags, int trace)
    +{
    + ptrace_init_task(child, (clone_flags & CLONE_PTRACE) || trace);
    +}
    +
    +/**
    + * tracehook_report_clone - in parent, new child is about to start running
    + * @trace: return value from tracehook_clone_prepare()
    + * @regs: parent's user register state
    + * @clone_flags: flags from parent's system call
    + * @pid: new child's PID in the parent's namespace
    + * @child: new child task
    + *
    + * Called after a child is set up, but before it has been started running.
    + * The @trace value is that returned by tracehook_clone_prepare().
    + * This is not a good place to block, because the child has not started yet.
    + * Suspend the child here if desired, and block in tracehook_clone_complete().
    + * This must prevent the child from self-reaping if tracehook_clone_complete()
    + * uses the @child pointer; otherwise it might have died and been released by
    + * the time tracehook_report_clone_complete() is called.
    + *
    + * Called with no locks held, but the child cannot run until this returns.
    + */
    +static inline void tracehook_report_clone(int trace, struct pt_regs *regs,
    + unsigned long clone_flags,
    + pid_t pid, struct task_struct *child)
    +{
    + if (unlikely(trace)) {
    + /*
    + * The child starts up with an immediate SIGSTOP.
    + */
    + sigaddset(&child->pending.signal, SIGSTOP);
    + set_tsk_thread_flag(child, TIF_SIGPENDING);
    + }
    +}
    +
    +/**
    + * tracehook_report_clone_complete - new child is running
    + * @trace: return value from tracehook_clone_prepare()
    + * @regs: parent's user register state
    + * @clone_flags: flags from parent's system call
    + * @pid: new child's PID in the parent's namespace
    + * @child: child task, already running
    + *
    + * This is called just after the child has started running. This is
    + * just before the clone/fork syscall returns, or blocks for vfork
    + * child completion if @clone_flags has the %CLONE_VFORK bit set.
    + * The @child pointer may be invalid if a self-reaping child died and
    + * tracehook_report_clone() took no action to prevent it from self-reaping.
    + *
    + * Called with no locks held.
    + */
    +static inline void tracehook_report_clone_complete(int trace,
    + struct pt_regs *regs,
    + unsigned long clone_flags,
    + pid_t pid,
    + struct task_struct *child)
    +{
    + if (unlikely(trace))
    + ptrace_event(0, trace, pid);
    +}
    +
    #endif /* */
    diff --git a/kernel/fork.c b/kernel/fork.c
    index adefc11..56b6ab0 100644
    --- a/kernel/fork.c
    +++ b/kernel/fork.c
    @@ -55,6 +55,7 @@
    #include
    #include
    #include
    +#include

    #include
    #include
    @@ -833,8 +834,7 @@ static void copy_flags(unsigned long clone_flags, struct task_struct *p)

    new_flags &= ~PF_SUPERPRIV;
    new_flags |= PF_FORKNOEXEC;
    - if (!(clone_flags & CLONE_PTRACE))
    - p->ptrace = 0;
    + new_flags |= PF_STARTING;
    p->flags = new_flags;
    clear_freeze_flag(p);
    }
    @@ -875,7 +875,8 @@ static struct task_struct *copy_process(unsigned long clone_flags,
    struct pt_regs *regs,
    unsigned long stack_size,
    int __user *child_tidptr,
    - struct pid *pid)
    + struct pid *pid,
    + int trace)
    {
    int retval;
    struct task_struct *p;
    @@ -1125,8 +1126,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
    */
    p->group_leader = p;
    INIT_LIST_HEAD(&p->thread_group);
    - INIT_LIST_HEAD(&p->ptrace_entry);
    - INIT_LIST_HEAD(&p->ptraced);

    /* Now that the task is set up, run cgroup callbacks if
    * necessary. We need to run them before the task is visible
    @@ -1157,7 +1156,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
    p->real_parent = current->real_parent;
    else
    p->real_parent = current;
    - p->parent = p->real_parent;

    spin_lock(&current->sighand->siglock);

    @@ -1199,8 +1197,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,

    if (likely(p->pid)) {
    list_add_tail(&p->sibling, &p->real_parent->children);
    - if (unlikely(p->ptrace & PT_PTRACED))
    - __ptrace_link(p, current->parent);
    + tracehook_finish_clone(p, clone_flags, trace);

    if (thread_group_leader(p)) {
    if (clone_flags & CLONE_NEWPID)
    @@ -1285,29 +1282,13 @@ struct task_struct * __cpuinit fork_idle(int cpu)
    struct pt_regs regs;

    task = copy_process(CLONE_VM, 0, idle_regs(&regs), 0, NULL,
    - &init_struct_pid);
    + &init_struct_pid, 0);
    if (!IS_ERR(task))
    init_idle(task, cpu);

    return task;
    }

    -static int fork_traceflag(unsigned clone_flags)
    -{
    - if (clone_flags & CLONE_UNTRACED)
    - return 0;
    - else if (clone_flags & CLONE_VFORK) {
    - if (current->ptrace & PT_TRACE_VFORK)
    - return PTRACE_EVENT_VFORK;
    - } else if ((clone_flags & CSIGNAL) != SIGCHLD) {
    - if (current->ptrace & PT_TRACE_CLONE)
    - return PTRACE_EVENT_CLONE;
    - } else if (current->ptrace & PT_TRACE_FORK)
    - return PTRACE_EVENT_FORK;
    -
    - return 0;
    -}
    -
    /*
    * Ok, this is the main fork-routine.
    *
    @@ -1342,14 +1323,14 @@ long do_fork(unsigned long clone_flags,
    }
    }

    - if (unlikely(current->ptrace)) {
    - trace = fork_traceflag (clone_flags);
    - if (trace)
    - clone_flags |= CLONE_PTRACE;
    - }
    + /*
    + * When called from kernel_thread, don't do user tracing stuff.
    + */
    + if (likely(user_mode(regs)))
    + trace = tracehook_prepare_clone(clone_flags);

    p = copy_process(clone_flags, stack_start, regs, stack_size,
    - child_tidptr, NULL);
    + child_tidptr, NULL, trace);
    /*
    * Do this prior waking up the new thread - the thread pointer
    * might get invalid after that point, if the thread exits quickly.
    @@ -1367,24 +1348,30 @@ long do_fork(unsigned long clone_flags,
    init_completion(&vfork);
    }

    - if ((p->ptrace & PT_PTRACED) || (clone_flags & CLONE_STOPPED)) {
    + tracehook_report_clone(trace, regs, clone_flags, nr, p);
    +
    + /*
    + * We set PF_STARTING at creation in case tracing wants to
    + * use this to distinguish a fully live task from one that
    + * hasn't gotten to tracehook_report_clone() yet. Now we
    + * clear it and set the child going.
    + */
    + p->flags &= ~PF_STARTING;
    +
    + if (unlikely(clone_flags & CLONE_STOPPED)) {
    /*
    * We'll start up with an immediate SIGSTOP.
    */
    sigaddset(&p->pending.signal, SIGSTOP);
    set_tsk_thread_flag(p, TIF_SIGPENDING);
    - }
    -
    - if (!(clone_flags & CLONE_STOPPED))
    - wake_up_new_task(p, clone_flags);
    - else
    __set_task_state(p, TASK_STOPPED);
    -
    - if (unlikely (trace)) {
    - current->ptrace_message = nr;
    - ptrace_notify ((trace << 8) | SIGTRAP);
    + } else {
    + wake_up_new_task(p, clone_flags);
    }

    + tracehook_report_clone_complete(trace, regs,
    + clone_flags, nr, p);
    +
    if (clone_flags & CLONE_VFORK) {
    freezer_do_not_count();
    wait_for_completion(&vfork);
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. [PATCH 20/23] tracehook: CONFIG_HAVE_ARCH_TRACEHOOK

    This adds the generic HAVE_ARCH_TRACEHOOK kconfig item.
    Each arch should add to some Kconfig file:
    select HAVE_ARCH_TRACEHOOK
    if the arch code uses all the latest hooks to enable newfangled
    tracing and debugging code. The comment in arch/Kconfig lists
    all the prerequisite arch support. When all these are available,
    setting HAVE_ARCH_TRACEHOOK will allow enabling any new features
    that depend on the modern arch interfaces.

    Signed-off-by: Roland McGrath
    ---
    arch/Kconfig | 18 ++++++++++++++++++
    1 files changed, 18 insertions(+), 0 deletions(-)

    diff --git a/arch/Kconfig b/arch/Kconfig
    index ad89a33..4641edf 100644
    --- a/arch/Kconfig
    +++ b/arch/Kconfig
    @@ -37,6 +37,24 @@ config HAVE_KPROBES
    config HAVE_KRETPROBES
    def_bool n

    +#
    +# An arch should select this if it provides all these things:
    +#
    +# task_pt_regs() in asm/processor.h or asm/ptrace.h
    +# arch_has_single_step() if there is hardware single-step support
    +# arch_has_block_step() if there is hardware block-step support
    +# arch_ptrace() and not #define __ARCH_SYS_PTRACE
    +# compat_arch_ptrace() and #define __ARCH_WANT_COMPAT_SYS_PTRACE
    +# asm/syscall.h supplying asm-generic/syscall.h interface
    +# linux/regset.h user_regset interfaces
    +# CORE_DUMP_USE_REGSET #define'd in linux/elf.h
    +# TIF_SYSCALL_TRACE calls tracehook_report_syscall_{entry,exit}
    +# TIF_NOTIFY_RESUME calls tracehook_notify_resume()
    +# signal delivery calls tracehook_signal_handler()
    +#
    +config HAVE_ARCH_TRACEHOOK
    + def_bool n
    +
    config HAVE_DMA_ATTRS
    def_bool n

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. [PATCH 13/23] tracehook: syscall

    This adds standard tracehook.h inlines for arch code to call when
    TIF_SYSCALL_TRACE has been set. This replaces having each arch
    implement the ptrace guts for its syscall tracing support.

    Signed-off-by: Roland McGrath
    ---
    include/linux/tracehook.h | 70 +++++++++++++++++++++++++++++++++++++++++++++
    1 files changed, 70 insertions(+), 0 deletions(-)

    diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
    index 8b4c15e..3548694 100644
    --- a/include/linux/tracehook.h
    +++ b/include/linux/tracehook.h
    @@ -66,6 +66,76 @@ static inline int tracehook_expect_breakpoints(struct task_struct *task)
    return (task_ptrace(task) & PT_PTRACED) != 0;
    }

    +/*
    + * ptrace report for syscall entry and exit looks identical.
    + */
    +static inline void ptrace_report_syscall(struct pt_regs *regs)
    +{
    + int ptrace = task_ptrace(current);
    +
    + if (!(ptrace & PT_PTRACED))
    + return;
    +
    + ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0));
    +
    + /*
    + * this isn't the same as continuing with a signal, but it will do
    + * for normal use. strace only continues with a signal if the
    + * stopping signal is not SIGTRAP. -brl
    + */
    + if (current->exit_code) {
    + send_sig(current->exit_code, current, 1);
    + current->exit_code = 0;
    + }
    +}
    +
    +/**
    + * tracehook_report_syscall_entry - task is about to attempt a system call
    + * @regs: user register state of current task
    + *
    + * This will be called if %TIF_SYSCALL_TRACE has been set, when the
    + * current task has just entered the kernel for a system call.
    + * Full user register state is available here. Changing the values
    + * in @regs can affect the system call number and arguments to be tried.
    + * It is safe to block here, preventing the system call from beginning.
    + *
    + * Returns zero normally, or nonzero if the calling arch code should abort
    + * the system call. That must prevent normal entry so no system call is
    + * made. If @task ever returns to user mode after this, its register state
    + * is unspecified, but should be something harmless like an %ENOSYS error
    + * return.
    + *
    + * Called without locks, just after entering kernel mode.
    + */
    +static inline __must_check int tracehook_report_syscall_entry(
    + struct pt_regs *regs)
    +{
    + ptrace_report_syscall(regs);
    + return 0;
    +}
    +
    +/**
    + * tracehook_report_syscall_exit - task has just finished a system call
    + * @regs: user register state of current task
    + * @step: nonzero if simulating single-step or block-step
    + *
    + * This will be called if %TIF_SYSCALL_TRACE has been set, when the
    + * current task has just finished an attempted system call. Full
    + * user register state is available here. It is safe to block here,
    + * preventing signals from being processed.
    + *
    + * If @step is nonzero, this report is also in lieu of the normal
    + * trap that would follow the system call instruction because
    + * user_enable_block_step() or user_enable_single_step() was used.
    + * In this case, %TIF_SYSCALL_TRACE might not be set.
    + *
    + * Called without locks, just before checking for pending signals.
    + */
    +static inline void tracehook_report_syscall_exit(struct pt_regs *regs, int step)
    +{
    + ptrace_report_syscall(regs);
    +}
    +
    /**
    * tracehook_unsafe_exec - check for exec declared unsafe due to tracing
    * @task: current task doing exec
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. [PATCH 23/23] /proc/PID/syscall

    This adds /proc/PID/syscall and /proc/PID/task/TID/syscall magic files.
    These use task_current_syscall() to show the task's current system call
    number and argument registers, stack pointer and PC. For a task blocked
    but not in a syscall, the file shows "-1" in place of the syscall number,
    followed by only the SP and PC. For a task that's not blocked, it shows
    "running".

    Signed-off-by: Roland McGrath
    ---
    fs/proc/base.c | 26 ++++++++++++++++++++++++++
    1 files changed, 26 insertions(+), 0 deletions(-)

    diff --git a/fs/proc/base.c b/fs/proc/base.c
    index 6f21daa..6c950c7 100644
    --- a/fs/proc/base.c
    +++ b/fs/proc/base.c
    @@ -509,6 +509,26 @@ static int proc_pid_limits(struct task_struct *task, char *buffer)
    return count;
    }

    +#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
    +static int proc_pid_syscall(struct task_struct *task, char *buffer)
    +{
    + long nr;
    + unsigned long args[6], sp, pc;
    +
    + if (task_current_syscall(task, &nr, args, 6, &sp, &pc))
    + return sprintf(buffer, "running\n");
    +
    + if (nr < 0)
    + return sprintf(buffer, "%ld 0x%lx 0x%lx\n", nr, sp, pc);
    +
    + return sprintf(buffer,
    + "%ld 0x%lx 0x%lx 0x%lx 0x%lx 0x%lx 0x%lx 0x%lx 0x%lx\n",
    + nr,
    + args[0], args[1], args[2], args[3], args[4], args[5],
    + sp, pc);
    +}
    +#endif /* CONFIG_HAVE_ARCH_TRACEHOOK */
    +
    /************************************************** **********************/
    /* Here the fs part begins */
    /************************************************** **********************/
    @@ -2425,6 +2445,9 @@ static const struct pid_entry tgid_base_stuff[] = {
    #ifdef CONFIG_SCHED_DEBUG
    REG("sched", S_IRUGO|S_IWUSR, pid_sched),
    #endif
    +#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
    + INF("syscall", S_IRUSR, pid_syscall),
    +#endif
    INF("cmdline", S_IRUGO, pid_cmdline),
    ONE("stat", S_IRUGO, tgid_stat),
    ONE("statm", S_IRUGO, pid_statm),
    @@ -2757,6 +2780,9 @@ static const struct pid_entry tid_base_stuff[] = {
    #ifdef CONFIG_SCHED_DEBUG
    REG("sched", S_IRUGO|S_IWUSR, pid_sched),
    #endif
    +#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
    + INF("syscall", S_IRUSR, pid_syscall),
    +#endif
    INF("cmdline", S_IRUGO, pid_cmdline),
    ONE("stat", S_IRUGO, tid_stat),
    ONE("statm", S_IRUGO, pid_statm),
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  14. [PATCH 22/23] task_current_syscall

    This adds the new function task_current_syscall() on machines where the
    asm/syscall.h interface is supported (CONFIG_HAVE_ARCH_TRACEHOOK).
    It's exported for modules to use in the future. This function safely
    samples the state of a blocked thread to collect what system call it
    is blocked in, and the six system call argument registers.

    Signed-off-by: Roland McGrath
    ---
    include/linux/ptrace.h | 4 ++
    lib/Makefile | 2 +
    lib/syscall.c | 75 ++++++++++++++++++++++++++++++++++++++++++++++++
    3 files changed, 81 insertions(+), 0 deletions(-)

    diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
    index ed69c03..fd31756 100644
    --- a/include/linux/ptrace.h
    +++ b/include/linux/ptrace.h
    @@ -314,6 +314,10 @@ static inline void user_enable_block_step(struct task_struct *task)
    #define arch_ptrace_stop(code, info) do { } while (0)
    #endif

    +extern int task_current_syscall(struct task_struct *target, long *callno,
    + unsigned long args[6], unsigned int maxargs,
    + unsigned long *sp, unsigned long *pc);
    +
    #endif

    #endif
    diff --git a/lib/Makefile b/lib/Makefile
    index 2c62a9c..0dd88c4 100644
    --- a/lib/Makefile
    +++ b/lib/Makefile
    @@ -82,6 +82,8 @@ lib-$(CONFIG_GENERIC_BUG) += bug.o

    obj-$(CONFIG_HAVE_LMB) += lmb.o

    +obj-$(CONFIG_HAVE_ARCH_TRACEHOOK) += syscall.o
    +
    hostprogs-y := gen_crc32table
    clean-files := crc32table.h

    diff --git a/lib/syscall.c b/lib/syscall.c
    new file mode 100644
    index 0000000..a4f7067
    --- /dev/null
    +++ b/lib/syscall.c
    @@ -0,0 +1,75 @@
    +#include
    +#include
    +#include
    +#include
    +
    +static int collect_syscall(struct task_struct *target, long *callno,
    + unsigned long args[6], unsigned int maxargs,
    + unsigned long *sp, unsigned long *pc)
    +{
    + struct pt_regs *regs = task_pt_regs(target);
    + if (unlikely(!regs))
    + return -EAGAIN;
    +
    + *sp = user_stack_pointer(regs);
    + *pc = instruction_pointer(regs);
    +
    + *callno = syscall_get_nr(target, regs);
    + if (*callno != -1L && maxargs > 0)
    + syscall_get_arguments(target, regs, 0, maxargs, args);
    +
    + return 0;
    +}
    +
    +/**
    + * task_current_syscall - Discover what a blocked task is doing.
    + * @target: thread to examine
    + * @callno: filled with system call number or -1
    + * @args: filled with @maxargs system call arguments
    + * @maxargs: number of elements in @args to fill
    + * @sp: filled with user stack pointer
    + * @pc: filled with user PC
    + *
    + * If @target is blocked in a system call, returns zero with *@callno
    + * set to the the call's number and @args filled in with its arguments.
    + * Registers not used for system call arguments may not be available and
    + * it is not kosher to use &struct user_regset calls while the system
    + * call is still in progress. Note we may get this result if @target
    + * has finished its system call but not yet returned to user mode, such
    + * as when it's stopped for signal handling or syscall exit tracing.
    + *
    + * If @target is blocked in the kernel during a fault or exception,
    + * returns zero with *@callno set to -1 and does not fill in @args.
    + * If so, it's now safe to examine @target using &struct user_regset
    + * get() calls as long as we're sure @target won't return to user mode.
    + *
    + * Returns -%EAGAIN if @target does not remain blocked.
    + *
    + * Returns -%EINVAL if @maxargs is too large (maximum is six).
    + */
    +int task_current_syscall(struct task_struct *target, long *callno,
    + unsigned long args[6], unsigned int maxargs,
    + unsigned long *sp, unsigned long *pc)
    +{
    + long state;
    + unsigned long ncsw;
    +
    + if (unlikely(maxargs > 6))
    + return -EINVAL;
    +
    + if (target == current)
    + return collect_syscall(target, callno, args, maxargs, sp, pc);
    +
    + state = target->state;
    + if (unlikely(!state))
    + return -EAGAIN;
    +
    + ncsw = wait_task_inactive(target, state);
    + if (unlikely(!ncsw) ||
    + unlikely(collect_syscall(target, callno, args, maxargs, sp, pc)) ||
    + unlikely(wait_task_inactive(target, state) != ncsw))
    + return -EAGAIN;
    +
    + return 0;
    +}
    +EXPORT_SYMBOL_GPL(task_current_syscall);
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  15. [PATCH 19/23] tracehook: asm/syscall.h

    This adds asm-generic/syscall.h, which documents what a real
    asm-ARCH/syscall.h file should define. This is not used yet,
    but will provide all the machine-dependent details of examining
    a user system call about to begin, in progress, or just ended.

    Each arch should add an asm-ARCH/syscall.h that defines all the
    entry points documented in asm-generic/syscall.h, as short inlines
    if possible. This lets us write new tracing code that understands
    user system call registers, without any new arch-specific work.

    Signed-off-by: Roland McGrath
    ---
    include/asm-generic/syscall.h | 141 +++++++++++++++++++++++++++++++++++++++++
    include/linux/tracehook.h | 3 +-
    2 files changed, 143 insertions(+), 1 deletions(-)

    diff --git a/include/asm-generic/syscall.h b/include/asm-generic/syscall.h
    new file mode 100644
    index 0000000..abcf34c
    --- /dev/null
    +++ b/include/asm-generic/syscall.h
    @@ -0,0 +1,141 @@
    +/*
    + * Access to user system call parameters and results
    + *
    + * Copyright (C) 2008 Red Hat, Inc. All rights reserved.
    + *
    + * This copyrighted material is made available to anyone wishing to use,
    + * modify, copy, or redistribute it subject to the terms and conditions
    + * of the GNU General Public License v.2.
    + *
    + * This file is a stub providing documentation for what functions
    + * asm-ARCH/syscall.h files need to define. Most arch definitions
    + * will be simple inlines.
    + *
    + * All of these functions expect to be called with no locks,
    + * and only when the caller is sure that the task of interest
    + * cannot return to user mode while we are looking at it.
    + */
    +
    +#ifndef _ASM_SYSCALL_H
    +#define _ASM_SYSCALL_H 1
    +
    +struct task_struct;
    +struct pt_regs;
    +
    +/**
    + * syscall_get_nr - find what system call a task is executing
    + * @task: task of interest, must be blocked
    + * @regs: task_pt_regs() of @task
    + *
    + * If @task is executing a system call or is at system call
    + * tracing about to attempt one, returns the system call number.
    + * If @task is not executing a system call, i.e. it's blocked
    + * inside the kernel for a fault or signal, returns -1.
    + *
    + * It's only valid to call this when @task is known to be blocked.
    + */
    +long syscall_get_nr(struct task_struct *task, struct pt_regs *regs);
    +
    +/**
    + * syscall_rollback - roll back registers after an aborted system call
    + * @task: task of interest, must be in system call exit tracing
    + * @regs: task_pt_regs() of @task
    + *
    + * It's only valid to call this when @task is stopped for system
    + * call exit tracing (due to TIF_SYSCALL_TRACE or TIF_SYSCALL_AUDIT),
    + * after tracehook_report_syscall_entry() returned nonzero to prevent
    + * the system call from taking place.
    + *
    + * This rolls back the register state in @regs so it's as if the
    + * system call instruction was a no-op. The registers containing
    + * the system call number and arguments are as they were before the
    + * system call instruction. This may not be the same as what the
    + * register state looked like at system call entry tracing.
    + */
    +void syscall_rollback(struct task_struct *task, struct pt_regs *regs);
    +
    +/**
    + * syscall_get_error - check result of traced system call
    + * @task: task of interest, must be blocked
    + * @regs: task_pt_regs() of @task
    + *
    + * Returns 0 if the system call succeeded, or -ERRORCODE if it failed.
    + *
    + * It's only valid to call this when @task is stopped for tracing on exit
    + * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
    + */
    +long syscall_get_error(struct task_struct *task, struct pt_regs *regs);
    +
    +/**
    + * syscall_get_return_value - get the return value of a traced system call
    + * @task: task of interest, must be blocked
    + * @regs: task_pt_regs() of @task
    + *
    + * Returns the return value of the successful system call.
    + * This value is meaningless if syscall_get_error() returned nonzero.
    + *
    + * It's only valid to call this when @task is stopped for tracing on exit
    + * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
    + */
    +long syscall_get_return_value(struct task_struct *task, struct pt_regs *regs);
    +
    +/**
    + * syscall_set_return_value - change the return value of a traced system call
    + * @task: task of interest, must be blocked
    + * @regs: task_pt_regs() of @task
    + * @error: negative error code, or zero to indicate success
    + * @val: user return value if @error is zero
    + *
    + * This changes the results of the system call that user mode will see.
    + * If @error is zero, the user sees a successful system call with a
    + * return value of @val. If @error is nonzero, it's a negated errno
    + * code; the user sees a failed system call with this errno code.
    + *
    + * It's only valid to call this when @task is stopped for tracing on exit
    + * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
    + */
    +void syscall_set_return_value(struct task_struct *task, struct pt_regs *regs,
    + int error, long val);
    +
    +/**
    + * syscall_get_arguments - extract system call parameter values
    + * @task: task of interest, must be blocked
    + * @regs: task_pt_regs() of @task
    + * @i: argument index [0,5]
    + * @n: number of arguments; n+i must be [1,6].
    + * @args: array filled with argument values
    + *
    + * Fetches @n arguments to the system call starting with the @i'th argument
    + * (from 0 through 5). Argument @i is stored in @args[0], and so on.
    + * An arch inline version is probably optimal when @i and @n are constants.
    + *
    + * It's only valid to call this when @task is stopped for tracing on
    + * entry to a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
    + * It's invalid to call this with @i + @n > 6; we only support system calls
    + * taking up to 6 arguments.
    + */
    +void syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
    + unsigned int i, unsigned int n, unsigned long *args);
    +
    +/**
    + * syscall_set_arguments - change system call parameter value
    + * @task: task of interest, must be in system call entry tracing
    + * @regs: task_pt_regs() of @task
    + * @i: argument index [0,5]
    + * @n: number of arguments; n+i must be [1,6].
    + * @args: array of argument values to store
    + *
    + * Changes @n arguments to the system call starting with the @i'th argument.
    + * @n'th argument to @val. Argument @i gets value @args[0], and so on.
    + * An arch inline version is probably optimal when @i and @n are constants.
    + *
    + * It's only valid to call this when @task is stopped for tracing on
    + * entry to a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
    + * It's invalid to call this with @i + @n > 6; we only support system calls
    + * taking up to 6 arguments.
    + */
    +void syscall_set_arguments(struct task_struct *task, struct pt_regs *regs,
    + unsigned int i, unsigned int n,
    + const unsigned long *args);
    +
    +#endif /* _ASM_SYSCALL_H */
    diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
    index 32867ab..589f429 100644
    --- a/include/linux/tracehook.h
    +++ b/include/linux/tracehook.h
    @@ -103,7 +103,8 @@ static inline void ptrace_report_syscall(struct pt_regs *regs)
    * the system call. That must prevent normal entry so no system call is
    * made. If @task ever returns to user mode after this, its register state
    * is unspecified, but should be something harmless like an %ENOSYS error
    - * return.
    + * return. It should preserve enough information so that syscall_rollback()
    + * can work (see asm-generic/syscall.h).
    *
    * Called without locks, just after entering kernel mode.
    */
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  16. [PATCH 10/23] tracehook: tracehook_signal_handler

    This defines tracehook_signal_handler() as a hook for the arch signal
    handling code to call. It gives ptrace the opportunity to stop for a
    pseudo-single-step trap immediately after signal handler setup is done.

    Signed-off-by: Roland McGrath
    ---
    include/linux/tracehook.h | 23 +++++++++++++++++++++++
    1 files changed, 23 insertions(+), 0 deletions(-)

    diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
    index e113e09..2d1426f 100644
    --- a/include/linux/tracehook.h
    +++ b/include/linux/tracehook.h
    @@ -289,4 +289,27 @@ static inline void tracehook_finish_release_task(struct task_struct *task)
    ptrace_release_task(task);
    }

    +/**
    + * tracehook_signal_handler - signal handler setup is complete
    + * @sig: number of signal being delivered
    + * @info: siginfo_t of signal being delivered
    + * @ka: sigaction setting that chose the handler
    + * @regs: user register state
    + * @stepping: nonzero if debugger single-step or block-step in use
    + *
    + * Called by the arch code after a signal handler has been set up.
    + * Register and stack state reflects the user handler about to run.
    + * Signal mask changes have already been made.
    + *
    + * Called without locks, shortly before returning to user mode
    + * (or handling more signals).
    + */
    +static inline void tracehook_signal_handler(int sig, siginfo_t *info,
    + const struct k_sigaction *ka,
    + struct pt_regs *regs, int stepping)
    +{
    + if (stepping)
    + ptrace_notify(SIGTRAP);
    +}
    +
    #endif /* */
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  17. [PATCH 16/23] tracehook: death

    This moves the ptrace logic in task death (exit_notify) into
    tracehook.h inlines. Some code is rearranged slightly to make
    things nicer. There is no change, only cleanup.

    There is one hook called with the tasklist_lock write-locked, as ptrace
    needs. There is also a new hook called after exit_state changes and
    without locks. This is a better place for tracing work to be in the
    future, since it doesn't delay the whole system with locking.

    Signed-off-by: Roland McGrath
    ---
    include/linux/sched.h | 2 +-
    include/linux/tracehook.h | 52 +++++++++++++++++++++++++++++++++++++++++++++
    kernel/exit.c | 26 +++++++--------------
    kernel/signal.c | 10 ++++++--
    4 files changed, 69 insertions(+), 21 deletions(-)

    diff --git a/include/linux/sched.h b/include/linux/sched.h
    index 1941d8b..c58e8ab 100644
    --- a/include/linux/sched.h
    +++ b/include/linux/sched.h
    @@ -1785,7 +1785,7 @@ extern int kill_pid_info_as_uid(int, struct siginfo *, struct pid *, uid_t, uid_
    extern int kill_pgrp(struct pid *pid, int sig, int priv);
    extern int kill_pid(struct pid *pid, int sig, int priv);
    extern int kill_proc_info(int, struct siginfo *, pid_t);
    -extern void do_notify_parent(struct task_struct *, int);
    +extern int do_notify_parent(struct task_struct *, int);
    extern void force_sig(int, struct task_struct *);
    extern void force_sig_specific(int, struct task_struct *);
    extern int send_sig(int, struct task_struct *, int);
    diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
    index 6dc428d..4c50e1b 100644
    --- a/include/linux/tracehook.h
    +++ b/include/linux/tracehook.h
    @@ -471,4 +471,56 @@ static inline int tracehook_notify_jctl(int notify, int why)
    return notify || (current->ptrace & PT_PTRACED);
    }

    +/**
    + * tracehook_notify_death - task is dead, ready to notify parent
    + * @task: @current task now exiting
    + * @death_cookie: value to pass to tracehook_report_death()
    + * @group_dead: nonzero if this was the last thread in the group to die
    + *
    + * Return the signal number to send our parent with do_notify_parent(), or
    + * zero to send no signal and leave a zombie, or -1 to self-reap right now.
    + *
    + * Called with write_lock_irq(&tasklist_lock) held.
    + */
    +static inline int tracehook_notify_death(struct task_struct *task,
    + void **death_cookie, int group_dead)
    +{
    + if (task->exit_signal == -1)
    + return task->ptrace ? SIGCHLD : -1;
    +
    + /*
    + * If something other than our normal parent is ptracing us, then
    + * send it a SIGCHLD instead of honoring exit_signal. exit_signal
    + * only has special meaning to our real parent.
    + */
    + if (thread_group_empty(task) && !ptrace_reparented(task))
    + return task->exit_signal;
    +
    + return task->ptrace ? SIGCHLD : 0;
    +}
    +
    +/**
    + * tracehook_report_death - task is dead and ready to be reaped
    + * @task: @current task now exiting
    + * @signal: signal number sent to parent, or 0 or -1
    + * @death_cookie: value passed back from tracehook_notify_death()
    + * @group_dead: nonzero if this was the last thread in the group to die
    + *
    + * Thread has just become a zombie or is about to self-reap. If positive,
    + * @signal is the signal number just sent to the parent (usually %SIGCHLD).
    + * If @signal is -1, this thread will self-reap. If @signal is 0, this is
    + * a delayed_group_leader() zombie. The @death_cookie was passed back by
    + * tracehook_notify_death().
    + *
    + * If normal reaping is not inhibited, @task->exit_state might be changing
    + * in parallel.
    + *
    + * Called without locks.
    + */
    +static inline void tracehook_report_death(struct task_struct *task,
    + int signal, void *death_cookie,
    + int group_dead)
    +{
    +}
    +
    #endif /* */
    diff --git a/kernel/exit.c b/kernel/exit.c
    index a416a84..78c6307 100644
    --- a/kernel/exit.c
    +++ b/kernel/exit.c
    @@ -861,7 +861,8 @@ static void forget_original_parent(struct task_struct *father)
    */
    static void exit_notify(struct task_struct *tsk, int group_dead)
    {
    - int state;
    + int signal;
    + void *cookie;

    /*
    * This does two things:
    @@ -898,22 +899,11 @@ static void exit_notify(struct task_struct *tsk, int group_dead)
    !capable(CAP_KILL))
    tsk->exit_signal = SIGCHLD;

    - /* If something other than our normal parent is ptracing us, then
    - * send it a SIGCHLD instead of honoring exit_signal. exit_signal
    - * only has special meaning to our real parent.
    - */
    - if (!task_detached(tsk) && thread_group_empty(tsk)) {
    - int signal = ptrace_reparented(tsk) ?
    - SIGCHLD : tsk->exit_signal;
    - do_notify_parent(tsk, signal);
    - } else if (tsk->ptrace) {
    - do_notify_parent(tsk, SIGCHLD);
    - }
    + signal = tracehook_notify_death(tsk, &cookie, group_dead);
    + if (signal > 0)
    + signal = do_notify_parent(tsk, signal);

    - state = EXIT_ZOMBIE;
    - if (task_detached(tsk) && likely(!tsk->ptrace))
    - state = EXIT_DEAD;
    - tsk->exit_state = state;
    + tsk->exit_state = signal < 0 ? EXIT_DEAD : EXIT_ZOMBIE;

    /* mt-exec, de_thread() is waiting for us */
    if (thread_group_leader(tsk) &&
    @@ -923,8 +913,10 @@ static void exit_notify(struct task_struct *tsk, int group_dead)

    write_unlock_irq(&tasklist_lock);

    + tracehook_report_death(tsk, signal, cookie, group_dead);
    +
    /* If the process is dead, release it - nobody will wait for it */
    - if (state == EXIT_DEAD)
    + if (signal < 0)
    release_task(tsk);
    }

    diff --git a/kernel/signal.c b/kernel/signal.c
    index 9b8d94a..cbec18e 100644
    --- a/kernel/signal.c
    +++ b/kernel/signal.c
    @@ -1346,9 +1346,11 @@ static inline void __wake_up_parent(struct task_struct *p,
    /*
    * Let a parent know about the death of a child.
    * For a stopped/continued status change, use do_notify_parent_cldstop instead.
    + *
    + * Returns -1 if our parent ignored us and so we've switched to
    + * self-reaping, or else @sig.
    */
    -
    -void do_notify_parent(struct task_struct *tsk, int sig)
    +int do_notify_parent(struct task_struct *tsk, int sig)
    {
    struct siginfo info;
    unsigned long flags;
    @@ -1420,12 +1422,14 @@ void do_notify_parent(struct task_struct *tsk, int sig)
    */
    tsk->exit_signal = -1;
    if (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN)
    - sig = 0;
    + sig = -1;
    }
    if (valid_signal(sig) && sig > 0)
    __group_send_sig_info(sig, &info, tsk->parent);
    __wake_up_parent(tsk, tsk->parent);
    spin_unlock_irqrestore(&psig->siglock, flags);
    +
    + return sig;
    }

    static void do_notify_parent_cldstop(struct task_struct *tsk, int why)
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  18. [PATCH 09/23] tracehook: tracehook_expect_breakpoints

    This adds tracehook_expect_breakpoints() as a formal hook for
    the nommu code to use for its, "Is text-poking likely?" check
    at mmap time. This names the actual semantics the code means
    to test, and documents it.

    Signed-off-by: Roland McGrath
    ---
    include/linux/tracehook.h | 15 +++++++++++++++
    mm/nommu.c | 4 ++--
    2 files changed, 17 insertions(+), 2 deletions(-)

    diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
    index 6468ca0..e113e09 100644
    --- a/include/linux/tracehook.h
    +++ b/include/linux/tracehook.h
    @@ -52,6 +52,21 @@
    struct linux_binprm;

    /**
    + * tracehook_expect_breakpoints - guess if task memory might be touched
    + * @task: current task, making a new mapping
    + *
    + * Return nonzero if @task is expected to want breakpoint insertion in
    + * its memory at some point. A zero return is no guarantee it won't
    + * be done, but this is a hint that it's known to be likely.
    + *
    + * May be called with @task->mm->mmap_sem held for writing.
    + */
    +static inline int tracehook_expect_breakpoints(struct task_struct *task)
    +{
    + return (task_ptrace(task) & PT_PTRACED) != 0;
    +}
    +
    +/**
    * tracehook_unsafe_exec - check for exec declared unsafe due to tracing
    * @task: current task doing exec
    *
    diff --git a/mm/nommu.c b/mm/nommu.c
    index 4462b6a..5edccd9 100644
    --- a/mm/nommu.c
    +++ b/mm/nommu.c
    @@ -22,7 +22,7 @@
    #include
    #include
    #include
    -#include
    +#include
    #include
    #include
    #include
    @@ -745,7 +745,7 @@ static unsigned long determine_vm_flags(struct file *file,
    * it's being traced - otherwise breakpoints set in it may interfere
    * with another untraced process
    */
    - if ((flags & MAP_PRIVATE) && (current->ptrace & PT_PTRACED))
    + if ((flags & MAP_PRIVATE) && tracehook_expect_breakpoints(current))
    vm_flags &= ~VM_MAYSHARE;

    return vm_flags;
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  19. [PATCH 12/23] tracehook: tracehook_consider_fatal_signal

    This defines tracehook_consider_fatal_signal() has a fine-grained
    hook for deciding to skip the special cases for a fatal signal,
    as ptrace does. There is no change, only cleanup.

    Signed-off-by: Roland McGrath
    ---
    include/linux/tracehook.h | 21 +++++++++++++++++++++
    kernel/signal.c | 9 +++++----
    2 files changed, 26 insertions(+), 4 deletions(-)

    diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
    index 8cffd34..8b4c15e 100644
    --- a/include/linux/tracehook.h
    +++ b/include/linux/tracehook.h
    @@ -331,4 +331,25 @@ static inline int tracehook_consider_ignored_signal(struct task_struct *task,
    return (task_ptrace(task) & PT_PTRACED) != 0;
    }

    +/**
    + * tracehook_consider_fatal_signal - suppress special handling of fatal signal
    + * @task: task receiving the signal
    + * @sig: signal number being sent
    + * @handler: %SIG_DFL or %SIG_IGN
    + *
    + * Return nonzero to prevent special handling of this termination signal.
    + * Normally @handler is %SIG_DFL. It can be %SIG_IGN if @sig is ignored,
    + * in which case force_sig() is about to reset it to %SIG_DFL.
    + * When this returns zero, this signal might cause a quick termination
    + * that does not give the debugger a chance to intercept the signal.
    + *
    + * Called with or without @task->sighand->siglock held.
    + */
    +static inline int tracehook_consider_fatal_signal(struct task_struct *task,
    + int sig,
    + void __user *handler)
    +{
    + return (task_ptrace(task) & PT_PTRACED) != 0;
    +}
    +
    #endif /* */
    diff --git a/kernel/signal.c b/kernel/signal.c
    index f68fb01..860afb8 100644
    --- a/kernel/signal.c
    +++ b/kernel/signal.c
    @@ -300,12 +300,12 @@ flush_signal_handlers(struct task_struct *t, int force_default)

    int unhandled_signal(struct task_struct *tsk, int sig)
    {
    + void __user *handler = tsk->sighand->action[sig-1].sa.sa_handler;
    if (is_global_init(tsk))
    return 1;
    - if (tsk->ptrace & PT_PTRACED)
    + if (handler != SIG_IGN && handler != SIG_DFL)
    return 0;
    - return (tsk->sighand->action[sig-1].sa.sa_handler == SIG_IGN) ||
    - (tsk->sighand->action[sig-1].sa.sa_handler == SIG_DFL);
    + return !tracehook_consider_fatal_signal(tsk, sig, handler);
    }


    @@ -770,7 +770,8 @@ static void complete_signal(int sig, struct task_struct *p, int group)
    if (sig_fatal(p, sig) &&
    !(signal->flags & (SIGNAL_UNKILLABLE | SIGNAL_GROUP_EXIT)) &&
    !sigismember(&t->real_blocked, sig) &&
    - (sig == SIGKILL || !(t->ptrace & PT_PTRACED))) {
    + (sig == SIGKILL ||
    + !tracehook_consider_fatal_signal(t, sig, SIG_DFL))) {
    /*
    * This signal will be fatal to the whole group.
    */
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  20. Re: [PATCH 00/23] tracehook

    On Thu, 17 Jul 2008 00:25:41 -0700 (PDT) Roland McGrath wrote:

    > This patch series introduces the "tracehook" interface layer of inlines
    > in .
    >
    > ...
    >
    > 25 files changed, 1084 insertions(+), 188 deletions(-)


    It's a strange time to be sending this.

    We're in the middle of a massive dump from linux-next into mainline and
    soon we'll be doing a more modest dump of -mm into mainline and I'm
    tiptoeing around getting all anxious when people try to sneak
    wasnt-in-linux-next stuff into mainline thus wrecking all my junk.

    I'm largely in ignore-new-stuff mode as I'm trying to stabilise
    the existing stuff for 2.6.27-rc1 and hopefully the subsystem
    maintainers are in the same mode.

    So, hum. It's not a good time for anyone to be merging new
    probably-2.6.28 material like this!

    So for now, please let's just review the code and if it does get decent
    review and if there are no unresolveable objections, please ask Stephen
    to include this tree in linux-next.


    That being said, the impact on existing code is fairly modest and we
    could perhaps slip it into 2.6.27-rc1. I haven't looked at it yet.

    Still. Strange timing!
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread
Page 1 of 2 1 2 LastLast