Re: [PATCH] arch_ptrace_stop
Roland McGrath wrote:[color=blue]
>
> +static int sigkill_pending(struct task_struct *tsk)
> +{
> + return ((sigismember(&tsk->pending.signal, SIGKILL) ||
> + sigismember(&tsk->signal->shared_pending.signal, SIGKILL)) &&
> + !unlikely(sigismember(&tsk->blocked, SIGKILL)));
> +}[/color]
How is it possible that SIGKILL is blocked?
[color=blue]
> static void ptrace_stop(int exit_code, int nostop_code, siginfo_t *info)
> {
> + int killed = 0;
> +
> + if (arch_ptrace_stop_needed(exit_code, info)) {
> + /*
> + * The arch code has something special to do before a
> + * ptrace stop. This is allowed to block, e.g. for faults
> + * on user stack pages. We can't keep the siglock while
> + * calling arch_ptrace_stop, so we must release it now.
> + * To preserve proper semantics, we must do this before
> + * any signal bookkeeping like checking group_stop_count.
> + * Meanwhile, a SIGKILL could come in before we retake the
> + * siglock. That must prevent us from sleeping in TASK_TRACED.
> + * So after regaining the lock, we must check for SIGKILL.
> + */
> + spin_unlock_irq(¤t->sighand->siglock);
> + arch_ptrace_stop(exit_code, info);
> + spin_lock_irq(¤t->sighand->siglock);
> + killed = sigkill_pending(current);
> + }
> +
> /*
> * If there is a group stop in progress,
> * we must participate in the bookkeeping.
> @@ -1604,7 +1635,7 @@ static void ptrace_stop(int exit_code, i
> spin_unlock_irq(¤t->sighand->siglock);
> try_to_freeze();
> read_lock(&tasklist_lock);
> - if (may_ptrace_stop()) {
> + if (!unlikely(killed) && may_ptrace_stop()) {[/color]
Could you please explain this change in more details?
Currently ptrace_stop() schedules in TASK_TRACED state even if we have a
pending SIGKILL. With this patch this is still possible, but unless
arch_ptrace_stop_needed() is true and thus we will check sigkill_pending().
Suppose the task was SIGKILL'ed and does ptrace_notify(PTRACE_EVENT_EXIT),
now the resulting action depends on arch_ptrace_stop_needed().
I don't claim this is wrong, just trying to understand.
Thanks,
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email]majordomo@vger.kernel.org[/email]
More majordomo info at [url]http://vger.kernel.org/majordomo-info.html[/url]
Please read the FAQ at [url]http://www.tux.org/lkml/[/url]
Re: [PATCH] arch_ptrace_stop
On Thu, Dec 13, 2007 at 08:29:26PM +0300, Oleg Nesterov wrote:[color=blue]
> How is it possible that SIGKILL is blocked?[/color]
I *think* it's possible that kernel threads may block SIGKILL.
And I think init (pid 1) gets SIGKILL blocked.
--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email]majordomo@vger.kernel.org[/email]
More majordomo info at [url]http://vger.kernel.org/majordomo-info.html[/url]
Please read the FAQ at [url]http://www.tux.org/lkml/[/url]
Re: [PATCH] arch_ptrace_stop
On 12/13, Matthew Wilcox wrote:[color=blue]
>
> On Thu, Dec 13, 2007 at 08:29:26PM +0300, Oleg Nesterov wrote:[color=green]
> > How is it possible that SIGKILL is blocked?[/color]
>
> I *think* it's possible that kernel threads may block SIGKILL.
> And I think init (pid 1) gets SIGKILL blocked.[/color]
Yes. But this shouldn't matter because we can't ptrace them. /sbin/init
doesn't block SIGKILL, but it can't be ptraced either.
Also, SIGKILL is blocked only if kthread was created by kernel_thread() and
then it does daemonize(). kthread_create()'ed kthreads don't block SIGKILL
but ignore it. The latter behaviour is more correct, but we can't do the same
for kernel_thread() threads.
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email]majordomo@vger.kernel.org[/email]
More majordomo info at [url]http://vger.kernel.org/majordomo-info.html[/url]
Please read the FAQ at [url]http://www.tux.org/lkml/[/url]
Re: [PATCH] arch_ptrace_stop
> How is it possible that SIGKILL is blocked?
It is probably not possible here. It's impossible for normal user threads,
and kernel threads or ones doing something weird inside the kernel can
probably never get here. But I'm just not trying to rely on any fancy
guarantees here. In this change I'm being conservative in just matching
the low-level logic that affects get_signal_to_deliver directly. I'd
prefer to get this in place now, where it's easy to be sure it's correct
vis a vis current behavior. Later on we can convince ourselves about the
correctness of simplifications to clean the logic up.
[color=blue]
> Could you please explain this change in more details?[/color]
I'm glad to. I appreciate the review.
[color=blue]
> Currently ptrace_stop() schedules in TASK_TRACED state even if we have a
> pending SIGKILL. With this patch this is still possible, but unless
> arch_ptrace_stop_needed() is true and thus we will check sigkill_pending().[/color]
Currently the siglock is always held throughout. The case this change
addresses is when no SIGKILL was already pending before we took the lock.
Currently, a new SIGKILL cannot come in until we've released the lock,
which is after we've set TASK_TRACED. The signal's sender will hold the
lock while checking each thread's state, waking up any in TASK_TRACED.
When arch_ptrace_stop_needed() is true, we release the siglock for an
unknown period (might block, etc). If a SIGKILL is sent there, it becomes
pending while we are in TASK_RUNNING or a normal blocked state. Next we
finish arch_ptrace_stop() and reacquire the siglock. Now entering
TASK_TRACED would leave us unkillable because SIGKILL is already pending
and nothing else (except PTRACE_CONT et al) will try to wake us up.
This was tested on ia64 by inserting an explicit sleep in arch_ptrace_stop
to simulate a very long block in a page fault, and then sending SIGKILL in
this interval. It was verified that this left the thread unkillable, and
that the sigkill_pending() check fixed that.
[color=blue]
> Suppose the task was SIGKILL'ed and does ptrace_notify(PTRACE_EVENT_EXIT),
> now the resulting action depends on arch_ptrace_stop_needed().[/color]
For a SIGKILL that caused the exit to happen, then that SIGKILL was already
dequeued and is no longer pending by the time we get here, so nothing is
different.
For a new SIGKILL racing with an exit already in progress, then either it
comes before ptrace_notify(PTRACE_EVENT_EXIT) has taken the siglock, or
after it's released it (in ptrace_stop). If it comes after ptrace_stop
releases the siglock, then it wakes the thread up from that ptrace stop.
If it comes before ptrace_notify takes the siglock, then the thread stops
anyway with a pending SIGKILL. (That may be a problem, but it is not new.
We can discuss that case separately.)
The new third possibility is that it comes after ptrace_stop releases the
siglock to call arch_ptrace_stop. Then the new code will notice the
pending SIGKILL and skip the stop. The only way this differs from the
SIGKILL having arrived immediately after the stop is that the parent
SIGCHLD/wait notification with CLD_TRAPPED didn't happen. It's a race for
the parent even to notice that, since exit_code and the queued SIGCHLD's
siginfo_t will very quickly be replaced by the fresh notification sent for
the child's death.
[color=blue]
> I don't claim this is wrong, just trying to understand.[/color]
I don't claim the logic is now perfect, just that this change is better for
the case it's intended for and not materially worse for others. We can
certainly do more to clean things up. But I don't want that complex
subject to delay the fixing and cleaning up of ia64 that this change
enables.
Thanks,
Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email]majordomo@vger.kernel.org[/email]
More majordomo info at [url]http://vger.kernel.org/majordomo-info.html[/url]
Please read the FAQ at [url]http://www.tux.org/lkml/[/url]
Re: [PATCH] arch_ptrace_stop
Roland, I am sorry for delay,
On 12/13, Roland McGrath wrote:[color=blue]
>[color=green]
> > Currently ptrace_stop() schedules in TASK_TRACED state even if we have a
> > pending SIGKILL. With this patch this is still possible, but unless
> > arch_ptrace_stop_needed() is true and thus we will check sigkill_pending().[/color]
>
> Currently the siglock is always held throughout. The case this change
> addresses is when no SIGKILL was already pending before we took the lock.
> Currently, a new SIGKILL cannot come in until we've released the lock,
> which is after we've set TASK_TRACED. The signal's sender will hold the
> lock while checking each thread's state, waking up any in TASK_TRACED.
>
> When arch_ptrace_stop_needed() is true, we release the siglock for an
> unknown period (might block, etc). If a SIGKILL is sent there, it becomes
> pending while we are in TASK_RUNNING or a normal blocked state. Next we
> finish arch_ptrace_stop() and reacquire the siglock.[/color]
Yes, yes, I see.
[color=blue]
> Now entering
> TASK_TRACED would leave us unkillable because SIGKILL is already pending
> and nothing else (except PTRACE_CONT et al) will try to wake us up.[/color]
But this doesn't differ from the case when SIGKILL was already pending when
we enter ptrace_stop, and arch_ptrace_stop_needed() == false, that was my
point.
Yes, arch_ptrace_stop() can take a long time, might block, etc. But what
about TIF_SYSCALL_TRACE ? The task can recieve SIGKILL while executing the
syscall which can also block and so on, but do_syscall_trace(entryexit == 1)
doesn't check the pending signal.
I should clarify my question. What I can't understand is the subtle dependency
on the result of arch_ptrace_stop_needed(). This means that it is hard to
predict the behaviour.
IOW, can't we
- ignore the pending SIGKILL (current behaviour)
-- OR --
- always check it unconditionally, before setting TASK_TRACED
? This looks a bit more consistent to me.
Please also note "before setting TASK_TRACED" above. With this patch we set
TASK_TRACED under ->siglock, and then change the ->state to TASK_RUNNING if
killed == 1. Minor, but this doesn't look correct, we can fool the tracer
which does ptrace_check_attach().
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email]majordomo@vger.kernel.org[/email]
More majordomo info at [url]http://vger.kernel.org/majordomo-info.html[/url]
Please read the FAQ at [url]http://www.tux.org/lkml/[/url]