[regression?] distcc says: (dcc_pump_sendfile) ERROR: sendfile returned 0? can't cope (bisected) - Kernel

This is a discussion on [regression?] distcc says: (dcc_pump_sendfile) ERROR: sendfile returned 0? can't cope (bisected) - Kernel ; Hi Tom, Jens, My build system started reporting these error messages recently. Reverting commit c3270e577c18b3d0e984c3371493205a4807db9d on top of 2.6.26-rc1 gets things working for me again. -- Dan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the ...

+ Reply to Thread
Results 1 to 8 of 8

Thread: [regression?] distcc says: (dcc_pump_sendfile) ERROR: sendfile returned 0? can't cope (bisected)

  1. [regression?] distcc says: (dcc_pump_sendfile) ERROR: sendfile returned 0? can't cope (bisected)

    Hi Tom, Jens,

    My build system started reporting these error messages recently.
    Reverting commit c3270e577c18b3d0e984c3371493205a4807db9d on top of
    2.6.26-rc1 gets things working for me again.

    --
    Dan
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [regression?] distcc says: (dcc_pump_sendfile) ERROR: sendfile returned 0? can't cope (bisected)

    On Wed, May 07 2008, Dan Williams wrote:
    > Hi Tom, Jens,
    >
    > My build system started reporting these error messages recently.
    > Reverting commit c3270e577c18b3d0e984c3371493205a4807db9d on top of
    > 2.6.26-rc1 gets things working for me again.


    Irk, that patch did scare me a bit (hence I asked Tom to double check as
    wel :-). I'll take a look in the morning, all test boxes are off at this
    point in time.

    --
    Jens Axboe

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [regression?] distcc says: (dcc_pump_sendfile) ERROR: sendfile returned 0? can't cope (bisected)


    On Wed, 2008-05-07 at 23:16 +0200, Jens Axboe wrote:
    > On Wed, May 07 2008, Dan Williams wrote:
    > > Hi Tom, Jens,
    > >
    > > My build system started reporting these error messages recently.
    > > Reverting commit c3270e577c18b3d0e984c3371493205a4807db9d on top of
    > > 2.6.26-rc1 gets things working for me again.

    >
    > Irk, that patch did scare me a bit (hence I asked Tom to double check as
    > wel :-). I'll take a look in the morning, all test boxes are off at this
    > point in time.
    >


    I did, and it still looks ok to me, but obviously it's not, so I'll have
    to do some more digging.

    The only thing I can think of right now that might be a possible cause
    would be in splice_direct_to_actor(), if we had an incomplete transfer,
    the sd->pos returned and assigned would have the value set by the failed
    actor(). Maybe something like the following would take care of that
    case, but I haven't had a chance to test it yet - will do that tomorrow
    night...

    Tom

    diff --git a/fs/splice.c b/fs/splice.c
    index 633f58e..1bb3f34 100644
    --- a/fs/splice.c
    +++ b/fs/splice.c
    @@ -986,7 +986,7 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd,

    while (len) {
    size_t read_len;
    - loff_t pos = sd->pos;
    + loff_t pos = sd->pos, prev_pos = pos;

    ret = do_splice_to(in, &pos, pipe, len, flags);
    if (unlikely(ret <= 0))
    @@ -1001,8 +1001,10 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd,
    * could get stuck data in the internal pipe:
    */
    ret = actor(pipe, sd);
    - if (unlikely(ret <= 0))
    + if (unlikely(ret <= 0)) {
    + sd->pos = prev_pos;
    goto out_release;
    + }

    bytes += ret;
    len -= ret;


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [regression?] distcc says: (dcc_pump_sendfile) ERROR: sendfile returned 0? can't cope (bisected)


    On Thu, 2008-05-08 at 02:21 -0500, Tom Zanussi wrote:
    > On Wed, 2008-05-07 at 23:16 +0200, Jens Axboe wrote:
    > > On Wed, May 07 2008, Dan Williams wrote:
    > > > Hi Tom, Jens,
    > > >
    > > > My build system started reporting these error messages recently.
    > > > Reverting commit c3270e577c18b3d0e984c3371493205a4807db9d on top of
    > > > 2.6.26-rc1 gets things working for me again.

    > >
    > > Irk, that patch did scare me a bit (hence I asked Tom to double check as
    > > wel :-). I'll take a look in the morning, all test boxes are off at this
    > > point in time.
    > >

    >
    > I did, and it still looks ok to me, but obviously it's not, so I'll have
    > to do some more digging.
    >
    > The only thing I can think of right now that might be a possible cause
    > would be in splice_direct_to_actor(), if we had an incomplete transfer,
    > the sd->pos returned and assigned would have the value set by the failed
    > actor(). Maybe something like the following would take care of that
    > case, but I haven't had a chance to test it yet - will do that tomorrow
    > night...
    >


    Looks like I was on the right track - can you try this patch out
    instead? It makes sure sd.pos is updated correctly if the transfer was
    incomplete or failed. I ran some kernel compiles using distcc while
    running blktrace in sendfile mode and didn't see any problems with
    either.

    Tom

    diff --git a/fs/splice.c b/fs/splice.c
    index 633f58e..3bd95a7 100644
    --- a/fs/splice.c
    +++ b/fs/splice.c
    @@ -986,7 +986,7 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd,

    while (len) {
    size_t read_len;
    - loff_t pos = sd->pos;
    + loff_t pos = sd->pos, prev_pos = pos;

    ret = do_splice_to(in, &pos, pipe, len, flags);
    if (unlikely(ret <= 0))
    @@ -1001,15 +1001,19 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd,
    * could get stuck data in the internal pipe:
    */
    ret = actor(pipe, sd);
    - if (unlikely(ret <= 0))
    + if (unlikely(ret <= 0)) {
    + sd->pos = prev_pos;
    goto out_release;
    + }

    bytes += ret;
    len -= ret;
    sd->pos = pos;

    - if (ret < read_len)
    + if (ret < read_len) {
    + sd->pos = prev_pos + ret;
    goto out_release;
    + }
    }

    done:


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [regression?] distcc says: (dcc_pump_sendfile) ERROR: sendfile returned 0? can't cope (bisected)

    On Thu, May 08 2008, Tom Zanussi wrote:
    >
    > On Thu, 2008-05-08 at 02:21 -0500, Tom Zanussi wrote:
    > > On Wed, 2008-05-07 at 23:16 +0200, Jens Axboe wrote:
    > > > On Wed, May 07 2008, Dan Williams wrote:
    > > > > Hi Tom, Jens,
    > > > >
    > > > > My build system started reporting these error messages recently.
    > > > > Reverting commit c3270e577c18b3d0e984c3371493205a4807db9d on top of
    > > > > 2.6.26-rc1 gets things working for me again.
    > > >
    > > > Irk, that patch did scare me a bit (hence I asked Tom to double check as
    > > > wel :-). I'll take a look in the morning, all test boxes are off at this
    > > > point in time.
    > > >

    > >
    > > I did, and it still looks ok to me, but obviously it's not, so I'll have
    > > to do some more digging.
    > >
    > > The only thing I can think of right now that might be a possible cause
    > > would be in splice_direct_to_actor(), if we had an incomplete transfer,
    > > the sd->pos returned and assigned would have the value set by the failed
    > > actor(). Maybe something like the following would take care of that
    > > case, but I haven't had a chance to test it yet - will do that tomorrow
    > > night...
    > >

    >
    > Looks like I was on the right track - can you try this patch out
    > instead? It makes sure sd.pos is updated correctly if the transfer was
    > incomplete or failed. I ran some kernel compiles using distcc while
    > running blktrace in sendfile mode and didn't see any problems with
    > either.


    Dan, can I talk you into re-trying current git with this patch applied?
    It's basically the (now) reverted broken bits plus the fix from Tom from
    this email.

    diff --git a/fs/splice.c b/fs/splice.c
    index 7815003..a048ad2 100644
    --- a/fs/splice.c
    +++ b/fs/splice.c
    @@ -983,7 +983,7 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd,

    while (len) {
    size_t read_len;
    - loff_t pos = sd->pos;
    + loff_t pos = sd->pos, prev_pos = pos;

    ret = do_splice_to(in, &pos, pipe, len, flags);
    if (unlikely(ret <= 0))
    @@ -998,15 +998,19 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd,
    * could get stuck data in the internal pipe:
    */
    ret = actor(pipe, sd);
    - if (unlikely(ret <= 0))
    + if (unlikely(ret <= 0)) {
    + sd->pos = prev_pos;
    goto out_release;
    + }

    bytes += ret;
    len -= ret;
    sd->pos = pos;

    - if (ret < read_len)
    + if (ret < read_len) {
    + sd->pos = prev_pos + ret;
    goto out_release;
    + }
    }

    done:
    @@ -1072,7 +1076,7 @@ long do_splice_direct(struct file *in, loff_t *ppos, struct file *out,

    ret = splice_direct_to_actor(in, &sd, direct_splice_actor);
    if (ret > 0)
    - *ppos += ret;
    + *ppos = sd.pos;

    return ret;
    }
    diff --git a/kernel/relay.c b/kernel/relay.c
    index bc24dcd..7de644c 100644
    --- a/kernel/relay.c
    +++ b/kernel/relay.c
    @@ -1191,7 +1191,7 @@ static ssize_t relay_file_splice_read(struct file *in,
    ret = 0;
    spliced = 0;

    - while (len) {
    + while (len && !spliced) {
    ret = subbuf_splice_actor(in, ppos, pipe, len, flags, &nonpad_ret);
    if (ret < 0)
    break;

    --
    Jens Axboe

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [regression?] distcc says: (dcc_pump_sendfile) ERROR: sendfile returned 0? can't cope (bisected)

    On Fri, May 9, 2008 at 4:26 AM, Jens Axboe wrote:
    > Dan, can I talk you into re-trying current git with this patch applied?
    > It's basically the (now) reverted broken bits plus the fix from Tom from
    > this email.
    >


    distcc appears to be happy.

    Tested-by: Dan Williams
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: [regression?] distcc says: (dcc_pump_sendfile) ERROR: sendfile returned 0? can't cope (bisected)

    On Fri, May 09 2008, Dan Williams wrote:
    > On Fri, May 9, 2008 at 4:26 AM, Jens Axboe wrote:
    > > Dan, can I talk you into re-trying current git with this patch applied?
    > > It's basically the (now) reverted broken bits plus the fix from Tom from
    > > this email.
    > >

    >
    > distcc appears to be happy.
    >
    > Tested-by: Dan Williams


    Super, Tom I already committed this patch as coming from you. If you
    want a SOB added, let me know.

    --
    Jens Axboe

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [regression?] distcc says: (dcc_pump_sendfile) ERROR: sendfile returned 0? can't cope (bisected)


    On Fri, 2008-05-09 at 21:17 +0200, Jens Axboe wrote:
    > On Fri, May 09 2008, Dan Williams wrote:
    > > On Fri, May 9, 2008 at 4:26 AM, Jens Axboe wrote:
    > > > Dan, can I talk you into re-trying current git with this patch applied?
    > > > It's basically the (now) reverted broken bits plus the fix from Tom from
    > > > this email.
    > > >

    > >
    > > distcc appears to be happy.
    > >
    > > Tested-by: Dan Williams

    >
    > Super, Tom I already committed this patch as coming from you. If you
    > want a SOB added, let me know.
    >


    OK, great, thanks.

    Signed-off-by: Tom Zanussi


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread