Csup truncates files on RELENG_7 - BSD

This is a discussion on Csup truncates files on RELENG_7 - BSD ; I have been tracking STABLE for about the last 5 years with only a few build failures due to an update in the middle of a commit. After migrating to the RELENG_7 branch, I was never able to successfully build ...

+ Reply to Thread
Results 1 to 11 of 11

Thread: Csup truncates files on RELENG_7

  1. Csup truncates files on RELENG_7

    I have been tracking STABLE for about the last 5 years with only a few
    build failures due to an update in the middle of a commit. After
    migrating to the RELENG_7 branch, I was never able to successfully build
    after any csup updates. I noticed that parts of edited files were
    'missing'. No syslog messages were observed. After deleting the
    truncated file and running csup again, the buildworld was mostly
    successful. Occasionaly I would have another truncated file, requiring
    deletion and running csup again. I remembered someting about 'write
    cache enable' sysctl default getting changed at the start of the
    RELENG_5 branch. After turning off the write cache, I had no further
    problems with building world after a csup of the source tree. It has
    been 4 weeks without any issues. I do a weekly csup and buildworld.

    My hard disk is a IBM Deskstar Model IC35L040AVVA07.

    DMESG shows:

    ad0: 39266MB at ata0-master UDMA100

    I have been using this drive for several years and never had this
    truncation issue until I went to the RELENG_7 branch. I wanted to get
    this into this newsgroup's archives in case somone else has the problem.
    It is strange that csup never detects that the file has been damaged in
    this way, requiring a download of the complete file from the CVS Server.
    The sysctl variable 'hw.ata.wc="0" in /boot/loader.conf cured my
    problem.

    Tom



    --
    Public Keys:
    PGP KeyID = 0x5F22FDC1
    GnuPG KeyID = 0x620836CF

  2. Re: Csup truncates files on RELENG_7

    Begin
    On Wed, 03 Sep 2008 10:26:00 -0500, Thomas Laus wrote:
    > I wanted to get this into this newsgroup's archives in case somone
    > else has the problem.


    Please please please file it as a PR instead.

    The website has hints&tips on how to write a PR and where to submit it.


    --
    j p d (at) d s b (dot) t u d e l f t (dot) n l .
    This message was originally posted on Usenet in plain text.
    Any other representation, additions, or changes do not have my
    consent and may be a violation of international copyright law.

  3. Re: Csup truncates files on RELENG_7

    jpd wrote:
    > Begin
    > On Wed, 03 Sep 2008 10:26:00 -0500, Thomas Laus wrote:
    >> I wanted to get this into this newsgroup's archives in case somone
    >> else has the problem.

    >
    > Please please please file it as a PR instead.


    Hmmm.

    Why?

    bin/113345

    apparently nobody ever bothers. I had this one sitting there,
    for more than a year now, and would repeat it on any darned
    box that had a newly installed 6.2 .

    Csup seems unmaintained. IIRC it's a summer of code product, and
    apparently, the summer is over.

    I still install cvsup on any every new server.

    Regards

    Christoph Weber-Fahr

  4. Re: Csup truncates files on RELENG_7

    Begin
    On Wed, 03 Sep 2008 23:07:09 +0200, Christoph wrote:
    > jpd wrote:
    >> Please please please file it as a PR instead.

    >
    > Hmmm.
    >
    > Why?


    If nothing else, it has a better chance to see attention at all, and is
    a generally better place to store things like that, if nothing else in
    that it doesn't depend on possibly-evil-after-all third parties.

    s/instead\./also./ might be a pragmatic compromise.


    > bin/113345
    >
    > apparently nobody ever bothers. I had this one sitting there,
    > for more than a year now, and would repeat it on any darned
    > box that had a newly installed 6.2 .


    A generalisation that's not quite correct, even when I too have an open
    PR (ata on sparc64) that hasn't seen any activity in quite a while. I'm
    told 7.1-BETA1 builds will be available shortly so that might be worth
    another shot on the u5.


    > Csup seems unmaintained. IIRC it's a summer of code product, and
    > apparently, the summer is over.


    So it would be good if someone picked it up. I'm actually not following
    mls and such, so I must admit having no clue how that is or isn't handled.


    --
    j p d (at) d s b (dot) t u d e l f t (dot) n l .
    This message was originally posted on Usenet in plain text.
    Any other representation, additions, or changes do not have my
    consent and may be a violation of international copyright law.

  5. Re: Csup truncates files on RELENG_7

    Thomas Laus wrote:

    > I have been tracking STABLE for about the last 5 years with only a few
    > build failures due to an update in the middle of a commit. After
    > migrating to the RELENG_7 branch, I was never able to successfully build
    > after any csup updates. I noticed that parts of edited files were
    > 'missing'. No syslog messages were observed. After deleting the
    > truncated file and running csup again, the buildworld was mostly
    > successful. Occasionaly I would have another truncated file, requiring
    > deletion and running csup again. I remembered someting about 'write
    > cache enable' sysctl default getting changed at the start of the
    > RELENG_5 branch. After turning off the write cache, I had no further
    > problems with building world after a csup of the source tree. It has
    > been 4 weeks without any issues. I do a weekly csup and buildworld.
    >
    > My hard disk is a IBM Deskstar Model IC35L040AVVA07.
    >
    > DMESG shows:
    >
    > ad0: 39266MB at ata0-master UDMA100
    >
    > I have been using this drive for several years and never had this
    > truncation issue until I went to the RELENG_7 branch. I wanted to get
    > this into this newsgroup's archives in case somone else has the problem.
    > It is strange that csup never detects that the file has been damaged in
    > this way, requiring a download of the complete file from the CVS Server.
    > The sysctl variable 'hw.ata.wc="0" in /boot/loader.conf cured my
    > problem.
    >
    > Tom


    I kind of agree with jpd here about the PR. I don't think the problem is
    csup, per se. Many people use it, as do I, who have never experienced what
    you're describing.

    It also would appear that you've identified something quite more insidious
    that may be semi related to several other situations. For example, I've
    been wondering about Nvidia chipset controllers with the "silent data
    corruption" errors. You've done a fair amount of legwork here and it's like
    passing the baton in a relay race. Time to pass it on.

    Individually it might be easy to disregard a single PR, but when a pattern
    emerges where many disparate PR's seem to have some commonality it is
    possible for them collectively to garner some attention.

    Sounds like to me your problem is write caching. So a bug in the disk
    controller software(s) may be indicated. Since it seems to have appeared
    when you went to RELENG_7 it is also quite likely a regression. If you can
    narrow down a timeline such as "it worked prior to this date and failed
    after such and such..." it gives the developers place to start looking.

    Whether or not it gets looked at, or ignored, you will have done what you
    could. You will have at least given it a chance, and some chance is better
    than none at all.

    Just my $.02 - please feel free to disregard. :-)

    -Jason



  6. Re: Csup truncates files on RELENG_7

    Christoph wrote:
    > jpd wrote:
    > > Begin
    > > On Wed, 03 Sep 2008 10:26:00 -0500, Thomas Laus wrote:
    > >> I wanted to get this into this newsgroup's archives in case somone
    > >> else has the problem.

    > >
    > > Please please please file it as a PR instead.

    >
    > Hmmm.
    >
    > Why?
    >
    > bin/113345
    >
    > apparently nobody ever bothers. I had this one sitting there,
    > for more than a year now, and would repeat it on any darned
    > box that had a newly installed 6.2 .
    >
    > Csup seems unmaintained. IIRC it's a summer of code product, and
    > apparently, the summer is over.


    Csup is certainly not a summer of code product, it is the result of long
    work by an experimented FreeBSD developer. Personnaly i have never seen
    any problem whatsoever using csup.

    --

    Michel TALON


  7. Re: Csup truncates files on RELENG_7

    On 2008-09-03, Jason Bourne wrote:
    >
    > I kind of agree with jpd here about the PR. I don't think the problem is
    > csup, per se. Many people use it, as do I, who have never experienced what
    > you're describing.
    >
    > It also would appear that you've identified something quite more insidious
    > that may be semi related to several other situations. For example, I've
    > been wondering about Nvidia chipset controllers with the "silent data
    > corruption" errors. You've done a fair amount of legwork here and it's like
    > passing the baton in a relay race. Time to pass it on.
    >
    > Individually it might be easy to disregard a single PR, but when a pattern
    > emerges where many disparate PR's seem to have some commonality it is
    > possible for them collectively to garner some attention.
    >
    > Sounds like to me your problem is write caching. So a bug in the disk
    > controller software(s) may be indicated. Since it seems to have appeared
    > when you went to RELENG_7 it is also quite likely a regression. If you can
    > narrow down a timeline such as "it worked prior to this date and failed
    > after such and such..." it gives the developers place to start looking.
    >
    > Whether or not it gets looked at, or ignored, you will have done what you
    > could. You will have at least given it a chance, and some chance is better
    > than none at all.
    >
    > Just my $.02 - please feel free to disregard. :-)
    >

    I agree with Jason, I tried cvsup and the file truncation issue was
    still present. I had this happen a few times while at RELENG_6 but was
    too random to reproduce. I feel that my problem is write caching and
    is isolated to my IBM Deskstar. I have never seen the problem on any of
    my other computers that are also tracking RELENG_7. I checked their
    sysctl parameters and all have hw.ata.wc set to '1'. I am not bright
    enough to suggest a method of testing for the correct setting of this
    variable.

    I'll file a PR to get it documented and tracked and maybe it can be
    corelated with another condition or someone elses PR.

    Tom

    --
    Public Keys:
    PGP KeyID = 0x5F22FDC1
    GnuPG KeyID = 0x620836CF

  8. Re: Csup truncates files on RELENG_7

    Thomas Laus wrote:

    > On 2008-09-03, Jason Bourne wrote:
    >>
    >> I kind of agree with jpd here about the PR. I don't think the problem is
    >> csup, per se. Many people use it, as do I, who have never experienced
    >> what you're describing.

    [snip]
    > I agree with Jason, I tried cvsup and the file truncation issue was
    > still present. I had this happen a few times while at RELENG_6 but was
    > too random to reproduce. I feel that my problem is write caching and
    > is isolated to my IBM Deskstar. I have never seen the problem on any of
    > my other computers that are also tracking RELENG_7. I checked their
    > sysctl parameters and all have hw.ata.wc set to '1'. I am not bright
    > enough to suggest a method of testing for the correct setting of this
    > variable.
    >
    > I'll file a PR to get it documented and tracked and maybe it can be
    > corelated with another condition or someone elses PR.
    >
    > Tom
    >


    This is sounding more and more like a firmware issue with the Deskstar
    drives themselves. It may even be known. In this it's also helpful to know
    what kind of controller chip is on the other end of the bus. Sometimes this
    kind of thing only presents with a specific controller/drive combination.

    If this is something that has been seen and investigated before the mfr may
    possibly have a firmware update which fixes it, though I wouldn't hold my
    breath. If it is indeed an obscure hardware fault with one specific model
    of something I doubt the developers will be inclined to spend any time on
    it. Kind of a "It's an IBM problem and if they won't deal with it then it
    just shouldn't used with FreeBSD".

    -Jason


  9. Re: Csup truncates files on RELENG_7

    On Thu, 04 Sep 2008 14:47:00 GMT,
    Jason Bourne wrote:
    > This is sounding more and more like a firmware issue with the Deskstar
    > drives themselves. It may even be known. In this it's also helpful to know
    > what kind of controller chip is on the other end of the bus. Sometimes this
    > kind of thing only presents with a specific controller/drive combination.


    If it's a ``deathstar'' series drive, then that probably counts as a
    known problem. But even so it'd be interesting to know why csup tickles
    the drive into misbehaving, and why other things don't.


    --
    j p d (at) d s b (dot) t u d e l f t (dot) n l .
    This message was originally posted on Usenet in plain text.
    Any other representation, additions, or changes do not have my
    consent and may be a violation of international copyright law.

  10. Re: Csup truncates files on RELENG_7

    On 2008-09-04, jpd wrote:
    > On Thu, 04 Sep 2008 14:47:00 GMT,
    >
    > If it's a ``deathstar'' series drive, then that probably counts as a
    > known problem. But even so it'd be interesting to know why csup tickles
    > the drive into misbehaving, and why other things don't.
    >

    It is one of the infamous 'deathstar' drives. I thought all of the
    issues with this drive were all mechanical and the failures all happened
    while the drive was still under warranty. Hitachi bought the hard drive
    line from IBM, so I would doubt that there is any ongoing support for
    this drive. It has worked flawlessly for me ever since it was installed
    nearly 5 years ago.

    I suspect that csup performs an intense amount of disk I/O and the write
    cache gets filled up and is unable to signal the IDE controller to wait
    for the cache to empty before resuming. When I csup an empty source
    tree, the data flow is all in the write diurection and does not involve
    reads to see if a particular file needs to be edited or deleted.

    I will file a PR for the experts to review. I checked the Hitachi site
    and found no firmware upgrades listed for this model drive.

    Tom

    --
    Public Keys:
    PGP KeyID = 0x5F22FDC1
    GnuPG KeyID = 0x620836CF

  11. Re: Csup truncates files on RELENG_7

    Begin
    On Thu, 04 Sep 2008 14:20:15 -0500, Thomas Laus wrote:
    > On 2008-09-04, jpd wrote:
    >> On Thu, 04 Sep 2008 14:47:00 GMT,
    >>
    >> If it's a ``deathstar'' series drive, then that probably counts as a
    >> known problem. But even so it'd be interesting to know why csup tickles
    >> the drive into misbehaving, and why other things don't.
    >>

    > It is one of the infamous 'deathstar' drives. I thought all of the
    > issues with this drive were all mechanical and the failures all happened
    > while the drive was still under warranty.


    No, they also have broken somethingorother in the controller. Something
    with tagged queueing comes to mind. ISTR this to be in the manpages
    somewhere, too, but I can't recall where.


    > Hitachi bought the hard drive line from IBM, so I would doubt that
    > there is any ongoing support for this drive. It has worked flawlessly
    > for me ever since it was installed nearly 5 years ago.


    The last few of that line had problems and got killed by kneejerking and
    generally doing everything wrong on the damage control front. I'm told
    the earlier models were in fact quite good.

    Since the brand got so thorougly killed dead IBM apparently saw little
    choice to sell the department. The technology still was worth something.


    > I suspect that csup performs an intense amount of disk I/O and the write
    > cache gets filled up and is unable to signal the IDE controller to wait
    > for the cache to empty before resuming. When I csup an empty source
    > tree, the data flow is all in the write diurection and does not involve
    > reads to see if a particular file needs to be edited or deleted.


    Might be because it's pretty heavy on the random writes, then, though
    then one could expect to see trouble with heavy RDBMS updates too.


    > I will file a PR for the experts to review. I checked the Hitachi site
    > and found no firmware upgrades listed for this model drive.


    Did you check the IBM site, too? I have on occasion fetched various
    images with bios updates from there even after the department had well
    been sold.


    --
    j p d (at) d s b (dot) t u d e l f t (dot) n l .
    This message was originally posted on Usenet in plain text.
    Any other representation, additions, or changes do not have my
    consent and may be a violation of international copyright law.

+ Reply to Thread