The 9997th file specific DOSFS problem - VxWorks

This is a discussion on The 9997th file specific DOSFS problem - VxWorks ; setup : tornado2.2.1, vxworks 5.5, dosfs2.0, FAT32, connecting to a ~500 GB raid, single partition (no dpartLib) problem: It takes a looong time to write the 9997th file, ~30 seconds. It takes ~0.1 seconds for other (4 MB) files. details: ...

+ Reply to Thread
Results 1 to 8 of 8

Thread: The 9997th file specific DOSFS problem

  1. The 9997th file specific DOSFS problem

    setup : tornado2.2.1, vxworks 5.5, dosfs2.0, FAT32, connecting to a
    ~500 GB raid, single partition (no dpartLib)

    problem: It takes a looong time to write the 9997th file, ~30 seconds.
    It takes ~0.1 seconds for other (4 MB) files.

    details:
    The time it takes actually depends on the size of the storage, it
    ranges from ~16 seconds for a 128 GB to
    ~35 seconds for ~900 GB raid. It has to be the DOSFS, since both the
    CPU and raid are busy when the problem
    happens. It appears that cbio is continuously doing reading/writting,
    possibly accessing the FAT?

    The problem happens only once if the raid is being writen continuously
    from the beginning to the end. If the
    raid is unmounted / mounted in between, the problem could happen
    multiple times, but still around ~10000th
    file. Some time around the 20000th, 30000th .... files as well, or
    could happen continuosly for consecutive
    files.

    The files are stored in different directories (up to 8000 files max
    per directory). The time is spent on write
    instead of file open / close.

    Any gent has seen this problem, got a solution / workaround, or any
    suggestions for things to try?

    Thanks!







  2. Re: The 9997th file specific DOSFS problem

    I used to be part of the VxWorks file systems team and I did my fair
    share of time working with DosFS (though not-so-much with the 5.x
    version). It has been over a year since I last worked with VxWorks,
    but maybe some of the following comments will help shed some light.

    One of the things that we noticed was that the size of the storage did
    affect the file system performance to one degree or another. We
    thought that this was due to two things: resulting cluster size, and
    position seeking. The larger the cluster size, the more efficient
    VxWorks becomes at reading and writing its data. If I remember
    correctly, the maximum recommended cluster size for DosFS is 32 kB.
    VxWorks does allow a cluster size of 64 kB, but this is not
    universally supported with other operating systems. With regards to
    position seeking, the larger the storage, the larger the FAT. This
    generally means that the disk head must physically longer distances
    when writing/reading data as it must read the FAT to get the number of
    the next cluster to read. Caching helps with this. So too can
    partitioning.

    I don't know why off hand you are encountering such suddenly large
    delays at the ~10000th file. My suspicion, and it is only a
    suspicion, would be some sort of interplay between the various
    caches. VxWorks implementation of DosFS uses a "tiny cache" (512
    bytes) for one sector of the FAT, and one big cache for caching other
    FAT, data and directories. There have been problems in that area in
    the past. To toss out an idea, the tiny cache could potentially be
    getting flushed more often than necessary (causing both extra writes
    and disk seeks).

    Another thought that crossed my mind is that it could simply have to
    do with the cluster allocation strategy. Perhaps it is spending too
    much time trying to find available clusters. But if that were the
    case, I would expect more files to be experiencing the same slow
    behaviour from that point forward.

    Without anymore access to VxWorks, the source code or a target
    machine, I can't really add anything else. I do know that there were
    a LOT of DosFS fixes between 5.5 and 6.2 and later. Many of these
    were for stability and some for performance. Unfortunately, that
    probably does not help your situation much.

    Hopefully something in the above will be useful.

    Peter Mitsis (pcm)

    On Jul 9, 9:22*pm, y...@mda.ca wrote:
    > setup : tornado2.2.1, vxworks 5.5, dosfs2.0, FAT32, connecting to a
    > ~500 GB raid, single partition (no dpartLib)
    >
    > problem: It takes a looong time to write the 9997th file, ~30 seconds.
    > It takes ~0.1 seconds for other (4 MB) files.
    >
    > details:
    > The time it takes actually depends on the size of the storage, it
    > ranges from ~16 seconds for a 128 GB to
    > ~35 seconds for ~900 GB raid. It has to be the DOSFS, since both the
    > CPU and raid are busy when the problem
    > happens. *It appears that cbio is continuously doing reading/writting,
    > possibly accessing the FAT?
    >
    > The problem happens only once if the raid is being writen continuously
    > from the beginning to the end. If the
    > raid is unmounted / mounted in between, the problem could happen
    > multiple times, but still around ~10000th
    > file. Some time around the 20000th, 30000th .... files as well, or
    > could happen continuosly for consecutive
    > files.
    >
    > The files are stored in different directories (up to 8000 files max
    > per directory). The time is spent on write
    > instead of file open / close.
    >
    > Any gent has seen this problem, got a solution / workaround, or any
    > suggestions for things to try?
    >
    > Thanks!



  3. Re: The 9997th file specific DOSFS problem

    Thanks alot Peter. You helped me to understand the problem better.
    A few things i need to clarify:

    1) The average performance of accessing the raid, single file I/O, is
    ~40 MB/s, except when it hits the ~10000th file. i.e., it takes ~0.1
    second
    to write a 4 MB file normally. When it hits the ~10000th file, it
    takes
    ~30 seconds, which is ~300 times slower. Writing the file right after
    it will then take the normal 0.1 seconds, until it hits the next one.
    It seems to me not a pure performance problem.

    2) Thought it is already using 32 KB sector size. Will double check
    on that. We do need it to be compatible with Windows / Solaris, so
    32 KB is the max. We need it to be on a single partition as well for
    a few reasons.

    3) Adjusting size of the cache of the dcacheLib(?) didn't affect the
    problem much.


    If we consider the volumn, two FATs and the # 9997, it's exactly 10000
    (speculating). This makes me think that there has to be this magic #
    somewhere in the code that's related to the problem.

    The 512 bytes "tiny cache" for the FAT is very intersting. Is it
    possible
    that a break down of the algorithm, say, an overflow / underflow, is
    causing
    it to "re-caching" every sector of the FAT. I just realized the 30
    second
    time duration is very consistent every time it hits this magic file.

    Anyidea this 512B "tiny cache" canbe adjusted? Does dosFs try to
    remember
    / use the last searched sector?

    Any other idea? Appreciated you help!



    On Jul 10, 6:39*am, peter.mit...@gmail.com wrote:
    > I used to be part of the VxWorks file systems team and I did my fair
    > share of time working with DosFS (though not-so-much with the 5.x
    > version). *It has been over a year since I last worked with VxWorks,
    > but maybe some of the following comments will help shed some light.
    >
    > One of the things that we noticed was that the size of the storage did
    > affect the file system performance to one degree or another. *We
    > thought that this was due to two things: resulting cluster size, and
    > position seeking. *The larger the cluster size, the more efficient
    > VxWorks becomes at reading and writing its data. *If I remember
    > correctly, the maximum recommended cluster size for DosFS is 32 kB.
    > VxWorks does allow a cluster size of 64 kB, but this is not
    > universally supported with other operating systems. *With regards to
    > position seeking, the larger the storage, the larger the FAT. *This
    > generally means that the disk head must physically longer distances
    > when writing/reading data as it must read the FAT to get the number of
    > the next cluster to read. *Caching helps with this. *So too can
    > partitioning.
    >
    > I don't know why off hand you are encountering such suddenly large
    > delays at the ~10000th file. *My suspicion, and it is only a
    > suspicion, would be some sort of interplay between the various
    > caches. *VxWorks implementation of DosFS uses a "tiny cache" (512
    > bytes) for one sector of the FAT, and one big cache for caching other
    > FAT, data and directories. *There have been problems in that area in
    > the past. *To toss out an idea, the tiny cache could potentially be
    > getting flushed more often than necessary (causing both extra writes
    > and disk seeks).
    >
    > Another thought that crossed my mind is that it could simply have to
    > do with the cluster allocation strategy. *Perhaps it is spending too
    > much time trying to find available clusters. *But if that were the
    > case, I would expect more files to be experiencing the same slow
    > behaviour from that point forward.
    >
    > Without anymore access to VxWorks, the source code or a target
    > machine, I can't really add anything else. *I do know that there were
    > a LOT of DosFS fixes between 5.5 and 6.2 and later. *Many of these
    > were for stability and some for performance. *Unfortunately, that
    > probably does not help your situation much.
    >
    > Hopefully something in the above will be useful.
    >
    > Peter Mitsis (pcm)
    >
    > On Jul 9, 9:22*pm, y...@mda.ca wrote:
    >
    >
    >
    > > setup : tornado2.2.1, vxworks 5.5, dosfs2.0, FAT32, connecting to a
    > > ~500 GB raid, single partition (no dpartLib)

    >
    > > problem: It takes a looong time to write the 9997th file, ~30 seconds.
    > > It takes ~0.1 seconds for other (4 MB) files.

    >
    > > details:
    > > The time it takes actually depends on the size of the storage, it
    > > ranges from ~16 seconds for a 128 GB to
    > > ~35 seconds for ~900 GB raid. It has to be the DOSFS, since both the
    > > CPU and raid are busy when the problem
    > > happens. *It appears that cbio is continuously doing reading/writting,
    > > possibly accessing the FAT?

    >
    > > The problem happens only once if the raid is being writen continuously
    > > from the beginning to the end. If the
    > > raid is unmounted / mounted in between, the problem could happen
    > > multiple times, but still around ~10000th
    > > file. Some time around the 20000th, 30000th .... files as well, or
    > > could happen continuosly for consecutive
    > > files.

    >
    > > The files are stored in different directories (up to 8000 files max
    > > per directory). The time is spent on write
    > > instead of file open / close.

    >
    > > Any gent has seen this problem, got a solution / workaround, or any
    > > suggestions for things to try?

    >
    > > Thanks!- Hide quoted text -

    >
    > - Show quoted text -



  4. Re: The 9997th file specific DOSFS problem

    Unfortunately, I'm running out of ideas.

    On Jul 11, 3:06*am, y...@mda.ca wrote:
    > Thanks alot Peter. You helped me to understand the problem better.
    > A few things i need to clarify:
    >
    > 1) The average performance of accessing the raid, single file I/O, is
    > ~40 MB/s, except when it hits the ~10000th file. i.e., it takes ~0.1
    > second
    > to write a 4 MB file normally. When it hits the ~10000th file, it
    > takes
    > ~30 seconds, which is ~300 times slower. Writing the file right after
    > it will then take the normal 0.1 seconds, until it hits the next one.
    > It seems to me not a pure performance problem.
    >
    > 2) Thought it is already using 32 KB sector size. Will double check
    > on that. We do need it to be compatible with Windows / Solaris, so
    > 32 KB is the max. We need it to be on a single partition as well for
    > a few reasons.
    >
    > 3) Adjusting size of the cache of the dcacheLib(?) didn't affect the
    > problem much.
    >
    > If we consider the volumn, two FATs and the # 9997, it's exactly 10000
    > (speculating). This makes me think that there has to be this magic #
    > somewhere in the code that's related to the problem.


    If there is a magic number, I am thinking that it is non-obvious (or
    perhaps my memory is going); I don't recall seeing anything that would
    suggest this behaviour.

    >
    > The 512 bytes "tiny cache" for the FAT is very intersting. Is it
    > possible
    > that a break down of the algorithm, say, an overflow / underflow, is
    > causing
    > it to "re-caching" every sector of the FAT. I just realized the 30
    > second
    > time duration is very consistent every time it hits this magic file.
    >
    > Anyidea this 512B "tiny cache" canbe adjusted? Does dosFs try to
    > remember
    > / use the last searched sector?
    >


    The tiny cache is not configurable. It is the same size as the sector
    (512 bytes in most cases), and should not be confused with the cluster
    size. Yes, the VxWorks DosFS implementation does try to track and
    begin the search from the last searched FAT entry.

    > Any other idea? Appreciated you help!
    >


    This may not be relevant, but according to wikipedia's entry on the
    File Allocation Table (http://en.wikipedia.org/wiki/
    File_Allocation_Table), earlier versions of windows had a limitation
    of 128 GB to the FAT32 partition size. I do recall the group
    mentionning trying to abide by that limitation, but I don't know off
    the top of my head whether such a limitation was actually inserted
    into the code. I note this as you mentionned single partitions and
    raid volumes of 128 GB -> 900 GB. This avenue might be a red
    herring. Even so, I don't really see it coming into play that much.
    10000 files * 4MB each = 40 GB, which is well under these limits.
    Even with any preallocation that may have been done (to reduce
    possible fragmentation), we should still be well shy of that 128 GB.
    I think one of the show routines displays the preferred number of
    clusters that are allocated in one go.

    Hmm, playing with some more numbers, I get ....
    Assuming 1 sector = 512 bytes, ...
    FAT32 = 4 byte FAT entries --> 128 FAT entries per sector
    32 kB cluster size means that each FAT sector (and tiny cache) can
    hold cluster numbers to address up to 4 MB of data.
    If the normal files are 4 MB, that is one FAT sector per file.
    9997 mod 128 = 13
    9997 / 128 = 78 + some change

    I'm not really sure where I am going with these numbers ... just
    looking for something that may give a better clue as to where to go
    next.

    Peter

  5. Re: The 9997th file specific DOSFS problem

    It can handle the size without problem. We are able to connect to
    devices up to 1TB in size.

    A bit more info: if the disk is unmounted and then remounted and there
    are more than ~10000
    files on there already, this problem will happen right away. It
    happens when writing the very first
    file, and sometime on the second file as well.

    Also, if a file is deleted or rewriten, the problem will happen after
    that,

    because 30 seconds is a long time on a 1GHz PPC machine, it mush be
    doing some very
    CPU intensive work (searching?) in addition to disk I/O. This is shown
    by the very busy CPU
    and I/O during the 30 seconds of time.

    You mentioned that the last allocated/searched/accessed sector is
    cached. I wonder if this
    variable is reset under the above scenarios and it is then trying to
    search the whole FAT?

    Some kind sole sent me pieces of the dosFs source code. In dosFsLib.c,
    I see the definition

    int fatClugFac = 10000; /* cluster allocation group size factor */

    I wonder is this ha anything to do with the problem.

    On Jul 11, 7:49*am, peter.mit...@gmail.com wrote:
    > Unfortunately, I'm running out of ideas.
    >
    > On Jul 11, 3:06*am, y...@mda.ca wrote:
    >
    >
    >
    >
    >
    > > Thanks alot Peter. You helped me to understand the problem better.
    > > A few things i need to clarify:

    >
    > > 1) The average performance of accessing the raid, single file I/O, is
    > > ~40 MB/s, except when it hits the ~10000th file. i.e., it takes ~0.1
    > > second
    > > to write a 4 MB file normally. When it hits the ~10000th file, it
    > > takes
    > > ~30 seconds, which is ~300 times slower. Writing the file right after
    > > it will then take the normal 0.1 seconds, until it hits the next one.
    > > It seems to me not a pure performance problem.

    >
    > > 2) Thought it is already using 32 KB sector size. Will double check
    > > on that. We do need it to be compatible with Windows / Solaris, so
    > > 32 KB is the max. We need it to be on a single partition as well for
    > > a few reasons.

    >
    > > 3) Adjusting size of the cache of the dcacheLib(?) didn't affect the
    > > problem much.

    >
    > > If we consider the volumn, two FATs and the # 9997, it's exactly 10000
    > > (speculating). This makes me think that there has to be this magic #
    > > somewhere in the code that's related to the problem.

    >
    > If there is a magic number, I am thinking that it is non-obvious (or
    > perhaps my memory is going); I don't recall seeing anything that would
    > suggest this behaviour.
    >
    >
    >
    > > The 512 bytes "tiny cache" for the FAT is very intersting. Is it
    > > possible
    > > that a break down of the algorithm, say, an overflow / underflow, is
    > > causing
    > > it to "re-caching" every sector of the FAT. I just realized the 30
    > > second
    > > time duration is very consistent every time it hits this magic file.

    >
    > > Anyidea this 512B "tiny cache" canbe adjusted? Does dosFs try to
    > > remember
    > > / use the last searched sector?

    >
    > The tiny cache is not configurable. *It is the same size as the sector
    > (512 bytes in most cases), and should not be confused with the cluster
    > size. *Yes, the VxWorks DosFS implementation does try to track and
    > begin the search from the last searched FAT entry.
    >
    > > Any other idea? Appreciated you help!

    >
    > This may not be relevant, but according to wikipedia's entry on the
    > File Allocation Table (http://en.wikipedia.org/wiki/
    > File_Allocation_Table), earlier versions of windows had a limitation
    > of 128 GB to the FAT32 partition size. *I do recall the group
    > mentionning trying to abide by that limitation, but I don't know off
    > the top of my head whether such a limitation was actually inserted
    > into the code. *I note this as you mentionned single partitions and
    > raid volumes of 128 GB -> 900 GB. *This avenue might be a red
    > herring. *Even so, I don't really see it coming into play that much.
    > 10000 files * 4MB each = 40 GB, which is well under these limits.
    > Even with any preallocation that may have been done (to reduce
    > possible fragmentation), we should still be well shy of that 128 GB.
    > I think one of the show routines displays the preferred number of
    > clusters that are allocated in one go.
    >
    > Hmm, playing with some more numbers, I get ....
    > * * Assuming 1 sector = 512 bytes, ...
    > * * FAT32 = 4 byte FAT entries --> 128 FAT entries per sector
    > * * 32 kB cluster size means that each FAT sector (and tiny cache) can
    > hold cluster numbers to address up to 4 MB of data.
    > * * If the normal files are 4 MB, that is one FAT sector per file.
    > * * 9997 mod 128 = 13
    > * * 9997 / 128 = 78 + some change
    >
    > I'm not really sure where I am going with these numbers ... just
    > looking for something that may give a better clue as to where to go
    > next.
    >
    > Peter- Hide quoted text -
    >
    > - Show quoted text -



  6. Re: The 9997th file specific DOSFS problem

    y...@mda.ca wrote:
    > It can handle the size without problem. We are able to connect to
    > devices up to 1TB in size.
    >
    > A bit more info: if the disk is unmounted and then remounted and there
    > are more than ~10000
    > files on there already, this problem will happen right away. It
    > happens when writing the very first
    > file, and sometime on the second file as well.
    >
    > Also, if a file is deleted or rewriten, the problem will happen after
    > that,
    >
    > because 30 seconds is a long time on a 1GHz PPC machine, it mush be
    > doing some very
    > CPU intensive work (searching?) in addition to disk I/O. This is shown
    > by the very busy CPU
    > and I/O during the 30 seconds of time.
    >
    > You mentioned that the last allocated/searched/accessed sector is
    > cached. I wonder if this
    > variable is reset under the above scenarios and it is then trying to
    > search the whole FAT?
    >
    > Some kind sole sent me pieces of the dosFs source code. In dosFsLib.c,
    > I see the definition
    >
    > int fatClugFac = 10000; /* cluster allocation group size factor */
    >
    > I wonder is this ha anything to do with the problem.
    >
    > On Jul 11, 7:49´┐Żam, peter.mit...@gmail.com wrote:


    AHA! . Thank you kind soul.

    I think I have a potential solution.

    Short answer : Change to a larger value. If you change
    it to X, then the performance penalty will occur about every X files.
    See the long answer for more details.

    Long answer.
    Whoever that kind soul was, has delivered the final piece to the
    puzzle. Here is what I believe is happening. To reduce both
    fragmentation and head seek penalties, DosFS tries allocate more
    clusters than are immediately needed. Unused clusters are freed when
    the file is closed. You can find out what this value is by calling
    dosFsShow() and checking the displayed item called something
    "allocation group size". I'm pretty sure that command exists on 5.5.
    This value is essentially the number of FAT entries divided by
    . And since VxWorks tracks the last allocated cluster, I
    think this will give us periodic behaviour with slow-downs every
    files (for the most part) as the search for the next free
    cluster will require the search to "wrap around".

    Playing with some numbers ...
    512 GB disk --> ~2^24 million FAT entries (~16 million)
    fat allocation group size = 2^24 / = ~1600 FAT entries.
    1600 FAT entries @ 32 kB cluster size = ~52 MB

    If your typical file size is 4 MB, if you set to about
    (130000) on a 512 GB disk, then you should see a significant slow down
    around the 130000th file. If you did not delete any files, this
    should mean that your disk will be almost completely full.

    Peter

  7. Re: The 9997th file specific DOSFS problem

    Thanks a lot Peper! That makes a lot of sense. Now, I am thinking how
    to hack the code to verify your theory and eventually find a solution.
    As I don't
    have the complete source code and not sure if the source code is the
    same version
    as the binary we are using, one easy way seems to just overwrite the
    value of

    pFatDesc->fatGroupSize

    i.e., use dosFsVolDescGet(volumeName) to get the pointer to the volumn
    descriptor
    and overwrite the value of fatGroupSize.

    We do have small files as well as large ones stored. Assuming the
    search time
    penalty is not very significant, we can probably increase the
    fatGroupSize
    to a big value. Alternatively, can we also set it to a very small
    value, such that
    it it will take small hit performance periodically but not very bit
    hit?

    Here is the dump of the dosFsShow:

    volume descriptor ptr (pVolDesc): 0x42816c0
    cache block I/O descriptor ptr (cbio): 0x42817c0
    auto disk check on mount: NOT ENABLED
    max # of simultaneously open files: 42
    file descriptors in use: 2
    # of different files in use: 2
    # of descriptors for deleted files: 0
    # of obsolete descriptors: 0

    current volume configuration:
    - volume label: test ; (in boot sector: test)
    - volume Id: 0x69747261
    - total number of sectors: 286,749,487
    - bytes per sector: 512
    - # of sectors per cluster: 64
    - # of reserved sectors: 32
    - FAT entry size: FAT32
    - # of sectors per FAT copy: 35,004
    - # of FAT table copies: 2
    - # of hidden sectors: 0
    - first cluster is in sector # 70,040
    - Update last access date for open-read-close = FALSE
    - directory structure: VFAT
    - root dir start cluster: 2

    FAT handler information:
    ------------------------
    - allocation group size: 448 clusters
    - free space on volume: 115,074,957,396 bytes

    Other than increasing the value of pFatDesc->fatGroupSize, is there a
    not
    very complicated way of improving the searching algorithm and avoid
    the
    performance hit all together? Currently we do not delete any files
    from the
    device to avoid fragmentation and it's not of a concern if we leave
    some
    empty sectors here and there.

    Yuejin
    On Jul 12, 5:06┬*pm, peter.mit...@gmail.com wrote:
    > y...@mda.ca wrote:
    > > It can handle the size without problem. We are able to connect to
    > > devices up to 1TB in size.

    >
    > > A bit more info: if the disk is unmounted and then remounted and there
    > > are more than ~10000
    > > files on there already, this problem will happen right away. It
    > > happens when writing the very first
    > > file, and sometime on the second file as well.

    >
    > > Also, if a file is deleted or rewriten, the problem will happen after
    > > that,

    >
    > > because 30 seconds is a long time on a 1GHz PPC machine, it mush be
    > > doing some very
    > > CPU intensive work (searching?) in addition to disk I/O. This is shown
    > > by the very busy CPU
    > > and I/O during the 30 seconds of time.

    >
    > > ┬*You mentioned that the last allocated/searched/accessed sector is
    > > cached. I wonder if this
    > > variable is reset under the above scenarios and it is then trying to
    > > search the whole FAT?

    >
    > > Some kind sole sent me pieces of the dosFs source code. In dosFsLib.c,
    > > I see the definition

    >
    > > int ┬* ┬* ┬* ┬*fatClugFac = 10000; ┬* ┬* /* cluster allocation group size factor */

    >
    > > I wonder is this ha anything to do with the problem.

    >
    > > On Jul 11, 7:49´┐Żam, peter.mit...@gmail.com wrote:

    >
    > AHA! . ┬*Thank you kind soul.
    >
    > I think I have a potential solution.
    >
    > Short answer : ┬*Change to a larger value. ┬*If you change
    > it to X, then the performance penalty will occur about every X files.
    > See the long answer for more details.
    >
    > Long answer.
    > Whoever that kind soul was, has delivered the final piece to the
    > puzzle. ┬*Here is what I believe is happening. ┬*To reduce both
    > fragmentation and head seek penalties, DosFS tries allocate more
    > clusters than are immediately needed. ┬*Unused clusters are freed when
    > the file is closed. ┬*You can find out what this value is by calling
    > dosFsShow() and checking the displayed item called something
    > "allocation group size". ┬*I'm pretty sure that command exists on 5.5..
    > This value is essentially the number of FAT entries divided by
    > . ┬*And since VxWorks tracks the last allocated cluster, I
    > think this will give us periodic behaviour with slow-downs every
    > files (for the most part) as the search for the next free
    > cluster will require the search to "wrap around".
    >
    > Playing with some numbers ...
    > 512 GB disk --> ~2^24 million FAT entries (~16 million)
    > fat allocation group size = 2^24 / = ~1600 FAT entries.
    > 1600 FAT entries @ 32 kB cluster size = ~52 MB
    >
    > If your typical file size is 4 MB, if you set to about
    > (130000) on a 512 GB disk, then you should see a significant slow down
    > around the 130000th file. ┬*If you did not delete any files, this
    > should mean that your disk will be almost completely full.
    >
    > Peter- Hide quoted text -
    >
    > - Show quoted text -



  8. Re: The 9997th file specific DOSFS problem

    If you don't have the source code (which from what I gather is your
    case), I can think of two ways to hack this. Someone more clever than
    I can probably think of better ways.

    #1. Increase the size of before the device is mounted
    with DosFS. If it is done after the device has been mounted with
    DosFS, it will have no effect. (Unfortunately 5.x does not support
    removability--something that was added in 6.2). Off the top of my
    head, the best place for that would probably be the configlette files
    in target/comps/src.

    #2. Create a routine that will set the value of pFatDesc-
    >fatGroupSize. This would be the more flexible and possibly useful of

    the two ways; when the system is up, you would be able to tune the
    number of FAT entries allocated at once at any time after the device
    has been mounted. The smaller the value, less often you should
    encounter the "big slowdown". However, if it is too small you may be
    encountering performance issues that could be related to file
    fragmentation, or other causes. The optimal value (or values) will
    depend upon your system (# files, file sizes, file access, type of
    file data, ...).

    Peter

    On Jul 14, 9:11┬*pm, y...@mda.ca wrote:
    > Thanks a lot Peper! That makes a lot of sense. Now, I am thinking how
    > to hack the code to verify your theory and eventually find a solution.
    > As I don't
    > have the complete source code and not sure if the source code is the
    > same version
    > as the binary we are using, one easy way seems to just overwrite the
    > value of
    >
    > pFatDesc->fatGroupSize
    >
    > i.e., use dosFsVolDescGet(volumeName) to get the pointer to the volumn
    > descriptor
    > and overwrite the value of fatGroupSize.
    >
    > We do have small files as well as large ones stored. Assuming the
    > search time
    > penalty is not very significant, we can probably increase the
    > fatGroupSize
    > to a big value. Alternatively, can we also set it to a very small
    > value, such that
    > it it will take small hit performance periodically but not very bit
    > hit?
    >
    > Here is the dump of the dosFsShow:
    >
    > volume descriptor ptr (pVolDesc): ┬* ┬* ┬* 0x42816c0
    > cache block I/O descriptor ptr (cbio): ┬*0x42817c0
    > auto disk check on mount: ┬* ┬* ┬* ┬* ┬* ┬* ┬* NOT ENABLED
    > max # of simultaneously open files: ┬* ┬* 42
    > file descriptors in use: ┬* ┬* ┬* ┬* ┬* ┬* ┬*┬*2
    > # of different files in use: ┬* ┬* ┬* ┬* ┬* ┬*2
    > # of descriptors for deleted files: ┬* ┬* 0
    > # of ┬*obsolete descriptors: ┬* ┬* ┬* ┬* ┬* ┬* 0
    >
    > current volume configuration:
    > ┬*- volume label: ┬* ┬* ┬* ┬*test ; (in boot sector: ┬* ┬* test)
    > ┬*- volume Id: ┬* ┬* ┬* ┬* ┬* 0x69747261
    > ┬*- total number of sectors: ┬* ┬* 286,749,487
    > ┬*- bytes per sector: ┬* ┬* ┬* ┬* ┬* ┬*512
    > ┬*- # of sectors per cluster: ┬* ┬*64
    > ┬*- # of reserved sectors: ┬* ┬* ┬* 32
    > ┬*- FAT entry size: ┬* ┬* ┬* ┬* ┬* ┬* ┬*FAT32
    > ┬*- # of sectors per FAT copy: ┬* 35,004
    > ┬*- # of FAT table copies: ┬* ┬* ┬* 2
    > ┬*- # of hidden sectors: ┬* ┬* ┬* ┬* 0
    > ┬*- first cluster is in sector # 70,040
    > ┬*- Update last access date for open-read-close = FALSE
    > ┬*- directory structure: ┬* ┬* ┬* ┬* VFAT
    > ┬*- root dir start cluster: ┬* ┬* ┬*2
    >
    > FAT handler information:
    > ------------------------
    > ┬*- allocation group size: ┬* ┬* ┬* 448 clusters
    > ┬*- free space on volume: ┬* ┬* ┬* ┬*115,074,957,396 bytes
    >
    > Other than increasing the value of pFatDesc->fatGroupSize, is there a
    > not
    > very complicated way of improving the searching algorithm and avoid
    > the
    > performance hit all together? Currently we do not delete any files
    > from the
    > device to avoid fragmentation and it's not of a concern if we leave
    > some
    > empty sectors here and there.
    >
    > Yuejin
    > On Jul 12, 5:06┬*pm, peter.mit...@gmail.com wrote:
    >
    >
    >
    > > y...@mda.ca wrote:
    > > > It can handle the size without problem. We are able to connect to
    > > > devices up to 1TB in size.

    >
    > > > A bit more info: if the disk is unmounted and then remounted and there
    > > > are more than ~10000
    > > > files on there already, this problem will happen right away. It
    > > > happens when writing the very first
    > > > file, and sometime on the second file as well.

    >
    > > > Also, if a file is deleted or rewriten, the problem will happen after
    > > > that,

    >
    > > > because 30 seconds is a long time on a 1GHz PPC machine, it mush be
    > > > doing some very
    > > > CPU intensive work (searching?) in addition to disk I/O. This is shown
    > > > by the very busy CPU
    > > > and I/O during the 30 seconds of time.

    >
    > > > ┬*You mentioned that the last allocated/searched/accessed sector is
    > > > cached. I wonder if this
    > > > variable is reset under the above scenarios and it is then trying to
    > > > search the whole FAT?

    >
    > > > Some kind sole sent me pieces of the dosFs source code. In dosFsLib.c,
    > > > I see the definition

    >
    > > > int ┬* ┬* ┬* ┬*fatClugFac = 10000; ┬* ┬* /*cluster allocation group size factor */

    >
    > > > I wonder is this ha anything to do with the problem.

    >
    > > > On Jul 11, 7:49´┐Żam, peter.mit...@gmail.com wrote:

    >
    > > AHA! . ┬*Thank you kind soul.

    >
    > > I think I have a potential solution.

    >
    > > Short answer : ┬*Change to a larger value. ┬*If you change
    > > it to X, then the performance penalty will occur about every X files.
    > > See the long answer for more details.

    >
    > > Long answer.
    > > Whoever that kind soul was, has delivered the final piece to the
    > > puzzle. ┬*Here is what I believe is happening. ┬*To reduce both
    > > fragmentation and head seek penalties, DosFS tries allocate more
    > > clusters than are immediately needed. ┬*Unused clusters are freed when
    > > the file is closed. ┬*You can find out what this value is by calling
    > > dosFsShow() and checking the displayed item called something
    > > "allocation group size". ┬*I'm pretty sure that command exists on 5..5.
    > > This value is essentially the number of FAT entries divided by
    > > . ┬*And since VxWorks tracks the last allocated cluster, I
    > > think this will give us periodic behaviour with slow-downs every
    > > files (for the most part) as the search for the next free
    > > cluster will require the search to "wrap around".

    >
    > > Playing with some numbers ...
    > > 512 GB disk --> ~2^24 million FAT entries (~16 million)
    > > fat allocation group size = 2^24 / = ~1600 FAT entries..
    > > 1600 FAT entries @ 32 kB cluster size = ~52 MB

    >
    > > If your typical file size is 4 MB, if you set to about
    > > (130000) on a 512 GB disk, then you should see a significant slow down
    > > around the 130000th file. ┬*If you did not delete any files, this
    > > should mean that your disk will be almost completely full.

    >
    > > Peter- Hide quoted text -

    >
    > > - Show quoted text -



+ Reply to Thread