How to calculate number of files on a disk - VMS

This is a discussion on How to calculate number of files on a disk - VMS ; Hello all, I'm working on gathering disk stats. I've used sys$getdviw to gather freeblocks, maxblocks, maxfiles, errorcount, mountcount... But I'd like to also get the number of files on a disk as well. In case I'm getting close to maxfiles, ...

+ Reply to Thread
Results 1 to 18 of 18

Thread: How to calculate number of files on a disk

  1. How to calculate number of files on a disk

    Hello all,

    I'm working on gathering disk stats. I've used sys$getdviw to gather
    freeblocks, maxblocks, maxfiles, errorcount, mountcount...

    But I'd like to also get the number of files on a disk as well. In
    case I'm getting close to maxfiles, or incase the number of files
    increase/decrease over time.


    A sample of C code would be appreciated...


    Thanks,

    Lyndon

  2. Re: How to calculate number of files on a disk

    On May 6, 11:29*pm, lyndonbart...@yahoo.com wrote:
    > Hello all,
    >
    > I'm working on gathering disk stats. I've used sys$getdviw to gather
    > freeblocks, maxblocks, maxfiles, errorcount, mountcount...
    >
    > But I'd like to also get the number of files on a disk as well. In
    > case I'm getting close to maxfiles, or incase the number of files
    > increase/decrease over time.
    >
    > A sample of C code would be appreciated...
    >
    > Thanks,
    >
    > Lyndon


    Use DFU. http://www.digiater.nl/dfu.html
    Spawn that if need be.

    Of course you _could_ open [000000]INDEXF.SYS and read /
    4096 blocks at VBN *4+1
    Then proceed to count the bits.

    hth,
    Hein.




  3. Re: How to calculate number of files on a disk

    lyndonbartels@yahoo.com wrote:
    > Hello all,
    >
    > I'm working on gathering disk stats. I've used sys$getdviw to gather
    > freeblocks, maxblocks, maxfiles, errorcount, mountcount...
    >
    > But I'd like to also get the number of files on a disk as well. In
    > case I'm getting close to maxfiles, or incase the number of files
    > increase/decrease over time.
    >
    >
    > A sample of C code would be appreciated...
    >


    Extract the following code to count_files.c.
    Compile and link as follows:

    $ cc count_files
    $ link count_files

    An example run would be:

    $ write sys$output f$getdvi ("dsa0:", "devnam")
    _DSA0:
    $ set process/priv=readall
    $ mcr []count_files dsa0:
    Number of files on dsa0: is 41376


    -------------------begin code---------------------------
    #define __NEW_STARLET 1

    #include
    #include
    #include
    #include
    #include
    #include
    #include

    #define errchk_sig(arg) if (!$VMS_STATUS_SUCCESS(arg)) (void)lib$signal
    (arg);


    /************************************************** ****************************/
    static void usage (void) {

    (void)printf ("Usage: $ mcr []count_files device_name\n"
    "\twhere device name is the name of a disk, including"
    "\t the colon. For example, DSA0:\n");
    }


    /************************************************** ****************************/
    int main (int argc, char *argv[]) {

    static int r0_status;
    static unsigned int file_count = 0;
    static unsigned int context = 0;
    static int finished = FALSE;

    static char wild[] = "[000000...]*.*;*";
    static char file[NAML$C_MAXRSS+1];
    static char spec[64+sizeof(wild)];

    static struct dsc$descriptor_s spec_d = { 0,
    DSC$K_DTYPE_T,
    DSC$K_CLASS_S,
    spec };
    static struct dsc$descriptor_s file_d = { 0,
    DSC$K_DTYPE_T,
    DSC$K_CLASS_S,
    file };

    if (argc < 2) {
    usage ();
    exit (EXIT_FAILURE);
    }

    spec_d.dsc$w_length = sprintf (spec, "%s%s", argv[1], wild);

    while (!finished) {
    file_d.dsc$w_length = NAML$C_MAXRSS;
    r0_status = lib$find_file (&spec_d,
    &file_d,
    &context,
    0,
    0,
    0,
    0);
    if (r0_status == RMS$_NMF) {
    finished = TRUE;
    continue;
    } else {
    errchk_sig (r0_status);
    }
    file_count++;
    }

    (void)printf ("Number of files on %s is %u\n",
    argv[1],
    file_count);
    return (EXIT_SUCCESS);
    }
    -----------end code----------------------------------

    Additional examples of calls to system services and LIB$ routines can be
    found at

    You might like to look at the LIB$SET_SYMBOL call to set a DCL symbol
    rather than printing out the result.

    Cheers,
    Jim.
    --
    www.eight-cubed.com

  4. Re: How to calculate number of files on a disk

    Jim Duff wrote:
    > lyndonbartels@yahoo.com wrote:
    >
    >> Hello all,
    >>
    >> I'm working on gathering disk stats. I've used sys$getdviw to gather
    >> freeblocks, maxblocks, maxfiles, errorcount, mountcount...
    >>
    >> But I'd like to also get the number of files on a disk as well. In
    >> case I'm getting close to maxfiles, or incase the number of files
    >> increase/decrease over time.
    >>
    >>
    >> A sample of C code would be appreciated...
    >>

    >
    > Extract the following code to count_files.c.
    > Compile and link as follows:
    >
    > $ cc count_files
    > $ link count_files
    >
    > An example run would be:
    >
    > $ write sys$output f$getdvi ("dsa0:", "devnam")
    > _DSA0:
    > $ set process/priv=readall
    > $ mcr []count_files dsa0:
    > Number of files on dsa0: is 41376
    >
    >
    > -------------------begin code---------------------------
    > #define __NEW_STARLET 1
    >
    > #include
    > #include
    > #include
    > #include
    > #include
    > #include
    > #include
    >
    > #define errchk_sig(arg) if (!$VMS_STATUS_SUCCESS(arg)) (void)lib$signal
    > (arg);
    >
    >
    > /************************************************** ****************************/
    >
    > static void usage (void) {
    >
    > (void)printf ("Usage: $ mcr []count_files device_name\n"
    > "\twhere device name is the name of a disk, including"
    > "\t the colon. For example, DSA0:\n");
    > }
    >
    >
    > /************************************************** ****************************/
    >
    > int main (int argc, char *argv[]) {
    >
    > static int r0_status;
    > static unsigned int file_count = 0;
    > static unsigned int context = 0;
    > static int finished = FALSE;
    >
    > static char wild[] = "[000000...]*.*;*";
    > static char file[NAML$C_MAXRSS+1];
    > static char spec[64+sizeof(wild)];
    >
    > static struct dsc$descriptor_s spec_d = { 0,
    > DSC$K_DTYPE_T,
    > DSC$K_CLASS_S,
    > spec };
    > static struct dsc$descriptor_s file_d = { 0,
    > DSC$K_DTYPE_T,
    > DSC$K_CLASS_S,
    > file };
    >
    > if (argc < 2) {
    > usage ();
    > exit (EXIT_FAILURE);
    > }
    >
    > spec_d.dsc$w_length = sprintf (spec, "%s%s", argv[1], wild);
    >
    > while (!finished) {
    > file_d.dsc$w_length = NAML$C_MAXRSS;
    > r0_status = lib$find_file (&spec_d,
    > &file_d,
    > &context,
    > 0,
    > 0,
    > 0,
    > 0);
    > if (r0_status == RMS$_NMF) {
    > finished = TRUE;
    > continue;
    > } else {
    > errchk_sig (r0_status);
    > }
    > file_count++;
    > }
    >
    > (void)printf ("Number of files on %s is %u\n",
    > argv[1],
    > file_count);
    > return (EXIT_SUCCESS);
    > }
    > -----------end code----------------------------------
    >
    > Additional examples of calls to system services and LIB$ routines can be
    > found at
    >
    > You might like to look at the LIB$SET_SYMBOL call to set a DCL symbol
    > rather than printing out the result.
    >
    > Cheers,
    > Jim.


    If I understand this correctly, this is doing a wildcard lookup
    on [000000...]*.*;* and counting the hits, i.e. traversing
    the directory tree. It won't find files that aren't in a directory
    (lost files), and it will count multiple times any file that has
    alias entries (produced by set file/enter...)

    What the OP didn't ask for, but what he actually needs is the
    number of file headers available. The only way to get this is
    to count the clear bits in the bitmap (as Hein proposed.) This
    could still be off, since the system may not be able to actually
    allocate that many headers for new files. If the free space is
    badly fragmented, the new files will also be badly fragmented and
    if severe enough, many of them may require multiple headers to
    map all the fragments. Also, if you use lots of ACLs, this can
    increase the number of headers required for an average file.
    A third limit is if INDEXF.SYS has not yet been fully extended,
    when it comes time to allocate more files, the disk may already
    be full or so fragmented that INDEXF.SYS requires another header,
    which I think is prohibited. Every file requires at least one
    header, but it is hard to predict how many will be required for
    future files!




    --
    John Santos
    Evans Griffiths & Hart, Inc.
    781-861-0670 ext 539

  5. Re: How to calculate number of files on a disk

    I was wondering if, due to caching, could the bitmap be out of date?


  6. Re: How to calculate number of files on a disk

    On 7 May, 10:01, IanMiller wrote:
    > I was wondering if, due to caching, could the bitmap be out of date?


    Not necessarily a bad assumption to make, but wouldn't this be risky
    for the OS? If it doesn't know at some level or other what headers
    are available and which aren't it's potentially going to try and use
    one that's already being used...

  7. Re: How to calculate number of files on a disk

    wrote in message
    news:f0cb80eb-76d0-4de5-90b0-a7bdeafa0c55@f36g2000hsa.googlegroups.com...
    > Hello all,
    >
    > I'm working on gathering disk stats. I've used sys$getdviw to gather
    > freeblocks, maxblocks, maxfiles, errorcount, mountcount...
    >
    > But I'd like to also get the number of files on a disk as well. In
    > case I'm getting close to maxfiles, or incase the number of files
    > increase/decrease over time.
    >
    > A sample of C code would be appreciated...


    Not C code, but wouldn't DIR/GRAND [000000...] be a starting point?
    --
    David Biddulph



  8. Re: How to calculate number of files on a disk

    On May 7, 7:47*am, "David Biddulph"
    wrote:
    > wrote in message
    >
    > news:f0cb80eb-76d0-4de5-90b0-a7bdeafa0c55@f36g2000hsa.googlegroups.com...
    >
    > > Hello all,

    >
    > > I'm working on gathering disk stats. I've used sys$getdviw to gather
    > > freeblocks, maxblocks, maxfiles, errorcount, mountcount...

    :
    > Not C code, but wouldn't DIR/GRAND [000000...] be a starting point?
    > David Biddulph


    That's what our friend Jim Duff already proposed.

    It is not entirely correct for the reasons outlined by John Santos

    John S> What the OP didn't ask for, but what he actually needs is the
    number of file headers available. The only way to get this is
    to count the clear bits in the bitmap.

    Exactly.

    Ian Miller asked: "I was wondering if, due to caching, could the
    bitmap be out of date?"

    It would be, if you opened INDEXF read-only, but it should be 'close
    enough' for the intended usage purpose.

    If a tools, such as briefly outlined, were to open INDEXF with WRITE
    access, then the XQP will flush the caches rigth there.
    See... Black Bible 8.6.7 User Invalidation of Cached Buffers

    Hein.







  9. Re: How to calculate number of files on a disk

    On May 6, 10:29*pm, lyndonbart...@yahoo.com wrote:
    > Hello all,
    >
    > I'm working on gathering disk stats. I've used sys$getdviw to gather
    > freeblocks, maxblocks, maxfiles, errorcount, mountcount...
    >
    > But I'd like to also get the number of files on a disk as well. In
    > case I'm getting close to maxfiles, or incase the number of files
    > increase/decrease over time.
    >
    > A sample of C code would be appreciated...
    >
    > Thanks,
    >
    > Lyndon


    First, I thank all of you for your responses. Especially Jim Duff, I
    found your sight, and your examples have been invaluable...!

    Second, I should further define my needs/limitations.

    I'm writing an extension onto the T4 monitoring swuite. So.
    1. It must be very fast.
    2. Must use very little I/O.
    3. It must be very fast.

    As others have stated, I don't need an absolutely perfectly accurate
    number. Inaccuracy due to caching, etc. is acceptable. I simply want a
    real close number I can track.

    Oh, did I mention it has to be fast???

    T4, by default, samples every second, so I need to be able loop
    through all the disks on a system, and get this information (If it's
    mounted) and write this info to an output csv file every sampling.

    This is the last piece of the performance data I want to gather, it's
    a "It'd be nice to have." If I can't gather this data quickly and
    efficiently, I can leave it out of the final product.


    Thanks in advance,

    Lyndon

  10. Re: How to calculate number of files on a disk

    In article <8bc540b0-5e7c-4853-9ada-6e255430a87a@e53g2000hsa.googlegroups.com>, IanMiller writes:
    > I was wondering if, due to caching, could the bitmap be out of date?
    >


    The bitmap on the platters is always out of date when the disk is mounted
    unless /nowrite was used in the mount (or hardware write lock is used).

    The bitmap will show allocated blocks that are actually preallocated
    (in the cache for future use). The bitmap is corrected on proper
    dismount, mount without /norebuild, or an explicit set volume/rebuild.

    So you don't have to worry about blocks in use that are marked free,
    but if you need a rebuild and haven't done it yet there are blocks
    marked used that aren't in use.

    I generally mount/norebuild and do the explicit rebuild once a week
    prior to full backups (overnight) at work, and let the rebuild happen
    during mount at home.

    It is possible (I've seen it about twice in the last 30 years) for
    hardware failures to corrupt the bitmap, leading to blocks marked
    free that actually have data in them. No file system can prevent
    that kind of hardware failure. That's one of the reasons I do
    backups.



  11. Re: How to calculate number of files on a disk

    On May 7, 9:38*am, lyndonbart...@yahoo.com wrote:
    > On May 6, 10:29*pm, lyndonbart...@yahoo.com wrote:
    >
    > > Hello all,

    >
    > > I'm working on gathering disk stats. I've used sys$getdviw to gather
    > > freeblocks, maxblocks, maxfiles, errorcount, mountcount...


    Along with the cluster size, that's all you need to know to read the
    header-bitmap
    in a single QIO, or IO_PERFORM, from INDEXF.SYS.

    Open the file by FID (1,1) or better still, keep it open.


    > First, I thank all of you for your responses. Especially Jim Duff, I
    > found your sight, and your examples have been invaluable...!


    "website" or "insight" would work.

    > Oh, did I mention it has to be fast???
    > T4, by default, samples every second, nor every second.


    Isn't it every minute?

    Anyway, either rule out any [*...] walk.

    IMHO it also rules out opening INDEXF.SYS for write,
    because I would not want (risk) flush those caches every second.

    I would just open for read and accept any inaccuracies.
    They can (rightfully!) be explained as 'active files'.

    > This is the last piece of the performance data I want to gather, it's
    > a "It'd be nice to have." If I can't gather this data quickly and
    > efficiently, I can leave it out of the final product.


    You could perhaps fake it (cache it!) somewhat.

    Get the easy (GETDVI) stuff every interval.
    Read the bitmaps every 30 or 60 intervals and just repeat the same
    info 29 (59) times more.

    Bob K> The bitmap will show allocated blocks that are actually
    preallocated

    Bob, it seems to me you are referring to the STORAGE bitmap from
    BITMAP.SYS where
    every bit represents and allocation cluster.
    We are talking about the header bitmap, which lives inside INDEXF.SYS
    and where
    each set bit represents a free block in that file and each cleared bit
    represents a block
    used for a header or saved in the cache.

    Hein.

  12. RE: How to calculate number of files on a disk

    >...
    > As others have stated, I don't need an absolutely perfectly
    > accurate
    > number. Inaccuracy due to caching, etc. is acceptable. I simply
    > want a
    > real close number I can track.
    >
    > Oh, did I mention it has to be fast???
    >
    > T4, by default, samples every second, so I need to be able loop
    > through all the disks on a system, and get this information (If
    > it's
    > mounted) and write this info to an output csv file every
    > sampling.
    >...


    Just because T4 records some information fast (I have not checked
    but I do not remember any T4 data being sampled every second, but
    it has been a couple of years since I last did anything serious
    with T4) doesn't mean that every piece of information needs to be
    collected that fast. I sample the disk space information once a
    day using T4 and that is good enough for my application, other
    sites may want the information recorded less often or more often.


    Peter Weaver
    www.WeaverConsulting.ca www.OpenVMSvirtualization.com
    www.VAXvirtualization.com www.AlphaVirtualization.com


  13. Re: How to calculate number of files on a disk


    > > Oh, did I mention it has to be fast???
    > > T4, by default, samples every second, nor every second.

    >
    > Isn't it every minute?
    >



    You're right.. My typo. I meant every minute.


    But my earlier desires for high speed and low I/O still apply. I do
    NOT want the process of collecting data become a performance burden. I
    don't want T4 collecting to become a statistic on the performance
    graphs.

  14. Re: How to calculate number of files on a disk

    Hein RMS van den Heuvel wrote:
    > On May 7, 9:38 am, lyndonbart...@yahoo.com wrote:
    >> On May 6, 10:29 pm, lyndonbart...@yahoo.com wrote:
    >>
    >>> Hello all,
    >>> I'm working on gathering disk stats. I've used sys$getdviw to gather
    >>> freeblocks, maxblocks, maxfiles, errorcount, mountcount...

    >
    > Along with the cluster size, that's all you need to know to read the
    > header-bitmap
    > in a single QIO, or IO_PERFORM, from INDEXF.SYS.
    >
    > Open the file by FID (1,1) or better still, keep it open.
    >
    >
    >> First, I thank all of you for your responses. Especially Jim Duff, I
    >> found your sight, and your examples have been invaluable...!

    >
    > "website" or "insight" would work.
    >
    >> Oh, did I mention it has to be fast???
    >> T4, by default, samples every second, nor every second.

    >
    > Isn't it every minute?
    >
    > Anyway, either rule out any [*...] walk.
    >
    > IMHO it also rules out opening INDEXF.SYS for write,
    > because I would not want (risk) flush those caches every second.
    >
    > I would just open for read and accept any inaccuracies.
    > They can (rightfully!) be explained as 'active files'.
    >
    >> This is the last piece of the performance data I want to gather, it's
    >> a "It'd be nice to have." If I can't gather this data quickly and
    >> efficiently, I can leave it out of the final product.

    >
    > You could perhaps fake it (cache it!) somewhat.
    >
    > Get the easy (GETDVI) stuff every interval.
    > Read the bitmaps every 30 or 60 intervals and just repeat the same
    > info 29 (59) times more.
    >
    > Bob K> The bitmap will show allocated blocks that are actually
    > preallocated
    >
    > Bob, it seems to me you are referring to the STORAGE bitmap from
    > BITMAP.SYS where
    > every bit represents and allocation cluster.
    > We are talking about the header bitmap, which lives inside INDEXF.SYS
    > and where
    > each set bit represents a free block in that file and each cleared bit
    > represents a block
    > used for a header or saved in the cache.
    >


    Of course, knowing the speed requirements for this now, doing
    lib$find_file or alternatively, sys$search (so we can eliminate aliased
    files) is out of the question.

    However, reading the bitmap is only going to give us an approximation as
    well, as the bitmap does not distinguish between primary and extension
    headers.

    To get an accurate number, you would have to take out the volume
    serialization lock, read the bitmap, and read each in-use header to see
    if it's a primary header, then release the volume serialization lock.

    Cheers,
    Jim.
    --
    www.eight-cubed.com

  15. Re: How to calculate number of files on a disk

    On May 7, 5:55*pm, Jim Duff wrote:
    :
    > Cheers,
    > Jim.
    > --www.eight-cubed.com- Hide quoted text -


    Jim, I tried to Email you but got a 'Mailbox full' message?
    Fix that or send me an Email from an other account to reply to?
    Nothing special, cheers, Hein.


  16. Re: How to calculate number of files on a disk

    Hein RMS van den Heuvel wrote:
    > On May 7, 5:55 pm, Jim Duff wrote:
    > :
    >> Cheers,
    >> Jim.
    >> --www.eight-cubed.com- Hide quoted text -

    >
    > Jim, I tried to Email you but got a 'Mailbox full' message?
    > Fix that or send me an Email from an other account to reply to?
    > Nothing special, cheers, Hein.
    >


    Fixed. Thanks for mentioning this. I hate spam. I especially hate
    spam when I get 4000+ of them in less than an hour :-(

    Cheers,
    Jim.
    --
    www.eight-cubed.com

  17. Re: How to calculate number of files on a disk

    On May 7, 5:55 pm, Jim Duff wrote:
    > Hein RMS van den Heuvel wrote:
    >
    >
    >
    > > On May 7, 9:38 am, lyndonbart...@yahoo.com wrote:
    > >> On May 6, 10:29 pm, lyndonbart...@yahoo.com wrote:

    >
    > >>> Hello all,
    > >>> I'm working on gathering disk stats. I've used sys$getdviw to gather
    > >>> freeblocks, maxblocks, maxfiles, errorcount, mountcount...

    >
    > > Along with the cluster size, that's all you need to know to read the
    > > header-bitmap
    > > in a single QIO, or IO_PERFORM, from INDEXF.SYS.

    >
    > > Open the file by FID (1,1) or better still, keep it open.

    >
    > >> First, I thank all of you for your responses. Especially Jim Duff, I
    > >> found your sight, and your examples have been invaluable...!

    >
    > > "website" or "insight" would work.

    >
    > >> Oh, did I mention it has to be fast???
    > >> T4, by default, samples every second, nor every second.

    >
    > > Isn't it every minute?

    >
    > > Anyway, either rule out any [*...] walk.

    >
    > > IMHO it also rules out opening INDEXF.SYS for write,
    > > because I would not want (risk) flush those caches every second.

    >
    > > I would just open for read and accept any inaccuracies.
    > > They can (rightfully!) be explained as 'active files'.

    >
    > >> This is the last piece of the performance data I want to gather, it's
    > >> a "It'd be nice to have." If I can't gather this data quickly and
    > >> efficiently, I can leave it out of the final product.

    >
    > > You could perhaps fake it (cache it!) somewhat.

    >
    > > Get the easy (GETDVI) stuff every interval.
    > > Read the bitmaps every 30 or 60 intervals and just repeat the same
    > > info 29 (59) times more.

    >
    > > Bob K> The bitmap will show allocated blocks that are actually
    > > preallocated

    >
    > > Bob, it seems to me you are referring to the STORAGE bitmap from
    > > BITMAP.SYS where
    > > every bit represents and allocation cluster.
    > > We are talking about the header bitmap, which lives inside INDEXF.SYS
    > > and where
    > > each set bit represents a free block in that file and each cleared bit
    > > represents a block
    > > used for a header or saved in the cache.

    >
    > Of course, knowing the speed requirements for this now, doing
    > lib$find_file or alternatively, sys$search (so we can eliminate aliased
    > files) is out of the question.
    >
    > However, reading the bitmap is only going to give us an approximation as
    > well, as the bitmap does not distinguish between primary and extension
    > headers.
    >
    > To get an accurate number, you would have to take out the volume
    > serialization lock, read the bitmap, and read each in-use header to see
    > if it's a primary header, then release the volume serialization lock.
    >
    > Cheers,
    > Jim.
    > --www.eight-cubed.com


    But isn't the number of headers what he really needs? Isn't that what
    he's worried of running out of? My question is: How does he get so
    many headers in the first place?

    AEF

  18. Re: How to calculate number of files on a disk


    "Bob Koehler" wrote in message
    news:rAPCDs2gP637@eisner.encompasserve.org...
    > In article

    <8bc540b0-5e7c-4853-9ada-6e255430a87a@e53g2000hsa.googlegroups.com>,
    IanMiller writes:
    > > I was wondering if, due to caching, could the bitmap be out of date?
    > >

    >
    > The bitmap on the platters is always out of date when the disk is

    mounted
    > unless /nowrite was used in the mount (or hardware write lock is used).
    >
    > The bitmap will show allocated blocks that are actually preallocated
    > (in the cache for future use). The bitmap is corrected on proper
    > dismount, mount without /norebuild, or an explicit set volume/rebuild.
    >
    > So you don't have to worry about blocks in use that are marked free,
    > but if you need a rebuild and haven't done it yet there are blocks
    > marked used that aren't in use.


    It's been a while since I read the fiche on this, but back when I was
    writing utilities to read & modify the Files-11 disk metadata files, opening
    them for write caused them to flush their contents to disk. Don't know if
    this is true past V4.4...it's been a while. Naturally, if you open them for
    write, you have to Be Careful.
    --
    Lee K. Gleason N5ZMR
    Control-G Consultants
    lee.gleason@comcast.net



+ Reply to Thread