Maximum amount of files per directory - Suse

This is a discussion on Maximum amount of files per directory - Suse ; Hi there, is there a maximum of files per directory in suse linux? Will the performace stay (nearly) the same on a suse 9.1 server with about 500000 files in one directory compared to 50000? Thank you for any hint, ...

+ Reply to Thread
Results 1 to 18 of 18

Thread: Maximum amount of files per directory

  1. Maximum amount of files per directory

    Hi there,

    is there a maximum of files per directory in suse linux?
    Will the performace stay (nearly) the same on a suse 9.1 server
    with about 500000 files in one directory compared to 50000?

    Thank you for any hint,

    Merlin

  2. Re: Maximum amount of files per directory

    Hi,

    Merlin Morgenstern wrote:
    > is there a maximum of files per directory in suse linux?


    Depends on you file system.

    > Will the performace stay (nearly) the same on a suse 9.1 server
    > with about 500000 files in one directory compared to 50000?


    Lookups should be in some O(log n).

    kind regards,
    Andreas

  3. Re: Maximum amount of files per directory

    Andreas schrieb:
    > Hi,
    >
    > Merlin Morgenstern wrote:
    >> is there a maximum of files per directory in suse linux?

    >
    > Depends on you file system.
    >
    >> Will the performace stay (nearly) the same on a suse 9.1 server
    >> with about 500000 files in one directory compared to 50000?

    >
    > Lookups should be in some O(log n).
    >
    > kind regards,
    > Andreas


    Hello Andreas,

    thank you for your reply. How can I find out which file system I am
    running on the machine?

    > Lookups should be in some O(log n).


    I am sorry, what does that mean? Can you be more specific?

    Best regards,

    Merlin

  4. Re: Maximum amount of files per directory

    Hi,

    Merlin Morgenstern wrote:
    > thank you for your reply. How can I find out which file system I am
    > running on the machine?


    The mount command or the /etc/fstab file.

    > > Lookups should be in some O(log n).

    >
    > I am sorry, what does that mean? Can you be more specific?


    It means that the time would increase, but not linear. Having ten times
    as many files won't take ten times longer.

    kindr regards,
    Andreas

  5. Re: Maximum amount of files per directory

    Andreas schrieb:
    > Hi,
    >
    > Merlin Morgenstern wrote:
    >> thank you for your reply. How can I find out which file system I am
    >> running on the machine?

    >
    > The mount command or the /etc/fstab file.
    >
    >>> Lookups should be in some O(log n).

    >> I am sorry, what does that mean? Can you be more specific?

    >
    > It means that the time would increase, but not linear. Having ten times
    > as many files won't take ten times longer.
    >
    > kindr regards,
    > Andreas


    HI Anreas,

    the filesystem is:
    /dev/sda7 on /home type ext2 (rw,acl,user_xattr)

    Do you believe I will notice a performance problem lets say with 500.000
    files in one directory?

    Best regards,

    Merlin

  6. Re: Maximum amount of files per directory

    On 2007-11-06 10:22, Merlin Morgenstern wrote:
    > Andreas schrieb:
    >> Hi,
    >>
    >> Merlin Morgenstern wrote:
    >>> thank you for your reply. How can I find out which file system I am
    >>> running on the machine?

    >>
    >> The mount command or the /etc/fstab file.
    >>
    >>>> Lookups should be in some O(log n).
    >>> I am sorry, what does that mean? Can you be more specific?

    >>
    >> It means that the time would increase, but not linear. Having ten times
    >> as many files won't take ten times longer.
    >>
    >> kindr regards,
    >> Andreas

    >
    > HI Anreas,
    >
    > the filesystem is:
    > /dev/sda7 on /home type ext2 (rw,acl,user_xattr)
    >
    > Do you believe I will notice a performance problem lets say with 500.000
    > files in one directory?
    >
    > Best regards,
    >
    > Merlin


    If you once have 500000 files in a directory, it will be terrible slow to
    do ls , even after you delete 499999 of them, since the directory itself
    will be huge, and never shrink when removing files.
    The only way to clean it is to make a new directory.

    If your application know the name and call the file, it will not be so slow,
    it's just the other way, when listing the contents of a directory.

    Reiserfs is better to handle this, and if you turn off acl and xattr you make
    it a bit faster. (unless you need those features)

    /bb

  7. Re: Maximum amount of files per directory

    On Tue, 06 Nov 2007 08:56:23 +0100, Merlin Morgenstern wrote:

    > Hi there,
    >
    > is there a maximum of files per directory in suse linux?
    > Will the performace stay (nearly) the same on a suse 9.1 server
    > with about 500000 files in one directory compared to 50000?
    >
    > Thank you for any hint,
    >
    > Merlin


    I'd suggest you reevaluate the problem - it seldom makes sense to have
    that many files in one directory. What are you trying to do?



  8. Re: Maximum amount of files per directory

    On Tue, 2007-11-06 at 09:22 +0100, Andreas wrote:
    > Hi,
    >
    > Merlin Morgenstern wrote:
    > > is there a maximum of files per directory in suse linux?

    >
    > Depends on you file system.
    >
    > > Will the performace stay (nearly) the same on a suse 9.1 server
    > > with about 500000 files in one directory compared to 50000?

    >
    > Lookups should be in some O(log n).


    Faster I believe if using reiserfs though (??).... I guess
    I didn't know ext3 was so fast (if that's what you mean).


  9. Re: Maximum amount of files per directory

    On Tue, 6 Nov 2007, birre wrote:-



    >If you once have 500000 files in a directory, it will be terrible slow to
    >do ls


    Not really. The time is actually taken up by the formatting and
    displaying on a console.

    >, even after you delete 499999 of them, since the directory itself
    >will be huge, and never shrink when removing files.


    That's wrong, or it is for reiserfs. Both ext3 and xfs retain the
    directory size, but directories shrink as files are deleted on a
    reiserfs file system.

    >The only way to clean it is to make a new directory.


    For ext3 and xfs, yes. For reiserfs, no.

    ===== start test =====
    davjam@playing:/media/xfs-test> grep /media/xfs-test /etc/fstab
    /dev/system/xfs-test /media/xfs-test xfs defaults 1 2
    davjam@playing:/media/xfs-test> mkdir zzz_directory_test
    davjam@playing:/media/xfs-test> cd zzz_directory_test
    davjam@playing:/media/xfs-test/zzz_directory_test> ls -l ../|tail -1
    drwxr-xr-x 2 davjam users 6 2007-11-06 16:06 zzz_directory_test
    davjam@playing:/media/xfs-test/zzz_directory_test> time /usr/src/packages/SOURCES/create_files

    real 3m24.942s
    user 0m2.152s
    sys 2m39.494s
    davjam@playing:/media/xfs-test/zzz_directory_test> ls -l ../|tail -1 drwxr-xr-x 2 davjam users 12050432 2007-11-06 16:11
    zzz_directory_test
    davjam@playing:/media/xfs-test/zzz_directory_test> find -type f|wc -l
    500000
    davjam@playing:/media/xfs-test/zzz_directory_test> time (ls >/dev/null)
    real 0m5.236s
    user 0m2.760s
    sys 0m0.260s
    davjam@playing:/media/xfs-test/zzz_directory_test> find -type f -name "[0-3]*"|sort -u|xargs rm
    davjam@playing:/media/xfs-test/zzz_directory_test> find -type f -name "4[0-8]*"|xargs rm davjam@playing:/media/xfs-test/zzz_directory_test>
    find -type f -name "49[0-8]*"|xargs rm davjam@playing:/media/xfs-test/zzz_directory_test> find -type f -name "499[0-8]*"|xargs rm
    davjam@playing:/media/xfs-test/zzz_directory_test> find -type f -name "4999[0-8]*"|xargs rm
    davjam@playing:/media/xfs-test/zzz_directory_test> find -type f -name "49999[0-8]*"|xargs rm
    davjam@playing:/media/xfs-test/zzz_directory_test> find -type f|wc -l
    1
    davjam@playing:/media/xfs-test/zzz_directory_test> ls -l ../|tail -1
    drwxr-xr-x 2 davjam users 12050432 2007-11-06 16:37 zzz_directory_test
    davjam@playing:/media/xfs-test/zzz_directory_test> cd ../
    davjam@playing:/media/xfs-test> rm -r zzz_directory_test
    davjam@playing:/media/xfs-test> grep /media/share /etc/fstab
    /dev/hdb6 /media/share ext3 rw 1 2
    /dev/sda1 /media/share2 reiserfs rw 1 2
    davjam@playing:/media/xfs-test> cd ../share
    davjam@playing:/media/share> mkdir zzz_directory_test
    davjam@playing:/media/share> cd zzz_directory_test
    davjam@playing:/media/share/zzz_directory_test> ls -l ../|tail -1 drwxr-xr-x 2 davjam users 4096 2007-11-06 16:55
    zzz_directory_test
    davjam@playing:/media/share/zzz_directory_test> time /usr/src/packages/SOURCES/create_files

    real 0m19.244s
    user 0m0.872s
    sys 0m14.901s
    davjam@playing:/media/share/zzz_directory_test> ls -l ../|tail -1
    drwxr-xr-x 2 davjam users 11317248 2007-11-06 16:57 zzz_directory_test
    davjam@playing:/media/share/zzz_directory_test> find -type f|wc -l
    500001
    davjam@playing:/media/share/zzz_directory_test> time (ls >/dev/null)

    real 0m7.144s
    user 0m5.624s
    sys 0m0.752s
    davjam@playing:/media/share/zzz_directory_test> find -type f -name "[0-3]*"|sort -u|xargs rm
    davjam@playing:/media/share/zzz_directory_test> find -type f -name "4[0-8]*"|xargs rm
    davjam@playing:/media/share/zzz_directory_test> find -type f -name "49[0-8]*"|xargs rm
    davjam@playing:/media/share/zzz_directory_test> find -type f -name "499[0-8]*"|xargs rm
    davjam@playing:/media/share/zzz_directory_test> find -type f -name "4999[0-8]*"|xargs rm
    davjam@playing:/media/share/zzz_directory_test> find -type f -name "49999[0-8]*"|xargs rm
    davjam@playing:/media/share/zzz_directory_test> find -type f|wc -l
    1
    davjam@playing:/media/share/zzz_directory_test> ls -l ../|tail -1
    drwxr-xr-x 2 davjam users 11317248 2007-11-06 16:58 zzz_directory_test
    davjam@playing:/media/share/zzz_directory_test> cd ../
    davjam@playing:/media/share> rm -r zzz_directory_test
    davjam@playing:/media/share> cd ../share2
    davjam@playing:/media/share2> mkdir zzz_directory_test
    davjam@playing:/media/share2> cd zzz_directory_test
    davjam@playing:/media/share2/zzz_directory_test> ls -l ../|tail -1 drwxr-xr-x 2 davjam users 48 2007-11-06 16:59
    zzz_directory_test
    davjam@playing:/media/share2/zzz_directory_test> time /usr/src/packages/SOURCES/create_files

    real 0m17.665s
    user 0m0.972s
    sys 0m15.305s
    davjam@playing:/media/share2/zzz_directory_test> ls -l ../|tail -1
    drwxr-xr-x 2 davjam users 12000048 2007-11-06 17:00 zzz_directory_test
    davjam@playing:/media/share2/zzz_directory_test> find -type f|wc -l
    500000
    davjam@playing:/media/share2/zzz_directory_test> time (ls >/dev/null)

    real 0m3.441s
    user 0m2.740s
    sys 0m0.364s
    davjam@playing:/media/share2/zzz_directory_test> find -type f -name "[0-3]*"|sort -u|xargs rm
    davjam@playing:/media/share2/zzz_directory_test> find -type f -name "4[0-8]*"|xargs rm
    davjam@playing:/media/share2/zzz_directory_test> find -type f -name "49[0-8]*"|xargs rm
    davjam@playing:/media/share2/zzz_directory_test> find -type f -name "499[0-8]*"|xargs rm
    davjam@playing:/media/share2/zzz_directory_test> find -type f -name "4999[0-8]*"|xargs rm
    davjam@playing:/media/share2/zzz_directory_test> find -type f -name "49999[0-8]*"|xargs rm
    davjam@playing:/media/share2/zzz_directory_test> find -type f|wc -l
    1
    davjam@playing:/media/share2/zzz_directory_test> ls -l ../|tail -1
    drwxr-xr-x 2 davjam users 72 2007-11-06 17:02 zzz_directory_test
    ===== end test =====

    Oh, and here's the source code for the create_files binary.

    ===== start code =====
    davjam@playing:/media/share2/zzz_directory_test> cat /usr/src/packages/SOURCES/create_files.c
    #include
    #include
    #include

    char fname[8]; /* only need 6 chars + \0, but round up anyway */

    int
    main(int argc,char **argv)
    {
    FILE *file; /* holds file pointer */
    int count; /* holds counter value */

    for(count=0;count<500000;count++)
    {
    sprintf(fname,"%06u",count);
    if(!(file=fopen(fname,"wb")))
    {
    exit(1);
    }
    fclose(file);
    }
    exit(0);
    }
    ===== end code =====

    Now, the observant amongst you should have noticed that the ext3 count
    is 500001. The reason for thin, and I'm going to do the same test on my
    other systems, is because the directory contains two files with the name
    353572. Since it shouldn't have two files with an identical name, this
    looks like a bug in the ext3 code.


    Regards,
    David Bolt

    --
    www.davjam.org/lifetype/ www.distributed.net: OGR@100Mnodes, RC5-72@15Mkeys
    | SUSE 10.1 32bit | openSUSE 10.2 32bit | openSUSE 10.3 32bit
    SUSE 10.0 64bit | SUSE 10.1 64bit | openSUSE 10.2 64bit |
    RISC OS 3.11 | RISC OS 3.6 | TOS 4.02 | openSUSE 10.3 PPC

  10. Re: Maximum amount of files per directory

    On Tue, 6 Nov 2007, Chris Cox wrote:-

    >On Tue, 2007-11-06 at 09:22 +0100, Andreas wrote:


    >> Lookups should be in some O(log n).

    >
    >Faster I believe if using reiserfs though (??)....


    Doing an "ls >/dev/null" on ext3, reiserfs and xfs with 500000 files in
    a directory, all have fairly similar times. On my 10.1 system, reiserfs
    is the fastest taking only 3.441 seconds. Using ext3 on the same system
    is by far the slowest taking 7.144 seconds, and xfs sits in between at
    4.313 seconds.

    However, actually creating the files takes significantly longer on xfs
    than either reiserfs or ext3. Both reiserfs and ext3 take less than 30
    seconds, compared to the almost 3 and a half minutes taken using xfs.

    >I guess
    >I didn't know ext3 was so fast (if that's what you mean).


    From the limited testing, just to satisfy my curiosity, it doesn't
    appear to be that much slower than reiserfs. In fact, the file deletion
    in ext3 is faster than reiserfs, probably because ext3 doesn't shrink
    the directories down to fit the number of files/directories contained
    within them.


    Regards,
    David Bolt

    --
    www.davjam.org/lifetype/ www.distributed.net: OGR@100Mnodes, RC5-72@15Mkeys
    | SUSE 10.1 32bit | openSUSE 10.2 32bit | openSUSE 10.3 32bit
    SUSE 10.0 64bit | SUSE 10.1 64bit | openSUSE 10.2 64bit |
    RISC OS 3.11 | RISC OS 3.6 | TOS 4.02 | openSUSE 10.3 PPC

  11. Re: Maximum amount of files per directory

    On Tue, 2007-11-06 at 17:32 +0000, David Bolt wrote:
    > On Tue, 6 Nov 2007, Chris Cox wrote:-
    >
    > >On Tue, 2007-11-06 at 09:22 +0100, Andreas wrote:

    >
    > >> Lookups should be in some O(log n).

    > >
    > >Faster I believe if using reiserfs though (??)....

    >
    > Doing an "ls >/dev/null" on ext3, reiserfs and xfs with 500000 files in
    > a directory, all have fairly similar times. On my 10.1 system, reiserfs
    > is the fastest taking only 3.441 seconds. Using ext3 on the same system
    > is by far the slowest taking 7.144 seconds, and xfs sits in between at
    > 4.313 seconds.
    >
    > However, actually creating the files takes significantly longer on xfs
    > than either reiserfs or ext3. Both reiserfs and ext3 take less than 30
    > seconds, compared to the almost 3 and a half minutes taken using xfs.
    >
    > >I guess
    > >I didn't know ext3 was so fast (if that's what you mean).

    >
    > From the limited testing, just to satisfy my curiosity, it doesn't
    > appear to be that much slower than reiserfs. In fact, the file deletion
    > in ext3 is faster than reiserfs, probably because ext3 doesn't shrink
    > the directories down to fit the number of files/directories contained
    > within them.


    Could also be tail packing.... you can try disabling that in reiserfs
    and see if that speeds things up.



  12. Re: Maximum amount of files per directory

    On Tue, 6 Nov 2007, Chris Cox wrote:-



    >Could also be tail packing.... you can try disabling that in reiserfs
    >and see if that speeds things up.


    I don't think so since the test files were zero-byte and so there's
    nothing to store except the name, but there's no harm in doing a test to
    find out:

    playing:~ # mkreiserfs /dev/hda7
    mkreiserfs 3.6.19 (2003 www.namesys.com)

    A pair of credits:
    Many persons came to www.namesys.com/support.html, and got a question answered
    for $25, or just gave us a small donation there.

    Jeremy Fitzhardinge wrote the teahash.c code for V3. Colin Plumb also
    contributed to that.


    Guessing about desired format.. Kernel 2.6.16.21-0.25-default is running.
    Format 3.6 with standard journal
    Count of blocks on the device: 1253056
    Number of blocks consumed by mkreiserfs formatting process: 8250
    Blocksize: 4096
    Hash function used to sort names: "r5"
    Journal Size 8193 blocks (first block 18)
    Journal Max transaction length 1024
    inode generation number: 0
    UUID: ee399e30-eef0-4448-866e-aa61306d0538
    ATTENTION: YOU SHOULD REBOOT AFTER FDISK!
    ALL DATA WILL BE LOST ON '/dev/hda7'!
    Continue (y/n):y
    Initializing journal - 0%....20%....40%....60%....80%....100%
    Syncing..ok
    ReiserFS is successfully created on /dev/hda7.
    playing:~ # mount /dev/hda7 /mnt -o notail
    playing:~ # cd /mnt
    playing:/mnt # ls -l /|grep mnt
    drwxr-xr-x 4 root root 80 Nov 6 22:47 mnt
    playing:/mnt # time /usr/src/packages/SOURCES/create_files

    real 0m15.998s
    user 0m0.996s
    sys 0m13.745s
    playing:/mnt # ls -l /|grep mnt
    drwxr-xr-x 4 root root 12000080 Nov 6 22:48 mnt
    playing:/mnt # find -type f | wc -l
    500000
    playing:/mnt # time ( (find -type f -name "[0-3]*"; find -type f -name "4[0-8]*"; find -type f -name "49[0-8]*"; find -type f -name "499[0-8]*";
    find -type f -name "4999[0-8]*"; find -type f -name "49999[0-8]" ) 2>/dev/null | xargs rm)

    real 0m28.672s
    user 0m1.060s
    sys 0m26.614s
    playing:/mnt # find -type f | wc -l
    1
    playing:/mnt # ls -l /|grep mnt
    drwxr-xr-x 4 root root 104 Nov 6 22:50 mnt
    playing:/mnt # cd ~
    playing:~ # umount /mnt

    That's using notail. And now for one using the defaults:

    playing:~ # mkreiserfs /dev/hda7
    mkreiserfs 3.6.19 (2003 www.namesys.com)

    A pair of credits:
    Vladimir Demidov wrote the parser for sys_reiser4(), the V3 alpha port, part of
    the V3 journal relocation code, and helped Hans keep the business side of
    things running.

    Many persons came to www.namesys.com/support.html, and got a question answered
    for $25, or just gave us a small donation there.


    Guessing about desired format.. Kernel 2.6.16.21-0.25-default is running.
    Format 3.6 with standard journal
    Count of blocks on the device: 1253056
    Number of blocks consumed by mkreiserfs formatting process: 8250
    Blocksize: 4096
    Hash function used to sort names: "r5"
    Journal Size 8193 blocks (first block 18)
    Journal Max transaction length 1024
    inode generation number: 0
    UUID: 39e6c102-b64e-43c2-b6f4-914fa5fc9481
    ATTENTION: YOU SHOULD REBOOT AFTER FDISK!
    ALL DATA WILL BE LOST ON '/dev/hda7'!
    Continue (y/n):y
    Initializing journal - 0%....20%....40%....60%....80%....100%
    Syncing..ok
    ReiserFS is successfully created on /dev/hda7.
    playing:~ # mount /dev/hda7 /mnt
    playing:~ # cd /mnt
    playing:/mnt # ls -l /|grep mnt
    drwxr-xr-x 4 root root 80 Nov 6 22:51 mnt
    playing:/mnt # time /usr/src/packages/SOURCES/create_files

    real 0m16.251s
    user 0m1.016s
    sys 0m13.985s
    playing:/mnt # ls -l /|grep mnt
    drwxr-xr-x 4 root root 12000080 Nov 6 22:52 mnt
    playing:/mnt # find -type f | wc -l
    500000
    playing:/mnt # time ( (find -type f -name "[0-3]*"; find -type f -name "4[0-8]*"; find -type f -name "49[0-8]*"; find -type f -name "499[0-8]*";
    find -type f -name "4999[0-8]*"; find -type f -name "49999[0-8]" ) 2>/dev/null | xargs rm)

    real 0m29.607s
    user 0m1.044s
    sys 0m27.502s
    playing:/mnt # find -type f | wc -l
    1
    playing:/mnt # ls -l /|grep mnt
    drwxr-xr-x 4 root root 104 Nov 6 22:53 mnt

    As for a re-test using ext3:

    playing:/mnt # cd ~
    playing:~ # umount /mnt
    playing:~ # mke2fs -j /dev/hda7
    mke2fs 1.38 (30-Jun-2005)
    Filesystem label=
    OS type: Linux
    Block size=4096 (log=2)
    Fragment size=4096 (log=2)
    627744 inodes, 1253062 blocks
    62653 blocks (5.00%) reserved for the super user
    First data block=0
    39 block groups
    32768 blocks per group, 32768 fragments per group
    16096 inodes per group
    Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912, 819200, 884736

    Writing inode tables: done
    Creating journal (32768 blocks): done
    Writing superblocks and filesystem accounting information: done

    This filesystem will be automatically checked every 20 mounts or
    180 days, whichever comes first. Use tune2fs -c or -i to override.
    playing:~ # mount /dev/hda7 /mnt
    playing:~ # cd /mnt
    playing:/mnt # ls -l /|grep mnt
    drwxr-xr-x 3 root root 4096 Nov 6 22:55 mnt
    playing:/mnt # time /usr/src/packages/SOURCES/create_files

    And, after almost 90 minutes, it's still around 110000 files to create!

    Previous tests were done using a sub-directory, so I'm guessing that
    while ext3 can handle large numbers of files in directories, it doesn't
    like having a huge number of files created in the top level of file
    system.

    Once it's finished, I'll reformat the partition and perform the same
    test using a sub-directory again. Unless there's something very strange
    going on, I expect the times to be much faster.


    Regards,
    David Bolt

    --
    www.davjam.org/lifetype/ www.distributed.net: OGR@100Mnodes, RC5-72@15Mkeys
    | SUSE 10.1 32bit | openSUSE 10.2 32bit | openSUSE 10.3 32bit
    SUSE 10.0 64bit | SUSE 10.1 64bit | openSUSE 10.2 64bit |
    RISC OS 3.11 | RISC OS 3.6 | TOS 4.02 | openSUSE 10.3 PPC

  13. Re: Maximum amount of files per directory

    On Wed, 7 Nov 2007, David Bolt wrote:-

    >And, after almost 90 minutes, it's still around 110000 files to create!


    After another hour of waiting:

    playing:/mnt # time /usr/src/packages/SOURCES/create_files

    real 152m48.024s
    user 0m5.172s
    sys 144m57.292s
    playing:/mnt # ls -l /|grep mnt
    drwxr-xr-x 3 root root 8003584 Nov 7 01:29 mnt
    playing:/mnt # find -type f | wc -l
    500000
    playing:/mnt # time ( (find -type f -name "[0-3]*"; find -type f -name "4[0-8]*"; find -type f -name "49[0-8]*"; find -type f -name "499[0-8]*";
    find -type f -name "4999[0-8]*"; find -type f -name "49999[0-8]" ) 2>/dev/null | xargs rm)

    real 0m11.388s
    user 0m0.876s
    sys 0m8.597s
    playing:/mnt # ls -l /|grep mnt
    drwxr-xr-x 3 root root 8003584 Nov 7 02:23 mnt
    playing:/mnt # find -type f | wc -l
    1
    playing:/mnt # cd ~
    playing:~ # umount /mnt

    >Previous tests were done using a sub-directory, so I'm guessing that
    >while ext3 can handle large numbers of files in directories, it doesn't
    >like having a huge number of files created in the top level of file
    >system.


    Well, it can't handle _creating_ large numbers of files in . so...

    >Once it's finished, I'll reformat the partition and perform the same
    >test using a sub-directory again. Unless there's something very strange
    >going on, I expect the times to be much faster.


    And it's taking its time again. Damned confusing, especially since a
    previous test was performed using a file system on /dev/hda and it
    wasn't this slow. It's something to look into later tomorrow.


    Regards,
    David Bolt

    --
    www.davjam.org/lifetype/ www.distributed.net: OGR@100Mnodes, RC5-72@15Mkeys
    | SUSE 10.1 32bit | openSUSE 10.2 32bit | openSUSE 10.3 32bit
    SUSE 10.0 64bit | SUSE 10.1 64bit | openSUSE 10.2 64bit |
    RISC OS 3.11 | RISC OS 3.6 | TOS 4.02 | openSUSE 10.3 PPC

  14. Re: Maximum amount of files per directory

    Hello David,

    that is quite an amount of testing you have done. Can you summerize
    what you found out?

    My basic question was:
    Will it be noticable slower to access a file with a given name inside a
    folder with 500.000 files, than with one that has only 5.000 files.

    Right now my structure looks like this:
    /gallery/$cc/pictures_thumb
    /gallery/$cc/pictures_400
    /gallery/$cc/pictures_600
    /gallery/$cc/pictures_1024
    Where $cc is a country code. There are about 260 country codes and
    therefore 260*4 = 1040 directories.

    I would like to switch to a structure that only contains the 4 folders
    /gallery/pictures_thumb
    /gallery/pictures_400
    /gallery/pictures_600
    /gallery/pictures_1024

    The reason for this is, I believe it will be faster to copy the files on
    backup with tar and with cp.

    The files are only accessed with a given name.

    Do you believe I will gain if I switch to the new structure with 4
    folders but more files in it?

    Best regards,

    Merlin



    David Bolt schrieb:
    > On Wed, 7 Nov 2007, David Bolt wrote:-
    >
    >> And, after almost 90 minutes, it's still around 110000 files to create!

    >
    > After another hour of waiting:
    >
    > playing:/mnt # time /usr/src/packages/SOURCES/create_files
    >
    > real 152m48.024s
    > user 0m5.172s
    > sys 144m57.292s
    > playing:/mnt # ls -l /|grep mnt
    > drwxr-xr-x 3 root root 8003584 Nov 7 01:29 mnt
    > playing:/mnt # find -type f | wc -l
    > 500000
    > playing:/mnt # time ( (find -type f -name "[0-3]*"; find -type f -name "4[0-8]*"; find -type f -name "49[0-8]*"; find -type f -name "499[0-8]*";
    > find -type f -name "4999[0-8]*"; find -type f -name "49999[0-8]" ) 2>/dev/null | xargs rm)
    >
    > real 0m11.388s
    > user 0m0.876s
    > sys 0m8.597s
    > playing:/mnt # ls -l /|grep mnt
    > drwxr-xr-x 3 root root 8003584 Nov 7 02:23 mnt
    > playing:/mnt # find -type f | wc -l
    > 1
    > playing:/mnt # cd ~
    > playing:~ # umount /mnt
    >
    >> Previous tests were done using a sub-directory, so I'm guessing that
    >> while ext3 can handle large numbers of files in directories, it doesn't
    >> like having a huge number of files created in the top level of file
    >> system.

    >
    > Well, it can't handle _creating_ large numbers of files in . so...
    >
    >> Once it's finished, I'll reformat the partition and perform the same
    >> test using a sub-directory again. Unless there's something very strange
    >> going on, I expect the times to be much faster.

    >
    > And it's taking its time again. Damned confusing, especially since a
    > previous test was performed using a file system on /dev/hda and it
    > wasn't this slow. It's something to look into later tomorrow.
    >
    >
    > Regards,
    > David Bolt
    >


  15. Re: Maximum amount of files per directory

    On 2007-11-06 18:21, David Bolt wrote:
    > On Tue, 6 Nov 2007, birre wrote:-
    >
    >
    >
    >> If you once have 500000 files in a directory, it will be terrible slow to
    >> do ls

    >
    > Not really. The time is actually taken up by the formatting and
    > displaying on a console.
    >
    >> , even after you delete 499999 of them, since the directory itself
    >> will be huge, and never shrink when removing files.

    >
    > That's wrong, or it is for reiserfs. Both ext3 and xfs retain the
    > directory size, but directories shrink as files are deleted on a
    > reiserfs file system.


    Merlin wrote he was using ext2 , and this was what I did reply to,
    ext2 is very fast for normal operations, but not so fun after a
    computer crash.

    I recommended reiserfs, but are not sure, since I don't know how
    he use the filesystem.

    Reiser has no directory or inodes in the same form as the others,
    it's more like a database that deliver the result to the OS,
    so no surprise the directory does not include deleted files.
    ( No lost+found either, only loss is possible :-)


    > drwxr-xr-x 2 davjam users 12050432 2007-11-06 16:11


    Ok, I can buy that a normal machine maybe can handle a directory
    12MB big , and not so terrible slow as I wrote, but all tests you
    did was from RAM , since it was cached.

    I have seen machines that don't respond for 30 sec. when doing ls -l.

    /bb

  16. Re: Maximum amount of files per directory

    On Wed, 7 Nov 2007, birre wrote:-

    >On 2007-11-06 18:21, David Bolt wrote:


    >> That's wrong, or it is for reiserfs. Both ext3 and xfs retain the
    >> directory size, but directories shrink as files are deleted on a
    >> reiserfs file system.

    >
    >Merlin wrote he was using ext2 , and this was what I did reply to,
    >ext2 is very fast for normal operations, but not so fun after a
    >computer crash.


    That I can attest to. Several times, with earlier much versions of SuSE
    (6.x, 7.x) I had a crash and had to endure the forced fsck :-| The good
    news was that I quite rarely actually lost anything. More often than not
    I could figure out what the "found" files were supposed to be.

    >I recommended reiserfs, but are not sure, since I don't know how
    >he use the filesystem.


    Looking at his reply to me, I think either ext3 or reiserfs would
    probably do just as good as each other.

    >Reiser has no directory or inodes in the same form as the others,
    >it's more like a database that deliver the result to the OS,
    >so no surprise the directory does not include deleted files.
    >( No lost+found either, only loss is possible :-)


    Well, I don't know. I've not yet lost anything use reiserfs whereas with
    ext2/3 I've had a lost+found containing several thousand recovered
    inodes.

    >> drwxr-xr-x 2 davjam users 12050432 2007-11-06 16:11

    >
    >Ok, I can buy that a normal machine maybe can handle a directory
    >12MB big , and not so terrible slow as I wrote, but all tests you
    >did was from RAM , since it was cached.


    Okay, one more test, this time with some attempt to get the stuff out of
    the cache :-)

    davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> grep $(df .|tail -1|awk '{print $1}') /etc/fstab
    /dev/hda12 /usr/src reiserfs defaults 1 2
    davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> time ../create_files

    real 0m17.315s
    user 0m0.932s
    sys 0m13.685s
    davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> free -t
    total used free shared buffers cached
    Mem: 995580 972656 22924 0 100532 82924
    -/+ buffers/cache: 789200 206380
    Swap: 4964036 391812 4572224
    Total: 5959616 1364468 4595148
    davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> find / -type f &>/dev/null
    free -t
    davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> free -t
    total used free shared buffers cached
    Mem: 995580 984804 10776 0 213408 38952
    -/+ buffers/cache: 732444 263136
    Swap: 4964036 392964 4571072
    Total: 5959616 1377768 4581848
    davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> time ls
    ....
    045447 090902 136357 181812 227267 272722 318177 363632 409087 454542 499997
    045448 090903 136358 181813 227268 272723 318178 363633 409088 454543 499998
    045449 090904 136359 181814 227269 272724 318179 363634 409089 454544 499999
    045450 090905 136360 181815 227270 272725 318180 363635 409090 454545
    045451 090906 136361 181816 227271 272726 318181 363636 409091 454546
    045452 090907 136362 181817 227272 272727 318182 363637 409092 454547
    045453 090908 136363 181818 227273 272728 318183 363638 409093 454548
    045454 090909 136364 181819 227274 272729 318184 363639 409094 454549

    real 0m30.632s
    user 0m4.876s
    sys 0m6.088s
    davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> find / -type f &>/dev/null
    davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> time ls -l
    ....
    -rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499990
    -rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499991
    -rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499992
    -rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499993
    -rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499994
    -rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499995
    -rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499996
    -rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499997
    -rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499998
    -rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499999

    real 0m53.351s
    user 0m5.556s
    sys 0m8.141s
    davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> find / -maxdepth 4 -type f 2>/dev/null | wc -l
    25202
    davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> find / -maxdepth 5 -type f 2>/dev/null | wc -l
    193293
    davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> free -t
    total used free shared buffers cached
    Mem: 995580 985588 9992 0 103668 32280
    -/+ buffers/cache: 849640 145940
    Swap: 4964036 154036 4810000
    Total: 5959616 1139624 4819992
    davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> cd ../
    davjam@playing:/usr/src/packages/SOURCES> time rm -rf zzz_directory_test

    real 0m32.773s
    user 0m0.168s
    sys 0m31.022s
    davjam@playing:/usr/src/packages/SOURCES> popd
    ~

    Okay, that's using reiserfs. Now for ext3:

    davjam@playing:~> pushd /media/share
    /media/share ~
    davjam@playing:/media/share> grep $(df .|tail -1|awk '{print $1}') /etc/fstab
    /dev/hdb6 /media/share ext3 rw 1 2
    davjam@playing:/media/share> mkdir -p zzz_directory_test
    davjam@playing:/media/share> cd zzz_directory_test
    davjam@playing:/media/share/zzz_directory_test> time /usr/src/packages/SOURCES/create_files

    real 0m21.155s
    user 0m0.832s
    sys 0m15.273s
    davjam@playing:/media/share/zzz_directory_test> find -type f | wc -l
    500001
    davjam@playing:/media/share/zzz_directory_test> ls -l .. | tail -1
    drwxr-xr-x 2 davjam users 11317248 2007-11-07 15:07 zzz_directory_test
    davjam@playing:/media/share/zzz_directory_test> free -t
    total used free shared buffers cached
    Mem: 995580 964460 31120 0 58952 28080
    -/+ buffers/cache: 877428 118152
    Swap: 4964036 66000 4898036
    Total: 5959616 1030460 4929156
    davjam@playing:/media/share/zzz_directory_test> find / -maxdepth 3 -type f &>/dev/null
    davjam@playing:/media/share/zzz_directory_test> free -t
    total used free shared buffers cached
    Mem: 995580 974336 21244 0 68044 28152
    -/+ buffers/cache: 878140 117440
    Swap: 4964036 66000 4898036
    Total: 5959616 1040336 4919280
    davjam@playing:/media/share/zzz_directory_test> free -t
    total used free shared buffers cached
    Mem: 995580 984864 10716 0 50124 29560
    -/+ buffers/cache: 905180 90400
    Swap: 4964036 0 4964036
    Total: 5959616 984864 4974752
    davjam@playing:/media/share/zzz_directory_test> find / -maxdepth 3 -type f 2>/dev/null | wc -l
    10244
    davjam@playing:/media/share/zzz_directory_test> time ls -l
    ....
    -rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499990
    -rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499991
    -rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499992
    -rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499993
    -rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499994
    -rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499995
    -rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499996
    -rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499997
    -rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499998
    -rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499999

    real 11m50.084s
    user 0m8.313s
    sys 0m9.145s
    davjam@playing:/media/share/zzz_directory_test> free -t
    total used free shared buffers cached
    Mem: 995580 843504 152076 0 16956 64860
    -/+ buffers/cache: 761688 233892
    Swap: 4964036 4656 4959380
    Total: 5959616 848160 5111456
    davjam@playing:/media/share/zzz_directory_test> free -t
    total used free shared buffers cached
    Mem: 995580 845604 149976 0 17052 66228
    -/+ buffers/cache: 762324 233256
    Swap: 4964036 0 4964036
    Total: 5959616 845604 5114012
    davjam@playing:/media/share/zzz_directory_test> cd ../
    davjam@playing:/media/share> find / -maxdepth 3 -type f 2>/dev/null | wc -l
    10244

    real 8m42.703s
    user 0m0.140s
    sys 0m14.921s

    Okay, it's 17 times slower deleting the directory, and about 12 times
    slower at producing a directory listing with ext3.

    >I have seen machines that don't respond for 30 sec. when doing ls -l.


    As have I, and if you look at the time above, you'll see it took almost
    12 minutes to do a directory listing on ext3, compared to less than a
    minute on reiserfs.

    As for the delay, my guess is that ls sorting and formatting the output
    for a console really takes its toll. After all, it was a lot faster
    sending the output to either a file or /dev/null.


    Regards,
    David Bolt

    --
    www.davjam.org/lifetype/ www.distributed.net: OGR@100Mnodes, RC5-72@15Mkeys
    | SUSE 10.1 32bit | openSUSE 10.2 32bit | openSUSE 10.3 32bit
    SUSE 10.0 64bit | SUSE 10.1 64bit | openSUSE 10.2 64bit |
    RISC OS 3.11 | RISC OS 3.6 | TOS 4.02 | openSUSE 10.3 PPC

  17. Re: Maximum amount of files per directory

    Andreas wrote:
    >Hi,


    >Merlin Morgenstern wrote:
    >> thank you for your reply. How can I find out which file system I am
    >> running on the machine?


    >The mount command or the /etc/fstab file.
    >
    >> > Lookups should be in some O(log n).

    >>
    >> I am sorry, what does that mean? Can you be more specific?


    >It means that the time would increase, but not linear. Having ten times
    >as many files won't take ten times longer.


    But it will take something like two or three times longer.
    --
    --- Paul J. Gans

  18. Re: Maximum amount of files per directory

    On Wed, 07 Nov 2007 17:07:20 +0000, David Bolt wrote:

    >>Reiser has no directory or inodes in the same form as the others,
    >>it's more like a database that deliver the result to the OS,
    >>so no surprise the directory does not include deleted files.
    >>( No lost+found either, only loss is possible

    >
    > Well, I don't know. I've not yet lost anything use reiserfs whereas with
    > ext2/3 I've had a lost+found containing several thousand recovered
    > inodes.


    My experience is exactly the opposite. The only time I ever suffered
    severe data loss after a power surge (nearby lightning strike, surge went
    straight through surge protector!) was with reiserfs, whereas ext3 has
    always been able to recover cleanly.

+ Reply to Thread