Maximum amount of files per directory - Suse
This is a discussion on Maximum amount of files per directory - Suse ; Hi there,
is there a maximum of files per directory in suse linux?
Will the performace stay (nearly) the same on a suse 9.1 server
with about 500000 files in one directory compared to 50000?
Thank you for any hint,
...
-
Maximum amount of files per directory
Hi there,
is there a maximum of files per directory in suse linux?
Will the performace stay (nearly) the same on a suse 9.1 server
with about 500000 files in one directory compared to 50000?
Thank you for any hint,
Merlin
-
Re: Maximum amount of files per directory
Hi,
Merlin Morgenstern wrote:
> is there a maximum of files per directory in suse linux?
Depends on you file system.
> Will the performace stay (nearly) the same on a suse 9.1 server
> with about 500000 files in one directory compared to 50000?
Lookups should be in some O(log n).
kind regards,
Andreas
-
Re: Maximum amount of files per directory
Andreas schrieb:
> Hi,
>
> Merlin Morgenstern wrote:
>> is there a maximum of files per directory in suse linux?
>
> Depends on you file system.
>
>> Will the performace stay (nearly) the same on a suse 9.1 server
>> with about 500000 files in one directory compared to 50000?
>
> Lookups should be in some O(log n).
>
> kind regards,
> Andreas
Hello Andreas,
thank you for your reply. How can I find out which file system I am
running on the machine?
> Lookups should be in some O(log n).
I am sorry, what does that mean? Can you be more specific?
Best regards,
Merlin
-
Re: Maximum amount of files per directory
Hi,
Merlin Morgenstern wrote:
> thank you for your reply. How can I find out which file system I am
> running on the machine?
The mount command or the /etc/fstab file.
> > Lookups should be in some O(log n).
>
> I am sorry, what does that mean? Can you be more specific?
It means that the time would increase, but not linear. Having ten times
as many files won't take ten times longer.
kindr regards,
Andreas
-
Re: Maximum amount of files per directory
Andreas schrieb:
> Hi,
>
> Merlin Morgenstern wrote:
>> thank you for your reply. How can I find out which file system I am
>> running on the machine?
>
> The mount command or the /etc/fstab file.
>
>>> Lookups should be in some O(log n).
>> I am sorry, what does that mean? Can you be more specific?
>
> It means that the time would increase, but not linear. Having ten times
> as many files won't take ten times longer.
>
> kindr regards,
> Andreas
HI Anreas,
the filesystem is:
/dev/sda7 on /home type ext2 (rw,acl,user_xattr)
Do you believe I will notice a performance problem lets say with 500.000
files in one directory?
Best regards,
Merlin
-
Re: Maximum amount of files per directory
On 2007-11-06 10:22, Merlin Morgenstern wrote:
> Andreas schrieb:
>> Hi,
>>
>> Merlin Morgenstern wrote:
>>> thank you for your reply. How can I find out which file system I am
>>> running on the machine?
>>
>> The mount command or the /etc/fstab file.
>>
>>>> Lookups should be in some O(log n).
>>> I am sorry, what does that mean? Can you be more specific?
>>
>> It means that the time would increase, but not linear. Having ten times
>> as many files won't take ten times longer.
>>
>> kindr regards,
>> Andreas
>
> HI Anreas,
>
> the filesystem is:
> /dev/sda7 on /home type ext2 (rw,acl,user_xattr)
>
> Do you believe I will notice a performance problem lets say with 500.000
> files in one directory?
>
> Best regards,
>
> Merlin
If you once have 500000 files in a directory, it will be terrible slow to
do ls , even after you delete 499999 of them, since the directory itself
will be huge, and never shrink when removing files.
The only way to clean it is to make a new directory.
If your application know the name and call the file, it will not be so slow,
it's just the other way, when listing the contents of a directory.
Reiserfs is better to handle this, and if you turn off acl and xattr you make
it a bit faster. (unless you need those features)
/bb
-
Re: Maximum amount of files per directory
On Tue, 06 Nov 2007 08:56:23 +0100, Merlin Morgenstern wrote:
> Hi there,
>
> is there a maximum of files per directory in suse linux?
> Will the performace stay (nearly) the same on a suse 9.1 server
> with about 500000 files in one directory compared to 50000?
>
> Thank you for any hint,
>
> Merlin
I'd suggest you reevaluate the problem - it seldom makes sense to have
that many files in one directory. What are you trying to do?
-
Re: Maximum amount of files per directory
On Tue, 2007-11-06 at 09:22 +0100, Andreas wrote:
> Hi,
>
> Merlin Morgenstern wrote:
> > is there a maximum of files per directory in suse linux?
>
> Depends on you file system.
>
> > Will the performace stay (nearly) the same on a suse 9.1 server
> > with about 500000 files in one directory compared to 50000?
>
> Lookups should be in some O(log n).
Faster I believe if using reiserfs though (??).... I guess
I didn't know ext3 was so fast (if that's what you mean).
-
Re: Maximum amount of files per directory
On Tue, 6 Nov 2007, birre wrote:-
>If you once have 500000 files in a directory, it will be terrible slow to
>do ls
Not really. The time is actually taken up by the formatting and
displaying on a console.
>, even after you delete 499999 of them, since the directory itself
>will be huge, and never shrink when removing files.
That's wrong, or it is for reiserfs. Both ext3 and xfs retain the
directory size, but directories shrink as files are deleted on a
reiserfs file system.
>The only way to clean it is to make a new directory.
For ext3 and xfs, yes. For reiserfs, no.
===== start test =====
davjam@playing:/media/xfs-test> grep /media/xfs-test /etc/fstab
/dev/system/xfs-test /media/xfs-test xfs defaults 1 2
davjam@playing:/media/xfs-test> mkdir zzz_directory_test
davjam@playing:/media/xfs-test> cd zzz_directory_test
davjam@playing:/media/xfs-test/zzz_directory_test> ls -l ../|tail -1
drwxr-xr-x 2 davjam users 6 2007-11-06 16:06 zzz_directory_test
davjam@playing:/media/xfs-test/zzz_directory_test> time /usr/src/packages/SOURCES/create_files
real 3m24.942s
user 0m2.152s
sys 2m39.494s
davjam@playing:/media/xfs-test/zzz_directory_test> ls -l ../|tail -1 drwxr-xr-x 2 davjam users 12050432 2007-11-06 16:11
zzz_directory_test
davjam@playing:/media/xfs-test/zzz_directory_test> find -type f|wc -l
500000
davjam@playing:/media/xfs-test/zzz_directory_test> time (ls >/dev/null)
real 0m5.236s
user 0m2.760s
sys 0m0.260s
davjam@playing:/media/xfs-test/zzz_directory_test> find -type f -name "[0-3]*"|sort -u|xargs rm
davjam@playing:/media/xfs-test/zzz_directory_test> find -type f -name "4[0-8]*"|xargs rm davjam@playing:/media/xfs-test/zzz_directory_test>
find -type f -name "49[0-8]*"|xargs rm davjam@playing:/media/xfs-test/zzz_directory_test> find -type f -name "499[0-8]*"|xargs rm
davjam@playing:/media/xfs-test/zzz_directory_test> find -type f -name "4999[0-8]*"|xargs rm
davjam@playing:/media/xfs-test/zzz_directory_test> find -type f -name "49999[0-8]*"|xargs rm
davjam@playing:/media/xfs-test/zzz_directory_test> find -type f|wc -l
1
davjam@playing:/media/xfs-test/zzz_directory_test> ls -l ../|tail -1
drwxr-xr-x 2 davjam users 12050432 2007-11-06 16:37 zzz_directory_test
davjam@playing:/media/xfs-test/zzz_directory_test> cd ../
davjam@playing:/media/xfs-test> rm -r zzz_directory_test
davjam@playing:/media/xfs-test> grep /media/share /etc/fstab
/dev/hdb6 /media/share ext3 rw 1 2
/dev/sda1 /media/share2 reiserfs rw 1 2
davjam@playing:/media/xfs-test> cd ../share
davjam@playing:/media/share> mkdir zzz_directory_test
davjam@playing:/media/share> cd zzz_directory_test
davjam@playing:/media/share/zzz_directory_test> ls -l ../|tail -1 drwxr-xr-x 2 davjam users 4096 2007-11-06 16:55
zzz_directory_test
davjam@playing:/media/share/zzz_directory_test> time /usr/src/packages/SOURCES/create_files
real 0m19.244s
user 0m0.872s
sys 0m14.901s
davjam@playing:/media/share/zzz_directory_test> ls -l ../|tail -1
drwxr-xr-x 2 davjam users 11317248 2007-11-06 16:57 zzz_directory_test
davjam@playing:/media/share/zzz_directory_test> find -type f|wc -l
500001
davjam@playing:/media/share/zzz_directory_test> time (ls >/dev/null)
real 0m7.144s
user 0m5.624s
sys 0m0.752s
davjam@playing:/media/share/zzz_directory_test> find -type f -name "[0-3]*"|sort -u|xargs rm
davjam@playing:/media/share/zzz_directory_test> find -type f -name "4[0-8]*"|xargs rm
davjam@playing:/media/share/zzz_directory_test> find -type f -name "49[0-8]*"|xargs rm
davjam@playing:/media/share/zzz_directory_test> find -type f -name "499[0-8]*"|xargs rm
davjam@playing:/media/share/zzz_directory_test> find -type f -name "4999[0-8]*"|xargs rm
davjam@playing:/media/share/zzz_directory_test> find -type f -name "49999[0-8]*"|xargs rm
davjam@playing:/media/share/zzz_directory_test> find -type f|wc -l
1
davjam@playing:/media/share/zzz_directory_test> ls -l ../|tail -1
drwxr-xr-x 2 davjam users 11317248 2007-11-06 16:58 zzz_directory_test
davjam@playing:/media/share/zzz_directory_test> cd ../
davjam@playing:/media/share> rm -r zzz_directory_test
davjam@playing:/media/share> cd ../share2
davjam@playing:/media/share2> mkdir zzz_directory_test
davjam@playing:/media/share2> cd zzz_directory_test
davjam@playing:/media/share2/zzz_directory_test> ls -l ../|tail -1 drwxr-xr-x 2 davjam users 48 2007-11-06 16:59
zzz_directory_test
davjam@playing:/media/share2/zzz_directory_test> time /usr/src/packages/SOURCES/create_files
real 0m17.665s
user 0m0.972s
sys 0m15.305s
davjam@playing:/media/share2/zzz_directory_test> ls -l ../|tail -1
drwxr-xr-x 2 davjam users 12000048 2007-11-06 17:00 zzz_directory_test
davjam@playing:/media/share2/zzz_directory_test> find -type f|wc -l
500000
davjam@playing:/media/share2/zzz_directory_test> time (ls >/dev/null)
real 0m3.441s
user 0m2.740s
sys 0m0.364s
davjam@playing:/media/share2/zzz_directory_test> find -type f -name "[0-3]*"|sort -u|xargs rm
davjam@playing:/media/share2/zzz_directory_test> find -type f -name "4[0-8]*"|xargs rm
davjam@playing:/media/share2/zzz_directory_test> find -type f -name "49[0-8]*"|xargs rm
davjam@playing:/media/share2/zzz_directory_test> find -type f -name "499[0-8]*"|xargs rm
davjam@playing:/media/share2/zzz_directory_test> find -type f -name "4999[0-8]*"|xargs rm
davjam@playing:/media/share2/zzz_directory_test> find -type f -name "49999[0-8]*"|xargs rm
davjam@playing:/media/share2/zzz_directory_test> find -type f|wc -l
1
davjam@playing:/media/share2/zzz_directory_test> ls -l ../|tail -1
drwxr-xr-x 2 davjam users 72 2007-11-06 17:02 zzz_directory_test
===== end test =====
Oh, and here's the source code for the create_files binary.
===== start code =====
davjam@playing:/media/share2/zzz_directory_test> cat /usr/src/packages/SOURCES/create_files.c
#include
#include
#include
char fname[8]; /* only need 6 chars + \0, but round up anyway */
int
main(int argc,char **argv)
{
FILE *file; /* holds file pointer */
int count; /* holds counter value */
for(count=0;count<500000;count++)
{
sprintf(fname,"%06u",count);
if(!(file=fopen(fname,"wb")))
{
exit(1);
}
fclose(file);
}
exit(0);
}
===== end code =====
Now, the observant amongst you should have noticed that the ext3 count
is 500001. The reason for thin, and I'm going to do the same test on my
other systems, is because the directory contains two files with the name
353572. Since it shouldn't have two files with an identical name, this
looks like a bug in the ext3 code.
Regards,
David Bolt
--
www.davjam.org/lifetype/ www.distributed.net: OGR@100Mnodes, RC5-72@15Mkeys
| SUSE 10.1 32bit | openSUSE 10.2 32bit | openSUSE 10.3 32bit
SUSE 10.0 64bit | SUSE 10.1 64bit | openSUSE 10.2 64bit |
RISC OS 3.11 | RISC OS 3.6 | TOS 4.02 | openSUSE 10.3 PPC
-
Re: Maximum amount of files per directory
On Tue, 6 Nov 2007, Chris Cox wrote:-
>On Tue, 2007-11-06 at 09:22 +0100, Andreas wrote:
>> Lookups should be in some O(log n).
>
>Faster I believe if using reiserfs though (??)....
Doing an "ls >/dev/null" on ext3, reiserfs and xfs with 500000 files in
a directory, all have fairly similar times. On my 10.1 system, reiserfs
is the fastest taking only 3.441 seconds. Using ext3 on the same system
is by far the slowest taking 7.144 seconds, and xfs sits in between at
4.313 seconds.
However, actually creating the files takes significantly longer on xfs
than either reiserfs or ext3. Both reiserfs and ext3 take less than 30
seconds, compared to the almost 3 and a half minutes taken using xfs.
>I guess
>I didn't know ext3 was so fast (if that's what you mean).
From the limited testing, just to satisfy my curiosity, it doesn't
appear to be that much slower than reiserfs. In fact, the file deletion
in ext3 is faster than reiserfs, probably because ext3 doesn't shrink
the directories down to fit the number of files/directories contained
within them.
Regards,
David Bolt
--
www.davjam.org/lifetype/ www.distributed.net: OGR@100Mnodes, RC5-72@15Mkeys
| SUSE 10.1 32bit | openSUSE 10.2 32bit | openSUSE 10.3 32bit
SUSE 10.0 64bit | SUSE 10.1 64bit | openSUSE 10.2 64bit |
RISC OS 3.11 | RISC OS 3.6 | TOS 4.02 | openSUSE 10.3 PPC
-
Re: Maximum amount of files per directory
On Tue, 2007-11-06 at 17:32 +0000, David Bolt wrote:
> On Tue, 6 Nov 2007, Chris Cox wrote:-
>
> >On Tue, 2007-11-06 at 09:22 +0100, Andreas wrote:
>
> >> Lookups should be in some O(log n).
> >
> >Faster I believe if using reiserfs though (??)....
>
> Doing an "ls >/dev/null" on ext3, reiserfs and xfs with 500000 files in
> a directory, all have fairly similar times. On my 10.1 system, reiserfs
> is the fastest taking only 3.441 seconds. Using ext3 on the same system
> is by far the slowest taking 7.144 seconds, and xfs sits in between at
> 4.313 seconds.
>
> However, actually creating the files takes significantly longer on xfs
> than either reiserfs or ext3. Both reiserfs and ext3 take less than 30
> seconds, compared to the almost 3 and a half minutes taken using xfs.
>
> >I guess
> >I didn't know ext3 was so fast (if that's what you mean).
>
> From the limited testing, just to satisfy my curiosity, it doesn't
> appear to be that much slower than reiserfs. In fact, the file deletion
> in ext3 is faster than reiserfs, probably because ext3 doesn't shrink
> the directories down to fit the number of files/directories contained
> within them.
Could also be tail packing.... you can try disabling that in reiserfs
and see if that speeds things up.
-
Re: Maximum amount of files per directory
On Tue, 6 Nov 2007, Chris Cox wrote:-
>Could also be tail packing.... you can try disabling that in reiserfs
>and see if that speeds things up.
I don't think so since the test files were zero-byte and so there's
nothing to store except the name, but there's no harm in doing a test to
find out:
playing:~ # mkreiserfs /dev/hda7
mkreiserfs 3.6.19 (2003 www.namesys.com)
A pair of credits:
Many persons came to www.namesys.com/support.html, and got a question answered
for $25, or just gave us a small donation there.
Jeremy Fitzhardinge wrote the teahash.c code for V3. Colin Plumb also
contributed to that.
Guessing about desired format.. Kernel 2.6.16.21-0.25-default is running.
Format 3.6 with standard journal
Count of blocks on the device: 1253056
Number of blocks consumed by mkreiserfs formatting process: 8250
Blocksize: 4096
Hash function used to sort names: "r5"
Journal Size 8193 blocks (first block 18)
Journal Max transaction length 1024
inode generation number: 0
UUID: ee399e30-eef0-4448-866e-aa61306d0538
ATTENTION: YOU SHOULD REBOOT AFTER FDISK!
ALL DATA WILL BE LOST ON '/dev/hda7'!
Continue (y/n):y
Initializing journal - 0%....20%....40%....60%....80%....100%
Syncing..ok
ReiserFS is successfully created on /dev/hda7.
playing:~ # mount /dev/hda7 /mnt -o notail
playing:~ # cd /mnt
playing:/mnt # ls -l /|grep mnt
drwxr-xr-x 4 root root 80 Nov 6 22:47 mnt
playing:/mnt # time /usr/src/packages/SOURCES/create_files
real 0m15.998s
user 0m0.996s
sys 0m13.745s
playing:/mnt # ls -l /|grep mnt
drwxr-xr-x 4 root root 12000080 Nov 6 22:48 mnt
playing:/mnt # find -type f | wc -l
500000
playing:/mnt # time ( (find -type f -name "[0-3]*"; find -type f -name "4[0-8]*"; find -type f -name "49[0-8]*"; find -type f -name "499[0-8]*";
find -type f -name "4999[0-8]*"; find -type f -name "49999[0-8]" ) 2>/dev/null | xargs rm)
real 0m28.672s
user 0m1.060s
sys 0m26.614s
playing:/mnt # find -type f | wc -l
1
playing:/mnt # ls -l /|grep mnt
drwxr-xr-x 4 root root 104 Nov 6 22:50 mnt
playing:/mnt # cd ~
playing:~ # umount /mnt
That's using notail. And now for one using the defaults:
playing:~ # mkreiserfs /dev/hda7
mkreiserfs 3.6.19 (2003 www.namesys.com)
A pair of credits:
Vladimir Demidov wrote the parser for sys_reiser4(), the V3 alpha port, part of
the V3 journal relocation code, and helped Hans keep the business side of
things running.
Many persons came to www.namesys.com/support.html, and got a question answered
for $25, or just gave us a small donation there.
Guessing about desired format.. Kernel 2.6.16.21-0.25-default is running.
Format 3.6 with standard journal
Count of blocks on the device: 1253056
Number of blocks consumed by mkreiserfs formatting process: 8250
Blocksize: 4096
Hash function used to sort names: "r5"
Journal Size 8193 blocks (first block 18)
Journal Max transaction length 1024
inode generation number: 0
UUID: 39e6c102-b64e-43c2-b6f4-914fa5fc9481
ATTENTION: YOU SHOULD REBOOT AFTER FDISK!
ALL DATA WILL BE LOST ON '/dev/hda7'!
Continue (y/n):y
Initializing journal - 0%....20%....40%....60%....80%....100%
Syncing..ok
ReiserFS is successfully created on /dev/hda7.
playing:~ # mount /dev/hda7 /mnt
playing:~ # cd /mnt
playing:/mnt # ls -l /|grep mnt
drwxr-xr-x 4 root root 80 Nov 6 22:51 mnt
playing:/mnt # time /usr/src/packages/SOURCES/create_files
real 0m16.251s
user 0m1.016s
sys 0m13.985s
playing:/mnt # ls -l /|grep mnt
drwxr-xr-x 4 root root 12000080 Nov 6 22:52 mnt
playing:/mnt # find -type f | wc -l
500000
playing:/mnt # time ( (find -type f -name "[0-3]*"; find -type f -name "4[0-8]*"; find -type f -name "49[0-8]*"; find -type f -name "499[0-8]*";
find -type f -name "4999[0-8]*"; find -type f -name "49999[0-8]" ) 2>/dev/null | xargs rm)
real 0m29.607s
user 0m1.044s
sys 0m27.502s
playing:/mnt # find -type f | wc -l
1
playing:/mnt # ls -l /|grep mnt
drwxr-xr-x 4 root root 104 Nov 6 22:53 mnt
As for a re-test using ext3:
playing:/mnt # cd ~
playing:~ # umount /mnt
playing:~ # mke2fs -j /dev/hda7
mke2fs 1.38 (30-Jun-2005)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
627744 inodes, 1253062 blocks
62653 blocks (5.00%) reserved for the super user
First data block=0
39 block groups
32768 blocks per group, 32768 fragments per group
16096 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 20 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
playing:~ # mount /dev/hda7 /mnt
playing:~ # cd /mnt
playing:/mnt # ls -l /|grep mnt
drwxr-xr-x 3 root root 4096 Nov 6 22:55 mnt
playing:/mnt # time /usr/src/packages/SOURCES/create_files
And, after almost 90 minutes, it's still around 110000 files to create!
Previous tests were done using a sub-directory, so I'm guessing that
while ext3 can handle large numbers of files in directories, it doesn't
like having a huge number of files created in the top level of file
system.
Once it's finished, I'll reformat the partition and perform the same
test using a sub-directory again. Unless there's something very strange
going on, I expect the times to be much faster.
Regards,
David Bolt
--
www.davjam.org/lifetype/ www.distributed.net: OGR@100Mnodes, RC5-72@15Mkeys
| SUSE 10.1 32bit | openSUSE 10.2 32bit | openSUSE 10.3 32bit
SUSE 10.0 64bit | SUSE 10.1 64bit | openSUSE 10.2 64bit |
RISC OS 3.11 | RISC OS 3.6 | TOS 4.02 | openSUSE 10.3 PPC
-
Re: Maximum amount of files per directory
On Wed, 7 Nov 2007, David Bolt wrote:-
>And, after almost 90 minutes, it's still around 110000 files to create!
After another hour of waiting:
playing:/mnt # time /usr/src/packages/SOURCES/create_files
real 152m48.024s
user 0m5.172s
sys 144m57.292s
playing:/mnt # ls -l /|grep mnt
drwxr-xr-x 3 root root 8003584 Nov 7 01:29 mnt
playing:/mnt # find -type f | wc -l
500000
playing:/mnt # time ( (find -type f -name "[0-3]*"; find -type f -name "4[0-8]*"; find -type f -name "49[0-8]*"; find -type f -name "499[0-8]*";
find -type f -name "4999[0-8]*"; find -type f -name "49999[0-8]" ) 2>/dev/null | xargs rm)
real 0m11.388s
user 0m0.876s
sys 0m8.597s
playing:/mnt # ls -l /|grep mnt
drwxr-xr-x 3 root root 8003584 Nov 7 02:23 mnt
playing:/mnt # find -type f | wc -l
1
playing:/mnt # cd ~
playing:~ # umount /mnt
>Previous tests were done using a sub-directory, so I'm guessing that
>while ext3 can handle large numbers of files in directories, it doesn't
>like having a huge number of files created in the top level of file
>system.
Well, it can't handle _creating_ large numbers of files in . so...
>Once it's finished, I'll reformat the partition and perform the same
>test using a sub-directory again. Unless there's something very strange
>going on, I expect the times to be much faster.
And it's taking its time again. Damned confusing, especially since a
previous test was performed using a file system on /dev/hda and it
wasn't this slow. It's something to look into later tomorrow.
Regards,
David Bolt
--
www.davjam.org/lifetype/ www.distributed.net: OGR@100Mnodes, RC5-72@15Mkeys
| SUSE 10.1 32bit | openSUSE 10.2 32bit | openSUSE 10.3 32bit
SUSE 10.0 64bit | SUSE 10.1 64bit | openSUSE 10.2 64bit |
RISC OS 3.11 | RISC OS 3.6 | TOS 4.02 | openSUSE 10.3 PPC
-
Re: Maximum amount of files per directory
Hello David,
that is quite an amount of testing you have done. Can you summerize
what you found out?
My basic question was:
Will it be noticable slower to access a file with a given name inside a
folder with 500.000 files, than with one that has only 5.000 files.
Right now my structure looks like this:
/gallery/$cc/pictures_thumb
/gallery/$cc/pictures_400
/gallery/$cc/pictures_600
/gallery/$cc/pictures_1024
Where $cc is a country code. There are about 260 country codes and
therefore 260*4 = 1040 directories.
I would like to switch to a structure that only contains the 4 folders
/gallery/pictures_thumb
/gallery/pictures_400
/gallery/pictures_600
/gallery/pictures_1024
The reason for this is, I believe it will be faster to copy the files on
backup with tar and with cp.
The files are only accessed with a given name.
Do you believe I will gain if I switch to the new structure with 4
folders but more files in it?
Best regards,
Merlin
David Bolt schrieb:
> On Wed, 7 Nov 2007, David Bolt wrote:-
>
>> And, after almost 90 minutes, it's still around 110000 files to create!
>
> After another hour of waiting:
>
> playing:/mnt # time /usr/src/packages/SOURCES/create_files
>
> real 152m48.024s
> user 0m5.172s
> sys 144m57.292s
> playing:/mnt # ls -l /|grep mnt
> drwxr-xr-x 3 root root 8003584 Nov 7 01:29 mnt
> playing:/mnt # find -type f | wc -l
> 500000
> playing:/mnt # time ( (find -type f -name "[0-3]*"; find -type f -name "4[0-8]*"; find -type f -name "49[0-8]*"; find -type f -name "499[0-8]*";
> find -type f -name "4999[0-8]*"; find -type f -name "49999[0-8]" ) 2>/dev/null | xargs rm)
>
> real 0m11.388s
> user 0m0.876s
> sys 0m8.597s
> playing:/mnt # ls -l /|grep mnt
> drwxr-xr-x 3 root root 8003584 Nov 7 02:23 mnt
> playing:/mnt # find -type f | wc -l
> 1
> playing:/mnt # cd ~
> playing:~ # umount /mnt
>
>> Previous tests were done using a sub-directory, so I'm guessing that
>> while ext3 can handle large numbers of files in directories, it doesn't
>> like having a huge number of files created in the top level of file
>> system.
>
> Well, it can't handle _creating_ large numbers of files in . so...
>
>> Once it's finished, I'll reformat the partition and perform the same
>> test using a sub-directory again. Unless there's something very strange
>> going on, I expect the times to be much faster.
>
> And it's taking its time again. Damned confusing, especially since a
> previous test was performed using a file system on /dev/hda and it
> wasn't this slow. It's something to look into later tomorrow.
>
>
> Regards,
> David Bolt
>
-
Re: Maximum amount of files per directory
On 2007-11-06 18:21, David Bolt wrote:
> On Tue, 6 Nov 2007, birre wrote:-
>
>
>
>> If you once have 500000 files in a directory, it will be terrible slow to
>> do ls
>
> Not really. The time is actually taken up by the formatting and
> displaying on a console.
>
>> , even after you delete 499999 of them, since the directory itself
>> will be huge, and never shrink when removing files.
>
> That's wrong, or it is for reiserfs. Both ext3 and xfs retain the
> directory size, but directories shrink as files are deleted on a
> reiserfs file system.
Merlin wrote he was using ext2 , and this was what I did reply to,
ext2 is very fast for normal operations, but not so fun after a
computer crash.
I recommended reiserfs, but are not sure, since I don't know how
he use the filesystem.
Reiser has no directory or inodes in the same form as the others,
it's more like a database that deliver the result to the OS,
so no surprise the directory does not include deleted files.
( No lost+found either, only loss is possible :-)
> drwxr-xr-x 2 davjam users 12050432 2007-11-06 16:11
Ok, I can buy that a normal machine maybe can handle a directory
12MB big , and not so terrible slow as I wrote, but all tests you
did was from RAM , since it was cached.
I have seen machines that don't respond for 30 sec. when doing ls -l.
/bb
-
Re: Maximum amount of files per directory
On Wed, 7 Nov 2007, birre wrote:-
>On 2007-11-06 18:21, David Bolt wrote:
>> That's wrong, or it is for reiserfs. Both ext3 and xfs retain the
>> directory size, but directories shrink as files are deleted on a
>> reiserfs file system.
>
>Merlin wrote he was using ext2 , and this was what I did reply to,
>ext2 is very fast for normal operations, but not so fun after a
>computer crash.
That I can attest to. Several times, with earlier much versions of SuSE
(6.x, 7.x) I had a crash and had to endure the forced fsck :-| The good
news was that I quite rarely actually lost anything. More often than not
I could figure out what the "found" files were supposed to be.
>I recommended reiserfs, but are not sure, since I don't know how
>he use the filesystem.
Looking at his reply to me, I think either ext3 or reiserfs would
probably do just as good as each other.
>Reiser has no directory or inodes in the same form as the others,
>it's more like a database that deliver the result to the OS,
>so no surprise the directory does not include deleted files.
>( No lost+found either, only loss is possible :-)
Well, I don't know. I've not yet lost anything use reiserfs whereas with
ext2/3 I've had a lost+found containing several thousand recovered
inodes.
>> drwxr-xr-x 2 davjam users 12050432 2007-11-06 16:11
>
>Ok, I can buy that a normal machine maybe can handle a directory
>12MB big , and not so terrible slow as I wrote, but all tests you
>did was from RAM , since it was cached.
Okay, one more test, this time with some attempt to get the stuff out of
the cache :-)
davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> grep $(df .|tail -1|awk '{print $1}') /etc/fstab
/dev/hda12 /usr/src reiserfs defaults 1 2
davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> time ../create_files
real 0m17.315s
user 0m0.932s
sys 0m13.685s
davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> free -t
total used free shared buffers cached
Mem: 995580 972656 22924 0 100532 82924
-/+ buffers/cache: 789200 206380
Swap: 4964036 391812 4572224
Total: 5959616 1364468 4595148
davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> find / -type f &>/dev/null
free -t
davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> free -t
total used free shared buffers cached
Mem: 995580 984804 10776 0 213408 38952
-/+ buffers/cache: 732444 263136
Swap: 4964036 392964 4571072
Total: 5959616 1377768 4581848
davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> time ls
....
045447 090902 136357 181812 227267 272722 318177 363632 409087 454542 499997
045448 090903 136358 181813 227268 272723 318178 363633 409088 454543 499998
045449 090904 136359 181814 227269 272724 318179 363634 409089 454544 499999
045450 090905 136360 181815 227270 272725 318180 363635 409090 454545
045451 090906 136361 181816 227271 272726 318181 363636 409091 454546
045452 090907 136362 181817 227272 272727 318182 363637 409092 454547
045453 090908 136363 181818 227273 272728 318183 363638 409093 454548
045454 090909 136364 181819 227274 272729 318184 363639 409094 454549
real 0m30.632s
user 0m4.876s
sys 0m6.088s
davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> find / -type f &>/dev/null
davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> time ls -l
....
-rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499990
-rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499991
-rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499992
-rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499993
-rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499994
-rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499995
-rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499996
-rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499997
-rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499998
-rw-r--r-- 1 davjam users 0 2007-11-07 14:42 499999
real 0m53.351s
user 0m5.556s
sys 0m8.141s
davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> find / -maxdepth 4 -type f 2>/dev/null | wc -l
25202
davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> find / -maxdepth 5 -type f 2>/dev/null | wc -l
193293
davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> free -t
total used free shared buffers cached
Mem: 995580 985588 9992 0 103668 32280
-/+ buffers/cache: 849640 145940
Swap: 4964036 154036 4810000
Total: 5959616 1139624 4819992
davjam@playing:/usr/src/packages/SOURCES/zzz_directory_test> cd ../
davjam@playing:/usr/src/packages/SOURCES> time rm -rf zzz_directory_test
real 0m32.773s
user 0m0.168s
sys 0m31.022s
davjam@playing:/usr/src/packages/SOURCES> popd
~
Okay, that's using reiserfs. Now for ext3:
davjam@playing:~> pushd /media/share
/media/share ~
davjam@playing:/media/share> grep $(df .|tail -1|awk '{print $1}') /etc/fstab
/dev/hdb6 /media/share ext3 rw 1 2
davjam@playing:/media/share> mkdir -p zzz_directory_test
davjam@playing:/media/share> cd zzz_directory_test
davjam@playing:/media/share/zzz_directory_test> time /usr/src/packages/SOURCES/create_files
real 0m21.155s
user 0m0.832s
sys 0m15.273s
davjam@playing:/media/share/zzz_directory_test> find -type f | wc -l
500001
davjam@playing:/media/share/zzz_directory_test> ls -l .. | tail -1
drwxr-xr-x 2 davjam users 11317248 2007-11-07 15:07 zzz_directory_test
davjam@playing:/media/share/zzz_directory_test> free -t
total used free shared buffers cached
Mem: 995580 964460 31120 0 58952 28080
-/+ buffers/cache: 877428 118152
Swap: 4964036 66000 4898036
Total: 5959616 1030460 4929156
davjam@playing:/media/share/zzz_directory_test> find / -maxdepth 3 -type f &>/dev/null
davjam@playing:/media/share/zzz_directory_test> free -t
total used free shared buffers cached
Mem: 995580 974336 21244 0 68044 28152
-/+ buffers/cache: 878140 117440
Swap: 4964036 66000 4898036
Total: 5959616 1040336 4919280
davjam@playing:/media/share/zzz_directory_test> free -t
total used free shared buffers cached
Mem: 995580 984864 10716 0 50124 29560
-/+ buffers/cache: 905180 90400
Swap: 4964036 0 4964036
Total: 5959616 984864 4974752
davjam@playing:/media/share/zzz_directory_test> find / -maxdepth 3 -type f 2>/dev/null | wc -l
10244
davjam@playing:/media/share/zzz_directory_test> time ls -l
....
-rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499990
-rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499991
-rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499992
-rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499993
-rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499994
-rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499995
-rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499996
-rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499997
-rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499998
-rw-r--r-- 1 davjam users 0 2007-11-07 15:07 499999
real 11m50.084s
user 0m8.313s
sys 0m9.145s
davjam@playing:/media/share/zzz_directory_test> free -t
total used free shared buffers cached
Mem: 995580 843504 152076 0 16956 64860
-/+ buffers/cache: 761688 233892
Swap: 4964036 4656 4959380
Total: 5959616 848160 5111456
davjam@playing:/media/share/zzz_directory_test> free -t
total used free shared buffers cached
Mem: 995580 845604 149976 0 17052 66228
-/+ buffers/cache: 762324 233256
Swap: 4964036 0 4964036
Total: 5959616 845604 5114012
davjam@playing:/media/share/zzz_directory_test> cd ../
davjam@playing:/media/share> find / -maxdepth 3 -type f 2>/dev/null | wc -l
10244
real 8m42.703s
user 0m0.140s
sys 0m14.921s
Okay, it's 17 times slower deleting the directory, and about 12 times
slower at producing a directory listing with ext3.
>I have seen machines that don't respond for 30 sec. when doing ls -l.
As have I, and if you look at the time above, you'll see it took almost
12 minutes to do a directory listing on ext3, compared to less than a
minute on reiserfs.
As for the delay, my guess is that ls sorting and formatting the output
for a console really takes its toll. After all, it was a lot faster
sending the output to either a file or /dev/null.
Regards,
David Bolt
--
www.davjam.org/lifetype/ www.distributed.net: OGR@100Mnodes, RC5-72@15Mkeys
| SUSE 10.1 32bit | openSUSE 10.2 32bit | openSUSE 10.3 32bit
SUSE 10.0 64bit | SUSE 10.1 64bit | openSUSE 10.2 64bit |
RISC OS 3.11 | RISC OS 3.6 | TOS 4.02 | openSUSE 10.3 PPC
-
Re: Maximum amount of files per directory
Andreas wrote:
>Hi,
>Merlin Morgenstern wrote:
>> thank you for your reply. How can I find out which file system I am
>> running on the machine?
>The mount command or the /etc/fstab file.
>
>> > Lookups should be in some O(log n).
>>
>> I am sorry, what does that mean? Can you be more specific?
>It means that the time would increase, but not linear. Having ten times
>as many files won't take ten times longer.
But it will take something like two or three times longer.
--
--- Paul J. Gans
-
Re: Maximum amount of files per directory
On Wed, 07 Nov 2007 17:07:20 +0000, David Bolt wrote:
>>Reiser has no directory or inodes in the same form as the others,
>>it's more like a database that deliver the result to the OS,
>>so no surprise the directory does not include deleted files.
>>( No lost+found either, only loss is possible
>
> Well, I don't know. I've not yet lost anything use reiserfs whereas with
> ext2/3 I've had a lost+found containing several thousand recovered
> inodes.
My experience is exactly the opposite. The only time I ever suffered
severe data loss after a power surge (nearby lightning strike, surge went
straight through surge protector!) was with reiserfs, whereas ext3 has
always been able to recover cleanly.