MTU / rsize / wsize problems
Lo,
Using linux 2.4.25, I ran into trouble with an ordinary nfs mount:
mount -t nfs -o hard,intr,nolock,noatime someserver:/mnt /mnt
(Server: /mnt 128.0.66.1/24(rw,no_root_squash) )
Although the mount completes and I can do an 'ls', I can't actually
transfer files or start executables; the connect then seems to (b)lock
and I get messages on the console like "kernel: nfs: server 128.0.66.18
not responding, still trying".
I ran tcpdump and found out that I had a lot of "ip reassembly time
exceeded" ICMP-messages which basically stalled the transfer. I thought
about this and search Google where I found a solution: make
rsize=wsize=1024. My MTU is 1500 so no fragments will occur.
Now the question is: what is the penalty of this? Or should I increase
the MTU to e.g. 8500 to allow a rsize=wsize=8192? It that safe to do or
am I going insain here?
Thanks,
Marc.
--- If you don't like my email headers, reply to me via email. ;-)
Re: MTU / rsize / wsize problems
Marc L. de Bruin <support@microsoft.com> wrote:[color=blue]
> Using linux 2.4.25, I ran into trouble with an ordinary nfs mount:[/color]
[color=blue]
> mount -t nfs -o hard,intr,nolock,noatime someserver:/mnt /mnt[/color]
[color=blue]
> (Server: /mnt 128.0.66.1/24(rw,no_root_squash) )[/color]
[color=blue]
> Although the mount completes and I can do an 'ls', I can't actually
> transfer files or start executables; the connect then seems to
> (b)lock and I get messages on the console like "kernel: nfs: server
> 128.0.66.18 not responding, still trying".[/color]
[color=blue]
> I ran tcpdump and found out that I had a lot of "ip reassembly time
> exceeded" ICMP-messages which basically stalled the transfer.[/color]
Well, the NFS retransmissions should have kicked-in. It does suggest
you have a rather non-trivial packet loss rate. And IP fragmentation
takes a packet loss rate and (multiplicitaly? exponentially?)
increases it. Since all fragments of a fragmented IP datagram must
get through, if you have a packet loss probability of p, then the
chances of the entire datagram getting through is (1-p)^numfrag. Or,
the probability of the NFS message not making it through is
(1-((1-p)^numfrag)). (At least I think I got that math right :)
[color=blue]
> I thought about this and search Google where I found a solution:
> make rsize=wsize=1024. My MTU is 1500 so no fragments will occur.[/color]
[color=blue]
> Now the question is: what is the penalty of this?[/color]
Well, where you would have issued one read or write request for 8192
bytes, you would not issue 8 each for 1024 bytes.
[color=blue]
> Or should I increase the MTU to e.g. 8500 to allow a
> rsize=wsize=8192? It that safe to do or am I going insain here?[/color]
It would only be insane if you have other equipment (switch included)
in the same broadcast domain that didn't support the larger MTU.
You could also switch to TCP mounts for your NFS, and perhaps even
enable "large send" on your NIC if it supports it. TCP is probably
the better transport to use if you have packet loss issues anyway.
rick jones
--
a wide gulf separates "what if" from "if only"
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to raj in cup.hp.com but NOT BOTH...