-
Measuring latency
Hi Guys
Has anybody here had any success measuring latency for TCP connections
on Linux(2.6.9)? It appears that the SO_TIMESTAMP option combined with
recvmsg doesn't work for SOCK_STREAM connections. My ultimate goal is
to be able to measure the latency between a packet arriving at a
socket and my application reading the data so I can determine whether
the app is network or cpu bound. Currently I attempting to infer this
by measuring how full the receive buffer is using ioctl and FIONREAD
immediately after a select call unblocks. My logic is that if the
level of the socket buffer is gradually increasing then the app is
probably cpu bound and if the socket is pretty much empty then it is
network bound.
Does anybody have any other suggestions, please?
TIA
AJ
-
Re: Measuring latency
[email]ajcppmod@gmail.com[/email] wrote:[color=blue]
> Has anybody here had any success measuring latency for TCP connections
> on Linux(2.6.9)? It appears that the SO_TIMESTAMP option combined with
> recvmsg doesn't work for SOCK_STREAM connections. My ultimate goal is
> to be able to measure the latency between a packet arriving at a
> socket and my application reading the data so I can determine whether
> the app is network or cpu bound. Currently I attempting to infer this
> by measuring how full the receive buffer is using ioctl and FIONREAD
> immediately after a select call unblocks. My logic is that if the
> level of the socket buffer is gradually increasing then the app is
> probably cpu bound and if the socket is pretty much empty then it is
> network bound.[/color]
[color=blue]
> Does anybody have any other suggestions, please?[/color]
If you were to "speculatively" read from the socket(s) (non-blocking)
before ever going into select(), the simple act of calling select()
could be interpreted as being network bound.
Eg:
begin:
do
readfromsockets
while (atleastonehaddata)
select()
goto begin:
(or something along those lines).
If it does not go into select() at all you could assume it was not
network bound and perhaps bound somehow else (CPU, disc, whatever).
rick jones
--
a wide gulf separates "what if" from "if only"
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
-
Re: Measuring latency
On Jun 9, 12:10*pm, ajcpp...@gmail.com wrote:[color=blue]
> Hi Guys
>
> Has anybody here had any success measuring latency for TCP connections
> on Linux(2.6.9)? It appears that the SO_TIMESTAMP option combined with
> recvmsg doesn't work for SOCK_STREAM connections. My ultimate goal is
> to be able to measure the latency between a packet arriving at a
> socket and my application reading the data so I can determine whether
> the app is network or cpu bound. Currently I attempting to infer this
> by measuring how full the receive buffer is using ioctl and FIONREAD
> immediately after a select call unblocks. My logic is that if the
> level of the socket buffer is gradually increasing then the app is
> probably cpu bound and if the socket is pretty much empty then it is
> network bound.
>
> Does anybody have any other suggestions, please?
>
> TIA
>
> AJ[/color]
I would simply use the following heuristic: if the receive buffer is
almost never empty, I'm CPU bound.
DS
-
Re: Measuring latency
David Schwartz wrote:[color=blue]
> I would simply use the following heuristic: if the receive buffer is
> almost never empty, I'm CPU bound.[/color]
Don't you mean 'almost *always* empty'?
It's impossible to get this information in some environments. In Java
all you can see is that OP_WRITE doesn't fire, which tells you you are
network-bound, or receiver-bound.
-
Re: Measuring latency
On Jun 10, 8:06*pm, EJP <esmond.not.p...@not.bigpond.com> wrote:[color=blue]
> David Schwartz wrote:[color=green]
> > I would simply use the following heuristic: if the receive buffer is
> > almost never empty, I'm CPU bound.[/color]
>
> Don't you mean 'almost *always* empty'?[/color]
Yeah, exactly.
[color=blue]
> It's impossible to get this information in some environments. In Java
> all you can see is that OP_WRITE doesn't fire, which tells you you are
> network-bound, or receiver-bound.[/color]
True, I guess in that case the thing to look at would be whether the
receive queue is full when OP_WRITE does fire. If so, your program is
probably holding up the TCP implementation.
Of course, if the protocol is query/response, and the other side is
waiting for you and so can't fill the receive buffer anyway, this
won't work.
I guess there's no good general way.
DS
-
Re: Measuring latency
David Schwartz wrote:[color=blue]
> True, I guess in that case the thing to look at would be whether the
> receive queue is full when OP_WRITE does fire.[/color]
Sorry, I don't understand that either. Do you mean whether the socket
receive buffer is full? which is another thing you can't see in Java.
All you can see is whether OP_READ & OP_WRITE fire.
[color=blue]
> If so, your program is probably holding up the TCP implementation.[/color]
If OP_WRITE isn't firing, you are output bound.
If OP_READ isn't firing, you are input bound.
If OP_READ fires and you don't get zero length writes you are not output
bound: either the reader is fast enough to keep up with you, or you are
too CPU bound to discover its maximum rate. You can't tell the
difference between these two situations: all it means is that the
receiver is >= as fast as you.
If OP_READ fires and you get a zero length write, you are now output
bound, and should start selecting for OP_WRITE *instead* of OP_READ.
If OP_WRITE is firing and OP_READ is also firing you have in my opinion
a design error. If you're waiting for OP_WRITE it's because you have
something to write that won't write, and you don't really have any
business reading from that socket until you've finished the write.
Otherwise you just piling up data inside your application that would be
better left in the receive buffer and ignored until you've finished the
write. It's like letting the passengers off the bus first before
admitting new ones.
-
Re: Measuring latency
On Jun 11, 7:04*pm, EJP <esmond.not.p...@not.bigpond.com> wrote:
[color=blue]
> Sorry, I don't understand that either. Do you mean whether the socket
> receive buffer is full? which is another thing you can't see in Java.
> All you can see is whether OP_READ & OP_WRITE fire.[/color]
You cannot tell that it was ever full with 100% certainty, but if you
go to read the socket and get a number of bytes equal to the receive
buffer size, you can be sure enough for statistical purposes.
[color=blue]
> If OP_WRITE is firing and OP_READ is also firing you have in my opinion
> a design error. If you're waiting for OP_WRITE it's because you have
> something to write that won't write, and you don't really have any
> business reading from that socket until you've finished the write.[/color]
WHAT?! NO!!
You *MUST* read from the socket.
[color=blue]
> Otherwise you just piling up data inside your application that would be
> better left in the receive buffer and ignored until you've finished the
> write. It's like letting the passengers off the bus first before
> admitting new ones.[/color]
Nonsense! That's insanity. If everyone did that, deadlock would be
rampant. Imagine if each side has data to send and all stack buffers
are full. Both sides would be waiting for the other side to receive so
it could finish sending.
You *MUST* receive all data, you *MUST NOT* ever wait for the other
side to receive before you receive unless your protocol explicitly
allows one side to do this and you are implementing that side *only*.
DS
-
Re: Measuring latency
On 2008-06-10 07:22:29 -0400, David Schwartz <davids@webmaster.com> said:
[color=blue]
> I would simply use the following heuristic: if the receive buffer is
> almost never empty, I'm CPU bound.[/color]
Depending on the criticality of the application you're writing (its
latency characteristics / requirements), I don't trust the application
itself, to measure itself. The one notable exception to this is the
smart folks @ 29-West who make a low-latency message bus infrastructure
that actually does do this quite reliably.
In other cases, I insist on having packet capture products around to
take accurate measurements and report back to my infrastructure what's
up. This is also a good way to build reasonable measurement into your
software and have a solid basis of comparison. There's too many
variables in OS implementations, compiler output, and TCP/IP stack
tweaks, NIC drivers, etc. to just trust a software measurement.
/dmfh
--
_ __ _
__| |_ __ / _| |_ 01100100 01101101
/ _` | ' \| _| ' \ 01100110 01101000
\__,_|_|_|_|_| |_||_| dmfh(-2)dmfh.cx
-
Re: Measuring latency
David Schwartz wrote:[color=blue]
> Nonsense! That's insanity. If everyone did that, deadlock would be
> rampant. Imagine if each side has data to send and all stack buffers
> are full. Both sides would be waiting for the other side to receive so
> it could finish sending.[/color]
So both sides should keep reading and run out of memory? Where does this
policy stop?
If one peer is writing blindly and not reading whatever it should be
reading, why should either the network or the other peer have to cope
with that?