Also, App1 and App2 are running on the same CPU and are using localhost
for the TCP/IP communications mentioned earlier.
This is a discussion on Tuning the Scheduler - Linux ; I have a Linux app, let's call it App1, that uses a TCP/IP socket to communicate with the socket of another app, App2. The problem is that the App2's receive socket buffer eventually overflows because it cannot keep up with ...
I have a Linux app, let's call it App1, that uses a TCP/IP socket to
communicate with the socket of another app, App2. The problem is that
the App2's receive socket buffer eventually overflows because it cannot
keep up with the rate at which App1 is sending TCP/IP messages. When I
view the CPU utilization using Top, I notice that the CPU is idle about
75% of the time. Because of this large amount of idle time, It seems to
me that there is plenty of CPU power to prevent the receive buffer
overflow if the scheduler can somehow be make aware of the receive
buffer state (say via some kind of high watermark indication).
Is there a way to tune the scheduler so that it gives App2 the time it
needs to prevent buffer overflow? BTW, App2 is not waiting on anything
which would prevent it from being run as it's receive buffer begins to
fill.
Thanks,
Darol
Also, App1 and App2 are running on the same CPU and are using localhost
for the TCP/IP communications mentioned earlier.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Darol wrote:
> I have a Linux app, let's call it App1, that uses a TCP/IP socket to
> communicate with the socket of another app, App2. The problem is that
> the App2's receive socket buffer eventually overflows because it cannot
> keep up with the rate at which App1 is sending TCP/IP messages. When I
[snip]
> Is there a way to tune the scheduler so that it gives App2 the time it
> needs to prevent buffer overflow
[snip]
Why not just decrease the "nice value" for App2? That will give it a
higher priority within the scheduler, and may help prevent the overrun
you see.
- --
Lew Pitcher, IT Specialist, Corporate Technology Solutions,
Enterprise Technology Solutions, TD Bank Financial Group
(Opinions expressed here are my own, not my employer's)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFEQ7l6agVFX4UWr64RAr4GAKC3GwP9RaPamEcgdEGQGy P45c88IQCfYl1D
vKGvWPOiB0cjzgtTkacTS4o=
=UIVG
-----END PGP SIGNATURE-----
Darolwrote in part:
> I have a Linux app, let's call it App1, that uses a TCP/IP socket
> to communicate with the socket of another app, App2. The problem
> is that the App2's receive socket buffer eventually overflows
> because it cannot keep up with the rate at which App1 is sending
> TCP/IP messages. When I view the CPU utilization using Top,
> I notice that the CPU is idle about 75% of the time. Because
> of this large amount of idle time, It seems to me that there is
> plenty of CPU power to prevent the receive buffer overflow if the
I'm a bit mystified: What is triggering App2's socket read?
It should be reading all it can, and in a blocked wait most of
the time.
Perhaps it is App1, which during it's timeslice sends too much
data. The kernel may not block and also have other `localhost`
optimizations. Try adding a yield(), sched_yield() or sleep(0)
call after the write.
-- Robert
Already tried adjusting the nice value - didn't work.
I believe App2 blocks on the read, but I'll verify this with the
developer. I don't believe App1 is sending too much data at one time
because I see (using netstat) App2's buffer gradually fill up and
eventually overflow.
Darolwrote in part:
> I believe App2 blocks on the read, but I'll verify this with
> the developer. I don't believe App1 is sending too much data at
> one time because I see (using netstat) App2's buffer gradually
> fill up and eventually overflow.
If it happens this slowly, I'd strongly suspect App2 is
not blocking on read but doing some sort of timeout.
-- Robert
Very interesting. The developer just told me it's not blocking and
using a timeout of some sort - we're investigating.
Darolwrote in part:
> Very interesting. The developer just told me it's not blocking
> and using a timeout of some sort - we're investigating.
Timeouts (sleep(x)) are _very_ poor programming practice
except where there is some realtime basis for `x`.
Perhaps this code was written for some bletcherous OS
(MS-Win9*?) that has poor blocking?
-- Robert
it must have been the case that the code is waiting on some timer
instead of blocking on read. Make it block to read and i think problem
would be solved. But it would be intresting to see the code and stats
of your problem.
shiva