Hi,

Summary of my problem:
Remote X forwarding is apperently randomly impossible for different display
numbers.
At the end of this mail you will find a recipe for how to reproduce
this behaviour easily.


I use SuSE 10.2 with the following openssh version:
OpenSSH_4.4p1, OpenSSL 0.9.8d 28 Sep 2006

Clients (Linux and Windows (Cygwin)) connect to the server with
X-Forwarding enabled ("-X" or "-Y").
The ssh server gives away local ports above 6010 for the X connections
of these clients. (default setup)


This setup works very stable (for years), BUT sometimes (every few
weeks) I receive "can't connect" errors, after opening a ssh connection
(successfully) and trying to run a remote X.program (e.g. xev).

For example: after connecting to the server (ssh -X ...), the DISPLAY
environment setting is "localhost:18". See the following output:


jackdaw:~ # netstat -lpn| grep 60
tcp 0 0 127.0.0.1:6016 0.0.0.0:* LISTEN 24607/sshd: jens@no
tcp 0 0 127.0.0.1:6017 0.0.0.0:* LISTEN 25900/sshd: michael
tcp 0 0 127.0.0.1:6019 0.0.0.0:* LISTEN 18030/sshd: lars@no
tcp 0 0 127.0.0.1:6010 0.0.0.0:* LISTEN 519/sshd: steffen@n
tcp 0 0 127.0.0.1:6011 0.0.0.0:* LISTEN 12190/sshd: ansgar@
tcp 0 0 127.0.0.1:6012 0.0.0.0:* LISTEN 25795/sshd: norbert
tcp 0 0 127.0.0.1:6013 0.0.0.0:* LISTEN 13587/sshd: henning
tcp 0 0 127.0.0.1:6014 0.0.0.0:* LISTEN 14594/sshd: diana@n
tcp 0 0 127.0.0.1:6015 0.0.0.0:* LISTEN 15447/sshd: axel@no
tcp 0 0 ::1:6016 :::* LISTEN 24607/sshd: jens@no
tcp 0 0 ::1:6017 :::* LISTEN 25900/sshd: michael
tcp 0 0 ::1:6018 :::* LISTEN 26589/sshd: lars@no
tcp 0 0 ::1:6019 :::* LISTEN 18030/sshd: lars@no
tcp 0 0 ::1:6010 :::* LISTEN 519/sshd: steffen@n
tcp 0 0 ::1:6011 :::* LISTEN 12190/sshd: ansgar@
tcp 0 0 ::1:6012 :::* LISTEN 25795/sshd: norbert
tcp 0 0 ::1:6013 :::* LISTEN 13587/sshd: henning
tcp 0 0 ::1:6014 :::* LISTEN 14594/sshd: diana@n
tcp 0 0 ::1:6015 :::* LISTEN 15447/sshd: axel@no


Out of some reason, port 6018 on 127.0.0.1 is not used by sshd (but it should: see "::1:6018" below).

Further investigations lead to the following:


jackdaw:~ # netstat -pn | grep ":6016"
tcp 0 0 127.0.0.1:6016 127.0.0.1:6039 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:6016 127.0.0.1:6038 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:6016 127.0.0.1:6037 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:6016 127.0.0.1:6047 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:6016 127.0.0.1:24990 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:6016 127.0.0.1:6045 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:6016 127.0.0.1:6044 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:6016 127.0.0.1:6040 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:6016 127.0.0.1:6023 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:6016 127.0.0.1:6022 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:6016 127.0.0.1:6018 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:6016 127.0.0.1:6017 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:6017 127.0.0.1:6016 ESTABLISHED 14353/kdeinit Runni
tcp 0 0 127.0.0.1:6018 127.0.0.1:6016 ESTABLISHED 14360/kded [kdeinit
tcp 0 0 127.0.0.1:6022 127.0.0.1:6016 ESTABLISHED 14367/ksmserver [kd
tcp 0 0 127.0.0.1:6023 127.0.0.1:6016 ESTABLISHED 14368/kwin [kdeinit
tcp 0 0 127.0.0.1:6024 127.0.0.1:6016 ESTABLISHED 14370/kdesktop [kde
tcp 0 0 127.0.0.1:6025 127.0.0.1:6016 ESTABLISHED 14372/kicker [kdein
tcp 0 0 127.0.0.1:6037 127.0.0.1:6016 ESTABLISHED 14380/amarokapp
tcp 0 0 127.0.0.1:6038 127.0.0.1:6016 ESTABLISHED 14382/kerry [kdeini
tcp 0 0 127.0.0.1:6039 127.0.0.1:6016 ESTABLISHED 14360/kded [kdeinit
tcp 0 0 127.0.0.1:6040 127.0.0.1:6016 ESTABLISHED 14358/klauncher [kd
tcp 0 0 127.0.0.1:6044 127.0.0.1:6016 ESTABLISHED 14392/knotify [kdei
tcp 0 0 127.0.0.1:6045 127.0.0.1:6016 ESTABLISHED 14396/konqueror [kd
tcp 0 0 127.0.0.1:6047 127.0.0.1:6016 ESTABLISHED 14407/klipper [kdei
tcp 0 0 127.0.0.1:6058 127.0.0.1:6016 ESTABLISHED 14436/beagled
tcp 0 0 127.0.0.1:6068 127.0.0.1:6016 ESTABLISHED 14450/firefox-bin
tcp 0 0 127.0.0.1:6016 127.0.0.1:6025 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:6016 127.0.0.1:6024 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:6016 127.0.0.1:6068 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:6016 127.0.0.1:6058 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:17119 127.0.0.1:6016 ESTABLISHED 14487/beagled-helpe
tcp 0 0 127.0.0.1:17103 127.0.0.1:6016 ESTABLISHED 14450/firefox-bin
tcp 0 0 127.0.0.1:6016 127.0.0.1:17119 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:6016 127.0.0.1:17103 ESTABLISHED 14279/sshd: jens@no
tcp 0 0 127.0.0.1:24990 127.0.0.1:6016 ESTABLISHED 16717/sunbird-bin


It seems like the user (in this example: "jens") was connected to
localhost:16 via "ssh -X ...". This X forwarding opened a port for
every X program within the session,
Now comes the problem:
The ports that were used, are within the range of the ports that are
used for new X forwarding connections as well. This leads to problems
for users trying to connect to the server, later.
After the user on port 6016 disconnected and reconnected again, the
problem was gone - his programs used a different (random?) port range
for connections. There was no problem to create new sessions, anymore.

Maybe the real root of the problem is, that the ssh server does not
check, if a port is already in use, when it creates the DISPLAY setting
for a new connection.
In this case, it should have noticed, that the ports 6017 and 6018 are
already in use and should announce a "localhost:19" DISPLAY setting to
the next new X forwarding session (skipping the unusable "localhost:17"
and "..:18").


How to reproduce the problem (using netcat):
1) setup an ssh server with X forwarding enabled
2) check open X forwarded sessions with "netstat -lpn | grep ':60'"
3) run "netcat -l -p 6010" (use the lowest free port number greater or
equal to 6010) - this blocks the specific port
4) connect to the server and run a X program, e.g.: "ssh -X $HOST xeyes"

Result: sshd cannot use the (blocked) port - so the client cannot run X
programs.


Is there a possible workaround how to tell the server, that it may not
forward local X connections to ports that are within a specific range
(in this case maybe 6000-6100)?
Maybe it would be good, only to use dynamic port numbers for new
processes that are far away from the port range needed by the ssh
daemon for new connections?
Or are there any other solutions?

thanks for your hard work,
Lars
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
http://lists.mindrot.org/mailman/lis...enssh-unix-dev