I have just set up MPPP over two ADSL lines using the Roaring Penguin
RP-PPPoE client 3.5 [1] running linux-2.4.29 together with the latest
version of PPPD [2]. On the ISP's side of the PPPoE link there is a
Cisco system that has been configured to support MPPP clients.

So far, the whole thing works quite well: I've got one logical PPP
device which offers nearly the bandwidth equal to the sum of the two
ADSL lines' bandwidths.

One little problem is left: The "adsl-start" wrapper script that comes
with the RP-PPPoE client dies when setting up the second PPP daemon (it
runs into timeout) because it doesn't recognize it correctly.

My setup is as following:

- I've got two ADSL modems, the first modem is connected to the
linux box' "eth2", the second one to "eth3".

- I've got two config files for RP-PPPoE:

/etc/ppp/pppoe-eth2.conf [3]
/etc/ppp/pppoe-eth3.conf [4]

- The first one is configured for using "eth2", the second one is
configured for using "eth3". I've given the option "multilink"
to both config files as an extra option for PPPD.

- I establish the connection in the following way:

- adsl-start /etc/ppp/pppoe-eth2.conf
("ppp0" comes up, no MPPP so far)

- adsl-start /etc/ppp/pppoe-eth3.conf
(PPPD recognizes that "ppp0" is up and that we want to have
MPPP (and the other side supports it). It then adds this
PPPoE connection to "ppp0").

But although the bundle comes up properly, the second "adsl-start"
wrapper runs into timeout and dies (the second PPPD stays up,
though). So only the first PPPD is monitored by a wrapper script...

Now that's my problem: Our MPPP bundle runs without any problems until
the first disconnection occurs. Such a disconnection occurs after 24
hours for sure (our ADSL carrier (T-Com) terminates any PPPoE
connection after 24 hours), but, of course, it may also occur earlier
due to maintenance on the carrier's or ISP's side or due to link
failure, for example.

In my situation this means that after a disconnection only the
monitored PPPD (i.e. the first one) will come up again, so there
will be only half of the bandwidth then...

My plan is now to try patching the Roaring Penguin scripts (i.e.
"adsl-start" [5], "adsl-connect" [6] and "adsl-status" [7]) so that
they will recognize and monitor also bundled connections...

I've found out that "adsl-status" (it has the task to check if
a specific PPP connection is up; it is called from "adsl-connect")
is the "problem", especially the following lines of this script:

| # Sigh. Some versions of pppd put PID files in /var/run; others put them
| # in /etc/ppp. Since it's too messy to figure out what pppd does, we
| # try both locations.
| for i in /etc/ppp/ppp*.pid /var/run/ppp*.pid ; do
| if [ -r $i ] ; then
| PID=`cat $i`
| if [ "$PID" = "$PPPD_PID" ] ; then
| IF=`basename $i .pid`
| netstat -rn | grep " ${IF}\$" > /dev/null
| # /sbin/ifconfig $IF | grep "UP.*POINTOPOINT" > /dev/null
| if [ "$?" != "0" ] ; then
| echo "adsl-status: Link is attached to $IF, but $IF is down"
| exit 1
| fi
| echo "adsl-status: Link is up and running on interface $IF"
| /sbin/ifconfig $IF
| exit 0
| fi
| fi
| done
| echo "adsl-status: Link is down -- could not find interface corresponding to"
| echo "pppd pid $PPPD_PID"
| exit 1

The script seems to collect all "/var/run/ppp*.pid" files, then it
looks up, whether the PIDs stored whithin these files equals to
$PPPD_PID (that contains the PID of the PPPD we want to monitor).
If it has found "its" PPPD's PID file it extracts the interface's
name from the PID file's name (IF=`basename $i .pid`) and looks up
(using "netstat -rn") if the interface is up or not.

That's the problem: The first PPPD creates a file called "ppp0.pid",
but the second one (that also serves the interface "ppp0") doesn't
create a PID file (I suppose because "ppp0.pid" already exists), so
the above quoted code from "adsl-status" doesn't find the correspon-
ding PID file and dies.

I tried to use the PPPD option "linkname", that means, I assigned
the option "linkname conn1" to "PPPD_EXTRA" of [3] and the option
"linkname conn2" to "PPPD_EXTRA" of [4].

This leads to the following observation: The first PPPD creates
a file called "ppp-conn1.pid" (within "/var/run/") that contains
not only its PID, but (in the second line of the file) also the
name of the device it serves for ("ppp0"), which makes sense
because the device's name isn't reconstructable from the PID file's
name as it was before (without using the option "linkname")

The second PPPD creates a file called "ppp-conn2.pid" (also within
"/var/run/", of course), but this file only contains its PID, not
the name of the interface (which would be also "ppp0" in this case).

At first, I thought, I could modify the above quoted part of "adsl-
status" so that it uses the PID files that are generated when using
the "linkname" option (e.g. "/var/run/ppp-conn1.pid" and "/var/run/
ppp-conn2.pid"). But I have to do a workaround as the PID file be-
longing to the second PPPD doesn't contain the name of the interface.

Another (perhaps the best?) solution would be to access the trivial
database "/var/run/pppd2.tdb" that contains detailed information
about the MPPP bundle.
BTW, is there any documentation of the structure of this "pppd2.tdb"
(especially assurances regarding the keys that PPPD-2.4.3 stores

Now, my questions are:

- What is the best way to handle the situation in your opinion, how
would you handle it (I dont' look for a dirty hack, but for quite a
stable solution)? How do I check if both PPPDs are up and running so
that the wrapper can stay active and restart the PPPDs if one (or
both) of them dies and/or the connection is lost?

I would like to do as less specialized coding as necessary, so that I
can offer a MPPP patch (to PPPD or -better- only to RP-PPPoE) to the
public that isn't narrowing the user (e.g. regarding device names
("ppp1" instead of "ppp0") or the number of connections belonging to
a bundle) or making configuration complicated.

It would be very good if the patched version is also running well on
a non-MPPP system and also if there is more than one PPP interface
(e.g. if someone wants to do route based load balancing over two or
more xDSL lines if an ISP doesn't support MPPP at its side).

- Might it be possible to get into a race condition (perhaps if both
lines die at almost the same time)?

Any help is greatly appreciated!


[1] http://www.roaringpenguin.com/pengui...e_rp-pppoe.php
[2] http://ppp.samba.org/

[3] http://www.steffen-moser.de/temp/ml/...ppoe-eth2.conf
[4] http://www.steffen-moser.de/temp/ml/...ppoe-eth3.conf

[5] http://www.steffen-moser.de/temp/ml/..._01/adsl-start
[6] http://www.steffen-moser.de/temp/ml/...1/adsl-connect
[7] http://www.steffen-moser.de/temp/ml/...01/adsl-status