pppd locking up randomly after hours of use in Linux - PPP

This is a discussion on pppd locking up randomly after hours of use in Linux - PPP ; I am currently working on a project that uses multiple cellular datacards in a mobile environment. Specifically, we have a PC104 Pentium 3 stack that has 8 Sierra Wireless PCMCIA Aircards. 4 of the Aircards are model 550s through Sprint, ...

+ Reply to Thread
Results 1 to 4 of 4

Thread: pppd locking up randomly after hours of use in Linux

  1. pppd locking up randomly after hours of use in Linux

    I am currently working on a project that uses multiple cellular datacards in
    a mobile environment. Specifically, we have a PC104 Pentium 3 stack that
    has 8 Sierra Wireless PCMCIA Aircards. 4 of the Aircards are model 550s
    through Sprint, and 4 of the Aircards are model 555s through Verizon. We
    are multiplexing real time video over all 8 of these cellular devices using
    custom software which is irrelevant to the problem I am having.

    Basically, all 8 Aircards are connected to the cellular network using pppd
    with a simple chat script provided by Sierra Wireless from their
    unsupported Linux section of their web site. The pppd uses persist to
    maintain the connection. Because the environment is mobile and the testing
    location is rural, the cellular connectivity will disconnect and reconnect
    relatively often.

    The problem is that after a while if the cellular datacards are out of
    coverage, the pppd process will freeze. Even after returning to an area
    with coverage, the frozen pppd process will never redial the modem, while
    the other pppd processes redial and reconnect just fine. This does not
    happen all the time, but over the course of a day, it is possible for 3 or
    4 of the 8 pppd processes to stop responding. We are using a 2.4.20 kernel
    with pppd 2.4.1.

    I have identified 2 possible solutions, but would like more feedback from
    the experts. I plan to add "AT&D2&C1" to the chat script hoping that our
    datacards are randomly failing to switch to a command mode. Lastly, I plan
    to incorporate Clifford Kite's patch if the chat script changes do not
    work.

    pppd options:

    -detach
    /dev/modem1
    persist
    maxfail 0
    lcp-echo-interval 3
    lcp-echo-failure 3
    debug
    usepeerdns
    user blah
    show-password
    crtscts
    lock
    connect '/usr/sbin/chat -v -t3 -f /etc/ppp/peers/ac550chat'


    chat script:

    '' AT
    OK ATD#777
    CONNECT ''


    pppd debug output:

    Serial connection established.
    using channel 62
    Using interface ppp3
    Connect: ppp3 <--> /dev/modem1
    rcvd [LCP ConfReq id=0x2d ]
    sent [LCP ConfReq id=0xa ]
    sent [LCP ConfAck id=0x2d ]
    rcvd [LCP ConfAck id=0xa ]
    sent [LCP EchoReq id=0x0 magic=0xe80562be]
    sent [IPCP ConfReq id=0x7 0.0.0.0> ]
    sent [CCP ConfReq id=0x4 ]
    rcvd [LCP DiscReq id=0x2e magic=0x90287b]
    rcvd [LCP EchoRep id=0x0 magic=0x90287b e8 05 62 be]
    rcvd [IPCP ConfReq id=0x2f ]
    sent [IPCP ConfAck id=0x2f ]
    rcvd [LCP ProtRej id=0x30 80 fd 01 04 00 0c 1a 04 78 00 18 04 78 00]
    rcvd [IPCP ConfNak id=0x7
    ]
    sent [IPCP ConfReq id=0x8 68.28.186.11> ]
    rcvd [IPCP ConfAck id=0x8 68.28.186.11> ]
    local IP address 68.240.48.222
    remote IP address 68.28.160.192
    primary DNS address 68.28.186.11
    secondary DNS address 68.28.178.11
    Script /etc/ppp/ip-up started (pid 30298)
    Script /etc/ppp/ip-up finished (pid 30298), status = 0x0
    sent [LCP EchoReq id=0x1 magic=0xe80562be]
    rcvd [LCP EchoRep id=0x1 magic=0x90287b]
    sent [LCP EchoReq id=0x2 magic=0xe80562be]
    rcvd [LCP EchoRep id=0x2 magic=0x90287b]
    sent [LCP EchoReq id=0x3 magic=0xe80562be]
    rcvd [LCP EchoRep id=0x3 magic=0x90287b]
    sent [LCP EchoReq id=0x4 magic=0xe80562be]
    rcvd [LCP EchoRep id=0x4 magic=0x90287b]
    sent [LCP EchoReq id=0x5 magic=0xe80562be]
    rcvd [LCP EchoRep id=0x5 magic=0x90287b]
    sent [LCP EchoReq id=0x6 magic=0xe80562be]
    rcvd [LCP EchoRep id=0x6 magic=0x90287b]
    sent [LCP EchoReq id=0x7 magic=0xe80562be]
    rcvd [LCP EchoRep id=0x7 magic=0x90287b]
    sent [LCP EchoReq id=0x8 magic=0xe80562be]
    rcvd [LCP EchoRep id=0x8 magic=0x90287b]
    sent [LCP EchoReq id=0x9 magic=0xe80562be]
    rcvd [LCP EchoRep id=0x9 magic=0x90287b]
    sent [LCP EchoReq id=0xa magic=0xe80562be]
    sent [LCP EchoReq id=0xb magic=0xe80562be]
    rcvd [LCP EchoRep id=0xa magic=0x90287b]
    sent [LCP EchoReq id=0xc magic=0xe80562be]
    sent [LCP EchoReq id=0xd magic=0xe80562be]
    sent [LCP EchoReq id=0xe magic=0xe80562be]
    rcvd [LCP EchoRep id=0xb magic=0x90287b]
    rcvd [LCP EchoRep id=0xc magic=0x90287b]
    sent [LCP EchoReq id=0xf magic=0xe80562be]
    sent [LCP EchoReq id=0x10 magic=0xe80562be]
    rcvd [LCP EchoRep id=0xd magic=0x90287b]
    rcvd [LCP EchoRep id=0xe magic=0x90287b]
    sent [LCP EchoReq id=0x11 magic=0xe80562be]
    sent [LCP EchoReq id=0x12 magic=0xe80562be]
    sent [LCP EchoReq id=0x13 magic=0xe80562be]
    No response to 3 echo-requests
    Serial link appears to be disconnected.
    Script /etc/ppp/ip-down started (pid 30536)
    sent [LCP TermReq id=0xb "Peer not responding"]
    Script /etc/ppp/ip-down finished (pid 30536), status = 0x0
    sent [LCP TermReq id=0xc "Peer not responding"]
    Connection terminated.
    Connect time 1.1 minutes.
    Sent 92843 bytes, received 379 bytes.
    Connect script failed
    Connect script failed
    Connect script failed
    Connect script failed
    Connect script failed
    Connect script failed
    Connect script failed
    Connect script failed
    Connect script failed

    As you can see above, the pppd process tried to reconnect multiple times,
    but eventually quit responding.

    Any help would be greatly appreciated.

  2. Re: pppd locking up randomly after hours of use in Linux

    Larry Goats wrote:
    > I am currently working on a project that uses multiple cellular datacards in
    > a mobile environment. Specifically, we have a PC104 Pentium 3 stack that
    > has 8 Sierra Wireless PCMCIA Aircards. 4 of the Aircards are model 550s
    > through Sprint, and 4 of the Aircards are model 555s through Verizon. We
    > are multiplexing real time video over all 8 of these cellular devices using
    > custom software which is irrelevant to the problem I am having.


    > Basically, all 8 Aircards are connected to the cellular network using pppd
    > with a simple chat script provided by Sierra Wireless from their
    > unsupported Linux section of their web site. The pppd uses persist to
    > maintain the connection. Because the environment is mobile and the testing
    > location is rural, the cellular connectivity will disconnect and reconnect
    > relatively often.


    > The problem is that after a while if the cellular datacards are out of
    > coverage, the pppd process will freeze. Even after returning to an area
    > with coverage, the frozen pppd process will never redial the modem, while
    > the other pppd processes redial and reconnect just fine. This does not
    > happen all the time, but over the course of a day, it is possible for 3 or
    > 4 of the 8 pppd processes to stop responding. We are using a 2.4.20 kernel
    > with pppd 2.4.1.


    > I have identified 2 possible solutions, but would like more feedback from
    > the experts. I plan to add "AT&D2&C1" to the chat script hoping that our
    > datacards are randomly failing to switch to a command mode. Lastly, I plan
    > to incorporate Clifford Kite's patch if the chat script changes do not
    > work.


    I don't know that the patch will help you. It was generated because
    at the time of the first patch the stty program lacked the -F option.
    So if pppd didn't restore the clocal terminal line setting then stty
    failed to show the terminal line settings (standard input/output had
    to be used). Pppd itself had no problem reusing the line without
    clocal when started from scratch.

    Since the problem occurs when you lose the "physical connection,"
    as a test I tried the nearest thing I can do to imitate that, which
    was to connect to my ISP using "persist" and "maxfail 0" and then
    unplug the modem from the wall jack. Pppd tried to connect 15 times
    before I plugged the modem back in, and then was able to reconnected
    on the first try.

    You showed 9 attempts to reconnect, but I don't know whether that was
    typical or not. The S/N ratio will degrade slowly when a wireless
    looses the connection, and it's remotely possible that that may make
    a difference. Plus what you are doing is, IMHO, rather unusual.

    The bottom line is that I know very little about cellular technology
    and can't say what's going wrong. But I have come to believe, from
    reading posts to this newsgroup, that some of the PPP implementations
    used in connection with that technology do strange things.

    FWIW, using minicom to configure my modem with the profile I use
    with PPP shows &C1 &D1 %E1 . Considering the divergence from the
    old Hayes standard, this may be less than worthless.

    --
    Clifford Kite Email: "echo xvgr_yvahk-ccc@ri1.arg|rot13"
    PPP-Q&A links, downloads: http://ckite.no-ip.net/
    /* Better is the enemy of good enough. */

  3. Re: pppd locking up randomly after hours of use in Linux

    Clifford Kite wrote:

    > Since the problem occurs when you lose the "physical connection,"
    > as a test I tried the nearest thing I can do to imitate that, which
    > was to connect to my ISP using "persist" and "maxfail 0" and then
    > unplug the modem from the wall jack. Pppd tried to connect 15 times
    > before I plugged the modem back in, and then was able to reconnected
    > on the first try.


    I forgot to add that I did this for both the modified pppd and the
    unmodified pppd. No difference, except that the modified pppd did
    revert to the original ttyS1 configuration before trying to reconnect
    while the unmodified one did not.

    > You showed 9 attempts to reconnect, but I don't know whether that was
    > typical or not. The S/N ratio will degrade slowly when a wireless
    > looses the connection, and it's remotely possible that that may make
    > a difference. Plus what you are doing is, IMHO, rather unusual.


    Gak! It's really disconcerting to see editing-for-clarity-errors
    (reconnected) as well as plain typos (looses) in my posts that should
    have been caught prior to posting. :/

    --
    Clifford Kite Email: "echo xvgr_yvahk-ccc@ri1.arg|rot13"
    PPP-Q&A links, downloads: http://ckite.no-ip.net/
    /* 97.3% of all statistics are made up. */

  4. Re: pppd locking up randomly after hours of use in Linux

    Larry Goats writes:

    ]I am currently working on a project that uses multiple cellular datacards in
    ]a mobile environment. Specifically, we have a PC104 Pentium 3 stack that
    ]has 8 Sierra Wireless PCMCIA Aircards. 4 of the Aircards are model 550s
    ]through Sprint, and 4 of the Aircards are model 555s through Verizon. We
    ]are multiplexing real time video over all 8 of these cellular devices using
    ]custom software which is irrelevant to the problem I am having.

    ]Basically, all 8 Aircards are connected to the cellular network using pppd
    ]with a simple chat script provided by Sierra Wireless from their
    ]unsupported Linux section of their web site. The pppd uses persist to
    ]maintain the connection. Because the environment is mobile and the testing
    ]location is rural, the cellular connectivity will disconnect and reconnect
    ]relatively often.

    ]The problem is that after a while if the cellular datacards are out of
    ]coverage, the pppd process will freeze. Even after returning to an area
    ]with coverage, the frozen pppd process will never redial the modem, while
    ]the other pppd processes redial and reconnect just fine. This does not
    ]happen all the time, but over the course of a day, it is possible for 3 or
    ]4 of the 8 pppd processes to stop responding. We are using a 2.4.20 kernel
    ]with pppd 2.4.1.

    ]I have identified 2 possible solutions, but would like more feedback from
    ]the experts. I plan to add "AT&D2&C1" to the chat script hoping that our
    ]datacards are randomly failing to switch to a command mode. Lastly, I plan
    ]to incorporate Clifford Kite's patch if the chat script changes do not
    ]work.

    ]pppd options:

    ]-detach
    ]/dev/modem1
    ]persist
    ]maxfail 0
    ]lcp-echo-interval 3
    ]lcp-echo-failure 3
    ]debug
    ]usepeerdns
    ]user blah
    ]show-password
    ]crtscts
    ]lock
    ]connect '/usr/sbin/chat -v -t3 -f /etc/ppp/peers/ac550chat'


    ]chat script:

    ]'' AT
    ]OK ATD#777
    ]CONNECT ''


    ]pppd debug output:

    ]Serial connection established.
    ]using channel 62
    ]Using interface ppp3
    ]Connect: ppp3 <--> /dev/modem1
    ]rcvd [LCP ConfReq id=0x2d ]
    ]sent [LCP ConfReq id=0xa ]
    ]sent [LCP ConfAck id=0x2d ]
    ]rcvd [LCP ConfAck id=0xa ]
    ]sent [LCP EchoReq id=0x0 magic=0xe80562be]
    ]sent [IPCP ConfReq id=0x7 ]0.0.0.0> ]
    ]sent [CCP ConfReq id=0x4 ]
    ]rcvd [LCP DiscReq id=0x2e magic=0x90287b]
    ]rcvd [LCP EchoRep id=0x0 magic=0x90287b e8 05 62 be]
    ]rcvd [IPCP ConfReq id=0x2f ]
    ]sent [IPCP ConfAck id=0x2f ]
    ]rcvd [LCP ProtRej id=0x30 80 fd 01 04 00 0c 1a 04 78 00 18 04 78 00]
    ]rcvd [IPCP ConfNak id=0x7
    ]]
    ]sent [IPCP ConfReq id=0x8 ]68.28.186.11> ]
    ]rcvd [IPCP ConfAck id=0x8 ]68.28.186.11> ]
    ]local IP address 68.240.48.222
    ]remote IP address 68.28.160.192
    ]primary DNS address 68.28.186.11
    ]secondary DNS address 68.28.178.11
    ]Script /etc/ppp/ip-up started (pid 30298)
    ]Script /etc/ppp/ip-up finished (pid 30298), status = 0x0
    ]sent [LCP EchoReq id=0x1 magic=0xe80562be]
    ]rcvd [LCP EchoRep id=0x1 magic=0x90287b]
    ]sent [LCP EchoReq id=0x2 magic=0xe80562be]
    ]rcvd [LCP EchoRep id=0x2 magic=0x90287b]
    ]sent [LCP EchoReq id=0x3 magic=0xe80562be]
    ]rcvd [LCP EchoRep id=0x3 magic=0x90287b]
    ]sent [LCP EchoReq id=0x4 magic=0xe80562be]
    ]rcvd [LCP EchoRep id=0x4 magic=0x90287b]
    ]sent [LCP EchoReq id=0x5 magic=0xe80562be]
    ]rcvd [LCP EchoRep id=0x5 magic=0x90287b]
    ]sent [LCP EchoReq id=0x6 magic=0xe80562be]
    ]rcvd [LCP EchoRep id=0x6 magic=0x90287b]
    ]sent [LCP EchoReq id=0x7 magic=0xe80562be]
    ]rcvd [LCP EchoRep id=0x7 magic=0x90287b]
    ]sent [LCP EchoReq id=0x8 magic=0xe80562be]
    ]rcvd [LCP EchoRep id=0x8 magic=0x90287b]
    ]sent [LCP EchoReq id=0x9 magic=0xe80562be]
    ]rcvd [LCP EchoRep id=0x9 magic=0x90287b]
    ]sent [LCP EchoReq id=0xa magic=0xe80562be]
    ]sent [LCP EchoReq id=0xb magic=0xe80562be]
    ]rcvd [LCP EchoRep id=0xa magic=0x90287b]
    ]sent [LCP EchoReq id=0xc magic=0xe80562be]
    ]sent [LCP EchoReq id=0xd magic=0xe80562be]
    ]sent [LCP EchoReq id=0xe magic=0xe80562be]
    ]rcvd [LCP EchoRep id=0xb magic=0x90287b]
    ]rcvd [LCP EchoRep id=0xc magic=0x90287b]
    ]sent [LCP EchoReq id=0xf magic=0xe80562be]
    ]sent [LCP EchoReq id=0x10 magic=0xe80562be]
    ]rcvd [LCP EchoRep id=0xd magic=0x90287b]
    ]rcvd [LCP EchoRep id=0xe magic=0x90287b]
    ]sent [LCP EchoReq id=0x11 magic=0xe80562be]
    ]sent [LCP EchoReq id=0x12 magic=0xe80562be]
    ]sent [LCP EchoReq id=0x13 magic=0xe80562be]
    ]No response to 3 echo-requests
    ]Serial link appears to be disconnected.
    ]Script /etc/ppp/ip-down started (pid 30536)
    ]sent [LCP TermReq id=0xb "Peer not responding"]
    ]Script /etc/ppp/ip-down finished (pid 30536), status = 0x0
    ]sent [LCP TermReq id=0xc "Peer not responding"]
    ]Connection terminated.
    ]Connect time 1.1 minutes.
    ]Sent 92843 bytes, received 379 bytes.
    ]Connect script failed
    ]Connect script failed
    ]Connect script failed
    ]Connect script failed
    ]Connect script failed
    ]Connect script failed
    ]Connect script failed
    ]Connect script failed
    ]Connect script failed

    ]As you can see above, the pppd process tried to reconnect multiple times,
    ]but eventually quit responding.

    ]Any help would be greatly appreciated.

    There are many reasons why that connect script could fail. the remote end
    never answers, the cell phone stops working, the port freezes up, etc.
    There is not enough info here.

    You should also have chat script reporting
    Ie chat should be run with the -v option, and you should have syslog steer
    local2 somewhere (eg same file as the ppp debug is going to)

    Ie, have the line
    local2.*;daemon.* /var/log/ppplog
    in /etc/syslog.conf and then do
    killall -1 syslogd

    It looks to me like pppd is NOT the problem. Rather the chat script is
    failing due to one of the above reasons.


    Note that pppd can sometimes leave a serial port in a weird state.
    Adapting a script of Carlson's Ihave a wakeup serial port resetting perl
    program in
    www.theory.physics.ubc.ca/modem-chk.html

    You could try running it in the ip-down script to reset the port (if the
    cell cards operate via a serial port)



+ Reply to Thread