queued jobs not going active - Veritas Net Backup

This is a discussion on queued jobs not going active - Veritas Net Backup ; We are running Netbackup 5.1 MP6 and have 5 Media servers. We have about 370 clients we backup a night and all the backups are going to disk. What will happen is that we will get about 500 jobs queued ...

+ Reply to Thread
Results 1 to 3 of 3

Thread: queued jobs not going active

  1. queued jobs not going active


    We are running Netbackup 5.1 MP6 and have 5 Media servers. We have about
    370 clients we backup a night and all the backups are going to disk. What
    will happen is that we will get about 500 jobs queued and they will not start.
    What I have been doing is killing the jobs and restarting and then they
    seem to work okay. I've had the problems before and adjusted the time that
    bpsched uses to check for more jobs to run or I think that is what I have
    done Wakeup Interval set to 45 minutes. I have also set the SLAVE_CONNECT_TIMEOUT
    = 180. I do not see anything in the logs that would suggest that the scheduler
    is stepping on itself. Maximum jobs per client is set to 100. I can't really
    find anything on Veritas support site. The one thing that I did notice is
    that this didn't start happening until around the time that I installed MP6.
    I'm not sure if that has something to do with it. The environment basically
    runs 24/7 because of the duplications that are going on. I'm just not sure
    what the problem is. I have thought about just backing out of the MP6 patch
    pn the Master and all Media servers. Does anyone have any thoughts or suggestions.

  2. Re: queued jobs not going active


    I have seen issues with backups either not being queued or remaining stuck
    in the queued state when there are processes from previous jobs that did
    not terminate after the job completed. Since bprd and bpsched spawn child
    processes for each job that it runs, I would start there. When the system
    is idle look for excess bpsched or bprd processes that did not get cleaned
    up after previous jobs completed. This has happened to me on more than one
    occasion with NetBackup 5.1, each time I killed all NetBackup processes and
    restarted them and issue cleared up.

    ################################################## ##############################

    "Jesse Hardy" wrote:
    >
    >We are running Netbackup 5.1 MP6 and have 5 Media servers. We have about
    >370 clients we backup a night and all the backups are going to disk. What
    >will happen is that we will get about 500 jobs queued and they will not

    start.
    > What I have been doing is killing the jobs and restarting and then they
    >seem to work okay. I've had the problems before and adjusted the time that
    >bpsched uses to check for more jobs to run or I think that is what I have
    >done Wakeup Interval set to 45 minutes. I have also set the SLAVE_CONNECT_TIMEOUT
    >= 180. I do not see anything in the logs that would suggest that the scheduler
    >is stepping on itself. Maximum jobs per client is set to 100. I can't really
    >find anything on Veritas support site. The one thing that I did notice

    is
    >that this didn't start happening until around the time that I installed

    MP6.
    > I'm not sure if that has something to do with it. The environment basically
    >runs 24/7 because of the duplications that are going on. I'm just not sure
    >what the problem is. I have thought about just backing out of the MP6 patch
    >pn the Master and all Media servers. Does anyone have any thoughts or suggestions.



  3. Re: queued jobs not going active


    Hi there Jesse,

    We have had this problem and we found out that the huge amount of jobs starting
    at the same time is a problem. Within the master server you can set the global
    attributes wakeup interval, probably you are submitting to much jobs at the
    same time so the wakeup interval is rerunning before completing. So in this
    way no jobs wil be running at all.

    Only when you submit a job manualy during no backups will run direct. Try
    to start not all the jobs at the same time , look at the wakeup interval
    and us that interval to create a better balance with the policies.

    Cheers

    Ray


    "Jesse Hardy" wrote:
    >
    >We are running Netbackup 5.1 MP6 and have 5 Media servers. We have about
    >370 clients we backup a night and all the backups are going to disk. What
    >will happen is that we will get about 500 jobs queued and they will not

    start.
    > What I have been doing is killing the jobs and restarting and then they
    >seem to work okay. I've had the problems before and adjusted the time that
    >bpsched uses to check for more jobs to run or I think that is what I have
    >done Wakeup Interval set to 45 minutes. I have also set the SLAVE_CONNECT_TIMEOUT
    >= 180. I do not see anything in the logs that would suggest that the scheduler
    >is stepping on itself. Maximum jobs per client is set to 100. I can't really
    >find anything on Veritas support site. The one thing that I did notice

    is
    >that this didn't start happening until around the time that I installed

    MP6.
    > I'm not sure if that has something to do with it. The environment basically
    >runs 24/7 because of the duplications that are going on. I'm just not sure
    >what the problem is. I have thought about just backing out of the MP6 patch
    >pn the Master and all Media servers. Does anyone have any thoughts or suggestions.



+ Reply to Thread