problem with db2 commit on aix (lock and cuncurrency) - Aix
This is a discussion on problem with db2 commit on aix (lock and cuncurrency) - Aix ; Dear all,
I got a problem on db2 for aix running a high workload messaging
system with more than 5,000,000 of deliveries per day. During high
peak hours it happens frequently that the application,
the application threads, are all waiting ...
-
problem with db2 commit on aix (lock and cuncurrency)
Dear all,
I got a problem on db2 for aix running a high workload messaging
system with more than 5,000,000 of deliveries per day. During high
peak hours it happens frequently that the application,
the application threads, are all waiting for a commit to complete.
This seems a typical case of a
missing or locked shared resource. The lock anyway is not at row or
table level during an SQL operation, but it is at commit time. In
fact, different applications at the same time are affected by
this problem. The db2 is v8.1 fixpak 14 and aix is 5.3 fixpak 05 with
5 processor (power5+ 1.6Ghz)
and 64 GB of memory. While the commit hang the db2 machine has no I/O
wait (no disk usage),
plenty of memory and processing power and plenty of network resources
(that's why I suspect
a lock).
To complete the scenario, the databases (same hw, same sw) are two in
continuos replication. The apply programs are running at the target
database. MIN_COMMIT is 5
The question is which monitoring parameter should I check to
understand what happen to
the db. Is there any operating-system resource that I can check?
As last thing, why the rootvg disks are working a lot? iostat says
that they write 8/10 MB per
second... I suspect something related to authentication due to
continuos connect/disconnect of replication, but I don't understand
what it matters with commit.
Thanks in advance
Matteo Gelosa
-
Re: problem with db2 commit on aix (lock and cuncurrency)
I'm not sure I understand all that you are saying but I would get a
snapshot for locks and also look in the diag log to see if there are
lock escalations. Even though you have plenty of memory on the box if
you misconfigure your locklist and maxlocks parms you may still
encounter lock escalations. As far as the high i/o rates to rootvg ,
typically I see that when a system is paging since most systems have
their paging space in rootvg. It could be other obscure things like
maybe someone put database temp on rootvg or something like that.
On Jun 1, 12:24 pm, m.gel...@gmail.com wrote:
> Dear all,
>
> I got a problem on db2 for aix running a high workload messaging
> system with more than 5,000,000 of deliveries per day. During high
> peak hours it happens frequently that the application,
> the application threads, are all waiting for a commit to complete.
> This seems a typical case of a
> missing or locked shared resource. The lock anyway is not at row or
> table level during an SQL operation, but it is at commit time. In
> fact, different applications at the same time are affected by
> this problem. The db2 is v8.1 fixpak 14 and aix is 5.3 fixpak 05 with
> 5 processor (power5+ 1.6Ghz)
> and 64 GB of memory. While the commit hang the db2 machine has no I/O
> wait (no disk usage),
> plenty of memory and processing power and plenty of network resources
> (that's why I suspect
> a lock).
>
> To complete the scenario, the databases (same hw, same sw) are two in
> continuos replication. The apply programs are running at the target
> database. MIN_COMMIT is 5
>
> The question is which monitoring parameter should I check to
> understand what happen to
> the db. Is there any operating-system resource that I can check?
>
> As last thing, why the rootvg disks are working a lot? iostat says
> that they write 8/10 MB per
> second... I suspect something related to authentication due to
> continuos connect/disconnect of replication, but I don't understand
> what it matters with commit.
>
> Thanks in advance
> Matteo Gelosa
-
Re: problem with db2 commit on aix (lock and cuncurrency)
On Jun 1, 1:24 pm, m.gel...@gmail.com wrote:
> Dear all,
>
> I got a problem on db2 for aix running a high workload messaging
> system with more than 5,000,000 of deliveries per day. During high
> peak hours it happens frequently that the application,
> the application threads, are all waiting for a commit to complete.
> This seems a typical case of a
> missing or locked shared resource. The lock anyway is not at row or
> table level during an SQL operation, but it is at commit time. In
> fact, different applications at the same time are affected by
> this problem. The db2 is v8.1 fixpak 14 and aix is 5.3 fixpak 05 with
> 5 processor (power5+ 1.6Ghz)
> and 64 GB of memory. While the commit hang the db2 machine has no I/O
> wait (no disk usage),
> plenty of memory and processing power and plenty of network resources
> (that's why I suspect
> a lock).
>
> To complete the scenario, the databases (same hw, same sw) are two in
> continuos replication. The apply programs are running at the target
> database. MIN_COMMIT is 5
>
> The question is which monitoring parameter should I check to
> understand what happen to
> the db. Is there any operating-system resource that I can check?
>
> As last thing, why the rootvg disks are working a lot? iostat says
> that they write 8/10 MB per
> second... I suspect something related to authentication due to
> continuos connect/disconnect of replication, but I don't understand
> what it matters with commit.
>
> Thanks in advance
> Matteo Gelosa
Are you by chance running SDD version 1.6.1.2 or 1.6.2.0 on this
machine? There is a bug in these versions that caused the exact same
symptoms on our db2 instance. Our solution was to back level to
1.6.0.8 but i believe 1.6.2.1 has since been released to fix this
issue.
Paul
-
Re: problem with db2 commit on aix (lock and cuncurrency)
On 1 Giu, 18:44, Lew wrote:
Well, the db2diag.log is empty and doesn't report any kind of error.
The suspected lock doesn't seem to be at row or table level. What I'm
sure
of is that the client applications are executing the commit (the CLI
api) and they
are waiting for an answer from the database server.
About paging, the machine is not paging. I tried with filemon but I
couldn't find
any relevant info. Anyway, thanks for your answer. I will try with
snapshot.
About the snapshots I was wondering if there is any COMMIT related
parameter
I can explore in order to solve my load problem.
Tnx
Matteo
> I'm not sure I understand all that you are saying but I would get a
> snapshot for locks and also look in the diag log to see if there are
> lock escalations. Even though you have plenty of memory on the box if
> you misconfigure your locklist and maxlocks parms you may still
> encounter lock escalations. As far as the high i/o rates to rootvg ,
> typically I see that when a system is paging since most systems have
> their paging space in rootvg. It could be other obscure things like
> maybe someone put database temp on rootvg or something like that.
>
> On Jun 1, 12:24 pm, m.gel...@gmail.com wrote:
-
Re: problem with db2 commit on aix (lock and cuncurrency)
On 1 Giu, 19:10, stlh...@gmail.com wrote:
> On Jun 1, 1:24 pm, m.gel...@gmail.com wrote:
>
>
>
>
>
> > Dear all,
>
> > I got a problem on db2 foraixrunning a high workload messaging
> > system with more than 5,000,000 of deliveries per day. During high
> > peak hours it happens frequently that the application,
> > the application threads, are all waiting for acommitto complete.
> > This seems a typical case of a
> > missing or locked shared resource. The lock anyway is not at row or
> > table level during an SQL operation, but it is atcommittime. In
> > fact, different applications at the same time are affected by
> > this problem. The db2 is v8.1 fixpak 14 andaixis 5.3 fixpak 05 with
> > 5 processor (power5+ 1.6Ghz)
> > and 64 GB of memory. While thecommithang the db2 machine has no I/O
> > wait (no disk usage),
> > plenty of memory and processing power and plenty of network resources
> > (that's why I suspect
> > a lock).
>
> > To complete the scenario, the databases (same hw, same sw) are two in
> > continuos replication. The apply programs are running at the target
> > database. MIN_COMMIT is 5
>
> > The question is which monitoring parameter should I check to
> > understand what happen to
> > the db. Is there any operating-system resource that I can check?
>
> > As last thing, why the rootvg disks are working a lot? iostat says
> > that they write 8/10 MB per
> > second... I suspect something related to authentication due to
> > continuos connect/disconnect of replication, but I don't understand
> > what it matters withcommit.
>
> > Thanks in advance
> > Matteo Gelosa
>
> Are you by chance running SDD version 1.6.1.2 or 1.6.2.0 on this
> machine? There is a bug in these versions that caused the exact same
> symptoms on our db2 instance. Our solution was to back level to
> 1.6.0.8 but i believe 1.6.2.1 has since been released to fix this
> issue.
>
> Paul- Nascondi testo tra virgolette -
>
> - Mostra testo tra virgolette -
Unfortunately, we're not using SDD...
Tnx
Matteo
-
Re: problem with db2 commit on aix (lock and cuncurrency)
On 1 Giu, 18:44, Lew wrote:
To add something: the application status with db2 list application
says
"Commit Active" for around the 400 transactions blocked. The I/O load
of the
log related disks (with iostat) is none... That's why I suspected a
lock
in a different place than rows/tables.
Bye,
Matteo
> I'm not sure I understand all that you are saying but I would get a
> snapshot for locks and also look in the diag log to see if there are
> lock escalations. Even though you have plenty of memory on the box if
> you misconfigure your locklist and maxlocks parms you may still
> encounter lock escalations. As far as the high i/o rates to rootvg ,
> typically I see that when a system is paging since most systems have
> their paging space in rootvg. It could be other obscure things like
> maybe someone put database temp on rootvg or something like that.
-
Re: problem with db2 commit on aix (lock and cuncurrency)
m.gelosa@gmail.com wrote:
>
> To complete the scenario, the databases (same hw, same sw) are two in
> continuos replication. The apply programs are running at the target
> database. MIN_COMMIT is 5
Have you read the documentation on MINCOMMIT? Setting it to > 1 means
that transactions must wait for either MINCOMMIT other transactions to
issue a commit, OR 1 second has passed.
Even on very busy systems (> 200 trxn/s), MINCOMMIT=1 is often sufficient.
I would suggest tuning MINCOMMIT way down -- perhaps back to 1 to check
performance. *Maybe* increase it to 2, if absolutely necessary.
-
Re: problem with db2 commit on aix (lock and cuncurrency)
On 4 Giu, 19:09, Ian wrote:
I tried, but the behavior is the same. Anyway, I suspect something
related to replication. When the replication is down (no apply program
running) the db is working well.
As just as I turn on the replication, it starts to have the behavior I
mentioned, by blocking 400 tnxs in a Commit Active state. I googled
for "Commit Active" and I found one message of mine of some years ago
related to db2 replication hang in a "Commit Active" state, solved
with a fixpak downgrade. I don't want to say it's a db2 replication
bug, but I suspect something wrong in the configuration of this area.
Besides, the control database for replication resides is the source
database (so the replication control tables reside on it), that has a
much higher frequence of the hang I experience.
As last info I discovered, even the "get snapshot for locks on "
is blocked until the "commit active" finished,
while other snapshots (bufferpools, applications, ...) work.
Tnx,
Matteo
> m.gel...@gmail.com wrote:
>
> > To complete the scenario, the databases (same hw, same sw) are two in
> > continuos replication. The apply programs are running at the target
> > database. MIN_COMMIT is 5
>
> Have you read the documentation on MINCOMMIT? Setting it to > 1 means
> that transactions must wait for either MINCOMMIT other transactions to
> issue acommit, OR 1 second has passed.
>
> Even on very busy systems (> 200 trxn/s), MINCOMMIT=1 is often sufficient.
>
> I would suggest tuning MINCOMMIT way down -- perhaps back to 1 to check
> performance. *Maybe* increase it to 2, if absolutely necessary.