How accurate are AIX disk reporting metrics with a SAN? - Aix
This is a discussion on How accurate are AIX disk reporting metrics with a SAN? - Aix ; Are the disk performance and configuration data reported by AIX valid
for SAN
storage configured as hypervolumes and presented to the operating
system as
hdiskpower logical disks?
For example:
Does the data distribution reported by lslv accurately represent the
actual
...
-
How accurate are AIX disk reporting metrics with a SAN?
Are the disk performance and configuration data reported by AIX valid
for SAN
storage configured as hypervolumes and presented to the operating
system as
hdiskpower logical disks?
For example:
Does the data distribution reported by lslv accurately represent the
actual
data distribution on the physical disks within the SAN?
Does the fact that hdisk17 has an inband distribution of 6 % indicate
that the
data on the hypervolume has an in band distribution of 6 %. Is "INTRA-
POLICY"
valid for logical disks in a SAN?
lslv -l xxxx_yyyy
xxxx_yyyy /myfilesystem.
PV COPIES IN BAND DISTRIBUTION
hdisk17 030:000:000 6% 004:002:006:004:014
hdisk22 030:000:000 6% 004:002:006:004:014
hdisk31 272:000:000 19% 055:054:054:054:055
If logical volume INTER-POLICY is minimum will the data be spread out
over the
disks in the hyper volume or concentrated over a mimimun number of
disks in the
hypervolume?
If shows two disks as being busy all the time and the rest idle does
that mean
that data in the hypervolume is concentrated over a few spindles?
Disk Busy% KBPS TPS KB-Read KB-Writ
hdisk25 79.5 14382.0 272.0 13940.0 442.0
hdisk31 37.0 448.5 123.5 6.5 442.0
hdisk17 1.5 18.0 1.5 0.0 18.0
hdisk102 0.5 66.0 5.5 0.0 66.0
hdisk38 0.5 18.0 1.5 0.0 18.0
hdisk36 0.5 18.0 1.5 0.0 18.0
hdisk37 0.0 0.0 0.0 0.0 0.0
hdisk3 0.0 0.0 0.0 0.0 0.0
hdisk19 0.0 0.0 0.0 0.0 0.0
hdisk18 0.0 0.0 0.0 0.0 0.0
What about "iostat -d 2 30"? Are the data transfer rates reported
valid?
-
Re: How accurate are AIX disk reporting metrics with a SAN?
On Sep 10, 5:19 pm, Patrick Finnegan
wrote:
> Are the disk performance and configuration data reported by AIX valid
> for SAN
> storage configured as hypervolumes and presented to the operating
> system as
> hdiskpower logical disks?
From the view of the OS. Certainly. From the view of the Storage
System ( SAN) : It depends.
1) Disk
In case you pass the SAN disks as JBOD to AIX the view from both side
should be the same.
But as soon as you create disk volumes on the SAN and passes parts or
the whole to AIX any AIX policy for the disk is IMHO useless.
Does a intra-policy on a RAID5 makes sense ?
2) I/O
The I/O - either IOPS or Transfer Speed is measured from a AIX point
of view. In case you have assigned parts of this array to another
node you might suffer in terms of speed and IOPS in case the other
node is utilizing the same storage group
> For example:
>
> Does the data distribution reported by lslv accurately represent the
> actual
> data distribution on the physical disks within the SAN?
Normaly No !
> Does the fact that hdisk17 has an inband distribution of 6 % indicate
> that the
> data on the hypervolume has an in band distribution of 6 %. Is "INTRA-
> POLICY"
> valid for logical disks in a SAN?
Like i said. Depending on the Diskconfiguration within the SAN any
disk policy is useless.
> If shows two disks as being busy all the time and the rest idle does
> that mean
> that data in the hypervolume is concentrated over a few spindles?
You are using from this information only 2 disk provided from your
SAN.
A simple reason might be that you are using JFS[2] without any
striping. Thus the load is not spread across all disk.
> What about "iostat -d 2 30"? Are the data transfer rates reported
> valid?
yes
So far no more from my crystal ball.
cheers
Hajo
-
Re: How accurate are AIX disk reporting metrics with a SAN?
On Sep 10, 6:34*pm, Hajo Ehlers wrote:
> On Sep 10, 5:19 pm, Patrick Finnegan
> wrote:
>
> > Are the disk performance and configuration data reported by AIX valid
> > for SAN
> > storage configured as hypervolumes and presented to the operating
> > system as
> > hdiskpower logical disks?
>
> From the view of the OS. Certainly. From the view of the Storage
> System ( SAN) : It depends.
> 1) Disk
> In case you pass the SAN disks as *JBOD to AIX the view from both side
> should be the same.
> But as soon as you create disk volumes on the SAN and passes parts or
> the whole to AIX any AIX policy for the disk is IMHO useless.
> Does a intra-policy on a RAID5 makes sense ?
>
> 2) I/O
> The I/O - either IOPS or Transfer Speed is measured from a AIX point
> of view. In case you have assigned parts of *this array to another
> node you might suffer in terms of speed and IOPS in case the other
> node is utilizing the same storage group
>
> > For example:
>
> > Does the data distribution reported by lslv accurately represent the
> > actual
> > data distribution on the physical disks within the SAN?
>
> Normaly No !
>
> > Does the fact that hdisk17 has an inband distribution of 6 % indicate
> > that the
> > data on the hypervolume has an in band distribution of 6 %. *Is "INTRA-
> > POLICY"
> > valid for logical disks in a SAN?
>
> Like i said. Depending on the Diskconfiguration within the SAN any
> disk policy is useless.
>
> > If shows two disks as being busy all the time and the rest idle does
> > that mean
> > that data in the hypervolume is concentrated over a few spindles?
>
> You are using from this information only 2 disk provided from your
> SAN.
> A simple reason might be that you are using JFS[2] without any
> striping. Thus the load is not spread across all disk.
>
> > What about "iostat -d 2 30"? *Are the data transfer rates reported
> > valid?
>
> yes
>
> So far no more from my *crystal ball.
>
> cheers
> Hajo
Thanks for your answers.
We are using JFS with no striping cos MWC is turned on but does MWC
mean anything on a SAN?
The logical volume config is:
#lslv xxxxx
LOGICAL VOLUME: xxxx VOLUME GROUP: yyyy
LV IDENTIFIER: 00cde1ea00004c000000010fa4bf4913.4 PERMISSION:
read/write
VG STATE: active/complete LV STATE: opened/
syncd
TYPE: jfs2 WRITE VERIFY: off
MAX LPs: 3072 PP SIZE: 128
megabyte(s)
COPIES: 2 SCHED POLICY: parallel
LPs: 1184 PPs: 2368
STALE PPs: 0 BB POLICY: relocatable
INTER-POLICY: minimum RELOCATABLE: yes
INTRA-POLICY: middle UPPER BOUND: 32
MOUNT POINT: /yabba LABEL: /yabba
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes
Serialize IO ?: NO
-
Re: How accurate are AIX disk reporting metrics with a SAN?
> We are using JFS with no striping cos MWC is turned on but does MWC
> mean anything on a SAN?
Back to the basics:
What are you trying to archive anyway. ?
Do you have any performance issues ?
-
Re: How accurate are AIX disk reporting metrics with a SAN?
On Sep 10, 9:10*pm, Hajo Ehlers wrote:
> > We are using JFS with no striping cos MWC is turned on but does MWC
> > mean anything on a SAN?
>
> Back to the basics:
> What are you trying to archive anyway. ?
> Do you have any performance issues ?
We have a DB2 database performance issue that is related to slow disk
i/o times which could be caused by a technical issue with the SAN or
simply by contention if for some reason the data is clustered on one
small area of the san. We have to prove that there is a problem
before the Management Company who look after the san will allocate
resources to look at the issue. Working on the principle that we will
get the best performance by allocating the data over as many disk as
possible we were wondering whether the AIX system utilities would give
us any worthwhile information about how the data is distributed on the
san but it looks like they don't so we will have to escalate up the
management chain.
The disks in the SAN are raid 5 so the logical volume MWC setting is
probably irrelevant.
-
Re: How accurate are AIX disk reporting metrics with a SAN?
Patrick Finnegan schrieb:
> On Sep 10, 9:10 pm, Hajo Ehlers wrote:
>>> We are using JFS with no striping cos MWC is turned on but does MWC
>>> mean anything on a SAN?
>> Back to the basics:
>> What are you trying to archive anyway. ?
>> Do you have any performance issues ?
>
> We have a DB2 database performance issue that is related to slow disk
> i/o times which could be caused by a technical issue with the SAN or
> simply by contention if for some reason the data is clustered on one
> small area of the san. We have to prove that there is a problem
> before the Management Company who look after the san will allocate
> resources to look at the issue. Working on the principle that we will
> get the best performance by allocating the data over as many disk as
> possible we were wondering whether the AIX system utilities would give
> us any worthwhile information about how the data is distributed on the
> san but it looks like they don't so we will have to escalate up the
> management chain.
>
> The disks in the SAN are raid 5 so the logical volume MWC setting is
> probably irrelevant.
>
>
>
>
>
Hi,
MWC is not your issue. As Hajo already pointed out, you have no kind of
striping. It is a common argument that striping has "no meaning" on SAN
(and i argued the same way, when starting to work with SANs). That's in
fact not true. Practical experience teached us, that you should have at
least changed your
INTER-POLICY: minimum
to
INTER-POLICY: maximum
if xxxxx is your data logical volume. This implements what sometimes is
called "PP striping" (you will need to recover your data to the
redefined filesystem).
In fact i got no real good and validated explanation, why this even
helps in SAN environments. But there must be some algorithm in SAN
storage system and/or LVM, which makes having i/o to a single disk only,
being a bottleneck although overall hardware i/o capacity is still not
reached (Keep in mind, that the storage subsystem is not just serving
your system but also others and needs to hold something i/o queues for
*all* system and *all* disk)
Regards,
Uwe Auer
-
Re: How accurate are AIX disk reporting metrics with a SAN?
On Sep 11, 3:04 pm, Patrick Finnegan
wrote:
> On Sep 10, 9:10 pm, Hajo Ehlers wrote:
>
> > > We are using JFS with no striping cos MWC is turned on but does MWC
> > > mean anything on a SAN?
>
> > Back to the basics:
> > What are you trying to archive anyway. ?
> > Do you have any performance issues ?
>
> We have a DB2 database performance issue that is related to slow disk
> i/o times which could be caused by a technical issue with the SAN or
....
Its looks like your focusing on the data distribution on the disk. Its
irrelevant for SAN provided disks.
But AIX - at least from this short view - told you that it is using
only 1 disk. In case you have 20 disk that looks like a pretty bad
configured system.
So you should check how the system is configured and WHY. This can be
answered only by you since i do not know your SAN configuration, DB2
configuration nor the AIX configuration and whether or not the final
configuration is even able to handle the load.
Might be a good idea to ask the person(s) who had configured the
system why they did so.
A very very simple approach regarding disks/ disk access is:
! Use as much spindel as you can.
- Do not mirror on the OS side except it is really needed.
- Use RAID10 ( mirror/stripe ) on the SAN or RAID5 with striping on
the AIX side.
- configure the cache on the SAN to your needs. Is read cache really
needed or would more write cache be better ?
- Check carefully the block size on the SAN. A block size of 256K on
a RAID5 might be bad if you are going to write in 64k blocks or even
lower.
Check the DB2 Guides for block sizes because they determine the SAN
block size.
- In case JFS2 is used. Check if special options are needed to
prevent the OS to cache data in its filecache ( CIO option )
EOD
Hajo
-
Re: How accurate are AIX disk reporting metrics with a SAN?
On Sep 11, 4:51*pm, Uwe Auer wrote:
> Patrick Finnegan schrieb:
>
> > On Sep 10, 9:10 pm, Hajo Ehlers wrote:
> >>> We are using JFS with no striping cos MWC is turned on but does MWC
> >>> mean anything on a SAN?
> >> Back to the basics:
> >> What are you trying to archive anyway. ?
> >> Do you have any performance issues ?
>
> > We have a DB2 database performance issue that is related to slow disk
> > i/o times which could be caused by a technical issue with the SAN or
> > simply by contention if for some reason the data is clustered on one
> > small area of the san. *We have to prove that there is a problem
> > before the Management Company who look after the san will allocate
> > resources to look at the issue. *Working on the principle that we will
> > get the best performance by allocating the data over as many disk as
> > possible we were wondering whether the AIX system utilities would give
> > us any worthwhile information about how the data is distributed on the
> > san but it looks like they don't so we will have to escalate up the
> > management chain.
>
> > The disks in the SAN are raid 5 so the logical volume MWC setting is
> > probably irrelevant.
>
> Hi,
>
> MWC is not your issue. As Hajo already pointed out, you have no kind of
> striping. It is a common argument that striping has "no meaning" on SAN
> (and i argued the same way, when starting to work with SANs). That's in
> fact not true. Practical experience teached us, that you should have at
> least changed your
> INTER-POLICY: * * * minimum
> to
> INTER-POLICY: * * * maximum
> if xxxxx is your data logical volume. This implements what sometimes is
> called "PP striping" (you will need to recover your data to the
> redefined filesystem).
> In fact i got no real good and validated explanation, why this even
> helps in SAN environments. But there must be some algorithm in SAN
> storage system and/or LVM, which makes having i/o to a single disk only,
> being a bottleneck although overall hardware i/o capacity is still not
> reached (Keep in mind, that the storage subsystem is not just serving
> your system but also others and needs to hold something i/o queues for
> *all* system and *all* disk)
>
> Regards,
> Uwe Auer
You might be on to something. We did change the inter-policy to
maximum on the test system which has the same type of SAN and then
reorged the logical volume. The reorg failed and IBM got back to us
with this.
"A conditional in reorgvg assumes the value in the ODM for an LVs
stripe_width will be a number. But if the CuAt entry
is removed, the default in PdAt is "", which causes the
comparison to fail and reorgvg to mistakenly think the LV is striped."
APAR number IZ03602 records the issue and U812873 bos.rte.lvm
5.3.8.0 holds the fix.
We are going to apply the fix and run the reorg again. Of course we
realise that we are attempting to reorg the data on the SAN and we are
making the assumption that the LVM will magically do this for us.
What we have noticed is that more disks have come into play since we
changed inter-policy to maximum so as data is being added and the
databases are reorged it does seem that the data is being spread
across more disks.
Will keep you posted.
-
Re: How accurate are AIX disk reporting metrics with a SAN?
On Sep 11, 5:46*pm, Hajo Ehlers wrote:
> On Sep 11, 3:04 pm, Patrick Finnegan
> wrote:> On Sep 10, 9:10 pm, Hajo Ehlers wrote:
>
> > > > We are using JFS with no striping cos MWC is turned on but does MWC
> > > > mean anything on a SAN?
>
> > > Back to the basics:
> > > What are you trying to archive anyway. ?
> > > Do you have any performance issues ?
>
> > We have a DB2 database performance issue that is related to slow disk
> > i/o times which could be caused by a technical issue with the SAN or
>
> ...
> Its looks like your focusing on the data distribution on the disk. Its
> irrelevant for SAN provided disks.
> But AIX - at least from this short view - told you that it is using
> only 1 disk. In case you have 20 disk that looks like a pretty bad
> configured system.
>
> So you should check how the system is configured and WHY. This can be
> answered only by you since i do not know your SAN configuration, DB2
> configuration nor the AIX configuration and whether or not the final
> configuration is even able to handle the load.
> Might be a good idea to ask the person(s) who had configured the
> system why they did so.
>
> A very very simple approach regarding disks/ disk access is:
> ! Use as much spindel as you can.
> *- Do not mirror on the OS side except it is really needed.
> *- Use RAID10 ( mirror/stripe ) on the SAN or *RAID5 with striping on
> the AIX side.
> *- configure the cache on the SAN to your needs. Is read cache really
> needed or would more write cache be better ?
> *- Check carefully the block size on the SAN. A block size of 256K on
> a RAID5 might be bad if you are going to write in 64k blocks or even
> lower.
> * *Check the DB2 Guides for block sizes because they determine the SAN
> block size.
> *- In case JFS2 is used. Check if special options are needed to
> prevent the OS to cache data in its filecache ( CIO option )
>
> EOD
> Hajo
Concurrent I/O is not enabled. We plan to work on that next week.
-
Re: How accurate are AIX disk reporting metrics with a SAN?
On 2008-09-11, Uwe Auer wrote:
> INTER-POLICY: maximum
> if xxxxx is your data logical volume. This implements what sometimes is
> called "PP striping" (you will need to recover your data to the
> redefined filesystem).
> In fact i got no real good and validated explanation, why this even
> helps in SAN environments. But there must be some algorithm in SAN
> storage system and/or LVM, which makes having i/o to a single disk only,
> being a bottleneck although overall hardware i/o capacity is still not
> reached
One possible factor (often overlooked) is queueing. This can happen in
multiple places, but in your example queueing on the AIX host itself is
most likely.
Suppose the SAN and storage array have a guaranteed response time of
1 ms. Every I/O takes 1 ms, no more, no less. Further suppose the
host has 1 HBA (Fibre Channel card), and that this HBA is configured
with a queue depth of 256 I/Os per LUN.
Imagine an application which issues I/O-requests as fast as it can. The
application will most likely be able to issue more than 1 I/O per ms.
This means the queue will start to fill up, since the SAN and storage
array will handle only one I/O per ms. When the queue is full, every
new I/O will be added to the end of the queue. That new I/O has to wait
for the other 255 I/Os in the queue to be serviced. Since each I/O
takes 1 ms, this means that the new I/O will have a response time of
256 ms. When the queue is (nearly) full, every new I/O will have a
high response time because of the I/Os in the queue which will be
serviced first.
This phenomenon is often a source of misunderstanding between
system administrators and storage administrators. In the above
example, the system administrators measure I/O-times of around 256 ms
per I/O while the storage administrators measure I/O-times of 1 ms
per I/O.
You now probably see why it can be benificial to spread the
load across multiple LUNs, even on a SAN. Since queueing happens
on a LUN level (but can also occur on other levels), when you
distribute the load you're keeping the number of queued I/Os
down, and each I/O is serviced earlier.
Of course, in the real world it isn't as black/white as described
above. In the example, you just reconfigure to a queue depth of 1
and everybody measures response times of 1ms/IO, with the same amount
of I/Os per second being serviced. However, there are also advantages
with a larger queue size. For example, different I/O's can be grouped
together into 1 I/O, which reduces the number of I/O's.
--
Jurjen Oskam
Savage's Law of Expediency:
You want it bad, you'll get it bad.