2015-11-26

handling disks for ASM - when DB, Linux and Storage admins work together

A proper management of ASM Disks can be a complicated task.

On DOAG2015 I discussed with Martin Bach about the concept in my current company, where we implemented a setting which is consistent, robust and enables Storage, Linux and DB admins to work together easily.

As we started to think about ASM when 10.1 was out we tried to evaluate our possibility. asmlib was discarded quite early as it only increased complexity without additional value: We have a SAN (fibre channel) infrastructure with 2 separated fabrics. So a multipath configuration is needed. ASM (or asmlib)  can not handle this, so a proper multipath configuration is needed at all. Also asmlib hides storage details from DBAs/ASM-Admins, where we wanted to enable every person involved know as many details as possible easily.

We also saw ASM sometimes takes a long time to scan for new disks (every time you access v$asm_disks - so use v$asm_disk_stat as this view does not rescan but only shows infos about devices it has in SGA already) if there are many "files" (devices) in asm_diskstring.

asm_diskstring

We set asm_diskstring to a dedicated directory. In our case it's called /appl/oracle/asm_disks/* This speeds up a rescan of all "disks", it's also a clear indicator of all disks ASM uses. There we have symlinks to devices in /dev/mapper/

symlinks

The symlink has this format:
/appl/oracle/asm_disks/360060e80167bd70000017bd700000007p1_p9500_b52_MONIQP01_000 -> /dev/mapper/360060e80167bd70000017bd700000007p1

Some informations about all the values we stored there:
360060e80167bd70000017bd700000007p1 is the WWN of the disk, together with it's partition (p1).  The WWN is very useful in every discussion with Storage Admins, as it identifies the LUN from their perspective. We decided to partition the disks. It's shown in our records that Linux Admins touches un-formatted devices more often than devices which are formatted already. There were also some cases in early tests when the first block of a disk was cached by the kernel. Both issues are addressed when we format every disk. If required partitioning can help do adapt alignments.
p9500 is a shortname which identifies the Storage box with a name we can use during discussions. It's somewhere within the WWN as well. So it's a pure redundancy. But it makes discussions much easier.
b52 is a shortname to identify the datacenter. As pur fabrics are spawned across several datacenters, sometimes it's nice to have a fast knowledge about the topology.
MONIQP01_000 is the label used in some Storage boxes. It contains the Diskgroup name and some number.  At the moment it's NOT the NAME of an ASM-disk, but this can be introduced easily.

As the name of a diskgroup is coded into our naming schema, it's not accepted to reuse a disk for some other diskgroup. (Technically it's still possible, we just agreed not to do so). Even it seems this limits the DBAs flexibility, there are good reasons to do so. Disks are sometimes created with dedicated settings/parameters for a special purpose. Reusing such disks in other DGs would cause strange and hard to find performance symptoms. So If disks are not needed anymore we always "destroy" them and re-create new if needed.

udev rules

Our udev ruleset on RedHat6 is quite simple:
the file /etc/udev/rules.d/41-multipath.rules contains such lines:
ACTION=="add|change", ENV{DM_NAME}=="360060e80167bd70000017bd700000007p1", OWNER:="oracle", MODE:="0660", GROUP:="asmadmin"
We do not do any mapping of names here - it's only there to set permissions.

multipath

The config in /etc/multipath.conf is quite simple, only parameters required for every specific storage vendor / product.


I can not say a lot about configurations outside if the Linux server, so both SAN fabrics and the storage system are "just working".




Keine Kommentare: