You can encounter ORA-15040: DiskGroup is Incomplete when ASM detects a missing disk when attempting to mount a DiskGroup. The issue usually is that ASM cannot read the header of one of the member disks.
Troubleshooting ORA-15040
To troubleshoot ORA-15040, we checked whether ASMlib could read the disks, and we couldn’t see it. oracleasm scandisks
and oracleasm listdisks
was used to list the disks.
ls -l /dev/oracleasm/disks/
total 0
brw-rw----. 1 oracle oinstall 251, 129 Mar 18 13:20 DATA02
brw-rw----. 1 oracle oinstall 251, 145 Mar 18 13:20 DATA03
brw-rw----. 1 oracle oinstall 251, 161 Mar 18 13:20 DATA04
brw-rw----. 1 oracle oinstall 251, 177 Mar 18 13:20 DATA05
brw-rw----. 1 oracle oinstall 251, 193 Mar 18 13:20 DATA06
Then we tried to read the header of the devices using kfed
to verify the permissions. Here is the output for a disk that ASMLIB could read, as well as, that of a device that ASMlib couldn’t read.
#### Good Disk
# oracleasm querydisk -v /dev/vdi1
Device "/dev/vdi1" is marked an ASM disk with the label "DATA02"
# kfed read /dev/vdi1 | more
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 2147483649 ; 0x008: disk=1
kfbh.check: 3924513787 ; 0x00c: 0xe9eb53fb
kfbh.fcn.base: 1163 ; 0x010: 0x0000048b
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr: ORCLDISKDATA02 ; 0x000: length=14
kfdhdb.driver.reserved[0]: 1096040772 ; 0x008: 0x41544144
kfdhdb.driver.reserved[1]: 12848 ; 0x00c: 0x00003230
kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000
kfdhdb.compat: 318767104 ; 0x020: 0x13000000
kfdhdb.dsknum: 1 ; 0x024: 0x0001
kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname: DATA_0001 ; 0x028: length=9
kfdhdb.grpname: DATA ; 0x048: length=4
kfdhdb.fgname: DATA_0001 ; 0x068: length=9
kfdhdb.siteguid[0]: 0 ; 0x088: 0x00
kfdhdb.siteguid[1]: 0 ; 0x089: 0x00
kfdhdb.siteguid[2]: 0 ; 0x08a: 0x00
#### Bad Disk
# oracleasm querydisk -v /dev/vdh1
Device "/dev/vdh1" defines a device with no label
# kfed read /dev/vdh1 | more
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 2 ; 0x003: 0x02
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 2147483648 ; 0x008: disk=0
kfbh.check: 3909078280 ; 0x00c: 0xe8ffcd08
kfbh.fcn.base: 1650 ; 0x010: 0x00000672
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr: ORCLDISK ; 0x000: length=8. >>>>>Issue
kfdhdb.driver.reserved[0]: 0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]: 0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000
kfdhdb.compat: 318767104 ; 0x020: 0x13000000
kfdhdb.dsknum: 0 ; 0x024: 0x0000
kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname: DATA_0000 ; 0x028: length=9
kfdhdb.grpname: DATA ; 0x048: length=4
kfdhdb.fgname: DATA_0000 ; 0x068: length=9
kfdhdb.siteguid[0]: 0 ; 0x088: 0x00
kfdhdb.siteguid[1]: 0 ; 0x089: 0x00
kfdhdb.siteguid[2]: 0 ; 0x08a: 0x00
The fact that kfed
the command could read the disk header, indicating that the permissions were correct. Later comparing the outputs of the command we had the following observations.
- The disk
/dev/vdh1
was a member (kfdhdb.hdrsts
) the Diskgroup DATA (kfdhdb.dskname
) and it was the zeroth disk (kfdhdb.dskname
) - However
kfdhdb.driver.provstr
indicated that the disk name was not DATA01 as we would expect it to be.
This indicated a problem with the disk header.
Solution
In this case, only the disk header appeared corrupted, not the data and we were certain that the disk belonged to the DATA Diskgroup and was DISK01. Hence, we contacted Oracle Support for confirmation, and they verified that the disk header was overwritten and was missing the ASMLIB driver information.
To fix this we forcefully change the label of an Oracle ASM library driver disk to the name. This can also be used in verbose mode to help with the debugging.
Please Note: Be extremely cautious before running the command, this can leave the disk unusable, so get in touch with Oracle Support before executing this on a live system.
# oracleasm renamedisk -f /dev/vdh1 DATA01
Writing disk header: done
Instantiating disk "DATA01": done
# oracleasm scandisks
Reloading disk partitions: done
Cleaning any stale ASM disks...
Scanning system for ASM disks...
# oracleasm listdisks
DATA01
DATA02
DATA03
DATA04
DATA05
DATA06
DATA07
DATA08
DATA09
DATA10
## Mount the disk
SQL> select name, state from v$asm_diskgroup;
NAME STATE
------------------------------ -----------
DATA MOUNTED
In case there is further disk corruption on your diskgroup then you might need to troubleshoot it further. Therefore it is recommended to run commands like “validate database” to check that all the database blocks are okay