Overview
Problem: Extra paths are visible in powermt display dev=all output. We are still working on determining the root cause. What seems to happen is:
1) Initially, the number of paths is correct. Assuming that each LUN should be using four paths, you can determine the correct number as follows:
echo $(( $(powermt display dev=all | grep emcpower | wc -l) * 4 )) |
And the current number:
powermt display dev=all | grep qla | wc -l |
The current number of SCSI devices:
lsscsi |
The HBA reports the correct number of paths:
echo $(( $(scli -l 0 | grep ^LUN | wc -l) - 4 + $(scli -l 1 | grep ^LUN | wc -l) )) ls -l /dev/sg* |
2) A LIP is issued:
echo 1 |tee /sys/ class /fc_host/host?/issue_lip # This operation performs a Loop Initialization Protocol (LIP) # and then scans the interconnect and causes the SCSI layer to be updated # to reflect the devices currently on the bus. A LIP is, essentially, a bus reset, # and will cause device addition and removal. This procedure is necessary to configure # a new SCSI target on a Fibre Channel interconnect. Bear in mind that issue_lip is # an asynchronous operation. The command may complete before the entire scan has completed. # You must monitor /var/log/messages to determine when it is done. # The lpfc and qla2xxx drivers support issue_lip. # For more information about the API capabilities supported by each driver in Red Hat Enterprise Linux, # refer to Table 1 , ¿Fibre-Channel API Capabilities¿. |
3) Several new paths appear.
The extra /dev/sg* devices are created when the LIP reports that new devices have been discovered on the SCSI bus. Since the HBA driver is responsible for reporting the paths to the system, we currently believe that the HBA driver and/or storage frame is at fault. EMC specifically mentions that the SPC2 bit must be set to "enabled", which is not an online change; the hosts must be rebooted to pick up the change. This doesn't seem to be related, however, since the paths spontaneously appear even though we can query the HBA with scli shortly afterward and it reports the correct number of paths.
Example and Mitigation
This was happening on [server name redacted].
# The number of paths that "scli -l" shows that the HBA is reporting to the server. # Subtracting 4 because the last 4 devices are LUNZ/VRAID devices instead of Symmetrix or Clariion. root @xxxxx :TEST:scsi_device> echo $(( $(scli -l 0 | grep ^LUN | wc -l) - 4 + $(scli -l 1 | grep ^LUN | wc -l) )) 440 # The number of paths that should exist root @xxxxx :TEST:scsi_device> echo $(( $(powermt display dev=all | grep emcpower | wc -l) * 4 )) 440 # The number of paths that PowerPath is reporting root @xxxxx :TEST:scsi_device> powermt display dev=all | grep qla | wc -l 1419 # The number of SCSI devices that the system sees # Subtracting 4 because the last 4 devices are LUNZ/VRAID devices root @xxxxx :TEST:scsi_device> echo $(( $(lsscsi | wc -l) - 4 )) 1419 |
And another "healthy" server in the same cluster:
# The number of paths that "scli -l" shows that the HBA is reporting to the server. # Subtracting 4 because the last 4 devices are LUNZ/VRAID devices instead of Symmetrix or Clariion. root @xxxxx :TEST:~> echo $(( $(scli -l 0 | grep ^LUN | wc -l) - 4 + $(scli -l 1 | grep ^LUN | wc -l) )) 440 # The number of paths that should exist root @xxxxx :TEST:~> echo $(( $(powermt display dev=all | grep emcpower | wc -l) * 4 )) 440 # The number of paths that PowerPath is reporting root @xxxxx :TEST:~> powermt display dev=all | grep qla | wc -l 440 # The number of SCSI devices that the system sees # Subtracting 4 because the last 4 devices are LUNZ/VRAID devices root @xxxxx :TEST:~> echo $(( $(lsscsi | wc -l) - 4 )) 440 |
The first 440 /dev/sg* devices were created on system bootup. The extraneous 900+ were created on November 15th, 2012, at the same time some LUNs were added:
xxxxx,xxxxx, 2012 - 11 - 15 21 : 29 : 57.398563 , "Added LUN(s) [redacted]..." |
The script we use to add storage issues a LIP, so that caused the extra /dev/sg* paths to be discovered. Here's an example emcpower device with too many paths:
root @xxxxx :TEST:device> powermt display dev=emcpowercz Pseudo name=emcpowercz Symmetrix ID= xxxxx Logical device ID=YYYY state=alive; policy=SymmOpt; priority= 0 ; queued-IOs= 0 ============================================================================== ---------------- Host --------------- - Stor - -- I/O Path - -- Stats --- ### HW Path I/O Paths Interf. Mode State Q-IOs Errors ============================================================================== 1 qla2xxx sdabg FA 15gB active alive 0 0 1 qla2xxx sdabt FA 15gB active alive 0 0 1 qla2xxx sdajq FA 15gB active alive 0 0 1 qla2xxx sdakd FA 15gB active alive 0 0 1 qla2xxx sdasa FA 15gB active alive 0 0 1 qla2xxx sdasn FA 15gB active alive 0 0 1 qla2xxx sdbak FA 15gB active alive 0 0 1 qla2xxx sdbax FA 15gB active alive 0 0 0 qla2xxx sdgu FA 5eB active alive 0 0 1 qla2xxx sdgv FA 12eB active alive 0 0 0 qla2xxx sdor FA 2gB active alive 0 0 1 qla2xxx sdox FA 15gB active alive 0 0 1 qla2xxx sdtj FA 15gB active alive 0 0 |
Let's figure out how those paths correlate to /dev/sg* devices:
root @sxxxxx :TEST:device> lsscsi -g | egrep "sdabg|sdabt|sdajq|sdakd|sdasa|sdasn|sdbak|sdbax|sdgu|sdgv|sdor|sdox|sdtj" [ 0 : 0 : 0 : 101 ] disk EMC SYMMETRIX 5874 /dev/sdgu /dev/sg202 [ 0 : 0 : 1 : 101 ] disk EMC SYMMETRIX 5874 /dev/sdor /dev/sg407 [ 1 : 0 : 0 : 101 ] disk EMC SYMMETRIX 5874 /dev/sdgv /dev/sg203 [ 1 : 0 : 1 : 101 ] disk EMC SYMMETRIX 5874 /dev/sdox /dev/sg413 [ 1 : 0 : 1 : 12389 ]disk EMC SYMMETRIX 5874 /dev/sdtj /dev/sg529 [ 1 : 0 : 1 : 34309 ]disk EMC SYMMETRIX 5874 /dev/sdabg /dev/sg734 [ 1 : 0 : 1 : 34325 ]disk EMC SYMMETRIX 5874 /dev/sdabt /dev/sg747 [ 1 : 0 : 1 : 38405 ]disk EMC SYMMETRIX 5874 /dev/sdajq /dev/sg952 [ 1 : 0 : 1 : 38421 ]disk EMC SYMMETRIX 5874 /dev/sdakd /dev/sg965 [ 1 : 0 : 1 : 42501 ]disk EMC SYMMETRIX 5874 /dev/sdasa /dev/sg1170 [ 1 : 0 : 1 : 42517 ]disk EMC SYMMETRIX 5874 /dev/sdasn /dev/sg1183 [ 1 : 0 : 1 : 46597 ]disk EMC SYMMETRIX 5874 /dev/sdbak /dev/sg1388 [ 1 : 0 : 1 : 46613 ]disk EMC SYMMETRIX 5874 /dev/sdbax /dev/sg1401 |
Look suspicious? There are four /dev/sg* devices numbered under 440, which is the number of total SCSI paths that should be on the system. Let's make sure those are valid:
for dev in sdgu sdor sdgv sdox; do oracleasm querydisk /dev/${dev} 1 ; done Device "/dev/sdgu1" is marked an ASM disk with the label "XXXXXXXXX" Device "/dev/sdor1" is marked an ASM disk with the label "XXXXXXXXX" Device "/dev/sdgv1" is marked an ASM disk with the label "XXXXXXXXX" Device "/dev/sdox1" is marked an ASM disk with the label "XXXXXXXXX" |
That's just an example for RAC. What you want to do is make sure that the /dev/sd* devices are accessible before we blow away the extraneous ones. Something like this would work as well:
for dev in sdgu sdor sdgv sdox; do od -c /dev/${dev} 1 | head - 10 ; done |
All four paths valid? Good. Now that we know how to suss out invalid paths manually, let's do it the easy way:
Note: This will delete extra SCSI paths; the actual 'delete' command has been commented out below. Make sure you're okay with the possibility of a server crash before uncommenting and running it.
VALID_TMP= "/tmp/.valid_devices.$(date " +%m%d%Y ")" ALL_TMP= "/tmp/.all_devices.$(date " +%m%d%Y ")" # Discover the paths that the HBA is reporting for hba in 0 1 ; do sudo scli -l ${hba} | grep -Po "sd\w+" done > ${VALID_TMP} # Discover the paths that the OS is reporting sudo powermt display dev=all | grep -Po "sd\w+" > ${ALL_TMP} for device in $(cat ${ALL_TMP}); do grep -P "^${device}$" ${VALID_TMP} &>/dev/ null if [ $? -eq 1 ]; then echo "Device ${device} is invalid. Deleting..." #echo 1 | sudo tee /sys/block/${device}/device/delete &>/dev/ null fi done rm -f ${VALID_TMP} ${ALL_TMP} |
And finally, issue another LIP:
echo 1 |sudo tee /sys/ class /fc_host/host?/issue_lip |
Our device should be back to normal:
root @xxxxxx :TEST:~> powermt display dev=emcpowercz Pseudo name=emcpowercz Symmetrix ID= xxxxxx Logical device ID=YYYY state=alive; policy=SymmOpt; priority= 0 ; queued-IOs= 0 ============================================================================== ---------------- Host --------------- - Stor - -- I/O Path - -- Stats --- ### HW Path I/O Paths Interf. Mode State Q-IOs Errors ============================================================================== 0 qla2xxx sdgu FA 5eB active alive 0 0 1 qla2xxx sdgv FA 12eB active alive 0 0 0 qla2xxx sdor FA 2gB active alive 0 0 1 qla2xxx sdox FA 15gB active alive 0 0 |