Impending Physical Disk Failure (High Data Availability Risk)

What Caused the Problem?

A physical disk is reporting internal errors that could cause the physical disk to fail. If this physical disk fails before you follow these recovery steps, the virtual disks in the disk group will fail and all data on the virtual disks will be lost. The Recovery Guru Details area provides specific information you will need as you follow the Recovery Steps.

  Caution: Possible loss of data accessibility. No data has been lost; however, this failure needs to be resolved immediately. A loss of data accessibility may occur if the indicated physical disk fails before you follow these recovery steps.

  Caution: Electrostatic discharge can damage sensitive components. Always use proper antistatic protection when handling components. Touching components without using a proper ground may damage the equipment.

Important Notes

Check the Recovery Guru Details area to determine the current status and RAID level of the affected disk groups and virtual disks.

If...

Then...

The current status/RAID level of the virtual disks is Optimal/RAID 0

Go to the Recovering RAID 0 recovery steps.

The current status/RAID level of the virtual disks is Optimal/RAID 1, 5, 6, or 10

If a hot spare is rebuilding in the affected disk group wait for the operation to complete before proceeding.

Although it is not required, you should stop all I/O to all virtual disks in the disk group associated with the affected physical disk and back up the data. If another physical disk fails in this disk group while you are performing this procedure, you may lose data accessibility.

Go to the Recovering RAID 1, 5, 6, or 10 recovery steps.

The current status/RAID level of the virtual disks is Degraded/RAID 1, 5, 6, or 10

Go to the Recovery Guru procedure for Degraded Virtual Disk, which also should be listed in the Recovery Guru Summary area. Do not continue with this procedure.

Recovering RAID 0

Use the following procedure if the affected virtual disks are RAID 0.

Recovery Steps

1

Stop all I/O to the affected virtual disks.

2

Back up all data on the affected virtual disks. (Step 5 will destroy all data on the affected virtual disks.)

Note: To the operating system (OS), a failed virtual disk is exactly the same as a failed non-RAID physical disk. Refer to the OS documentation for any special requirements concerning failed physical disks and perform them where necessary.

3

If any of the affected virtual disks are also source or target virtual disks in a copy operation that is either Pending or In Progress, you must stop the copy operation before continuing.

From the Modify tab, click the Manage virtual disk copies link, and select the virtual disks that are in a copy operation that you wish to stop.

4

If you have snapshot virtual disks associated with the affected virtual disks, these snapshot virtual disks will no longer be valid once you fail the physical disk in step 5.

Perform any necessary operations (such as backup) on the snapshot virtual disks, and then delete them. From the Modify tab, click the Delete virtual disks link to select the snapshot virtual disks for deletion.

5

Perform the following steps on the affected physical disk shown in the Recovery Guru Details area to manually fail the physical disk prior to replacement.

  Caution: The data on the affected virtual disks will be lost once you perform this step. Be sure you have backed up your data before performing this step.

a

Open a Command Prompt, and type the following command:

SMcli -n <storageArray_name>;

where storageArray_name is the name of the storage array listed in the Details area.

Note: If you receive an error from this command, change your working directory to the directory that contains the SMcli executable.

b

Execute the following command in order to manually fail the physical disk:

set physicalDisk [enclosure_ID,slot_ID] operationalState=failed;

where enclosure_id is the enclosure ID for the enclosure where the physical disk resides and slot_ID is the slot position within the enclosure.

Note: The commands above are case sensitive, and must be entered exactly as shown.

6

If...

Then...

You want to replace the failed physical disk with a new physical disk

  • Remove the physical disk (its status LED may be amber flashing).
  • Wait 30 seconds, and then insert the new physical disk.
  • Note: Wait until the replaced physical disk is ready (status LED is green) before going to step 7.

You want to utilize an existing unassigned physical disk or the in-use hot spare to replace the failed disk in the disk group

  • Click on the Modify tab and then select Replace Physical Disks.
  • Under Failed and Missing Physical Disks, select the physical disk that you would like to replace
  • Under Available replacement physical disks, select the physical disk that you would like to use to replace the failed or missing physical disk
  • Click on Replace Physical Disk.
  • Note: If you choose the Hot Spare as a replacement for the failed or missing physical disk, the Hot Spare role will be changed to Assigned. A new Hot Spare would need to be assigned if that functionality is desired

    Note: Wait until the replaced physical disk is ready (status LED is green) before going to step 7.

7

a

Open a Command Prompt, and type the following command:

SMcli -n <storageArray_name>;

where storageArray_name is the name of the storage array listed in the Details area.

Note: If you receive an error from this command, change your working directory to the directory that contains the SMcli executable.

b

Execute the following command in order to initialize a virtual disk in the disk group:

start virtualDisk [virtualDiskName] initialize;

where virtualDiskName is a virtual disk in the disk group you wish to initialize.

Note: When initialization starts on a virtual disk, the icon changes to Operation in Progress  in the Disk groups and Virtual Disks dialog. When initialization is completed, the virtual disk becomes Optimal  .

c

Repeat step b for each virtual disk in the disk group.

d

Save this procedure by clicking the Save As button because once you perform step 8 and the failure is fixed, you will not be able to access the information in step 8 from the Recovery Guru.

Go to step 8.

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

9

Add the affected virtual disks back to the operating system (refer to your Storage Manager software user guide for specific instructions on how to do this). You may need to reboot the system to see the re-initialized virtual disks.

Note: Do not start I/O to these virtual disks until after you restore from backup.

10

Restore the data for the affected virtual disks from backup.

11

If desired, create new snapshot virtual disks to replace the ones that you may have deleted in step 4.

12

If desired, re-create any copies you stopped by clicking the Manage virtual disk copies link on the Modify tab, and then selecting the virtual disks you wish to re-copy.

Recovering RAID 1, 5, 6, or 10

Use the following procedure if ALL of the following conditions apply:

Recovery Steps

1

Perform the following steps on the affected physical disk shown in the Recovery Guru Details area to manually fail the physical disk prior to replacement.

a

Open a Command Prompt, and type the following command:

SMcli -n <storageArray_name>;

where storageArray_name is the name of the storage array listed in the Details area.

Note: If you receive an error from this command, change your working directory to the directory that contains the SMcli executable.

b

Execute the following command in order to manually fail the physical disk:

set physicalDisk [enclosure_ID,slot_ID] operationalState=failed;

where enclosure_ID is the enclosure ID for the enclosure where the physical disk resides and slot_ID is the slot position within the enclosure.

Note: The commands above are case sensitive, and must be entered exactly as shown.

2

If...

Then...

You want to replace the failed physical disk with a new physical disk

  • Remove the physical disk (its status LED may be amber flashing).
  • Wait 30 seconds, and then insert the new physical disk.
  • Wait until the replaced physical disk is ready (status LED is green)
  • Click the Recheck button to rerun the Recovery Guru to ensure that the failure has been fixed.

You want to utilize an existing unassigned physical disk or the in-use hot spare to replace the failed disk in the disk group

  • Click on the Modify tab and then select Replace Physical Disks.
  • Under Failed and Missing Physical Disks, select the physical disk that you would like to replace
  • Under Available replacement physical disks, select the physical disk that you would like to use to replace the failed or missing physical disk
  • Click on Replace Physical Disk.
  • If you choose the Hot Spare as a replacement for the failed or missing physical disk, the Hot Spare role will be changed to Assigned. A new Hot Spare would need to be assigned if that functionality is desired

  • Wait until the replaced physical disk is ready (status LED is green)
  • Click the Recheck button to rerun the Recovery Guru to ensure that the failure has been fixed.

Note: Additional information on this issue may be available. Please visit the Dell support website at support.dell.com and select your product model. Choose "troubleshooting" as your tool option, then search by this procedure title.