Case Study 1 – Replacement of SPS in EMC CX4-120 Storage

By: | Tags: | Comments: 0

Client: – Net4India

Location: – Chennai

The Challenge: – EMC Storage shows alert on Standby Power Supply (SPS).

Scenario: – Enclosure SPE SPS A Failure

Error: – SPS A: (1.2KW) FLT

How To Replace an EMC CX3 / CX4 SPS:-

Replacing a faulty or dead EMC CX4-120 SPS is crucial to keeping your EMC CX4 series storage system up and running. A faulty SPS can cause all kinds of performance issues and prevent your system from working all together.

 

Plan of Action:-

  1. Asked customer to send the latest SP logs for analysis.
  2. Once received the logs from customer we started analyzing the Logs.
  3. Found SPS A showing faulty & SPS B cabling unknown state.
  4. Physically verified the storage and identifying the failed SPS by Amber notification

 

SPS B:  (1.2KW) OK but showing Unknown Configuration State.

Solution: – Need to Replace SPS A.

After Replacing SPS A, Customer found that the EMC storage resulted in slow performance, and we had a discussion with the customer & asked him to share the latest SP logs after replacing the SPS A Power Supply. We found that the SPS LED glowed Green as per the Onsite Engineer and once we analyzed the sent Logs & found both SPS cabling showing unknown State & Write Cache Disabled.

Scenario: – Enclosure SPE SPS A Cabling State: Cabling Status is unknown

Enclosure SPE SPS B Cabling State: Cabling Status is unknown

Error: – Write Cache Disabled

cs1

This issue has been addressed in the initial release of FLARE OS, it is required to Reboot the SPS on the error reporting side & before restarting the SPS, need to check LUN trespassed, multipathing, Host connectivity & any HDD predictive failure.

cs2

 

Taken the remote session from customer & checked the storage, found 4 path connected on host, some LUN connected to Celerra NAS storage this LUN not confirm Multipath, asked to customer we required only SPS restarts & IOPS stop.

cs3

 

 

 

 

 

 

 

cs4

 

 

 

 

 

 

 

 

 

 

cs5

 

 

 

 

 

 

 

 

 

Customer asked some queries :-

Please check below answer for their questions…

  1. How can we stops IOPS from Customers END?
  • Ask customers not to use the mapped LUNS till the reboot completes.

 

  1. How you sure celerra (NFS) not in multipath?
  • In CX Storage level found 4 paths assigned to Celerra NAS box which we analyzed in the SPS logs

 

  1. Rebooting the SPA and SPB, if problem not resolved then what your next plan?
  • After rebooting it will definitely works, recommended by EMC.
  1. We are rebooting only the controllers one by one not Power Cycling the Entire storage.

 

Solution: – Required SPS Restart…

POA: – 1)    Check Both SPS Ping’s,

2)    Before Start Activity collect fresh SP logs,

3)    Confirm Multipathing configure on Host side

4)    Check Host connectivity

5)    Customer not confirmed Multipathing then Confirm IOPS stops from User end.

6)    Restart SP A and after SP A Comes up online then restart SP B.

8)    After the restart both SPS checked & the Write Cache Enabled.

After confirm above some points from customer taken remote session & restart SP A,

After the SP A Reboot found Write Cache enable on both SPS & Storage working fine.

cs6