cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
4252
Views
0
Helpful
27
Replies

Cisco MDS 9513 - Zone changes issue

desingraja
Level 1
Level 1

Dear team,

 

We are using Cisco MDS9513 in our environment to connect the  hosts and the Pure storage array.

Whenever we do any zonechanges, its affecting the other hosts paths logging in and out on the storage array.

 

 Especially with Aix / Solaris hosts which takes much time to recover its paths.

 

Any help highly appreciated.

 

27 Replies 27

Walter Dey
VIP Alumni
VIP Alumni

Hi

This is very strange and should not happen !

Can you please provide more information.

- what kind of changes are done ? adding zones to zoneset ? modifying zones ?

- are you using device alias or fcalias ?

- which version are you using

- are all devices doing a new flogi ?

Hi Walter,

 

Please find the below.

 

Can you please provide more information.

- what kind of changes are done ? adding zones to zoneset ? modifying zones ? - Yes adding zones to zoneset.

- are you using device alias or fcalias ? Device alias

- which version are you using -> 6.2(3)

- are all devices doing a new flogi ? -> No,

 

Detailed description :

 

We have connected ur Pure storage via Cisco MDS9513 SWITCHES .

The host wwn's are logged into the FC ports on the cisco MDS switches, but sometimes the wwns are not logging to the Pure storage array. Sometimes we have to manually force the wwn to come on the storage (Disable and Enable on the FC port in the switch).

 

Especially with AIX / SOLARIS, the paths are very slowly detected on the storage end. 

 

Did you check the MDS log file: show logging log

I hope you are also aware of

https://www.cisco.com/en/US/docs/storage/san_switches/mds9000/sw/san-os/quick/guide/qcg_ids.html

Warning HP-UX and AIX are two operating systems that utilize the FC ID in the device path to the storage. For a switch to always assign the same FC ID to a device, persistent FC IDs and static domain ID must be configured for the VSAN.

 

Can you please also provide the following information:

Show flogi database vsan x

Show fcns database vsan x

Show zoneset active vsan x

show zoneset vsan x

 

Regarding AIX see also

https://supportforums.cisco.com/t5/storage-networking/ibm-power-8-vios-problem/td-p/2566824

 

https://supportforums.cisco.com/t5/storage-networking/problems-with-aix-hosts/td-p/1015757

 

 

Hi Attached the output from both of our switches.

Q. Are this 2 MDS just a single fabric A, fabric B, with all the hosts, storage dual homed, and no ISL between the two.

One MDS has 45 flogi, the other 58 ? why ?

How often do the errors with Pure storage happen (to recover with shut / noshut)

Did you check the MDS error log ? I'm convinced that this should be seen there.

Q. Are this 2 MDS just a single fabric A, fabric B, with all the hosts, storage dual homed, and no ISL between the two. - No these 2 MDS are separate fabric with hosts, and storage dual home. We have ISL between Cisco UCS and mds.

One MDS has 45 flogi, the other 58 ? why ? - For DR purpose and VMOTION, only single HBA is connected from VSAN ESX host 

How often do the errors with Pure storage happen (to recover with shut / noshut) - Whenever there is any zone changes. 

Did you check the MDS error log ? I'm convinced that this should be seen there. Yes.

Is the UCS system in FC end host mode, and MDS NPIV enabled.

Is Fabric interconnect A connected to MDS 1 and B to MDS 2 ?

No ISL between MDS 1 and 2 !

Do you have static domain id and persistend FC-ID ?

It seems that this problem is reproducable ?

Can you do a zoneset activation producing the error, and then please post the error log entries.

The documentation clearly states:

Zone changes can be configured nondisruptively without interrupting traffic on unaffected ports or devices

Please find the below 

 

Is the UCS system in FC end host mode, and MDS NPIV enabled. - Yes, In UCS its in host mode and MDS NPIV enabled.

Is Fabric interconnect A connected to MDS 1 and B to MDS 2 ? Yes

No ISL between MDS 1 and 2 ! - Yes no ISL

Do you have static domain id and persistend FC-ID ? Can you please help me how to check this .

It seems that this problem is reproducable ? Yes sometimes it works, and sometimes it dont. But yesterday we are setting up with newAIX VIO setup. After zoning, the from the VIO end the Luns are visible. But after reboot of the VIO, the luns are missing. I can see the virtual wwpns from VIO is logged into the MDS. But on the Pure storage the wwpns not logged in. We have zoned 1 virtual WWPn.

Can you do a zoneset activation producing the error, and then please post the error log entries.

Can you also post the error message seen on the AIX system.

I have seen cases, where the AIX used FC class 2 and class 3 frames (during flogi); you can force some HBA at least to only use class 3 !

Hi Walter,

Thanks for guiding me.
Unfortunately our AIX admins not available to get the error, but certainly will get it for you.

Can you help me how to force the HBA to send class 3 frames ?


Also , from various issue searched in google related my situation from my understanding

Suspecting the below but not sure how to proceed .
Would it be a problem related to RSCN ?
Slow drain issue ?
Because i could see sometimes the Solaris / AIX / Unic hosts lost paths when we do any reboot of the hosts / failover of the storage controller / Storage upgrade . That time i need to manually force the logins by disabling and enabling the ports on the Switch port .

Hi

Very likely this is not a MDS issue, at least I don't see any obvious errors. Therefore most likely its AIX related (I had ton's of issues with such installations).

e.g. MDS supports class 2 and 3 FC frames, Nexus 5k only class 3. And some HBA can be forced to only use class 3 ! and I seem to remember that if the HBA supports both, AIX start flogi with class 2 !

Of course, the re flogi of the Pure storage is odd, and has to be resolved.

Can you confirm, that the majority of issues is AIX related ?

Yes, AIX is new. But almost faced issue with all Unix / Solaris servers.

And before MDS, we were using Brocade and these solaris / Unix was not having any issues with that switches.. However its 3 year back history.

sure i will.

Meantime, is there anything to be changed on the MDS switch port level related to the class 2 / 3 frames