cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
202
Views
1
Helpful
5
Replies

Cisco Fabric Interconnect is showing six(6) Pmon Services as failed

alam1
Level 1
Level 1

Cisco Fabric Interconnect is having some Pmon Services as failed

==================================================

FI-B(local-mgmt)# show pmon state

SERVICE NAME STATE RETRY(MAX) EXITCODE SIGNAL CORE
------------ ----- ---------- -------- ------ ----
svc_sam_controller running 0(4) 0 0 no
svc_sam_dme running 1(4) 0 15 no
svc_sam_dcosAG failed 5(4) 0 15 no
svc_sam_bladeAG running 0(4) 0 0 no
svc_sam_portAG failed 5(4) 0 15 no
svc_sam_statsAG running 0(4) 0 0 no
svc_sam_hostagentAG running 0(4) 0 0 no
svc_sam_nicAG running 0(4) 0 0 no
svc_sam_licenseAG running 0(4) 0 0 no
svc_sam_extvmmAG failed 5(4) 0 15 no
httpd.sh running 0(4) 0 0 no
httpd_cimc.sh running 0(4) 0 0 no
svc_sam_sessionmgrAG failed 5(4) 0 6 yes
svc_sam_pamProxy running 0(4) 0 0 no
dhcpd running 0(4) 0 0 no
sam_core_mon running 0(4) 0 0 no
svc_sam_netSnmpAG running 0(4) 0 0 no
svc_sam_rsdAG failed 5(4) 0 15 no
svc_sam_svcmonAG running 0(4) 0 0 no
svc_sam_samcproxy failed 5(4) 0 11 yes
svc_sam_samcstatsproxy running 0(4) 0 0 no
mtuTune running 0(10) 0 0 no

=============================================

We have a cluster of two fabric interconnects of 6454's with firmare of 4.2(3e) installed on them. It looks like Primary fabric interconnect is not taking any commands such as

Cluster lead a

Pmon stop

pmon start

Can anyone confirm what is cauing this issue and what would be the fix for this. Right now both fabric interconnects are taking the traffic.

 

Note:

Tac case is work in progress.

1 Accepted Solution

Accepted Solutions

Steven Tardy
Cisco Employee
Cisco Employee

Fixed width output in non-fixed width font and consecutive spaces reduced to one space breaks my brain.
Cisco web tools don't do this any justice by not pasting properly.
Reformatted your output so I could see what was happening in that jumbled output:

SERVICE NAME             STATE  RETRY(MAX) EXITCODE SIGNAL CORE
------------             -----  ---------- -------- ------ ----
svc_sam_controller     running        0(4)        0      0   no
svc_sam_dme            running        1(4)        0     15   no
svc_sam_dcosAG          failed        5(4)        0     15   no
svc_sam_bladeAG        running        0(4)        0      0   no
svc_sam_portAG          failed        5(4)        0     15   no
svc_sam_statsAG        running        0(4)        0      0   no
svc_sam_hostagentAG    running        0(4)        0      0   no
svc_sam_nicAG          running        0(4)        0      0   no
svc_sam_licenseAG      running        0(4)        0      0   no
svc_sam_extvmmAG        failed        5(4)        0     15   no
httpd.sh               running        0(4)        0      0   no
httpd_cimc.sh          running        0(4)        0      0   no
svc_sam_sessionmgrAG    failed        5(4)        0      6  yes
svc_sam_pamProxy       running        0(4)        0      0   no
dhcpd                  running        0(4)        0      0   no
sam_core_mon           running        0(4)        0      0   no
svc_sam_netSnmpAG      running        0(4)        0      0   no
svc_sam_rsdAG           failed        5(4)        0     15   no
svc_sam_svcmonAG       running        0(4)        0      0   no
svc_sam_samcproxy       failed        5(4)        0     11  yes
svc_sam_samcstatsproxy running        0(4)        0      0   no
mtuTune                running       0(10)        0      0   no

Doubt this is old bug CSCwa58954.

More likely newer bug:

CSCwf39250 ::  samcproxy fails due to multiple failed SSH login attempts 

which is fixed on UCSM 4.3(2b).

TAC should be able to clear the issue without rebooting the Fabric Interconnect.

 

 

View solution in original post

5 Replies 5

marce1000
VIP
VIP

 

 - Possibly not a complete match but I noted it : https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwa58954

 M.



-- ' 'Good body every evening' ' this sentence was once spotted on a logo at the entrance of a Weight Watchers Club !

Steven Tardy
Cisco Employee
Cisco Employee

Fixed width output in non-fixed width font and consecutive spaces reduced to one space breaks my brain.
Cisco web tools don't do this any justice by not pasting properly.
Reformatted your output so I could see what was happening in that jumbled output:

SERVICE NAME             STATE  RETRY(MAX) EXITCODE SIGNAL CORE
------------             -----  ---------- -------- ------ ----
svc_sam_controller     running        0(4)        0      0   no
svc_sam_dme            running        1(4)        0     15   no
svc_sam_dcosAG          failed        5(4)        0     15   no
svc_sam_bladeAG        running        0(4)        0      0   no
svc_sam_portAG          failed        5(4)        0     15   no
svc_sam_statsAG        running        0(4)        0      0   no
svc_sam_hostagentAG    running        0(4)        0      0   no
svc_sam_nicAG          running        0(4)        0      0   no
svc_sam_licenseAG      running        0(4)        0      0   no
svc_sam_extvmmAG        failed        5(4)        0     15   no
httpd.sh               running        0(4)        0      0   no
httpd_cimc.sh          running        0(4)        0      0   no
svc_sam_sessionmgrAG    failed        5(4)        0      6  yes
svc_sam_pamProxy       running        0(4)        0      0   no
dhcpd                  running        0(4)        0      0   no
sam_core_mon           running        0(4)        0      0   no
svc_sam_netSnmpAG      running        0(4)        0      0   no
svc_sam_rsdAG           failed        5(4)        0     15   no
svc_sam_svcmonAG       running        0(4)        0      0   no
svc_sam_samcproxy       failed        5(4)        0     11  yes
svc_sam_samcstatsproxy running        0(4)        0      0   no
mtuTune                running       0(10)        0      0   no

Doubt this is old bug CSCwa58954.

More likely newer bug:

CSCwf39250 ::  samcproxy fails due to multiple failed SSH login attempts 

which is fixed on UCSM 4.3(2b).

TAC should be able to clear the issue without rebooting the Fabric Interconnect.

 

 

Thanks it is helpful, But how to get into the debug shell from there. It looks like we are not able to switchover from b to a as primary Interconnect as it is not taking this command.

The debug shell on FI-6454 is only available to TAC via a challenge-response mechanism.

I am happy to inform you that the issue with the UCS GUI has been resolved. I worked with TAC and they confirmed that it was the same bug that Steven had identified. We followed the steps suggested to kill and restart the pmon process for user1&2 &  and clear the cores. After that, the UCS GUI was accessible again. Thank you for your patience and support.

Review Cisco Networking products for a $25 gift card