cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
4838
Views
29
Helpful
23
Replies

Devices stuck in UNKNOWN state

Vag
Level 1
Level 1

I have 2 devices stuck in unknown state. I've tried to re-discover them, but there was no change in their status.

Then I tried to delete them and I got the following error message:

apic-em 1.jpg

apic-em 2.jpg

I have restarted the APIC-EM server, but I still get the same error messages. The discovery doesn't show any errors.

Any ideas how to fix this?

23 Replies 23

I went through the requested logs and they had an enormous size, so I tried to narrow it down for the sake of making the whole troubleshooting process easier.

I took a backup of APIC-EM and a snapshot in VMWare, deleted all of the devices but those that could not be deleted, cleared the logs and then performed discovery, resync and delete operations on those remaining devices that were stuck.

Then I restarted the APIC-EM server and repeated.

I attached the requested log files that were produced.

pmuthuva
Cisco Employee
Cisco Employee

Hi Vag,

There are some exceptions seen with respect to RawCliInfo table entries and found that we have a defect open for this issue of devices being stuck at 'Unknown' state.

To check the delete device issue, there is no relevant logs available. Enabling DEBUG logs  and sharing the same set of logs  would help to debug the issue with delete.

Command to enable DEBUG logs:

sudo /opt/CSCOlumos/bin/setLogLevel.sh inventory DEBUG

sudo /opt/CSCOlumos/bin/setLogLevel.sh disocvery DEBUG

This has to be run on the VM where apic-em-inventory-manager-service is running.

I have attached the requested logs, produced by following the same procedure and enabling the debug options (delete, discovery etc).

pmuthuva
Cisco Employee
Cisco Employee

Looking at the logs, the rootcause for both the issues that Device goes to 'Unknown' state and unable to delete device is same. There is a open defect for this issue.

Is this related to an issue that is expected to be fixed in the next update release?

pmuthuva
Cisco Employee
Cisco Employee

For the bug related to these issues, work is still in progress.

The workaround for the issues is to cleanup the corrupted entries from DB manually. Please schedule for  a WebEx if the data has to be cleaned up manually.

Thanks,

Pragatheeswary M

Hello pmuthuva, 

 

I am having the same issue, can you please help me with that validation? my devices keep in the "Unknown" state also.

 

Best Regards, 

ddeitsch1
Level 1
Level 1

Did you run into this issue after an upgrade? I feel like I ran into something similar after upgrading from 1.2.X to 1.3.X and then to 1.4.X in succession. I ended up doing a reset grapevine to get the entries cleared and that worked.

I've performed several upgrades since the initial installation, thus I am not certain when it first occurred.

I have about 50 sites and 300+ network devices, so resetting grapevine and configuring everything again is not something that I would like to do..