cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
216
Views
3
Helpful
4
Replies

PXGrid 2.0 High Availability - Primary Pan Outage

ryanbess
Level 1
Level 1

I have an EVE-NG lab that consists of a Panorama, 1 FW, and 4 ISE nodes (see lab.jpg attached).  The Lab ISE nodes are running 3.2 patch 5 and panorama and Firewall is running 10.2.7-h3.  In reading the 3.2 and 3.1 admin guides they both state in the below 

High Availability for pxGrid 2.0
pxGrid 2.0 nodes operate in an Active/Active configuration. For high availability, there should be at least two
pxGrid nodes in the deployment. Large deployments can have up to four nodes for increased scale and
redundancy. We recommend that you configure IP addresses for all the nodes, so that if one node goes down,
that node's clients connect to the working node. When the PAN goes down, the pxGrid server stops handling
the activations. Manually promote the PAN to activate the pxGrid server. For more information about pxGrid
deployments, see Performance and Scalability Guide for Cisco Identity Services Engine
All the pxGrid service provider clients periodically reregister themselves with the pxGrid controller within a
span of 7.5 minutes. If the client does not reregister, the PAN node assumes that the client is inactive and
deletes the client. If the PAN node goes down for more than 7.5 minutes, when it comes back up, it deletes
all the clients with timestamp values older than 7.5 minutes. All those clients must then register again with
the pxGrid controller.
pxGrid 2.0 clients use WebSocket and REST-based APIs for pub/sub and query. These APIs are served by
the ISE application server on port 8910. The pxGrid processes shown by show logging application pxgrid
don’t apply to pxGrid 2.0.

Further https://youtu.be/_aO6oZrYCPE?si=qNlpxHO8ECX2sSXV&t=751  states that Pxgrid stops working when the PAN goes down.

In my lab, I downed node-1 which is the primary pan (no automatic failover enabled) i expected to NOT see any new registered endpoints but i DO.  Am i not understanding the Admin docs correctly?

4 Replies 4

Not a 100% sure, but based on my experience with pxGrid 2.0 everything would still work even when the primary PAN is down, however, I don't remember ever tested this thoroughly. Based on the documentation as you pointed out it seems to be required to have an active PAN in place for the new registrations to happen, so I'm wondering if there is kinda a timer that needs to get past before the new registrations stop happening? not sure!

I have a case open with TAC to see if what i'm seeing is expected and if so will request they update their documents.  

Just got off the call with TAC.  They acknowledged their public documentation is wrong and will get it fixed.  With that said, it's pretty clear that not even internal TAC documentation is correct as we were able to disprove their internal documents as well.  The engineer is going back and will discuss internally.  

Ok so its turning out to have nothing to do with a PAN at all.  As long as you have both a M&T and a pxgrid node available online, pxGrid services will continue to function (for new and existing sessions).  The interesting thing is that for ip to sgt mappings, the PSN's update the secondary M&T with data which then pushes it out to available pxGrid Nodes.  More to come.