Application Centric Operations!

igino · ‎04-09-2015

Fresh from Cisco Live I’m having difficulty finding the time to catch my breath. With so many exciting things happening, I’ve been challenged to keep up with e-mails, let alone sleep! Sleep can wait. One of the projects I’m most enthused by is our progress in the area of monitoring, managing and troubleshooting in ACI. At Cisco, we call this Day 2 Operations.

Among the driving motivations in designing an Application Centric Infrastructure (ACI) has been to empower network administrators. Our aim is to put the confusion of inherited access control lists and the waste of overprovisioning behind us. To do this, we’ve needed to shift our thinking, and while changes can take us out of our comfort zones, the progress we’ll make is well worth it.

In the adoption of any new technology there are two forces at work: push and pull. I think of the push as the external pressures – from bosses and competition to make sure our organizations don’t get left behind. To make sure we are ‘future proof’. While this force is a considerable motivator, it’s not what drives our passion as network engineers. Progress for progress’ sake is stressful. Rather, what gets us out of bed in the morning is the pull. That is, when we see new possibilities through a technology. One of the largest pulls of ACI has been the fundamental shift in the way that network administrators are controlling their networks. That mentality change has been the motivator in the conversations I have been having lately.

When a network administrator asks me, “So…why ACI?”, as an engineer, my temptation is to run deep: logical and concrete models, clos topology, white list policy – but that’s the what to ACI and misses the crucial word: why. When the discussion starts with, “ACI is how you are going to take control of your data center” the why quickly follows: “so that you can sleep at night, knowing exactly what is happening on your network”. If you’re reading this and thinking, “I already know what’s going on in my network”, I hope you’re enjoying those Ramen noodles, because chances are you’re an academic. Day 2 operations in the real world can be a scary place. Let me give you an example.

The video (link below) takes the user through one of the toughest situations to troubleshoot: degraded performance between hosts. Network admins dread the late night page: a key service or application is acting sluggishly. If it has high visibility, failure to rectify the problem could turn this into a career-limiting night. The conventional approach is to draw out the network, complete with L2 and L3 addresses and log into each machine one by one to check for warnings and monitor counters. This process is tedious at best, futile at worst. If you’re unlucky enough to have a 5% packet drop, finding the culprit can take quite a while.

The troubleshooter demo shows how ACI makes this a much more manageable scenario. In the video, a user enters the source and destination IP or physical addresses as well as the time period in which he or she would like to view activity. From there, the APIC is able to display a network topology with only the components affecting the two hosts. Included in the view will be all spines, relevant leaves, as well as potential fabric extenders and also services such firewalls and load balancers. Any problem areas will be color coded to show severity of their warnings. In this specific case, the problem happens to be ingress packet drop on the spine. What could have taken hours is discovered in a matter of minutes. Even more powerful is the ability to demonstrate that the network is behaving properly. Of course, if the user would rather send all the information directly to TAC or an integration partner, there is an easy way to do that as well.

Despite being so easy to use, the troubleshooter has some very advanced functionality. In future posts, I look forward to diving deeper into these, but for now I’ll just list them:

iTraceroute
Service Ping
Statistics
Change Tracker
Atomic Counters
Contract Visibility

The feedback has been very positive and I’m extremely excited to share its current capabilities as well as work together with users to unlock its full potential. My plan is to use this space to include technical explanations as well as real world examples designed to show the troubleshooter tool in action. As always, comments are welcome. I look forward to working with you!

Application Centric Operations!

Cisco ACI Inter VRF/Tenant Route Leaking Design – Simplified!

Connecting Physical Servers To Cisco ACI Fabric - Simplified!

VXLAN/EVPN Configuration Example (N9k / p2p)