cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1384
Views
0
Helpful
8
Replies

ASA5510 - Packet loss and underrun but low CPU

AK59
Level 1
Level 1

Dear all,

 

I'm writing you regarding a big headache I have with a active/passive ASA 5510 cluster. 

Both have been updated to their latest version ( 9.1.7 ).

 

Since 4/5 months now, we have complaints from users as their IP phone reboot nearly 5 to 6 times a day. This behaviour occurs when the IP phone can reach its server. The firewall is between the server and the IP phone. We also have complaints regarding random latency when devices on the same location of the IP phone try to reach their server ( on the same location of IP Phone server). 

 

After some investigation, I observe a lot of cpu-hog on the dispatch unit and ovveruns. 

 

This is the statistic from the interface on the server's side : 

 

771705546 packets input, 169543247259 bytes, 0 no buffer
Received 8175 broadcasts, 0 runts, 0 giants
604 input errors, 0 CRC, 0 frame, 604 overrun, 0 ignored, 0 abort
0 pause input, 0 resume input
0 L2 decode drops
543216229 packets output, 132318002309 bytes, 0 underruns
0 pause output, 0 resume output
0 output errors, 0 collisions, 2 interface resets
0 late collisions, 0 deferred
0 input reset drops, 0 output reset drops, 0 tx hangs
input queue (blocks free curr/low): hardware (255/230)
output queue (blocks free curr/low): hardware (255/108)
 

This is the statistics from the interface on the devices side : 


528796173 packets input, 128631811627 bytes, 0 no buffer
Received 24794 broadcasts, 0 runts, 0 giants
256 input errors, 0 CRC, 0 frame, 256 overrun, 0 ignored, 0 abort
0 pause input, 0 resume input
0 L2 decode drops
762513350 packets output, 168836484510 bytes, 5829 underruns
0 pause output, 0 resume output
0 output errors, 0 collisions, 2 interface resets
0 late collisions, 0 deferred
1 input reset drops, 0 output reset drops, 0 tx hangs
input queue (blocks free curr/low): hardware (255/230)
output queue (blocks free curr/low): hardware (255/0)

 

 

I don't have a big throughput ( averaging 30 Mbps overall) , the CPU is good ( 22% ) and Memory too ( 350 MB out of 1024 MB). I'm averaging 6 000 connection

 

The usage is good too I guess

 

Resource     Current  Peak Limit 
SSH Server     1           1       5 
ASDM             1           1     30 
Syslogs [rate] 288  3847   N/A 
Conns           5729 11205 130000 
Xlates              4         4        N/A 
Hosts             4346     4370   N/A 
Conns [rate]      281    1232    N/A 
Inspects [rate]    317    919      N/A 
Routes                36       36      unlimited 

 

But I'm experiencing cpu-hog on dispatch unit 

 


Process: Dispatch Unit, PROC_PC_TOTAL: 765852, MAXHOG: 65, LASTHOG: 3
LASTHOG At: 13:12:49 CEDT Aug 19 2021
PC: 0x082a4838 (suspend)

Process: Dispatch Unit, NUMHOG: 741108, MAXHOG: 65, LASTHOG: 3
LASTHOG At: 13:12:49 CEDT Aug 19 2021
PC: 0x082a4838 (suspend)
Call stack: 0x082a4838 0x0806a65c

Process: Dispatch Unit, PROC_PC_TOTAL: 547272, MAXHOG: 52, LASTHOG: 4
LASTHOG At: 13:12:50 CEDT Aug 19 2021
PC: 0x082a4a8c (suspend)

Process: Dispatch Unit, NUMHOG: 544708, MAXHOG: 52, LASTHOG: 4
LASTHOG At: 13:12:50 CEDT Aug 19 2021
PC: 0x082a4a8c (suspend)
Call stack: 0x082a4a8c 0x0806a65c

 

I tried tio figure it out the ASP drop too

 

Frame drop:
Flow is denied by configured rule (acl-drop) 392040
First TCP packet not SYN (tcp-not-syn) 274395
TCP failed 3 way handshake (tcp-3whs-failed) 1502
TCP RST/FIN out of order (tcp-rstfin-ooo) 215421
TCP RST/SYN in window (tcp-rst-syn-in-win) 35
ICMP Error Inspect no existing conn (inspect-icmp-error-no-existing-conn) 4
Dropped pending packets in a closed socket (np-socket-closed) 54

Last clearing: 17:18:05 CEDT Aug 18 2021 by enable_15

Flow drop:
Inspection failure (inspect-fail) 3968

 

I just can't find where my problem is...an someone help me please  ?

8 Replies 8

balaji.bandi
Hall of Fame
Hall of Fame

Do you have any high level network diagram, what you see the device connected to switch ?

is the Switch is ok ? any Logs ?

 

If you reboot the ASA does the problem resolves ?

 

here some troubleshoot :

 

https://www.cisco.com/c/en/us/support/docs/security/asa-5500-x-series-next-generation-firewalls/115985-asa-overrun-product-tech-note-00.html

 

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Well, I don't network diagram to show but it goes likes

 

Servers<=>Nexus<=>ASA<=>Router<=>Switches<=>Devices

 

The devices come from various locations and various switches. 

The servers come from the same farm and the nexus interface statistics show no errors. 

 

If there should be a problem, it would wether be the ASA or the Router. 

 

The ASA has been reloaded more than once...

After ASA reboot did the fix the issue ? for some time ?

 

Router<=>Switches<=>Devices

 

we need to check Router and switch Logs ?

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

It did for like 24hrs...I don't have control to any of the router or switch...I would like to focus on the ASA as I suggest the problem comes from there. 

The input that not helps us to suggest, until we see aorund device connected with more inputs, if not it is very  hard to identify the issue ( i am afraid any help here).

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

AK59
Level 1
Level 1

I just checked with the adjacent routeur ( brand : Hirschmann ), there is no error on its interfaces. Neither on the Nexus ones. 

 

My two next possibilities are : 

 

  1. Enable flow control My question is : Is it compatible with a non-cisco device ? Do I have to change some parameters on the adjacent device 
  2.  Switch the secondary device, is it necessary or dooes the problem will remain in the cluster ? 
  3. Change port on the switch. Is it possible that a faulty NIC is the problem ? ( I have Underrun and overrun on both interface in and out )

 

Thanks in advance, 

AK59
Level 1
Level 1

Hi everyone,

 

Here is a little update of situation. 

I didn't enable the flow control yet. 

I did a failover to check if the problem will occur on the second device...and it does I still have underruns and overrun when I'm on the second device. 

 

But now I'm really suspicous by the ASP DROP rate. I get a lot of "First TCP packet is not SYN" . 

When I capture the traffic, the packets in error are from both sides... ( inside to outside and outside to inside)

 

 

 

hi,

this could be a HW oversubscription on your 5510.

was there any recent change in your network environment? i.e. additional VLAN/users or application eating up BW?

do you have port/traffic monitoring on the ASA? i.e. PRTG or solarwinds?

could you post the output of these commands:

show process cpu-usage sorted non-zero 

show conn count

show local | in host|count/limit

 

Review Cisco Networking for a $25 gift card