cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
220
Views
0
Helpful
5
Replies

Repeating syslog Message and then Crash

jay-berringer
Level 1
Level 1

We're seeing a continuous string of the following messages on a 9407 running 17.9.5.  Eventually the switch crashes.  I can't find any info on what these messages mean exactly and whether we're looking at a software or hardware fault.  The supervisor in question in new so I'm leaning towards hardware but would like to know what these messages mean.

Apr 24 12:23:28: %IOSXE-3-PLATFORM: R0/0: kernel: ardbeg_iodma_cache_update (line 1930): IO Desc in cache: 0x740000000716020e 0x0000000000000500 0x0000000000000000 0x0000000000000000
Apr 24 12:23:58: %IOSXE-3-PLATFORM: R0/0: kernel: ardbeg_iodma_cache_update (line 1921): Returning IODMA io desc mismatch, bus_id 4 start 0 cnt 16 ndx 12 jiffies 4305641022
Apr 24 12:23:58: %IOSXE-3-PLATFORM: R0/0: kernel: ardbeg_iodma_cache_update (line 1925): updating IO Desc: 0x740000000406020e 0x0000000000000000 0x0000000000000000 0x0000000000000000
Apr 24 12:23:58: %IOSXE-3-PLATFORM: R0/0: kernel: ardbeg_iodma_cache_update (line 1930): IO Desc in cache: 0x740000000416020e 0x0000000000000000 0x0000000000000000 0x0000000000000000
Apr 24 12:23:59: %IOSXE-3-PLATFORM: R0/0: kernel: ardbeg_iodma_cache_update (line 1921): Returning IODMA io desc mismatch, bus_id 5 start 0 cnt 16 ndx 12 jiffies 4305641084
Apr 24 12:23:59: %IOSXE-3-PLATFORM: R0/0: kernel: ardbeg_iodma_cache_update (line 1925): updating IO Desc: 0x740000000506020e 0x0000000000000000 0x0000000000000000 0x0000000000000000
Apr 24 12:23:59: %IOSXE-3-PLATFORM: R0/0: kernel: ardbeg_iodma_cache_update (line 1930): IO Desc in cache: 0x740000000516020e 0x0000000000000000 0x0000000000000000 0x0000000000000000
Apr 24 12:23:59: %IOSXE-3-PLATFORM: R0/0: kernel: ardbeg_iodma_cache_update (line 1921): Returning IODMA io desc mismatch, bus_id 4 start 16 cnt 16 ndx 12 jiffies 4305641146
Apr 24 12:23:59: %IOSXE-3-PLATFORM: R0/0: kernel: ardbeg_iodma_cache_update (line 1925): updating IO Desc: 0x740000000406020e 0x0000000000000500 0x0000000000000000 0x0000000000000000
Apr 24 12:23:59: %IOSXE-3-PLATFORM: R0/0: kernel: ardbeg_iodma_cache_update (line 1930): IO Desc in cache: 0x740000000416020e 0x0000000000000500 0x0000000000000000 0x0000000000000000
Apr 24 12:23:59: %IOSXE-3-PLATFORM: R0/0: kernel: ardbeg_iodma_cache_update (line 1921): Returning IODMA io desc mismatch, bus_id 6 start 0 cnt 16 ndx 12 jiffies 4305641208
Apr 24 12:23:59: %IOSXE-3-PLATFORM: R0/0: kernel: ardbeg_iodma_cache_update (line 1925): updating IO Desc: 0x740000000606020e 0x0000000000000000 0x0000000000000000 0x0000000000000000
Apr 24 12:23:59: %IOSXE-3-PLATFORM: R0/0: kernel: ardbeg_iodma_cache_update (line 1930): IO Desc in cache: 0x740000000616020e 0x0000000000000000 0x0000000000000000 0x0000000000000000
Apr 24 12:23:59: %IOSXE-3-PLATFORM: R0/0: kernel: ardbeg_iodma_cache_update (line 1921): Returning IODMA io desc mismatch, bus_id 5 start 16 cnt 16 ndx 12 jiffies 4305641270
Apr 24 12:23:59: %IOSXE-3-PLATFORM: R0/0: kernel: ardbeg_iodma_cache_update (line 1925): updating IO Desc: 0x740000000506020e 0x0000000000000500 0x0000000000000000 0x0000000000000000
Apr 24 12:23:59: %IOSXE-3-PLATFORM: R0/0: kernel: ardbeg_iodma_cache_update (line 1930): IO Desc in cache: 0x740000000516020e 0x0000000000000500 0x0000000000000000 0x0000000000000000

5 Replies 5

Hello,

I am using 25 different global search engines, but none of them returned anything on what that log message means. A Cisco bug search does not return anything either...

Which supervisor do you have installed ?

It’s a SUP1 in a 9407R. I hit every source I could find and couldn’t decipher the message.

Probably the only ones who would know (or be able to find out) would be TAC.

Between the sup being "new" and the nature of the error (kernel I/O DMA error), I too would suspect the sup.

What you might try, power off the chassis, reset the sup, restart the chassis, and monitor the console, during startup, for POST errors.

Problem is that it’s new and so not added to a service contract yet. Thus TAC refuses anything other than issuing an RMA under warranty. It’s a bit of a vicious circle as I don’t want to swap hardware needlessly but can’t open an SR except to create an RMA.

Looking at other items in the “show tech” seem to point to hardware. Restart reason is “Unknown” even though the switch has steady power and more than enough head room in the power budget. No crashinfo file is produced, again leaning me towards hardware.

Just wish Cisco documentation of error messages was better.

Well, like many other things, better documentation increases cost.  Assuming this error is nothing you as an end user can do anything about, beyond replacing hardware, I would suggest it's adequate for such a purpose.  If fact, internal TAC documentation might be no more informative, i.e. tell customer hardware needs to be replaced.

Of course, issues like this error, likely go beyond TAC, as vendors usually want to mitigate a reoccurring issue.

Such mitigations might provide for later hardware revisions for the "same" part number.  But how often do you see any documentation why hardware has been revised (when there's no spec change)?

There are other good reasons for "poor" public documentation including trying to keep proprietary info secret.

Heck, don't know if you've been in a similar situation, but I've worked in large companies, buying hardware from Cisco, where Cisco required NDAs, to discuss their roadmaps, and work with them on what the roadmap might be.  As I've signed such NDAs I cannot be more specific.  However, such practices aren't limited to dealing with Cisco.

I mention the foregoing because your "wish" is reasonable but there are various reasonable reasons it's unlikely to be met.

Review Cisco Networking products for a $25 gift card