cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3050
Views
0
Helpful
4
Replies

lxc fails

huntc
Level 4
Level 4

Hi there,

I've been able to install/activate/start my LXC based application on IOx until recently. I was hoping that someone might point me in the direction of what to look at that could help me diagnose its issue. The app appears to start up and then, within a few seconds, it stops with an exit code of 1 - which of course could be anything.

If I use the "system troubleshoot" page of the Local Manager UI, the CAF log reports activation and startup of my app. There is a worrying message around SIGTERM not killing the process, but I'm unsure if that's the issue:

2018-03-20 22:15:35,055 [runtime.hosting:INFO] [CP Server Thread-10] [platformcapabilities.py:227 - check_disk_status()] Calling disk status script: /home/root/iox/caf/scripts/disk_status.sh

2018-03-20 22:15:35,082 [runtime.hosting:INFO] [CP Server Thread-10] [platformcapabilities.py:230 - check_disk_status()] Executed /home/root/iox/caf/scripts/disk_status.sh: returncode: 0, message:

2018-03-20 22:15:38,032 [runtime.hosting:INFO] [CP Server Thread-8] [controller.py:656 - start_app_async()] Starting app : loraserver

2018-03-20 22:15:38,038 [runtime.hosting:INFO] [Thread-12] [libvirtcontainer.py:144 - start()] Starting lxc container : LXC Container : Name: loraserver, UUID: 5e93f734-4cdf-4577-afae-104305acb7a2

2018-03-20 22:15:43,280 [command_wrapper:INFO] [Thread-12] [commandwrappers.py:941 - kill()] Sigterm did not kill the process 3020, so passing sigkill

2018-03-20 22:15:43,787 [pdservices:INFO] [Worker-1] [autoconfigcli.py:458 - container_event_cb()] Autoconfig network_event_cb. Event [started] ignored.

2018-03-20 22:15:50,291 [command_wrapper:INFO] [Thread-12] [commandwrappers.py:941 - kill()] Sigterm did not kill the process 3101, so passing sigkill

2018-03-20 22:15:54,243 [runtime.hosting:INFO] [Thread-12] [connectorwrapper.py:854 - setConnectorState()] Connector 'loraserver' state change: STOPPED-->RUNNING

2018-03-20 22:15:54,271 [runtime.hosting:INFO] [Thread-12] [connectorwrapper.py:777 - startConnector()] Connector 'loraserver' successfully started.

2018-03-20 22:15:54,320 [runtime.api.resources:INFO] [CP Server Thread-8] [connector.py:714 - on_post()] App: loraserver started

2018-03-20 22:15:55,824 [pdservices:INFO] [Worker-1] [autoconfigcli.py:458 - container_event_cb()] Autoconfig network_event_cb. Event [started] ignored.

...and then about a second later my program is stopped:

2018-03-20 22:16:13,389 [runtime.hosting:INFO] [MonitoringService] [connectorwrapper.py:854 - setConnectorState()] Connector 'loraserver' state change: RUNNING-->STOPPED

2018-03-20 22:16:13,391 [runtime.hosting:INFO] [MonitoringService] [connectorwrapper.py:813 - stopConnector()] Connector 'loraserver' successfully stopped.

2018-03-20 22:16:13,392 [runtime.hosting:INFO] [MonitoringService] [monitoring.py:332 - start_monitoring()] Exhausted the max restart attempts 5 for the app loraserver

2018-03-20 22:16:13,863 [pdservices:INFO] [Worker-1] [autoconfigcli.py:449 - container_event_cb()] Autoconfig network_event_cb. app_id:[loraserver] Event container stopped Payload:[[]]

2018-03-20 22:16:13,864 [pdservices:INFO] [Worker-1] [autoconfigcli.py:394 - delete()] AutoconfigcliService service. delete()

2018-03-20 22:16:13,864 [pdservices:INFO] [Worker-1] [autoconfigcli.py:404 - delete()] Autoconfig delete not done. Service not enabled


So why was the app attempting to restart? Where should I look next as none of the other logs are yielding anything obviously interesting.

Thanks for your guidance.

Kind regards,

Christopher

1 Accepted Solution

Accepted Solutions

Hi Christopher,

You may also access the app console to see what happened for the LXC and the running software.

See the part of "How to access application console?" in https://developer.cisco.com/docs/iox/#troubleshooting-guide/troubleshooting-guide.

View solution in original post

4 Replies 4

huntc
Level 4
Level 4

I should add that non of the app's logs yield anything interesting to me (container log):

2018-03-20 22:15:38.073+0000: starting up

PATH=/sbin:/usr/sbin:/bin:/usr/bin LIBVIRT_DEBUG=3 LIBVIRT_LOG_OUTPUTS=3:stderr /usr/lib64/libvirt/libvirt_lxc --name loraserver --console 22 --security=smack --handshake 25 --veth vnet1

PATH=/bin:/sbin TERM=linux container=lxc-libvirt HOME=/ container_uuid=5e93f734-4cdf-4577-afae-104305acb7a2 LIBVIRT_LXC_UUID=5e93f734-4cdf-4577-afae-104305acb7a2 LIBVIRT_LXC_NAME=loraserver startcontainer.sh

2018-03-20 22:15:38.137+0000: 1: info : libvirt version: 1.2.19

2018-03-20 22:15:38.137+0000: 1: warning : lxcContainerUnmountSubtree:617 : Failed to unmount '/.oldroot/var/volatile/run/libvirt/lxc/loraserver.devpts', trying to detach subtree '/.oldroot': Invalid argument

and watch dog:

Waiting to get the ip for interfaces: eth0

Waiting to get the ip for interfaces: eth0

Waiting to get the ip for interfaces: eth0

Waiting to get the ip for interfaces: eth0

Waiting to get the ip for interfaces: eth0

Got the ip address for interface eth0

All interfaces got the ips

APP START TIME:1521584153

App loraserver started with PID : 92

Monitoring this process now

App loraserver completed with exit code: 1

APP END TIME:1521584170

Time taken by App : 0 minutes and 17 seconds.

I do note that syslog reports a problem eth0, and there was some configuration done with the 829 yesterday (my app requires eth0 as per its package.yaml):

Mar 20 22:15:12 qemux86-64 dhclient: XMT: Solicit on eth0, interval 129960ms.

Mar 20 22:15:12 qemux86-64 dhclient: send_packet6: No such device

Mar 20 22:15:12 qemux86-64 dhclient: dhc6: send_packet6() sent -1 of 56 bytes

Mar 20 22:15:38 qemux86-64 kernel: device vnet0 entered promiscuous mode

Mar 20 22:15:38 qemux86-64 kernel: IPv6: ADDRCONF(NETDEV_UP): vnet0: link is not ready

Mar 20 22:15:38 qemux86-64 kernel: IPVS: Creating netns size=1376 id=10

Mar 20 22:15:38 qemux86-64 kernel: eth0: renamed from vnet1

Mar 20 22:15:52 qemux86-64 dhclient: Internet Systems Consortium DHCP Client 4.3.2

Mar 20 22:15:52 qemux86-64 dhclient: Copyright 2004-2015 Internet Systems Consortium.

Mar 20 22:15:52 qemux86-64 dhclient: All rights reserved.

Mar 20 22:15:52 qemux86-64 dhclient: For info, please visit https://www.isc.org/software/dhcp/

Mar 20 22:15:52 qemux86-64 dhclient:

Mar 20 22:15:52 qemux86-64 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): vnet0: link becomes ready

Mar 20 22:15:52 qemux86-64 kernel: dpbr_0: port 3(vnet0) entered forwarding state

Mar 20 22:15:52 qemux86-64 kernel: dpbr_0: port 3(vnet0) entered forwarding state

Mar 20 22:15:52 qemux86-64 dhclient: Listening on LPF/eth0/52:54:dd:24:b6:c9

Mar 20 22:15:52 qemux86-64 dhclient: Sending on   LPF/eth0/52:54:dd:24:b6:c9

Mar 20 22:15:52 qemux86-64 dhclient: Sending on   Socket/fallback

Mar 20 22:15:52 qemux86-64 dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67

Mar 20 22:15:52 qemux86-64 dhclient: DHCPACK from 192.168.1.1

Mar 20 22:15:52 qemux86-64 dhclient: bound to 192.168.1.9 -- renewal in 37440 seconds.

Mar 20 22:15:53 qemux86-64 dhclient: Internet Systems Consortium DHCP Client 4.3.2

Mar 20 22:15:53 qemux86-64 dhclient: Copyright 2004-2015 Internet Systems Consortium.

Mar 20 22:15:53 qemux86-64 dhclient: All rights reserved.

Mar 20 22:15:53 qemux86-64 dhclient: For info, please visit https://www.isc.org/software/dhcp/

Mar 20 22:15:53 qemux86-64 dhclient:

Mar 20 22:15:54 qemux86-64 dhclient: Listening on Socket/eth0

Mar 20 22:15:54 qemux86-64 dhclient: Sending on   Socket/eth0

Mar 20 22:15:54 qemux86-64 dhclient: XMT: Solicit on eth0, interval 1050ms.

Mar 20 22:15:55 qemux86-64 dhclient: XMT: Solicit on eth0, interval 2170ms.

Mar 20 22:15:57 qemux86-64 dhclient: XMT: Solicit on eth0, interval 4220ms.

Mar 20 22:16:01 qemux86-64 dhclient: XMT: Solicit on eth0, interval 8200ms.

Mar 20 22:16:10 qemux86-64 dhclient: XMT: Solicit on eth0, interval 17100ms.

Mar 20 22:16:11 qemux86-64 kernel: dpbr_0: port 3(vnet0) entered disabled state

Mar 20 22:16:11 qemux86-64 kernel: device vnet0 left promiscuous mode

Mar 20 22:16:11 qemux86-64 kernel: dpbr_0: port 3(vnet0) entered disabled state

Mar 20 22:16:11 qemux86-64 dhclient: receive_packet failed on eth0: Network is down

Mar 20 22:16:27 qemux86-64 dhclient: XMT: Solicit on eth0, interval 33590ms.

Mar 20 22:16:27 qemux86-64 dhclient: send_packet6: No such device

Mar 20 22:16:27 qemux86-64 dhclient: dhc6: send_packet6() sent -1 of 56 bytes

Thanks.

Hi Christopher,

You may also access the app console to see what happened for the LXC and the running software.

See the part of "How to access application console?" in https://developer.cisco.com/docs/iox/#troubleshooting-guide/troubleshooting-guide.

Thanks for the reply Steve. I should have mentioned that I tried that. Unfortunately, the app doesn't stay up long enough to be able to - even when I activate it with debug.

It turns out that I used `debug` as an argument to activation, and not `--debug`. I was able to login to the app after all, and discovered my issue. Thanks.