Qcpwrdn shut down issue on WP7603


#1

Hi,

We’re seeing an issue on our mangOH Red units that shutdown for unexplained reasons. Here is a crash log we were able to extract.

Nov  5 17:31:31 swi-mdm9x28 user.warn kernel: [   61.011285] irq 50, desc: cefdb000, depth: 0, count: 0, unhandled: 0
Nov  5 17:31:31 swi-mdm9x28 user.warn kernel: [   61.011308] ->handle_irq():  c025ac68, msm_gpio_irq_handler+0x0/0x150
Nov  5 17:31:31 swi-mdm9x28 user.warn kernel: [   61.011338] ->irq_data.chip(): c0ed8e68, gic_chip+0x0/0x74
Nov  5 17:31:31 swi-mdm9x28 user.warn kernel: [   61.011361] ->action():   (null)
Nov  5 17:31:31 swi-mdm9x28 user.warn kernel: [   61.011372]    IRQ_NOPROBE set
Nov  5 17:31:31 swi-mdm9x28 user.warn kernel: [   61.011383]  IRQ_NOREQUEST set
Nov  5 17:31:31 swi-mdm9x28 user.warn kernel: [   61.011394]   IRQ_NOTHREAD set
Nov  5 17:31:32 swi-mdm9x28 user.info kernel: [   62.443635] gpio_check_and_wake: wake-n_gpio26 STATE=WAKEUP
Nov  5 17:31:32 swi-mdm9x28 user.info kernel: [   62.465129] gpio_check_and_wake: wake-n_gpio26 STATE=SLEEP
Nov  5 17:31:33 swi-mdm9x28 user.info swiapp: qmi_at_unsol_ind_cb: ind id:33 
Nov  5 17:31:33 swi-mdm9x28 user.info swiapp: Received AT command forward request from modem
Nov  5 17:31:33 swi-mdm9x28 user.info swiapp: fwdcmd.opcode=1
Nov  5 17:31:33 swi-mdm9x28 user.info swiapp: fwdcmd.name=$QCPWRDN
Nov  5 17:31:33 swi-mdm9x28 user.info swiapp: fwdcmd.ntokens=0
Nov  5 17:31:33 swi-mdm9x28 user.info swiapp: ctrCond signalling complete.
Nov  5 17:31:33 swi-mdm9x28 user.info swiapp: Recieved ctrCond: p: 0, S:0, nr: 1
Nov  5 17:31:33 swi-mdm9x28 user.info swiapp: QCPWRDN has been detected
Nov  5 17:31:33 swi-mdm9x28 user.warn kernel: [   63.685455] PSM: Modem oprt mode - 3
Nov  5 17:31:33 swi-mdm9x28 user.info swiapp: sending QMI_AT_FWD_RESP_AT_CMD_RESP_V01 message
Nov  5 17:31:33 swi-mdm9x28 user.info swiapp: qmi_client_send_raw_msg_sync returned: 0
Nov  5 17:31:33 swi-mdm9x28 user.info swiapp: New request processing complete.
Nov  5 17:31:33 swi-mdm9x28 user.info swiapp: Waiting for ctrCond
Nov  5 17:31:33 swi-mdm9x28 daemon.info init: starting pid 1755, tty '': '/etc/init.d/rcK'
Nov  5 17:31:33 swi-mdm9x28 user.info kernel: [   63.872044] gpio_check_and_wake: wake-n_gpio26 STATE=WAKEUP
Nov  5 17:31:33 swi-mdm9x28 user.info kernel: [   63.879719] swimcu_device_init: start 0x7ff
Nov  5 17:31:33 swi-mdm9x28 user.info kernel: [   63.883297] swimcu_device_init: mcufw ver=2.009 target=1 opt=0xF
Nov  5 17:31:33 swi-mdm9x28 user.info kernel: [   63.887979] swimcu_device_init: success
Nov  5 17:31:33 swi-mdm9x28 user.info kernel: [   63.895223] swimcu_set_fault_mask: 0x100, cnt 1
Nov  5 17:31:33 swi-mdm9x28 user.info kernel: [   63.895241] swimcu reset_recovery: complete
Nov  5 17:31:33 swi-mdm9x28 user.info kernel: [   63.895250] swimcu_set_reset_source: 0x40
Nov  5 17:31:33 swi-mdm9x28 user.info kernel: [   63.895271] gpio_check_and_wake: wake-n_gpio26 STATE=SLEEP

Does anyone know what QCPWRDN means, and why it would cause a random shutdown?

cm info details:

root@swi-mdm9x28:~# cm info
Device:                        WP7603-1
IMEI:                          351711090107910
IMEISV:                        4
FSN:                           WD752585161410
Firmware Version:              SWI9X07Y_02.16.02.00 000000 jenkins 2018/04/19 19:59:02
Bootloader Version:            SWI9X07Y_02.16.02.00 000000 jenkins 2018/04/19 19:59:02
MCU Version:                   002.009
PRI Part Number (PN):          9907595
PRI Revision:                  001.004 
Carrier PRI Name:              GENERIC
Carrier PRI Revision:          002.032_000
SKU:                           1103727
Last Reset Cause:              Power Down
Resets Count:                  Expected: 248    Unexpected: 2

Legato version

root@swi-mdm9x28:~# legato version
18.06.1_302932300e810a4fb0cba7b4960af4fd

#2

Hi @ktanikel,

As you’ve identified, the source of the shutdown is the $QCPWRDN command which coordinates the system power down routine. This isn’t something I’ve seen before, without a trigger, so to understand your platform a bit better:

  1. Do you have any custom applications? If you have any custom apps that are triggering Ultra-Low power mode or PSM, those could be the triggers for the coordinated shutdown behaviour.
  2. Is the timing consistent? Log shows ~60 seconds - is this always true?
  3. Is there a USB host connection? If so, can you disconnect that and use only the UART console to communicate with the device temporarily? In that configuration, is the power down still occurring (i.e. removing any host-supplied commands to determine if the origin is host or on-target).

Those are the first things that come to mind.

Ryan


#3

Thank you for your reply Ryan.

To address your questions

  1. Yes, we have a custom application that uses ULPM. We set GPIO36 and a timer as boot sources before entering ULPM. However, the problem reported here seems to occur outside of our custom application control i.e. something else is initiating the shutdown and we’re trying to understand what that could be. Here’s what the log looks like when our applications triggers low power mode.

    Nov 7 19:13:03 swi-mdm9x28 user.warn kernel: [ 950.204461] PSM: Server Healthcheck timerid : 2136631240
    Nov 7 19:13:03 swi-mdm9x28 user.warn kernel: [ 950.204552] PSM: Supported: 0
    Nov 7 19:13:03 swi-mdm9x28 user.warn kernel: [ 950.205951] PSM: Client health check timerid : 808896
    Nov 7 19:13:03 swi-mdm9x28 user.debug kernel: [ 950.206227] gpio-200 (sysfs): gpiod_request: status -16
    Nov 7 19:13:03 swi-mdm9x28 user.err kernel: [ 950.207474] i2c-msm-v2 78b8000.i2c: NACK: slave not responding, ensure its powered: msgs(n:1 cur:0 tx) bc(rx:0 tx:18) mode:FIFO slv_addr:0x3a MSTR_STS:0x0c$300c8 OPER:0x00000090
    Nov 7 19:13:03 swi-mdm9x28 user.info kernel: [ 950.243739] swimcu_mdm_sec_to_mcu_time_ms: mdm time=43200 seconds to be calibrated 500/489
    Nov 7 19:13:03 swi-mdm9x28 user.info kernel: [ 950.243766] swimcu_mdm_sec_to_mcu_time_ms: mcu time 44171000 ms + remainder time 44171 ms = 44215171 ms
    Nov 7 19:13:03 swi-mdm9x28 user.info kernel: [ 950.253472] pm_set_mcu_ulpm_enable: sending wait_time_config
    Nov 7 19:13:03 swi-mdm9x28 user.info kernel: [ 950.265234] swimcu_pm_data_store: sending persistent data to MCU
    Nov 7 19:13:03 swi-mdm9x28 user.info kernel: [ 950.277847] pm_set_mcu_ulpm_enable: sending ulpm_config
    Nov 7 19:13:03 swi-mdm9x28 user.warn kernel: [ 950.288790] PSM: Supported: 0
    Nov 7 19:13:03 swi-mdm9x28 daemon.info init: starting pid 3760, tty ‘’: ‘/etc/init.d/rcK’
    Nov 7 19:13:03 swi-mdm9x28 user.info qmi_shutdown_modem: subsystem_control_shutdown
    Nov 7 19:13:03 swi-mdm9x28 user.err qmi_shutdown_modem: QMUXD: WARNING qmi_qmux_if_pwr_up_init failed! rc=-6
    Nov 7 19:13:03 swi-mdm9x28 user.info qmi_shutdown_modem: Initialize QMI DMS Client
    Nov 7 19:13:03 swi-mdm9x28 user.info qmi_shutdown_modem: Current Modem status - 0
    Nov 7 19:13:03 swi-mdm9x28 user.info qmi_shutdown_modem: Move modem status to LPM
    Nov 7 19:13:03 swi-mdm9x28 user.info qmi_shutdown_modem: Wait for modem LPM complete

    Nov 7 19:13:03 swi-mdm9x28 user.debug Legato: DBUG | ampmanager[822]/ampManagerComponent T=main | ampManagerComponent.cpp dataConnectionStateHandlerFunc() 153 | Data connection state event received: 0
    Nov 7 19:13:03 swi-mdm9x28 user.info qmi_shutdown_modem: DMS indication id - 1
    Nov 7 19:13:03 swi-mdm9x28 user.info qmi_shutdown_modem: Modem in LPM state… proceed with rest of shutdown procedure
    Nov 7 19:13:03 swi-mdm9x28 user.info qmi_shutdown_modem: In the shutdown path
    Nov 7 19:13:03 swi-mdm9x28 user.warn kernel: [ 950.716868] PSM: Modem oprt mode - 1

    Nov 7 19:13:03 swi-mdm9x28 user.err Legato: =ERR= | posDaemon[914]/posDaemon T=main | le_pos.c ComputeMove() 663 | Longitude or Latitude are not relevant
    Nov 7 19:13:03 swi-mdm9x28 user.warn Legato: -WRN- | posDaemon[914]/le_pa_gnss T=unknown | pa_gnss_qmi.c PositionHandler() 1359 | Bad position indication
    Nov 7 19:13:03 swi-mdm9x28 user.warn Legato: -WRN- | modemDaemon[900]/le_pa T=main | pa_sim_qmi.c MapSimState() 879 | More than one application 2
    Nov 7 19:13:04 swi-mdm9x28 user.info qmi_shutdown_modem: Received subsystem shutdown complete in 3 second(s)
    Nov 7 19:13:04 swi-mdm9x28 user.info qmi_shutdown_modem: Success.
    Nov 7 19:13:04 swi-mdm9x28 user.info qmi_shutdown_modem: Modem shutdown complete
    Nov 7 19:13:04 swi-mdm9x28 user.warn kernel: [ 951.267980] Stopping QCMAP_ConnectionManager:
    Nov 7 19:13:04 swi-mdm9x28 user.warn kernel: [ 951.404030] Stopping qti: done

As you can see, the boot sources are being set and log clearly indicates it is going in to a LPM state. Also, there seems to some difference with respect to “Modem oprt mode” going to -3 in the problem condition vs -1 in the normal operation. I don’t know what the significance of that is.

  1. I don’t have enough information on this yet. We’ve only been able to capture the log once which is what you see above. I’ll update as we got more data.
  2. There is no USB host connection. The device is running on an internal battery and is also connected to an external power supply.

We haven’t yet found the exact conditions to reproduce this problem. However, we have several devices out in production that are exhibiting this behavior so we’re trying to address this as best we can with the information at hand.


#4

Hi @ktanikel
Do you have "“logread -f” for the shutdown scenario? The logs could help identify the trigger for the shutdown.


#5

Hi @ajoseph
We don’t have the output of "logread -f " because we’ve never encountered this problem during development or testing with the USB connected. The problem seems to occur when the unit is operating off a battery and the only logs we can get are /mnt/flash/legato_logs/, which is what I’ve posted here. As per my understanding, ‘logread’ gets written to legato_logs when there is crash, so what you’re seeing here is essentially logread.


#6

This scenario looks like an organized shutdown. The “$QCPWRDN” is the final command which actually shutdown the modem, but the trigger itself is somewhere else. Legato logs leading up to “$QCPWRDN” being called may provide an idea as to what the trigger is.

Logs for the app is stored in case of an application crash at /mnt/flash/legato_logs/. https://docs.legato.io/latest/c_logging.html#c_log_debugFiles. To ensure Legato logs are captured(which should show the trigger), you will have to redirect “logread -f” to a file (by running script at startup) or modifying syslogd to store logs to a fie.