Frequent hardfaults when using mqtt basic functions - bw16 virtually USELESS

Hello, I am experienced a high frequency of hardfaults when trying to use a simple code that receives uart signal from a peripheral device and publishes the data to a mosquitto mqtt server. These hardfaults do not allow for a watchdog reset. The only reset possible is to physically reset the device. NO FINAL PRODUCTION PRODUCT is possible if a constant physical reset is required. How can such a faulty device be on the market? Am I missing something? Im so dissapointed by the BW16 right now. This is really a simple use case and it does not seem possible to make a reliable implementation with this device. Why in the world does the standard behavior for the hardfault hander not include a system reset???

#include <WiFi.h>
#include <PubSubClient.h>

#include <SoftwareSerial.h>

SoftwareSerial mySerial(PB2, PB1); // RX, TX

char c;
String dataIn;


char ssid[] = "xxxxxx";      
char pass[] = "xxxxxx";  


int status = WL_IDLE_STATUS;        


char mqttServer[]     = "test.mosquitto.org";
char clientId[]       = "amebaClient";
char publishTopic[]   = "Server";
char publishPayload[] = "initialize";
char subscribeTopic[] = "INPUT";

char key[10] = {0};


void callback(char* topic, byte* payload, unsigned int length) {
    Serial.print("Message arrived [");
    Serial.print(topic);
    Serial.print("] ");
    for (unsigned int i = 0; i < length; i++) {
        Serial.print((char)(payload[i]));
    }
    Serial.println();
}

WiFiClient wifiClient;
PubSubClient client(wifiClient);

void reconnect() {
    // Loop until we're reconnected
    while (!(client.connected())) {
        Serial.print("\r\nAttempting MQTT connection...");
        // Attempt to connect
        if (client.connect(clientId)) {
            Serial.println("connected");
            //Once connected, publish an announcement and resubscribe
            client.publish(publishTopic, publishPayload);
            client.subscribe(subscribeTopic);
        } else {
            Serial.println("failed, rc=");
            Serial.print(client.state());
            Serial.println(" try again in 5 seconds");
            //Wait 5 seconds before retrying
            delay(5000);
        }
    }
}



void setup() {
    //Initialize serial and wait for port to open:
    Serial.begin(115200);
    mySerial.begin(115200);
    //Attempt to connect to WiFi network
    while (status != WL_CONNECTED) {
        Serial.print("\r\nAttempting to connect to SSID: ");
        Serial.println(ssid);
        // Connect to WPA/WPA2 network. Change this line if using open or WEP network:
        status = WiFi.begin(ssid, pass);
        // wait 10 seconds for connection:
        delay(10000);   
    }
    client.setServer(mqttServer, 1883);
    client.setCallback(callback);
    
    //Allow Hardware to sort itself out
    delay(1500);
    
}

void loop() 
{

  while (mySerial.available()>0) 
  {
    c = mySerial.read();

    if (c == '\n') {break;}
    else {dataIn+=c;}
  }

  if (c=='\n')
  {
    Serial.println(dataIn);
    int str_len = dataIn.length() + 1;
    dataIn.toCharArray(key, str_len);     
    if (client.connect(clientId)) 
    {
      client.publish(publishTopic, key);
    }
            
    c=0;
    dataIn="";
  }

   
  if (!(client.connected())) 
  {
    reconnect();
  }
  client.loop();
   
  
}

Hi @microPC,

Can you try to call this API for software reset?

#include <sys_api.h>
sys_reset();

Thank you.

This is not a solution at all. The firmware hang up does not allow execution of that function or any other function. Of course, I tried calling the function before i tried using the watchdog. Neither works. Was that not clear in my post?

Hi @microPC,

Sure, no worries. I will take a look at this issue tomorrow. I originally thought you meant you are unable to do a software reset. If you can’t even reach the main function, the current implementation probably has some issue that needs to be looked into.

Thank you.

That would be amazing if you could find the issue. I am concerned because another user also posted here in 2022 that the mqtt functions were causing a similar firmware hang and nobody was able to help him :slightly_frowning_face:at this point I am thinking I have to resort to an external power cycling mechanism which would not be ideal…please help :grin:

Hi @microPC,

What peripheral device are you using to get the uart signal? Have you tested if standalone publishing data to mosquito mqtt server OR standalone receiving UART signal from your peripheral device will cause any hard fault?

May I know what SDK version are you using so I can try to replicate your set up?

Thank you.

I am using a teensy 4.1 and I am using pin 7 and 8 to pin PB1 and PB2 on rtl8720dn. I am programming both of them using Arduino IDE 2.3.2

Using the SoftwareSerial library.

I have tried running the programming with just the mqtt functionality - the rtl8720dn can send and receive messages from the mqtt broker without experiencing any firmware hangups.

I have tried running the software serial communication code between teensy 4.1 and rtl8720dn without mqtt functions - everything functions flawlessly, has worked for days on end. Never experienced a single hang up.

When I combine the two, I get frequent hangups. Watching the rtl8720dn serial monitoring, I get different situations where it stops functioning. Sometimes I get a cycle like this:

[INFO] Create socket successfully
[ERROR] Connect to server failed
[INFO] [ard_socket.c][send_data] err = 0
failed, rc = 4

Other times i get:

RTL8721D[Driver]: no beacon for a long time, disconnect or roaming
12:38:41.420 -> dissconn reason code: 65535
12:38:41.420 -> connected stage, loss beacon
12:38:41.420 -> 

12:39:49.457 -> Attempting MQTT connection...
12:39:49.457 -> [INFO]server_drv.cpp:  start_client
12:39:49.457 -> [INFO] Create socket successfully
12:39:49.457 -> 
12:40:07.496 ->  [ERROR] Connect to server failed
12:40:07.497 -> 
12:40:07.497 -> [INFO] [ard_socket.c][send_data] err = 1583184219

And other times the serial monitor simply stops sending any messages whatsoever with no error messages at all. Everything just stops. Nonresponsive until I disconnect the device and reconnect the USB cable. If I use the reset pin then it will often go into a cycle of trying and failing to connect to wifi. Often the serial monitor with just say: [Driver]:" without any other message. But if I unplug and replug, it can connect to wifi just fine.

MOST IMPORTANTLY, in all of these scenarios the watchdog does not work. This is the crucial part. Because otherwise I would just program a reset and let it work itself out. But in this case someone has to physically access the device or unpower the device. Which means it is not production ready.

Hi @microPC,

Alright thanks for the info, I do not have teensy, but will test with other peripheral boards first.

Will let you know if I find a fix to this issue.

Thank you.

Hi,
I don’t know about the MQTT disconnects, but Hard Faults can occur on memory access violations, which you may have here:

dataIn.toCharArray(key, str_len);

key is only 10 bytes and str_len potentially longer than that, so the input would be

dataIn.toCharArray(key, 10);

I hope this helps, but I don’t have your exact setup so I can’t guarantee it.

1 Like