UDP packet loss in FreeRTOS?

My application requires the BW16 to receive up to 8 UDP packets in very quick succession.
For test purposes I am only sending a burst once per second.
Wireshark, tells me the burst is taking about 870us to be sent from my PC (around 100us per packet).
My test program simply sits in a loop using recvfrom() in blocking mode, calling a simple ‘copy’ routine when a packet arrives, and toggles a GPIO bit each time a packet is received.
In this test environment, I have no other (user) tasks running except the WiFi Interactive.
I am finding that often only the first three or four packets in the series are received.
This is indicated by all of the following:

  • rx_packets= in output from wifi_info command (rx_dropped= sits at 1 but does not increment)
  • Scope trace on GPIO pin
  • My own packet counter around the recvfrom() calls.

Test conditions

  • The sending PC and the receiving BW16 are associated with the same (Ubiquiti) access point on the same channel and are in close proximity (less than 2m). The AP is reporting RSSI of -48dBm (i.e. very loud!)
  • Both are using IEEE 11.na on 5GHz.
  • The sending PC is reporting a “link speed” of 216Mbit/s

Is there some ‘tuning’ I can do to the WiFi settings to avoid this packet loss? It is a very severe problem for our application.
Thanks in advance.

Further information - if I ‘divert’ the burst of packets to a Linux VM on a wired connection, tcpdump shows no missing packets, so I know they are being launched correctly by the PC.
The packet length is shown as 572 bytes by Wireshark

And here is a copy of the wifi_info output after having been running for a few minutes:

wifi_info

[MEM] After do cmd, available heap 142240

WIFI wlan0 Status: Running

[rltk_wlan_statistic] tx stat: tx_packets=48, tx_dropped=0, tx_bytes=3832
[rltk_wlan_statistic] rx stat: rx_packets=36227, rx_dropped=3, rx_bytes=20763252, rx_overflow=0
[rltk_wlan_statistic] rx stat: rx_reorder_drop_cnt=0, rx_reorder_timeout_cnt=2
[rltk_wlan_statistic] min_free_heap_size=140000, current heap free size=142240
[rltk_wlan_statistic] max_skbbuf_used_num=4, skbbuf_used_num=0
[rltk_wlan_statistic] max_skbdata_used_num=4, skbdata_used_num=0
[rltk_wlan_statistic] max_timer_used_num=52

WIFI wlan0 Setting:

  MODE => STATION
  SSID => artnet

CHANNEL => 36
SECURITY => AES
PASSWORD =>

Interface (wlan0)

    MAC => 94:c9:60:1d:80:8e
    IP  => 192.168.100.44
    GW  => 192.168.100.1

Last Link Error

    No Error

Task List:
interacti X 4 810 13
Tmr Svc R 1 426 7
art_udp_t R 1 918 5
IDLE R 0 472 6
TCP_IP B 9 734 8
art_spi_t B 1 938 4
LOGUART_T B 5 974 2
log_servi B 5 1227 1
rtw_inter B 6 198 11
rtw_recv_ B 5 827 9
rtw_xmit_ B 5 208 10
cmd_threa B 6 759 12

Memory Usage

Min Free Heap Size: 140000
Cur Free Heap Size: 142240

1 Like

Hi, we have tested the following example with BW16, and didn’t encounter such loss in packet during transmission,
https://www.amebaiot.com/en/amebad-arduino-udp-receive-delay/

Maybe you wanna give this example another try to eliminate some other possibilities

I will test it, but my losses were on reception, not on transmission. Thank you for your work doing the testing, and for your reply.

I notice that the UDP sender in this example injects 5ms delay between each packet. I’m quite sure that if I did that, I would not lose packets. However, the packet source for our project is a lighting control suite which does not provide for such ‘pacing’.

However what I will do is use this example but modify the sender program to send 8 packets “nose to tail” - I bet I will only receive about three of them at the receiver (but I’ll be pleased if I’m wrong)! I will post back later this morning.

1 Like

I modified the Arduino example so that the tester sends only 100 packets, and altered the receiver to report every 100 packets coming in. I modified the transmission length to match my use case (572 byte packets).
Even with the usleep delay commented out in the sender, I can confirm that I reliably receive 100 packets when 100 are sent. Packet trace at the sender end tells me the packets are being sent in approx 10ms, so less than 100us between packets - comparable with my required use case.

So now I need to work out why recvfrom() under the standard SDK and under FreeRTOS seems to lose packets whilst the Arduino does not.

Do you have any ideas on how I can further diagnose this?

So far as I can see both the Arduino udp.read() and the sockets recv() follow almost exactly the same sequence through to lwip_recvfrom() and then netconn_recv() and then netconn_recv_data(), and then we are into the mailbox protocol between the network processor and the application processor.
File comparisions of app_lib.c and sockets.c between the two source trees do not reveal any significant differences

This leads me to the following questions:

  1. Given that my SDK application makes use of an SPI peripheral, is it possible this use is clashing with the interface between the network processor and the km4? I am using the instance defined as MBED_SPI1.
  2. Could there be any differences in the lower levels of the firmware between the two environments?
  3. Could there be differences in the memory allocated to the lower levels of the stack in terms of pre-allocated buffers? I’m still looking for settings like ‘maximum UDP sockets’ ‘maximum receive buffers’. I’ll keep digging.

I have now modified both the Arduino example, and my project’s UDP handler so that they pulse a GPIO on receipt of a packet. I have commented out ALL processing of the packets (and my use of the SPI peripheral) to try and get a fair comparison
The lighting controller sends eight packets in quick succession.
The scope trace for the Arduino looks like this (blue trace):


Reasonably tidy clusters of pulses on average, about 40ms apart - just what I’d expect.
Zooming in on on of the clusters we get this:

Reasonably evenly spaced pulses roughly 200us apart - that’s fine.
Now we move to the FreeRTOS-based system. Here, the pulses are pretty scattered, and I was unable to capture the blocks of pulses as before:

And when we zoom in, we can see that there are hugely different gaps between the pulses:

. That gap between the second and fourth pulse is about 1.8ms, and this burst only showed 4 packets coming in - the other 4 were lost.

These tests were carried out with having switched off all the console logging service and with interactive commands disabled, and I had commented out my SPI task (which is why the top trace shows nothing except noise). I’m not sure what else would be running.

A couple of weeks back, I thought I’d be able to get tighter control over the timing using the ‘proper’ SDK and FreeRTOS, but right now, it is looking as if I should have stayed with Arduino.

Perhaps I’m missing some basics in setting up FreeRTOS?

Could it be due to task priorities in FreeRTOS?

In Arduino, the main loop is set at the highest priority, and by extension, so are all the LwIP functions it calls when polling for UDP data.
In the SDK, generally no tasks are set at the highest priority.

All the SDK functions can be called even when using Arduino IDE, just need to include the right header files

1 Like

I have just two tasks.
My UDP receiver task sits on a blocking recv() so I am assuming that this will constitute a ‘yielded’ condition. My assumption (which may be wrong) is that incoming UDP packets would be detected by FreeRTOS, and then cause my task to return from recv().

My SPI transmitter task sits waiting on a FreeRTOS queue event, which should also mean it is ‘yielded’, unless triggered either on a timer or on an SPI transmit complete event or on an event queued from the UDP task.

Your email prompted me to dig in to the source to find out the meaning of the various numbers in the task report, and I’ve worked out that the third field is ‘priority’, and see that less important things (like the log service) are at higher priority than my two tasks - which I put only just above the Idletask thinking that I should leave room for things like the IP stack to have more priority, but if any of the other higher priority tasks have been written “badly” - e.g. running round polling something - it would explain the behaviour.

I will experiment with priorities and report back - thank you for that useful hint.

I’m pleased to say you hit the nail on the head.
I’ve adjusted the priorities of my two tasks as per the following screenshot:

In essence, I’d blindly followed the example for ‘xTaskCreate()’ which sets the task priority very low.
I don’t know what some of the tasks in the list do, but I’m surprised any of them were doing enough to mean that my tasks - even though low priority - would not get scheduled in a timely manner.

I should now do the research to try and work out which of the tasks was getting in the way, but not tonight!

I’m really grateful to you for helping me spot this, and delighted that my project now works smoothly. Thank you.

1 Like

Honestly it was a blind guess, but I am glad to see that it worked for you