AmebaD + RTL8720DN SSL socket data receive problem

Hi,
I use AmebaD SDK where I enabled SSL sockets and control them via AT commands. I use external HTTP library on my host to download some file from HTTP/HTTPS server. When I download it via TCP socket, everything goes well. When I do it through SSL socket (requests and download flow from host point of view is the same - I use the same AT commands) file is downloaded to about 97% and then ATPR command returns that there is no more data.

When downloading files from the same server through browser, or my own python script everything works perfectly. I also tested different files with different sizes and no matter the file size if it’s 300kB or 500kB the download stops at about 97%.

I made some tests regarding different ATPR recv sizes and here’s what I discovered:

			dl_test.txt - Content Length: 457800 B, ATPR buffer size: 256 B
			dl_test: No more data! Remaining length: 15583, bytes received: 442217, missing: 3,40%.
			
			dl_test.txt - Content Length: 457800 B, ATPR buffer size: 2048 B
			dl_test: No more data! Remaining length: 13791, bytes received: 444009, missing: 3,01%.
			
			dl_test.txt - Content Length: 457800 B, ATPR buffer size: 3072 B
			dl_test: No more data! Remaining length: 12767, bytes received: 445033, missing: 2,78%.
			
			dl_test.txt - Content Length: 457800 B, ATPR buffer size: 3996 B
			dl_test: No more data! Remaining length: 11843, bytes received: 445957, missing: 2,58%.

Seems like download is failing at the same point each time, remaining length difference between each cases is the same as buffer size difference (15583-13791 == |256 - 2048| etc.) File contains only text data (repeating “DownloadTest1234\r\n” through whole file).

For SSL support I use MBEDTLS.

Any hints on what might be the reason or what should I check or test?

Hi @kilimanjaro please share your device info and software info first, so we can better assist you, for example,

  • Board model
  • SDK version and where it’s from

@xidameng
I use BW16 module and release branch (or master as it is named now) that was recommended to me here RTL8720DN/BW16 firmware boot problem. Atcmd version is v2.2.1 and SDK version is v3.5.

I made some modifications to this SDK including:

  • Support for UART ATcmds and SSL via ATcmds (defines and pin changes only)

  • Support for SSL client example (defines only) and provided my own certificate

  • Extended LOG_SERVICE_BUFLEN to 4200 and STACKSIZE to 4800 of log_service to support larger buffers

  • Changes in ATSO command to support HTTPS OTA update

That’s everything that was changed and the only place I touched SSL is providing my own cert for MQTT connection which shouldn’t affect download process.

  • Extended LOG_SERVICE_BUFLEN to 4200 and STACKSIZE to 4800 of log_service to support larger buffers
  • is LOG_SERVICE_BUFLEN that which is defined in platform_opts.h?
  • In which file is the stacksize of log_service changed?
  • can you provide more details on the commands you run in the testing process? from my testing, ATPR only receives data once, up to the buffer limit. Are you using the auto receive mode ATPK for downloading the file?
  • Yes, it’s in platform_opts.h
  • \component\common\api\at_cmd\log_service.c line 467
  • I run ATPC to open SSL socket, then ATPT for http requests and ATPR for polling for data. I don’t use ATPK at all.
    When I download files this way through TCP socket everything works perfectly.

Unfortunately I can’t use ATPK as I have multi-threaded host with MQTT connection on one thread and the other thread running HTTP download both using Realtek sockets. As specified in AN0075 document about ATPK: received data will return to host without any information in the head so I woudln’t be able to distinguish which connection returned data.

@kilimanjaro

It looks like the error is caused by some MbedTLS behaviour. I am testing using ATPC to open a SSL socket, and using ATPK to receive a large data file sent from the server side, and I am seeing that it stops before the file contents are completely transferred. I guess this is the same thing you are seeing.

I am not sure what is happening, I plan to enable TLS debug, then capture and decrypt the TLS traffic in wireshark to see the details.

Seems like the same thing, looking forward to hearing from you about tests results.

@kilimanjaro

After some testing, this is what I have found:

  • A file of size 1501 bytes will be received entirely, using either ATPR or ATPK
  • A file exceeding 1501 bytes will also be received entirely, but the data exceeding 1501 bytes seem to be stuck in a buffer somewhere, until another transmission occurs from the server. After this transmission, using ATPR once will first clear out the remaining data from the previous file, using ATPR a second time will retrieve the data from the new transmission. If the data sent in the new transmission also exceeds 1501 bytes, the process repeats.

The value of 1501 may be specific only to my build, but I think the overall behavior should be the same as yours. My understanding is that repeated ATPR calls should clear out the buffers, so this is definitely wrong and should be investigated.

As a stop-gap measure, perhaps you could try sending the server a dummy request after the download, just to get the server to send a short response, and see if this gets you the missing data.

Thanks, I’ll test that workaround. Will you investigate further to find the root cause of this problem?

yes, the data is getting to through to the TCP layer. It is getting stuck somewhere in the TLS layer.

@wyy
I tested your workaround and it works, thanks for this suggestion. Any updates on TLS layer investigation?

@kilimanjaro

thanks for letting me know it works, that would confirm that we are seeing the same issue, and that a solution would probably work for both of us.

I am trying to enable LwIP debugging, to compare the debug messages between the TCP layer and TLS layer to narrow down where the data is stuck at.

@wyy any updates, are you still investigating?

Since I last looked at it, from the LwIP debug messages, it seemed like the IP layer also experienced the same phenomenon, i.e. I had no idea where the data is getting stuck at, since LwIP also reported no missing data.
Unfortunately, that is at the limit of the source code that is accessible to me. I have forwarded the issue to internal developers for investigation, but any progress on the issue is now out of my control. I must apologize that I would be probably unable to provide any detailed updates.

Ok, thanks for passing the subject further on. Where will I be able to find information about the prepared solution when it is available?

I will post updates as I receive them

1 Like

I have very similar issue but just at the beginning of downloading a file. Also, I have a workaround for it but no idea why it works!. Here is what I do:

  1. Connecting to AP
  2. Opening SSL socket
  3. Sending GET request
  4. Receiving data (using ATPR)
    The problem is that only FIRST attempt of ATPR returns some data. Every next ATPR command returns 0 length. The file is large enough so it should return more data.

The workaround is weird. Just adding zero (\0) at the end of GET request string fixes the problem with receiving data. But why?

Below 2 examples with my logs.
1st - the NOT working version:

rtl_uart: >>> INIT
rtl_uart: Baud rate: 38400
rtl_uart: <<< AT COMMAND READY

rtl_uart: >>> ATPN=“SSID”,“PASS”
rtl_uart: <<< [ATPN] OK
at_cmd: [ATPN] SUCCESS

rtl_uart: >>> ATPC=2,dweuiotsa1.blob.core.windows.net,443
rtl_uart: <<< [ATPC] OK
[ATPC] con_id=1
at_cmd: [ATPC] SUCCESS Socket ID: 1

rtl_uart: >>> ATPT=141,1 [47,45,54,20,2F,77,69,66,69,2D,66,69,72,6D,77,61,72,65,2D,74,65,73,74,2F,74,65,73,74,70,6F,62,69,65,72,61,6E,69,61,2E,74,78,74,20,48,54,54,50,2F,31,2E,31,0D,0A,48,6F,73,74,3A,20,64,77,65,75,69,6F,74,73,61,31,2E,62,6C,6F,62,2E,63,6F,72,65,2E,77,69,6E,64,6F,77,73,2E,6E,65,74,0D,0A,41,63,63,65,70,74,3A,20,74,65,78,74,2F,78,6D,6C,2C,20,61,70,70,6C,69,63,61,74,69,6F,6E,2F,78,68,74,6D,6C,2B,78,6D,6C,2C,20,2A,2F,2A,0D,0A,0D,0A]=141
rtl_uart: <<< [ATPT] OK,1
at_cmd: [ATPT] SUCCESS

rtl_uart: >>> ATPR=1,64
rtl_uart: <<< [ATPR] OK,64,1 [48,54,54,50,2F,31,2E,31,20,32,30,30,20,4F,4B,0D,0A,43,6F,6E,74,65,6E,74,2D,4C,65,6E,67,74,68,3A,20,36,36,35,38,0D,0A,43,6F,6E,74,65,6E,74,2D,54,79,70,65,3A,20,61,70,70,6C,69,63,61,74,69,6F,6E]=64
at_cmd: [ATPR] SUCCESS Socket ID: 1, Data: HTTP/1.1 200 OK
Content-Length: 6658
Content-Type: application

rtl_uart: >>> ATPR=1,64
rtl_uart: <<< [ATPR] OK,0,1 =0
at_cmd: [ATPR] SUCCESS Socket ID: 1, No data

rtl_uart: >>> ATPR=1,64
rtl_uart: <<< [ATPR] OK,0,1 =0
at_cmd: [ATPR] SUCCESS Socket ID: 1, No data

rtl_uart: >>> ATPR=1,64
rtl_uart: <<< [ATPR] OK,0,1 =0
at_cmd: [ATPR] SUCCESS Socket ID: 1, No data

rtl_uart: >>> ATPD=1
rtl_uart: <<< [ATPD] OK
at_cmd: [ATPD] SUCCESS

2nd - the Working version (with \0 at the end of GET request):
rtl_uart: >>> INIT
rtl_uart: Baud rate: 38400
rtl_uart: <<< AT COMMAND READY

rtl_uart: >>> ATPN=“SSID”,“PASS”
rtl_uart: <<< [ATPN] OK
at_cmd: [ATPN] SUCCESS

rtl_uart: >>> ATPC=2,dweuiotsa1.blob.core.windows.net,443
rtl_uart: <<< [ATPC] OK
[ATPC] con_id=1
at_cmd: [ATPC] SUCCESS Socket ID: 1

rtl_uart: >>> ATPT=142,1 [47,45,54,20,2F,77,69,66,69,2D,66,69,72,6D,77,61,72,65,2D,74,65,73,74,2F,74,65,73,74,70,6F,62,69,65,72,61,6E,69,61,2E,74,78,74,20,48,54,54,50,2F,31,2E,31,0D,0A,48,6F,73,74,3A,20,64,77,65,75,69,6F,74,73,61,31,2E,62,6C,6F,62,2E,63,6F,72,65,2E,77,69,6E,64,6F,77,73,2E,6E,65,74,0D,0A,41,63,63,65,70,74,3A,20,74,65,78,74,2F,78,6D,6C,2C,20,61,70,70,6C,69,63,61,74,69,6F,6E,2F,78,68,74,6D,6C,2B,78,6D,6C,2C,20,2A,2F,2A,0D,0A,0D,0A,00]=142
rtl_uart: <<< [ATPT] OK,1
at_cmd: [ATPT] SUCCESS

rtl_uart: >>> ATPR=1,64
rtl_uart: <<< [ATPR] OK,64,1 [48,54,54,50,2F,31,2E,31,20,32,30,30,20,4F,4B,0D,0A,43,6F,6E,74,65,6E,74,2D,4C,65,6E,67,74,68,3A,20,36,36,35,38,0D,0A,43,6F,6E,74,65,6E,74,2D,54,79,70,65,3A,20,61,70,70,6C,69,63,61,74,69,6F,6E]=64
at_cmd: [ATPR] SUCCESS Socket ID: 1, Data: HTTP/1.1 200 OK
Content-Length: 6658
Content-Type: application

rtl_uart: >>> ATPR=1,64
rtl_uart: <<< [ATPR] OK,64,1 [2F,6F,63,74,65,74,2D,73,74,72,65,61,6D,0D,0A,43,6F,6E,74,65,6E,74,2D,4D,44,35,3A,20,75,30,45,4E,32,71,51,74,61,6D,31,41,62,76,44,47,55,47,58,78,71,41,3D,3D,0D,0A,4C,61,73,74,2D,4D,6F,64,69,66]=64
at_cmd: [ATPR] SUCCESS Socket ID: 1, Data: /octet-stream
Content-MD5: u0EN2qQtam1AbvDGUGXxqA==
Last-Modif

rtl_uart: >>> ATPR=1,64
rtl_uart: <<< [ATPR] OK,64,1 [69,65,64,3A,20,54,75,65,2C,20,32,34,20,4D,61,79,20,32,30,32,32,20,31,37,3A,31,39,3A,35,39,20,47,4D,54,0D,0A,45,54,61,67,3A,20,30,78,38,44,41,33,44,41,39,41,31,33,45,38,39,31,43,0D,0A,53,65,72]=64
at_cmd: [ATPR] SUCCESS Socket ID: 1, Data: ied: Tue, 24 May 2022 17:19:59 GMT
ETag: 0x8DA3DA9A13E891C
Ser

rtl_uart: >>> ATPR=1,64
rtl_uart: <<< [ATPR] OK,64,1 [76,65,72,3A,20,57,69,6E,64,6F,77,73,2D,41,7A,75,72,65,2D,42,6C,6F,62,2F,31,2E,30,20,4D,69,63,72,6F,73,6F,66,74,2D,48,54,54,50,41,50,49,2F,32,2E,30,0D,0A,78,2D,6D,73,2D,72,65,71,75,65,73,74,2D]=64
at_cmd: [ATPR] SUCCESS Socket ID: 1, Data: ver: Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0
x-ms-request-

rtl_uart: >>> ATPR=1,64
rtl_uart: <<< [ATPR] OK,64,1 [69,64,3A,20,66,64,64,66,31,63,39,32,2D,64,30,31,65,2D,30,30,34,34,2D,36,35,64,65,2D,39,35,62,63,66,30,30,30,30,30,30,30,0D,0A,78,2D,6D,73,2D,76,65,72,73,69,6F,6E,3A,20,32,30,30,39,2D,30,39,2D]=64
at_cmd: [ATPR] SUCCESS Socket ID: 1, Data: id: fddf1c92-d01e-0044-65de-95bcf0000000
x-ms-version: 2009-09-

rtl_uart: >>> ATPR=1,64
rtl_uart: <<< [ATPR] OK,64,1 [31,39,0D,0A,78,2D,6D,73,2D,6C,65,61,73,65,2D,73,74,61,74,75,73,3A,20,75,6E,6C,6F,63,6B,65,64,0D,0A,78,2D,6D,73,2D,62,6C,6F,62,2D,74,79,70,65,3A,20,42,6C,6F,63,6B,42,6C,6F,62,0D,0A,44,61,74,65]=64
at_cmd: [ATPR] SUCCESS Socket ID: 1, Data: 19
x-ms-lease-status: unlocked
x-ms-blob-type: BlockBlob
Date

rtl_uart: >>> ATPR=1,64
rtl_uart: <<< [ATPR] OK,64,1 [3A,20,54,75,65,2C,20,31,32,20,4A,75,6C,20,32,30,32,32,20,31,30,3A,35,39,3A,33,34,20,47,4D,54,0D,0A,0D,0A,54,65,73,74,50,6F,62,69,65,72,61,6E,69,61,31,32,33,34,0D,0A,54,65,73,74,50,6F,62,69,65]=64
at_cmd: [ATPR] SUCCESS Socket ID: 1, Data: : Tue, 12 Jul 2022 10:59:34 GMT

TestPobierania1234
TestPobie

rtl_uart: >>> ATPR=1,64
rtl_uart: <<< [ATPR] OK,64,1 [72,61,6E,69,61,31,32,33,34,0D,0A,54,65,73,74,50,6F,62,69,65,72,61,6E,69,61,31,32,33,34,0D,0A,54,65,73,74,50,6F,62,69,65,72,61,6E,69,61,31,32,33,34,0D,0A,54,65,73,74,50,6F,62,69,65,72,61,6E,69]=64
at_cmd: [ATPR] SUCCESS Socket ID: 1, Data: rania1234
TestPobierania1234
TestPobierania1234
TestPobierani

rtl_uart: >>> ATPD=1
rtl_uart: <<< [ATPD] OK
at_cmd: [ATPD] SUCCESS

Few more hints:

  • No matter what is the requested length in the ATPR command - only the first attempt work
  • I did custom change in the Ameba SDK file atcmd_lwip.c. Instead ETH_MAX_MTU I use 'packet_size".
    I changet this:
    error_no = atcmd_lwip_receive_data(curnode, rx_buffer, ETH_MAX_MTU, &recv_size, udp_clientaddr, &udp_clientport);
    Into this:
    error_no = atcmd_lwip_receive_data(curnode, rx_buffer, packet_size, &recv_size, udp_clientaddr, &udp_clientport);
  • I use SSL socket

Any ideas?

Apologies for the delay, this modification should fix the issue.

  1. Open the file atcmd_lwip.c in your local SDK copy
  2. Find the function atcmd_lwip_receive_data located around line 5266
  3. Replace the original code show here:
	if(!((ret > 0)&&(FD_ISSET(curnode->sockfd, &readfds))))
	{
		//AT_DBG_MSG(AT_FLAG_LWIP, AT_DBG_ALWAYS, 
		//	"[ATPR] No receive event for con_id %d", curnode->con_id);
		goto exit;
	}

with this replacement:

	if(!((ret > 0)&&(FD_ISSET(curnode->sockfd, &readfds))))
	{
		if(curnode->protocol == NODE_MODE_SSL)
		{
			if(mbedtls_ssl_get_bytes_avail((mbedtls_ssl_context *)curnode->context) == 0)
			{
				//AT_DBG_MSG(AT_FLAG_LWIP, AT_DBG_ALWAYS, 
				//	"[ATPR] No receive event for con_id %d", curnode->con_id);
				goto exit;
			}
		}
		else{
			//AT_DBG_MSG(AT_FLAG_LWIP, AT_DBG_ALWAYS, 
			//	"[ATPR] No receive event for con_id %d", curnode->con_id);
			goto exit;
		}
	}
1 Like