MicroPython with Computer Vision on RTL8722DM_MINI
Hi! if you have not watched the demo video yet, please feel free to click the video above
Computer Vision has been around for quite a while but running computer vision application is always a little too demanding for microcontroller (MCU), as it requires a lot of computation and also burns a lot of power, thus it’s better to offload the CV related computation to a more capable machine, for example our PC or a cloud server and let the MCU to control sensers/actuators in real time.
This little project has 2 main building blocks,
- Video capture and running of computer vision algorithm → PC
- Running MicroPython for wireless data transmission and LED control → RTL8722DM_MINI
Let’s go through them one by one.
1. Video capture and Computer Vision Algorithm
To achieve hand gesture recognition, I chose to the well-known OpenCV and MediaPipe libraries as they are open source project available on Github and they all have Python binding – meaning I can key in all my logic using Python for quick prototyping.
The code I use are mainly from a Youtuber by the name of Murtaza, he has provided his code for free if you register an account on his website, as I have not obtained his permission to release the code to public , I would suggest that you can watch his videos specially on Hand Tracking and Gesture control to see his code and understand how hand gestures were captured and understood by yourself.
However, I added and edited quite a bit of the original Python script so that it only runs the task I need and also help me to send the data to my RTL8722DM_MINI via a TCP socket (server). Below is the flow chart for all the logic running on my PC.
Here is the complete python code,
# Computer Vision related Code partially adopted from https://www.youtube.com/c/MurtazasWorkshopRoboticsandAI/featured
# Credit goes to him for create the module and example
import cv2
import time
import numpy as np
import math
import socket
import mediapipe as mp
########## Module #############
class handDetector():
def __init__(self, mode=False, maxHands=2, detectionCon=0.5, trackCon=0.5):
self.mode = mode
self.maxHands = maxHands
self.detectionCon = detectionCon
self.trackCon = trackCon
self.mpHands = mp.solutions.hands
self.hands = self.mpHands.Hands(self.mode, self.maxHands,
self.detectionCon, self.trackCon)
self.mpDraw = mp.solutions.drawing_utils
def findHands(self, img, draw=True):
imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
self.results = self.hands.process(imgRGB)
# print(results.multi_hand_landmarks)
if self.results.multi_hand_landmarks:
for handLms in self.results.multi_hand_landmarks:
if draw:
self.mpDraw.draw_landmarks(img, handLms,
self.mpHands.HAND_CONNECTIONS)
return img
def findPosition(self, img, handNo=0, draw=True):
lmList = []
if self.results.multi_hand_landmarks:
myHand = self.results.multi_hand_landmarks[handNo]
for id, lm in enumerate(myHand.landmark):
# print(id, lm)
h, w, c = img.shape
cx, cy = int(lm.x * w), int(lm.y * h)
# print(id, cx, cy)
lmList.append( [id, cx, cy])
if draw:
cv2.circle(img, (cx, cy), 15, (255, 0, 255), cv2.FILLED)
return lmList
############## Variables ##################
wCam, hCam = 640, 480
pTime = 0
minBri = 0
maxBri = 1
briArd =0
############## Declararion ##################
cap = cv2.VideoCapture(0) # default camera is 0, if you have another cam, you can set to 1
cap.set(3, wCam)
cap.set(4, hCam)
detector = handDetector(detectionCon=0.7)
########## Step 1 ###########
# Start a TCP server and bind to port 12345
# Use ipconfig to check the IP address of your PC
s = socket.socket()
print ("Socket successfully created")
port = 12345
s.bind(('', port))
print("socket binded to %s" % (port))
s.listen(5)
print ("socket is listening")
c, addr = s.accept()
print('Got connection from', addr)
######### Step 2 ###############
# Image capture and processing using mediapipe and opencv
while True:
success, img = cap.read()
img = detector.findHands(img)
lmList = detector.findPosition(img, draw=False)
if len(lmList) != 0:
x1, y1 = lmList[4][1], lmList[4][2]
x2, y2 = lmList[8][1], lmList[8][2]
cx, cy = (x1 + x2) // 2, (y1 + y2) // 2
cv2.circle(img, (x1, y1), 15, (255, 0, 255), cv2.FILLED)
cv2.circle(img, (x2, y2), 15, (255, 0, 255), cv2.FILLED)
cv2.line(img, (x1, y1), (x2, y2), (255, 0, 255), 3)
cv2.circle(img, (cx, cy), 15, (255, 0, 255), cv2.FILLED)
length = math.hypot(x2 - x1, y2 - y1)
#print(length)
# Hand range 50 - 300
brightness = np.interp(length, [50, 300], [minBri, maxBri])
briArd = np.around(brightness,2)
#print(briArd, brightness, length)
if length < 50:
cv2.circle(img, (cx, cy), 15, (0, 255, 0), cv2.FILLED)
# Print FPS
cTime = time.time()
fps = 1 / (cTime - pTime)
pTime = cTime
cv2.putText(img, f'FPS: {int(fps)}', (40, 50), cv2.FONT_HERSHEY_COMPLEX,
1, (255, 0, 0), 3)
# Display image
cv2.imshow("Img", img)
cv2.waitKey(1)
# Sending the distance between our thumb and index wirelessly to IoT device
c.sendall(str(briArd).encode())
print("send data success")
print(briArd, str(briArd))
#c.close()
2. Running MicroPython for wireless data transmission and LED control
Reason why I chose MicroPython is so that I don’t have to use other languages
MicroPython is a lean implementation of Python 3 interpretor designed for microcontrollers and RTL8722DM_MINI supports MicroPython.
With MicroPython, you can control all the peripherals on the microcontroller, in the case of RTL8722DM_MINI, we can use only 13 lines of Python code to,
- Connect to Wi-Fi
- Start a TCP client socket
- Control LED brightness via PWM
(MicroPython code is here )
import socket
from wireless import WLAN
from machine import PWM
wifi = WLAN(mode = WLAN.STA)
wifi.connect(ssid = "yourNetwork", pswd = "password") # change the ssid and pswd to yours
c = socket.SOCK()
# make sure to check the server IP address and update in the next line of code
c.connect("192.168.0.106", 12345)
p = PWM(pin = "PA_23")
while True:
data = c.recv(512)
f_brightness= float(data[:3])
print(f_brightness)
p.write(f_brightness)
Conclusion
With Wi-Fi and other wireless communications, and IoT-enabled device like RTL8722DM_MINI can really be an integral part of an AIoT project. With Python language, product prototyping can be even faster and smoother.
RTL8722DM_MINI still has many other interesting features I haven’t tried out, such as Audio Codec, Ultra-Low Power and BLE5.0, which give it a lot of potential to become an powerful multimedia-capable IoT node that can be deployed for long time.