Real Time Computer Vision

This project focuses on developing a real-time semantic segmentation system for drone video streams. Leveraging advanced deep learning models, the system processes live footage to identify and classify objects within the scene. With a focus on efficiency, the project integrates post-processing techniques and a user-friendly interface to enhance segmentation accuracy and adjust parameters on the fly. Designed for rapid inference, the application ensures seamless integration with market dominant drones for live environmental analysis, analytics, and more.

Follow this project here: github.com/kylegraupe

What problem does this solve?

This application enables the use of computer vision on a DJI drone that does NOT get access to the DJI SDK. To see a list of the supported SDKs and their associated DJI drones, click here. The drone that I am using for the development of this application is the DJI Mini 4 Pro, the latest release of the sub-250g class of consumer drones, which is NOT supported in the DJI SDK.

V5.0: Implemented custom multithreaded buffering to keep stream in near real-time.

- Increased frame rate

- Lowered and stabilized latency

(No postprocessing conducted in segmentation mask for this test flight, Network = iPhone 16 Hotspot)

[Released 9.15.24, Test Flight 9.13.24]

V1.0 - V4.0: Application teaser trailer.

[Released 9.10.24]

Real-Time Semantic Segmentation on DJI Drone via RTMP Server

This application takes a video stream from a DJI drone via RTMP Server and performs image processing and semantic segmentation on the video stream.

Context

In various industries and applications, there is a growing need for real-time, high-quality video streaming capabilities. DJI is the market-dominant supplier in consumer and industry drones. Therefore, building an application for real-time Computer Vision, leveraging DJI drones like the Mini 4 Pro, is essential to harness the full potential of these advanced imaging systems. This application provides immediate AI analysis to both consumers and professionals, eliminating the need for more costly alternatives and the necessity of DJI SDK while offering comparable control over the video feed and frames.

Features

Real-Time Semantic Segmentation: Perform live semantic segmentation on drone footage.
Custom Model Integration: Integrate custom U-Net models for segmentation tasks.
Post-Processing: Apply advanced post-processing techniques to improve segmentation accuracy.
GUI Integration: A user-friendly graphical interface for controlling and visualizing the segmentation process.
Custom Buffer: Custom implementation of RTMP Stream buffer to mediate latency and keep stream in near real-time.

REFERENCES

Model Training Conducted in Kaggle Jupyter Notebook Environment:
- https://www.kaggle.com/code/kylegraupe/model-training-dji-real-time-semantic-segmentation
- Model training also included in repository: 'model_training/model-training-dji-real-time-semantic-seg-v1.ipynb'