AIoT-based prototype for Remote Monitoring System

About Client Problem Statement Oneture's Role Solution Technologies Values Delivered

About Client

Client is one of the largest Cash Logistics company in India, they are one of the top 5 in APAC. They enable commerce – connecting business, banks and people with money. They automate ATM and currency management in India. Their networks and support services ensure money is readily available across all states. They provide Cashiering services for top retail chains to picking up cash from thousands of merchants and banking it, they provide a range of services across each stage of the cash cycle in India from currency chests to ATMs to vaults to stores to wallets, to installing and managing Intelligent ATMs, Cash Deposit Machines and Recyclers; they are pioneer in helping change banking in India.

Problem Statement

Client wanted to build an end-to-end system to monitor the key activities in their managed ATMs across different location. Currently, client is leveraging 3rd party product and would like to build this as a home-grown product from the scratch. As part of this prototype, we are helping client to build a monitoring system based on Computer Vision for Helmet, Face cover, Loitering, Crowd Count, Camera Tempering They already have a hardware system (pi 4b based) identified for this purpose and this new application need to be integrated leveraging the same hardware.

There is need to create a training pipeline at the Cloud and optimize models to run at the EDGE device ( RPi 4b based )

Oneture's Role

Oneture’s Data Scientist and Engineers are assisting Client and AWS prototyping team in several high-level discussions, fostering closer collaboration to identify a track of work related to automated surveillance leveraging computer vision deep learning models and building the training and inference pipelines.

Broadly we are assisting in: feasibility study, scoping and architecture, model evaluation and optimisation, collecting training images for the given task, labelling the images as per the model needs, training the models, inferencing and review the model accuracy

Upon a successful prototyping engagement, client and Oneture to take the outputs and convert them into a production deployment and continue to maintain and enhance the same.

Solution

There are three major components a) Inference pipeline involving the edge device and the cloud, b) Training pipeline at the cloud and c) Device provisioning and management.

Overall flow of the solution is as below.

Step 1 : Light weight model will be running at the edge device to detect human images and number of people present and Object classification to determine camera tampering
Step 2 : If the edge device detects the humans in the frame, it starts sending every n’th frame per second to the cloud. Cloud to run inferences to detect Helmet/face cover detection and tracking to determine the duration of the stay
Step 3 : Based on the results from above steps, streamed frames will be formed as a video to be shared with the control centre
Step 4 : Any Low confidence detection will be re-routed for human validation and the validated images will be used for re-training the models
Step 5 : OTA and model updates will be handled by the AWS IoT Greengrass

Technologies

At the EDGE : Pi 4b based device, HD camera, Internet connectivity, Mount(s), Greengrass v2 core, Pre-trained optimized light weight models, Edge inference pipeline - Stream manager (for results and image data)
At the Cloud : AWS IoT core for edge device control - Use case detection logic with deep learning computer vision models, Lambda functions - API layer with AWS API gateway or AWS Appsync
Models and optimisation techniques : Yolov7 / yolov8, SSD with mobilenet-v2, Multi object tracking with fairMOT or fastmot or deepsort, Classification with mobilenet-v2, Sagemaker neo, tflite, Ncnn based
ML framework : Training – pytorch, Inference – onnx, Ncnn (c++ based libraries), Tensorlite based models

Values Delivered

The expected outcome of this prototyping engagement is to demonstrate an automated approach to surveillance with computer vision deep learning models to monitor the key activities in ATMs across different location.