Human activity recognition

General Description

The Human Activity Recognition component provides models for detecting and classifying human activities in video data. It leverages state-of-the-art deep learning architectures such as SlowFast networks to recognize activities from video streams. The system first detects persons using Detectron2, then applies specialized neural networks to identify specific actions. It supports multiple dataset formats including InHARD and AVA-Kinetics, and provides tools for data processing, model training, and visualization of results. This component enables both training on custom datasets and inference using pre-trained models.

Resource Link
Source code https://gitc.piap.lukasiewicz.gov.pl/ai-prism/wp4/ai-based-perception-modules/human-activity-recognition
Demo Video

Contact

The following table includes contact information of the main developers in charge of the component:

Name Email Organisation
Dorin Clisu dorin.clisu@nttdata.com NTT Data Romania

License

Proprietary.

Technical Foundations

Integrated and Open Source Components

Overview

This component integrates several powerful open-source libraries and frameworks to enable human activity recognition. PyTorch and torchvision provide the deep learning foundation, while pytorchvideo offers specialized video models including SlowFast networks. Detectron2 handles person detection, and additional libraries like OpenCV support video processing. The component uses Lightning for standardized training, scikit-learn for data splitting, and pandas for efficient data manipulation, all orchestrated with Click for command-line interfaces.

Pre-existing Components

PyTorch

Source

PyTorch is an open source machine learning framework. https://pytorch.org/

Description

PyTorch is a deep learning framework that provides a seamless path from research prototyping to production deployment, with GPU acceleration and automatic differentiation.

Modifications

None.

Purpose in AI-PRISM

PyTorch serves as the core deep learning framework for implementing and training the activity recognition models.

License

PyTorch is licensed under the BSD-style license found in the LICENSE file. https://github.com/pytorch/pytorch/blob/master/LICENSE

PyTorchVideo

Source

PyTorchVideo is a deep learning library for video understanding research. https://pytorchvideo.org/

Description

PyTorchVideo provides implementations of state-of-the-art video models, datasets and common transforms for video understanding research.

Modifications

The component builds upon PyTorchVideo's SlowFast networks with custom transformations and dataset adaptations for activity recognition.

Purpose in AI-PRISM

PyTorchVideo provides the SlowFast and other video models that are fine-tuned for the specific activity recognition tasks in the project.

License

PyTorchVideo is licensed under the Apache License 2.0. https://github.com/facebookresearch/pytorchvideo/blob/main/LICENSE

Detectron2

Source

Detectron2 is Facebook AI Research's next-generation platform for object detection and segmentation. https://github.com/facebookresearch/detectron2

Description

Detectron2 is a platform that implements state-of-the-art object detection algorithms including Faster R-CNN and Mask R-CNN.

Modifications

The component integrates Detectron2 specifically for person detection as a preprocessing step for activity recognition.

Purpose in AI-PRISM

Detectron2 is used to detect persons in video frames, providing the regions of interest for subsequent activity recognition.

License

Detectron2 is licensed under the Apache License 2.0. https://github.com/facebookresearch/detectron2/blob/main/LICENSE

Lightning

Source

Lightning is a lightweight PyTorch wrapper for high-performance AI research. https://lightning.ai/

Description

Lightning is a framework that organizes PyTorch code to remove boilerplate, making research code more readable and reproducible.

Modifications

None.

Purpose in AI-PRISM

Lightning provides a standardized training interface for the SlowFast models, handling device management, logging, and other training functions.

License

Lightning is licensed under the Apache License 2.0. https://github.com/Lightning-AI/lightning/blob/master/LICENSE

How to install

Every AI-PRISM component is installed using the Cluster management service. During the installation process, the user needs to configure a set of high-level parameters.

How to use