Visual Threat Analytics
Executes parallel YOLOv11 inference streams for real-time weapon (knife) classification and skeletal pose estimation, maintaining frame rates suitable for live video feeds.
Temporal Sequence Processing
Integrates a Bidirectional LSTM network tracking a 150-frame temporal buffer of normalized keypoint velocities to classify suspicious physical gestures and stabbing movements.
Automated Alert Logging
Triggers instant logging upon violent activity detection, extracting facial crops from clean video frames and archiving logs for forensic review.
Detection Pipeline Overview
Four integrated steps to analyze, filter, and classify threats in under 30 milliseconds.
Visual Ingestion
Captures live camera feeds or local security video streams at 30 FPS.
Parallel YOLOv11
Runs object detection for weapons and tracks 17-joint skeletal human poses.
Torso Normalization
Centers and scales skeleton coordinates relative to shoulders for distance invariance.
LSTM Sequence HAR
Classifies gestures over a 150-frame queue to flag active stabbing motions.
Real-time Surveillance Monitor
The system features a lightweight Tkinter dashboard designed for operators. It overlays active tracking lines, weapon bounding boxes, and threat classifications (Safe vs. Danger) directly on the video feed.
- Dynamic Frame Skipping: Adjust processing load in real-time from 1 to 5 frames.
- Automated Forensic Captures: Extracts and crops the face of suspects instantly.
- Alert Logging: Appends detailed telemetry logs for security review.

Research Foundation
"By separating visual features into parallel streams—object classification for weapons (YOLOv11 Knife) and skeletal keypoint estimation (YOLOv11-Pose)—the system isolates physical postures. These pose trajectories are normalized against perspective scaling and tracked across consecutive frames. The temporal sequences are then processed by a Bidirectional LSTM network."
Read Thesis AbstractThesis Paper Documentation
Author: Filippo Notari • Advisor: Prof. Francesco Santini • Università degli Studi di Perugia
The development of this real-time detection pipeline is backed by a structured academic thesis exploring human activity recognition (HAR), computer vision optimizations, and temporal modeling.
The documentation prepares all sections of the thesis, outlining research methodology, comparative models (CNN-LSTM vs. 3D-CNNs), training hyperparameters, and experimental accuracy outputs (AUC / Flicker Rates).
Explore Thesis & Download PDFDeploy the Surveillance Pipeline
Explore the installation guides, prerequisites, and code directories to start tracking and classifying stabbing motions.