Date of Award

Spring 2023

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Electrical and Computer Engineering

First Advisor

Yaz, Edwin

Second Advisor

Schneider, Susan

Third Advisor

Medeiros, Henry

Abstract

Many communities have installed surveillance cameras in an effort to deter and respond to violence.Due to the difficulty of constantly monitoring such camera feeds, these systems are rarely used to provide real-time information. To enable rapid alerts and information for first responders, this thesis develops a proof-of-concept system capable of automatically detecting violence from video footage. This system is developed by fine-tuning a convolutional neural network that has previously demonstrated success on general action recognition tasks. This thesis explores two new techniques to improve the accuracy of the fine-tuned model. The first is a data augmentation technique that generates aspect ratio and scale distortions without cropping input frames. The second technique aims to improve the effectiveness of existing hyper-parameter tuning algorithms by reducing the size of the hyper-parameter search space partway through the tuning process. After extensive evaluation using a benchmark violence detection dataset, however, these methods cannot be shown to improve final model accuracy. The final transfer learning and hyper-parameter tuning experiments of this thesis remove these new techniques, and the model achieves reasonably competitive accuracy on three violence detection benchmarks. While convolutional neural networks can achieve high accuracy on laboratory datasets, their complexity makes it difficult for community members to trust them. To further explain the final violence detection model, this work extends the Grad-CAM model interpretation technique to three dimensions and uses it to analyze model inferences. After a review of bias measurement and mitigation research for other applications of neural networks, this thesis also conducts an analysis of skin tone distributions within the data. Such an analysis has not previously been performed for violence detection data, and this thesis identifies a clear need for more skin-tone balanced data in violence detection research. This work concludes by suggesting possible avenues to improve model efficiency, interpretability, and fairness in violence detection research.

Included in

Engineering Commons

COinS