Date of Award
Spring 2023
Document Type
Thesis
Degree Name
Master of Science (MS)
Department
Electrical and Computer Engineering
First Advisor
Yaz, Edwin
Second Advisor
Schneider, Susan
Third Advisor
Medeiros, Henry
Abstract
Many communities have installed surveillance cameras in an effort to deter and respond to violence.Due to the difficulty of constantly monitoring such camera feeds, these systems are rarely used to provide real-time information. To enable rapid alerts and information for first responders, this thesis develops a proof-of-concept system capable of automatically detecting violence from video footage. This system is developed by fine-tuning a convolutional neural network that has previously demonstrated success on general action recognition tasks. This thesis explores two new techniques to improve the accuracy of the fine-tuned model. The first is a data augmentation technique that generates aspect ratio and scale distortions without cropping input frames. The second technique aims to improve the effectiveness of existing hyper-parameter tuning algorithms by reducing the size of the hyper-parameter search space partway through the tuning process. After extensive evaluation using a benchmark violence detection dataset, however, these methods cannot be shown to improve final model accuracy. The final transfer learning and hyper-parameter tuning experiments of this thesis remove these new techniques, and the model achieves reasonably competitive accuracy on three violence detection benchmarks. While convolutional neural networks can achieve high accuracy on laboratory datasets, their complexity makes it difficult for community members to trust them. To further explain the final violence detection model, this work extends the Grad-CAM model interpretation technique to three dimensions and uses it to analyze model inferences. After a review of bias measurement and mitigation research for other applications of neural networks, this thesis also conducts an analysis of skin tone distributions within the data. Such an analysis has not previously been performed for violence detection data, and this thesis identifies a clear need for more skin-tone balanced data in violence detection research. This work concludes by suggesting possible avenues to improve model efficiency, interpretability, and fairness in violence detection research.