Date of Award

Summer 2022

Document Type


Degree Name

Doctor of Philosophy (PhD)


Mathematics, Statistics and Computer Science


Computational Mathematical and Statistical Sciences

First Advisor

Ahamed, Sheikh Iqbal

Second Advisor

Bansal, Naveen

Third Advisor

Maadooliat, Mehdi


Temporal sentiment labels are used in various multimedia studies. They are useful for numerous classification and detection tasks such as video tagging, segmentation, and labeling. However, generating a large-scale sentiment dataset through manual labeling is usually expensive and challenging. Some recent studies explored the possibility of using online Time-Sync Comments (TSCs) as the primary source of their sentiment maps. Although the approach has positive results, existing TSCs datasets are limited in scale and content categories. Guidelines for generating such data within a constrained budget are yet to be developed and discussed. This dissertation tries to address the above issues by leveraging existing live comments from a popular video distributed platform, YouTube, as a primary time-synchronized data source and exploring efficient strategies for generating TSCs with a constrained budget. An automatic data mining system was first developed and deployed across multiple platforms. Then, long-period experiments were conducted to test the efficiency of the framework. Additionally, two large-scale TSCs datasets were created through the proposed data framework and analyzed for their characteristics. Finally, the outcomes were tested against the original temporal Automatic Speech Recognition (ASR) sentiment labeling to validate their accuracy. The experiment shows the potential of automatically generating temporal sentiment datasets through the proposed mapping system. This project also provides valuable tools for future multimedia research.

Included in

Mathematics Commons