Date of Award
Doctor of Philosophy (PhD)
Electrical and Computer Engineering
In this dissertation, we explore the advantages and limitations of the application of sequential Monte Carlo methods to visual tracking, which is a challenging computer vision problem. We propose six visual tracking models, each of which integrates a particle filter, a deep convolutional neural network, and a correlation filter. In our first model, we generate an image patch corresponding to each particle and use a convolutional neural network (CNN) to extract features from the corresponding image region. A correlation filter then computes the correlation response maps corresponding to these features, which are used to determine the particle weights and estimate the state of the target. We then introduce a particle filter that extends the target state by incorporating its size information. This model also utilizes a new adaptive correlation filtering approach that generates multiple target models to account for potential model update errors. We build upon that strategy to devise an adaptive particle filter that can decrease the number of particles in simple frames in which there is no challenging scenarios and the target model closely reflects the current appearance of the target. This strategy allows us to reduce the computational cost of the particle filter without negatively impacting its performance. This tracker also improves the likelihood model by generating multiple target models using varying model update rates based on the high-likelihood particles. We also propose a novel likelihood particle filter for CNN-correlation visual trackers. Our method uses correlation response maps to estimate likelihood distributions and employs these likelihoods as proposal densities to sample particles. Additionally, our particle filter searches for multiple modes in the likelihood distribution using a Gaussian mixture model. We further introduce an iterative particle filter that performs iterations to decrease the distance between particles and the peaks of their correlation maps which results in having a few more accurate particles in the end of iterations. Applying K-mean clustering method on the remaining particles determine the number of the clusters which is used in evaluation step and find the target state. Our approach ensures a consistent support for the posterior distribution. Thus, we do not need to perform resampling at every video frame, improving the utilization of prior distribution information. Finally, we introduce a novel framework which calculates the confidence score of the tracking algorithm at each video frame based on the correlation response maps of the particles. Our framework applies different model update rules according to the calculated confidence score, reducing tracking failures caused by model drift. The benefits of each of the proposed techniques are demonstrated through experiments using publicly available benchmark datasets.