Date of Award
Fall 2003
Document Type
Thesis - Restricted
Degree Name
Master of Science (MS)
Department
Mathematics, Statistics and Computer Science
First Advisor
Struble, Craig A.
Second Advisor
Merrill, Stephen J.
Third Advisor
Chen, Chin-Fu
Abstract
The Human Genome Project has reached its completion and plenty of genes and expressed sequence tags have been identified. However, the function, expression, and regulation of more than 80% of the genes have yet to be explored. Exploring is best done systematically. The genome, representing the complete blueprint of the organism, is the natural bounded system in which to conduct this exploration. DNA microarrays provide a natural means for exploring the genome in a way that is both systematic and comprehensive. The function of a gene can be explored by determining its pattern of expression. The set of genes expressed in a cell determines what the cell is made of, what biochemical and regulatory systems are operative. As we learn to infer the biological consequences of specific features of gene expression patterns, we can use microarray to see a comprehensive, dynamic molecular picture of the living cell. Underlying the microarray experiments is the notion that analyzing the response of a system to a given perturbation can shed light on the mechanism of signaling or biological response to the perturbation, or both, at the gene expression level. The complexity of microarray data provides new challenges for data mining to identify and validate patterns that are biologically relevant. Singular value decomposition (SVD) is one approach for analyzing gene expression data. We use an integrative approach and investigate the claim that SVD elucidates patterns representing biological processes by annotating these patterns with biological process terms contained in the Gene Ontology (GO) database, which is a dynamic, controlled vocabulary that can be applied to all eukaryotes. We present a procedure using statistical measures to classify genes involved in distinct regulatory biological processes that are statistically significant, and biologically interpretable from a systems perspective. Our approach paves a way in understanding regulatory and other complex biological processes from the molecular level to the systems level.
Recommended Citation
Li, Peigang, "Identifying Biological Processes in Microarray Data Using Singular Value Decomposition and the Gene Ontology Database" (2003). Master's Theses (1922-2009) Access restricted to Marquette Campus. 2127.
https://epublications.marquette.edu/theses/2127