Date of Award

Summer 2006

Degree Type

Thesis - Restricted

Degree Name

Master of Science (MS)



First Advisor

Struble, Craig A.

Second Advisor

Sugg, Sonia L.

Third Advisor

Madiraju, Praveen


Each cell in a body contains thousands of genes encoded in DNA. At any given time, a cell expresses only a part of these genes as RNA transcripts. Gene expression is a measure of genes transcribed into RNA. Gene expression can be investigated in various experimental conditions such as by different tissues (e.g., normal vs diseased tissue), by developmental stages (e.g., early vs late development), drug responses (e.g., with vs without drug treatment), or disease states (e.g., breast cancer vs prostate cancer). Therefore, studies of gene expression address a variety of biological questions. Our research investigates an interactive tool for analyzing data from gene expression studies. Several technologies are used to measure gene expression. Among them, microarray technology is used broadly. A microarray, also called a gene chip or DNA chip, appears as a glass slide or nylon membrane on which genes are spotted in arrays. Each spot contains tens of millions of DNA molecules. During a microarray process, genes are labeled with fluorescence and scanned as an image. Through analyzing an image from the microarray technology, researchers can monitor the measurement of the expression levels of thousands of genes simultaneously. Microarray experiments provide large am,o unts of data for biological researchers. To obtain significant information from these large data sets, an appropriate method or tool is needed. There exist several methods or tools for microarray data analysis such as classification systems, clustering methods. In contrast with those methods, online analytical processing (OLAP) allows users to explore data interactively. Thus, we focus on building an OLAP tool for gene expression analysis. OLAP techniques are very useful since they provide excellent support for multidimensional views of the data and multiple hierarchies of dimensions. Moreover, they support three main operations: roll up, drill down and pivot. The combination of these three operations allows users to explore between summarized data and detailed one on any dimension and presents the results as a two-dimensional chart. We built a database by utilizing the hierarchies of gene products provided by the Gene Ontology (GO) consortium. The data sets used in this project were from the microarray experiments in breast cancer and prostate cancer categories of Standford Microarray Database (SMD). The project is developed on Mondrian OLAP server written in Java, with data stored in a local relational database (PostgreSQL) and a local Tomcat web server. The end system suggests that OLAP technology provide an interactive and fast way for gene expression analysis.