Date of Award

Summer 2005

Degree Type

Thesis - Restricted

Degree Name

Master of Science (MS)

Department

Mathematics, Statistics and Computer Science

First Advisor

Olivier, Michael

Second Advisor

Marth, Gabor T.

Third Advisor

Struble, Craig A.

Abstract

Different classes of haplotype block partitioning algorithms exist and the ideal dataset to assess their performance would be to comprehensively re-sequence entire sets of chromosomes from a large population of unrelated individuals. Such a dataset is prohibtively expensive to collect. Alternatively, we tested the performance of block algorithms on a large number of chromosomes with a high marker density generated from coalescent simulations. Block partitions resulting from diversity based, LD based, and information theoretic algorithms differed in the number, size, and percentage of sequence contained in blocks under a variety of marker spacing and allele frequency conditions. Only a handful of partitions contained matching block boundaries but still had large overlaps in regions inferred in blocks. A single, gold standard block definition is difficult to achieve but by subjecting data sets to a variety of block partitioning algorithms and determining intersecting block regions may be the best way to find genomic regions of high LD and reduced haploytype diversity.

Share

COinS