Multiple-imputation approaches to haplotypic analysis of population-based data with applications to cardiovascular disease

Pamela McCaskie

Research output: ThesisDoctoral Thesis

141 Downloads (Pure)


[Truncated abstract] This thesis investigates novel methods for the genetic association analysis of haplotype data in samples of unrelated individuals, and applies these methods to the analysis of coronary heart disease and related phenotypes. Determining the inheritance pattern of genetic variants in studies of unrelated individuals can be problematic because family members of the studied individuals are often not available. For the analysis of individual genetic loci, no problem arises because the unit of interest is the observed genotype. When the unit of interest is the linear combination of alleles along one chromosome, inherited together in a haplotype, it is not always possible to determine with certainty the inheritance pattern, and therefore statistical methods to infer these patterns must be adopted. Due to genotypic heterozygosity, mutliple possible haplotype configurations can often resolve an individual's genotype measures at multiple loci. When haplotypes are not known, but are inferred statistically, an element of uncertainty is thus inherent which, if not dealt with appropriately, can result in unreliable estimates of effect sizes in an association setting. The core aim of the research described in this thesis was to develop and implement a general method for haplotype-based association analysis using multiple imputation to appropriately deal with uncertainty haplotype assignment. Regression-based approaches to association analysis provide flexible methods to investigate the influence of a covariate on a response variable, adjusting for the effects of other variables including interaction terms...These methods are then applied to models accommodating binary, quantitative, longitudinal and survival data. The performance of the multiple imputation method implemented was assessed using simulated data under a range of haplotypic effect sizes and genetic inheritance patterns. The multiple imputation approach performed better, on average, than ignoring haplotypic uncertainty, and provided estimates that in most cases were similar to those observed when haplotypes were known. The haplotype association methods developed in this thesis were used to investigate the genetic epidemiology of cardiovascular disease, utilising data for the cholesteryl ester transfer protein gene (CETP), the hepatic lipase (LIPC) gene and the 15- lipoxygenase (ALOX15) gene on a total of 6,487 individuals from three Western Australian studies. Results of these analyses suggested single nucleotide polymorphisms (SNPs) and haplotypes in the CETP gene were associated with increased plasma high-density lipoprotein cholesterol (HDL-C). SNPs in the LIPC gene were also associated with increased HDL-C and haplotypes in the ALOX15 gene were associated with risk of carotid plaque among individuals with premature CHD. The research presented in this thesis is both novel and important as it provides methods for the analysis of haplotypic associations with a range of response types, while incorporating information about haplotype uncertainty inherent in populationbased studies. These methods are shown to perform well for a range of simulated and real data situations, and have been written into a statistical analysis package that has been freely released to the research community.
Original languageEnglish
QualificationDoctor of Philosophy
Publication statusUnpublished - 2008


Dive into the research topics of 'Multiple-imputation approaches to haplotypic analysis of population-based data with applications to cardiovascular disease'. Together they form a unique fingerprint.

Cite this