Coalescent modeling for genetic mapping in population based samples

Body

Dr. Laura Kubatko, Department of Statistics
Rank at time of award: Associate Professor

Abstract

Objectives: Genetic epidemiology refers to the study of inherited causes of disease in families and in populations,  with the goal of localizing genes associated with disease susceptibility. The recent technological advances that have led to generation of high throughput genetic data have revolutionized the field of genetic epidemiology by providing an abundance of data that can be used in mapping such  disease-causing mutations. This has led to a variety of statistical challenges in the development of computationally feasible methods for appropriately analyzing such data.
 
The statistical techniques commonly used in genetic epidemiology can be broadly classified into two types based on the type of sampling being performed. Family-based samples, in which each sampled family contains at least one affected individual, are generally analyzed using linkage methods. Population-based samples, composed of either affected (case) individuals and unaffected controls, or cases alone, are generally analyzed using association methods, which seek to identify chromosomal locations at which a particular variant (allele) is found to be associated with case individuals more frequently than in controls. The overall objective of this proposal is to combine the expertise of several faculty on The Ohio State University campus to develop new statistical methodology for performing gene epidemiological studies in population­ based samples.
 
Two of the co-PI faculty, Dr. Veronica Vieland, and her collaborator Dr. Christopher Bartlett, are housed at the Battelle Center for Mathematical Medicine at Nationwide Children's Hospital with faculty appointments in the Department of Pediatrics at OSU. The third co-PI, Dr. Laura Kubatko, is housed in the Department of Statistics on OSU's main campus. Dr. Vieland has expertise in the development of methodology for genetic mapping, with application to understanding the role that genetic factors play in diseases such as autism, cancer and schizophrenia. Dr. Kubatko has expertise in the development of statistical methods for studying relatedness in population samples under the coalescent model. The co-PIs have met on a couple of occasions over the last year to discuss potential overlap in research interests, but the physical separation of OSU's main campus from Nationwide Children's Hospital has been a significant barrier to the frrm establislunent of collaborative work. Funding from this initiative will be used to support Dr. Kubatko's presence at the Battelle Center for Mathematical Medicine for a substantial period during the upcoming summer to allow for the necessary level of interaction between the co-PIs.

Significance to Population and Health Research: This goal of this collaboration (described more fully below) is to provide new statistical techniques for the analysis of population-based samples of individuals affected with a particular disease under study. The result will be the development of more powerful  methods  for the  identification  of putative  genetic regions  at which  disease-causing mutations might be found. These regions can then be studied with the goal of understanding (and in the long-term, treating) the mechanisms that cause the disease.  Our methods will be tested using two sets of samples. The frrst comes from Dr. Bartlett and his colleagues who are collecting cases and families with autism or other language impairments. These samples will be collected from central Ohio over the next 4 years with several clinical collection grants pending. The second set of data comes from the database core within the Battelle Center for Mathematical Medicine, which manages the genetic data for the international Autism Genome Project that will ultimately include 3,000-4,000 cases with very dense genetic data across the entire human genome. Use of this test data set will not only allow testing of our developed methodology, but may also result in increased understanding of the genetic basis of autism in Ohio's children. 
 

Ending Synopsis/Findings

This funding was used to partially support Statistics Graduate Student Lori Hoffman as a Graduate Research Assistant, and to establish research collaboration with Dr. Veronica Vieland at the Battelle Center for Mathematical Medicine at Nationwide Children’s Hospital.  Dr. Hoffman completed her Ph.D., titled “Disease Gene Mapping Under the Coalescent Model”, in Spring 2010.  Her work developed new methodology for inferring the genomic location of disease-causing gene variants from large-scale genome-wide association data using a statistical modeling framework in which population-level relationships among individuals were accounted for. She found that her method led to improved power in the detection and localization of genes underlying discrete traits (e.g., presence or absence of a disease) when populations were structured. The method showed similar power to existing methods in other cases. Statistics Graduate Student Katie Thompson built on this work, extending the method to continuous traits (e.g., blood pressure) and incorporating models for the covariance between genes and the environment. Although Dr. Thompson’s work was not formally supported by this award, the supported work of Dr. Hoffman was of fundamental importance in the subsequent development. Dr. Thompson published two papers on her work listed below.

 

Publications resulting from this seed grant:

Thompson, K., C. Linnen, and L. Kubatko. 2016. Tree-based quantitative trait mapping in the presence of external covariates, Statistical Applications in Genetics and Molecular Biology 15(6): 473-490
 
Thompson, K.L., and L. Kubatko. 2013. Using ancestral information to detect and localize quantitative trait loci in genome-wide association studies, BMC Bioinformatics 14: 200