Probabilistic Modeling in population genetics and genomics
Sneha Shadija
Undergraduate Researcher
Genetics Major (College of Arts & Sciences)
Olivia Morelli
Undergraduate Researcher
Finance Major (Kelley School of Business)
Wai-Tong (Louis) Fan
Faculty Mentor
Wai-Tong (Louis) Fan (College of Arts & Sciences)
Project Description
Probabilistic models are powerful mathematical tools in the analysis of modern genetic and genomic data, and are rooted in long history of population genetics. The Kingman coalescent, for instance, arise as a universal (thus important) object that describes the genetic ancestry of a sample of DNA sequences. Mathematically, the Kingman coalescent is the scaling limit of the genealogy of a sample under the Wright-Fisher model, the Moran model and many other population models. Such scaling limit results are analogous to the central limit theorem, where the Kingman coalescent plays the role of the Gaussian distribution.
Technology or Computational Component
This project involves mathematical modeling, computer simulations and statistical analysis for important questions in population genetics and genomics. We will start from classical examples (such as the Wright Fisher model and the Moran model), perform computer simulations for those example models and gradually build more realistic models to test important scientific hypothesis. Students will perform computer simulation of various Markov chains to visualize and illustrate the system under study. They will learn how to ask questions and formulate conjectures, which are important skills in research.