BIO
Throughout my life, I have lived in China, South Africa, Canada and the US where I completed my undergraduate studies at Cornell University in 2018. I majored in computer science and computational biology with a focus on machine learning and statistical genetics. At Cornell, I worked with Dr. Alon Keinan and Dr. Kaixiong Ye on a human population genetics study to elucidate the evolution of human dietary adaptation in Europe and India. I first became interested in RNA biology through an internship at Regeneron Pharmaceuticals where I worked with machine learning tools to analyze B cell class switching and T cell receptor diversity in single cell data.
I started graduate school at Penn in 2018 and join the Barash Lab in 2019. I am broadly interested in computational and statistical modeling of RNA and developing bioinformatics tools that will benefit the RNA community at large. I am currently working on non-parametric Bayesian approaches for modeling heterogenous cancer data and plan to eventually work on projects involving developing statistical methods to improve RNA experimental design. I am also interested in developing machine learning methods to facilitate single cell RNA analysis.
Topics
• Statistical Machine Learning
• Non-parametric and Variational Bayes
• Probabilistic Graphical Models
• Computational Modeling of RNA
• Single Cell Biology
Publications
Ye, K., Gao, F., Wang, D., Bar-Yosef, O., & Keinan, A. (2017). Dietary adaptation of FADS genes in Europe varied across time and geography. Nature ecology & evolution, 1, 167.
Bai, Y., Wang, D., Li, W., Huang, Y., Ye, X., Waite, J., ... & Skokos, D. (2018). Evaluation of the capacities of mouse TCR profiling from short read RNA-seq data. PloS one, 13(11).
Bai, Y., Wang, D., & Fury, W. (2018). PHLAT: Inference of High-Resolution HLA Types from RNA and Whole Exome Sequencing. Methods in molecular biology (Clifton, N.J.), 1802, 193–201.