I am building a machine learning model to identify genomic mutations in non-coding regions, which could have severe effects in various rare diseases. Even though 98% of the genome represents non-coding regions, our understanding of non-coding mutations remains very limited. I am using data science approach to tackle this problem, with a hope that it accelerates our understanding of the genetic cause of rare diseases.

What is the hardest part about the work you do?
Sheer size of the data. Genomic data sets are large and we need to be really careful to efficiently analyze them.

Qingbo playing soccer

Which teacher made the biggest impact on your life?
It’s not exactly a teacher, but I’ll share my favorite quote from a statistics textbook: “All models are wrong, but some are useful.” Applicable not only in your research, but in life in general.

What are some of your talents or interests outside the lab?
I enjoy playing soccer – I manage a soccer team called BosBaka. Let me know if you are interested in joining! I also enjoy poker games – it is not a simple gamble. It is an exercise of statistics!

What are your hopes for the future?
I hope to see a world where causal variant(s) for all the monogenic rare diseases are identified, and we can find a way to cure the diseases. I hope to make a non-zero contribution to our fight against rare diseases.

Profile photo by Anna Olivella