Investigator, Howard Hughes Medical Institute
Because of rapid advances in genome sequencing technology, we can finally see the source code for life: the complete genomic DNA sequences that specify development, regulation, and function of organisms. We still don’t really understand how to read this trove of encoded information, and we don’t really understand in any satisfying detail how it evolved.
We are interested in deciphering the evolutionary history of life by comparative analysis of genome sequences. Our research currently focuses on the development of computational methods for RNA, protein, and DNA sequence analysis. We use probabilistic modeling approaches to build statistical models of interesting biological features, in order to identify those features accurately in large-scale genome analyses.
We currently work in three areas. One is RNA structure and function. We are developing methods for RNA homology detection, RNA structure prediction, and identification of conserved RNA structures by
comparative genome sequence analysis (including both noncoding RNA genes and cis-regulatory RNA structures). A second area is remote protein homology detection, implemented in our HMMER software package. Finally, we are collaborating with neuroscientists on the analysis of neural cell-type-specific genomic data, aiming to learn more about the evolution and development specification of neuronal cell types.