Dr. Xin Gao is an associate professor of computer science at King Abdullah University of Science and Technology (KAUST), Saudi Arabia. He is also a PI in the Computational Bioscience Research Center at KAUST and an adjunct faculty member at David R. Cheriton School of Computer Science at University of Waterloo, Canada. Prior to joining KAUST, he was a Lane Fellow at Carnegie Mellon University, U.S.. He earned his bachelor degree in Computer Science in 2004 from Tsinghua University, China, and his Ph.D. degree in Computer Science in 2009 from University of Waterloo, Canada.
Dr. Gao’s research focuses on the intersection between computer science and biology/biomedicine. His interests are building computational models, developing machine learning techniques, and designing efficient and effective algorithms, with applications to key open problems all the way from sequence analysis, to 3D structure determination, to function annotation, to systems biology, and to biomedicine and healthcare. He has co-authored more than 170 research articles at leading journals and conferences in the fields of bioinformatics and machine learning.
A common problem in bioinformatics is how to extract informative features from biomolecular sequences, such as DNA and proteins, to feed into classification or regression models to achieve high accuracy. Traditional feature extraction was done by manual craft based on expert knowledge. Advances in data mining and machine learning techniques have enabled systematic and automatic ways of extracting features. In this talk, I will give a brief overview of such successful feature extraction methods in bioinformatics, including string kernels and deep learning. I will then introduce our works that overcome certain bottlenecks of these methods. I will show one classification application on polyadenylation motif prediction and one regression application on transcription factor-DNA binding affinity prediction.