Evolution of protein structure function
[摘要] Proteins are generated as a result of millions of evolutionary experiments over a billion years. These experiments are constrained by proteins structure and function. Investigating genomics data is a natural way for mining these constraints to address long-standing biological challenges.In this dissertation, first, we tried to address two specific biological questions by reconstructing the evolutionary pathways. We simulated Cyclin-dependent kinase 2 (CDK2) and its experimentally validated Cyclin-independent ancestor to understand the atomistic basis of Cyclin dependence in a kinase family. A set of conformational differences and the corresponding residues were identified to be the origin of different activation processes between CDK2 and its ancestor. The second biological question was to understand how does the wonder drug of century, Imatinib (Gleevec), exhibit high selectivity for a particular kinase involved in Chronic myelogenous leukemia. We investigated this question by simulating two modern kinases, a strong (Abl) and a weak (Src) binder, with five of their common ancestors to regenerate the evolutionary pathway between them.In the second effort to use evolutionary information, we exploited large-scale genomic sequences along with machine learning techniques to address one of the biggest challenges facing Molecular Dynamics (MD) simulations. MD simulations are shown to be valuable tools for study of proteins at atomic resolution. How- ever, in practice, running these simulations is computationally expensive due to the long timescales associated with biologically relevant conformational changes. Here, we developed a reinforcement learning-based sampling algorithm to enhance the MD simulation by prioritizing the most important reaction coordinates at each stage. Testing the proposed algorithm on multiple case studies showed a significant improvement over other unbiased sampling techniques. Furthermore, we showed the distances between evolutionary coupling residues can be a natural and effective set of reaction coordinates to be used for reinforcement learning based adaptive sampling.At last, we went one step further and applied transfer learning on a genomic-based model to predict the effects of mutation more efficiently. We evaluated the effectiveness of the proposed transfer learning techniques in three different cases.
[发布日期] [发布机构]
[效力级别] [学科分类]
[关键词] Cancer, Kinase, Imatinib, Molecular Dynamics Simulations, Computational Biology, Reinforcement Learning, Transfer Learning, Evolutionary Coupling, Evolution, Proteomics [时效性]