Applications of Deep Learning on Protein Mutations
Research Mentor(s)
Hutchinson, Brian
Description
Recent developments in Deep Learning have enabled new approaches to important prediction problems in biology. In particular, models can approximate experimental laboratory investigations at a scale that would otherwise be prohibitive in time and cost. In this work, we report on three research threads adapting deep learning methods for applications involving proteins. Our longest running thread focuses on the use of rigidity analysis to assess protein stability, most recently using multiclass classification to predict the stability change caused by a mutation in a protein, with explicit modeling of experimental uncertainty. This work was recently published at BICOB 2018. The second thread involves accelerating or approximating an exhaustive analysis of in-silico protein mutations. While an exhaustive analysis is possible using parallel computing for pairwise mutations, it is infeasible to analyze higher level of protein mutation. We are using low rank matrix factorization techniques to approximate the exhaustive results with dramatically less computation. Our newest thread involves training variational autoencoders on protein sequences to learn a fixed-size latent representation of proteins, which can then be leveraged for a variety of applications (e.g. optimizing protein properties). We are also analyzing the biological significance of our model's errors when translating from the latent space back into a sequence of amino acids.
Document Type
Event
Start Date
17-5-2018 12:00 AM
End Date
17-5-2018 12:00 AM
Department
Computer Science
Genre/Form
student projects, posters
Subjects – Topical (LCSH)
Proteins--Structure; Proteins--Biotechnology; Machine Learning
Type
Image
Rights
Copying of this document in whole or in part is allowable only for scholarly purposes. It is understood, however, that any copying or publication of this document for commercial purposes, or for financial gain, shall not be allowed without the author’s written permission.
Language
English
Format
application/pdf
Applications of Deep Learning on Protein Mutations
Recent developments in Deep Learning have enabled new approaches to important prediction problems in biology. In particular, models can approximate experimental laboratory investigations at a scale that would otherwise be prohibitive in time and cost. In this work, we report on three research threads adapting deep learning methods for applications involving proteins. Our longest running thread focuses on the use of rigidity analysis to assess protein stability, most recently using multiclass classification to predict the stability change caused by a mutation in a protein, with explicit modeling of experimental uncertainty. This work was recently published at BICOB 2018. The second thread involves accelerating or approximating an exhaustive analysis of in-silico protein mutations. While an exhaustive analysis is possible using parallel computing for pairwise mutations, it is infeasible to analyze higher level of protein mutation. We are using low rank matrix factorization techniques to approximate the exhaustive results with dramatically less computation. Our newest thread involves training variational autoencoders on protein sequences to learn a fixed-size latent representation of proteins, which can then be leveraged for a variety of applications (e.g. optimizing protein properties). We are also analyzing the biological significance of our model's errors when translating from the latent space back into a sequence of amino acids.