Applications of Deep Learning on Protein Mutations
Presentation Type
Poster
Abstract
Recent developments in Deep Learning have enabled new approaches to important prediction problems in biology. In particular, models can approximate experimental laboratory investigations at a scale that would otherwise be prohibitive in time and cost. In this work, we report on three research threads adapting deep learning methods for applications involving proteins. Our longest running thread focuses on the use of rigidity analysis to assess protein stability, most recently using multiclass classification to predict the stability change caused by a mutation in a protein, with explicit modeling of experimental uncertainty. This work was recently published at BICOB 2018. The second thread involves accelerating or approximating an exhaustive analysis of in-silico protein mutations. While an exhaustive analysis is possible using parallel computing for pairwise mutations, it is infeasible to analyze higher level of protein mutation. We are using low rank matrix factorization techniques to approximate the exhaustive results with dramatically less computation. Our newest thread involves training variational autoencoders on protein sequences to learn a fixed-size latent representation of proteins, which can then be leveraged for a variety of applications (e.g. optimizing protein properties). We are also analyzing the biological significance of our model's errors when translating from the latent space back into a sequence of amino acids.
Start Date
10-5-2018 12:00 PM
End Date
10-5-2018 2:00 PM
Genre/Form
posters
Subjects - Topical (LCSH)
Proteins--Analysis; Mutation (Biology)
Type
Event
Format
application/pdf
Language
English
Applications of Deep Learning on Protein Mutations
Recent developments in Deep Learning have enabled new approaches to important prediction problems in biology. In particular, models can approximate experimental laboratory investigations at a scale that would otherwise be prohibitive in time and cost. In this work, we report on three research threads adapting deep learning methods for applications involving proteins. Our longest running thread focuses on the use of rigidity analysis to assess protein stability, most recently using multiclass classification to predict the stability change caused by a mutation in a protein, with explicit modeling of experimental uncertainty. This work was recently published at BICOB 2018. The second thread involves accelerating or approximating an exhaustive analysis of in-silico protein mutations. While an exhaustive analysis is possible using parallel computing for pairwise mutations, it is infeasible to analyze higher level of protein mutation. We are using low rank matrix factorization techniques to approximate the exhaustive results with dramatically less computation. Our newest thread involves training variational autoencoders on protein sequences to learn a fixed-size latent representation of proteins, which can then be leveraged for a variety of applications (e.g. optimizing protein properties). We are also analyzing the biological significance of our model's errors when translating from the latent space back into a sequence of amino acids.