Applications of Deep Learning on Protein Mutations

Research Mentor(s)

Hutchinson, Brian

Description

Recent developments in Deep Learning have enabled new approaches to important prediction problems in biology. In particular, models can approximate experimental laboratory investigations at a scale that would otherwise be prohibitive in time and cost. In this work, we report on three research threads adapting deep learning methods for applications involving proteins. Our longest running thread focuses on the use of rigidity analysis to assess protein stability, most recently using multiclass classification to predict the stability change caused by a mutation in a protein, with explicit modeling of experimental uncertainty. This work was recently published at BICOB 2018. The second thread involves accelerating or approximating an exhaustive analysis of in-silico protein mutations. While an exhaustive analysis is possible using parallel computing for pairwise mutations, it is infeasible to analyze higher level of protein mutation. We are using low rank matrix factorization techniques to approximate the exhaustive results with dramatically less computation. Our newest thread involves training variational autoencoders on protein sequences to learn a fixed-size latent representation of proteins, which can then be leveraged for a variety of applications (e.g. optimizing protein properties). We are also analyzing the biological significance of our model's errors when translating from the latent space back into a sequence of amino acids.

Document Type

Event

Start Date

17-5-2018 12:00 AM

End Date

17-5-2018 12:00 AM

Department

Computer Science

Genre/Form

student projects, posters

Subjects – Topical (LCSH)

Proteins--Structure; Proteins--Biotechnology; Machine Learning

Type

Image

Rights

Copying of this document in whole or in part is allowable only for scholarly purposes. It is understood, however, that any copying or publication of this document for commercial purposes, or for financial gain, shall not be allowed without the author’s written permission.

Language

English

Format

application/pdf

This document is currently not available here.

Share

COinS
 
May 17th, 12:00 AM May 17th, 12:00 AM

Applications of Deep Learning on Protein Mutations

Recent developments in Deep Learning have enabled new approaches to important prediction problems in biology. In particular, models can approximate experimental laboratory investigations at a scale that would otherwise be prohibitive in time and cost. In this work, we report on three research threads adapting deep learning methods for applications involving proteins. Our longest running thread focuses on the use of rigidity analysis to assess protein stability, most recently using multiclass classification to predict the stability change caused by a mutation in a protein, with explicit modeling of experimental uncertainty. This work was recently published at BICOB 2018. The second thread involves accelerating or approximating an exhaustive analysis of in-silico protein mutations. While an exhaustive analysis is possible using parallel computing for pairwise mutations, it is infeasible to analyze higher level of protein mutation. We are using low rank matrix factorization techniques to approximate the exhaustive results with dramatically less computation. Our newest thread involves training variational autoencoders on protein sequences to learn a fixed-size latent representation of proteins, which can then be leveraged for a variety of applications (e.g. optimizing protein properties). We are also analyzing the biological significance of our model's errors when translating from the latent space back into a sequence of amino acids.