Crimson Publishers Publish With Us Reprints e-Books Video articles

Abstract

Significances of Bioengineering & Biosciences

Predicting Protein Transmembrane Regionsby Using LSTM Model

Submission: November 09, 2017;Published: February 23, 2018

DOI: 10.31031/SBB.2018.01.000510

ISSN 2637-8078
Volume1 Issue2

Abstract

Predicting transmembrane regions in proteins using machine learning methods is a classical bioinformatics problem. In this paper, we propose a novel approach to this problem using the Long Short-Term Memory (LSTM) model-a recurrent neural network. This recurrent model was trained on an already explored set of proteins to capture the relationships between adjacent amino acids. Then it uses this information to predict whether an amino acid on a new protein is a transmembrane residue or not. With accuracy up to 92.56%, our experiments show better results than other advanced approaches. Our second contribution is an analysis of four common, easy-to-extract and effective features of an amino acid used in many machine learning approaches. They are propensity, hydrophobicity, positive charge and identity feature. We implemented our model with combinations of these four features to investigate the effect of each feature on our system’s performance. Results of the experiments show that our method is as good as other state-of-the-art methods and therefore is trustworthy to be used to predict transmembrane regions on structure-unexplored proteins. Our analysis of the four features also points out efficient combinations of them for solving the problem. We hope this information will help later researches in the field to choose a useful set of features.

Get access to the full text of this article