Analysis of Amino Acid Sequence Characteristics of Type I Cluster of Differentiation (CD) Proteins Using Multivariate Statistics to Determine Their Functional Class
Loading...
Authors
K.C., Pramir
Issue Date
2017-12
Type
Thesis
Language
en_US
Keywords
Proteins , Bioinformatics , Biometry , Protein Function Prediction , Cluster of differentiation
Alternative Title
Abstract
Cluster of Differentiation (CD) proteins are proteins found in the cell membranes of leukocytes. These proteins are important because they are cell surface markers for many immune cells and can be used as therapeutic and diagnostic targets. Biophysical methods like X-ray crystallography and nuclear magnetic resonance (NMR) are commonly used to determine the function of proteins through the generation of their three-dimensional structures. However, applications of these experimental methods do not work very well in order to determine the function of membrane proteins because of their high flexibility and instability, their partial hydrophobic surface, and the requirement of highly specific detergents for their extraction from phospholipids membranes. In order to address this problem, we devised a theoretical approach where type I CD proteins can be classified into two different functional groups (enzyme and non-enzyme) by using physicochemical parameters related to the primary sequence of the individual CD proteins. Principal component analysis (PCA) was used to analyze 126 parameters of 244 type I CD proteins. Two different clusters of type I CD proteins with enzymatic activity and non-enzymatic activity were found on the score plot, and the separation of those clusters was found to be statistically significant. Cytoplasmic amino acid count was found to be the most important variable for separating enzymes and non- enzymes. The continuous probability densities of CD proteins with enzymatic activity and non-enzymatic activity were then approximated by kernel density estimation (KDE) of cytoplasmic amino acid count. This is the first time this method of determining type I CD proteins functional classes has been employed and appears quite promising. In the future, this statistical approach could be very useful in determining the functional class of newly discovered or poorly characterized type I CD proteins.
Description
Citation
K.C., Pramir. "Analysis of Amino Acid Sequence Characteristics of Type I Cluster of Differentiation
(CD) Proteins Using Multivariate Statistics to Determine Their Functional Class," Master's thesis, Valdosta State University, 2017. http://hdl.handle.net/10428/2968.
