North Carolina Central University


A new approach to protein fold recognition based on Delaunay tessellation of protein structure.

Publication Type  Journal Article
Author  Zheng W, Cho SJ, Vaisman II, Tropsha A
Year of Publication  1997
Secondary Title  Pac Symp Biocomput
Pagination  486--497
Type of Work  article
Publication Language  eng
Key Words  Amino Acid Sequence
Abstract  

We propose new algorithms for sequence-structure compatibility (fold recognition) searches in multi-dimensional sequence-structure space. Individual amino acid residues in protein structures are represented by their C alpha atoms; thus each protein is described as a collection of points in three-dimensional space. Delaunay tessellation of a protein generates an aggregate of space-filling, irregular tetrahedra, or Delaunay simplices. Statistical analysis of quadruplet residue compositions of all Delaunay simplices in a representative dataset of protein structures leads to a novel four body contact residue potential expressed as log likelihood factor q. The q factors are calculated for native 20 letter amino acid alphabet and several reduced alphabets. Two sequence-structure compatibility functions are computed as (i) the sum of q factors for all Delaunay simplices in a given protein, or (ii) 3D-1D Delaunay tessellation profiles where the individual residue profile value is calculated as the sum of q factors for all simplices that share this vertex residue. Both threading functions have been implemented in structure-recognizes-sequence and sequence-recognizes-structure protocols for protein fold recognition. We find that both profile and total score based threading functions can distinguish both the native fold from incorrect folds for a sequence, and the native sequence from non-native sequences for a fold.

Citation Key  252