In this paper we propose an algorithm named Incremental Structural Mountain Clustering Method (ISMCM) with a view to finding a library of building blocks for reconstruction of 3-D structures of proteins/peptides. The building blocks are short structural motifs that are identified based on an estimate of local "density" of 3-D fragments computed using a measure of structural similarity. The structural similarity is computed after the best-molecular-fit alignment of pairs of fragments. The algorithm is tested on two well known benchmark data sets. Following the protocols used by other researchers, for the first data set we reconstruct a set of 71 test peptides (up to first 60 residues) whereas for the second data set we reconstruct all 143 test peptides. The ISMCM algorithm is found to successfully reconstruct the test peptides in terms of both global-fit root-mean-square (RMS) error and local-fit RMS error. The low values of local-fit RMS errors suggest that these building blocks extracted by ISMCM are good quantizers, which can represent nearby fragments quite accurately. To further assess the quality of building blocks we use two alternative graphical ways. We also use Shannon's entropy to show the structural similarity of the clusters found by our algorithm. This is important as building blocks that represent clusters with structurally similar fragments will be very effective in reconstruction. The entropic analysis reveals a very interesting fact that the secondary structure of the central residue of the fragments in a cluster is most strongly conserved (minimum entropy) over the cluster, which might be an indicator that central residue of the structural motif plays a dominant role in local folding.
- building blocks
- incremental structural mountain clustering
- protein structure
- structural mountain clustering