Residues that are crucial to protein function or structure are usually evolutionarily conserved. To identify the important residues in protein, sequence conservation is estimated, and current methods rely upon the unbiased collection of homologous sequences. Surprisingly, our previous studies have shown that the sequence conservation is closely correlated with the weighted contact number (WCN), a measure of packing density for residue's structural environment, calculated only based on the C-alpha positions of a protein structure. Moreover, studies have shown that sequence conservation is correlated with environment-related structural properties calculated based on different protein substructures, such as a protein's all atoms, backbone atoms, side-chain atoms, or side-chain centroid. To know whether the C-alpha atomic positions are adequate to show the relationship between residue environment and sequence conservation or not, here we compared C-alpha atoms with other substructures in their contributions to the sequence conservation. Our results show that C-alpha positions are substantially equivalent to the other substructures in calculations of various measures of residue environment. As a result, the overlapping contributions between C-alpha atoms and the other substructures are high. yielding similar structure-conservation relationship. Take the WCN as an example, the average overlapping contribution to sequence conservation is 87% between C-alpha, and all-atom substructures. These results indicate that only C-alpha atoms of a protein structure could reflect sequence conservation at the residue level.
Liu, J-W., Lin, J-J., Cheng, C. W., Lin, Y. F., Hwang, J. K., & Huang, T-T. (2017). On the relationship between residue structural environment and sequence conservation in proteins. Proteins: Structure, Function and Genetics, 1713-1723. https://doi.org/10.1002/prot.25329