Proteomic Code

Proteomic Code

Analysis of crystallized protein structures suggests that globular proteins are organized as consecutively connected units of 25–35 residues. These units are closed loops, that is returns of the polypeptide chain trajectory to a close contact with itself. This universal feature of apparently polymer-statistical nature is a basis for a principally novel view on the globular proteins as loop fold structures. The same unit size has been detected in protein sequences translated from complete prokaryotic genomes by positional autocorrelation analysis, which strongly indicates the evolutionary connection of the units. The units are further characterized by prototype sequences matching to their numerous derivatives in the translated genomes. The matches to five strongest prokaryotic prototypes and three prototypes of C. elegans are identified in the sequences of crystallized proteins, and their structures analyzed. Corresponding segments of the polypeptide chains in majority of cases form closed loops, though evolutionary fate of every prototype element is shown to be rather diverse. Then loop ends can be separated by a sequence-wise distant segments and stabilized by the spatial interactions in the context of the overall globular structure. The units belong to a presumably limited spectrum of the sequence prototypes, full repertoire of which would constitute a proteomic code.


Last Updated on: Nov 25, 2024

Global Scientific Words in General Science