Protein Structure Prediction Bioinformatics Pdf
Total Page:16
File Type:pdf, Size:1020Kb
Protein structure prediction bioinformatics pdf Continue Go to the main content In this article lead section to be expanded. Please consider expanding the lead to provide an accessible overview of all important aspects of the article. (February 2017) Composite amino acids can be analyzed to predict the secondary, tertiary and quay structure of the protein. Predicting the structure of a protein is the output of the three-dimensional structure of a protein from its amino acid sequence, i.e. predicting its folding and its secondary and tertiary structure from its primary structure. Predicting structure is fundamentally different from the reverse problem of protein design. Protein structure prediction is one of the most important goals to ride bioinformatics and theoretical chemistry; this is very important in medicine (e.g. drug development) and biotechnology (e.g. in the development of new enzymes). Every two years, the effectiveness of modern methods is assessed in the CASP (Critical Assessment of Protein Structure Forecasting Methods). A continuous assessment of the web servers predicting the structure of the protein is carried out by the community project CAMEO3D. Protein structure and terminology Protein chains of amino acids combined with peptide bonds. Many conformations of this chain are possible because of the rotation of the chain around each atom C. It is these conformal changes responsible for differences in the three-dimensional structure of proteins. Each amino acid in the polar chain, i.e. it separates positive and negative charged regions with a free carbonyl group that can act as a host of hydrogen bonds and the NH group, which can act as a donor to hydrogen bonds. Thus, these groups can interact in the structure of the protein. 20 amino acids can be classified according to lateral chain chemistry, which also plays an important structural role. Glycine occupies a special position, as it has the slightest side chain, only one hydrogen atom, and therefore can increase local flexibility in the structure of the protein. Cysteine, on the other hand, can react with another residue of cysteine and thus form a cross-bond of stabilization of the entire structure. The protein structure can be considered as a sequence of secondary structure elements, such as α helices and β sheets, which together make up the overall three-dimensional configuration of the protein chain. In these secondary structures, regular H-link patterns are formed between neighboring amino acids, and amino acids have similar Φ and Ψ angles. The communication angles for Φ and ψ The formation of these structures neutralizes the polar groups on each amino acid. The secondary structures are tightly packed into the protein core in a hydrophobic environment. Each amino acid lateral group has a limited volume to occupy and a limited number of possible interactions with other nearby side chains, the situation should be taken into account in molecular modeling and alignment. In the α Helix Main: α spiral α spiral is the most common type of secondary protein structure. In α has 3.6 amino acids per turn with H bond formed between every fourth residue; The average length is 10 amino acids (3 turns) or 10, but ranges from 5 to 40 (1.5 to 11 turns). The alignment of H bonds creates a dipole moment for the spiral with a partial positive charge at the amino end of the spiral. Because this region has free NH2 groups, it will interact with negatively charged groups such as phosphates. The most common place α on the surface of protein nuclei where they provide an interface with an aqueous environment. The inside of the spiral tends to have hydrophobic amino acids and external lateral hydrophilic amino acids. Thus, one third of the four amino acids along the chain are usually hydrophobic, a pattern that can be quite easily detected. In the motif of lightning leucine, the repetitive pattern of leucines on the sides of the cladding of two adjacent gels very predicts the motive. In order to show this repetition, you can use a heliko-wheeled plot. Other α found in the protein nucleus or in cell membranes have a higher and regular distribution of hydrophobic amino acids and are highly predictable. The gels on the surface have a lower proportion of hydrophobic amino acids. The amino acid content can predict α area. Regions richer in alanine (A), glutamic acid (E), leucine (L) and methionine (M) and poorer in proline (P), glycine (G), tyrosine (Y) and serina (S) tend to form a spiral α. Proline destabilizes or breaks α spirals, but can be present in longer lycolins, forming a bend. Alpha spiral with hydrogen bonds (yellow dots) β sheet Main article: β sheet β sheets formed H connections between an average of 5-10 consecutive amino acids in one part of the chain with the other 5-10 further down the chain. Interacting regions can be adjacent, with a short cycle between them or far apart, with other structures between them. Each chain can work in one direction to form a parallel sheet, any other chain can work in the opposite chemical direction to form an anti parallel sheet, or the chains can be parallel and anti parallel to form a mixed sheet. The H communication pattern differs in parallel and anti-parallel configurations. Each amino acid in the inner strands of the leaf forms two H-links with neighboring amino acids, while each amino acid on the outer strands forms only one connection with the internal filament. Looking across the sheet at right angles to the strands, the more distant strands rotate a little counterclockwise to form Twist. THE ATOMs alternate above and below the sheet in a pleated pleated and R side groups of amino acids alternate above and below the crease. The angles Φ and Ψ amino acids in the sheets vary considerably in one region of the Ramachandran area. Predicting the location of these sheets is more difficult β than α hedicates. The situation improves somewhat when amino acid changes in multiple alignment sequences are taken into account. Loop loops are regions of the protein chain that are 1) between α helices and β sheets, 2) different lengths and three-dimensional configurations, and 3) on the surface of the structure. The stud loops that represent a complete twist in the polypeptide chain attaching two β strands can be as short as two amino acids in length. Loops interact with the surrounding aqueous environment and other proteins. Since the amino acids in the loops are not limited to space and the environment, like amino acids in the main area, and do not affect the location of secondary structures in the nucleus, there may be more substitutions, inserts and removals. Thus, when the sequence is aligned, the presence of these functions can be a sign of a loop. The positions of the introns in the genomic DNA sometimes correspond to the location of the loops in the encoded protein. Loops also tend to have charged and polar amino acids and are often a component of active sites. A detailed study of cyclical structures has shown that they fall into separate families. The coil area is a secondary structure that is not α a spiral, β sheet, or a recognizable turn commonly referred to as a coil. Protein classification of proteins can be classified according to structural similarity and sequence. For structural classification, the dimensions and spatial mechanisms of the secondary structures described in the above paragraph are compared in known three-dimensional structures. The classification, based on the similarity of the sequence, has historically been the first one to be used. Initially, similarities were performed, based on alignment of whole sequences. Later proteins were classified based on the appearance of preserved amino acid models. Databases are available by classifying proteins by one or more of these schemes. When considering protein classification patterns, it is important to keep in mind several observations. First, two completely different protein sequences from different evolutionary sources can develop into a similar structure. Conversely, the sequence of the ancient gene for this structure may have diverged significantly in different species while maintaining the same basic structural features. Recognizing any remaining similarity of consistency in such cases can be a very difficult task. Second, the two proteins, which have a significant degree of sequence similarity either to each other or to a third sequence, also have an evolutionary origin and should function as well. However, gene duplication and genetic permutations in the evolutionary process can lead to new copies of genes, which can then turn into proteins with a new function and structure. Terms used to classify protein structures and sequences are more commonly used terms for evolutionary and structural relationships between proteins listed below. Many additional terms are used for different types of structural features found in proteins. Descriptions of these terms can be found on the CATH website, on the Structural Protein Classification website (SCOP) and in the Glaxo Wellcome tutorial on The Swiss Bioinformatics website Expasy. Active location is a localized combination of amino acid lateral groups in a tertiary (three-dimensional) or four-dimensional (protein sub-edinice) structure that can interact with a chemically specific substrate and which provides protein biological activity. Proteins of very different amino acid sequences can add up to a structure that produces the same active site. Architecture is the relative orientation of secondary structures in a three-dimensional structure without considering whether they have a similar cycle structure. Fold (topology) a type of architecture that also has a saved cycle structure. Blocks are a preserved amino acid sequence in a family of proteins.