International Journal of Molecular Sciences

Article Structural Interface Forms and Their Involvement in Stabilization of Multidomain or Complexes

Jacek Dygut 1, Barbara Kalinowska 2, Mateusz Banach 3, Monika Piwowar 3, Leszek Konieczny 4 and Irena Roterman 3,*

1 Department of Rehabilitation, Hospital in Przemy´sl,Monte Cassino 18, 37-700 Przemy´sl,Poland; [email protected] 2 Faculty of Physics, Astronomy and Applied Computer Science, Jagiellonian University, Łojasiewicza 11, 30-348 Krakow, Poland; [email protected] 3 Department of Bioinformatics and Telemedicine, Jagiellonian University–Medical College, Łazarza 16, 31-530 Krakow, Poland; [email protected] (M.B.); [email protected] (M.P.) 4 Chair of Medical Biochemistry, Jagiellonian University–Medical College, Kopernika 7, 31-034 Krakow, Poland; [email protected] * Correspondence: [email protected]; Tel./Fax: +48-12-619-96-93

Academic Editor: Christo Z. Christov Received: 14 August 2016; Accepted: 11 October 2016; Published: 18 October 2016

Abstract: The presented analysis concerns the inter-domain and inter-protein interface in protein complexes. We propose extending the traditional understanding of the as a function of local compactness with an additional criterion which refers to the presence of a well-defined hydrophobic core. Interface areas in selected homodimers vary with respect to their contribution to share as well as individual (domain-specific) hydrophobic cores. The basic definition of a protein domain, i.e., a structural unit characterized by tighter packing than its immediate environment, is extended in order to acknowledge the role of a structured hydrophobic core, which includes the interface area. The hydrophobic properties of interfaces vary depending on the status of interacting domains—In this context we can distinguish: (1) Shared hydrophobic cores (spanning the whole dimer); (2) Individual hydrophobic cores present in each monomer irrespective of whether the dimer contains a shared core. Analysis of interfaces in dystrophin and utrophin indicates the presence of an additional quasi-domain with a prominent hydrophobic core, consisting of fragments contributed by both monomers. In addition, we have also attempted to determine the relationship between the type of interface (as categorized above) and the biological function of each complex. This analysis is entirely based on the fuzzy oil drop model.

Keywords: hydrophobic core; homodimers; utrophin; dystrophin; domain; hydrophobicity; interface

1. Introduction The traditional definition of a domain refers to the local compactness criterion [1–4]. Accordingly, the domain is regarded as an independent entity, which may exist in separation from the rest of the polypeptide chain [5,6]. This notion is supported by the fact that domains often evolve independently of one another. Residues responsible for mediating the protein’s biological function may be restricted to a single domain or spread across multiple domains. Two databases—SCOP [7,8] and CATH [9]—Group proteins with respect to their sequential and structural similarity, introduce suitable classifications for both purposes. The sequential similarity factor highlights evolutionary changes which produce homologues. From an evolutionary point of view, changes in the protein’s function may call for individual structural rearrangements in each

Int. J. Mol. Sci. 2016, 17, 1741; doi:10.3390/ijms17101741 www.mdpi.com/journal/ijms Int. J. Mol. Sci. 2016, 17, 1741 2 of 21 domain. Given that domains often form functional units, their structural stability should be viewed as a critical factor. Domain-specific structural changes are often discussed in the context of amyloidogenesis where partial unfolding of domain fragments (e.g., individual loops) is seen as a possible cause of amyloid plaque formation [10]. According to the domain swapping hypothesis a multimolecular aggregate emerges as a result of uncoiling of a domain fragment, leaving behind a pocket which accommodates the uncoiled fragment of another molecule [11]. Improperly folded proteins (or individual domains) lead to pathological changes commonly referred to as misfolding diseases [12]. Voronoi tessellation is a specialized model used in the study of inter-domain interfaces. [13–15]. Such interfaces are recognized as highly specific to their interaction partners and binding orientation. The relative positions of interface residues are generally well conserved within the same type of interface, even between remote homologues [16]. The relation between interface structure and hydrophobicity density distribution is particularly relevant in the context of amyloidogenesis [17]. Selective cross-saturation effects have been applied in docking method analysis [18]. A thorough discussion of protein complexation and interface conformation would not be possible without referring to the worldwide discussion forum offered by the CAPRI (Critical Assessment of PRedicted Interactions) project, originally launched in 2001 [19,20]. The goal of CAPRI is to predict the structure of protein complexes on the basis of the structural properties of individual monomers. After more than 10 years of development it seems that rigid body docking yields better results than the alternative flexible docking method, which calls for substantial conformational changes in target monomers. Successive editions of CAPRI introduce new challenges—For instance, assessment of the expected stability of protein complexes. Expectations related to the prediction of structural properties of complexes include determining their biological role, or—When this proves impossible—Identifying relations between the model and the biological activity of a specific complex [21]. It appears that chemical complementarity and conformational similarity are the main forces driving the formation of protein complexes. A cursory review of techniques employed in the CAPRI challenge indicates one common characteristic: They all tend to focus on pairwise (atom-atom) interactions. Analysis of the protein-protein interface is usually restricted to the zone of contact, with the geometry of solids (monomers) often used as the sole criterion of structural compatibility. In our work, we propose an approach which regards complexation as an effect of global conformational characteristics produced by the fuzzy oil drop model (FOD) in each structural unit (domain or protein), as well as in multi-domain complexes and protein dimers (note that we restrict our analysis to homodimers). The presence of a shared hydrophobic core is seen as the primary factor responsible for quaternary structural stabilization. The paper presents a set of homodimers in which the status of the inter-domain (or inter-chain) interface varies. Some dimers exhibit a shared hydrophobic core while in others no such core is present. In some cases we can attribute this phenomenon to the specific biological function of a given homodimer. Special attention is given to two proteins: dystrophin [22–25] and utrophin [26–33]. All homodimers presented in this work have been analyzed with the use of the fuzzy oil drop model [34]—An extension of the original oil drop model [35], which predicted internalization of hydrophobic residues along with exposure of hydrophilic residues on the surface. This qualitative concept has been extended with quantitative analysis by introduction of a mathematical model in which the “expected” hydrophobicity density distribution is represented by a Gaussian defined over a 3D ellipsoid and peaking at the geometric center of the molecule. This “idealized” distribution of hydrophobicity density is then compared with the actual (observed) distribution, which depends on the placement and properties of each residue in the . Such comparison enables us to identify locally discordant areas. The accuracy of the fuzzy oil drop model is confirmed by the study of protein families in which biological activity requires close correspondence between the observed and theoretical Int. J. Mol. Sci. 2016, 17, 1741 3 of 21 distributions (e.g., antifreeze and downhill proteins) [36,37]. Locally significant deviations from the idealized model indicate possible areas of interaction [38], with ligand binding pockets and protein complexation sites traceable to local hydrophobicity deficiencies and excesses respectively [39–41]. The innovation of the fuzzy oil drop approach rests in its ability to identify “hydrophobically ordered” domains in the protein body—Even when these domains are actually formed by fragments contributed by separate chains. Such emergent hydrophobic cores may play a key role in ensuring proper biological activity of protein complexes, particularly when contraction and stretching come into play [42]. The presence of a hydrophobic core appears to be a significant factor in quaternary stabilization of protein complexes [43]. To-date analyses of homodimers on the basis of the fuzzy oil drop model characterize protein-protein interfaces as areas exhibiting higher-than-expected hydrophobicity—A result of local exposure of hydrophobic residues on the protein surface [39–41]. A detailed study of homodimers suggests that both the dimer and its constituent monomers may vary with respect to their adherence to the fuzzy oil drop model (FOD model). A well-ordered hydrophobic core may be present in the dimer as well as in each monomer separately. In the former case the core is shared by both interacting monomers, contributing to structural stabilization of the complex. In light of the above, we can divide homodimers into four distinct groups: (1) hydrophobic core present in each participating monomer as well as in the dimer; (2) hydrophobic core present in the dimer but not in either monomer; (3) hydrophobic core present in both monomers but not in the dimer; and (4) hydrophobic core present in one participating monomer as well as in the dimer. The main point of this paper is the recognition of a distinct interface which manifests itself as a quasi-domain comprising fragments contributed by both monomers and possessing its own hydrophobic core. Such quasi-domains have been found in dystrophin and utrophin. The analysis presented in this work acknowledges the FOD status of homodimers (as explained above), focusing on representative cases for each group. The presented homodimers have been selected in such a way as to cover the entire spectrum of relationships between individual units (domains, chains) (Table1): (1) Hydrophobic core (according to FOD criterion) present in each participating monomer as well as in the dimer. This case is represented by the histone B dimer from Methanothermus fervidus (a hyperthermophilic organism)—1BFM [44]. (2) Hydrophobic core present (according to FOD model) in the dimer but not present in either monomer, represented by protein 1Y7Q [45]. 1Y7Q is a mammalian SCAN domain dimer which is a domain-swapped homologue of the HIV capsid C-terminal domain. The structure of 1Y7Q is interesting as it adopts a fold almost identical to that of the retroviral capsid (CTD) but uses an entirely different dimerization interface produced by swapping the MHR-like element between monomers. (3) Hydrophobic core present in both monomers but not in the dimer; the metal-sensing transcriptional repressor from Staphylococcus aureus in the Zn2-form (1R1V) is representative of this group [46]. (4) Hydrophobic core present in one participating monomer as well as in the dimer. This category is represented by the 2CD0—Human lambda-6 light chain dimer, a variable domain from the lambda-6 type immunoglobulin light chain [47]. This dimer may rise doubts due to the dissymmetry observed in homodimer (usually the C2 symmetry is expected in this case) and due to the high value of Rfree = 0.351. This protein is occurring in pathological conditions (over-expression of immunoglobulin light chain) and sometimes undergoes the amyloid transformation [47]. The observed structure, however related to pathological conditions, is evidenced and additionally may code some aspects allowing amyloid transformation [48]. The main contribution of this paper is recognition of a distinct interface, which manifests itself as an independent domain, comprising fragments contributed by both monomers and possessing its own well-defined hydrophobic core. Such interfaces are found in dystrophin and utrophin. Int. J. Mol. Sci. 2016, 17, 1741 4 of 21

Table 1. Complexes subject to analysis in the presented work. Columns indicate the status of the hydrophobic core in each monomer while rows correspond to the status of the shared (dimeric) core. Numbers in parentheses represent the number of monomers which exhibit accordance with the idealized hydrophobic core model. The rightmost column lists two proteins which contain an emergent quasi-domain, comprised of fragments contributed by each monomer and possessing its own hydrophobic core.

Monomer Additional Interface Dimer Status Hydrophobic Core Domain Present Absent present 1BFM [44] 1Y7Q [45] Hydrophobic core absent 1R1V (2) [46] 2CD0 (1) [47] Additional interface domain 1DXX [22], 1QAG [26]

Dystrophin is one of the largest proteins found in humans. It is expressed in various tissues, including skeletal muscle, heart, nerves and liver, where it forms an important component of the cellular membrane. Dystrophin is a cytoplasmic protein with an elongated shape. It links the of the muscle cell with the extracellular matrix across the plasma membrane [48]. Dystrophin is coded for by the DMD gene located at Xp21.2-p21.1 and consisting of 79 exons [23,24]. Mutations in DMD result in synthesis of an aberrant protein which, in turn, causes progressive muscular dystrophy. In its native form, dystrophin is a rodlike molecule, approximately 150 nm long, located beneath the cellular membrane. The transcript consists of 3685 amino acid residues (NP_03997.1) forming four domains which mediate interaction with cytoskeletal proteins (actin filaments) on the N-terminal side, and with membrane glycoproteins on the C-terminal side (dystrophin-associated proteins (DAPs) and dystrophin-associated glycoproteins (DAGs) respectively). Our analysis is restricted to the N-terminal actin-binding domain of human dystrophin containing CH1 and CH2 domains, present in each of the two chains which make up the homodimer. The structure of dystrophin (PDB: 1DXX) is listed as consisting of four chains paired up to form two subunits with identical symmetry (AB + CD). Our analysis focuses on the AB dimer, which is identical to the CD dimer. Utrophin [25] is another large molecule found in the cellular membrane and expressed with particular intensity during embryonic development, as well as in regenerating muscle tissue. Utrophin (1QAG) is a cytoskeletal protein homologous to dystrophin. Its degree of sequential similarity reaches 85% in acting-binding domains. Utrophin is responsible for mediating the activity of muscle cells to which it contributes by stabilizing plasmatic membranes. It is also responsible for clustering of the acetylcholine receptor. Dystrophin and utrophin expression is greatly upregulated in patients suffering from Duchenne muscular dystrophy [24]. The PDB structure (1QAG [25]) comprises two chains arranged in a similar way to the AB complex of dystrophin. This fragment is also responsible for binding to actin. Utrophin is generally undetectable in healthy adults. When expressed, it can be found at the neuromuscular synapse and myotendinous junctions. In all other locations it is replaced by dystrophin, which it structurally resembles. Utrophin is coded for by the UTRN gene, consisting of 74 exons and located at 6q24 [27]. The transcript consists of 3433 residues (NP_009055.2). Its N-terminal fragment interacts with actin while the C-terminal fragment is responsible for binding to AChR in the neuromuscular synapse [28,29]. Both utrophin and dystrophin interact with actin in a similar way. They both affect the mobility of actin—Specifically, its bending and twisting capabilities [30]. Functional differences between both proteins relate mostly to their site of interaction with actin. Dystrophin attaches to actin in two low-affinity areas while utrophin makes use of a higher-affinity fragment (continuous actin-binding domain). As a result, the torsional rigidity of actin is more significantly impaired by dystrophin than by utrophin. It is believed that the interaction between actin and dystrophin (or utrophin) promotes stretching and twisting, determining the stability of actin filaments. According to some studies either dystrophin or utrophin may have formed via partial gene duplication [31]. In biological nomenclature both proteins are referred to as paralogues, implying that they are homologous [32]. Int. J. Mol. Sci. 2016, 17, 1741 5 of 21

The set of proteins under consideration is presented in Table1 with a brief description of the status of the hydrophobic core in each example.

The Fuzzy Oil Drop Model: An Introduction The fuzzy oil drop model is based on the oil drop introduced by Kauzmann assuming hydrophobic residues migrating to the center of the protein body with hydrophilic residues exposed on its surface [39–41]. The modification introduced in fuzzy oil drop model uses the 3D Gauss function to express the continuous decrease of hydrophobicity in the distance from the molecule center (maximum hydrophobicity) toward the surface where the hydrophobicity takes the level close to zero. The assumed idealized distribution of hydrophobicity compared with the hydrophobicity observed in the real molecule (inter-residual hydrophobic interaction) reveals the differences. They can be of global or local character. The quantitative measurement of these differences is possible applying the Kullback–Leibler entropy [49]. The global discordance expressed by RD > 0.5 for complete protein (or domain) is interpreted as lack of ordered hydrophobicity core in particular molecule. Local discordance is also expressed by RD > 0.5 calculated for selected fragment of the profiles (T-expected, O-observed). In particular, the fragments of polypeptide chain representing well defined secondary structure can be taken as units to calculate RD to characterize their status versus the ordered hydrophobic core. It is also possible to calculate the RD value for the polypeptide chain with selected residues eliminated from calculation. This is to show the role of certain (selected) residues in the hydrophobic core construction. In particular, the resides engaged in protein-protein interaction can be used as objects to be eliminated. Their status depends on the internal construction of the domain (molecule) and on the external factors which is the second protein molecule. The residues located in the inter-protein interface are the object of the analysis presented here. The aim of this analysis is the definition of the role of residues responsible for dimers construction and its stability. To visualize the status of particular chain, domain or chain fragment the status of RD > 0.5 discordance versus the theoretical distribution expressed by the values of RD above 0.5 are given in bold. The fuzzy oil drop model is described in details in a paper recently published [50]. This is why the description presented here is limited to the main points of the procedure. Identification of secondary structural folds and the composition of protein domains follows the CATH and PDBSum classifications. Inter-domain/inter-chain contacts have been identified on the basis of the PDBSum distance criterion [51].

2. Results

2.1. Hydrophobic Core Present in the Dimer and in both Monomers This category is represented by the histone B dimer from Methanothermus fervidus (a hyperthermophilic organism) labeled 1BFM [44]. Analysis of RD values listed in Table2 indicates the presence of a shared dimeric core as well as individual monomeric cores. The dimeric core is more prominent than the monomeric cores. Analysis of individual helical folds reveals the destabilizing effect of the 21–51 fragment upon monomeric cores. Instead, the fragment appears to reinforce the shared core. The helical fold at 56–68 appears to play a significant role in shaping the monomeric cores as well as the shared dimeric core. Elimination of fragments, which participate in the inter-chain interface, distorts the shared core (RD increases from 0.383 to 0.390). This suggests that the interface zone contributes to the common hydrophobic core and its elimination destabilizes the complex, distorting the alignment between the observed and ideal distributions (Figure1). Int. J. J. Mol. Mol. Sci. Sci. 20162016,, 1717,, 1741 1741 6 of 21

Table 2. RD coefficients describing the status of the hydrophobic core in the dimer and in each monomerTable 2. RD of coefficients 1BFM. The describing table also the lists status the ofstatus the hydrophobic (with respect core to inthe the monomer dimer and and in eachthe dimer) monomer of individualof 1BFM. The helical table folds. alsolists “No the P-P” status indicates (with respect parts toof thethe monomermonomers and and the the dimer) dimer of individualfollowing eliminationhelical folds. of “No residues P-P” indicates which make parts up of the monomersP-P interface. and the dimer following elimination of residues which make up the P-P interface. RD 1BFM Individual MonomerRD Monomer in Dimer 1BFM IndividualABAB Monomer Monomer in Dimer Monomer 0.441ABAB 0.412 0.379 0.386 HelixMonomer 0.441 0.412 0.379 0.386 4–16Helix 0.429 0.355 0.360 0.379 4–16 0.429 0.355 0.360 0.379 21–5121–51 0.5650.565 0.514 0.514 0.4800.480 0.4570.457 56–6856–68 0.260 0.260 0.2010.201 0.3150.315 0.3000.300 NO P-PNO P-P 0.452 0.452 0.365 0.365 0.390 0.390 COMPLETECOMPLETE DIMER DIMER 0.383 0.383

Figure 1.1. SchematicSchematic depiction depiction of of a systema system in whichin which the dimerthe dimer as well as aswell its individualas its individual monomers monomers contain containwell-defined well-defined hydrophobic hydrophobic cores. Elimination cores. Elimination (black bar (black on the bar right on profile) the right of residuesprofile) involvedof residues in involvedcomplexation in complexation distorts the distorts dimeric the core. dimeric Blue andcore pink. Blue lines: and pink distributions lines: distributions in monomers, in monomers, gray line: graydistribution line: distribution in dimer. in (A dimer.) Status (A of) theStatus individual of the individual monomers monomers (blue and (bluepink )and versus pink the) versus resultant the resultantdistribution distribution in dimer (grayin dimer); (B) ( Thegray removal); (B) The of removal the interface of the (black interface bar) does(black not bar change) does thenot status change of the status dimer—Distribution of the dimer—Distribution accordant versus accordant the expected versus the one. expected one.

The presented protein comes from Methanothermus fervidus, an archaeon that grows optimally at The presented protein comes from Methanothermus fervidus, an archaeon that grows optimally 83 °C, making it a hyperthermophilic organism. It is, therefore, a very interesting study subject. Our at 83 ◦C, making it a hyperthermophilic organism. It is, therefore, a very interesting study subject. experience indicates that well-ordered hydrophobic cores are particularly common in thermophilic Our experience indicates that well-ordered hydrophobic cores are particularly common in thermophilic proteins (publication currently in preparation). This is consistent with experimental studies which proteins (publication currently in preparation). This is consistent with experimental studies which point to increased role of hydrophobic interactions under high temperature conditions [52]. point to increased role of hydrophobic interactions under high temperature conditions [52]. A highly simplified, one-dimensional visualization of the presented case is shown in Figure 1. A highly simplified, one-dimensional visualization of the presented case is shown in Figure1. The figure reveals how two structures with accordant distributions merge to produce a coherent The figure reveals how two structures with accordant distributions merge to produce a coherent Gaussian which, by itself, also remains consistent with theoretical expectations. Elimination of Gaussian which, by itself, also remains consistent with theoretical expectations. Elimination of interface residues in this particular case lowers the accordance of the dimer, as well as of its interface residues in this particular case lowers the accordance of the dimer, as well as of its individual individual monomeric units. monomeric units.

2.2. Hydrophobic Core Present in the Dimer but Absent in both Monomers This category is represented by 1Y7Q (Table 1 1).). FuzzyFuzzy oiloil dropdrop analysisanalysis indicatesindicates thatthat thethe dimerdimer possesses a shared hydrophobic core (RD = 0.421) while individual monomers lack such a core (RD(RD = 0.553 and 0.514 respectively;respectively; TableTable3 ).3). Parameters Parameters describing describing the the status status of of each each monomer monomer and and of ofthe the dimer dimer itself itself suggest suggest that that individual individual secondary secondary folds folds tend tend to participate to participate in the in shared the shared core rather core ratherthan in than individual in individual (monomeric) (monomeric) cores. cores. The interface zonezone playsplays aa significantsignificant role role in in shaping shaping the the shared shared hydrophobic hydrophobic core: core: elimination elimination of ofits residuesits residues (row (row labeled labeled “No P-P”) “No increases P-P”) increases the value ofthe RD, value which of impliesRD, which substantial implies involvement substantial in involvementconstruction ofin theconstruction shared core. of the We shared may conclude core. We that may this conclude portion ofthat the this chain portion is the of main the chain component is the mainof the component shared core of and the should shared becore characterized and should by be high characterized local hydrophobicity. by high local Reductions hydrophobicity. in RD Reductionscalculated for in individualRD calculated secondary for individual folds support secondary this conclusion, folds support suggesting this conclusion, that those suggesting fragments that are thosealigned fragments with the are shared aligned core. with the shared core.

Int. J. Mol. Sci. 2016, 17, 1741 7 of 21

Table 3. RD coefficients describing the status of the hydrophobic core in the dimer and in each monomer of 1Y7Q. The table also lists the status (with respect to the monomer and the dimer) of individual helical folds.Int. J. “No Mol. Sci. P-P” 2016 indicates, 17, 1741 parts of the monomers and the dimer following elimination of residues7 of 21 involved in protein-protein interactions. Table 3. RD coefficients describing the status of the hydrophobic core in the dimer and in each monomer of 1Y7Q. The table also lists the status (with respect to the monomer and the dimer) of Individual Monomer Monomer in Dimer individual helical folds.1Y7Q “No P-P” indicates parts of the monomers and the dimer following ABAB elimination of residues involved in protein-protein interactions. Monomer 0.553 0.514 0.410 0.430 Individual Monomer Monomer in Dimer 1Y7QHelix 43–51 0.571ABAB0.484 0.522 0.521 Monomer60–76 0.5530.563 0.5580.514 0.4100.312 0.430 0.330 81–98Helix 0.488 0.430 0.452 0.454 100–10843–51 0.571 0.313 0.3610.484 0.522 0.353 0.521 0.361 113–126 0.426 0.381 0.372 0.393 60–76 0.563 0.558 0.312 0.330 No81–98 P-P 0.4880.511 0.4780.430 0.452 0.471 0.454 Complete100–108 Dimer 0.313 0.361 0.353 0.421 0.361 113–126 0.426 0.381 0.372 0.393 No P-P 0.511 0.478 0.471 Comparing RD values for both monomers and for the dimer reveals symmetry in the shared Complete Dimer 0.421 core, while individual assessment of each monomeric core points to certain differences—It seems that each monomerComparing adapts RD to values the shared for both core monomers in a slightly and for different the dimer manner. reveals symmetry We can concludein the shared that most residuescore, contribute while individual to the sharedassessment core of rathereach monomeri than toc individual core points cores.to certain From differences—It the point seems of view of a single monomerthat each monomer involvement adapts of to itsthe residues shared core in thein a sharedslightly different (dimeric) manne corer. introduces We can conclude deformations that in the individualmost residues (monomeric) contribute core. to the shared core rather than to individual cores. From the point of view of a single monomer involvement of its residues in the shared (dimeric) core introduces deformations The role of individual helical folds varies. Three helixes (four in the B chain) follow the expected in the individual (monomeric) core. monomericThe distribution. role of individual When helical considering folds varies. the Three dimer helixes as a (four whole, in the four B chain) folds follow remain the accordantexpected with theoreticalmonomeric predictions. distribution. Of particular When considering note is the the dimer helix as at a 60–76,whole, four which folds diverges remain accordant from the with idealized distributiontheoretical in each predictions. monomer Of particular while exerting note is the astrong helix at stabilizing60–76, which influence diverges from on thethe dimeridealized (low RD values).distribution This phenomenon in each monomer can be explainedwhile exerting by a pointing strong stabilizing out that influence the helix on forms the dimer part of(low the RD interface zone andvalues). therefore This phenomenon plays a crucial can be part explained in quaternary by pointi stabilization.ng out that the helix forms part of the interface zone and therefore plays a crucial part in quaternary stabilization. From aFrom functional a functional point point of view,of view, the the complex complex contains a azinc -associated finger-associated SCAN SCANdomain, domain, which iswhich a homologue is a homologue of the HIV-1of the capsidHIV-1 capsid protein. protein. Hydrophobic Hydrophobic core core analysis analysis may may therefore therefore provide a new modelprovide for a protein-protein new model for protein-protein interactions interactions underpinning underpinning viral particle viral particle formation formation [53]. [53]. This highlyThis highly simplified, simplified, one-dimensional one-dimensional presentation presentation of ofthe the presentedpresented case case (Figure (Figure 2)2 )reveals reveals how two structureshow two with structures divergent with distributions divergent producedistributions a coherent produce Gaussian a coherent in closeGaussian correspondence in close with correspondence with theoretical predictions. The effect of elimination of interface residues is also theoretical predictions. The effect of elimination of interface residues is also depicted. It seems that in depicted. It seems that in this case the interface plays a significant role in shaping the common this case(dimeric) the interface core. plays a significant role in shaping the common (dimeric) core.

Figure 2. Schematic depiction of a system in which individual monomers lack hydrophobic cores but Figure 2. Schematic depiction of a system in which individual monomers lack hydrophobic cores but the dimer as a whole contains a well-defined core. Elimination of residues involved in complexation the dimer as a whole contains a well-defined core. Elimination of residues involved in complexation distorts the dimeric core while bolstering the monomeric cores—The black frame (right-hand distortsschema) the dimeric symbolizes core whilethe interface, bolstering which the has monomeric been eliminated cores—The from RD calculation black frame to characterize (right-hand its schema) symbolizesinfluence the interface,on the status which of the has dimer, been as eliminated well as the fromstatusRD of each calculation monomer. to Blue characterize and pink itslines: influence on the statusdistributions of the in dimer, monomers, as well gray as line: the statusdistribution of each in dimer. monomer. (A) Status Blue of and the pinkindividual lines: monomers distributions in monomers,(blue grayand pink line:) versus distribution the resultant in dimer. distribution (A) Status in dimer of the (gray individual); (B) the removal monomers of the ( blueinterfaceand pink) versus the(black resultant bar) does distribution not change the in status dimer of ( thegray dimer—Dist); (B) theribution removal accordant of the interface versus the ( blackexpected bar one.) does not

change the status of the dimer—Distribution accordant versus the expected one. Int. J. Mol. Sci. 2016, 17, 1741 8 of 21

Int. J. Mol. Sci. 2016, 17, 1741 8 of 21 The charts shown in Figure3 visualize the degree of accordance/discordance between theoretical and observedThe distributionscharts shown for in individualFigure 3 monomersvisualize the and degree for the of dimer. accordance/discordance Poor accordance ofbetween monomeric theoretical and observed distributions for individual monomers and for the dimer. Poor accordance distributions is particularly evident in the N-terminal portion of chain where the observed density is of monomeric distributions is particularly evident in the N-terminal portion of chain where the lowerobserved than expected density is (this lower relation than expected reverses (this at aroundrelation reverses 50–70 aa). at around A different 50–70 aa). result A different is obtained result when both monomersis obtained when are regarded both monomers as components are regarded of aas dimer. components of a dimer.

Figure 3. Hydrophobicity density distributions (theoretical: blue line; observed: red line) in 1Y7Q. Figure 3. Hydrophobicity density distributions (theoretical: blue line; observed: red line) in 1Y7Q. (A) Monomer chain A and B respectively; (B) chain A and B in dimmer (common ellipsoid for (A) Monomer chain A and B respectively; (B) chain A and B in dimmer (common ellipsoid for both chains). both chains).

2.3. Hydrophobic2.3. Hydrophobic Core Core Absent Absent in thein the Dimer Dimer but but Present Present inin both Monomers Monomers ThisThis category category is represented represented by the by meta thel-sensing metal-sensing transcriptional transcriptional repressor from Staphylococcus repressor from Staphylococcusaureus in the aureus Zn2-formin the (1R1V). Zn2-form (1R1V). The dimerThe dimer lacks lacks a shared a shared core core while while each each monomermonomer retains retains a awell-defined well-defined core. core. The Thestatus status of of β-sheets both in the dimer and in each monomer separately indicates good accordance with the β-sheets both in the dimer and in each monomer separately indicates good accordance with the theoretical model. We can therefore conclude that β-sheets contribute to structural stability of both β theoreticalthe dimer model. and Weeach can monomer. therefore conclude that -sheets contribute to structural stability of both the dimer andMost each secondary monomer. folds in 1R1V appear to stabilize their parent monomer rather than the Mostprotein-protein secondary complex. folds This in 1R1V is particularly appear tr toue stabilizeof the helixes their at 25–38 parent and monomer 41–50 (Table rather 4). than the protein-proteinSome other complex. folds This(such is as particularly the β-fragment true at of 68–74) the helixes play an at 25–38equal role and in 41–50 the stabilization (Table4). of Somemonomeric other units folds and (such the dimer. as the Eliminationβ-fragment of interface at 68–74) residues play an improves equal rolethe dimer’s in the accordance, stabilization of monomericsuggesting units that, and at theleast dimer. in the Eliminationcase of 1R1V, ofprot interfaceein complexation residues is improves mediated theby forces dimer’s other accordance, than suggestinghydrophobic that, atinteractions. least in the Further case ofanalysis 1R1V, indicates protein complexationthat strong electrostatic is mediated forces by are forces at play other since than hydrophobicthe interface interactions. is composed Further almost analysisentirely of indicates polar residues. that strong electrostatic forces are at play since The helixes at 52–66 and 84–100 do not align with the hydrophobic core; instead, they are the interface is composed almost entirely of polar residues. involved in binding ligands (Zn2+ ions), which is critical for the protein’s biological function. The Thelabile helixes nature at of 52–66 both helixes and 84–100 is related do notto the align required with theconformational hydrophobic changes core; instead,when interacting they are with involved 2+ in bindingDNA ligandsstrains [46]. (Zn ions), which is critical for the protein’s biological function. The labile nature of both helixesFigure is related 4 presents to the a model required in wh conformationalich each monomer changes conforms when to th interactinge idealized withGaussian DNA while strains the [46]. Figuredimer 4as presents a whole diverges a model from in which theoretical each monomerpredictions. conforms Elimination to theof interface idealized residues Gaussian improves while the dimerthe as accordance a whole diverges of the dimer. from theoretical predictions. Elimination of interface residues improves the accordance of the dimer.

Int. J. Mol. Sci. 2016, 17, 1741 9 of 21

Table 4. RD coefficients describing the status of the hydrophobic core in the dimer and in each monomer

ofInt.1R1V. J. Mol. Sci. Values 2016, 17 given, 1741 in bold are fragments of status expressed by RD > 0.5. H and B denote9 of the 21 secondary form of the particular fragment, helical and β-structural respectively. Position “No P-P” expressesTable the 4. RD status coefficients of dimer describing with residues the status in interface of the hydrophobic eliminated form core calculation.in the dimer and in each monomer of 1R1V. Values given in bold are fragments of status expressed by RD > 0.5. H and B denote the secondary form of the particular fragment, helicalRD and β-structural respectively. Position “No P-P” expresses the1R1V status of dimerIndividual with residues Monomer in interface Monomer eliminated in form Dimer calculation. ABABRD Monomer1R1V Individual 0.476 Monomer 0.477 Monomer0.639 in Dimer 0.638 ABAB 9–24 H 0.595 0.496 0.695 0.716 25–38Monomer H 0.4780.476 0.4960.477 0.639 0.789 0.638 0.780 41–509–24 H H 0.2320.595 0.2120.496 0.695 0.537 0.716 0.510 52–6625–38 H H 0.544 0.478 0.543 0.496 0.789 0.510 0.780 0.504 68–7441–50 B H 0.140 0.232 0.148 0.212 0.537 0.093 0.510 0.093 77–8352–66 B H 0.542 0.544 0.575 0.543 0.510 0.548 0.504 0.591 84–10068–74 H B 0.538 0.140 0.534 0.148 0.093 0.528 0.093 0.536 β-sheet77–83 B 0.304 0.542 0.354 0.575 0.548 0.302 0.591 0.356 84–100 H 0.538 0.534 0.528 0.536 No P-P 0.467 0.465 0.477 β-sheet 0.304 0.354 0.302 0.356 CompleteNo Dimer P-P 0.467 0.465 0.477 0.638 Complete Dimer 0.638

FigureFigure 4. Schematic 4. Schematic depiction depiction of of a a system system in in whichwhich the dimer dimer lacks lacks a clear a clear hydrophobic hydrophobic core corewhile while each each monomer retains a well-defined core. Elimination of residues involved in complexation (right-hand monomer retains a well-defined core. Elimination of residues involved in complexation (right-hand image) leads to better accordance with the model when considering the dimer as a whole. Blue and image) leads to better accordance with the model when considering the dimer as a whole. Blue and pink lines: distributions in monomers, gray line: distribution in dimer. (A) Status of the individual pink lines: distributions in monomers, gray line: distribution in dimer. (A) Status of the individual monomers (blue and pink) versus the resultant distribution in dimer (gray); (B) Removal of the monomersinterface (blue (blackand barpink) changes) versus the thestatus resultant of the dimer—Distribution distribution in dimer appears (gray accordant); (B) Removal versus the of the interfaceexpected (black one. bar ) changes the status of the dimer—Distribution appears accordant versus the expected one. 2.4. Hydrophobic Core Absent in the Dimer but Present in a Single Monomer 2.4. HydrophobicThis category Core Absent is represented in the Dimer by 2CD0: but human Present lambda-6 in a Single light Monomer chain dimer (Bence-Jones protein), Thisa variable category domain is represented from the lambda-6 by 2CD0: type humanimmunoglobulin lambda-6 light light chain chain [47]. dimer (Bence-Jones protein), a variableThe domain protein from is a thehomodimer lambda-6 consisting type immunoglobulin of two V immunoglobulin light chain domains. [47]. Despite sharing a β-sandwich structure, both domains differ with respect to their FOD status. The dimer itself diverges The protein is a homodimer consisting of two V immunoglobulin domains. Despite sharing a from the model as does one of its monomeric subunits. Nevertheless, individual secondary folds β -sandwichappear to structure, play a similar both domainsrole in the differ stabilization with respect of each to monomer; their FOD for status. example, The the dimer upper itself β-sheet diverges from(called the model the upper as does core) one is of a its good monomeric match for subunits. theoretical Nevertheless, predictions in individual the dimer secondary as well as foldsin both appear to playmonomers, a similar while role inthe the lower stabilization β-sheet of(lower each core) monomer; significantly for example, diverges the from upper theβ -sheetidealized (called the upperdistribution core) isin aeach good case. match 2CD0 for also theoretical contains a predictions disulfide bond in the which dimer appears as well to further as in both stabilize monomers, its whiletertiary the lower conformation.β-sheet (lower The loop core) bounded significantly by SS diverges bond-forming from theCys idealizedresidues conforms distribution to the in each case.idealized 2CD0 also distribution contains a disulfidewith good bond accuracy, which suggesting appears to a further measure stabilize of coaction its tertiary between conformation. both The loopstabilizing bounded factors by SS(disulfides bond-forming and hydrophobic Cys residues intera conformsctions). It to seems the idealized that the formation distribution of the with SS good bond is enabled by hydrophobic forces which guide Cys residues towards the center of the protein accuracy, suggesting a measure of coaction between both stabilizing factors (disulfides and hydrophobic body and into close proximity to each other. The resulting bonds do not, however, contribute to the interactions). It seems that the formation of the SS bond is enabled by hydrophobic forces which guide formation of a shared (dimeric) hydrophobic core [54]. Cys residues towards the center of the protein body and into close proximity to each other. The resulting bonds do not, however, contribute to the formation of a shared (dimeric) hydrophobic core [54].

Int. J. Mol. Sci. 2016, 17, 1741 10 of 21

Of note is the variable status of β-folds: fragments 8–13, 19–26, 34–38, 45–48 and 69–76. Each of theseInt. folds J. Mol. contributes Sci. 2016, 17, 1741 to the stability of either its parent monomer or the dimer. 10 of 21 InterfaceOf note residues is the variable do not status play of anβ-folds: important fragments role 8–13, in 19–26, structural 34–38, stabilization 45–48 and 69–76. of 2CD0Each of since eliminatingthese folds them contributes does not to appreciably the stability changeof either theits parent RD value. monomer or the dimer. UnderInterface natural residues conditions, do not theplay Van domainimportant of role the in immunoglobulin structural stabilization light of chain 2CD0 undergoessince complexationeliminating with them the does corresponding not appreciably domain change the of RD the value. heavy chain. Pathological conditions lead to formationUnder of an natural excessive conditions, number ofthe light V domain chain dimers.of the immunoglobulin The asymmetric light nature chain of theundergoes homodimer (Tablecomplexation5) may be a with consequence the corresponding of “replacing” domain the of heavythe heavy chain chain. with Pathological another copy conditions of the lightlead to chain, producingformation a complex of an excessive in which number each monomer of light chain adopts dimers. a different The asymmetric conformation. nature Theof the role homodimer of interface is (Table 5) may be a consequence of “replacing” the heavy chain with another copy of the light chain, schematically presented in Figure5. Elimination of P-P residues further stabilizes the dimer but does producing a complex in which each monomer adopts a different conformation. The role of interface not produce a shared core as defined by the fuzzy oil drop model (RD > 0.5). is schematically presented in Figure 5. Elimination of P-P residues further stabilizes the dimer but does not produce a shared core as defined by the fuzzy oil drop model (RD > 0.5). Table 5. RD coefficients describing the status of the hydrophobic core in the dimer and in each monomer of 2CD0.Table The 5. tableRD coefficients also lists thedescribing status (withthe status respect of tothe the hydrophobic monomer core and thein the dimer) dimer of and individual in each folds: helixesmonomer (H), β-folds of 2CD0. (B), The upper table core, also lower lists the core status fragments (with respect bounded to the by monomer Cys residues and formingthe dimer) SS of bonds, as wellindividual as fragments folds: helixes following (H), eliminationβ-folds (B), upper of residues core, lower involved core fragments in protein-protein bounded by interaction. Cys residues forming SS bonds, as well as fragments following elimination of residues involved in protein-protein interaction. RD 2CD0 Individual MonomerRD Monomer in Dimer 2CD0 IndividualABAB Monomer Monomer in Dimer 0.526ABAB 0.477 0.719 0.704 0.526 0.477 0.719 0.704 Secondary Secondary 3–5 B 0.428 0.400 0.150 0.221 8–133–5 B B 0.163 0.428 0.292 0.400 0.150 0.600 0.221 0.611 19–268–13 B B 0.535 0.163 0.547 0.292 0.600 0.361 0.611 0.411 27–3119–26 H B 0.603 0.535 0.566 0.547 0.361 0.678 0.411 0.689 34–3827–31 B H 0.175 0.603 0.137 0.566 0.678 0.561 0.689 0.582 45–4834–38 B B 0.119 0.175 0.131 0.137 0.561 0.771 0.582 0.753 61–6645–48 B B 0.104 0.119 0.100 0.131 0.771 0.211 0.753 0.275 69–7661–66 B B 0.437 0.104 0.341 0.100 0.211 0.531 0.275 0.506 79–8369–76 H B 0.571 0.437 0.576 0.341 0.531 0.670 0.506 0.657 84–92 B 0.489 0.294 0.416 0.360 79–83 H 0.571 0.576 0.670 0.657 95–99 B 0.370 0.176 0.363 0.344 84–92 B 0.489 0.294 0.416 0.360 102–107 B 0.747 0.634 0.866 0.822 95–99 B 0.370 0.176 0.363 0.344 β-sheet-Upper102–107 coreB 0.359 0.747 0.321 0.634 0.866 0.376 0.822 0.398 β-sheet-Lowerβ-sheet-Upper core core 0.515 0.359 0.537 0.321 0.376 0.793 0.398 0.757 βSS-bond-sheet-Lower loop core 0.487 0.515 0.453 0.537 0.793 0.681 0.757 0.695 SS-bond loop 0.487 0.453 0.681 0.695 No P-P 0.516 0.446 0.686 No P-P 0.516 0.446 0.686 CompleteComplete Dimer Dimer 0.711 0.711

FigureFigure 5. Schematic5. Schematic depiction depiction ofof aa systemsystem in which only only one one monomer monomer contains contains a prominent a prominent hydrophobic core. Elimination of residues involved in complexation (right-hand image) leads to better hydrophobic core. Elimination of residues involved in complexation (right-hand image) leads to better accordance with the model when considering the dimer as a whole. Blue and pink lines: distributions accordance with the model when considering the dimer as a whole. Blue and pink lines: distributions in monomers, gray line: distribution in dimer. (A) Status of the individual monomers (blue and in monomers,pink) versus gray the line: resultant distribution distribution in dimer. in dimer (A )(gray Status); (B of) The the individualremoval of the monomers interface ((blueblackand bar)pink ) versusdoes the not resultant change the distribution status of the in dimer—Distribu dimer (gray); (Btion) The still removal discordant of theversus interface the expected (black one. bar ) does not change the status of the dimer—Distribution still discordant versus the expected one.

Int. J. Mol. Sci. 2016, 17, 1741 11 of 21

Int. J. Mol. Sci. 2016, 17, 1741 11 of 21 The V domain dimer also plays an important role in amyloidogenesis research, given that products of the majorThe V human domain V dimer kappa also and plays V lambda an important gene role families in amyloidogenesis have been identified research, ingiven AL that deposits. One particularproducts of subgroup the major (lambda-6)human V kappa has been and foundV lambda tobe gene preferentially families have associated been identified withthis in AL disease. deposits. One particular subgroup (lambda-6) has been found to be preferentially associated with Comparative analysis of the status of individual β-folds in immunoglobulin-like domains indicates this disease. Comparative analysis of the status of individual β-folds in immunoglobulin-like that—Despite a high degree of structural similarity, they differ with respect to their adherence to the domains indicates that—Despite a high degree of structural similarity, they differ with respect to fuzzytheir oil dropadherence model to [the55 ].fuzzy oil drop model [55].

2.5. Complexes2.5. Complexes Stabilized Stabilized through through Formation Formationof of anan Additional Shared Shared Domain Domain in inthe theInterface Interface Zone Zone An entirelyAn entirely unexpected unexpected form form of interfaceof interface is observedis observed in in the the 1DXX 1DXX and and 1QAG 1QAG homodimers. homodimers. Here, fragmentsHere, fragments contributed contributed by each monomerby each monomer assemble assemble to form to a quasi-domainform a quasi-domain which which is highly is highly accordant withaccordant the idealized with hydrophobicity the idealized hydrophobicity distribution model distribution and contributes model and to contributes the protein’s to biologicalthe protein’s activity. Notebiological that both activity. 1DXX andNote 1QAG that both are components1DXX and 1QAG of a cross-membraneare components of link a cross-membrane between the actin-based link cytoskeletonbetween the of theactin-based muscle cellcytoskeleton and the extracellularof the muscle matrix—Theycell and the extracellular must therefore matrix—They adapt to must external stimulitherefore (e.g., stretching)adapt to external and revert stimuli to (e.g., their stretching original) conformation and revert to their when original such stimuliconformation are not when present. such stimuli are not present. 2.6. Dystrophin (1DXX) 2.6. Dystrophin (1DXX) Figure6 presents hydrophobicity density distribution profiles for complete domains CH1 and Figure 6 presents hydrophobicity density distribution profiles for complete domains CH1 and CH2CH2 of dystrophin. of dystrophin. Results Results (RD (RD parameters) parameters) for th thee actin-binding actin-binding N-terminal N-terminal fragment fragment of human of human dystrophindystrophin are listedare listed in Tablein Table6. 6.

Figure 6. Theoretical (blue line) and observed (red line) hydrophobicity density distribution profiles Figure 6. Theoretical (blue line) and observed (red line) hydrophobicity density distribution profiles in dystrophin: A—CH1 (complete domain), B—CH2 (complete domain). Green lines: positions of in dystrophin: A—CH1 (complete domain), B—CH2 (complete domain). Green lines: positions of disease-related mutations, yellow area: fragment interacting with actin. disease-related mutations, yellow area: fragment interacting with actin. The “Modified” tag indicates domain structures identified on the basis of the FOD-based definitionThe “Modified” of a domain. tag indicates “Modified” domain means structures deprived identified of the on fragment the basis ofof thechain FOD-based identified definition as of a domain.participating “Modified” in interface-domain. means deprived The leftmost of thecolumn fragment also lists of mutations chain identified which cause as participatingDuchenne’s in interface-domain.muscular dystrophy. The The leftmost most detrimental column also mutation lists mutationss are underlined. which Question cause marks Duchenne’s indicate muscularthat a given mutation does not belong to the analyzed fragment, but is found in its immediate dystrophy. The most detrimental mutations are underlined. Question marks indicate that a given neighborhood. mutation does not belong to the analyzed fragment, but is found in its immediate neighborhood.

Int. J. Mol. Sci. 2016, 17, 1741 12 of 21

Table 6. RD values calculated for the ABCD complex, for AB and CD subcomplexes, for a single chain and for individual domains (CH1, CH2) of dystrophin (1DXX), as well as for selected secondary structural folds. “FOD DOMAIN” represents fragments contributed by two chains (A,B) which form a coherent domain with its own FOD-compliant hydrophobic core.

Dystrophin1DXX Chain Fragment RD Complex–Tetramer ABCD 0.815 Complex–Dimer AB/CD 0.754/0.752 Chain A/C 0.839/0.758 Chain B/D 0.761/0.762 Domain 1 CH1 9–121 0.458 CH1 Modified 9–118 0.462 Helix K18N 13–32 0.424 Actin binding 17–26 0.341 Loop 33–46 0.466 Helix L54R 47–59 0.596 Loop 60–68 0.562 Helix 69–87 0.536 Loop 88–94 0.212 Helix-Actin binding 95–101 0.299 Helix-Actin binding 103–119 0.343 122–246 0.510 Domain 2 CH2 Modified 135–241 0.336 134–246 0.432 Helix 135–149 0.309 Loop 150–159 0.215 HelixD165V 160–164 0.358 Helix D165V, A168D, L172H 166–177 0.321 Helix 178–181 0.299 Helix 182–188 0.195 Helix 191–206 0.455 Loop 207–213 0.492 Helix 214–219 0.134 Loop 220–223 0.865 Helix Y231N 224–237 0.325 β 242–246 0.224 Fod Domain A (Helix119-134 + A (β242–246)) + B (Helix119–134 + B (β242–246)) 0.239 C (Helix119-134 + C (β42–246)) + D (Helix119–134 + D (β242–246)) 0.245

In addition to secondary folds the table also lists residues involved in interaction with actin. The helix at 119–134 is referred to as “Helix X” in [22]. The four-chain complex (ABCD), taken as a whole, does not appear to include a hydrophobic core in the sense of the fuzzy oil drop model. Similar conditions are encountered in both sub-complexes (AB and CD), as well as in individual monomers (A, B, C and D). This strong discordance is caused by the presence of a cavity in the central part of the complex as well as in individual chains. The CH1 domain is highly accordant, unlike CH2, which does not contain a prominent core. The status of individual β-folds can be assessed on the basis of their RD coefficients, listed in Table6. It appears that each domain, regardless of its overall status, includes locally discordant fragments. In CH1 the discordant fragment at 47–87 consists of two helixes joined by a twist. In CH2 the twist at 220–223 diverges from the model, likely causing the entire domain to slightly exceed the RD = 0.5 threshold (Figure7). Two fragments of CH1, which have been experimentally confirmed to mediate actin binding, retain accordance with the fuzzy oil drop model. It therefore seems likely that the act of complexation requires structural rearrangement of the entire CH1 domain as—Contrary to FOD predictions—No hydrophobic residues are exposed on the surface in the protein’s native form. Int. J. Mol. Sci. 2016, 17, 1741 13 of 21

Visual inspection reveals fragments with high (e.g., the first 20 N-terminal residues and the 100–122 C-terminal segment in CH1) and low (e.g., 40–60 in CH1 and 120–140 in CH2) accordance between both distributions. In-depth comparative analysis is provided in Table6. Int. J. Mol. Sci. 2016, 17, 1741 13 of 21 VisualInt. J. Mol. inspection Sci. 2016, 17, 1741 of the structure of the AB complex (Figure8) suggests a different division13 of 21 into domains thanVisual the inspection one proposed of the structure in SCOP of the and AB PDBSum. complex (Figure In particular, 8) suggests an a additional different division quasi-domain into Visual inspection of the structure of the AB complex (Figure 8) suggests a different division into can bedomains found sandwichedthan the one proposed between CH1in SCOP and and CH2. PDBS Thisum. domain In particular, is made an upadditional of fragments quasi-domain contributed domains than the one proposed in SCOP and PDBSum. In particular, an additional quasi-domain by bothcan be chains. found sandwiched Modification between of the CH1 set and of primaryCH2. This domainsdomain is made does up not of appreciably fragments contributed alter the RD can be found sandwiched between CH1 and CH2. This domain is made up of fragments contributed parameterby both of chains. CH1; howeverModification it significantlyof the set of improvesprimary domains the accordance does not appreciably of CH2 (Figure alter 8the, Table RD 6). byparameter both chains. of CH1; Modification however it significantlyof the set of improves primary thedomains accordance does ofnot CH2 appreciably (Figure 8, alterTable the 6). TheRD The quasi-domain, composed of fragments contributed by CH1 and CH2 in the AB interface zone, is quasi-domain,parameter of CH1; composed however of fragmentsit significantly contributed improves by the CH1 accordance and CH2 inof theCH2 AB (Figure interface 8, Table zone, 6). is alsoThe also highly accordant, as suggested by T and O distribution profiles (Figure9) and confirmed by low highlyquasi-domain, accordant, composed as suggested of fragments by T and contributed O distribution by CH1 profiles and CH2 (Figure in the 9) AB and interface confirmed zone, by is alsolow valuesvalueshighly of RD ofaccordant, (TableRD (Table6 Refer as 6 suggestedRefer to the to the “FOD by “FOD T and domain” domain” O distri item). item).bution TheseThese profiles values (Figure are are 0.239 9) 0.239 and and andconfirmed 0.245 0.245 for fortheby theA/Blow A/B and C/Dandvalues C/D complexes of complexesRD (Table respectively. respectively.6 Refer to the “FOD domain” item). These values are 0.239 and 0.245 for the A/B and C/D complexes respectively.

FigureFigure 7. Theoretical 7. Theoretical (blue (blue line )line and) observedand observed (red line(red) hydrophobicityline) hydrophobicity density density distribution distribution profiles. profiles.Figure 7. ( ATheoretical) CH1 (modified (blue line domain);) and observed(B) CH2 ((modifiedred line) hydrophobicitydomain). Green density lines: positionsdistribution of (A) CH1 (modified domain); (B) CH2 (modified domain). Green lines: positions of disease-related disease-relatedprofiles. (A) CH1 mutations. (modified Yellow domain); area: fragment (B) CH2 interacting (modified with domain). actin. Green lines: positions of mutations.disease-related Yellow area:mutations. fragment Yellow interacting area: fragment with interacting actin. with actin.

Figure 8. Structure of dystrophin: chain A (turquoise) with two fragments of the B chain (dark blue). FigureFigureThe 8. redStructure 8. fragments Structure ofdystrophin: of dystrophin:the A chain chain chaininteract A A ( turquoise(withturquoise the corresponding)) with with two two fragments fragments black fragments of the of theB chain of B chainthe (dark B chain (dark blue to). blue ). The redproduceThe fragments red fragmentsthe FOD of domain. theof the A chainA chain interact interact with with thethe corresponding black black fragments fragments of the of theB chain B chain to to produceproduce the FOD the FOD domain. domain.

Int. J. Mol. Sci. 2016, 17, 1741 14 of 21 Int. J. Mol. Sci. 2016, 17, 1741 14 of 21

Int. J. Mol. Sci. 2016, 17, 1741 14 of 21

Figure 9. Hydrophobicity profile for the interface-domain in dystrophin comprising fragments of the Figure 9. Hydrophobicity profile for the interface-domain in dystrophin comprising fragments of the A A chain (left) (119–134 and 242–246) and analogous fragments of the B chain (right). Blue line: chain (leftexpectedFigure) (119–134 9. distribution;Hydrophobicity and 242–246) red profileline: andobserved for analogousthe interface-domain distribution. fragments The in chart dystrophin of thereveals B chain comprising good ( rightaccordance fragments). Blue between line: of the expected distribution;bothA chain profiles. red (left line: Residues) (119–134 observed 119–134 and distribution. 242–246) form helixes, and The analogouswhile chart residues revealsfragments 242–246 good of form the accordance Bβ-folds. chain (right between). Blue both line: profiles. Residuesexpected 119–134 distribution; form helixes, red line: while observed residues distribution. 242–246 The form chartβ-folds. reveals good accordance between Asboth a profiles.conclusion, Residues we propose119–134 form a new helixes, type while of doma residuesin consisting 242–246 form of βfragments-folds. which belong to separate chains (Figure 10). As a conclusion,As a conclusion, we proposewe propose a newa new type type ofof domaindomain consisting consisting of fragments of fragments which which belong belongto to separateseparate chains chains (Figure (Figure 10). 10).

Figure 10. FOD domain structure distinguished by red ellipses. The color patter is identical to the one used in Figure 8. Figure 10. FOD domain structure distinguished by red ellipses. The color patter is identical to the one Figure 10. FOD domain structure distinguished by red ellipses. The color patter is identical to the one Inused Table in Figure 6 we 8.also present mutations believed to cause Duchenne muscular dystrophy (DMD). usedThe in underlined Figure8. items correspond to mutations which produce the most severe symptoms. ExperimentalIn Table 6studies we also indicate present that mutations all six mutantbelieved proteins to cause are Duchenne significantly muscular more dystrophy prone to thermal (DMD). IndenaturationThe Table underlined6 we and also itemsaggregation present correspond [48]. mutations Itto would mutations believed be in terestingwhich to produce cause to visualize Duchennethe mostthe structure severe muscular symptoms.of mutant dystrophy (DMD).domains.Experimental The underlined Note studies that itemsthe indicate most correspond severethat all mutations six tomutant mutations a ffectproteins a whichdomain are significantly produceregarded themoreas stable most prone (under severe to thermal FOD symptoms. denaturation and aggregation [48]. It would be interesting to visualize the structure of mutant Experimentalcriteria), studiesalthough indicateL54R is located that allin a sixfragment mutant which proteins exhibits are high significantly RD. It seems that more any pronedisruption to thermal indomains. the complex Note interplay that the betweenmost severe stable mutations and unstable affect fragments a domain may regarded severely asimpair stable the (under function FOD of denaturationdystrophin,criteria), and although aggregationwhich (inL54R its isnative [located48]. form) It in would a needs fragment be to interesting a ccommodatewhich exhibits to the visualize high stresses RD. the Itproduced seems structure that by any stretching of disruption mutant the domains. Note thatmembrane-actinin the complex most severe interplay link but mutations neverthelessbetween affectstable remain and a domain unstable capable regarded offragments reverting as may to stable its severely original (under impair conformation. FOD the criteria), function of although L54R isdystrophin, located in which a fragment (in its native which form) exhibits needs to high accommodate RD. It seems the stresses that anyproduced disruption by stretching in the the complex interplay2.7.membrane-actin between Utrophin stable link and but unstablenevertheless fragments remain capable may severely of reverting impair to its the original function conformation. of dystrophin, which (in its nativeMuch form) like needs dystrophin, to accommodate the utrophin chain—When the stresses considered produced as by a stretchingstandalone unit—Diverges the membrane-actin 2.7. Utrophin link butfrom nevertheless the idealized remain hydrophobicity capable density of reverting distribution to its model. original The conformation. chain itself resembles dystrophin in thatMuch it consists like dystrophin, of two distinct the utrophindomains; chain—When however unlike considered in dystrophin as a standalone both domains unit—Diverges conform to 2.7. Utrophinthefrom FOD the model. idealized hydrophobicity density distribution model. The chain itself resembles dystrophin in thatIn domainit consists 1 (31–130of two distinct aa), the domains; helical fragments however unlikeat 63–75 in anddystrophin 85–103, bothas well domains as the conform adjoining to Muchloopsthe FOD like (48–62 model. dystrophin, and 76–84) thediverge utrophin from the chain—When model, whereas considered in domain as2 only a standalone the helix at unit—Diverges206–223 from theremains idealizedIn domaindivergent. hydrophobicity 1 (31–130 The inter-domain aa), the density helical interface distributionfragments of utrophin at model.63–75 differs and The from 85–103, chain the asone itself well in dystrophin; resemblesas the adjoining only dystrophin in that ithelicalloops consists (48–62fragments of and two are76–84) distinct present diverge domains;and theirfrom spatialthe however model, confor unlikewhereasmation in inis dystrophin different.domain 2 Nevertheless, only both the domains helix calculations at conform206–223 to the remains divergent. The inter-domain interface of utrophin differs from the one in dystrophin; only FOD model.based on the FOD model indicate that the interface exhibits a hydrophobicity density distribution whichhelical isfragments remarkably are similarpresent toand its their dystrophin spatial conforcounterpartmation and is different. also in good Nevertheless, agreement calculations with the Inmodel domainbased (althoughon 1the (31–130 FOD dystrophin model aa), theindicate conforms helical that fragmentsto the the interfac model at ewith exhibits 63–75 greater and a hydrophobici accuracy). 85–103, asty well density as the distribution adjoining loops (48–62 andwhichIn 76–84) general,is remarkably diverge the constituent similar from theto domains its model, dystrophin of both whereas protei counterpartns in share domain and similar also 2 onlyFOD in good characteristics. the helixagreement at 206–223 Figure with the11 remains divergent.illustratesmodel The (although inter-domainthe relations dystrophin between interface conforms their of to utrophinthecorresponding model with differs greaterRD from values. accuracy). the The one correlation in dystrophin; coefficient only helical fragmentscalculated areIn general, present for all the analogous and constituent their fragments spatial domains is conformation of0.715 both (0.855 protei whenns is share different.excluding similar the Nevertheless,FOD single characteristics. outlier, calculationsi.e., Figurethe loop 11 based illustrates the relations between their corresponding RD values. The correlation coefficient on the FODat 235–238 model in 1QAG indicate and that the corresponding the interface exhibitsloop at 220–223 a hydrophobicity in 1DXX—This density loop is exposed distribution on the which is surfacecalculated and for likely all analogousplays a role fragments in interaction is 0.715 with (0.855 the protein’s when excluding environment). the single The outlier,modified i.e., domain the loop 2 remarkablyat 235–238 similar in to1QAG its dystrophin and the corresponding counterpart loop and at also220–223 in goodin 1DXX—This agreement loop with is exposed the model on the (although dystrophin surface conforms and likely to plays the modela role in withinteraction greater with accuracy). the protein’s environment). The modified domain 2 In general, the constituent domains of both proteins share similar FOD characteristics. Figure 11 illustrates the relations between their corresponding RD values. The correlation coefficient calculated for all analogous fragments is 0.715 (0.855 when excluding the single outlier, i.e., the loop at 235–238 in 1QAG and the corresponding loop at 220–223 in 1DXX—This loop is exposed on the surface and likely plays a role in interaction with the protein’s environment). The modified domain 2 (deprived of its N-terminal fragment) exhibits higher RD, but the value is still below 0.5. This suggests that the helix at 151–164 serves as a stabilizing connector for domain 2 as well as for the FOD quasi-domain. Int. J. Mol. Sci. 2016, 17, 1741 15 of 21

(deprived of its N-terminal fragment) exhibits higher RD, but the value is still below 0.5. This Int. J. Mol. Sci.suggests2016, 17that, 1741 the helix at 151–164 serves as a stabilizing connector for domain 2 as well as for the 15 of 21 FOD quasi-domain.

Figure 11. Relation between the status of individual secondary folds in dystrophin and in utrophin. Figure 11. TheRelation blue circle between marks an theoutlier status which of corresponds individual to an secondary exposed loop. folds Red: in helices, dystrophin gray: loops. and in utrophin. The blue circle marks an outlier which corresponds to an exposed loop. Red: helices, gray: loops. Much like dystrophin, utrophin also contains a quasi-domain comprised of fragments contributed by individual chains (two helixes per chain). Figure 12 depicts these fragments (black Muchfragment like dystrophin, in chain A and utrophin red fragment also containsin chain B.a The quasi-domain resulting structure comprised is hydrophobically of fragments stable, contributed by individualwith an chains RD value (two of helixes0.453 (Table per 7), chain). indicating Figure the presence 12 depicts of a well-defined these fragments hydrophobic (black core fragment in chain A andwhich red stabilizes fragment the complex. in chain B. The resulting structure is hydrophobically stable, with an RD value of 0.453 (Table7), indicating the presence of a well-defined hydrophobic core which stabilizes Table 7. RD values calculated for the AB complex, for chains A and B and for individual domains of the complex. utrophin (1QAG), as well as for selected secondary structural folds. “Fod Domain” represents fragments contributed by two chains (A,B) which form a coherent quasi-domain with its own Table 7. RDFOD-compliant values calculated hydrophobic for core. the AB complex, for chains A and B and for individual domains of utrophin (1QAG), as wellUtrophin as for selected 1QAG secondary structural folds. “Fod Domain” RD represents fragments contributed by two chains (A,B)Complex which form a coherentAB quasi-domain 0.781 with its own FOD-compliant Chain A/B 0.738/0.738 hydrophobic core. Domain 1 31–130 0.493 Helix 31–47 0.312 Utrophin 1QAGLoop 48–62 0.573 RD Helix 63–75 0.673 Complex AB 0.781 Loop 76–84 0.560 Chain A/B 0.738/0.738 Helix 85–103 0.526 Domain 1 31–130 0.493 Loop 104–110 0.106 Helix 31–47 0.312 Helix 111–118 0.294 Loop 48–62 0.573 Helix 119–130 0.312 63–75 0.673 HelixDomain 2 149–251 0.390 LoopDomain Modified 76–84 167–251 0.427 0.560 Helix Helix 85–103151–164 0.316 0.526 LoopLoop 104–110165–175 0.190 0.106 HelixHelix 111–118176–180 0.450 0.294 HelixHelix 119–130182–192 0.243 0.312 Domain 2Helix 149–251193–197 0.224 0.390 Domain ModifiedHelix 167–251198–205 0.281 0.427 Helix 151–164 0.316 Loop 165–175 0.190 Helix 176–180 0.450 Helix 182–192 0.243 Helix 193–197 0.224 Helix 198–205 0.281 Helix 206–223 0.634 Loop 224–228 0.368 Helix 229–234 0.206 Loop 235–238 0.450 Helix 239–251 0.294 Fod Domain A(134–166) + B(134–166) 0.453 Int. J. Mol. Sci. 2016, 17, 1741 16 of 21

Int. J. Mol. Sci. 2016, 17, 1741 Helix 206–223 0.634 16 of 21 Loop 224–228 0.368 HelixHelix 206–223229–234 0.6340.206 LoopLoop 224–228235–238 0.3680.450 HelixHelix 229–234239–251 0.2060.294 Loop 235–238 0.450 Fod Domain Helix 239–251 0.294 A(134–166) + B(134–166) 0.453 Int. J. Mol. Sci. 2016, 17, 1741 Fod Domain 16 of 21 A(134–166) + B(134–166) 0.453

Figure 12.FigureUtrophin 12. Utrophin homodimer homodimer (chain (chain A: A: Dark Dark blue;blue; ch chainain B: B:Teal). Teal). Fragments Fragments contributed contributed to the to the domainFigure interfacedomain 12. Utrophin interface by chains homodimerby chains A and A Band (chain are B are marked A:marked Dark in in blue; blackblack ch and andain red B: red respectively. Teal). respectively. Fragments contributed to the domain interface by chains A and B are marked in black and red respectively. The chart depicted in Figure 13 includes fragments contributed by chains A and B, revealing a The chartThehighly chart depicted coherent depicted structure in Figurein Figure with 13 13 itsincludes includes own hydrop fragmentsfragmentshobic core. contributed contributed This phenomenon by by chains chains suggests A and A and B,that revealing B, the revealing a a specific interplay of both chains exerts a stabilizing influence on the complex as a whole. highlyhighly coherent coherent structure structure with with its own its own hydrophobic hydrophobic core. core. This This phenomenon phenomenon suggests suggests that that the the specific interplayspecific of bothinterplay chains of both exerts chains a stabilizing exerts a stabilizing influence influence on the complexon the complex as a whole. as a whole.

Figure 13. Utrophin-Theoretical (blue line) and observed (red line) hydrophobicity density profiles. The similarity between both distributions is evident.

Figure 14 schematically depicts the presence of an additional quasi-domain in the interface FigureFigure 13. Utrophin-Theoretical 13. Utrophin-Theoretical (blue (blue line line)) and and observedobserved (red (red line line) hydrophobicity) hydrophobicity density density profiles. profiles. Thezone. similarity In this betweencase, the both quasi-domain distributions is composedis evident. of inter-domain connectors (131–148) and the The similarityN-terminal between fragment both of domain distributions 2 (149–166). is evident. Figure 14 schematically depicts the presence of an additional quasi-domain in the interface Figurezone. In 14 this schematically case, the quasi-domain depicts the presenceis composed of anof additionalinter-domain quasi-domain connectors (131–148) in the interface and the zone. In thisN-terminal case, the fragment quasi-domain of domain is composed 2 (149–166). of inter-domain connectors (131–148) and the N-terminal fragmentInt. J. Mol. Sci. of 2016 domain, 17, 1741 2 (149–166). 17 of 21

Figure 14. SchematicSchematic representation representation of of the the interface-domain interface-domain formed formed by both by both chains chains in the in interface the interface area. area.Blue Blueand red and lines: red lines: ndividual ndividual chains, chains, Green Green line: line: the theadditional additional domains domains generated generated by byfragments fragments of ofindividual individual chains. chains.

3. Discussion Discussion The fuzzy oil drop model, which describes the hydrophobic core status in monomeric units The fuzzy oil drop model, which describes the hydrophobic core status in monomeric units (domains) and larger complexes (dimers), may assist researchers in the study of mechanisms (domains) and larger complexes (dimers), may assist researchers in the study of mechanisms responsible for quaternary structural stabilization. A dimeric complex may form around a shared hydrophobic core with a coherent distribution of hydrophobicity throughout the protein body. Alternatively, each monomer may retain its own hydrophobic core while interacting with its partner. An interesting phenomenon is observed in the case of dystrophin and utrophin: these proteins generate an independent quasi-domain in the interface zone, consisting of parts of both monomers and exhibiting very high accordance with the theoretical model. This solution appears to be correlated with the biological properties of both proteins: both dystrophin and utrophin bind actin filaments (N-terminal domains) with membrane glycoproteins (C-terminal sites) and must withstand substantial external forces, which produce structural deformations. Under these conditions structural stabilization by way of nonbinding interactions may prove insufficient, and a shared quasi-domain is needed to provide additional stability. This domain is not listed in protein domain databases since it is not entirely contained in either monomeric chain. It is also not recognized by CATH, which specifically focuses on individual polypeptide chains. Similarly, the C-terminal β-fold is not subject to formal classification as it lacks a suitable β-“partner” in the opposite chain (cf. secondary conformation scheme in PDBSum). In the case of utrophin and dystrophin the presented structural solution appears to be a consequence of these proteins’ biological properties: specifically, their susceptibility to stretching forces. Comparative analysis of the dystrophin and utrophin conformations presented in [22] underscores the differing orientation (twisting angle) of domains CH1 and CH2, as well as differences in inter-domain interaction potential. The authors suggest that the dystrophin dimer is more stable of the two, and note that the utrophin dimer lacks a β-“latch” produced by the C-terminal fragment. From the point of view of the interface domain, the secondary conformation of individual folds is of relatively little importance—instead, the domain relies on the presence of a prominent hydrophobic core. Mutations which cause Duchenne or Becker’s muscular dystrophy do not seem to affect the analyzed interface domains; hence both diseases are likely unrelated to structural changes in this area [22]. A similar relation between structure and function can be observed in the immunoglobulin-like domain of titin (1TIT) [50,55]. This protein, which is expressed in muscle tissue, must adapt to external forces while retaining shape memory. Much like other proteins discussed in this paper, titin exhibits good accordance with the theoretical hydrophobicity density distribution model, with domain-specific RD values as low as 0.32. It seems that the presence of a well-ordered hydrophobic core is a critical factor in ensuring that the protein reverts to its original conformation in the absence of external forces (note, however, that other immunoglobulin-like domains which also follow the “two-layer sandwich” blueprint do not conform to the model with such high accuracy). In summary, it should be noted that high concentrations of hydrophobicity density at the center of the domain exert a stabilizing influence on the domain as a whole. This observation is in agreement with the well-known relationship between hydrophobic interactions and tertiary structural stabilization.

Int. J. Mol. Sci. 2016, 17, 1741 17 of 21 responsible for quaternary structural stabilization. A dimeric complex may form around a shared hydrophobic core with a coherent distribution of hydrophobicity throughout the protein body. Alternatively, each monomer may retain its own hydrophobic core while interacting with its partner. An interesting phenomenon is observed in the case of dystrophin and utrophin: these proteins generate an independent quasi-domain in the interface zone, consisting of parts of both monomers and exhibiting very high accordance with the theoretical model. This solution appears to be correlated with the biological properties of both proteins: both dystrophin and utrophin bind actin filaments (N-terminal domains) with membrane glycoproteins (C-terminal sites) and must withstand substantial external forces, which produce structural deformations. Under these conditions structural stabilization by way of nonbinding interactions may prove insufficient, and a shared quasi-domain is needed to provide additional stability. This domain is not listed in protein domain databases since it is not entirely contained in either monomeric chain. It is also not recognized by CATH, which specifically focuses on individual polypeptide chains. Similarly, the C-terminal β-fold is not subject to formal classification as it lacks a suitable β-“partner” in the opposite chain (cf. secondary conformation scheme in PDBSum). In the case of utrophin and dystrophin the presented structural solution appears to be a consequence of these proteins’ biological properties: specifically, their susceptibility to stretching forces. Comparative analysis of the dystrophin and utrophin conformations presented in [22] underscores the differing orientation (twisting angle) of domains CH1 and CH2, as well as differences in inter-domain interaction potential. The authors suggest that the dystrophin dimer is more stable of the two, and note that the utrophin dimer lacks a β-“latch” produced by the C-terminal fragment. From the point of view of the interface domain, the secondary conformation of individual folds is of relatively little importance—instead, the domain relies on the presence of a prominent hydrophobic core. Mutations which cause Duchenne or Becker’s muscular dystrophy do not seem to affect the analyzed interface domains; hence both diseases are likely unrelated to structural changes in this area [22]. A similar relation between structure and function can be observed in the immunoglobulin-like domain of titin (1TIT) [50,55]. This protein, which is expressed in muscle tissue, must adapt to external forces while retaining shape memory. Much like other proteins discussed in this paper, titin exhibits good accordance with the theoretical hydrophobicity density distribution model, with domain-specific RD values as low as 0.32. It seems that the presence of a well-ordered hydrophobic core is a critical factor in ensuring that the protein reverts to its original conformation in the absence of external forces (note, however, that other immunoglobulin-like domains which also follow the “two-layer sandwich” blueprint do not conform to the model with such high accuracy). In summary, it should be noted that high concentrations of hydrophobicity density at the center of the domain exert a stabilizing influence on the domain as a whole. This observation is in agreement with the well-known relationship between hydrophobic interactions and tertiary structural stabilization. The authors of [56] report that many digitally generated protein-protein interfaces with highly favorable computed binding energies could not be observed in practice. They conclude that there may be important physical chemistry missing in the energy calculations. The algorithm applied in [52,56] can be summarized as follows:

i. Pre-compute a set of high-affinity amino acid residue interactions with the target surface; ii. Redesign natural protein scaffolds to incorporate a selection of these amino acids; iii. Design the remainder of the interface to enhance binding affinity.

We attempt to approach the presented issue by performing a holistic analysis of protein conformations on the basis of the fuzzy oil drop model. Some other publications, which deal with changes triggered by point mutations in the structure of protein-protein interfaces, also highlight the need for improved energy functions [52]. Int. J. Mol. Sci. 2016, 17, 1741 18 of 21

Examples of structures characterized by variable FOD status can be found in the immunoglobulin-like domain family [55]. Despite their overall structural and sequential similarity, the conformance of individual folds which make up the β-sandwich varies from protein to protein. This work attempts to underscore the influence of global (i.e., protein-wide) hydrophobicity density distribution upon the protein’s biological properties; particularly its ability to form complexes. It should be noted that earlier methods which focus entirely on pairwise atom-atom interactions (including interactions between protein atoms and the surrounding water environment) only acknowledge the effects of immediate “interaction partners”, while the structure of interfaces is critically dependent on the overall distribution of hydrophobicity density throughout the protein body, as well as on local deviations from the idealized model. The interface does not form in separation from other parts of the protein; rather, the entire molecule must participate in this process, folding in such a way as to ensure that the contact area remains capable of interacting with the intended ligand. This process relies on correct distribution of hydrophobicity density, including local discordances. Point mutations, not necessarily located close to the interface zone, may disrupt the protein’s hydrophobic core and lead to structural deformations which may render the protein less susceptible—Or indeed entirely incapable of—Complexation. In light of this phenomenon, existing in silico folding models, which acknowledge the presence of ligands or external proteins, should undergo further evolution [57].

4. Materials and Methods

4.1. Data All structures were taken from PDB Data Base [58]. The identification of interface residues was performed according to PDBSum standards [51] using the distance between interacting atoms.

4.2. Programs The 3D presentation of protein structures was generated using PyMol [59]. Calculations of RD values (complete calculation concerning FOD) were performed using our own software.

Acknowledgments: The authors would like to express their thanks to Piotr Nowakowski and Anna Smieta´nska´ for valuable suggestions and editorial work. This research was supported by the Jagiellonian University Medical College grant No. K/ZDS/006363. Author Contributions: Irena Roterman and Leszek Konieczny conceived and designed the experiments; Mateusz Banach performed the experiments; Irena Roterman analyzed the data; Barbara Kalinowska, Jacek Dygut and Monika Piwowar contributed reagents/materials/analysis tools; Irena Roterman wrote the paper. All authors have read and approved the final manuscript. Conflicts of Interest: Authors declare no conflict of interest.

References

1. Levinthal, C. Are there pathways for protein folding? J. Chem. Phys. 1968, 65, 44–45. 2. Dill, K.A. Polymer principles and protein folding. Protein Sci. 1999, 8, 1166–1180. [CrossRef][PubMed] 3. Leopold, P.E.; Montal, M.; Onuchic, J.N. Protein folding funnels: A kinetic approach to the sequence-structure relationship. Proc. Natl. Acad. Sci. USA 1992, 89, 8721–8725. [CrossRef][PubMed] 4. Dill, K.A.; Chan, H.S. From levinthal to pathways to funnels. Nat. Struct. Biol. 1997, 4, 10–19. [CrossRef] [PubMed] 5. George, R.A.; Heringa, J. SnapDRAGON: A method to delineate protein structural domains from sequence data. J. Mol. Biol. 2002, 316, 839–851. [CrossRef][PubMed] 6. George, R.A.; Lin, K.; Heringa, J. Scooby-domain: Prediction of globular domains in protein sequence. Nucleic Acids Res. 2005, 33, W160–W163. [CrossRef][PubMed] 7. Murzin, A.G.; Brenner, S.E.; Hubbard, T.; Chothia, C. A structural classification of proteins database for the investigation of sequences structures. J. Mol. Biol. 1995, 247, 536–540. [CrossRef] Int. J. Mol. Sci. 2016, 17, 1741 19 of 21

8. Andreeva, A.; Howorth, D.; Chandonia, J.-M.; Brenner, S.E.; Hubbard, T.J.P.; Chothia, C.; Murzin, A.G. Data growth and its impact on the SCOP database: New developments. Nucleic Acids Res. 2008, 36, D419–D425. [CrossRef][PubMed] 9. Sillitoe, I.; Cuff, A.L.; Dessailly, B.H.; Dawson, N.L.; Furnham, N.; Lee, D.; Lees, J.G.; Lewis, T.E.; Studer, R.A.; Rentzsch, R.; et al. New functional families (FunFams) in CATH to improve the mapping of conserved functional sites to 3D structures. Nucleic Acids Res. 2013, 41.[CrossRef][PubMed] 10. Van der Wel, P.C. Domain swapping and amyloid fibril conformation. Prion 2012, 6, 211–216. [CrossRef] [PubMed] 11. Fonseca-Ornelas, L.; Eisbach, S.E.; Paulat, M.; Giller, K.; Fernández, C.O.; Outeiro, T.F.; Becker, S.; Zweckstetter, M. Small molecule-mediated stabilization of vesicle-associated helical α-synuclein inhibits pathogenic misfolding and aggregation. Nat. Commun. 2014, 5, 5857. [CrossRef][PubMed] 12. Lin, L.X.; Bo, X.Y.; Tan, Y.Z.; Sun, F.X.; Song, M.; Zhao, J.; Ma, Z.H.; Li, M.; Zheng, K.J.; Xu, S.M. Feasibility of β-sheet breaker peptide-H102 treatment for Alzheimer's disease based on β-amyloid hypothesis. PLoS ONE 2014, 9, e112052. [CrossRef][PubMed] 13. Rooklin, D.; Wang, C.; Katiqbak, J.; Arora, P.S.; Zhang, Y. α-Space: Fragment-centric topographicall mapping to target protein-protein an interaction interfaces. J. Chem. Inf. Model. 2015, 55, 1585–1599. [CrossRef] [PubMed] 14. Voloshin, V.P.; Kim, A.V.; Medvedev, N.N.; Winter, R.; Geiger, A. Calculation of the volumetric characteristics of biomacromolecules in solution by the Voronoi-Delaunay techniques. Biophys. Chem. 2014, 192, 1–9. [CrossRef][PubMed] 15. Fern, J.T.; Keffer, D.J.; Steele, W.V. Measuring coexisting densities from a two-phase molecular dynamics simulation by voronoi tessalations. J. Phys. Chem. B 2007, 111, 3469–3475. [CrossRef][PubMed] 16. Kim, W.K.; Ison, J.C. Survey of the geometric association of domain-domain interfaces. Proteins 2005, 61, 1075–1088. [CrossRef][PubMed] 17. Nikolic, A.; Baud, S.; Rauscher, S.; Pomes, R. Molecular mechanism of β-sheet self-organization a water-hydrophobic interfaces. Proteins 2011, 79, 1–22. [CrossRef][PubMed] 18. Kanamori, E.; Igarashi, S.; Osawa, M.; Fukunishi, Y.; Shimada, I.; Nakamura, H. Structure determination of a protein assembly by amino acid selective cross-saturation. Proteins 2011, 79, 179–190. [CrossRef][PubMed] 19. Janin, J.; Henrick, K.; Moult, J.; Eyck, L.T.; Sternberg, M.J.E.; Vajda, S.; Vakser, I.; Wodak, S.J. CAPRI: A Critical Assessment of PRedicted Interactions. Proteins 2003, 52, 2–9. [CrossRef][PubMed] 20. Janin, J. Protein-protein docking tested in blind predictions: The CAPRI experiment. Mol. Biosyst. 2010, 6, 2351–2362. [CrossRef][PubMed] 21. Wodak, S.J.; Méndez, R. Prediction of protein-protein interactions: The CAPRI experiment, its evaluation and implications. Curr. Opin. Struct. Biol. 2004, 14, 242–249. [CrossRef][PubMed] 22. Norwood, F.L.; Sutherland-Smith, A.J.; Keep, N.H.; Kendrick-Jones, J. The structure of the N-terminal actin-binding domain of human dystrophin and how mutations in this domain may cause Duchenne or Becker muscular dystrophy. Structure 2000, 8, 481–491. [CrossRef] 23. Koenig, M.; Monaco, A.P.; Kunkel, L.M. The complete sequence of dystrophin predicts a rod-shaped cytoskeletal protein. Cell 1988, 53, 219–228. [CrossRef] 24. Tennyson, C.N.; Klamut, H.J.; Worton, R.G. The human dystrophin gene requires 16 hours to be transcribed and is cotranscriptionally spliced. Nat. Genet. 1995, 9, 184–190. [CrossRef][PubMed] 25. Worton, R.G. Dystrophin: The long and short of it. J. Clin. Investig. 1994, 93, 4. [CrossRef][PubMed] 26. Keep, N.H.; Winder, S.J.; Moores, C.A.; Walke, S.; Norwood, F.L.; Kendrick-Jones, J. Crystal structure of the actin-binding region of utrophin reveals a head-to-tail dimer. Structure 1999, 7, 1539–1546. [CrossRef] 27. Blake, D.J.; Schofield, J.N.; Zuellig, R.A.; Gorecki, D.C.; Phelps, S.R.; Barnard, E.A.; Edwards, Y.H.; Davies, K.E. G-utrophin, the autosomal homologue of dystrophin Dp116, is expressed in sensory ganglia and brain. Proc. Natl. Acad. Sci. USA 1995, 92, 3697–3701. [CrossRef][PubMed] 28. Matsumura, K.; Ervasti, J.M.; Ohlendieck, K.; Kahl, S.D.; Campbell, K.P. Association of dystrophin-related protein with dystrophin-associated proteins in mdx mouse muscle. Nature 1992, 360, 588–591. [CrossRef] [PubMed] 29. Guo, W.-X.A.; Nichol, M.; Merlie, J.P. Cloning and expression of full length mouse utrophin: The differential association of utrophin and dystrophin with AChR clusters. FEBS Lett. 1996, 398, 259–264. [PubMed] Int. J. Mol. Sci. 2016, 17, 1741 20 of 21

30. Prochniewicz, E.; Henderson, D.; Ervasti, J.M.; Thomas, D.D. Dystrophin and utrophin have distinct effects on the structural dynamics of actin. Proc. Natl. Acad. Sci. USA 2009, 106, 7822–7827. [CrossRef][PubMed] 31. Pearce, M.; Blake, D.J.; Tinsley, J.M.; Byth, B.C.; Campbell, L.; Monaco, A.P.; Davies, K.E. The utrophin and dystrophin genes share similarities in genomic structure. Hum. Mol. Genet. 1993, 2, 1765–1772. [CrossRef] [PubMed] 32. Koonin, E.V. Orthologs, paralogs, and evolutionary genomics. Annu. Rev. Genet. 2005, 39, 309–338. [CrossRef] [PubMed] 33. Henderson, D.M.; Lee, A.; Ervasti, J.M. Disease-causing missense mutations in actin binding domain 1 of dystrophin induce thermodynamic instability and protein aggregation. Proc. Natl. Acad. Sci. USA 2010, 107, 9632–9637. [CrossRef][PubMed] 34. Konieczny, L.; Bryli´nski, M.; Roterman, I. Gauss-function-based model of hydrophobicity density in proteins. Silico Biol. 2006, 6, 15–22. 35. Kauzmann, W. Some factors in the interpretation of protein denaturation. Adv. Protein Chem. 1959, 14, 1–63. [PubMed] 36. Banach, M.; Prymula, K.; Jurkowski, W.; Konieczny, L.; Roterman, I. Fuzzy oil drop model to interpret the structure of antifreeze proteins and their mutants. J. Mol. Model. 2012, 18, 229–237. [CrossRef][PubMed] 37. Roterman, I.; Konieczny, L.; Jurkowski, W.; Prymula, K.; Banach, M. Two-intermediate model to characterize the structure of fast-folding proteins. J. Theor. Biol. 2011, 283, 60–70. [CrossRef][PubMed] 38. Prymula, K.; Jadczyk, T.; Roterman, I. Catalytic residues in hydrolases: analysis of methods designer for ligand-binding site in proteins. J. Comp. Aide Mol. Des. 2011, 25, 117–133. [CrossRef][PubMed] 39. Banach, M.; Konieczny, L.; Roterman, I. Ligand-binding-site recognition. In Protein folding in silico; Roterman-Konieczna, I., Ed.; Woodhead Publishing (Currently Elsevier): Oxford, UK; Cambridge, UK; Philadelphia, PA, USA; New Dehli, India, 2012; pp. 78–94. 40. Banach, M.; Konieczny, L.; Roterman, I. Can the structure of hydrophobic core determine the complexation site? In Identification of Ligand Binding Site and Protein-Protein Interaction Area; Roterman-Konieczna, I., Ed.; Springer: Dordrecht, The Netherlands; Heidelberg, Germany; New York, NY, USA; London, UK, 2013; pp. 41–54. 41. Banach, M.; Konieczny, L.; Roterman, I. Use of the “fuzzy oil drop” model to identify the complexation area in protein homodimers. In Protein Folding in Silico; Roterman-Konieczna, I., Ed.; Woodhead Publishing (Currently Elsevier): Oxford, UK; Cambridge, UK; Philadelphia, PA, USA; New Dehli, India, 2012; pp. 95–122. 42. Munson, M.; Balasubramanian, S.; Fleming, K.G.; Nagi, A.D.; O’Brien, R.; Sturtevant, J.M.; Regan, L. What makes a protein a protein? Hydrophobic core designs that specify stability and structural properties. Protein Sci. 1996, 5, 1584–1593. [CrossRef][PubMed] 43. Nyarko, A.; Cochrun, L.; Norwood, S.; Pursifull, N.; Voth, A.; Barbar, E. Ionization of His 55 at the dimer interface of dynein light-chain LC8 is coupled to dimer dissociation. Biochemistry 2005, 44, 14248–14225. [CrossRef][PubMed] 44. Starich, M.R.; Sandman, K.; Reeve, J.N.; Summers, M.F. NMR structure of HMfB from the hyperthermophile, Methanothermus fervidus, confirms that this archaeal protein is a histone. J. Mol. Biol. 1996, 255, 187–203. [CrossRef][PubMed] 45. Ivanov, D.; Stone, J.R.; Maki, J.L.; Collins, T.; Wagner, G. Mammalian SCAN domain dimer is a domain-swapped homolog of the HIV capsid C-terminal domain. Mol. Cell 2005, 17, 137–143. [CrossRef] [PubMed] 46. Pozharski, E.; Hewagama, A.; Shanafelt, A.; Petsko, G.; Ringe, D. –PDB–ID 1R1V. Available online: https://www.rcsb.org (accessed on 29 July 2009). 47. Pokkuluri, P.R.; Solomon, A.; Weiss, D.T.; Stevens, F.J.; Schiffer, M. Tertiary structure of human lambda 6 light chains. Amyloid 1999, 6, 165–171. [CrossRef][PubMed] 48. Hoffman, E.P.; Brown, R.H.; Kunkel, L.M. Dystrophin: The protein product of the Duchenne muscular dystrophy locus. Cell 1987, 53, 219–228. [CrossRef] 49. Kullback, S.; Leibler, R.A. On information and sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [CrossRef] 50. Roterman, I.; Banach, M.; Kalinowska, B.; Konieczny, L. Influence of the aqueous environment on protein structure—A plausible hypothesis concerning the mechanism of amyloidogenesis. Entropy 2016, 18, 351. [CrossRef] Int. J. Mol. Sci. 2016, 17, 1741 21 of 21

51. De Beer, T.A.P.; Berka, K.; Thornton, J.M.; Laskowski, R.A. PDBsum additions. Nucleic Acids Res. 2014, 42, D292–D296. [CrossRef][PubMed] 52. Moretti, R.; Fleishman, S.J.; Agius, R.; Torchala, M.; Bates, P.A.; Kastritis, P.L.; Rodrigues, J.P.; Trellet, M.; Bonvin, A.M.; Cui, M.; et al. Community-wide evaluation of methods for predicting the effect of mutations on protein-protein interactions. Proteins 2013, 81, 1980–1987. [CrossRef][PubMed] 53. Khoury, G.A.; Liwo, A.; Khatib, F.; Zhou, H.; Chopra, G.; Bacardit, J.; Bortot, L.O.; Faccioli, R.A.; Deng, X.; He, Y.; et al. Foldit Players. We Fold: A competition for protein structure prediction. Proteins 2014, 82, 1850–1868. [CrossRef][PubMed] 54. Banach, M.; Kalinowska, B.; Konieczny, L.; Roterman, I. Role of Disulfide Bonds in Stabilizing the Conformation of Selected Enzymes—An Approach Based on Divergence Entropy Applied to the Structure of Hydrophobic Core in Proteins. Entropy 2016, 18, 67. [CrossRef] 55. Banach, M.; Konieczny, L.; Roterman, I. The fuzzy oil drop model, based on hydrophobicity density distribution, generalizes the influence of water environment on protein structure and function. J. Theor. Biol. 2014, 359, 6–17. [CrossRef][PubMed] 56. Fleishman, S.J.; Whitehead, T.A.; Strauch, E.M.; Corn, J.E.; Qin, S.; Zhou, H.X.; Mitchell, J.C.; Demerdash, O.N.; Takeda-Shitaka, M.; Terashi, G.; et al. Community-wide assessment of protein-interface modeling suggests improvements to design methodology. J. Mol. Biol. 2011, 414, 289–302. [CrossRef] [PubMed] 57. Konieczny, L.; Roterman, I. Conclusions. In Protein Folding in Silico; Roterman-Konieczna, I., Ed.; Woodhead Publishing (Currently Elsevier): Oxford, UK; Cambridge, UK; Philadelphia, PA, USA; New Dehli, India, 2012; pp. 191–202. 58. Berman, H.M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T.N.; Weissig, H.; Shindyalov, I.N.; Bourne, P.E. The protein data bank. Nucleic Acids Res. 2000, 28, 235–242. [CrossRef][PubMed] 59. The PyMOL Molecular Graphics System, Version 1.7 Schrödinger, LLC. Available online: https://www. pymol.org (accessed on 1 May 2015).

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).