Crystal Data Phases Structure Structure Validation

Remember The electron density is related to the intensity The model reflects the fit of sequence into the electron density The model is refined against the electron density to give the “best” fit But how do we know the model is correct? Evaluating the model Every model should be treated as a hypothesis. The first model may contain many errors.

Model evaluation is a necessary step to minimize possible errors.

The final model will be the result of an iterative process of fitting and refinement.

A good model will have most of the common features listed below: Make chemical sense and satisfy all that is known about the macromolecule. Fit the electron density for whole molecule including side chains. Few clashes (contacts closer than Van der Waals radius) Missing density is much better than extra density. Refining the structure

Used to refine the fit of the model to the observed electron density

Fcalc => Fobs

Parameters refined x,y,z,B for all atoms B is the temperature factor - isotropic vibration of atom at point x,y,z

Target function that is minimized 2 Q= Σ w(h,k,l)(|Fobs(h,k,l)| - |Fcal(h,k,l)|)

dQ/duj=0; uj- all atomic parameters If each atom has 4 parameters and each parameter requires a minimum 3 observations (reflections) for a free- atom least-squares refinement. A protein of N atoms requires 12N observations. Stereochemical restrained refinement

Need 12 observations (unique reflections) per atom. Amino acids have 4 (Gly) to 14 (Trp) non-hydrogen atoms Assume the average amino acid has 9 atoms For a 100 residue protein we will need >10,000 reflections Protein data is limited by diffraction quality unit cell space group For crystals diffracting to 2Å or less the observation to parameter ratio is considerable less than 3. Include protein restrains (Bond lengths, bond angles, aromatic ring planarity, etc.) Observed value compared to ideal value Validating the structure

Importance of Structural validation: Crystallographic refinement although standardized still is still subjective in terms of: Quality of initial phases from MAD/MIR, SAD/SIR or the search model chosen for molecular replacement Which refinement program/technique is used How temperature factors were modeled (isotropic/anisotropic) Extent of solvent substructure Whether alternative conformations were used

Important if you do not validate your structure someone else will! Sources of error

1. In the worst case, the model may be completely wrong!

Recent examples: asparaginase/glutaminase, photoactive yellow protein (McRee et al. PNAS, 1989, 86, 6533-6537 vs. Borgstahl, G.E, et al. Biochemistry, 1995, 34, 6278-6287).

2. In other cases, secondary structure elements may be ok for the most part, but incorrectly connected. Also, register errors (or frame shift) may exceptionally not be detected and corrected (although ultra-rare w/ high resolution structures.

3. The most common type of model-building error is locally incorrect main-chain or side- chain conformations. This errors typically occur at low resolution with poor phases. In addition, multiple conformations may be not well resolved at relatively high resolutions (~2A) leading to hard interpretation of side-chain density.

Errors may come from bad sequence information (deletion/insertion between the actual sequence and what is known), bad molecular replacement, inappropriate temperature factor models….

…from Kleywegt, G.J., Validation of protein crystal structures, Acta Cryst., 2000, D56, 249-265 Quality indicators Quality indicators may be (i) Derived from the (real-space) coordinates and B-factor only (that is, indicators that may be computed from the PDB structure alone) (ii) Derived from the previous two plus the crystallographic data is taken into account. Quality indicator also differ in an important feature: (i) Measure how well the refinement program has succeeded in imposing restraints (i.e. deviations from ideal geometry as it is implemented in the code used) (ii) How well the refined model fit the experimental data that were used to build it (e.g. conventional R factor). (iii) Involve aspects of the model which are orthogonal to the refinement process (e.g. free R values, patterns of non-bonded interactions, conformational torsion angles distribution as used in the Ramachandran plot) Global methods assess the quality of the model globally (e.g. conventional and free R factors) Local methods provide information of the correctness of the model at the level of the residue or group of residues. Properties of a good quality indicator

A good quality indicator is:

1. testable on a real-space coordinates and/or crystallographic data 2. strongly correlated with structure quality (good structure -> good indicator) 3. independent (orthogonal) as possible from the refinement process 4. automatable and not too time-consuming computationally

Examples of a good quality indicator: Ramachandran plot (outliers)

From Kleywegt, G.J. and Jones, T.A, Homo Crystallographicus Quo From EDS server, PDB: 1QYS Vadis ?, Structure, 10, 465-472 Some model quality indicators

Non-bonded contacts/Goodness-of-Fit Electron Density Fitting - VdW interactions 1. R-factor, free R factor - Shape complementarity 2. B-values - Hydrogen bonding

Basic Geometry Dihedral Angles

- bond length and angles - Φ,Ψ (Ramachandran plot), Ω - Planarity - Χ angles for side-chains (rotamer lib)

Peptide planes - Χ3 for disulfides Rings in sc (His, Phe, Tyr, Trp) - Other dihedral angles (Ca,C,N,Ca+1)

Chirality and Secondary Structure - L-amino acids - Right handed α-helixes, twist in β-strands - Preferred packing angles between strands/helices

Adapted from Morris et al., Stereochemical Quality of Protein Structure Coordinates, , 1992, 12, 345-364 The crystallographic R-factor

The R-factor gives the agreement between Fobs and Fcalc

Rcryst = Σhkl |Fobs(hkl) - kFcal(hkl)| / Σhkl |Fobs(hkl)| The R-factor => 0 as the model becomes more correct The R-factor generally is in the range of 15% to 25% mainly due disordered residues poor bulk solvent model isotropic treatment of thermal parameters error in the measured intensities

Generally the R-factor decreases as data resolution increases The free R-factor R-factor free of model bias

The Rfree also gives the agreement between Fobs and Fcalc

Rfree = Σh’k’l’ |Fobs(h’k’l’) - kFcal(h’k’l’)| / Σhkl |Fobs(h’k’l’)| The set of observations (h’k’l’) is excluded from the refinement

Rfree is used to limit over fitting of the data adding parameters reduces Rcryst Rfree should be reduced also

If Rfree goes up you are over fitting the data

Rfree is generally larger than Rcryst

Rfree for a good model is less than 30%

The difference between R and Rfree is smaller for higher resolution and well-refined structures Bond lengths and angles

• Bond lengths and angles are determined with a high accuracy from small molecule crystals and very high resolution protein structures (cf for instance Engh & Huber, ref below) • Average values are known as well as standard deviation (that can be translated into force constants for refinement). These are among the most important parameters that are restrained during refinement:

2 2 E = …. + ∑1/σb(d-d0) + ∑ 1/σΘ(Θ-Θ0)

Thus measurement of local deviation from ideality is one of the easiest validation test that can be done on a structure. Although they are explicitly optimized during refinement, significant (i.e. typically more than several σ) deviation from ideality for bond length and angles is often the signature of a significant problem the model. For a well refined structure R.M.S.D. bond lengths < 0.1Å R.M.S.D. bond angles < 1.5°

Engh, R.A. and Huber, R., Accurate bond and angle parameters for X-Ray protein structure refinement, Acta Cryst., 1991, A47, 392-400 Dihedral Angles Cα –Cα angles coupled with Cα –Cα-Cα –Cα dihedral angles can be used as an additional quality indicator. Distribution computed from databases (1343 structures, resolution < 2.0A in Kleywegt et al.) show prefered and disallowed (i.e. not occupied in this dataset of high- resolution structures) regions, in an analogous manner to the Ramachandran plot. Implemented in program MOLEMAN2. glycines

3° by 3° binning + counting to define Cα –Cα-Cα –Cα dihedral statistical analysis for the incorrect (a) and correct (b) core (more than 100 in a bin), allowed model of DDP. Note the clear outliers in the left plot. (between 50 and 100) and generously allowed (between 10 and 50). Less than ten defines the disallowed region

Kleywegt G.J., Validation of Protein Models from Cα coordinates alone, JMB, 1997, 12, 371-376 Oldfield, T.J. and Hubbard,R.E., Analysis of Cα geometry in protein structures, Proteins, 18, 324-337 Cis - trans conformations

Polypeptide generally adopt conformations where Ω is close to 180°, the so-called trans conformation because it is sterically favorable to have Cβ on different “sides” of the planar peptide bond. However, cis can also happen, especially for prolines where this is the favored angle. The graph above shows the distribution of omega angles derived by the Thornton group, as it is used in PROCHECK. However, some large deviations in Ω angles in high resolution structures have been reported (Merritt et al.), thus this indicator should be considered with care.

Morris, A.L. et al., Stereochemical Quality of Protein Structure Coordinates, Proteins, 1992, 12, 345-364 Merrit, E., et al., The 1.25A Resolution Refinement of the Cholera Toxin B-pentamer: Evidence of Peptide Backbone Strain at the Receptor-binding Site, JMB, 1992, 282, 1043-1059 Ramachandran plots

Ramachandran plot Most favored 88.7% Additional allowed 11.0% Generously allowed 0% Disallowed 0.3%

G-factors Dihedral angles: Phi-psi -0.21 Chi1-chi2 -0.14 ψ Chi1 only -0.02 Chi3 & chi4 0.49 Omega -0.57 Average 0.14 Main-chain bond lengths 0.63 bond angles 0.38 Average 0.49

OVERALL AVERAGE 0.28

φ Understanding the Ramachandran plot Ramachandran et al. drew the simple scatterplot of Φ,Ψ for a set of proteins. Because of clashes between backbone atoms (N,Ca,C,O) and the Cβ or other part of the side-chain, only a small part of the Φ,Ψ plot is actually populated.

most favored region allowed region generously allowed region disallowed region

Ψ

Φ The good thing is that the “Ramachandran constraints” are not included into refinement programs, thus making use of the Ramachandran statistics an orthogonal indicator. In addition, there is an excellent empirical correlation between the quality of structural models (as measured by the resolution) and their compliance with the “Ramachandran constraints” (cf the plot of “what makes a good quality indicator). Where do the allowed regions come from?

PROCHECK, WHATCHECK and some other tools:

Database of 463 structures and 121,870 residues. Exclude Pro and Gly from dataset. 1. Discretize Φ,Ψ plot by 10° by 10° bins. Count item falling in each bin. Define regions based on the number of items in each bin: 2. CORE region if more than 100 residues in bin 3. ALLOWED region if > 7 residues in bin 4. GENEROUSLY ALLOWED : extend by 20° all the allowed regions.

Various improvements have been proposed (e.g. MolPROBITY):

1. Database of 500 structures, all at resolution <1.7A, and consider only residues with no clashes and main-chain atom B-factor < 40 2. No binning, use of a smoothing distribution:

ρ(φ,ψ) = ∑σi(φ,ψ) where σi(φ,ψ) is a fast decaying distribution centered at each empirical point.

3. Proline, Glycine are treated separately, as well as pre-Pro residues (Pro Cδ can clash and prevents NH H-bonding). Distribution for all the cases lead to favored (98% of distrib), favored + less favored (99.95)

Morris, A.L. et al., Stereochemical Quality of Protein Structure Coordinates, Proteins, 1992, 12, 345-364 Lovell S.C, et al., Structure Validation by Cα Geometry, Proteins, 2003, 12, 437-450 Other Ramachandran like plots

Lovell S.C, et al., Structure Validation by Cα Geometry, Proteins, 2003, 12, 437-450 Ramachandran Outliers

• Whatever the method used to derive the distribution, the validation software detect outliers in a given structure, i.e. residues that have Φ,Ψ angles outside the defined regions.

• Usually outliers are bad. However, there are some cases in known structures where outliers are known to be correct.

• Typical example include active site loops where the strain introduced by the non- canonical Φ,Ψ dihedral angle is compensated by other favorable interactions (hydrogen bonds).

When in presence of outliers, the things to do are thus:

• check what the environment of the outlier residue is: is there an explanation for this residue begin an outlier ? • look at other quality indicator (particularly, other dihedral angles, Cβ deviations) for that particular residue, and check their quality. • look at the environment of the residue: an error in the neighboring residues may affect the Φ,Ψ values of the residue being marked as outliers. Cβ distribution

Distribution of Cβ deviations for all residues

Distribution of Cβ deviations for branched-Cβ residues with eclipsed X1

Rotamer expectations and counts for valine:

Count (%) Chi1 7.4 65.5 73.9 175.9 18.7 -61.7 from Dunbrack rot indep. libraries Cβ deviations

Lovell et al. idea: correlated very small deviations may indicate some problem in ED fitting, at the Cα level. If the model is not accurate locally, the refinement will put strain on bond lengths and angles to fit ED better.

Idea: from model backbone coordinates, compute ideal Cβ position and compare to model

deviation from ideality

Cβ obs. Cβ ideal

Cα C

N

Lovell S.C, et al., Structure Validation by Cα Geometry, Proteins, 2003, 12, 437-450 Side chain rotomer distribution 1. Choose a dataset in which you are confident in (i.e. you trust side-chains conformations in the structures + residues that you consider). Different choices (among other parameters) have led to different rotamer libraries. - Dunbrack et al.: 518 chains, from PDB, more than 2.0A res., < 50% identity - Lovell et al.: 240 structures, < 1.7A, res < 50% identity + individual choice of the residues used (low clash score, B-factor cutoff of 40).

2. Compute the distribution of χ angles for each type of side-chain If a backbone dependent rotamer library is needed, the distribution of χ angles can also be binned according to the (Φ,Ψ) angles.

3. Find the different modes (average values) from the distributions

Dunbrack, R. and Cohen,F.E., Bayesian statistical analysis of protein side-chain rotamer preferences, Protein Science, 1997, 6, 1661-1681 Preferred Side chain rotomers Due to VdW repulsion between groups of atoms in most side-chains, side-chain tend to adopt conformations that are well-defined around a discrete number of discrete values (mode) of their χ angles.

Different side-chain have different χ angles, starting from χ1 up to χ4 for longest sc.

E.g. eclipsed methyls are usually preferred conformations (orbital overlap). Validation software WHATIF / WHATCHECK : http://swift.cmbi.kun.nl/WIWWWI/ (web interface) - anomalous bond length and angles - planarity of planar groups - chirality checks - ramachandran, omega checks - packing estimation - proline pucker

PROCHECK : http://www.biochem.ucl.ac.uk/~roman/procheck/procheck.html (web interface) - anomalous bond length and angles - planarity of planar groups - chirality checks - ramachandran, omega checks - clash checks - disulfide bond distance and x3 angle - sc rotamer (x1 only) - H-bond quality (but no network optimization)

MolPROBITY : http://kinemage.biochem.duke.edu/molprobity/ - Cβ deviations - ramachandran - clash checks (contact dots) - H-bond network optimization + asn/gln/his flip optimization

EDS (Electron Density Server): http://fsrv1.bmc.uu.se/eds/ Procheck www.biochem.ucl.ac.uk/~roman/procheck/procheck.html Properties of a well refined model Agree with known chemistry!!! Fit the electron density R between 15% to 25% Rfree between 20% to 30% Have good stereochemistry RMSD bond lengths less than 0.1 RMSD bond angles less than 1.5 Few outliers in the Ramachandran plot Reasonable thermal parameters B factors between 5 and 30 mainchain B’s smaller than side chain B’s Few clashes Zero close contacts (including packing) Solvent Fewer than 1 solvent molecule per residue

Important if you do not validate your structure someone else will! Molprobity assisted refinement

Structure validation in all stages of refinement • Produce a chemically reasonable model as early as possible • Quickly Identify hot spots • Speed up the refinement process • Produce high quality structures MolProbity diagnosis and repair . Left - red bad contacts . Right - green good contacts plus H-bond

An all-atom contact analysis tool Contains the most updated Ramachandran and side chain rotamer criteria.

Arendall, et al., Journal of Structural and Functional Genomics, 6: 1-11 (2005). kinemage.biochem.duke.edu/molprobity/index-king.html Combining refinement & validation

Conventional refinement protocol

Crude Manual Refinement “original” adjustments programs Validation model (Graphics)

SECSG protocol - MolProbity assisted refinement

Crude “original” model

Validation

Manual Refinement adjustments programs (Graphics)

Arendall, et al., Journal of Structural and Functional Genomics, 6: 1-11 (2005). Molprobity versus traditional refinement

• PDB Sample • PDB Sample O SECSG Conventional O SECSG Conventional  SECSG MolProbity  SECSG MolProbity + Current Set + Current Set

Clash Score Ramachandran Outliers

MolProbity assisted refinement insures that structures produced are of the highest quality.

Arendall, et al., Journal of Structural and Functional Genomics, 6: 1-11 (2005). The

How to interpret a Protein Data Bank entry

Three cases

INXP 1980 high resolution diffractometer data sealed tube source

1AD3 1996 moderate resolution image plate data rotating anode source

1OQC 2003 moderate resolution CCD data synchrotron data Date of Entry and Reference 1NXB HEADER NEUROTOXIN (POST-SYNAPTIC) 08-AUG-80 1NXB COMPND NEUROTOXIN $B (PROBABLY IDENTICAL TO ERABUTOXIN $B) SOURCE SEA SNAKE (LATICAUDA SEMIFASCIATA) FROM PHILIPPINES SEA, SOURCE 2 BROAD-BANDED BLUE VENOMOUS SEA SNAKE, PROBABLY IDENTICAL SOURCE 3 TO SEA SNAKE FROM SEA OF JAPAN AUTHOR D.TSERNOGLOU,G.A.PETSKO REVDAT 4 22-OCT-84 1NXBC 1 SEQRES REVDAT 3 30-SEP-83 1NXBB 1 REVDAT REVDAT 2 13-JUN-83 1NXBA 1 REMARK REVDAT 1 27-JAN-81 1NXB 0 REMARK 1 REMARK 1 REFERENCE 1 REMARK 1 AUTH D.TSERNOGLOU,G.A.PETSKO,R.A.HUDSON REMARK 1 TITL STRUCTURE AND FUNCTION OF SNAKE VENOM CURARIMIMETIC REMARK 1 TITL 2 NEUROTOXINS REMARK 1 REF MOL.PHARMACOL. V. 14 710 1978 REMARK 1 REFN ASTM MOPMA3 US ISSN 0026-895X 197 REMARK 1 REFERENCE 2 REMARK 1 AUTH D.TSERNOGLOU,G.A.PETSKO,J.E.MC*QUEEN *JUNIOR, REMARK 1 AUTH 2 J.HERMANS REMARK 1 TITL MOLECULAR GRAPHICS. APPLICATION TO THE STRUCTURE REMARK 1 TITL 2 DETERMINATION OF A SNAKE VENOM NEUROTOXIN REMARK 1 REF SCIENCE V. 197 1378 1977 REMARK 1 REFN ASTM SCIEAS US ISSN 0036-8075 038 Resolution and R-Factor 1NXB REMARK 2 REMARK 2 RESOLUTION. 1.38 ANGSTROMS. REMARK 3 REMARK 3 REFINEMENT. RESTRAINED LEAST-SQUARES REFINEMENT BY THE REMARK 3 METHOD OF HENDRICKSON AND KONNERT. REMARK 4 REMARK 4 THE R-FACTOR FOR THE CURRENT COORDINATE SET IS 0.24. REMARK 4 REFINEMENT IS CONTINUING AND SOME MODIFICATION TO THE REMARK 4 CURRENT COORDINATE SET IS TO BE EXPECTED. ALL DATA TO REMARK 4 1.38 ANGSTROMS WERE COLLECTED ON ONE CRYSTAL. THE REMARK 4 STRUCTURE WAS DETERMINED WITH 1.5 MG OF PROTEIN. THE REMARK 4 WATER POSITIONS AND THERMAL PARAMETERS HAVE NOT BEEN REMARK 4 REFINED AND ARE SUBJECT TO CHANGE WITHOUT NOTICE. THE REMARK 4 SULFATE IONS HAVE BEEN IDENTIFIED BY PH AND SALT DIFFERENCE REMARK 4 MAPS. REMARK 5 REMARK 5 THE PROTEIN IS PROBABLY IDENTICAL TO ERABUTOXIN FROM REMARK 5 JAPANESE SEA SNAKES. THIS POINT IS DISCUSSED IN REFERENCES REMARK 5 3 AND 4 ABOVE. REMARK 6 REMARK 6 RESIDUE NUMBERING IS SEQUENTIAL. IN PUBLISHED PAPERS A REMARK 6 GENERAL HOMOLOGY SEQUENCE NUMBERING IS USED OFTEN INSTEAD REMARK 6 OF SEQUENTIAL NUMBERING OF RESIDUES. SEE REFERENCE 1 FOR REMARK 6 DETAILS. Sequence and Unit Cell 1NXB SEQRES 1 62 ARG ILE CYS PHE ASN GLN HIS SER SER GLN PRO GLN THR SEQRES 2 62 THR LYS THR CYS SER PRO GLY GLU SER SER CYS TYR HIS SEQRES 3 62 LYS GLN TRP SER ASP PHE ARG GLY THR ILE ILE GLU ARG SEQRES 4 62 GLY CYS GLY CYS PRO THR VAL LYS PRO GLY ILE LYS LEU SEQRES 5 62 SER CYS CYS GLU SER GLU VAL CYS ASN ASN SHEET 1 A 5 THR 13 CYS 17 0 SHEET 2 A 5 ILE 2 HIS 7 -1 N CYS 3 O LYS 15 SHEET 3 A 5 GLY 34 GLY 40 -1 O ILE 37 N SER 8 SHEET 4 A 5 TYR 25 ASP 31 -1 N TYR 25 O GLY 40 SHEET 5 A 5 LYS 51 CYS 54 -1 N SER 53 O HIS 26 TURN 1 B12 HIS 7 GLN 10 TURN 2 B14 SER 18 GLU 21 ORIG SEQ PROBLY IN ERROR HERE TURN 3 B34 ASP 31 GLY 34 TIP OF *TOXIC LOOP* TURN 4 B5X LYS 47 ILE 50 TURN 5 B5Y SER 57 CYS 60 SMALL LOOP AT END OF CHAIN SSBOND 1 CYS 3 CYS 24 SSBOND 2 CYS 17 CYS 41 SSBOND 3 CYS 43 CYS 54 SSBOND 4 CYS 55 CYS 60 SITE 1 AMS 3 ASP 31 PHE 32 ARG 33 SITE 2 ETS 1 TRP 29 SITE 3 ELS 2 LYS 27 LYS 47 CRYST1 49.900 46.600 21.300 90.00 90.00 90.00 P 21 21 21 4 Ramachandran plot 1NXB

Ramachandran plot Most favored 62.7% Additional allowed 27.5% Generously allowed 7.8% Disallowed 2.0%

G-factors Dihedral angles: Phi-psi -1.33** Chi1-chi2 -1.45** Chi1 only -0.60* Chi3 & chi4 0.19 Omega - 1.49** Average -1.13** Main-chain bond lengths -6.79** bond angles -21.47** Average -15.31**

OVERALL AVERAGE -6.10** Atoms Records 1NXB ATOM 1 N ARG 1 38.091 6.218 23.243 1.00 16.74 ATOM 2 CA ARG 1 37.811 7.243 22.293 1.00 14.94 ATOM 3 C ARG 1 37.327 7.041 20.877 1.00 10.12 ATOM 4 O ARG 1 37.849 7.212 19.826 1.00 15.48 ATOM 5 CB ARG 1 37.923 8.573 22.859 1.00 15.31 ATOM 6 CG ARG 1 37.448 9.591 21.915 1.00 11.37 ATOM 7 CD ARG 1 35.910 9.792 21.877 1.00 10.78 ATOM 8 NE ARG 1 35.110 9.817 23.125 1.00 9.29 ATOM 9 CZ ARG 1 35.627 10.423 24.218 1.00 10.45 ATOM 10 NH1 ARG 1 35.890 11.704 24.113 1.00 14.25 ATOM 11 NH2 ARG 1 35.723 9.671 25.425 1.00 14.91 ATOM 12 N ILE 2 35.991 6.697 21.121 1.00 12.86 ATOM 13 CA ILE 2 35.142 6.317 19.987 1.00 13.00 ATOM 14 C ILE 2 34.052 7.243 19.543 1.00 11.70 ATOM 15 O ILE 2 33.278 7.324 20.451 1.00 14.10 ATOM 16 CB ILE 2 34.897 4.846 20.146 1.00 13.89 ATOM 17 CG1 ILE 2 36.334 4.094 20.148 1.00 14.80 ATOM 18 CG2 ILE 2 34.240 4.200 18.927 1.00 10.78 ATOM 19 CD1 ILE 2 35.837 2.651 19.897 1.00 14.90 ATOM 20 N CYS 3 34.054 7.676 18.314 1.00 9.79 ATOM 21 CA CYS 3 32.759 8.379 17.896 1.00 12.83 ATOM 22 C CYS 3 31.856 7.671 16.912 1.00 9.20 ATOM 23 O CYS 3 32.532 6.862 16.137 1.00 9.42 ATOM 24 CB CYS 3 32.877 9.803 17.587 1.00 8.19 ATOM 25 SG CYS 3 34.159 10.505 18.764 1.00 8.75 Date of Entry, Compound & Source 1AD3

HEADER OXIDOREDUCTASE 25-JUN-96 1AD3 TITLE CLASS 3 ALDEHYDE DEHYDROGENASE COMPLEX WITH TITLE 2 NICOTINAMIDE-ADENINE-DINUCLEOTIDE COMPND MOL_ID: 1; COMPND 2 MOLECULE: ALDEHYDE DEHYDROGENASE (CLASS 3); COMPND 3 CHAIN: A, B; COMPND 4 SYNONYM: ALDH; COMPND 5 EC: 1.2.1.5; COMPND 6 ENGINEERED: YES; COMPND 7 BIOLOGICAL_UNIT: DIMER SOURCE MOL_ID: 1; SOURCE 2 ORGANISM_SCIENTIFIC: RATTUS NORVEGICUS; SOURCE 3 ORGANISM_COMMON: RAT; SOURCE 4 ORGAN: LIVER; SOURCE 5 EXPRESSION_SYSTEM: ESCHERICHIA COLI; SOURCE 6 EXPRESSION_SYSTEM_STRAIN: BH101; SOURCE 7 EXPRESSION_SYSTEM_PLASMID: PTA1DH KEYWDS NADP, OXIDOREDUCTASE, AROMATIC ALDEHYDE EXPDTA X-RAY DIFFRACTION AUTHOR Z.-J.LIU,J.ROSE,B.C.WANG REVDAT 1 07-JUL-97 1AD3 0 Resolution and R-Factor 1AD3 REMARK 2 REMARK 2 RESOLUTION. 2.6 ANGSTROMS. REMARK 3 REMARK 3 REFINEMENT. REMARK 3 PROGRAM : X-PLOR 3.1 REMARK 3 AUTHORS : BRUNGER REMARK 3 REMARK 3 DATA USED IN REFINEMENT. REMARK 3 RESOLUTION RANGE HIGH (ANGSTROMS) : 2.6 REMARK 3 RESOLUTION RANGE LOW (ANGSTROMS) : 8.0 REMARK 3 DATA CUTOFF (SIGMA(F)) : 2. REMARK 3 DATA CUTOFF HIGH (ABS(F)) : NULL REMARK 3 DATA CUTOFF LOW (ABS(F)) : NULL REMARK 3 COMPLETENESS (WORKING+TEST) (%) : 80.00 REMARK 3 NUMBER OF REFLECTIONS : 22435 REMARK 3 REMARK 3 FIT TO DATA USED IN REFINEMENT. REMARK 3 CROSS-VALIDATION METHOD : THROUGHOUT REMARK 3 FREE R VALUE TEST SET SELECTION : RANDOM REMARK 3 R VALUE (WORKING SET) : 0.177 REMARK 3 FREE R VALUE : 0.279 REMARK 3 FREE R VALUE TEST SET SIZE (%) : 8. REMARK 3 FREE R VALUE TEST SET COUNT : 1914 REMARK 3 ESTIMATED ERROR OF FREE R VALUE : 0.006 Stereochemistry R.M.S.D.s 1AD3 REMARK 3 NUMBER OF NON-HYDROGEN ATOMS USED IN REFINEMENT. REMARK 3 PROTEIN ATOMS : 6964 REMARK 3 NUCLEIC ACID ATOMS : 0 REMARK 3 HETEROGEN ATOMS : 88 REMARK 3 SOLVENT ATOMS : 284 REMARK 3 REMARK 3 B VALUES. REMARK 3 FROM WILSON PLOT (A**2) : 23.624 REMARK 3 MEAN B VALUE (OVERALL, A**2) : 18.4 REMARK 3 REMARK 3 ESTIMATED COORDINATE ERROR. REMARK 3 ESD FROM LUZZATI PLOT (A) : 0.35 REMARK 3 ESD FROM SIGMAA (A) : NULL REMARK 3 LOW RESOLUTION CUTOFF (A) : NULL REMARK 3 REMARK 3 CROSS-VALIDATED ESTIMATED COORDINATE ERROR. REMARK 3 ESD FROM C-V LUZZATI PLOT (A) : NULL REMARK 3 ESD FROM C-V SIGMAA (A) : NULL REMARK 3 REMARK 3 RMS DEVIATIONS FROM IDEAL VALUES. REMARK 3 BOND LENGTHS (A) : 0.008 REMARK 3 BOND ANGLES (DEGREES) : 1.457 REMARK 3 DIHEDRAL ANGLES (DEGREES) : 23.50 REMARK 3 IMPROPER ANGLES (DEGREES) : 1.404 Data Collection Parameters 1AD3

REMARK 200 REMARK 200 EXPERIMENTAL DETAILS REMARK 200 EXPERIMENT TYPE : X-RAY DIFFRACTION REMARK 200 DATE OF DATA COLLECTION : JUN-1994 REMARK 200 TEMPERATURE (KELVIN) : 289 REMARK 200 PH : 6.2 REMARK 200 NUMBER OF CRYSTALS USED : 1 REMARK 200 REMARK 200 SYNCHROTRON (Y/N) : N REMARK 200 RADIATION SOURCE : X-RAY GENERATOR REMARK 200 BEAMLINE : NULL REMARK 200 X-RAY GENERATOR MODEL : RIGAKU RU200 REMARK 200 MONOCHROMATIC OR LAUE (M/L) : M REMARK 200 WAVELENGTH OR RANGE (A) : 1.5418 REMARK 200 MONOCHROMATOR : NI FILTER REMARK 200 OPTICS : SUPPER MIRRORS (SMALL) REMARK 200 REMARK 200 DETECTOR TYPE : X100 REMARK 200 DETECTOR MANUFACTURER : SIEMENS REMARK 200 INTENSITY-INTEGRATION SOFTWARE : XENGEN 2.1 REMARK 200 DATA SCALING SOFTWARE : XENGEN 2.1 Data Processing Parameters 1AD3

REMARK 200 REMARK 200 NUMBER OF UNIQUE REFLECTIONS : 25656 REMARK 200 RESOLUTION RANGE HIGH (A) : 2.5 REMARK 200 RESOLUTION RANGE LOW (A) : INFINITY REMARK 200 REJECTION CRITERIA (SIGMA(I)) : -3. REMARK 200 REMARK 200 OVERALL. REMARK 200 COMPLETENESS FOR RANGE (%) : 80.00 REMARK 200 DATA REDUNDANCY : 2.3 REMARK 200 R MERGE (I) : 0.050 REMARK 200 R SYM (I) : NULL REMARK 200 FOR THE DATA SET : 17.444 REMARK 200 REMARK 200 IN THE HIGHEST RESOLUTION SHELL. REMARK 200 HIGHEST RESOLUTION SHELL, RANGE HIGH (A) : 2.6 REMARK 200 HIGHEST RESOLUTION SHELL, RANGE LOW (A) : 2.7 REMARK 200 COMPLETENESS FOR SHELL (%) : 50. REMARK 200 DATA REDUNDANCY IN SHELL : 2.3 REMARK 200 R MERGE FOR SHELL (I) : 0.183 REMARK 200 R SYM FOR SHELL (I) : NULL REMARK 200 FOR SHELL : 2.29 Non-Crystallographic Symmetry 1AD3

REMARK 295 REMARK 295 NON-CRYSTALLOGRAPHIC SYMMETRY REMARK 295 THE TRANSFORMATIONS PRESENTED ON THE MTRIX RECORDS BELOW REMARK 295 DESCRIBE NON-CRYSTALLOGRAPHIC RELATIONSHIPS AMONG ATOMS REMARK 295 IN THIS ENTRY. APPLYING THE APPROPRIATE MTRIX REMARK 295 TRANSFORMATION TO THE RESIDUES LISTED FIRST WILL YIELD REMARK 295 APPROXIMATE COORDINATES FOR THE RESIDUES LISTED SECOND. REMARK 295 CHAIN IDENTIFIERS GIVEN AS "?" REFER TO CHAINS FOR WHICH REMARK 295 ATOMS ARE NOT FOUND IN THIS ENTRY. REMARK 295 REMARK 295 APPLIED TO TRANSFORMED TO REMARK 295 TRANSFORM CHAIN RESIDUES CHAIN RESIDUES RMSD REMARK 295 SSS REMARK 295 M 1 A 2 .. 447 B 2 .. 447 0.189 /// MTRIX1 1 -0.189500 0.065400 -0.979700 85.58990 1 MTRIX2 1 0.063900 -0.994800 -0.078700 101.20960 1 MTRIX3 1 -0.979800 -0.077600 0.184300 77.52180 1 Sequence 1AD3 REMARK 999 REMARK 999 SEQUENCE REMARK 999 1AD3 A SWS P11883 1 - 1 NOT IN ATOMS LIST REMARK 999 1AD3 A SWS P11883 448 - 452 NOT IN ATOMS LIST REMARK 999 1AD3 B SWS P11883 1 - 1 NOT IN ATOMS LIST REMARK 999 1AD3 B SWS P11883 448 - 452 NOT IN ATOMS LIST REMARK 999 REMARK 999 RESIDUE 1 OF CHAINS A AND B IS MET IN THE CRYSTALLIZED REMARK 999 SAMPLE. THIS IS IN CONFLICT WITH SER 1 IN THE SEQUENCE REMARK 999 REPORTED IN THE SWISSPROT ENTRY. THE CONFLICT IS A REMARK 999 CLONING ARTIFACT. DBREF 1AD3 A 2 447 SWS P11883 DHAP_RAT 2 447 DBREF 1AD3 B 2 447 SWS P11883 DHAP_RAT 2 447 SEQRES 1 A 452 MET SER ILE SER ASP THR VAL LYS ARG ALA ARG GLU ALA SEQRES 2 A 452 PHE ASN SER GLY LYS THR ARG SER LEU GLN PHE ARG ILE SEQRES 3 A 452 GLN GLN LEU GLU ALA LEU GLN ARG MET ILE ASN GLU ASN SEQRES 4 A 452 LEU LYS SER ILE SER GLY ALA LEU ALA SER ASP LEU GLY SEQRES 5 A 452 LYS ASN GLU TRP THR SER TYR TYR GLU GLU VAL ALA HIS SEQRES 6 A 452 VAL LEU GLU GLU LEU ASP THR THR ILE LYS GLU LEU PRO SEQRES 7 A 452 ASP TRP ALA GLU ASP GLU PRO VAL ALA LYS THR ARG GLN SEQRES 8 A 452 THR GLN GLN ASP ASP LEU TYR ILE HIS SER GLU PRO LEU SEQRES 9 A 452 GLY VAL VAL LEU VAL ILE GLY ALA TRP ASN TYR PRO PHE SEQRES 10 A 452 ASN LEU THR ILE GLN PRO MET VAL GLY ALA VAL ALA ALA /// Heterogens and Unit Cell 1AD3

HET NAD A 600 44 HET NAD B 600 44 HETNAM NAD NICOTINAMIDE-ADENINE-DINUCLEOTIDE FORMUL 3 NAD 2(C21 H27 N7 O14 P2) FORMUL 4 HOH *276(H2 O1) /// CISPEP 1 PRO A 443 PRO A 444 0 2.03 CISPEP 2 PRO B 443 PRO B 444 0 2.01 SITE 1 CTA 1 CYS A 216 SITE 1 CTB 1 CYS B 216 CRYST1 64.950 170.950 47.160 90.00 110.25 90.00 P 1 21 1 4 Sequence and Unit Cell 1AD3 ATOM 54 N LYS A 8 24.906 55.805 26.376 1.00 14.99 ATOM 55 CA LYS A 8 24.308 57.152 26.368 1.00 17.54 C ATOM 56 C LYS A 8 22.816 57.087 26.654 1.00 15.91 C ATOM 57 O LYS A 8 22.311 57.807 27.515 1.00 24.32 O ATOM 58 CB LYS A 8 24.500 57.844 25.015 1.00 30.49 C ATOM 59 CG LYS A 8 25.887 58.393 24.760 1.00 41.49 C ATOM 60 CD LYS A 8 26.020 58.875 23.319 1.00 51.02 C ATOM 61 CE LYS A 8 27.485 59.012 22.908 1.00 54.08 C ATOM 62 NZ LYS A 8 27.644 59.154 21.428 1.00 58.73 N ATOM 63 H LYS A 8 25.371 55.490 25.572 1.00 15.00 H ATOM 64 1HZ LYS A 8 27.144 59.999 21.085 1.00 15.00 H ATOM 65 2HZ LYS A 8 28.655 59.235 21.203 1.00 15.00 H ATOM 66 3HZ LYS A 8 27.256 58.309 20.959 1.00 15.00 H ATOM 67 N ARG A 9 22.121 56.224 25.913 1.00 22.35 N ATOM 68 CA ARG A 9 20.679 56.028 26.062 1.00 21.66 C ATOM 69 C ARG A 9 20.304 55.785 27.528 1.00 22.14 C ATOM 70 O ARG A 9 19.353 56.384 28.041 1.00 20.12 O ATOM 71 CB ARG A 9 20.230 54.830 25.211 1.00 22.29 C ATOM 72 CG ARG A 9 18.721 54.644 25.090 1.00 25.06 C ATOM 73 CD ARG A 9 18.369 53.335 24.384 1.00 34.73 C ATOM 74 NE ARG A 9 19.082 53.189 23.115 1.00 48.57 N ATOM 75 CZ ARG A 9 19.929 52.196 22.838 1.00 58.41 C ATOM 76 NH1 ARG A 9 20.175 51.247 23.735 1.00 56.00 N ATOM 77 NH2 ARG A 9 20.562 52.171 21.672 1.00 59.67 N Ramachandran plot 1AD3

Ramachandran plot Most favored 88.7% Additional allowed 11.0% Generously allowed 0% Disallowed 0.3%

G-factors Dihedral angles: Phi-psi -0.21 Chi1-chi2 -0.14 Chi1 only -0.02 Chi3 & chi4 0.49 Omega -0.57 Average 0.14 Main-chain bond lengths 0.63 bond angles 0.38 Average 0.49

OVERALL AVERAGE 0.28 Date of Entry, Compound & Source 1OQC

HEADER OXIDOREDUCTASE 07-MAR-03 1OQC TITLE THE CRYSTAL STRUCTURE OF AUGMENTER OF LIVER REGENERATION: A TITLE 2 MAMMALIAN FAD DEPENDENT SULFHYDRYL OXIDASE COMPND MOL_ID: 1; COMPND 2 MOLECULE: AUGMENTER OF LIVER REGENERATION; COMPND 3 CHAIN: A, B, C, D; COMPND 4 SYNONYM: ALR; COMPND 5 ENGINEERED: YES SOURCE MOL_ID: 1; SOURCE 2 ORGANISM_SCIENTIFIC: RATTUS NORVEGICUS; SOURCE 3 ORGANISM_COMMON: RAT; SOURCE 4 GENE: ALR; SOURCE 5 EXPRESSION_SYSTEM: ESCHERICHIA COLI; SOURCE 6 EXPRESSION_SYSTEM_COMMON: BACTERIA; SOURCE 7 EXPRESSION_SYSTEM_STRAIN: BL21; SOURCE 8 EXPRESSION_SYSTEM_VECTOR_TYPE: PLASMID; SOURCE 9 EXPRESSION_SYSTEM_PLASMID: PET16 KEYWDS SULFHYDRYL OXIDASE, LIVER REGENERATION, ALR, HELIX-TURN- KEYWDS 2 HELIX EXPDTA X-RAY DIFFRACTION AUTHOR J.P.ROSE,C.-K.WU,B.-C.WANG REVDAT 1 15-APR-03 1OQC 0 Resolution and R-Factor 1OQC REMARK 2 REMARK 2 RESOLUTION. 1.80 ANGSTROMS. REMARK 3 REMARK 3 REFINEMENT. REMARK 3 PROGRAM : CNS 1.0 REMARK 3 AUTHORS : BRUNGER,ADAMS,CLORE,DELANO,GROS,GROSSE- REMARK 3 REMARK 3 REFINEMENT TARGET : ENGH & HUBER REMARK 3 REMARK 3 DATA USED IN REFINEMENT. REMARK 3 RESOLUTION RANGE HIGH (ANGSTROMS) : 1.80 REMARK 3 RESOLUTION RANGE LOW (ANGSTROMS) : 27.65 REMARK 3 DATA CUTOFF (SIGMA(F)) : 0.000 REMARK 3 OUTLIER CUTOFF HIGH (RMS(ABS(F))) : 1754209.16000 REMARK 3 COMPLETENESS (WORKING+TEST) (%) : 88.1 REMARK 3 NUMBER OF REFLECTIONS : 41788 REMARK 3 REMARK 3 FIT TO DATA USED IN REFINEMENT. REMARK 3 CROSS-VALIDATION METHOD : THROUGHOUT REMARK 3 FREE R VALUE TEST SET SELECTION : RANDOM REMARK 3 R VALUE (WORKING SET) : 0.202 REMARK 3 FREE R VALUE : 0.240 REMARK 3 FREE R VALUE TEST SET SIZE (%) : 5.000 REMARK 3 FREE R VALUE TEST SET COUNT : 2100 REMARK 3 ESTIMATED ERROR OF FREE R VALUE : 0.005 Stereochemistry R.M.S.D.s

REMARK 3 NUMBER OF NON-HYDROGEN ATOMS 1OQC USED IN REFINEMENT. REMARK 3 PROTEIN ATOMS : 3728 REMARK 3 NUCLEIC ACID ATOMS : 0 REMARK 3 HETEROGEN ATOMS : 212 REMARK 3 SOLVENT ATOMS : 623 REMARK 3 REMARK 3 B VALUES. REMARK 3 FROM WILSON PLOT (A**2) : 10.70 REMARK 3 MEAN B VALUE (OVERALL, A**2) : 12.10 REMARK 3 REMARK 3 ESTIMATED COORDINATE ERROR. REMARK 3 ESD FROM LUZZATI PLOT (A) : 0.19 REMARK 3 ESD FROM SIGMAA (A) : 0.08 REMARK 3 LOW RESOLUTION CUTOFF (A) : 6.00 REMARK 3 REMARK 3 CROSS-VALIDATED ESTIMATED COORDINATE ERROR. REMARK 3 ESD FROM C-V LUZZATI PLOT (A) : 0.24 REMARK 3 ESD FROM C-V SIGMAA (A) : 0.14 REMARK 3 REMARK 3 RMS DEVIATIONS FROM IDEAL VALUES. REMARK 3 BOND LENGTHS (A) : 0.006 REMARK 3 BOND ANGLES (DEGREES) : 1.50 REMARK 3 DIHEDRAL ANGLES (DEGREES) : 18.00 REMARK 3 IMPROPER ANGLES (DEGREES) : 1.32 Bulk Solvent & Libraries 1OQC

REMARK 3 BULK SOLVENT MODELING. REMARK 3 METHOD USED : FLAT MODEL REMARK 3 KSOL : 0.38 REMARK 3 BSOL : 44.19 REMARK 3 REMARK 3 NCS MODEL : NULL REMARK 3 REMARK 3 NCS RESTRAINTS. RMS SIGMA/WEIGHT REMARK 3 GROUP 1 POSITIONAL (A) : NULL ; NULL REMARK 3 GROUP 1 B-FACTOR (A**2) : NULL ; NULL REMARK 3 REMARK 3 PARAMETER FILE 1 : PROTEIN_REP.PARAM REMARK 3 PARAMETER FILE 2 : WATER.PARAM REMARK 3 PARAMETER FILE 3 : FAD.PAR REMARK 3 PARAMETER FILE 4 : NULL REMARK 3 TOPOLOGY FILE 1 : PROTEIN.TOP REMARK 3 TOPOLOGY FILE 2 : WATER.TOP REMARK 3 TOPOLOGY FILE 3 : FAD.TOP REMARK 3 TOPOLOGY FILE 4 : NULL Data Collection Parameters 1OQC

REMARK 200 REMARK 200 EXPERIMENTAL DETAILS REMARK 200 EXPERIMENT TYPE : X-RAY DIFFRACTION REMARK 200 DATE OF DATA COLLECTION : 21-SEP-1998; 25-JUN-1998 REMARK 200 TEMPERATURE (KELVIN) : 120; 120 REMARK 200 PH : 6.50 REMARK 200 NUMBER OF CRYSTALS USED : 2 REMARK 200 REMARK 200 SYNCHROTRON (Y/N) : Y; Y REMARK 200 RADIATION SOURCE : NSLS ; APS REMARK 200 BEAMLINE : X12C; 17ID REMARK 200 X-RAY GENERATOR MODEL : NULL REMARK 200 MONOCHROMATIC OR LAUE (M/L) : M; M REMARK 200 WAVELENGTH OR RANGE (A) : 0.95, 0.979 0.978; 0.979 REMARK 200 MONOCHROMATOR : GRAPHITE CRYSTAL; GRAPHITE REMARK 200 CRYSTAL REMARK 200 OPTICS : NULL REMARK 200 REMARK 200 DETECTOR TYPE : CCD; CCD REMARK 200 DETECTOR MANUFACTURER : CUSTOM-MADE; SIEMENS 2X2 REMARK 200 MOSAIC Data Processing Parameters 1OQC REMARK 200 INTENSITY-INTEGRATION SOFTWARE : DENZO;SAINT REMARK 200 DATA SCALING SOFTWARE : SCALEPACK;SAINT REMARK 200 REMARK 200 NUMBER OF UNIQUE REFLECTIONS : 41788 REMARK 200 RESOLUTION RANGE HIGH (A) : 1.600 REMARK 200 RESOLUTION RANGE LOW (A) : 27.650 REMARK 200 REJECTION CRITERIA (SIGMA(I)) : NULL REMARK 200 REMARK 200 OVERALL. REMARK 200 COMPLETENESS FOR RANGE (%) : 76.6 REMARK 200 DATA REDUNDANCY : 4.08 REMARK 200 R MERGE (I) : NULL REMARK 200 R SYM (I) : 0.07800 REMARK 200 FOR THE DATA SET : NULL REMARK 200 REMARK 200 IN THE HIGHEST RESOLUTION SHELL. REMARK 200 HIGHEST RESOLUTION SHELL, RANGE HIGH (A) : 1.60 REMARK 200 HIGHEST RESOLUTION SHELL, RANGE LOW (A) : 1.70 REMARK 200 COMPLETENESS FOR SHELL (%) : 40.6 REMARK 200 DATA REDUNDANCY IN SHELL : 4.02 REMARK 200 R MERGE FOR SHELL (I) : NULL REMARK 200 R SYM FOR SHELL (I) : 0.178 REMARK 200 FOR SHELL : 2.880 Missing Residues or Atoms 1OQC

REMARK 465 REMARK 465 MISSING RESIDUES REMARK 465 THE FOLLOWING RESIDUES WERE NOT LOCATED IN THE REMARK 465 EXPERIMENT. (M=MODEL NUMBER; RES=RESIDUE NAME; C=CHAIN REMARK 465 IDENTIFIER; SSSEQ=SEQUENCE NUMBER; I=INSERTION CODE.) REMARK 465 REMARK 465 M RES C SSSEQI REMARK 465 MET A 1 REMARK 465 ARG A 2 REMARK 465 THR A 3 /// REMARK 470 REMARK 470 MISSING ATOM REMARK 470 THE FOLLOWING RESIDUES HAVE MISSING ATOMS(M=MODEL NUMBER; REMARK 470 RES=RESIDUE NAME; C=CHAIN IDENTIFIER; SSEQ=SEQUENCE NUMBER; REMARK 470 I=INSERTION CODE): REMARK 470 M RES CSSEQI ATOMS REMARK 470 LYS A 120 CG CD CE NZ REMARK 470 SER A 123 OG REMARK 470 GLU C 13 CB CG CD OE1 OE2 REMARK 470 GLU D 13 CB CG CD OE1 OE2 Geometry and Stereochemistry 1OQC

REMARK 500 REMARK 500 GEOMETRY AND STEREOCHEMISTRY REMARK 500 SUBTOPIC: COVALENT BOND LENGTHS REMARK 500 REMARK 500 THE STEREOCHEMICAL PARAMETERS OF THE FOLLOWING RESIDUES REMARK 500 HAVE VALUES WHICH DEVIATE FROM EXPECTED VALUES BY MORE REMARK 500 THAN 6*RMSD (M=MODEL NUMBER; RES=RESIDUE NAME; C=CHAIN REMARK 500 IDENTIFIER; SSEQ=SEQUENCE NUMBER; I=INSERTION CODE). REMARK 500 REMARK 500 STANDARD TABLE: REMARK 500 FORMAT: (10X,I3,1X,2(A3,1X,A1,I4,A1,1X,A4,3X),F6.3) REMARK 500 REMARK 500 EXPECTED VALUES: ENGH AND HUBER, 1991 REMARK 500 REMARK 500 M RES CSSEQI ATM1 RES CSSEQI ATM2 DEVIATION REMARK 500 MET A 49 CG MET A 49 SD 0.039 REMARK 500 MET A 49 SD MET A 49 CE -0.047 REMARK 500 MET B 40 SD MET B 40 CE -0.040 Secondary Structure 1OQC

HELIX 1 1 ASP A 18 TYR A 36 1 19 HELIX 2 2 THR A 42 TYR A 60 1 19 HELIX 3 3 CYS A 62 SER A 76 1 15 HELIX 4 4 THR A 82 LEU A 101 1 20 HELIX 5 5 ASP A 107 SER A 109 5 3 HELIX 6 6 ARG A 110 ARG A 116 1 7 /// SSBOND 1 CYS A 15 CYS C 124 SSBOND 2 CYS A 62 CYS A 65 SSBOND 3 CYS A 91 CYS A 108 SSBOND 4 CYS A 124 CYS C 15 /// SITE 1 CTA 3 CYS A 62 CYS A 65 FAD A 1 Ramachandran plot 1OQC

Ramachandran plot Most favored 93.9% Additional allowed 6.1% Generously allowed 0% Disallowed 0.0%

G-factors Dihedral angles: Phi-psi 0.38 Chi1-chi2 0.21 Chi1 only -0.42 Chi3 & chi4 0.52 Omega -0.61 Average 0.43 Main-chain bond lengths 0.68 bond angles 0.48 Average 0.57

OVERALL AVERAGE 0.49