<<

Introductory biophysics A. Y. 2016-17

5. X-ray crystallography and its applications to the structural problems of biology Edoardo Milotti Dipartimento di Fisica, Università di Trieste The interatomic distance in a metallic crystal can be roughly estimated as follows.

Take, e.g., iron

• density: 7.874 g/cm3 • atomic weight: 56 3 • molar volume: VM = 7.1 cm /mole then the interatomic distance is roughly

VM d ≈ 3 ≈ 2.2nm N A

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 The atomic lattice can be used a sort of diffraction grating for short-wavelength radiation, about 100 times shorter than visible light which is in the range 400-750 nm.

Since

hc 2·10−25 J m 1.24 eV µm E = ≈ ≈ γ λ λ λ

1 nm radiation corresponds to about 1 keV photon energy.

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 !"#$%&'$(")*

S#%/J&T&U2*#<.%&CKET3&VG$GG./"#%G3&W.%-$/;

+(."J&AN&>,%()&CTDB3&S.%)(/3&X.1*&W.%-$/;

Y#<.)&V%(Z.&(/&V=;1(21&(/&CTCL&[G#%&=(1&"(12#5.%;&#G&*=.& "(GG%$2*(#/&#G&\8%$;1&<;&2%;1*$)1]

9/(*($));&=.&1*:"(."&H(*=&^_/F*./3&$/"&*=./&H(*=&'$`&V)$/2I&(/& S.%)(/3&H=.%.&=.&=$<()(*$*."&(/&CTBD&H(*=&$&*=.1(1&[a<.% "(.& 9/*.%G.%./Z.%12=.(/:/F./ $/&,)$/,$%$)).)./ V)$**./[?&

7=./&=.&H#%I."&$*&*=.&9/1*(*:*.&#G&7=.#%.*(2$)&V=;1(213&=.$"."& <;&>%/#)"&Q#--.%G.)"3&:/*()&=.&H$1&$,,#(/*."&G:))&,%#G.11#%&$*& *=.&4/(5.%1(*;&#G&0%$/IG:%*&(/&CTCL3&H=./&=.&$)1#&%.2.(5."&=(1& Y#<.)&V%(Z.?&

!"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE (1868-1951)

... Four of Sommerfeld's doctoral students, , , , and Hans Bethe went on to win Nobel Prizes, while others, most notably, Walter Heitler, Rudolf Peierls, Karl Bechert, Hermann Brück, , Eugene Feenberg, Herbert Fröhlich, , Ernst Guillemin, Helmut Hönl, Ludwig Hopf, , , , Karl Meissner, Rudolf Seeliger, Ernst C. Stückelberg, Heinrich Welker, , Alfred Landé, and Léon Brillouin became famous in their own right. Three of Sommerfeld's postgraduate students, Linus Pauling, Isidor I. Rabi and Max von Laue, won Nobel Prizes, and ten others, William Allis, Edward Condon, Carl Eckart, Edwin C. Kemble, William V. Houston, , Walther Kossel, Philip M. Morse, Howard Robertson, and Wojciech Rubinowicz went on to become famous in their own right. Walter Rogowski, an undergraduate student of Sommerfeld at RWTH Aachen, also went on to become famous in his own right.

Max Born believed Sommerfeld's abilities included the "discovery and development of talents." Albert Einstein told Sommerfeld: "What I especially admire about you is that you have, as it were, pounded out of the soil such a large number of young talents." Sommerfeld's style as a professor and institute director did not put distance between him and his colleagues and students. He invited collaboration from them, and their ideas often influenced his own views in . He entertained them in his home and met with them in cafes before and after seminars and colloquia. Sommerfeld owned an alpine ski hut to which students were often invited for discussions of physics as demanding as the sport. ... from https://en.wikipedia.org/wiki/Arnold_Sommerfeld

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 ... Such was the state of affairs as, one evening in February 1912, P. P. Ewald came to visit me. (...) he was faced at that time with certain difficulties and came to me with a request for advice.

Now it was not, however, possible for me to assist him at that time. But during the conversation I was suddenly struck by the obvious question of the behaviour of waves which are short by comparison with the lattice-constants of the space lattice. And it was at that point that my intuition for suddenly gave me the answer: lattice spectra would have to ensue.

The fact that the lattice constant in crystals is of an order of 10-8 cm was sufficiently known from the analogy with other interatomic distances in solid and liquid substances, and, in addition, this could easily be argued from the density, molecular weight and the mass of the hydrogen atom which, just at that time, had been particularly well determined.

The order of X-ray wavelengths was estimated by Wien and Sommerfeld at 10-9 cm. Thus the ratio of wavelengths and lattice constants was extremely favourable if X-rays were to be transmitted through a crystal. I immediately told Ewald that I anticipated the occurrence of interference phenomena with X-rays. ...

(from von Laue’s Nobel Lecture)

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 R#/&g$:.n1&.`,.%(-./*$)&)$;#:*

!"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE 9-$F.&*$I./&(/&CTCA&#G&$/&\8%$;& (/*.%G.%./2.&#G&$&Z(/2&<)./".&2%;1*$)?& o(/2&<)./".&bo/Q3&1,=$).%(*.c&H$1&#/.& #G&*=.&G(%1*&2%;1*$)1&(/5.1*(F$*."&<;& g$:.3&0%(."%(2=&$/"&e/(,,(/F? bV=#*#F%$,=J&+.:*12=.1 ':1.:-c

!"#$#%&"'()#%*))+,#-.&%#/)%0(-1#

23 %.4*)5#-.$-#%4(&5%#$/)#4/5)/)5#$-#-.)#64()70($/#()8)( 93 &-#%)--()5#$((#:0)%-&4"%#4"#-.)#"$-0/)#4;#<=/$>%

!"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE “Dear Mr. Laue! I cordially salute you on your marvelous success. Your experiment counts among the most glorious that Physics has seen so far.”

Albert Einstein

(on a postcard to von Laue, dated June 1912)

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 /:,$8:--:"0$9*',?$@,"<<

S#%/J&A&p:);&CKDA3&X(F*#/3&4/(*."&e(/F"#-

+(."J&CA&'$%2=&CTLA3&g#/"#/3&4/(*."&e(/F"#-

Y#<.)&V%(Z.&(/&V=;1(21&(/&CTCO&[G#%&*=.(%&1.%5(2.1&(/&*=.&$/$);1(1& #G&2%;1*$)&1*%:2*:%.&<;&-.$/1&#G&\8%$;1]

!"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE 8:--:"0$("B,*'5*$@,"<<

S#%/J&NC&'$%2=&CKTB3&>".)$(".3&>:1*%$)($

+(."J&C&p:);&CTEC3&9,1H(2=3&4/(*."&e(/F"#-

Y#<.)&V%(Z.&(/&V=;1(21&(/&CTCO&[G#%&*=.(%&1.%5(2.1&(/&*=.&$/$);1(1& #G&2%;1*$)&1*%:2*:%.&<;&-.$/1&#G&\8%$;1]

!"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE Y$h) 2%;1*$)1

!"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE Q,=$).%(*.3&+#)#-(*.3&h=$)2#,;%(*.?&g#2$)(*;J&p#,)(/&0(.)"3&7%(8Q*$*.& +(1*%(2*3&p$1,.%&h#:/*;3&'(11#:%(3&4Q>& b=**,JMM./?H(I(,."($?#%FMH(I(MQ,=$).%(*.c

!"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE Crystal structure

In order to proceed, and explain the contributions by von Laue and the Braggs, we must describe order in crystals

basis

lattice

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 a1

a2

a1 and a2 are the primitive lattice vectors, and the translation vectors

T = u1a1 + u2a2 (u1,2 integers) generate the whole lattice

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 a1

a2

This pair of a1 and a2 is not primitive because they do not generate the whole lattice

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 a1 The primitive vectors also define the crystal axes

a2

a 1 The associated parallelogram is the primitive cell

a2

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 a 1 a3 The primitive vectors also define the crystal axes

a2

The associated parallelepiped a 1 a3 is the primitive cell

a2 Volume of primitive cell:

V = a ⋅a × a 3D 1 2 3

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 7=.%.&$%.&-$/;&*;,.1&#G&)$**(2.1?&7=.;&$%.&1;1*.-$*(2$));& 2)$11(G(."&<;&"(12%.*.&1,$2.&F%#:,1?&

7=.&2#--#/&/#-./2)$*:%.&(1&*=$*&#G&*=.&S%$5$(1 g$**(2.1?&

!`$-,).1

L10'>030' T.@7.(-?1*> %0@E>-$'&70'

!"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE 7=.&2:<(2&)$**(2.1

Q(-,).&2:<(2&b12c S#";82./*.%."&2:<(2&b<22c&&&0$2.82./*.%."&2:<(2& bG22c

!"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE Reciprocal lattice vectors

a2 × a3 a3 × a1 a1 × a2 b1 = 2π ; b2 = 2π ; b3 = 2π ; a1 ⋅a2 × a3 a2 ⋅a3 × a1 a3 ⋅a1 × a2

These vectors define the reciprocal lattice and have the property

ai ⋅b j = 2πδ ij and they define a reciprocal lattice, by means of the translation vectors

G = v1b1 + v2b2 + v3b3

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 Example: reciprocal lattice to a simple cubic (sc) lattice

⎛ a ⎞ ⎛ 0 ⎞ ⎛ 0 ⎞ a = ⎜ 0 ⎟ a = ⎜ a ⎟ a = ⎜ 0 ⎟ 1 ⎜ ⎟ 2 ⎜ ⎟ 3 ⎜ ⎟ ⎝ 0 ⎠ ⎝ 0 ⎠ ⎝ a ⎠

⎛ 1 a ⎞ ⎛ 0 ⎞ ⎛ 0 ⎞ b = ⎜ ⎟ b = ⎜ 1 a ⎟ b = ⎜ 0 ⎟ 1 ⎜ 0 ⎟ 2 ⎜ ⎟ 3 ⎜ ⎟ ⎝⎜ 0 ⎠⎟ ⎝⎜ 0 ⎠⎟ ⎝⎜ 1 a ⎠⎟

The reciprocal lattice is again sc; the sc lattice is self-dual.

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 Example: body-centered cubic lattice (bcc) ed. (Wiley, 2005) th from C. Kittel, “Introduction to Solid State Physics, 8

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 a    a    a    a1 = (−x + y + z); a2 = (x − y + z); a3 = (x + y − z); 2 2 2

a2 ⌢ ⌢ a2 ⌢ ⌢ a2 ⌢ ⌢ a2 × a3 = (y + z); a3 × a1 = (x + z); a1 × a2 = (x + y); 2 2 2 a3 a1 ⋅(a2 × a3 ) = 2 2π ⌢ ⌢ 2π ⌢ ⌢ 2π ⌢ ⌢ b1 = (y + z); b2 = (x + z); b3 = (x + y); a a a

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 2π   2π   2π   b1 = (y + z); b2 = (x + z); b3 = (x + y); a a a

The reciprocal lattice vectors of the bcc lattice correspond to the lattice vectors of the fcc lattice.

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 Diffraction of waves by crystals

a

b

θ θ crystal surface d θ

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 path difference between rays a and b: 2d sinθ

constructive interference condition: 2d sinθ = nλ a (Bragg law) b

θ θ crystal surface d θ

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 Remarks:

• the scattering cross-section is small, thus X-rays penetrate the crystal and are scattered by different planes

• the scattering cross-section is small, thus the X-ray beam is not significantly attenuated by previous crystal planes

• X-rays are scattered by electrons, and they are scattered more where the electron density is higher

• the Bragg law implies that

nλ 2d = sinθ ≤ 1 ⇒ n ≤ 2d λ

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 Crystalline solids

In crystalline solids, the electron density is periodic with respect to the discrete translations that define the crystal lattice

n(r) = n(r + T) T = u1a1 + u2a2 + u3a3; ui ∈Z

Then, the electron density can then be expressed as a Fourier series

n(r) = ∑ nG exp(iG⋅r) G

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 n(r) = ∑ nG exp(iG⋅r) G

We still have to find the vectors G that lead to the correct definition of the Fourier series.

Now we note that

n(r) = ∑ nG exp(iG⋅r) = n(r + T) = ∑ nG exp(iG⋅r + iG⋅T) G G and we see that periodicity works if

G⋅T = 2πm

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 The condition G⋅T = 2πm is satisfied by the reciprocal lattice translations.

Indeed ai ⋅b j = 2πδ ij

G = v1b1 + v2b2 + v3b3

T = u1a1 + u2a2 + u3a3

G⋅T = (v1b1 + v2b2 + v3b3 )⋅(u1a1 + u2a2 + u3a3 )

= 2π (v1u1 + v2u2 + v3u3 )

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 It is also easy to see that

1 n = n r exp −iG⋅r dV G ∫ ( ) ( ) VC cell

where VC is the volume of a cell of the crystal (prove it as homework!).

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 X-ray diffraction ed. (Wiley, 2005) th from C. Kittel, “Introduction to Solid State Physics, 8

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 incoming wave π rsinϕ −ϕ 2 k

r ϕ

path length: rsinϕ optical path length: krsinϕ ⎛ π ⎞ scalar product: k·r = kr cos −ϕ = krsinϕ ⎝⎜ 2 ⎠⎟

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 diffracted wave

π +ϕ 2

rsinϕ′ r k′ ϕ′

path length: rsinϕ′ optical path length: k′rsinϕ′ ⎛ π ⎞ scalar product: k′·r = k′r cos +ϕ = −krsinϕ′ ⎝⎜ 2 ⎠⎟

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 total optical path length difference

kr sin ' + kr sin '0 = k r k0 r · · = (k0 k) r · = k r · scattering vector

k = k0 k k0 = k + k

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 The amplitude scattered in direction k’ is proportional to dF = n(r)exp(−iΔk ⋅r)dV

The total scattered amplitude is proportional to the Fourier transform of the charge distribution

F = ∫ n(r)exp(−iΔk ⋅r)dV V ⎡ ⎤ = n exp iG⋅r exp −iΔk ⋅r dV ∫ ⎢∑ G ( )⎥ ( ) V ⎣ G ⎦ = n exp i G − Δk ⋅r dV ∑ G ∫ ⎣⎡ ( ) ⎦⎤ G V

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 F = n exp i G − Δk ⋅r dV ∑ G ∫ ⎣⎡ ( ) ⎦⎤ G V

This scattered amplitude reaches a local maximum for each given G when Δk = G (vector equation) otherwise it is negligibly small.

Taking into account energy conservation, so that k = k’, we find

k′ = G + k 2 ⇒ k 2 = k′2 = (G + k) = G2 + 2G⋅k + k 2 ⇒ G2 + 2G⋅k = 0

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 G2 + 2G⋅k = 0 or, equivalently (with the remark that it holds for –G as it as well as G)

2 G = 2G⋅k (scalar equation) which is a restatement of Bragg’s law 2d sinθ = nλ , and thus we expect peaks at these values of k.

(see Kittel for further details).

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 Δk = G implies that the following equations must also hold

a1·Δk = 2πv1; a2 ·Δk = 2πv2; a3·Δk = 2πv3;

Laue equations each equation defines a cone in k space;

diffraction occurs only where in those directions where cones intersect;

this is a severely limiting condition, realized only with systematic sweeping or, occasionally, by chance

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 The powder method

In this method the cone axis is defined by the incoming beam , “Solid State Chemistry” (New Age International, Chakrabarty 2010) D. K.

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 Q(/F).&2%;1*$) V#);2%;1*$))(/.&,#H".%

!"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE D. K. Chakrabarty, “Solid State Chemistry” (New Age International, 2010)

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 When the local maximum condition is satisfied, i.e.,

Δk = G the scattered amplitude becomes

F = ∫ n(r)exp(−iΔk ⋅r)dV = ∫ n(r)exp(−iG⋅r)dV V V

When this is summed over the N cells of a crystal, we find the amplitude of the corresponding diffraction maximum

F = N n r exp −iG⋅r dV = NS ∫ ( ) ( ) G cell

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 S = n r exp −iG⋅r dV G ∫ ( ) ( ) cell is the structure factor.

Now introduce the electron density of each atom in the cell (j-th atom)

n(r) = ∑n j (r − rj ) j then S = n r exp −iG⋅r dV = n r − r exp −iG⋅r dV G ∫ ( ) ( ) ∑ ∫ j ( j ) ( ) cell j cell

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 S = n r exp −iG⋅r dV = n r − r exp −iG⋅r dV G ∫ ( ) ( ) ∑ ∫ j ( j ) ( ) cell j cell = exp −iG⋅r n r − r exp ⎡−iG⋅ r − r ⎤dV ∑ ( j ) ∫ j ( j ) ⎣ ( j )⎦ j cell = exp −iG⋅r n s exp −iG⋅s dV ∑ ( j ) ∫ j ( ) ( ) j cell

= ∑ f j exp(−iG⋅rj ) j where f = n s exp −iG·s dV j ∫ j ( ) ( ) cell is the atomic form factor.

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 The data inversion problem

Since the amplitude is a Fourier sum

FG = NSG = N∑ f j exp(−iG⋅rj ) = N∑ f j exp(−ivx x j − ivy yj − ivzz j ) j j we can find – at least in principle – the electron density and the atomic coordinates by Fourier inversion.

However, the problem is more difficult, because we do not measure the amplitude, but the intensity

2 I ∝ FG and therefore phase is lost (this is the phase problem).

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 Atomic positions are also unknown. Thus there are two related problems

• phase problem • atomic positions

Several methods exist, that rely on initial trial solutions with successive refinements.

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 The Patterson function

We found earlier that the scattered amplitude is a Fourier transform

F(G) = ∫ n(r)exp(−iG⋅r)dV V so that the electron density can be retrieved from Fourier inversion

1 n(r) = ∑ F(G)exp(iG⋅r) V G

However we cannot measure F directly, but only the scattered intensity (no phase!), which is proportional to |F|2.

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 The Patterson function P was introduced by Arthur Lindo Patterson in 1935, and it is essentially the inverse Fourier transform of the scattered intensity

1 2 P(u) = ∑ F(G) exp(iG⋅u) V G which is somewhat similar to

1 n(r) = ∑ F(G)exp(iG⋅r) V G

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 Now denote the Fourier transform with the symbol T and the inverse transform with T-1, then

2 P(u) = T −1 ( F ) = T −1 (F* ⋅ F) = T −1 (F* )⊗T −1 (F)

apply convolution theorem

−1 ⎪⎧n(r) = T (F) this shows that the Patterson Since , we find function is the convolution of ⎨ * ⎩⎪F (G) = F(−G) the electron density with the mirror image of the density P(u) = n(r)⊗ n(−r) = ∫ n(r)n(u + r)dr V

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 Another way of looking at the Patterson function

1 2 P(u) = ∑ F(G) exp(iG⋅u) V G 1 ⎡ ⎤⎡ ⎤ = ∑ ⎢∫ n(r′)exp(iG⋅r′)dV ′⎥⎢∫ n(r′′)exp(−iG⋅r′′)dV ′′⎥exp(iG⋅u) V G ⎣V ⎦⎣V ⎦ 1 = ∑∫ n(r′)n(r′′)exp ⎣⎡iG⋅(r′ − r′′ + u)⎦⎤dV ′dV ′′ V G V

the Patterson function has peaks where this parenthesis vanishes and where the densities are highest

Peaks occur at u = ri − rj , i.e., the Patterson is a function of the distances between atoms.

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 74 Protein Crystallography

74 Protein Crystallography 2008 : “Protein crystallography: a concise guide”,

Figure Figure5.1. Building 5.1. Building up the up thePatterson Patterson function function fromfrom thethe atomicatomic positions. positions. (a) One(a) One unit cell of a simple two-dimensional crystal containing four atoms. (b) The same unit cell of a simple two-dimensional crystal containing four atoms. (b) The same unit cell shown in (a), but showing the vectors connecting the four atoms. (c) The unit cellinteratomic shown in vectors (a), but from showing (b), shown the emanating vectors connectingfrom a common the origin.four atoms. This is how(c) The Lattman&Loll interatomicthe vector vectors peaks from appear (b), inshown the Patterson emanating function. from (d)a common Multiple origin. unit cells This of theis how the vectorPatterson peaks function. appear Peaks in the corresponding Patterson to function. interatomic (d) vectors Multiple are shown unit cellsas dots. of the from John Hopkins Univ. Press, PattersonThe function. vectors are Peaks drawn correspondingfor only the central to unit interatomic cell, but all vectorsthe unit cellsare areshown identical. as dots. The vectors are drawnEdoardo Milotti for only the- Introductory biophysics central unit cell, -butA.Y. 2016 all -the17 unit cells are identical. If we use F as the coe≈cient in a Fourier series we obtain the electron density, r(x). r(x) has peaks corresponding to the atomic positions, x . By analogy, we might If we use F as the coe≈cient in a Fourier series we obtain thej electron density, r(x). expect that using F 2 as the coe≈cient in a Fourier series will give us a function, r(x) has peaks corresponding to the atomic positions, xj. By analogy, we might P(u), that has peaks corresponding to the interatomic vectors, (xj – xk). This is 2 expect indeedthat using the case F andas the is easily coe ≈demonstrated.cient in a Fourier Combining series equations will give (5.1) us and a function, (5.2)

P(u), thatwe obtain has peaks corresponding to the interatomic vectors, (xj – xk). This is indeed the case and is easily demonstrated. Combining equations (5.1) and (5.2) 1 we obtain P(u) = f f exp(2pih § [x – x ]) exp(–2pih § u) V ͸ͫ͸͸ j k j k ͬ h j k 1 P(uCombining) = terms leads tof f theexp(2 following:pih § [x – x ]) exp(–2pih § u) V ͸ͫ͸͸ j k j k ͬ h j k 1 P(u) = f f exp(–2pih § [u – (x – x )]) V ͸͸͸ j k j k Combining terms leadsh toj thek following:

1 P(u) = f f exp(–2pih § [u – (x – x )]) V ͸͸͸ j k j k h j k The Patterson Function 77

Figure 5.3. An example of how atomic positions can be inferred from the Patterson function when heavy atoms are present. (a) A simple organic compound—iodoben- zene—containing a single heavy atom. The iodine atom is dark gray and is placed at 2008 the origin of an arbitrary coordinate system. (b) The Patterson function calculated

from this molecule. Peaks in the Patterson function are shown as dots. Vectors be- : “Protein crystallography: a concise guide”, tween the iodine atom and carbon atoms appear as dark peaks, while carbon–carbon vectors are shown as light peaks. There is also a peak at the origin, as is always the case for the Patterson function. Note how the dark peaks reveal both the structure of the molecule and its mirror image. Lattman&Loll ing a moderate number of light atoms plus a few heavy atoms, the heavy atom- from John Hopkins Univ. Press, heavy atom peaks canEdoardo Milotti frequently -beIntroductory biophysics identified in the- A.Y. 2016 Patterson-17 function, allowing the heavy atom positions to be inferred. Unfortunately, in large molecules like proteins, the Patterson functions are so complex that even heavy atom peaks become lost. However, as we have seen in Chapter 4, the positions of the heavy atoms must be known before we can calculate phases in the MIR experiment. How do we find them? In this case, we use what are called di√erence Patterson functions to accentuate the heavy atom peaks and allow their identification. Suppose we are searching for the positions of the heavy atoms in a heavy atom derivative of a protein. The heavy atoms are bound to the protein and occupy specific sites in the crystal lattice. Now imagine that we could magically erase all the protein atoms from the crystal, without changing the positions of the heavy atoms. The di√raction pattern of this imaginary heavy atom-only crystal would correspond to the structure factors FH(h). The Patterson function calcu- 2 lated from these data would have coe≈cients F H(h) and would be ideally suited for determining the positions of the heavy atoms. Of course, we can’t erase the protein atoms, and we can’t measure FH, but we can approximate FH. As men- tioned in Chapter 4, the isomorphous di√erence qFiso = FPH – FP can be used as The Patterson Function 79

Figure 5.4. Isomorphous di√erence Patterson map for a mercury derivative of a protein. The crystals in this example contain hemoglobin from the annelid Glycera dibanchiata. This two-dimensional contour plot shows the v = 0 section of the full 2008 three-dimensional Patterson function (the u-w plane corresponds to the x-z plane in the real unit cell). The section plotted extends over the full unit cell in w (across) and

halfway along the unit cell in u (down). The origin is at the upper left, and the large : “Protein crystallography: a concise guide”, peak there represents vectors between atoms and themselves. The peak about halfway down on the left, lying at approximately u= 0.45 and w = 0.04, represents a vector between two symmetry-related mercury atoms. It occurs at the position 2x, 2z, where x and z are the mercury atom coordinates in the unit cell. The refined coordinates for the mercury atom are x = 0.225 and z = 0.007. Reproduced from the dissertation The Structure of Glycera Hemoglobin by E. A. Padlan. Lattman&Loll from John Hopkins Univ. Press, we may choose the correct arrangement of heavy atoms, or we may choose its Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 inverse. If the incorrect arrangement of heavy atoms is chosen, it will give rise to an inverse image—a protein containing d-amino acids and left-handed a-helices. Changing the sign of every phase angle will invert the handedness of the image, and most programs contain a switch to do this. (Crystallography is capable of determining the absolute handedness of molecules through the use of anomalous scattering, but we won’t describe how in this book. Note that Linus Pauling, Robert Corey, and Herman Branson, in their landmark 1951 paper describing the structure of the a-helix, did not attempt to assign absolute handedness and actually chose arbitrarily to draw the helix as left-handed! Later that same year Johannes Bijvoet published his account of how the absolute configuration of chiral molecules can be determined by using anomalous scattering.) The example we just examined o√ers no information about the y coordinate of the heavy atom. In this case, we are at liberty to set the y coordinate equal to Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 14 Protein Crystallography

Figure 1.8. Illustration of how a crystal is built up by symmetric repetition of simple elements. (a) The asymmetric unit is the smallest entity that is necessary to build up the entire crystal. In this example, the asymmetric unit corresponds to a single molecule. (b) Identical copies of this molecule are generated by the space group 14 Protein Crystallographysymmetry operations. In the example shown, each of the four molecules in the unit cell is related to the other three by twofold (180\) rotations about one of three symmetry axes. The three rotational symmetry axes are parallel to the unit cell edges. This type of packing arrangement is known as 222 symmetry. These four molecules comprise the contents of the unit cell, which is shown in (c). The unit cell is a box that encloses the various symmetry-related copies of the asymmetric unit. The edges of the unit cell are defined by three vectors, a, b, and c. Finally, as shown in (d), multiple copies of the unit cells are stacked together to form the crystal, much as bricks are stacked to form a wall. Each unit cell is related to all of its neighbors by a pure translation that constitutes an integer number of steps in a, b, and c. Kindly provided by Alexander McPherson.

meansfrom thatLattman&Loll molecules in the crystal: “Protein crystallography: are superimposed on copies of themselves when reflected through a particular plane or point. Mirror planes and inversions changea concise guide”, John Hopkins Univ. Press, the hand of objects and can therefore not be present in protein crystals, since2008 the amino acids comprising proteins are chiral. Finally, translations can be combined with rotations or mirror planes to give screw axes or glide planes, respectively.

Figure 1.8. Illustration of how a crystal is built up by symmetric repetition of simple elements. (a) The asymmetric unit is the smallest entity that is necessary to build up the entire crystal. In this example, the asymmetric unit corresponds to a single molecule. (b) Identical copies of this molecule are generated by the space group symmetry operations. In the example shown, each of the four molecules in the unit cell is related to the other three by twofold (180Edoardo Milotti \) rotations about- Introductory biophysics one of three - A.Y. 2016-17 symmetry axes. The three rotational symmetry axes are parallel to the unit cell edges. This type of packing arrangement is known as 222 symmetry. These four molecules comprise the contents of the unit cell, which is shown in (c). The unit cell is a box that encloses the various symmetry-related copies of the asymmetric unit. The edges of the unit cell are defined by three vectors, a, b, and c. Finally, as shown in (d), multiple copies of the unit cells are stacked together to form the crystal, much as bricks are stacked to form a wall. Each unit cell is related to all of its neighbors by a pure translation that constitutes an integer number of steps in a, b, and c. Kindly provided by Alexander McPherson. means that molecules in the crystal are superimposed on copies of themselves when reflected through a particular plane or point. Mirror planes and inversions change the hand of objects and can therefore not be present in protein crystals, since the amino acids comprising proteins are chiral. Finally, translations can be combined with rotations or mirror planes to give screw axes or glide planes, respectively. W):2#1.&91#-.%$1. h%;1*$)1

W):2#1.&(1#-.%$1. bW9c&2$*$);Z.1&*=.&%.5.%1(<).&(1#-.%(Z$*(#/&#G& +8F):2#1.&$/"&+8`;)#1.&*#&+8G%:2*#1.&$/"&+8`;):)#1.3& %.1,.2*(5.);? 9/&*=.&(/":1*%;3&F):2#1.&(1#-.%$1. (1&:1."&$)-#1*&.`2):1(5.);&(/& *=.&2#/5.%1(#/&#G&1*$%2=.1&*#&1:F$%1?&W9&-$;&<.&*=.&-#1*& (-,#%*$/*&#G&$))&(/":1*%($)&./Z;-.1&#G&*=.&G:*:%.? !"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE g;1#Z;-.&h%;1*$)1

7=.&);1#Z;-.1&$%.&./Z;-.1&*=$*&"$-$F.& <$2*.%($)&2.))&H$))1&<;&2$*$);Z(/F&=;"%#);1(1& #G&C3L8<.*$8)(/I$F.1&<.*H../&Y8 $2.*;)-:%$-(2 $2("&$/"&Y8$2.*;)8+8 F):2#1$-(/.&%.1(":.1&(/&$&,.,*("#F);2$/& $/"&<.*H../&Y8$2.*;)8+8F):2#1$-(/.& %.1(":.1&(/&2=(*#".`*%(/1?&g;1#Z;-.&(1& $<:/"$/*&(/&$&/:-<.%&#G&1.2%.*(#/13&1:2=&$1& !"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&*.$%13&1$)(5$3&=:-$/&-()I3&$/"&-:2:1?8 >?@?&ABCD8CE !`2.)1(/ h%;1*$)1

!`.)2(/ (1&$&F)#<:)$%&,%#*.(/&G#:/"&(/&1#-.& ,)$/*1?&9*&(1&$/&(/1.2*&,#(1#/&$/"&,%.5./*1& !"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&(/1.2*&G.."(/F?8 >?@?&ABCD8CE Introduction 13 2008 : “Protein crystallography: a Lattman&Loll a gallery of proteins, from concise guide”, John Hopkins Univ. Press,

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 Figure 1.7. Gallery of pictures illustrating a variety of protein crystals. Note the wide variety of shapes. Note also the variation in the perfection of the external appearance of the crystals, which is not necessarily related to perfection in internal order. The names of the proteins appear in the individual frames. Kindly provided by Alexander McPherson. inversions, and translations. In a crystal possessing rotational symmetry, every molecule in the crystal is superimposed on an identical copy of itself when rotated by a specific angle (for example, 180\) about a particular axis. Allowed rotational symmetries are twofold (180\), threefold (120\), fourfold (90\), and sixfold (60\). Note that fivefold symmetry is not allowed in crystals, nor is sevenfold symmetry or higher. When we say they are not allowed, we mean that it is physically impossible to build up a repeating three-dimensional array that is based on fivefold or sevenfold symmetry. Mirror symmetry or inversion symmetry !"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE !"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE !"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE !"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE Q*%:2*:%.&#G&=./&.FF8H=(*.&);1#Z;-.

!"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 ???&G#))#H&*=.&)(/I&=**,JMMHHH?%21

!"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE !"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE DATA DEPOSITION AND ANNOTATION

Structural data are deposited to the PDB archive by scientists residing In 2015, the wwPDB curated 10,956 deposited on every inhabited structures. The PDB archive is on track to receive continent ~12,000 new structures in 2016 OneDep

D System s a e Data are expertly t l a u curated by wwPDB Cu od annotators before ration M release in the public PDB archive

Chemical Manual and Component Sequence Automated Processing Processing Annotation Validation

The PDB archive hosted a total of 534,339,871 PDB data downloads in 2015 Archive

Ligand Validation Workshop Participants

!"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE Edoardo Milotti - Introductory biophysics - A.Y. 2016-17 =**,JMMHHH?%21

!"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE ABCNJ&KENK&1*%:2*:%.1 ABCLJ&KKEN&1*%:2*:%.1 ABCOJ&KDEN&1*%:2*:%.1 ABCDJ&CBBCD&1*%:2*:%.1 ABCE&bNcJ&NCNT&1*%:2*:%.1 =**,JMMHHH?%21

!"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE ABCNJ&CCK&1*%:2*:%.1 ABCLJ&OAD&1*%:2*:%.1 ABCOJ&ECO&1*%:2*:%.1 ABCDJ&TAK&1*%:2*:%.1 ABCE&bNcJ CAC&1*%:2*:%.1 =**,JMMHHH?%21

!"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE !"#$%"#&'()#**(&8 9/*%#":2*#%;&<(#,=;1(21&8 >?@?&ABCD8CE Homework:

1. Access the website http://www.ks.uiuc.edu and download VMD, the molecular dynamics visualizer (look for documentation, quick start, instructions, etc. on the same site)

2. Go to http://www.rcsb.org and download PDB entry 1fbb (bacteriorhodopsin)

3. Run VMD, load file 1fbb, and find a way to display a stereo view of the molecule in the ball-and-stick representation

4. Print the result and bask in the light of your success

Edoardo Milotti - Introductory biophysics - A.Y. 2016-17