Algorithmic Framework for X-Ray Nanocrystallographic Reconstruction in the Presence of the Indexing Ambiguity

Algorithmic framework for X-ray nanocrystallographic reconstruction in the presence of the indexing ambiguity Jeffrey J. Donatellia,b and James A. Sethiana,b,1 aDepartment of Mathematics and bLawrence Berkeley National Laboratory, University of California, Berkeley, CA 94720 Contributed by James A. Sethian, November 21, 2013 (sent for review September 23, 2013) X-ray nanocrystallography allows the structure of a macromolecule over multiple orientations. Although there has been some success in to be determined from a large ensemble of nanocrystals. How- determining structure from perfectly twinned data, reconstruction is ever, several parameters, including crystal sizes, orientations, and often infeasible without a good initial atomic model of the structure. incident photon flux densities, are initially unknown and images We present an algorithmic framework for X-ray nanocrystal- are highly corrupted with noise. Autoindexing techniques, com- lographic reconstruction which is based on directly reducing data monly used in conventional crystallography, can determine ori- variance and resolving the indexing ambiguity. First, we design entations using Bragg peak patterns, but only up to crystal lattice an autoindexing technique that uses both Bragg and non-Bragg symmetry. This limitation results in an ambiguity in the orienta- data to compute precise orientations, up to lattice symmetry. tions, known as the indexing ambiguity, when the diffraction Crystal sizes are then determined by performing a Fourier analysis pattern displays less symmetry than the lattice and leads to data around Bragg peak neighborhoods from a finely sampled low- that appear twinned if left unresolved. Furthermore, missing angle image, such as from a rear detector (Fig. 1). Next, we model phase information must be recovered to determine the imaged structure factor magnitudes for each reciprocal lattice point with object’s structure. We present an algorithmic framework to deter- a multimodal Gaussian distribution, using a multistage expectation mine crystal size, incident photon flux density, and orientation in maximization algorithm which simultaneously scales and models APPLIED the presence of the indexing ambiguity. We show that phase in- the data. These multimodal models are used to build a weighted MATHEMATICS formation can be computed from nanocrystallographic diffraction graph which models the structure factor magnitude concurrency. using an iterative phasing algorithm, without extra experimental We formulate the solution to the indexing ambiguity problem as requirements, atomicity assumptions, or knowledge of similar finding the maximum edge weight clique in this graph, which can structures required by current phasing methods. The feasibility be solved efficiently via a greedy approach. Finally, we demon- of this approach is tested on simulated data with parameters strate the feasibility of solving the phase problem using iterative and noise levels common in current experiments. phase retrieval. Whereas several of the presented methods rely on the use of nanocrystals, we note that the scaling–multimodal anal- lthough conventional X-ray crystallography has been used ysis and indexing ambiguity resolution steps can also be applied to Aextensively to determine atomic structure, it is limited to larger crystals ð1 − 10 μmÞ. objects that can be formed into large crystal samples ð>10 μmÞ. An appealing alternative, made possible by recent advances in Formulation light source technology, is X-ray nanocrystallography, which is In X-ray crystallography, diffraction patterns are collected from able to image structures resistant to large crystallization, such as a periodic crystal made up of the target object. The 3D crystal membrane proteins, by substituting a large ensemble of easier to lattice structure may be described by its Bravais lattice charac- h ; h ; h ; h ∈ R3 build nanocrystals, typically <1 μm, often delivered to the beam teristicP ð 1 2 3Þ j , and its associated infinite lattice = 3 h : ; ; ∈ Z via a liquid jet (1–6) (Fig. 1). However, the beam power required L f j=1nj j n1 n2 n3 g. We define the lattice rotational to retrieve sufficient information destroys the crystal, hence ul- symmetry group to be the set of rotation operators which trafast pulses (≤70 fs) are required to collect data before damage preserve the lattice structure . effects alter the signal. Using nanocrystals introduces several chal- lenges. Due to the small crystal size, Bragg peaks are smeared out, Significance and there is noticeable signal between peaks. Typically, only partial peak reflections are measured, resulting in reduced in- X-ray nanocrystallography is a powerful imaging technique tensities. Variations in crystal size and incident photon flux den- which is able to determine the atomic structure of a macro- sity, unknown orientations, shot noise, and background signal from molecule from a large ensemble of nanocrystals. Determining the liquid and detector add additional uncertainty to the data. structure from this ensemble is challenging because the images If crystal orientations were known, noise and variation in the are noisy, and individual crystal sizes, orientations, and in- peak measurements could be averaged out, and the data could cident photon flux densities are unknown. Additionally, lattice be inverted to retrieve the object’s electron density. Although symmetries may lead to orientation ambiguities. Here, we autoindexing techniques can be used to determine crystal orien- show how to determine crystal size, incident photon flux tation up to lattice symmetry from the location of a sufficient density, and crystal orientation from noisy data. We also number of Bragg peaks, they typically face difficulties in the pres- demonstrate that these data can be used to perform re- ence of partial and non-Bragg reflections common in nanocrystal construction without extra experimental requirements, atom- diffraction images. Furthermore, these techniques only narrow icity assumptions, or knowledge of similar structures. down orientation to a list of possibilities when the diffraction pattern has less symmetry than the lattice, leading to an ambiguity Author contributions: J.J.D. and J.A.S. designed research, performed research, analyzed in the image orientation, known as the indexing ambiguity. Cur- data, and wrote the paper. rent methods of processing the diffraction data are largely based The authors declare no conflict of interest. on averaging out the data variance over several images (1–8). 1To whom correspondence should be addressed. E-mail: [email protected]. However, if the data are processed without resolving the indexing This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. ambiguity then they will appear to be perfectly twinned, i.e., averaged 1073/pnas.1321790111/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1321790111 PNAS | January 14, 2014 | vol. 111 | no. 2 | 593–598 Downloaded by guest on September 27, 2021 Fig. 2. Example of a low-angle image. The profile of the shape transform around the Bragg peaks (circled) can be used to determine the crystal sizes. for nanocrystallography, small crystal sizes spread the shape transform, and measurements close to, but not directly at, a Bragg Fig. 1. Liquid jet (blue) delivers nanocrystal samples to the X-ray beam peak, known as partial reflections, have a decreased measured (red). Wide- and small-angle diffraction data are collected using front and rear detectors. intensity. Additionally, the signal at pixels corresponding to lines in between adjacent reciprocal lattice points is often noticeable in P nanocrystal diffraction images. Each lattice has both a Dirac comb Δ x = δ x − y , The goal of X-ray nanocrystallography is to determine the unit Lð Þ y∈L ð Þ ρ which is a sum of Dirac delta functions supported on the lat- cell electron density from a large ensemble of diffraction images, ^ tice points, and a dual L, known as the reciprocal lattice, given which vary in orientation, incident photon flux density, and crystal by the support of the Dirac comb’s Fourier transform* Δ^ . size and are corrupted with noise. Here we will focus on the case L† ‡ The Bravais vectors of the reciprocal lattice are given by when , which leads to the indexing ambiguity. ^ ^ ^ −T ½h1; h2; h3 = ½h1; h2; h3 . In practice, a crystal lattice consists of only a finite part of Autoindexing its associated infinite lattice. In this case, the associated Dirac Commonly used autoindexing methods, e.g., refs. 9, 10, can ac- comb’s Fourier transform , known as the shape transform curately determine a lattice’s unit cell, given by the Bravais S : R3 → C, is no longer a sum of delta functions, but is instead vectors in some reference configuration, using a large ensemble ^ a smeared-out version of ΔL. To simplify the discussion, of images. However, the orientation information that these we will be assuming that the finite crystal lattice can be de- methods compute might not be accurate enough to be used in scribed as a box with Nj unit cells in the direction of hj, i.e., the evaluation of the shape transform, especially in the presence . Here, the squared norm of its asso- of non-Bragg spots and low Bragg peak counts. Starting from this unit cell information, we devise an algorithm which uses both ciated shape transform is given by Bragg and non-Bragg data to generate precise orientations, up to À Á lattice symmetry. 3 2 πN h · q q 2 = ∏ sin À j j Á : [1] jSð Þj 2 j = 1 sin πhj · q Bravais Characteristic

Algorithmic Framework for X-Ray Nanocrystallographic Reconstruction in the Presence of the Indexing Ambiguity

Observational Astrophysics II January 18, 2007

On the Duality of Regular and Local Functions

Algorithmic Framework for X-Ray Nanocrystallographic Reconstruction in the Presence of the Indexing Ambiguity

Understanding the Lomb-Scargle Periodogram

The Poisson Summation Formula, the Sampling Theorem, and Dirac Combs

Distribution Theory by Riemann Integrals Arxiv:1810.04420V1

Generalized Poisson Summation Formula for Tempered Distributions

Notes on Distributions

Dirac Comb and Flavors of Fourier Transforms

Introduction to Image Processing #5/7

The Mellin Transform Jacqueline Bertrand, Pierre Bertrand, Jean-Philippe Ovarlez

11S Poisson Summation Formula Masatsugu Sei Suzuki Department of Physics, SUNY at Bimghamton (Date: January 08, 2011)