bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 Structural insight into Marburg virus nucleoprotein-RNA complex formation
2
3 Yoko Fujita-Fujiharu1,2,3, Yukihiko Sugita1,2,4, Yuki Takamatsu1,#, Kazuya Houri1,2,3, Manabu
4 Igarashi5, Yukiko Muramoto1,2,3, Masahiro Nakano1,2,3, Yugo Tsunoda1, Stephan Becker6,7,
5 Takeshi Noda1,2,3*
6
7 1 Laboratory of Ultrastructural Virology, Institute for Frontier Life and Medical Sciences,
8 Kyoto University, 53 Shogoin Kawahara-cho, Sakyo-ku, Kyoto 606-8507, Japan
9 2 Laboratory of Ultrastructural Virology, Graduate School of Biostudies, Kyoto University, 53
10 Shogoin Kawahara-cho, Sakyo-ku, Kyoto 606-8507, Japan
11 3 CREST, Japan Science and Technology Agency, 4-1-8 Honcho, Kawaguchi, Saitama 332-
12 0012, Japan
13 4 Hakubi Center for Advanced Research, Kyoto University, Kyoto, Japan.
14 5 Division of Epidemiology, International Institute for Zoonosis Control, Hokkaido University,
15 Sapporo 001-0020, Japan
16 6 Institute of Virology, University of Marburg, 35043 Marburg, Germany.
17 7 German Center for Infection Research (DZIF), Marburg-Gießen-Langen Site, University of
18 Marburg, 35043 Marburg, Germany.
19
20 # present address: Department of Virology I, National Institute of Infectious Diseases, Gakuen
21 4-7-1, Musashimurayama-city, Tokyo 208-0011, Japan
22 *correspondence to: [email protected]
23
24 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
25 Abstract
26 The nucleoprotein (NP) of Marburg virus (MARV), a close relative of Ebola virus (EBOV),
27 encapsidates the single-stranded, negative-sense viral genomic RNA (vRNA) to form the
28 helical NP-RNA complex. The NP-RNA complex serves as a scaffold for the assembly of the
29 nucleocapsid that is responsible for viral RNA synthesis. Although appropriate interactions
30 among NPs and RNA are required for the formation of nucleocapsid, the structural basis of the
31 helical assembly remains largely elusive. Here, we show the structure of the MARV NP-RNA
32 complex determined using cryo-electron microscopy at a resolution of 3.1 Å. The structures of
33 the asymmetric unit, a complex of an NP and six RNA nucleotides, was very similar to that of
34 EBOV, suggesting that both viruses share common mechanisms for the nucleocapsid formation.
35 Structure-based mutational analysis of both MARV and EBOV NPs identified key residues for
36 the viral RNA synthesis as well as the helical assembly. Importantly, most of the residues
37 identified were conserved in both viruses. These findings provide a structural basis for
38 understanding the nucleocapsid formation and contribute to the development of novel antivirals
39 against MARV and EBOV.
40 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
41 Introduction
42 Marburg virus (MARV), a close relative of the Ebola virus (EBOV), belongs to the
43 family Filoviridae and possesses a 19.1 kb non-segmented, single-stranded, negative-sense
44 RNA genome (vRNA)1. Similar to EBOV, it causes severe haemorrhagic fever in humans and
45 non-human primates with a high mortality rate, which continuously poses a threat to public
46 health. However, no vaccine or antivirals against MARV diseases have yet been licensed. The
47 nucleoprotein (NP) of MARV consisting of 695 amino acids encapsidates the vRNA to form a
48 left-handed helical complex2. The helical NP-RNA complex acts as a scaffold for the formation
49 of a nucleocapsid, which is responsible for the transcription and replication of the vRNA3,4.
50 Since the NP-RNA complex is the structural unit that constitutes the helical core structure of
51 the nucleocapsid, for which appropriate interactions between NP molecules and RNA
52 nucleotides are required, solving the NP-RNA unit structure is essential for understanding the
53 mechanisms of the nucleocapsid formation.
54 The core structure of RNA-free monomeric MARV NP, spanning residues 21 to 373,
55 features a typical bilobed fold consisting of N-terminal and C-terminal lobes, as determined
56 using X-ray crystallography at 2.9 Å resolution5, which is similar to that of the EBOV NP core
57 structure6. These two lobes are connected by a flexible hinge region and are considered to
58 clamp the RNA strand between the two lobes by electrostatic interactions. The overall three-
59 dimensional architecture of the NP-RNA complex, in which the cellular RNA was coated with
60 truncated NP comprising its N-terminal 390 residues, was visualized using cryo-electron
61 tomography (cryo-ET), showing that the helical NP-RNA complex constitutes the innermost
62 layer of the MARV nucleocapsid7. However, since the resolution was limited to the scale of a
63 few nanometres, the structural basis for the assembly of NP and RNA into the nucleocapsid
64 core largely remains unknown. bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
65 Using single-particle cryo-electron microscopy (cryo-EM), we recently determined the
66 structure of EBOV NP-RNA complex at 3.6 Å resolution8. Here, we employed the same
67 technique to determine the cryo-EM structure of the MARV NP-RNA complex. In addition,
68 we aimed to identify key residues important for NP-RNA and NP-NP interactions and viral
69 RNA synthesis using structure-based mutagenesis.
70 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
71 Results
72 Overall structure of the MARV NP-RNA complex and its structural units
73 Similar to EBOV NP, MARV NP possesses the structural domains: an N-terminal arm,
74 an NP core composed of N- and C-terminal lobes, a disordered linker, and a C-terminal tail
75 (Fig. 1a). To determine the structure of MARV NP associated with RNA, we expressed a C-
76 terminally truncated NP which encompasses residues 1 to 395 containing the N-terminal arm
77 and the NP core domains in mammalian cells to constitute the helical complex7. Purified
78 complexes were imaged using cryo-EM (Extended Data Fig. 1a), and the structure was
79 determined by single-particle analysis at 3.1 Å resolution (Fig. 1b, Extended Data Fig. 1b-d).
80 The constituted complex shows a left-handed double helical structure (Fig. 1b, Extended Data
81 Fig. 1b), in which respective strands have similar dimensions: each strand contains single-
82 strand RNA, and 30.50 NP subunits per turn and a 128.99 Å helical pitch (rise = 4.23 Å, rotation
83 = 11.8052) with the outer and inner diameters of approximately 330 Å and 255 Å, respectively.
84 In the cryo-EM map, the density of the nucleotide bases and most of the side chains are clearly
85 visible, which enabled us to build an atomic model unambiguously showing six RNA
86 nucleotides enclosed by the N- and C-terminal lobes of NP (Fig. 1c-e). The respective
87 asymmetric NP subunits, NP-a and NP-b, show very similar conformations as shown by a
88 0.592-Å backbone RMSD (Root Mean Square Deviation) (Fig. 1c, Extended Data Fig. 1e).
89 The models of these asymmetric subunits also fit into the NP region within a low resolution
90 cryo-ET reconstruction of the MARV nucleocapsid20 (Extended Data Fig. 1f). Helix α16
91 confines the RNA strand into a cleft between the two lobes (Fig. 1d). The N-terminal arm
92 (residues 1 to 19) protrudes sideway from the N-terminal lobe and has a short 310-helix η1 (Fig.
93 1d), which is associated with a neighbouring NP molecule within the helical complex. Overall,
94 the MARV NP-RNA complex shares common structural features to that of the EBOV NP-
95 RNA unit8 (Extended Data Fig. 2), although the MARV NP-RNA complex reconstituted here bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
96 is a double helix unlike the EBOV NP-RNA complex. Expression of the full-length wild-type
97 MARV NP also showed the formation of a double helical structure (Extended Data Fig. 3).
98
99 Amino acid residues of MARV NP involved in interactions with RNA
100 Comparison of our atomic model of the RNA-bound MARV NP with an RNA-free
101 monomeric MARV NP structure5 provides a possible mechanism of RNA encapsidation (Fig.
102 2a). A structural alignment of our RNA-bound and the previously reported RNA-free NP5
103 molecules exhibits a 0.885-Å backbone RMSD, suggesting no drastic conformational changes
104 in the overall structure. Local conformational changes are observed mainly in the C-terminal
105 lobe probably to clamp the RNA strand. In particular, short 310-helix η6, which is disordered
106 in the RNA-free state, appears underneath the RNA strand possibly to make hydrogen bonds
107 with the RNA (Fig. 2a, b). The C-terminal helix α15 is shifted outward of the helical complex
108 so that the C-terminal helices α16 on the NP strands can be aligned like a zipper to capture the
109 RNA in between the two lobes (Fig. 2a). Indeed, the electrostatic surface potential of the NP
110 shows that the α15 and α16 helices confine the RNA in the positively charged cleft between
111 the N- and C-terminal lobes (Extended Data Fig. 4).
112 In the positively charged cleft between the N- and C-terminal lobes, six RNA
113 nucleotides are encapsidated with a ‘3-bases-inward, 3-bases-outward’ configuration (Fig. 2c).
114 Several polar residues are present within hydrogen-bonding distance of the RNA backbone and
115 likely contact with the RNA bases within the cleft (Fig. 2c). In particular, the interactions
116 between NP and RNA are predominantly electrostatic due to positively charged residues such
117 as K142, K153, R156, and H292 with their side chains pointing to the phosphate backbone of
118 the RNA (Fig. 2c, d). In contrast, the side chain of K230, which is reported not to be essential
119 for the RNA binding5, is oriented toward the RNA base (Fig. 2c, d). To further understand the
120 mechanism of RNA binding, molecular dynamics (MD) simulation and binding free energy bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
121 calculations were performed. The binding free energy of the six RNA bases (all of them
122 modelled as uracil) within a single NP molecule (Extended Data Table 1) indicated that the
123 total contribution of the RNA backbone in the NP-binding energy was larger than that of the
124 RNA base, and the sixth uracil (U6) exhibited a relatively high binding energy, which was
125 partially attributed to its interaction with the helix α6 of the adjacent NP. Also, the RNA
126 backbone-specific interactions with the NP are consistent with a sequence-independent manner
127 of the viral RNA encapsidation, similar to those of other negative-strand RNA viruses8–16.
128 Multiple protein sequence alignment of filovirus NPs shows that most of the polar residues in
129 the RNA-binding cleft are highly conserved among the family Filoviridae (Extended Data Fig.
130 5), suggesting that those residues play important roles in the RNA binding and/or the vRNA
131 synthesis.
132
133 Amino acid residues involved in MARV NP-NP interactions
134 The N-terminal arm is known to be essential for NP oligomerization of
135 filoviruses5,6,8,17–20. However, the structural basis for the MARV NP-NP interactions remains
136 elusive. Our atomic model shows that the N-terminal arm of any arbitrary NP molecule (NPn)
137 is associated with an adjacent NPn+1 inside the helical complex (Fig. 3a). Similar to EBOV NP,
138 the N-terminal arm of MARV NPn occupies a hydrophobic pocket located on the C-terminal
139 lobe of NPn+1 via L6 and L9 (Fig. 3b). Because the hydrophobic pocket of MARV NP is
140 reportedly bound by N-terminal region of VP35, which blocks the NP oligomerization and
141 RNA encapsidation5,19, the interaction between the MARV N-terminal arm and the
142 hydrophobic pocket is considered to be essential for the formation of the helical complex along
143 with RNA. Different from the structures of an RNA-free form of the NP bound to VP35
19,21 144 peptide , H4 in the N-terminal arm of NPn is tucked between K239 on the 240 loop and the
145 hydrophobic pocket of NPn+1 (Fig. 3b). A loop 202-208 on the N-terminal lobe of NPn+1 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
146 slightly moves downward to make a new groove, which allows the N-terminal arm of NPn to
147 enter this space (Fig. 3c). This local conformational change is probably caused by electrostatic
148 interactions between D208 and D211 on NPn+1 and R19 on the N-terminal arm of NPn (Fig.
149 3c).
150 In addition to the N-terminal arm and the N-terminal lobe, C-terminal helices of the NP
151 are likely involved in interactions with a neighbouring NP. In our model, helix α15 of NPn is
152 associated with α16 of the adjacent NPn+1, which is mediated by hydrophobic interactions
153 between a hydrophobic patch composed of L249, L325, and I336 on NPn and a hydrophobic
154 region composed of I357, F361, and I368 on NPn+1 (Fig. 3d). In addition, R339 on α15 of NPn
155 points towards an acidic amino acid-rich region on the α16 of adjacent NPn+1, suggesting that
156 electrostatic interactions also occur between the α15 and α16 helices. This interaction might be
157 more important for NP oligomerization than just for stably clamping the RNA strand within
158 the helical complex as suggested previously5. Importantly, the residues described above, H4,
159 L6, L9, R19, R339, and hydrophobic residues on helices α15 and α16, are highly conserved
160 among the family Filoviridae (Extended Data Fig. 5), suggesting that interactions via the N-
161 terminal arm and the C-terminal helices are important for NP oligomerization of filoviruses.
162
163 Identification of amino acid residues important for filovirus RNA synthesis
164 Having identified the potential interactions among MARV NPs and RNA as described
165 above, we performed structure-based mutational analysis on MARV NP and examined the
166 impact on the helical NP-RNA complex formation by negative staining EM. In parallel, we
167 also assessed the corresponding residues of EBOV NP to understand the structural conservation
168 among filoviruses. The NP mutants tested are listed in Extended Data Table 2. Among the
169 residues potentially involved in MARV NP-NP interactions, the H4A, L9E, R339A mutants
170 formed similar helical NP-RNA complexes to wild-type MARV NP (1-395) (Fig. 4a). bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
171 Similarly, the corresponding EBOV NP mutants, H22A, A27E, and Y357A formed helical NP-
172 RNA complexes, which were morphologically indistinguishable from that of wild-type EBOV
173 NP (1-450) (Fig. 4a). The MARV L6E mutant rarely showed helical complex formation, and
174 when it formed, the helical complexes were significantly shorter than those of the wild-type
175 (Extended Data Fig. 6), while the corresponding EBOV I24E mutant formed considerably
176 loose helices, suggesting the involvement of the interaction between N-terminal arm and the
177 hydrophobic pocket for appropriate NP oligomerization. Both MARV R19A mutant and its
178 corresponding EBOV mutant, R37A, formed straighter and longer helical complexes,
179 suggesting that interactions between the N-terminal arm and helix α10 affect rigidity of the
180 helical complex (Extended Data Fig. 6). Among the residues potentially responsible for RNA
181 binding, MARV R156A, K230A, and H292A, and the corresponding residues of EBOV NP
182 R174A, K248A, and H310A formed helical NP-RNA complexes. In contrast, MARV K142A
183 and K153A as well as corresponding K160A and K171A in EBOV NP were not able to form
184 regular helical complexes, suggesting their importance for the RNA binding property and for
185 appropriate NP oligomerization.
186 Finally, to determine whether the amino acid residues that are responsible for
187 appropriate helical NP-RNA complex formation are required for exerting nucleocapsid
188 functions, we conducted minigenome assay using full-length MARV and EBOV NP mutants
189 and assessed the impact of respective mutations on the transcription and replication activity.
190 All of the NP mutants, which showed no morphological changes in the helical complexes (Fig.
191 4a), showed comparable levels of the transcription and replication activity to wild-type NP (Fig.
192 4b). In contrast, MARV L6E and R19A and the corresponding I24E, R37A of EBOV NP,
193 which are involved in lateral NP-NP interaction, showed a significant reduction in their activity.
194 In addition, MARV NP K142A and K153A and the corresponding K160A and K171A of
195 EBOV NP, which are involved in the RNA binding, lost their activity to less than 1% of the bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
196 wild-type. These results demonstrated that the amino acid residues, which are involved in
197 appropriate helical NP-RNA complex formation, are essential for functional nucleocapsid
198 formation.
199 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
200 Discussion
201 MARV NP is an RNA-binding protein and a major component of the nucleocapsid.
202 Here, we determined the structure of MARV NP-RNA complex at 3.1 Å resolution using
203 single-particle cryo-EM. Despite substantial difference in the amino acid sequence, MARV NP
204 shared common structural features with EBOV NP. Each MARV NP encapsidated six
205 nucleotides in between the N- and C-terminal lobes in a sequence-independent manner. The N-
206 terminal arm is responsible for interaction with an adjacent NP. We also identified that amino
207 acid residues of MARV and EBOV NPs critical for functional nucleocapsid formation as well
208 as helical NP-RNA complex formation were mostly conserved in filoviruses.
209 Different from EBOV NP (1-450)-RNA complex, the MARV NP (1-395)-RNA
210 complex purified from NP-expressing cells reproducibly constituted double-helical structures
211 (Extended Data Fig. 7), although we used the same purification procedures for both8. Since the
212 MARV nucleocapsid and its core structure is thought to be a single helix as shown by cryo-
213 ET7,20, the double-helical structure reconstituted in this study may not faithfully recapitulate
214 the structure in the MARV nucleocapsid core. Nevertheless, the asymmetric subunits, NP-a
215 and NP-b monomers, showed the same domain organization and structural features with NP
216 molecules on the cryo-ET reconstruction of MARV20 (Extended Data Fig. 1f). Furthermore,
217 introduction of mutations into potential interaction regions among MARV NPs and RNA
218 revealed changes in the NP-RNA helical formation and the nucleocapsid function (Fig. 4).
219 These results indicate that not only the atomic coordinates of NP and RNA but also the
220 interaction modes between lateral NPs in the helix and between NP and RNA nucleotides
221 determined in this study reflect those in authentic MARV and in the virus-infected cells. Our
222 structure also reveals flexible and yet stable structural features of the MARV NP to potentially
223 form the metastable assemblies with RNA. bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
224 It is believed that, in many RNA viruses, viral nucleoproteins bind to viral genomic
225 RNA in a sequence-independent manner8–16 which was also observed in our study (side chains
226 of positively charged residues, K142, K153, R156, and H292, in the RNA-binding cleft pointed
227 to the phosphate backbone of the RNA, but not to the RNA bases). The MD simulation
228 followed by binding free energy analysis revealed that the energy contribution of RNA
229 backbone to NP-binding was higher than that of the RNA bases. These results highlighted the
230 importance of the interactions between RNA backbone and NP for RNA binding, which
231 probably assist in the incorporation of viral RNA in a sequence-independent manner to form
232 the helical NP-RNA complex.
233 Structure-based mutational analysis identified amino acid residues which are critical
234 for helical complex formation and nucleocapsid function of EBOV and MARV. Regarding the
235 RNA recognition, MARV K142 and K153 (corresponding to EBOV K160 and K171,
236 respectively) likely play more important roles in functional nucleocapsid formation, compared
237 to K230 and H292 in the C-terminal lobe. Interestingly, paramyxovirus and pneumovirus
238 nucleoproteins, which encapsidate vRNA in the positively charged cleft composed between the
239 N- and C-terminal lobes, contain more basic amino acid residues on the RNA-binding cleft of
240 N-terminal lobe than that of C-terminal lobe9,12,22,23 (Extended Data Fig. 8). These findings
241 indicate that robust association via N-terminal lobe and subsequent interaction of C-terminal
242 lobe with RNA nucleotides would be important for nucleocapsid assembly and vRNA
243 transcription and replication.
244 In our MARV NP-RNA complex structure, the N-terminal arm fits into the hydrophobic
245 pocket of the adjacent NP (Fig. 3b, Extended Data Fig. 9), where the hydrophobic interaction
246 plays a key role in tethering adjacent two NP molecules. Importantly, the amino acids
247 comprising this hydrophobic pocket, such as F223, V229, L233, L237, V244, L266, L269,
248 A270, G273, A276, P277, and F278, are conserved among the family Filoviridae (Extended bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
249 Data Fig. 5). The hydrophobic pocket is also occupied by N-terminal region of MARV
250 VP3519,21 (Extended Data Fig. 9). Once VP35 binds to the hydrophobic pocket, the NP
251 molecule switches to an “open-state” structure and loses its ability to encapsidate vRNAs. Thus,
252 MARV NP would oligomerize by release of VP35 N-terminus from the hydrophobic pocket
253 and NP binding to vRNA in turn, as is the case with EBOV6,8,18. Although the N-terminal arm
254 of MARV is 18-amino acids shorter than that of the Ebola and Cueva viruses (Extended Data
255 Fig. 5), the competitive interactions with the hydrophobic pocket between the N-terminal arm
256 and the VP35 N-terminus is probably a common NP oligomerization mechanism in the family
257 Filoviridae6,8,18.
258 In conclusion, we determined a high-resolution structure of MARV NP-RNA complex
259 using cryo-EM and identified critical residues for functional nucleocapsid formation. The
260 results advance our understanding of the mechanisms of the nucleocapsid formation and
261 contribute to the development of antivirals broadly effective for filoviruses. bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
262 Material & Methods
263 Cells
264 Human embryonic kidney 293 Freestyle (HEK293F) cells were maintained in Expi293
265 Expression Medium (Gibco, Waltham, MA, USA), at 37 °C in an 8% CO2 atmosphere. Human
266 embryonic kidney (293T) cells were maintained in the Dulbecco’s Modified Eagle Medium
267 (D6046, Merck, Darmstadt, Germany) supplemented with 10% fetal bovine serum (FB-1365,
268 Biosera, France).
269
270 Plasmids
271 All MARV (Musoki strain) plasmids encoding wild-type proteins (pCAGGS-L, pCAGGS-
272 VP30, pCAGGS-VP35, pCAGGS-NP), T7-driven MARV minigenome encoding Renilla
273 luciferase (p3M-5M-Luc), and T7 DNA-dependent RNA polymerase (pCAGGS-T7), are
274 described elsewhere 24. The pCAGGS-NP (1-395) that expresses residues 1-395 of MARV NP
275 was constructed from pCAGGS-NP by PCR using primers with digestion enzyme sites. Then,
276 the PCR product was cloned into digested pCAGGS vector with a Kozak sequence present
277 upstream of the initiation codon. The NP as well as NP (1-395) mutant constructs were
278 generated by PCR-based site-directed mutagenesis. The primers used in this study are listed in
279 Extended Data Table 3. All constructs were sequenced to conform that unwanted mutations
280 were not present.
281
282 Expression and purification of NP-RNA complex
283 For MARV, HEK293F cells grown in 10 ml of medium (3.0 x 106 cells/ml) were transfected
284 with 6 µg of pCAGGS-NP (1-395) using Polyethyleneimine MAX (Polysciences, Warrington,
285 PA, USA). Three days post-transfection, the cells were collected and lysed with 0.1 % Nonidet
286 P-40 substitute (Wako, Osaka, Japan) in Tris-HCl buffer (10 mM Tris-HCl (pH 8.0), 150 mM bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
287 NaCl, 1 mM EDTA) including a cOmplete Protease Inhibitor (Roche, Base, Switzerland) and
288 10 mM Ribonucleoside-Vanadyl Complex (NEB, Ipswich, MA, USA) on ice. The lysate was
289 centrifuged at 10,000 g at 4 ˚C for 10 min to remove insoluble substances, and the supernatant
290 was subjected to a discontinuous CsCl gradient ultracentrifugation at 246,100 g at 4 ˚C for 2 h.
291 Then, fractions containing the NP-RNA complex were collected and ultracentrifuged at
292 246,100 g at 4 ˚C for 15 min. The pellet was suspended in Tris-HCl buffer.
293 For EBOV, HEK293T cells grown in 10 ml of medium (3.0 x 105 cells/ml) were transfected
294 with 10 µg of pCAGGS-NP (1-450) using TransIT-293 Reagent (Takara, Shiga, Japan).
295 Three days post-transfection, NP-RNA complexes were purified as described above and
296 suspended in Tris-HCl buffer.
297
298 Negative staining EM
299 Five microliters of sample was applied to a glow-discharged copper grid coated with carbon
300 and negatively stained with 2% uranyl acetate. Images were obtained using transmission
301 electron microscopy (HT-7700, Hitachi High Technology, Tokyo, Japan) operating at 80 kV
302 with an XR81-B CCD camera. The length distribution was measured for about sixty helices
303 by using ImageJ software25,26.
304
305 Cryo-EM specimen preparation
306 The 2.5 µl sample of purified NP-RNA complex solution was applied onto glow-discharged
307 200-mesh R1.2/1.3 Cu grids (Quantifoil, Jena, Germany). The grids were blotted and rapidly
308 frozen in liquid ethane using a Vitrobot Mark IV (Thermo Fisher Scientific, Waltham, MA,
309 USA).
310
311 Cryo-EM data collection bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
312 Images were acquired on a Titan Krios cryo-EM (Thermo Fisher Scientific, Waltham, MA,
313 USA) at 300 kV equipped with a Cs corrector (CEOS, GmbH), which was installed in the
314 Institute for Protein Research, Osaka University. Data were acquired automatically with EPU
315 software as movies on a Falcon 3EC detector (Thermo Fisher Scientific, Waltham, MA, USA).
316 The detailed imaging condition is described in Extended Data Table 4.
317
318 Image processing
319 We used the software RELION3.127 for the processing steps. The movie frames were motion-
320 corrected using relion_motioncorrection with 5 × 5 patches. Next, GCTF28 software was
321 utilized for contrast transfer function (CTF) estimation of all micrographs. A total of 2,469
322 micrographs were subjected to subsequent helical image processing. Helices were manually
323 traced. A total of 30,668 segments were extracted and two rounds of 2D classification was
324 performed. Classes, in which the detailed structure of helices was confirmed, were selected for
325 further analysis. The 3D initial model, a featureless cylinder, was made by the
326 relion_helix_toolbox command. Then, the featureless cylinder was refined via Refine3D and
327 used as the initial model for the following 3D classification. The 3D classification was
328 performed with local symmetry search around expected helical parameters from 2D class
329 images (twist from -12.5 degrees to -11 degrees, rise from 4 Å to 4.5 Å). A total of 23,545
330 segments from the best resolved 3D class were subjected to 3D auto-refinement (helical
331 parameter: twist for -11.8052 degrees, rise for 4.23 Å). Finally, we ran the CTF refinement job
332 and the Bayesian polishing job to improve the resolution and the final map was calculated from
333 3D auto-refinement. The resulting EM map has a nominal resolution of 3.1 Å estimated from
334 the Fourier Shell Correlation (FSC) and was normalized with PyMOL. Local resolution was
335 estimated with RELION and the locally sharpened map was used for the following atomic
336 model building. The detailed image processing conditions are listed in Extended Data Table 4. bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
337
338 Atomic model building and refinement
339 Atomic modelling of NP (1-395) was performed by template-based and de novo structure
340 modelling. An atomic model of EBOV NP (1-450)-RNA complex (PDB-ID: 5Z9W)8 served
341 as the initial template. First, we performed a rigid body fitting of the atomic model of EBOV
342 NP (1-450) into the locally sharpened EM map of MARV NP (1-395)-RNA complex using the
343 software UCSF Chimera29. Next, the software COOT30 was used to replace each amino acid of
344 EBOV NPs with the amino acid sequence of MARV NPs; for amino acid residues from 389 to
345 391, which have no homologous sites in the atomic model of EBOV NPs, the models were
346 constructed de novo. The atomic model was refined with phenix.real_space_refine31. To
347 precisely refine the atomic models located in the intermolecular region, surrounding 8
348 molecules of the atomic models were fitted to our map and refined with constraints on the
349 protein secondary structure estimated by phenix_secondary_structure_restraints. The model
350 validity was assessed by the software Phenix31 and MolProbity32. The detailed results of the
351 atomic model construction and its evaluation are shown in Extended Data Table 4. The
352 software UCSF Chimera X33 was used for displaying the EM map and the atomic models.
353
354 Minigenome assay
355 For MARV, each well of HEK 293T cells grown in a 12-well plate were transfected with the
356 plasmids expressing MARV nucleocapsid components (400 ng of pCAGGS-L, 40 ng of
357 pCAGGS-VP30, 40 ng of pCAGGS-VP35, 200 ng of pCAGGS-NP, and 400 ng of the
358 minigenome p3M-5M-Luc), 200 ng of T7 polymerase (pCAGGS-T7), and 20 ng of firefly
359 luciferase (pGL) by using 4 µl of TransIT 293 (Takara, Shiga, Japan) and 140 µl of OPTI-
360 MEM (Thermo Fisher Scientific, Waltham, MA, USA). At 48 h post-transfection, the cells
361 were lysed using lysis buffer (pjk, Kleinblittersdorf, Germany), and the Renilla and firefly bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
362 luciferase activities were measured using GloMax®-Multi+ Detection System (Promega,
363 Madison, WI, USA) with an Renilla-Juice kit and a FireFly-Juice kit (pjk, Kleinblittersdorf,
364 Germany) according to the manufacturer’s protocol. The Renilla luciferase activity was
365 normalized to the firefly luciferase activity.
366 For EBOV, HEK293T cells were grown in a 24-well plate and each well was transfected with
367 the plasmids expressing EBOV nucleocapsid components (1,000 ng of pCAGGS-L, 75 ng of
368 pCAGGS-VP30, 100 ng of pCAGGS-VP35, 100 ng of pCAGGS-NP, and 200 ng of the
369 minigenome p3E-5E-Luc), 200 ng of T7 polymerase (pCAGGS-T7), and 10 ng of Renilla
370 luciferase (pTK r.luc) by using 4 µl of TransIT 293 (Takara, Shiga, Japan) and 100 µl of OPTI-
371 MEM (Thermo Fisher Scientific, Waltham, MA, USA). At 48 h post-transfection, the cells
372 were lysed using Passive Lysis buffer (Promega Madison, WI, USA), and the firefly and
373 Renilla luciferase activities were measured using GloMax®-Multi+ Detection System
374 (Promega Madison, WI, USA) with a Luciferase Assay Reagent II and a Stop & Glo® Reagent
375 (Promega Madison, WI, USA) according to the manufacturer’s protocol. The firefly luciferase
376 activity was normalized to the Renilla luciferase activity.
377
378 Molecular dynamics simulation and binding free energy analysis
379 MD simulation was performed for three consecutive units of MARV NP-RNA complex (Fig.
380 3a), each consisting of an NP and six RNA nucleotides, using the Amber 18 package34. The
381 AMBER ff14SB35 and RNA.OL336,37 force fields were used for the simulation. The protonation
382 states of the ionizable residues were assigned a pH of 7.0 using the PDB2PQR web server38.
383 The system was solvated in a truncated octahedral box of TIP3P water molecules with a
384 distance of at least 10 Å around the proteins, and was then energy-minimized in four steps:
385 first, only hydrogen atoms; second, water and ions; third, the side chains; and finally, all atoms.
386 After energy minimization, the system was gradually heated from 0 to 310 K over 300 ps with bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
387 harmonic restraints (with a force constant of 1.0 kcal/mol·Å2). Two additional rounds of MD
388 (50 ps each at 310K) were performed with decreasing the restraint weight from 0.5 to 0.1
389 kcal/mol·Å2. Next, 500 ns of unrestrained production run was performed in the NPT ensemble
390 at a pressure of 1.0 atm and a temperature of 300 K. All bond lengths including hydrogen atoms
391 were constrained using the SHAKE algorithm39, and a time step of 2 fs was used. A cut-off
392 radius of 10 Å was set for the nonbonded interactions. Long-range electrostatic interactions
393 were treated using the particle mesh Ewald (PME) method40. The stability of the trajectory was
394 assessed by monitoring the RMSD of the backbone Cα atoms from the initial structure of MD
395 simulation. After confirming that RMSD in the system reached equilibrium within 200 ns, the
396 trajectory extracted from the last 300 ns (i.e., 200–500 ns) was used for subsequent per-residue
397 energy analysis. Binding free energies were calculated using the script of the molecular
398 mechanics-generalized Born surface area (MM-GBSA) method in AmberTools18. The
399 conformational entropy was not considered due to the high computational cost and low
400 prediction accuracy. The salt concentration was set to 0.20 M.
401
402 Statistical analysis and reproducibility
403 Statistically significant differences in minigenome assay were evaluated using ANOVA-
404 Dunnett’s test to correct for multiple hypothesis testing with RStudio software. Statistically
405 significant differences in the length of MARV NP variants were evaluated by unpaired Student
406 t-test using RStudio software. Data are presented as the mean ± s.d. A P-value < 0.01 was
407 considered statistically significant. The data from minigenome assay and negative staining
408 analysis of MARV and EBOV NP mutants were repeated in three independent experiments.
409 Data availability bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
410 The cryo-EM maps generated in this study have been deposited into the Electron Microscopy
411 Data Bank with accession numbers EMDB: EMD-XXXXX. The atomic coordinates reported
412 in this paper have been deposited in the Protein Data Bank with accession code PDB-ID:
413 XXXX. Raw movies have been deposited in the Electron Microscopy Public Image Archive
414 with accession codes EMPIAR-XXXXX. All other data are available from the corresponding
415 author upon request.
416 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
417 Reference
418 1. Kiley, M. P. et al. Filoviridae: a Taxonomic Home for Marburg and Ebola Viruses ?
419 Intervirology 18, 24–32 (1982).
420 2. Mavrakis, M., Kolesnikova, L., Schoehn, G., Becker, S. & Ruigrok, R. W. H. Morphology of
421 Marburg Virus NP–RNA. Virology 296, 300–307 (2002).
422 3. Ruigrok, R. W., Crépin, T. & Kolakofsky, D. Nucleoproteins and nucleocapsids of negative-
423 strand RNA viruses. Curr. Opin. Microbiol. 14, 504–510 (2011).
424 4. Mühlberger, E., Lötfering, B., Klenk, H.-D. & Becker, S. Three of the Four Nucleocapsid
425 Proteins of Marburg Virus, NP, VP35, and L, Are Sufficient To Mediate Replication and
426 Transcription of Marburg Virus-Specific Monocistronic Minigenomes. J. Virol. 72, 8756–8764
427 (1998).
428 5. Liu, B. et al. Structural Insight into Nucleoprotein Conformation Change Chaperoned by
429 VP35 Peptide in Marburg Virus. J. Virol. 91, (2017).
430 6. Kirchdoerfer, R. N., Abelson, D. M., Li, S., Wood, M. R. & Saphire, E. O. Assembly of the
431 Ebola Virus Nucleoprotein from a Chaperoned VP35 Complex. Cell Rep. 12, 140–149 (2015).
432 7. Bharat, T. A. M. et al. Cryo-Electron Tomography of Marburg Virus Particles and Their
433 Morphogenesis within Infected Cells. PLOS Biol. 9, e1001196 (2011).
434 8. Sugita, Y., Matsunami, H., Kawaoka, Y., Noda, T. & Wolf, M. Cryo-EM structure of the
435 Ebola virus nucleoprotein–RNA complex at 3.6 Å resolution. Nature 563, 137–140 (2018).
436 9. Gutsche, I. et al. Near-atomic cryo-EM structure of the helical measles virus nucleocapsid.
437 Science 348, 704–707 (2015).
438 10. Green, T. J., Zhang, X., Wertz, G. W. & Luo, M. Structure of the Vesicular Stomatitis Virus
439 Nucleoprotein-RNA Complex. Science 313, 357–360 (2006).
440 11. Albertini, A. A. V. et al. Crystal Structure of the Rabies Virus Nucleoprotein-RNA Complex.
441 Science 313, 360–363 (2006).
442 12. Tawar, R. G. et al. Crystal Structure of a Nucleocapsid-Like Nucleoprotein-RNA Complex of
443 Respiratory Syncytial Virus. Science 326, 1279–1283 (2009). bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
444 13. Raymond, D. D., Piper, M. E., Gerrard, S. R., Skiniotis, G. & Smith, J. L. Phleboviruses
445 encapsidate their genomes by sequestering RNA bases. Proc. Natl. Acad. Sci. 109, 19208–19213
446 (2012).
447 14. Ariza, A. et al. Nucleocapsid protein structures from orthobunyaviruses reveal insight into
448 ribonucleoprotein architecture and RNA polymerization. Nucleic Acids Res. 41, 5912–5926 (2013).
449 15. Niu, F. et al. Structure of the Leanyer orthobunyavirus nucleoprotein-RNA complex reveals
450 unique architecture for RNA encapsidation. Proc. Natl. Acad. Sci. 110, 9054–9059 (2013).
451 16. Hastie, K. M. et al. Crystal structure of the Lassa virus nucleoprotein–RNA complex reveals a
452 gating mechanism for RNA binding. Proc. Natl. Acad. Sci. 108, 19365–19370 (2011).
453 17. Noda, T., Hagiwara, K., Sagara, H. & Kawaoka, Y. Characterization of the Ebola virus
454 nucleoprotein–RNA complex. J. Gen. Virol. 91, 1478–1483 (2010).
455 18. Leung, D. W. et al. An intrinsically disordered peptide from Ebola virus VP35 controls viral
456 RNA synthesis by modulating nucleoprotein-RNA interactions. Cell Rep. 11, 376–389 (2015).
457 19. Zhu, T. et al. Crystal Structure of the Marburg Virus Nucleoprotein Core Domain
458 Chaperoned by a VP35 Peptide Reveals a Conserved Drug Target for Filovirus. J. Virol. 91, (2017).
459 20. Wan, W. et al. Structure and assembly of the Ebola virus nucleocapsid. Nature 551, 394–397
460 (2017).
461 21. Liu, B. et al. Structural Insight into Nucleoprotein Conformation Change Chaperoned by
462 VP35 Peptide in Marburg Virus. J. Virol. 91, (2017).
463 22. Renner, M. et al. Nucleocapsid assembly in pneumoviruses is regulated by conformational
464 switching of the N protein. eLife 5, e12627 (2016).
465 23. Alayyoubi, M., Leser, G. P., Kors, C. A. & Lamb, R. A. Structure of the paramyxovirus
466 parainfluenza virus 5 nucleoprotein–RNA complex. Proc. Natl. Acad. Sci. U. S. A. 112, E1792–E1799
467 (2015).
468 24. Wenigenrath, J., Kolesnikova, L., Hoenen, T., Mittler, E. & Becker, S. Establishment and
469 application of an infectious virus-like particle system for Marburg virus. J. Gen. Virol. 91, 1325–1334
470 (2010). bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
471 25. Rueden, C. T. et al. ImageJ2: ImageJ for the next generation of scientific image data. BMC
472 Bioinformatics 18, 529 (2017).
473 26. Fiji: an open-source platform for biological-image analysis | Nature Methods.
474 https://www.nature.com/articles/nmeth.2019.
475 27. Zivanov, J. et al. New tools for automated high-resolution cryo-EM structure determination in
476 RELION-3. eLife 7, e42166 (2018).
477 28. Zhang, K. Gctf: Real-time CTF determination and correction. J. Struct. Biol. 193, 1–12
478 (2016).
479 29. Pettersen, E. F. et al. UCSF Chimera—A visualization system for exploratory research and
480 analysis. J. Comput. Chem. 25, 1605–1612 (2004).
481 30. Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. Acta
482 Crystallogr. D Biol. Crystallogr. 60, 2126–2132 (2004).
483 31. Adams, P. D. et al. PHENIX : a comprehensive Python-based system for macromolecular
484 structure solution. Acta Crystallogr. D Biol. Crystallogr. 66, 213–221 (2010).
485 32. Chen, V. B. et al. MolProbity : all-atom structure validation for macromolecular
486 crystallography. Acta Crystallogr. D Biol. Crystallogr. 66, 12–21 (2010).
487 33. Goddard, T. D. et al. UCSF ChimeraX: Meeting modern challenges in visualization and
488 analysis. Protein Sci. 27, 14–25 (2018).
489 34. Case, D. A. et al. AMBER 2018; 2018. Univ. Calif. San Franc. (2018).
490 35. Maier, J. A. et al. ff14SB: improving the accuracy of protein side chain and backbone
491 parameters from ff99SB. J. Chem. Theory Comput. 11, 3696–3713 (2015).
492 36. Cornell, R. of the. Nucleic Acids Force Field Based on Reference Quantum Chemical
493 Calculations of Glycosidic Torsion Profiles. J Chem Theory Comput 7, 2886–2902 (2011).
494 37. Banás, P. et al. Performance of molecular mechanics force fields for RNA simulations:
495 stability of UUCG and GNRA hairpins. J. Chem. Theory Comput. 6, 3836–3849 (2010).
496 38. Dolinsky, T. J. et al. PDB2PQR: expanding and upgrading automated preparation of
497 biomolecular structures for molecular simulations. Nucleic Acids Res. 35, W522–W525 (2007). bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
498 39. Ryckaert, J.-P., Ciccotti, G. & Berendsen, H. J. Numerical integration of the cartesian
499 equations of motion of a system with constraints: molecular dynamics of n-alkanes. J. Comput. Phys.
500 23, 327–341 (1977).
501 40. Darden, T., York, D. & Pedersen, L. Particle mesh Ewald: An N⋅ log (N) method for Ewald
502 sums in large systems. J. Chem. Phys. 98, 10089–10092 (1993).
503 41. Smith, N. et al. DelPhi web server v2: incorporating atomic-style geometrical figures into the
504 computational protocol. Bioinforma. Oxf. Engl. 28, 1655–1657 (2012).
505 42. Sarkar, S. et al. DelPhi Web Server: A comprehensive online suite for electrostatic
506 calculations of biological macromolecules and their complexes. Commun. Comput. Phys. 13, 269–284
507 (2013).
508 43. Larkin, M. A. et al. Clustal W and Clustal X version 2.0. Bioinforma. Oxf. Engl. 23, 2947–
509 2948 (2007).
510 44. Robert, X. & Gouet, P. Deciphering key features in protein structures with the new ENDscript
511 server. Nucleic Acids Res. 42, W320–W324 (2014).
512
513 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
514 Acknowledgments
515 We thank Y. Kawaoka for providing plasmids expressing EBOV viral proteins and
516 minigenome, and A. Takada for anti-MARV NP antiserum. This work was supported by a
517 Grant-in-Aid for JSPS Fellows (21J12207) (to Y.F.), a Research Grant from the Kazato
518 Research Encouragement Prize, MEXT Grant-in-Aid for Leading Initiative for Excellent
519 Young Researchers, JSPS KAKENHI Grant Number 21K07052 (to Y.S.), JSPS KAKENHI
520 Grant Number 19K16666, an AMED Research Program for Infectious Diseases Research and
521 Infrastructure (Interdisciplinary Cutting-edge Research, 20wm0325023j0001,
522 21wm0325023j0002) (to Y.T.), JSPS KAKENHI Grant Number 20H03140 (to M.I.), JSPS
523 KAKENHI Grant Numbers 20H03494, 19K22529, the JSPS Core-to-Core Program A, the
524 Advanced Research Networks, MEXT Grant-in-Aid for Scientific Research on Innovative
525 Area (19H04831), an AMED Research Program on Emerging and Re-emerging Infectious
526 Disease grants (19fk0108113, 20fk0108270h0001; 21wm0325023j0002), the JST Core
527 Research for Evolutional Science and Technology (JPMJCR20HA), the Grant for the Joint
528 Usage / Research Center on Tropical Disease, Institute of Tropical Medicine, and Nagasaki
529 University, the Daiichi Sankyo Foundation of Life Science, the Uehara Memorial Foundation
530 (to T.N.), the Grant for Joint Research Project of the Institute of Medical Science, University
531 of Tokyo, the Joint Usage/Research Center program of Institute for Frontier Life and Medical
532 Sciences Kyoto University, and the Takeda Science Foundation (to Y.S and T.N).
533 This study was also supported by the Platform Project for Supporting Drug Discovery and Life
534 Science Research (Basis for Supporting Innovative Drug Discovery and Life Science Research
535 (BINDS)) from AMED under Grant Number JP19am0101072 (support number 1632).
536
537 Author contributions bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
538 Y.F., Y.S., and T.N. designed the study; Y.F., Y.S., Y.T., K.H., I.M., Y.M., M.N., and Y.T.
539 performed experiments; Y.F., Y.S., S. B., and T.N. wrote the manuscript with input from all
540 co-authors. All authors read and approved the final manuscript.
541
542 Competing interest declaration
543 The authors declare no competing interests.
544 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Fig. 1: Overall cryo-electron microscopy (EM) structure and atomic model of Marburg
virus nucleoprotein (MARV NP) (1-395). bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
(a) Structural layout of MARV NP sequence. The full-length NP consists of 695 amino acids
and is divided into an N-terminal arm, an NP core consisting of N- and C-terminal lobes, a
disordered region, and a C-terminal tail.
(b) The iso-electron potential surface map of an NP-RNA complex reconstruction, calculated
from 23,545 segments (contoured at 3σ above average). The RNA strand is highlighted in red.
An NP-a molecule, which is in a dark gray strand, is highlighted in light green. An NP-b
molecule in a light gray strand is highlighted in yellow.
(c) The isolated NP-RNA complex unit (3σ) and (d) our atomic model with the EM map (3σ)
in a ribbon representation, coloured the same as in Figure 1a.
(e) Isolated EM map (6σ) of RNA superimposed with the atomic model. RNA bases are
modelled as uracil.
bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Fig. 2: Nucleoprotein (NP)-RNA interactions.
(a) Overall structure of an RNA-bound NP molecule (PDB-ID: XXXX, coloured the same as
Figure 1a, from this study), which is superimposed with RNA-free monomeric NP (PDB-ID:
5F5M, gray)
(b) Close-up view of the RNA binding site in a cylinder representation of α-helices and β-
sheets. A secondary structure η6 appears underneath the RNA nucleotides. bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
(c) Close-up view of the RNA binding site in a ribbon representation of α-helices and β-sheets,
coloured the same as Figure 1a.
(d) Schematic diagram of RNA recognition by positively charged residues of Marburg virus
(MARV) NP. The RNA bases, which face inward and outward of the helix, are coloured pink
and cyan, respectively.
bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Fig. 3: Nucleoprotein (NP)-NP interactions.
(a) Three adjacent NP molecules viewed from inside the helix. Black boxes show respective
close-up figures.
(b) NP-NP interactions between the N-terminal arm of NPn (green) and the hydrophobic pocket
of an adjacent NPn+1 (white, hydrophobic residues are shown in orange).
(c) NP-NP interaction between the N-terminal lobe of NPn (green) and an adjacent N-terminal
lobe of NPn+1 (white). Basic residues are coloured in blue, and acidic residues are coloured in
red.
(d) NP-NP interaction between two C-terminal lobe helices in NP-a. Hydrophobic residues are
shown in orange. Basic residues are coloured in blue, and acidic residues are coloured in red. bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Fig. 4: Site-directed mutagenesis analysis.
(a) A gallery of purified, negatively stained, helical complexes composed of C-terminally
truncated Marburg virus (MARV) and Ebola virus (EBOV) nucleoprotein (NP) mutant-RNA
complexes and the wild-type (WT) complex. Scale bars, 50 nm. bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
(b) Transcription and replication activities of MARV and EBOV NP mutants and the wild-type
(WT), evaluated by minigenome assay. The experiments were performed in triplicates (n = 3).
The statistical significance was tested by ANOVA-Dunnett’s test to correct for multiple
hypothesis testing.
bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Extended Data Table 1. The calculated MM-GBSA binding free energies (kcal mol−1)
of RNA (Uracil) backbone and base
URACIL Total Backbone BASE The order of numbering is as shown in Fig. 1e 1 −5.187 −3.876 −1.311
2 −3.003 −2.743 −0.26
3 −9.744 −3.957 −5.787
4 −5.038 −3.472 −1.565
5 −9.094 −8.775 −0.318
6 −15.876 −13.984 −1.892
all −47.942 −36.807 −11.133
bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Extended Data Table 2. Target residues and site-specific substitutions
Mutant Location Contact region
MARV EBOV
H4A H22A N-terminal arm C-terminal loop of adjacent NP
L6E I24E N-terminal arm C-terminal hydrophobic pocket of
adjacent NP
L9E A27E N-terminal arm C-terminal hydrophobic pocket of
adjacent NP
R19A R37A N-terminal arm N-terminal lobe of adjacent NP
R339A Y357A C-terminal lobe α-helix C-terminal lobe α-helix of adjacent
NP
K142A K160A N-terminal lobe side of RNA
RNA cleft
K153A K171A N-terminal lobe side of RNA
RNA cleft
R156A R174A N-terminal lobe side of RNA
RNA cleft
K230A K248A C-terminal lobe side of RNA
RNA cleft
H292A H310A C-terminal lobe side of RNA
RNA cleft
bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Extended Data Table 3. Primers used in this study
Name Forward Oligo sequence (5´ to 3´) Purpose
or
Reverse
MARV H4A Forward GATTTAGCCAGTTTGTTGGAG mutation
TTGGGTAC introduction
MARV H4A Reverse CAAACTGGCTAAATCCATGGT mutation
GGCGGCG introduction
MaV_NP_L6E_F Forward CACAGTGAATTGGAGTTGGGT mutation
ACAAAACC introduction
MaV_NP_L6E_R Reverse CTCCAATTCACTGTGTAAATC mutation
CATGGTGG introduction
MaV_NP_L9E_F Forward TTGGAGGAGGGTACAAAACC mutation
CACTGCCCC introduction
MaV_NP_L9E_R Reverse TGTACCCTCCTCCAACAAACT mutation
GTGTAAATC introduction
MaV_NP_R19A_F Forward GCCCCTCATGTCGCAAATAAG mutation
AAAGTG introduction
MaV_NP_R19A_R Reverse CTTATTTGCGACATGAGGGGC mutation
AGTGGGTTTTG introduction
MaV_NP_R339A_F Forward CAAAGGGCACATGAACATCA mutation
GGAAATTC introduction
MaV_NP_R339A_R Reverse TTCATGTGCCCTTTGTAGTTTT mutation
ACTTCCG introduction bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
MaV_NP_K142A_F Forward CTCCCAGCACTTGTCGTCGGA mutation
GACCGAGC introduction
MaV_NP_K142A_R Reverse GACAAGTGCTGGGAGGAAAA mutation
GACTGCAAAATG introduction
MaV_NP_K153A_F Forward ATCGAAGCGGCTTTAAGACA mutation
AGTAACAGTG introduction
MaV_NP_K153A_R Reverse TAAAGCCGCTTCGATACTAGC mutation
TCGGTCTC introduction
MaV_NP_R156A_F Forward GCTTTAGCACAAGTAACAGTG mutation
CATCAAG introduction
MaV_NP_R156A_R Reverse TACTTGTGCTAAAGCCTTTTC mutation
GATACTAG introduction
MaV_NP_K230A_F Forward ATCGTGGCAACAGTTCTCGAG mutation
TTCATCTTG introduction
MaV_NP_K230A_R Reverse AACTGTTGCCACGATAAGAA mutation
GTCCTGAG introduction
MaV_NP_H292A_F Forward CTCGAAGCAGGACTCTATCCT mutation
CAGCTTTC introduction
MaV_NP_H292A_R Reverse GAGTCCTGCTTCGAGGTTGTT mutation
AATCCC introduction
MaV-F-Kozac- Forward CACACAGAATTCGCCGCCACC insert PCR
EcoRI ATGGATTTACACAGTTTGTTG
GAGTTG bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
MaV-F-Kozac- Forward CACACAGAATTCGCCGCCACC insert PCR
EcoRI_H4A ATGGATTTAGCCAGTTTGTTG
GAGTTG
MaV-F-Kozac- Forward CACACAGAATTCGCCGCCACC insert PCR
EcoRI_L6E ATGGATTTACACAGTGAATTG
GAGTTG
MaV-F-Kozac- Forward CACACAGAATTCGCCGCCACC insert PCR
EcoRI-L9E ATGGATTTACACAGTTTGTTG
GAGGAG
MaV-R395-stop- Reverse CACACAGCTAGCTCAAATATT insert PCR
NheI GTTTTCAATTTCTGCAGCG
MaV_NP_full_stop_ Reverse CACACAGCTAGCCTACAAGTT insert PCR
NheI_R CATAGCAACATGTCTCCTTTC
MaV NP seq Forward CACATACCCTAATCATTGGC sequence
EboNPR174A Forward GTTCAAGCACAAATTCAAGTA mutation
CATGCAGAGCAAGGACTGA introduction
Reverse AATTTGTGCTTGAACCTTCTC mutation
AAGGCAAGCCTTTTCTCCT introduction
EboNPK160A Forward CTTCCGGCATTGGTAGTAGGA mutation
GAAAAGGCTTGCCTTGAGA introduction
Reverse TACCAATGCCGGAAGGAATA mutation
GACTTGCAAAGGAGAGAAAC introduction
TG bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
EboNPK171A Forward CTTGAGGCAGTTCAAAGGCA mutation
AATTCAAGTACATGCAGAGC introduction
AAGGACTGATAC
Reverse TTGAACTGCCTCAAGGCAAGC mutation
CTTTTCTCCTACTACCAA introduction
EboNPI24E Forward CACAAGGAGTTGACAGCAGG mutation
TCTGTCCGTTCAACAGGGGA introduction
Reverse TGTCAACTCCTTGTGGTAATC mutation
CATGTCAGATTCAGTGAGA introduction
EboNPR37A Forward ATTGTTGCACAAAGAGTCATC mutation
CCAGTGTATCAAGTAAACA introduction
Reverse TCTTTGTGCAACAATCCCCTG mutation
TTGAACGGACAGA introduction
EboNPH22A Forward GATTACGCAAAGATCTTGACA mutation
GCAGGTCTGTC introduction
Reverse GATCTTTGCGTAATCCATGTC mutation
AGATTCAGTGAGACTCG introduction
EboNPA27E Forward TTGACAGAGGGTCTGTCCGTT mutation
CAACAGGGGAT introduction
Reverse CAGACCCTCTGTCAAGATCTT mutation
GTGGTAATCCATGTCAGA introduction
EboNPY357A Forward CCAACAAGCAGCAGAGTCTC mutation
GCGAACTTGACCATCTTG introduction
Reverse ACTCTGCTGCTTGTTGGAGTT mutation
GCTTCTCAGCCTCAGT introduction bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
EboNPK248A Forward ATTGTCGCAACAGTACTTGAT mutation
CATATCCTACAAAAGACAGA introduction
ACGAGG
Reverse TACTGTTGCGACAATCAATAA mutation
GCCTGAAAAACGAGCTTGAG introduction
C
EboNPH310A Forward CTTGAGGCAGGTCTTTTCCCT mutation
CAACTATCGGCAATTGC introduction
Reverse AAGACCTGCCTCAAGATTATT mutation
TACTCCAGAAAGGTTCAAAA introduction
GTCGGGCGA
EcoR1EboNPF Forward CACACAGAATTCGCCGCCACC insert PCR
ATGGATTCTCGTCCTCAGAAA
ATCTGGATGG
Nhe1EboNPR Reverse CACACAGCTAGCTCAAGCGTC insert PCR
ATCGTCGTCCTTGTAGTCCT
EboNP425F Forward CAACTGAAGCTAATGCCGGTC sequence
A
bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Extended Data Table 4. Cryo-EM data collection, refinement and validation statistics
MARV NP-RNA
complex
(EMDB-XXXXX)
(PDB-ID: XXXX)
Data collection and
processing
Magnification 59,000
Voltage (kV) 300
Electron exposure (e–/Å2) 30
Defocus range (μm) -0.5 ~ -2.5
Pixel size (Å) 1.13
Symmetry imposed C1 helical
Helical rise (Å) 4.23
Helical rotation (°) 11.8052
Initial particle images (no.) 30,668
Final particle images (no.) 23,545
Map resolution (Å) 3.1
FSC threshold 0.143 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Map resolution range (Å) 2.98 ~ 4.35
Refinement
Initial model used (PDB-ID) 5Z9W
Model resolution (Å) 3.1
FSC threshold 0.143
Map sharpening B factor (Å2) -100
Model composition
Non-hydrogen atoms 6450
Protein residues 788
Nucleotide 12
B factors (Å2) min/max/mean
Protein 6.92/83.43/26.05
Nucleotide 13.55/25.68/20.47
R.m.s. deviations
Bond lengths (Å) 0.009
Bond angles (°) 1.124
Validation
MolProbity score 1.52
Clashscore 4.98
Poor rotamers (%) 0.00 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Ramachandran plot
Favored (%) 96.17
Allowed (%) 3.83
Disallowed (%) 0
bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Extended Data Fig. 1: Cryo-electron microscopy (EM) image and single particle
analysis of Marburg virus (MARV) nucleoprotein (NP)-RNA complex.
(a) A representative cryo-electron micrograph of purified MARV NP-RNA complex.
(b) 2D class averages of MARV NP-RNA complex, selected for the following 3D
classification.
(c) Local resolution map of the asymmetric subunits, NP-a and NP-b, in a MARV NP-RNA
complex. bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
(d) Fourier Shell Correlation (FSC) curves from independently refined datasets. Red curve
for overall map resolution of 3.1 angstrom (FSC = 0.143), which is validated by the cross-
FSC between cryo-EM map and model-generated map (black, 3.4 angstrom at FSC = 0.5).
(e) Overall structure of MARV RNA-bound NP-a structure (green) superimposed with NP-
b structure (yellow).
(f) RNA-bound NP-a structure (coloured in light green) fitted into authentic MARV
nucleocapsid EM map (EMD-3875, 2.5σ).
bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Extended Data Fig. 2: Comparison between Ebola virus (EBOV) and Marburg virus
(MARV) nucleoprotein (NP)-RNA complex unit structures.
Overall structure of MARV NP-RNA complex (PDB-ID: XXXX, coloured as shown in Fig.
1a, from this study) is superimposed with EBOV NP-RNA complex (PDB-ID: 5Z9W, gray).
bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Extended Data Fig. 3: Marburg virus (MARV) full length nucleoprotein (NP)-RNA
complexes.
An image of the negatively stained MARV full length NP-RNA complexes purified from
Expi 293F cells. Red arrow indicates a double-helical structure.
bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Extended Data Fig. 4: Electrostatic surface of Marburg virus (MARV) nucleoprotein
(NP)-a.
Electrostatic surface potential was calculated with the atomic model of our NP-a monomer
using Delphi web-server41,42. Scale ranging from -5 (red) to +5 kcal mol-1 e-1 (blue).
bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Extended Data Fig. 5: Secondary structure assignment of Marburg virus (MARV)
nucleoprotein (NP)-a aligned with filovirus NPs.
Secondary structure elements were annotated by phenix.secondary_structure_restraints31.
The amino acid sequences of filovirus NP (MARV: Marburg marburgvirus, ZEBOV: Zaire
ebolavirus, SUDV: Sudan ebolavirus, RESTV: Reston ebolavirus, BDBV: Bundibugyo
ebolavirus, TAFV: Taï Forest ebolavirus, BOMV: Bombali ebolavirus, LLOV: Lloviu bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
cuevavirus) were aligned with ClustalW2.143 and visualized with ESPript 3.044. The
numbering of the alignments is assigned based on the MARV NP sequence. Conserved
residues are highlighted in red and similar residues are coloured in red.
bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Extended Data Fig. 6: Impact of L6E and R19A mutations on the length of Marburg
virus (MARV) nucleoproteins (NPs).
Length of MARV C-terminal truncated L6E and R19A mutant NPs and the C-terminal
truncated wild-type (WT), evaluated by negative staining. About sixty helices were examined
and the statistical significance was tested by student’s t-test. (**** : p < 0.0001, ** : p < 0.01)
bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Extended Data Fig. 7: Marburg virus (MARV) nucleoprotein (NP)-RNA complexes
purified from HEK 293T cells.
(a) An image of the negatively stained C-terminal truncated MARV NP (1-395)-RNA
complexes purified from HEK 293T cells. Red arrow indicates a double-helical structure.
(b) A cryo-electron micrography (EM) image of the C-terminal truncated MARV NP (1-
395)-RNA complexes purified from HEK 293T cells. Red arrow indicates a double-helical
structure.
bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Extended Data Fig. 8: List of amino acids located near RNA strand and electrostatic
surfaces on nucleoproteins of the family Paramyxoviridae.
Amino acids located within 3.5 Å distance from RNA strand are listed. Basic residues are
highlighted in blue. None of the basic amino acids in the C-terminal lobe were oriented
toward the phosphate group of the RNA (i.e., unlikely to be responsible for interaction with
RNA). Electrostatic surfaces were calculated from the previously reported atomic models
using Delphi web-server41,42. Scale ranging from -5 (red) to +5 kcal mol-1 e-1 (blue).
bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Extended Data Fig. 9: Common binding pocket in VP35 peptide and the N-terminal
arm of nucleoprotein (NP).
(a) Comparison of the Marburg virus (MARV) nucleoprotein (NP) structures around the
binding pocket of VP35 peptide and the NP N-terminal arm. The VP35-bound RNA-free
monomeric state of the NP are coloured in blue and purple (PDB-ID: 5F5O), and orange and
red (PDB-ID: 5XSQ). The RNA-bound oligomeric state was coloured in gray and yellow
(PDB-ID: XXXX in this study).
(b) Molecular lipophilicity potential map of our structure coloured in blue (hydrophilic) to
orange (hydrophobic). The binding pocket of VP35 peptide and the N-terminal arm is
hydrophobic.
(c) Close-up view of the N-terminal arm of its adjacent NP (shown in yellow, PDB-ID:
XXXX in this study).
(d) Close-up view of the VP35 peptide (shown in purple, PDB-ID: 5F5O) aligned relative to
the whole NP structure.