bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1 Structural insight into Marburg virus nucleoprotein-RNA complex formation

2

3 Yoko Fujita-Fujiharu1,2,3, Yukihiko Sugita1,2,4, Yuki Takamatsu1,#, Kazuya Houri1,2,3, Manabu

4 Igarashi5, Yukiko Muramoto1,2,3, Masahiro Nakano1,2,3, Yugo Tsunoda1, Stephan Becker6,7,

5 Takeshi Noda1,2,3*

6

7 1 Laboratory of Ultrastructural Virology, Institute for Frontier Life and Medical Sciences,

8 Kyoto University, 53 Shogoin Kawahara-cho, Sakyo-ku, Kyoto 606-8507, Japan

9 2 Laboratory of Ultrastructural Virology, Graduate School of Biostudies, Kyoto University, 53

10 Shogoin Kawahara-cho, Sakyo-ku, Kyoto 606-8507, Japan

11 3 CREST, Japan Science and Technology Agency, 4-1-8 Honcho, Kawaguchi, Saitama 332-

12 0012, Japan

13 4 Hakubi Center for Advanced Research, Kyoto University, Kyoto, Japan.

14 5 Division of Epidemiology, International Institute for Zoonosis Control, Hokkaido University,

15 Sapporo 001-0020, Japan

16 6 Institute of Virology, University of Marburg, 35043 Marburg, Germany.

17 7 German Center for Infection Research (DZIF), Marburg-Gießen-Langen Site, University of

18 Marburg, 35043 Marburg, Germany.

19

20 # present address: Department of Virology I, National Institute of Infectious Diseases, Gakuen

21 4-7-1, Musashimurayama-city, Tokyo 208-0011, Japan

22 *correspondence to: [email protected]

23

24 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

25 Abstract

26 The nucleoprotein (NP) of Marburg virus (MARV), a close relative of Ebola virus (EBOV),

27 encapsidates the single-stranded, negative-sense viral genomic RNA (vRNA) to form the

28 helical NP-RNA complex. The NP-RNA complex serves as a scaffold for the assembly of the

29 nucleocapsid that is responsible for viral RNA synthesis. Although appropriate interactions

30 among NPs and RNA are required for the formation of nucleocapsid, the structural basis of the

31 helical assembly remains largely elusive. Here, we show the structure of the MARV NP-RNA

32 complex determined using cryo-electron microscopy at a resolution of 3.1 Å. The structures of

33 the asymmetric unit, a complex of an NP and six RNA nucleotides, was very similar to that of

34 EBOV, suggesting that both viruses share common mechanisms for the nucleocapsid formation.

35 Structure-based mutational analysis of both MARV and EBOV NPs identified key residues for

36 the viral RNA synthesis as well as the helical assembly. Importantly, most of the residues

37 identified were conserved in both viruses. These findings provide a structural basis for

38 understanding the nucleocapsid formation and contribute to the development of novel antivirals

39 against MARV and EBOV.

40 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

41 Introduction

42 Marburg virus (MARV), a close relative of the Ebola virus (EBOV), belongs to the

43 family Filoviridae and possesses a 19.1 kb non-segmented, single-stranded, negative-sense

44 RNA genome (vRNA)1. Similar to EBOV, it causes severe haemorrhagic fever in humans and

45 non-human primates with a high mortality rate, which continuously poses a threat to public

46 health. However, no vaccine or antivirals against MARV diseases have yet been licensed. The

47 nucleoprotein (NP) of MARV consisting of 695 amino acids encapsidates the vRNA to form a

48 left-handed helical complex2. The helical NP-RNA complex acts as a scaffold for the formation

49 of a nucleocapsid, which is responsible for the transcription and replication of the vRNA3,4.

50 Since the NP-RNA complex is the structural unit that constitutes the helical core structure of

51 the nucleocapsid, for which appropriate interactions between NP molecules and RNA

52 nucleotides are required, solving the NP-RNA unit structure is essential for understanding the

53 mechanisms of the nucleocapsid formation.

54 The core structure of RNA-free monomeric MARV NP, spanning residues 21 to 373,

55 features a typical bilobed fold consisting of N-terminal and C-terminal lobes, as determined

56 using X-ray crystallography at 2.9 Å resolution5, which is similar to that of the EBOV NP core

57 structure6. These two lobes are connected by a flexible hinge region and are considered to

58 clamp the RNA strand between the two lobes by electrostatic interactions. The overall three-

59 dimensional architecture of the NP-RNA complex, in which the cellular RNA was coated with

60 truncated NP comprising its N-terminal 390 residues, was visualized using cryo-electron

61 tomography (cryo-ET), showing that the helical NP-RNA complex constitutes the innermost

62 layer of the MARV nucleocapsid7. However, since the resolution was limited to the scale of a

63 few nanometres, the structural basis for the assembly of NP and RNA into the nucleocapsid

64 core largely remains unknown. bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

65 Using single-particle cryo-electron microscopy (cryo-EM), we recently determined the

66 structure of EBOV NP-RNA complex at 3.6 Å resolution8. Here, we employed the same

67 technique to determine the cryo-EM structure of the MARV NP-RNA complex. In addition,

68 we aimed to identify key residues important for NP-RNA and NP-NP interactions and viral

69 RNA synthesis using structure-based mutagenesis.

70 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

71 Results

72 Overall structure of the MARV NP-RNA complex and its structural units

73 Similar to EBOV NP, MARV NP possesses the structural domains: an N-terminal arm,

74 an NP core composed of N- and C-terminal lobes, a disordered linker, and a C-terminal tail

75 (Fig. 1a). To determine the structure of MARV NP associated with RNA, we expressed a C-

76 terminally truncated NP which encompasses residues 1 to 395 containing the N-terminal arm

77 and the NP core domains in mammalian cells to constitute the helical complex7. Purified

78 complexes were imaged using cryo-EM (Extended Data Fig. 1a), and the structure was

79 determined by single-particle analysis at 3.1 Å resolution (Fig. 1b, Extended Data Fig. 1b-d).

80 The constituted complex shows a left-handed double helical structure (Fig. 1b, Extended Data

81 Fig. 1b), in which respective strands have similar dimensions: each strand contains single-

82 strand RNA, and 30.50 NP subunits per and a 128.99 Å helical pitch (rise = 4.23 Å, rotation

83 = 11.8052) with the outer and inner diameters of approximately 330 Å and 255 Å, respectively.

84 In the cryo-EM map, the density of the nucleotide bases and most of the side chains are clearly

85 visible, which enabled us to build an atomic model unambiguously showing six RNA

86 nucleotides enclosed by the N- and C-terminal lobes of NP (Fig. 1c-e). The respective

87 asymmetric NP subunits, NP-a and NP-b, show very similar conformations as shown by a

88 0.592-Å backbone RMSD (Root Mean Square Deviation) (Fig. 1c, Extended Data Fig. 1e).

89 The models of these asymmetric subunits also fit into the NP region within a low resolution

90 cryo-ET reconstruction of the MARV nucleocapsid20 (Extended Data Fig. 1f). α16

91 confines the RNA strand into a cleft between the two lobes (Fig. 1d). The N-terminal arm

92 (residues 1 to 19) protrudes sideway from the N-terminal lobe and has a short 310-helix η1 (Fig.

93 1d), which is associated with a neighbouring NP molecule within the helical complex. Overall,

94 the MARV NP-RNA complex shares common structural features to that of the EBOV NP-

95 RNA unit8 (Extended Data Fig. 2), although the MARV NP-RNA complex reconstituted here bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

96 is a double helix unlike the EBOV NP-RNA complex. Expression of the full-length wild-type

97 MARV NP also showed the formation of a double helical structure (Extended Data Fig. 3).

98

99 residues of MARV NP involved in interactions with RNA

100 Comparison of our atomic model of the RNA-bound MARV NP with an RNA-free

101 monomeric MARV NP structure5 provides a possible mechanism of RNA encapsidation (Fig.

102 2a). A structural alignment of our RNA-bound and the previously reported RNA-free NP5

103 molecules exhibits a 0.885-Å backbone RMSD, suggesting no drastic conformational changes

104 in the overall structure. Local conformational changes are observed mainly in the C-terminal

105 lobe probably to clamp the RNA strand. In particular, short 310-helix η6, which is disordered

106 in the RNA-free state, appears underneath the RNA strand possibly to make hydrogen bonds

107 with the RNA (Fig. 2a, b). The C-terminal helix α15 is shifted outward of the helical complex

108 so that the C-terminal helices α16 on the NP strands can be aligned like a zipper to capture the

109 RNA in between the two lobes (Fig. 2a). Indeed, the electrostatic surface potential of the NP

110 shows that the α15 and α16 helices confine the RNA in the positively charged cleft between

111 the N- and C-terminal lobes (Extended Data Fig. 4).

112 In the positively charged cleft between the N- and C-terminal lobes, six RNA

113 nucleotides are encapsidated with a ‘3-bases-inward, 3-bases-outward’ configuration (Fig. 2c).

114 Several polar residues are present within hydrogen-bonding distance of the RNA backbone and

115 likely contact with the RNA bases within the cleft (Fig. 2c). In particular, the interactions

116 between NP and RNA are predominantly electrostatic due to positively charged residues such

117 as K142, K153, R156, and H292 with their side chains pointing to the phosphate backbone of

118 the RNA (Fig. 2c, d). In contrast, the side chain of K230, which is reported not to be essential

119 for the RNA binding5, is oriented toward the RNA base (Fig. 2c, d). To further understand the

120 mechanism of RNA binding, molecular dynamics (MD) simulation and binding free energy bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

121 calculations were performed. The binding free energy of the six RNA bases (all of them

122 modelled as uracil) within a single NP molecule (Extended Data Table 1) indicated that the

123 total contribution of the RNA backbone in the NP-binding energy was larger than that of the

124 RNA base, and the sixth uracil (U6) exhibited a relatively high binding energy, which was

125 partially attributed to its interaction with the helix α6 of the adjacent NP. Also, the RNA

126 backbone-specific interactions with the NP are consistent with a sequence-independent manner

127 of the viral RNA encapsidation, similar to those of other negative-strand RNA viruses8–16.

128 Multiple sequence alignment of filovirus NPs shows that most of the polar residues in

129 the RNA-binding cleft are highly conserved among the family Filoviridae (Extended Data Fig.

130 5), suggesting that those residues play important roles in the RNA binding and/or the vRNA

131 synthesis.

132

133 Amino acid residues involved in MARV NP-NP interactions

134 The N-terminal arm is known to be essential for NP oligomerization of

135 filoviruses5,6,8,17–20. However, the structural basis for the MARV NP-NP interactions remains

136 elusive. Our atomic model shows that the N-terminal arm of any arbitrary NP molecule (NPn)

137 is associated with an adjacent NPn+1 inside the helical complex (Fig. 3a). Similar to EBOV NP,

138 the N-terminal arm of MARV NPn occupies a hydrophobic pocket located on the C-terminal

139 lobe of NPn+1 via L6 and L9 (Fig. 3b). Because the hydrophobic pocket of MARV NP is

140 reportedly bound by N-terminal region of VP35, which blocks the NP oligomerization and

141 RNA encapsidation5,19, the interaction between the MARV N-terminal arm and the

142 hydrophobic pocket is considered to be essential for the formation of the helical complex along

143 with RNA. Different from the structures of an RNA-free form of the NP bound to VP35

19,21 144 peptide , H4 in the N-terminal arm of NPn is tucked between K239 on the 240 loop and the

145 hydrophobic pocket of NPn+1 (Fig. 3b). A loop 202-208 on the N-terminal lobe of NPn+1 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

146 slightly moves downward to make a new groove, which allows the N-terminal arm of NPn to

147 enter this space (Fig. 3c). This local conformational change is probably caused by electrostatic

148 interactions between D208 and D211 on NPn+1 and R19 on the N-terminal arm of NPn (Fig.

149 3c).

150 In addition to the N-terminal arm and the N-terminal lobe, C-terminal helices of the NP

151 are likely involved in interactions with a neighbouring NP. In our model, helix α15 of NPn is

152 associated with α16 of the adjacent NPn+1, which is mediated by hydrophobic interactions

153 between a hydrophobic patch composed of L249, L325, and I336 on NPn and a hydrophobic

154 region composed of I357, F361, and I368 on NPn+1 (Fig. 3d). In addition, R339 on α15 of NPn

155 points towards an acidic amino acid-rich region on the α16 of adjacent NPn+1, suggesting that

156 electrostatic interactions also occur between the α15 and α16 helices. This interaction might be

157 more important for NP oligomerization than just for stably clamping the RNA strand within

158 the helical complex as suggested previously5. Importantly, the residues described above, H4,

159 L6, L9, R19, R339, and hydrophobic residues on helices α15 and α16, are highly conserved

160 among the family Filoviridae (Extended Data Fig. 5), suggesting that interactions via the N-

161 terminal arm and the C-terminal helices are important for NP oligomerization of filoviruses.

162

163 Identification of amino acid residues important for filovirus RNA synthesis

164 Having identified the potential interactions among MARV NPs and RNA as described

165 above, we performed structure-based mutational analysis on MARV NP and examined the

166 impact on the helical NP-RNA complex formation by negative staining EM. In parallel, we

167 also assessed the corresponding residues of EBOV NP to understand the structural conservation

168 among filoviruses. The NP mutants tested are listed in Extended Data Table 2. Among the

169 residues potentially involved in MARV NP-NP interactions, the H4A, L9E, R339A mutants

170 formed similar helical NP-RNA complexes to wild-type MARV NP (1-395) (Fig. 4a). bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

171 Similarly, the corresponding EBOV NP mutants, H22A, A27E, and Y357A formed helical NP-

172 RNA complexes, which were morphologically indistinguishable from that of wild-type EBOV

173 NP (1-450) (Fig. 4a). The MARV L6E mutant rarely showed helical complex formation, and

174 when it formed, the helical complexes were significantly shorter than those of the wild-type

175 (Extended Data Fig. 6), while the corresponding EBOV I24E mutant formed considerably

176 loose helices, suggesting the involvement of the interaction between N-terminal arm and the

177 hydrophobic pocket for appropriate NP oligomerization. Both MARV R19A mutant and its

178 corresponding EBOV mutant, R37A, formed straighter and longer helical complexes,

179 suggesting that interactions between the N-terminal arm and helix α10 affect rigidity of the

180 helical complex (Extended Data Fig. 6). Among the residues potentially responsible for RNA

181 binding, MARV R156A, K230A, and H292A, and the corresponding residues of EBOV NP

182 R174A, K248A, and H310A formed helical NP-RNA complexes. In contrast, MARV K142A

183 and K153A as well as corresponding K160A and K171A in EBOV NP were not able to form

184 regular helical complexes, suggesting their importance for the RNA binding property and for

185 appropriate NP oligomerization.

186 Finally, to determine whether the amino acid residues that are responsible for

187 appropriate helical NP-RNA complex formation are required for exerting nucleocapsid

188 functions, we conducted minigenome assay using full-length MARV and EBOV NP mutants

189 and assessed the impact of respective mutations on the transcription and replication activity.

190 All of the NP mutants, which showed no morphological changes in the helical complexes (Fig.

191 4a), showed comparable levels of the transcription and replication activity to wild-type NP (Fig.

192 4b). In contrast, MARV L6E and R19A and the corresponding I24E, R37A of EBOV NP,

193 which are involved in lateral NP-NP interaction, showed a significant reduction in their activity.

194 In addition, MARV NP K142A and K153A and the corresponding K160A and K171A of

195 EBOV NP, which are involved in the RNA binding, lost their activity to less than 1% of the bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

196 wild-type. These results demonstrated that the amino acid residues, which are involved in

197 appropriate helical NP-RNA complex formation, are essential for functional nucleocapsid

198 formation.

199 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

200 Discussion

201 MARV NP is an RNA-binding protein and a major component of the nucleocapsid.

202 Here, we determined the structure of MARV NP-RNA complex at 3.1 Å resolution using

203 single-particle cryo-EM. Despite substantial difference in the amino acid sequence, MARV NP

204 shared common structural features with EBOV NP. Each MARV NP encapsidated six

205 nucleotides in between the N- and C-terminal lobes in a sequence-independent manner. The N-

206 terminal arm is responsible for interaction with an adjacent NP. We also identified that amino

207 acid residues of MARV and EBOV NPs critical for functional nucleocapsid formation as well

208 as helical NP-RNA complex formation were mostly conserved in filoviruses.

209 Different from EBOV NP (1-450)-RNA complex, the MARV NP (1-395)-RNA

210 complex purified from NP-expressing cells reproducibly constituted double-helical structures

211 (Extended Data Fig. 7), although we used the same purification procedures for both8. Since the

212 MARV nucleocapsid and its core structure is thought to be a single helix as shown by cryo-

213 ET7,20, the double-helical structure reconstituted in this study may not faithfully recapitulate

214 the structure in the MARV nucleocapsid core. Nevertheless, the asymmetric subunits, NP-a

215 and NP-b monomers, showed the same domain organization and structural features with NP

216 molecules on the cryo-ET reconstruction of MARV20 (Extended Data Fig. 1f). Furthermore,

217 introduction of mutations into potential interaction regions among MARV NPs and RNA

218 revealed changes in the NP-RNA helical formation and the nucleocapsid function (Fig. 4).

219 These results indicate that not only the atomic coordinates of NP and RNA but also the

220 interaction modes between lateral NPs in the helix and between NP and RNA nucleotides

221 determined in this study reflect those in authentic MARV and in the virus-infected cells. Our

222 structure also reveals flexible and yet stable structural features of the MARV NP to potentially

223 form the metastable assemblies with RNA. bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

224 It is believed that, in many RNA viruses, viral nucleoproteins bind to viral genomic

225 RNA in a sequence-independent manner8–16 which was also observed in our study (side chains

226 of positively charged residues, K142, K153, R156, and H292, in the RNA-binding cleft pointed

227 to the phosphate backbone of the RNA, but not to the RNA bases). The MD simulation

228 followed by binding free energy analysis revealed that the energy contribution of RNA

229 backbone to NP-binding was higher than that of the RNA bases. These results highlighted the

230 importance of the interactions between RNA backbone and NP for RNA binding, which

231 probably assist in the incorporation of viral RNA in a sequence-independent manner to form

232 the helical NP-RNA complex.

233 Structure-based mutational analysis identified amino acid residues which are critical

234 for helical complex formation and nucleocapsid function of EBOV and MARV. Regarding the

235 RNA recognition, MARV K142 and K153 (corresponding to EBOV K160 and K171,

236 respectively) likely play more important roles in functional nucleocapsid formation, compared

237 to K230 and H292 in the C-terminal lobe. Interestingly, paramyxovirus and pneumovirus

238 nucleoproteins, which encapsidate vRNA in the positively charged cleft composed between the

239 N- and C-terminal lobes, contain more basic amino acid residues on the RNA-binding cleft of

240 N-terminal lobe than that of C-terminal lobe9,12,22,23 (Extended Data Fig. 8). These findings

241 indicate that robust association via N-terminal lobe and subsequent interaction of C-terminal

242 lobe with RNA nucleotides would be important for nucleocapsid assembly and vRNA

243 transcription and replication.

244 In our MARV NP-RNA complex structure, the N-terminal arm fits into the hydrophobic

245 pocket of the adjacent NP (Fig. 3b, Extended Data Fig. 9), where the hydrophobic interaction

246 plays a key role in tethering adjacent two NP molecules. Importantly, the amino acids

247 comprising this hydrophobic pocket, such as F223, V229, L233, L237, V244, L266, L269,

248 A270, G273, A276, P277, and F278, are conserved among the family Filoviridae (Extended bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

249 Data Fig. 5). The hydrophobic pocket is also occupied by N-terminal region of MARV

250 VP3519,21 (Extended Data Fig. 9). Once VP35 binds to the hydrophobic pocket, the NP

251 molecule switches to an “open-state” structure and loses its ability to encapsidate vRNAs. Thus,

252 MARV NP would oligomerize by release of VP35 N-terminus from the hydrophobic pocket

253 and NP binding to vRNA in turn, as is the case with EBOV6,8,18. Although the N-terminal arm

254 of MARV is 18-amino acids shorter than that of the Ebola and Cueva viruses (Extended Data

255 Fig. 5), the competitive interactions with the hydrophobic pocket between the N-terminal arm

256 and the VP35 N-terminus is probably a common NP oligomerization mechanism in the family

257 Filoviridae6,8,18.

258 In conclusion, we determined a high-resolution structure of MARV NP-RNA complex

259 using cryo-EM and identified critical residues for functional nucleocapsid formation. The

260 results advance our understanding of the mechanisms of the nucleocapsid formation and

261 contribute to the development of antivirals broadly effective for filoviruses. bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

262 Material & Methods

263 Cells

264 Human embryonic kidney 293 Freestyle (HEK293F) cells were maintained in Expi293

265 Expression Medium (Gibco, Waltham, MA, USA), at 37 °C in an 8% CO2 atmosphere. Human

266 embryonic kidney (293T) cells were maintained in the Dulbecco’s Modified Eagle Medium

267 (D6046, Merck, Darmstadt, Germany) supplemented with 10% fetal bovine serum (FB-1365,

268 Biosera, France).

269

270 Plasmids

271 All MARV (Musoki strain) plasmids encoding wild-type (pCAGGS-L, pCAGGS-

272 VP30, pCAGGS-VP35, pCAGGS-NP), T7-driven MARV minigenome encoding Renilla

273 luciferase (p3M-5M-Luc), and T7 DNA-dependent RNA polymerase (pCAGGS-T7), are

274 described elsewhere 24. The pCAGGS-NP (1-395) that expresses residues 1-395 of MARV NP

275 was constructed from pCAGGS-NP by PCR using primers with digestion enzyme sites. Then,

276 the PCR product was cloned into digested pCAGGS vector with a Kozak sequence present

277 upstream of the initiation codon. The NP as well as NP (1-395) mutant constructs were

278 generated by PCR-based site-directed mutagenesis. The primers used in this study are listed in

279 Extended Data Table 3. All constructs were sequenced to conform that unwanted mutations

280 were not present.

281

282 Expression and purification of NP-RNA complex

283 For MARV, HEK293F cells grown in 10 ml of medium (3.0 x 106 cells/ml) were transfected

284 with 6 µg of pCAGGS-NP (1-395) using Polyethyleneimine MAX (Polysciences, Warrington,

285 PA, USA). Three days post-transfection, the cells were collected and lysed with 0.1 % Nonidet

286 P-40 substitute (Wako, Osaka, Japan) in Tris-HCl buffer (10 mM Tris-HCl (pH 8.0), 150 mM bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

287 NaCl, 1 mM EDTA) including a cOmplete Protease Inhibitor (Roche, Base, Switzerland) and

288 10 mM Ribonucleoside-Vanadyl Complex (NEB, Ipswich, MA, USA) on ice. The lysate was

289 centrifuged at 10,000 g at 4 ˚C for 10 min to remove insoluble substances, and the supernatant

290 was subjected to a discontinuous CsCl gradient ultracentrifugation at 246,100 g at 4 ˚C for 2 h.

291 Then, fractions containing the NP-RNA complex were collected and ultracentrifuged at

292 246,100 g at 4 ˚C for 15 min. The pellet was suspended in Tris-HCl buffer.

293 For EBOV, HEK293T cells grown in 10 ml of medium (3.0 x 105 cells/ml) were transfected

294 with 10 µg of pCAGGS-NP (1-450) using TransIT-293 Reagent (Takara, Shiga, Japan).

295 Three days post-transfection, NP-RNA complexes were purified as described above and

296 suspended in Tris-HCl buffer.

297

298 Negative staining EM

299 Five microliters of sample was applied to a glow-discharged copper grid coated with carbon

300 and negatively stained with 2% uranyl acetate. Images were obtained using transmission

301 electron microscopy (HT-7700, Hitachi High Technology, Tokyo, Japan) operating at 80 kV

302 with an XR81-B CCD camera. The length distribution was measured for about sixty helices

303 by using ImageJ software25,26.

304

305 Cryo-EM specimen preparation

306 The 2.5 µl sample of purified NP-RNA complex solution was applied onto glow-discharged

307 200-mesh R1.2/1.3 Cu grids (Quantifoil, Jena, Germany). The grids were blotted and rapidly

308 frozen in liquid ethane using a Vitrobot Mark IV (Thermo Fisher Scientific, Waltham, MA,

309 USA).

310

311 Cryo-EM data collection bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

312 Images were acquired on a Titan Krios cryo-EM (Thermo Fisher Scientific, Waltham, MA,

313 USA) at 300 kV equipped with a Cs corrector (CEOS, GmbH), which was installed in the

314 Institute for Protein Research, Osaka University. Data were acquired automatically with EPU

315 software as movies on a Falcon 3EC detector (Thermo Fisher Scientific, Waltham, MA, USA).

316 The detailed imaging condition is described in Extended Data Table 4.

317

318 Image processing

319 We used the software RELION3.127 for the processing steps. The movie frames were motion-

320 corrected using relion_motioncorrection with 5 × 5 patches. Next, GCTF28 software was

321 utilized for contrast transfer function (CTF) estimation of all micrographs. A total of 2,469

322 micrographs were subjected to subsequent helical image processing. Helices were manually

323 traced. A total of 30,668 segments were extracted and two rounds of 2D classification was

324 performed. Classes, in which the detailed structure of helices was confirmed, were selected for

325 further analysis. The 3D initial model, a featureless cylinder, was made by the

326 relion_helix_toolbox command. Then, the featureless cylinder was refined via Refine3D and

327 used as the initial model for the following 3D classification. The 3D classification was

328 performed with local symmetry search around expected helical parameters from 2D class

329 images (twist from -12.5 degrees to -11 degrees, rise from 4 Å to 4.5 Å). A total of 23,545

330 segments from the best resolved 3D class were subjected to 3D auto-refinement (helical

331 parameter: twist for -11.8052 degrees, rise for 4.23 Å). Finally, we ran the CTF refinement job

332 and the Bayesian polishing job to improve the resolution and the final map was calculated from

333 3D auto-refinement. The resulting EM map has a nominal resolution of 3.1 Å estimated from

334 the Fourier Shell Correlation (FSC) and was normalized with PyMOL. Local resolution was

335 estimated with RELION and the locally sharpened map was used for the following atomic

336 model building. The detailed image processing conditions are listed in Extended Data Table 4. bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

337

338 Atomic model building and refinement

339 Atomic modelling of NP (1-395) was performed by template-based and de novo structure

340 modelling. An atomic model of EBOV NP (1-450)-RNA complex (PDB-ID: 5Z9W)8 served

341 as the initial template. First, we performed a rigid body fitting of the atomic model of EBOV

342 NP (1-450) into the locally sharpened EM map of MARV NP (1-395)-RNA complex using the

343 software UCSF Chimera29. Next, the software COOT30 was used to replace each amino acid of

344 EBOV NPs with the amino acid sequence of MARV NPs; for amino acid residues from 389 to

345 391, which have no homologous sites in the atomic model of EBOV NPs, the models were

346 constructed de novo. The atomic model was refined with phenix.real_space_refine31. To

347 precisely refine the atomic models located in the intermolecular region, surrounding 8

348 molecules of the atomic models were fitted to our map and refined with constraints on the

349 protein secondary structure estimated by phenix_secondary_structure_restraints. The model

350 validity was assessed by the software Phenix31 and MolProbity32. The detailed results of the

351 atomic model construction and its evaluation are shown in Extended Data Table 4. The

352 software UCSF Chimera X33 was used for displaying the EM map and the atomic models.

353

354 Minigenome assay

355 For MARV, each well of HEK 293T cells grown in a 12-well plate were transfected with the

356 plasmids expressing MARV nucleocapsid components (400 ng of pCAGGS-L, 40 ng of

357 pCAGGS-VP30, 40 ng of pCAGGS-VP35, 200 ng of pCAGGS-NP, and 400 ng of the

358 minigenome p3M-5M-Luc), 200 ng of T7 polymerase (pCAGGS-T7), and 20 ng of firefly

359 luciferase (pGL) by using 4 µl of TransIT 293 (Takara, Shiga, Japan) and 140 µl of OPTI-

360 MEM (Thermo Fisher Scientific, Waltham, MA, USA). At 48 h post-transfection, the cells

361 were lysed using lysis buffer (pjk, Kleinblittersdorf, Germany), and the Renilla and firefly bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

362 luciferase activities were measured using GloMax®-Multi+ Detection System (Promega,

363 Madison, WI, USA) with an Renilla-Juice kit and a FireFly-Juice kit (pjk, Kleinblittersdorf,

364 Germany) according to the manufacturer’s protocol. The Renilla luciferase activity was

365 normalized to the firefly luciferase activity.

366 For EBOV, HEK293T cells were grown in a 24-well plate and each well was transfected with

367 the plasmids expressing EBOV nucleocapsid components (1,000 ng of pCAGGS-L, 75 ng of

368 pCAGGS-VP30, 100 ng of pCAGGS-VP35, 100 ng of pCAGGS-NP, and 200 ng of the

369 minigenome p3E-5E-Luc), 200 ng of T7 polymerase (pCAGGS-T7), and 10 ng of Renilla

370 luciferase (pTK r.luc) by using 4 µl of TransIT 293 (Takara, Shiga, Japan) and 100 µl of OPTI-

371 MEM (Thermo Fisher Scientific, Waltham, MA, USA). At 48 h post-transfection, the cells

372 were lysed using Passive Lysis buffer (Promega Madison, WI, USA), and the firefly and

373 Renilla luciferase activities were measured using GloMax®-Multi+ Detection System

374 (Promega Madison, WI, USA) with a Luciferase Assay Reagent II and a Stop & Glo® Reagent

375 (Promega Madison, WI, USA) according to the manufacturer’s protocol. The firefly luciferase

376 activity was normalized to the Renilla luciferase activity.

377

378 Molecular dynamics simulation and binding free energy analysis

379 MD simulation was performed for three consecutive units of MARV NP-RNA complex (Fig.

380 3a), each consisting of an NP and six RNA nucleotides, using the Amber 18 package34. The

381 AMBER ff14SB35 and RNA.OL336,37 force fields were used for the simulation. The protonation

382 states of the ionizable residues were assigned a pH of 7.0 using the PDB2PQR web server38.

383 The system was solvated in a truncated octahedral box of TIP3P water molecules with a

384 distance of at least 10 Å around the proteins, and was then energy-minimized in four steps:

385 first, only hydrogen ; second, water and ions; third, the side chains; and finally, all atoms.

386 After energy minimization, the system was gradually heated from 0 to 310 K over 300 ps with bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

387 harmonic restraints (with a force constant of 1.0 kcal/mol·Å2). Two additional rounds of MD

388 (50 ps each at 310K) were performed with decreasing the restraint weight from 0.5 to 0.1

389 kcal/mol·Å2. Next, 500 ns of unrestrained production run was performed in the NPT ensemble

390 at a pressure of 1.0 atm and a temperature of 300 K. All bond lengths including hydrogen atoms

391 were constrained using the SHAKE algorithm39, and a time step of 2 fs was used. A cut-off

392 radius of 10 Å was set for the nonbonded interactions. Long-range electrostatic interactions

393 were treated using the particle mesh Ewald (PME) method40. The stability of the trajectory was

394 assessed by monitoring the RMSD of the backbone Cα atoms from the initial structure of MD

395 simulation. After confirming that RMSD in the system reached equilibrium within 200 ns, the

396 trajectory extracted from the last 300 ns (i.e., 200–500 ns) was used for subsequent per-residue

397 energy analysis. Binding free energies were calculated using the script of the molecular

398 mechanics-generalized Born surface area (MM-GBSA) method in AmberTools18. The

399 conformational entropy was not considered due to the high computational cost and low

400 prediction accuracy. The salt concentration was set to 0.20 M.

401

402 Statistical analysis and reproducibility

403 Statistically significant differences in minigenome assay were evaluated using ANOVA-

404 Dunnett’s test to correct for multiple hypothesis testing with RStudio software. Statistically

405 significant differences in the length of MARV NP variants were evaluated by unpaired Student

406 t-test using RStudio software. Data are presented as the mean ± s.d. A P-value < 0.01 was

407 considered statistically significant. The data from minigenome assay and negative staining

408 analysis of MARV and EBOV NP mutants were repeated in three independent experiments.

409 Data availability bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

410 The cryo-EM maps generated in this study have been deposited into the Electron Microscopy

411 Data Bank with accession numbers EMDB: EMD-XXXXX. The atomic coordinates reported

412 in this paper have been deposited in the Protein Data Bank with accession code PDB-ID:

413 XXXX. Raw movies have been deposited in the Electron Microscopy Public Image Archive

414 with accession codes EMPIAR-XXXXX. All other data are available from the corresponding

415 author upon request.

416 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

417 Reference

418 1. Kiley, M. P. et al. Filoviridae: a Taxonomic Home for Marburg and Ebola Viruses ?

419 Intervirology 18, 24–32 (1982).

420 2. Mavrakis, M., Kolesnikova, L., Schoehn, G., Becker, S. & Ruigrok, R. W. H. Morphology of

421 Marburg Virus NP–RNA. Virology 296, 300–307 (2002).

422 3. Ruigrok, R. W., Crépin, T. & Kolakofsky, D. Nucleoproteins and nucleocapsids of negative-

423 strand RNA viruses. Curr. Opin. Microbiol. 14, 504–510 (2011).

424 4. Mühlberger, E., Lötfering, B., Klenk, H.-D. & Becker, S. Three of the Four Nucleocapsid

425 Proteins of Marburg Virus, NP, VP35, and L, Are Sufficient To Mediate Replication and

426 Transcription of Marburg Virus-Specific Monocistronic Minigenomes. J. Virol. 72, 8756–8764

427 (1998).

428 5. Liu, B. et al. Structural Insight into Nucleoprotein Conformation Change Chaperoned by

429 VP35 Peptide in Marburg Virus. J. Virol. 91, (2017).

430 6. Kirchdoerfer, R. N., Abelson, D. M., Li, S., Wood, M. R. & Saphire, E. O. Assembly of the

431 Ebola Virus Nucleoprotein from a Chaperoned VP35 Complex. Cell Rep. 12, 140–149 (2015).

432 7. Bharat, T. A. M. et al. Cryo-Electron Tomography of Marburg Virus Particles and Their

433 Morphogenesis within Infected Cells. PLOS Biol. 9, e1001196 (2011).

434 8. Sugita, Y., Matsunami, H., Kawaoka, Y., Noda, T. & Wolf, M. Cryo-EM structure of the

435 Ebola virus nucleoprotein–RNA complex at 3.6 Å resolution. 563, 137–140 (2018).

436 9. Gutsche, I. et al. Near-atomic cryo-EM structure of the helical measles virus nucleocapsid.

437 Science 348, 704–707 (2015).

438 10. Green, T. J., Zhang, X., Wertz, G. W. & Luo, M. Structure of the Vesicular Stomatitis Virus

439 Nucleoprotein-RNA Complex. Science 313, 357–360 (2006).

440 11. Albertini, A. A. V. et al. Crystal Structure of the Rabies Virus Nucleoprotein-RNA Complex.

441 Science 313, 360–363 (2006).

442 12. Tawar, R. G. et al. Crystal Structure of a Nucleocapsid-Like Nucleoprotein-RNA Complex of

443 Respiratory Syncytial Virus. Science 326, 1279–1283 (2009). bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

444 13. Raymond, D. D., Piper, M. E., Gerrard, S. R., Skiniotis, G. & Smith, J. L. Phleboviruses

445 encapsidate their genomes by sequestering RNA bases. Proc. Natl. Acad. Sci. 109, 19208–19213

446 (2012).

447 14. Ariza, A. et al. Nucleocapsid protein structures from orthobunyaviruses reveal insight into

448 ribonucleoprotein architecture and RNA polymerization. Nucleic Acids Res. 41, 5912–5926 (2013).

449 15. Niu, F. et al. Structure of the Leanyer orthobunyavirus nucleoprotein-RNA complex reveals

450 unique architecture for RNA encapsidation. Proc. Natl. Acad. Sci. 110, 9054–9059 (2013).

451 16. Hastie, K. M. et al. Crystal structure of the Lassa virus nucleoprotein–RNA complex reveals a

452 gating mechanism for RNA binding. Proc. Natl. Acad. Sci. 108, 19365–19370 (2011).

453 17. Noda, T., Hagiwara, K., Sagara, H. & Kawaoka, Y. Characterization of the Ebola virus

454 nucleoprotein–RNA complex. J. Gen. Virol. 91, 1478–1483 (2010).

455 18. Leung, D. W. et al. An intrinsically disordered peptide from Ebola virus VP35 controls viral

456 RNA synthesis by modulating nucleoprotein-RNA interactions. Cell Rep. 11, 376–389 (2015).

457 19. Zhu, T. et al. Crystal Structure of the Marburg Virus Nucleoprotein Core Domain

458 Chaperoned by a VP35 Peptide Reveals a Conserved Drug Target for Filovirus. J. Virol. 91, (2017).

459 20. Wan, W. et al. Structure and assembly of the Ebola virus nucleocapsid. Nature 551, 394–397

460 (2017).

461 21. Liu, B. et al. Structural Insight into Nucleoprotein Conformation Change Chaperoned by

462 VP35 Peptide in Marburg Virus. J. Virol. 91, (2017).

463 22. Renner, M. et al. Nucleocapsid assembly in pneumoviruses is regulated by conformational

464 switching of the N protein. eLife 5, e12627 (2016).

465 23. Alayyoubi, M., Leser, G. P., Kors, C. A. & Lamb, R. A. Structure of the paramyxovirus

466 parainfluenza virus 5 nucleoprotein–RNA complex. Proc. Natl. Acad. Sci. U. S. A. 112, E1792–E1799

467 (2015).

468 24. Wenigenrath, J., Kolesnikova, L., Hoenen, T., Mittler, E. & Becker, S. Establishment and

469 application of an infectious virus-like particle system for Marburg virus. J. Gen. Virol. 91, 1325–1334

470 (2010). bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

471 25. Rueden, C. T. et al. ImageJ2: ImageJ for the next generation of scientific image data. BMC

472 Bioinformatics 18, 529 (2017).

473 26. Fiji: an open-source platform for biological-image analysis | Nature Methods.

474 https://www.nature.com/articles/nmeth.2019.

475 27. Zivanov, J. et al. New tools for automated high-resolution cryo-EM structure determination in

476 RELION-3. eLife 7, e42166 (2018).

477 28. Zhang, K. Gctf: Real-time CTF determination and correction. J. Struct. Biol. 193, 1–12

478 (2016).

479 29. Pettersen, E. F. et al. UCSF Chimera—A visualization system for exploratory research and

480 analysis. J. Comput. Chem. 25, 1605–1612 (2004).

481 30. Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. Acta

482 Crystallogr. D Biol. Crystallogr. 60, 2126–2132 (2004).

483 31. Adams, P. D. et al. PHENIX : a comprehensive Python-based system for macromolecular

484 structure solution. Acta Crystallogr. D Biol. Crystallogr. 66, 213–221 (2010).

485 32. Chen, V. B. et al. MolProbity : all- structure validation for macromolecular

486 crystallography. Acta Crystallogr. D Biol. Crystallogr. 66, 12–21 (2010).

487 33. Goddard, T. D. et al. UCSF ChimeraX: Meeting modern challenges in visualization and

488 analysis. Protein Sci. 27, 14–25 (2018).

489 34. Case, D. A. et al. AMBER 2018; 2018. Univ. Calif. San Franc. (2018).

490 35. Maier, J. A. et al. ff14SB: improving the accuracy of protein side chain and backbone

491 parameters from ff99SB. J. Chem. Theory Comput. 11, 3696–3713 (2015).

492 36. Cornell, R. of the. Nucleic Acids Force Field Based on Reference Quantum Chemical

493 Calculations of Glycosidic Torsion Profiles. J Chem Theory Comput 7, 2886–2902 (2011).

494 37. Banás, P. et al. Performance of molecular mechanics force fields for RNA simulations:

495 stability of UUCG and GNRA hairpins. J. Chem. Theory Comput. 6, 3836–3849 (2010).

496 38. Dolinsky, T. J. et al. PDB2PQR: expanding and upgrading automated preparation of

497 biomolecular structures for molecular simulations. Nucleic Acids Res. 35, W522–W525 (2007). bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

498 39. Ryckaert, J.-P., Ciccotti, G. & Berendsen, H. J. Numerical integration of the cartesian

499 equations of motion of a system with constraints: molecular dynamics of n-alkanes. J. Comput. Phys.

500 23, 327–341 (1977).

501 40. Darden, T., York, D. & Pedersen, L. Particle mesh Ewald: An N⋅ log (N) method for Ewald

502 sums in large systems. J. Chem. Phys. 98, 10089–10092 (1993).

503 41. Smith, N. et al. DelPhi web server v2: incorporating atomic-style geometrical figures into the

504 computational protocol. Bioinforma. Oxf. Engl. 28, 1655–1657 (2012).

505 42. Sarkar, S. et al. DelPhi Web Server: A comprehensive online suite for electrostatic

506 calculations of biological macromolecules and their complexes. Commun. Comput. Phys. 13, 269–284

507 (2013).

508 43. Larkin, M. A. et al. Clustal W and Clustal X version 2.0. Bioinforma. Oxf. Engl. 23, 2947–

509 2948 (2007).

510 44. Robert, X. & Gouet, P. Deciphering key features in protein structures with the new ENDscript

511 server. Nucleic Acids Res. 42, W320–W324 (2014).

512

513 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

514 Acknowledgments

515 We thank Y. Kawaoka for providing plasmids expressing EBOV viral proteins and

516 minigenome, and A. Takada for anti-MARV NP antiserum. This work was supported by a

517 Grant-in-Aid for JSPS Fellows (21J12207) (to Y.F.), a Research Grant from the Kazato

518 Research Encouragement Prize, MEXT Grant-in-Aid for Leading Initiative for Excellent

519 Young Researchers, JSPS KAKENHI Grant Number 21K07052 (to Y.S.), JSPS KAKENHI

520 Grant Number 19K16666, an AMED Research Program for Infectious Diseases Research and

521 Infrastructure (Interdisciplinary Cutting-edge Research, 20wm0325023j0001,

522 21wm0325023j0002) (to Y.T.), JSPS KAKENHI Grant Number 20H03140 (to M.I.), JSPS

523 KAKENHI Grant Numbers 20H03494, 19K22529, the JSPS Core-to-Core Program A, the

524 Advanced Research Networks, MEXT Grant-in-Aid for Scientific Research on Innovative

525 Area (19H04831), an AMED Research Program on Emerging and Re-emerging Infectious

526 Disease grants (19fk0108113, 20fk0108270h0001; 21wm0325023j0002), the JST Core

527 Research for Evolutional Science and Technology (JPMJCR20HA), the Grant for the Joint

528 Usage / Research Center on Tropical Disease, Institute of Tropical Medicine, and Nagasaki

529 University, the Daiichi Sankyo Foundation of Life Science, the Uehara Memorial Foundation

530 (to T.N.), the Grant for Joint Research Project of the Institute of Medical Science, University

531 of Tokyo, the Joint Usage/Research Center program of Institute for Frontier Life and Medical

532 Sciences Kyoto University, and the Takeda Science Foundation (to Y.S and T.N).

533 This study was also supported by the Platform Project for Supporting Drug Discovery and Life

534 Science Research (Basis for Supporting Innovative Drug Discovery and Life Science Research

535 (BINDS)) from AMED under Grant Number JP19am0101072 (support number 1632).

536

537 Author contributions bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

538 Y.F., Y.S., and T.N. designed the study; Y.F., Y.S., Y.T., K.H., I.M., Y.M., M.N., and Y.T.

539 performed experiments; Y.F., Y.S., S. B., and T.N. wrote the manuscript with input from all

540 co-authors. All authors read and approved the final manuscript.

541

542 Competing interest declaration

543 The authors declare no competing interests.

544 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Fig. 1: Overall cryo-electron microscopy (EM) structure and atomic model of Marburg

virus nucleoprotein (MARV NP) (1-395). bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

(a) Structural layout of MARV NP sequence. The full-length NP consists of 695 amino acids

and is divided into an N-terminal arm, an NP core consisting of N- and C-terminal lobes, a

disordered region, and a C-terminal tail.

(b) The iso-electron potential surface map of an NP-RNA complex reconstruction, calculated

from 23,545 segments (contoured at 3σ above average). The RNA strand is highlighted in red.

An NP-a molecule, which is in a dark gray strand, is highlighted in light green. An NP-b

molecule in a light gray strand is highlighted in yellow.

(c) The isolated NP-RNA complex unit (3σ) and (d) our atomic model with the EM map (3σ)

in a ribbon representation, coloured the same as in Figure 1a.

(e) Isolated EM map (6σ) of RNA superimposed with the atomic model. RNA bases are

modelled as uracil.

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Fig. 2: Nucleoprotein (NP)-RNA interactions.

(a) Overall structure of an RNA-bound NP molecule (PDB-ID: XXXX, coloured the same as

Figure 1a, from this study), which is superimposed with RNA-free monomeric NP (PDB-ID:

5F5M, gray)

(b) Close-up view of the RNA binding site in a cylinder representation of α-helices and β-

sheets. A secondary structure η6 appears underneath the RNA nucleotides. bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

(c) Close-up view of the RNA binding site in a ribbon representation of α-helices and β-sheets,

coloured the same as Figure 1a.

(d) Schematic diagram of RNA recognition by positively charged residues of Marburg virus

(MARV) NP. The RNA bases, which face inward and outward of the helix, are coloured pink

and cyan, respectively.

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Fig. 3: Nucleoprotein (NP)-NP interactions.

(a) Three adjacent NP molecules viewed from inside the helix. Black boxes show respective

close-up figures.

(b) NP-NP interactions between the N-terminal arm of NPn (green) and the hydrophobic pocket

of an adjacent NPn+1 (white, hydrophobic residues are shown in orange).

(c) NP-NP interaction between the N-terminal lobe of NPn (green) and an adjacent N-terminal

lobe of NPn+1 (white). Basic residues are coloured in blue, and acidic residues are coloured in

red.

(d) NP-NP interaction between two C-terminal lobe helices in NP-a. Hydrophobic residues are

shown in orange. Basic residues are coloured in blue, and acidic residues are coloured in red. bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Fig. 4: Site-directed mutagenesis analysis.

(a) A gallery of purified, negatively stained, helical complexes composed of C-terminally

truncated Marburg virus (MARV) and Ebola virus (EBOV) nucleoprotein (NP) mutant-RNA

complexes and the wild-type (WT) complex. Scale bars, 50 nm. bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

(b) Transcription and replication activities of MARV and EBOV NP mutants and the wild-type

(WT), evaluated by minigenome assay. The experiments were performed in triplicates (n = 3).

The statistical significance was tested by ANOVA-Dunnett’s test to correct for multiple

hypothesis testing.

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Extended Data Table 1. The calculated MM-GBSA binding free energies (kcal mol−1)

of RNA (Uracil) backbone and base

URACIL Total Backbone BASE The order of numbering is as shown in Fig. 1e 1 −5.187 −3.876 −1.311

2 −3.003 −2.743 −0.26

3 −9.744 −3.957 −5.787

4 −5.038 −3.472 −1.565

5 −9.094 −8.775 −0.318

6 −15.876 −13.984 −1.892

all −47.942 −36.807 −11.133

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Extended Data Table 2. Target residues and site-specific substitutions

Mutant Location Contact region

MARV EBOV

H4A H22A N-terminal arm C-terminal loop of adjacent NP

L6E I24E N-terminal arm C-terminal hydrophobic pocket of

adjacent NP

L9E A27E N-terminal arm C-terminal hydrophobic pocket of

adjacent NP

R19A R37A N-terminal arm N-terminal lobe of adjacent NP

R339A Y357A C-terminal lobe α-helix C-terminal lobe α-helix of adjacent

NP

K142A K160A N-terminal lobe side of RNA

RNA cleft

K153A K171A N-terminal lobe side of RNA

RNA cleft

R156A R174A N-terminal lobe side of RNA

RNA cleft

K230A K248A C-terminal lobe side of RNA

RNA cleft

H292A H310A C-terminal lobe side of RNA

RNA cleft

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Extended Data Table 3. Primers used in this study

Name Forward Oligo sequence (5´ to 3´) Purpose

or

Reverse

MARV H4A Forward GATTTAGCCAGTTTGTTGGAG mutation

TTGGGTAC introduction

MARV H4A Reverse CAAACTGGCTAAATCCATGGT mutation

GGCGGCG introduction

MaV_NP_L6E_F Forward CACAGTGAATTGGAGTTGGGT mutation

ACAAAACC introduction

MaV_NP_L6E_R Reverse CTCCAATTCACTGTGTAAATC mutation

CATGGTGG introduction

MaV_NP_L9E_F Forward TTGGAGGAGGGTACAAAACC mutation

CACTGCCCC introduction

MaV_NP_L9E_R Reverse TGTACCCTCCTCCAACAAACT mutation

GTGTAAATC introduction

MaV_NP_R19A_F Forward GCCCCTCATGTCGCAAATAAG mutation

AAAGTG introduction

MaV_NP_R19A_R Reverse CTTATTTGCGACATGAGGGGC mutation

AGTGGGTTTTG introduction

MaV_NP_R339A_F Forward CAAAGGGCACATGAACATCA mutation

GGAAATTC introduction

MaV_NP_R339A_R Reverse TTCATGTGCCCTTTGTAGTTTT mutation

ACTTCCG introduction bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

MaV_NP_K142A_F Forward CTCCCAGCACTTGTCGTCGGA mutation

GACCGAGC introduction

MaV_NP_K142A_R Reverse GACAAGTGCTGGGAGGAAAA mutation

GACTGCAAAATG introduction

MaV_NP_K153A_F Forward ATCGAAGCGGCTTTAAGACA mutation

AGTAACAGTG introduction

MaV_NP_K153A_R Reverse TAAAGCCGCTTCGATACTAGC mutation

TCGGTCTC introduction

MaV_NP_R156A_F Forward GCTTTAGCACAAGTAACAGTG mutation

CATCAAG introduction

MaV_NP_R156A_R Reverse TACTTGTGCTAAAGCCTTTTC mutation

GATACTAG introduction

MaV_NP_K230A_F Forward ATCGTGGCAACAGTTCTCGAG mutation

TTCATCTTG introduction

MaV_NP_K230A_R Reverse AACTGTTGCCACGATAAGAA mutation

GTCCTGAG introduction

MaV_NP_H292A_F Forward CTCGAAGCAGGACTCTATCCT mutation

CAGCTTTC introduction

MaV_NP_H292A_R Reverse GAGTCCTGCTTCGAGGTTGTT mutation

AATCCC introduction

MaV-F-Kozac- Forward CACACAGAATTCGCCGCCACC insert PCR

EcoRI ATGGATTTACACAGTTTGTTG

GAGTTG bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

MaV-F-Kozac- Forward CACACAGAATTCGCCGCCACC insert PCR

EcoRI_H4A ATGGATTTAGCCAGTTTGTTG

GAGTTG

MaV-F-Kozac- Forward CACACAGAATTCGCCGCCACC insert PCR

EcoRI_L6E ATGGATTTACACAGTGAATTG

GAGTTG

MaV-F-Kozac- Forward CACACAGAATTCGCCGCCACC insert PCR

EcoRI-L9E ATGGATTTACACAGTTTGTTG

GAGGAG

MaV-R395-stop- Reverse CACACAGCTAGCTCAAATATT insert PCR

NheI GTTTTCAATTTCTGCAGCG

MaV_NP_full_stop_ Reverse CACACAGCTAGCCTACAAGTT insert PCR

NheI_R CATAGCAACATGTCTCCTTTC

MaV NP seq Forward CACATACCCTAATCATTGGC sequence

EboNPR174A Forward GTTCAAGCACAAATTCAAGTA mutation

CATGCAGAGCAAGGACTGA introduction

Reverse AATTTGTGCTTGAACCTTCTC mutation

AAGGCAAGCCTTTTCTCCT introduction

EboNPK160A Forward CTTCCGGCATTGGTAGTAGGA mutation

GAAAAGGCTTGCCTTGAGA introduction

Reverse TACCAATGCCGGAAGGAATA mutation

GACTTGCAAAGGAGAGAAAC introduction

TG bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

EboNPK171A Forward CTTGAGGCAGTTCAAAGGCA mutation

AATTCAAGTACATGCAGAGC introduction

AAGGACTGATAC

Reverse TTGAACTGCCTCAAGGCAAGC mutation

CTTTTCTCCTACTACCAA introduction

EboNPI24E Forward CACAAGGAGTTGACAGCAGG mutation

TCTGTCCGTTCAACAGGGGA introduction

Reverse TGTCAACTCCTTGTGGTAATC mutation

CATGTCAGATTCAGTGAGA introduction

EboNPR37A Forward ATTGTTGCACAAAGAGTCATC mutation

CCAGTGTATCAAGTAAACA introduction

Reverse TCTTTGTGCAACAATCCCCTG mutation

TTGAACGGACAGA introduction

EboNPH22A Forward GATTACGCAAAGATCTTGACA mutation

GCAGGTCTGTC introduction

Reverse GATCTTTGCGTAATCCATGTC mutation

AGATTCAGTGAGACTCG introduction

EboNPA27E Forward TTGACAGAGGGTCTGTCCGTT mutation

CAACAGGGGAT introduction

Reverse CAGACCCTCTGTCAAGATCTT mutation

GTGGTAATCCATGTCAGA introduction

EboNPY357A Forward CCAACAAGCAGCAGAGTCTC mutation

GCGAACTTGACCATCTTG introduction

Reverse ACTCTGCTGCTTGTTGGAGTT mutation

GCTTCTCAGCCTCAGT introduction bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

EboNPK248A Forward ATTGTCGCAACAGTACTTGAT mutation

CATATCCTACAAAAGACAGA introduction

ACGAGG

Reverse TACTGTTGCGACAATCAATAA mutation

GCCTGAAAAACGAGCTTGAG introduction

C

EboNPH310A Forward CTTGAGGCAGGTCTTTTCCCT mutation

CAACTATCGGCAATTGC introduction

Reverse AAGACCTGCCTCAAGATTATT mutation

TACTCCAGAAAGGTTCAAAA introduction

GTCGGGCGA

EcoR1EboNPF Forward CACACAGAATTCGCCGCCACC insert PCR

ATGGATTCTCGTCCTCAGAAA

ATCTGGATGG

Nhe1EboNPR Reverse CACACAGCTAGCTCAAGCGTC insert PCR

ATCGTCGTCCTTGTAGTCCT

EboNP425F Forward CAACTGAAGCTAATGCCGGTC sequence

A

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Extended Data Table 4. Cryo-EM data collection, refinement and validation statistics

MARV NP-RNA

complex

(EMDB-XXXXX)

(PDB-ID: XXXX)

Data collection and

processing

Magnification 59,000

Voltage (kV) 300

Electron exposure (e–/Å2) 30

Defocus range (μm) -0.5 ~ -2.5

Pixel size (Å) 1.13

Symmetry imposed C1 helical

Helical rise (Å) 4.23

Helical rotation (°) 11.8052

Initial particle images (no.) 30,668

Final particle images (no.) 23,545

Map resolution (Å) 3.1

FSC threshold 0.143 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Map resolution range (Å) 2.98 ~ 4.35

Refinement

Initial model used (PDB-ID) 5Z9W

Model resolution (Å) 3.1

FSC threshold 0.143

Map sharpening B factor (Å2) -100

Model composition

Non-hydrogen atoms 6450

Protein residues 788

Nucleotide 12

B factors (Å2) min/max/mean

Protein 6.92/83.43/26.05

Nucleotide 13.55/25.68/20.47

R.m.s. deviations

Bond lengths (Å) 0.009

Bond angles (°) 1.124

Validation

MolProbity score 1.52

Clashscore 4.98

Poor rotamers (%) 0.00 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Ramachandran plot

Favored (%) 96.17

Allowed (%) 3.83

Disallowed (%) 0

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Extended Data Fig. 1: Cryo-electron microscopy (EM) image and single particle

analysis of Marburg virus (MARV) nucleoprotein (NP)-RNA complex.

(a) A representative cryo-electron micrograph of purified MARV NP-RNA complex.

(b) 2D class averages of MARV NP-RNA complex, selected for the following 3D

classification.

(c) Local resolution map of the asymmetric subunits, NP-a and NP-b, in a MARV NP-RNA

complex. bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

(d) Fourier Shell Correlation (FSC) from independently refined datasets. Red

for overall map resolution of 3.1 angstrom (FSC = 0.143), which is validated by the cross-

FSC between cryo-EM map and model-generated map (black, 3.4 angstrom at FSC = 0.5).

(e) Overall structure of MARV RNA-bound NP-a structure (green) superimposed with NP-

b structure (yellow).

(f) RNA-bound NP-a structure (coloured in light green) fitted into authentic MARV

nucleocapsid EM map (EMD-3875, 2.5σ).

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Extended Data Fig. 2: Comparison between Ebola virus (EBOV) and Marburg virus

(MARV) nucleoprotein (NP)-RNA complex unit structures.

Overall structure of MARV NP-RNA complex (PDB-ID: XXXX, coloured as shown in Fig.

1a, from this study) is superimposed with EBOV NP-RNA complex (PDB-ID: 5Z9W, gray).

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Extended Data Fig. 3: Marburg virus (MARV) full length nucleoprotein (NP)-RNA

complexes.

An image of the negatively stained MARV full length NP-RNA complexes purified from

Expi 293F cells. Red arrow indicates a double-helical structure.

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Extended Data Fig. 4: Electrostatic surface of Marburg virus (MARV) nucleoprotein

(NP)-a.

Electrostatic surface potential was calculated with the atomic model of our NP-a monomer

using Delphi web-server41,42. Scale ranging from -5 (red) to +5 kcal mol-1 e-1 (blue).

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Extended Data Fig. 5: Secondary structure assignment of Marburg virus (MARV)

nucleoprotein (NP)-a aligned with filovirus NPs.

Secondary structure elements were annotated by phenix.secondary_structure_restraints31.

The amino acid sequences of filovirus NP (MARV: Marburg marburgvirus, ZEBOV: Zaire

ebolavirus, SUDV: Sudan ebolavirus, RESTV: Reston ebolavirus, BDBV: Bundibugyo

ebolavirus, TAFV: Taï Forest ebolavirus, BOMV: Bombali ebolavirus, LLOV: Lloviu bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

cuevavirus) were aligned with ClustalW2.143 and visualized with ESPript 3.044. The

numbering of the alignments is assigned based on the MARV NP sequence. Conserved

residues are highlighted in red and similar residues are coloured in red.

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Extended Data Fig. 6: Impact of L6E and R19A mutations on the length of Marburg

virus (MARV) nucleoproteins (NPs).

Length of MARV C-terminal truncated L6E and R19A mutant NPs and the C-terminal

truncated wild-type (WT), evaluated by negative staining. About sixty helices were examined

and the statistical significance was tested by student’s t-test. (**** : p < 0.0001, ** : p < 0.01)

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Extended Data Fig. 7: Marburg virus (MARV) nucleoprotein (NP)-RNA complexes

purified from HEK 293T cells.

(a) An image of the negatively stained C-terminal truncated MARV NP (1-395)-RNA

complexes purified from HEK 293T cells. Red arrow indicates a double-helical structure.

(b) A cryo-electron micrography (EM) image of the C-terminal truncated MARV NP (1-

395)-RNA complexes purified from HEK 293T cells. Red arrow indicates a double-helical

structure.

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Extended Data Fig. 8: List of amino acids located near RNA strand and electrostatic

surfaces on nucleoproteins of the family Paramyxoviridae.

Amino acids located within 3.5 Å distance from RNA strand are listed. Basic residues are

highlighted in blue. None of the basic amino acids in the C-terminal lobe were oriented

toward the phosphate group of the RNA (i.e., unlikely to be responsible for interaction with

RNA). Electrostatic surfaces were calculated from the previously reported atomic models

using Delphi web-server41,42. Scale ranging from -5 (red) to +5 kcal mol-1 e-1 (blue).

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.13.452116; this version posted July 13, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Extended Data Fig. 9: Common binding pocket in VP35 peptide and the N-terminal

arm of nucleoprotein (NP).

(a) Comparison of the Marburg virus (MARV) nucleoprotein (NP) structures around the

binding pocket of VP35 peptide and the NP N-terminal arm. The VP35-bound RNA-free

monomeric state of the NP are coloured in blue and purple (PDB-ID: 5F5O), and orange and

red (PDB-ID: 5XSQ). The RNA-bound oligomeric state was coloured in gray and yellow

(PDB-ID: XXXX in this study).

(b) Molecular lipophilicity potential map of our structure coloured in blue (hydrophilic) to

orange (hydrophobic). The binding pocket of VP35 peptide and the N-terminal arm is

hydrophobic.

(c) Close-up view of the N-terminal arm of its adjacent NP (shown in yellow, PDB-ID:

XXXX in this study).

(d) Close-up view of the VP35 peptide (shown in purple, PDB-ID: 5F5O) aligned relative to

the whole NP structure.