Gene regulation mediated by ancient retroviral elements in the Felix Broecker 1,2 , Martin Vingron 2 , Hans Lehrach 2 , Karin Moelling 1,2 1 Institute of Medical Microbiology, University of Zurich. 2 Max Planck Institute for molecular Genetics, Berlin.

About 50% of the human genome is composed of retrovirus-related sequences; retroelements (REs), grouped into short- and long interspersed elements (SINEs, LINEs) and human endogenous retroviruses (HERVs) [1]. The recent ENCODE project demonstrated that many REs attract transcription factors and are transcribed by host RNA polymerases [2]. Despite the vast number of ~3,000,000 REs per haploid genome, functional relevance of their activity on host regulation has only been attributed to some representative REs. Here we characterized a HERV family designated HERV-K(HML-10) on the genome-wide level, identifying 66 HERV-K(HML-10)-related sequences [3]. The infectious progenitor retrovirus invaded the primate genome ~35 million years ago. Today, HERV-K(HML-10) is either present as full-length proviruses with two flanking retroviral promoters, long terminal repeats (LTRs), or as solitary LTRs resulting from homologous recombination. These elements showed a distinct non-random genomic distribution, and a preference towards antisense orientation of those located within gene introns, indicating purifying selection and possible functional relevance. We demonstrate that about half of the LTRs have remained active promoters until today. These LTRs may exert regulatory functions on neighboring or encompassing by providing regulatory antisense transcripts. One of these LTR-dependent regulatory transcripts originated from an active LTR within an intron of the death-associated 3 (DAP3) gene involved in . Knockdown of the LTR-dependent antisense transcript in human cells caused an increase in DAP3 mRNA expression levels and thereby promoted cell death. We conclude that retrovirus-derived sequences in the human genome can influence cellular morphology and function and may serve as drivers of evolution.

HERV-K(HML-10) elements in the human genome Expression of intronic HERV-K(HML-10) proviruses a b c a b c Provirus (~6400 bp) Integration of HML-10 HML-8 IFN-γ HML-7 HML-1 Chr.6p21.32 (32057813-32078435) 5‘LTR gag A/T-rich 3‘LTR Human 293T or 24h 24h Chimpanzee 10 kb HepG2 HML-4 Scale HepG2 0h -6h 0h pol env Gorilla HML-2 Orangutan C4 (RefSeq) 0

HML-3 HML-5 l ) Rhesus macaque SV40 (i) 1 m

Homologous HML-10 / Baboon pGL3 U 10 ( recombination Squirrel monkey HML-9 γ 100 Marmoset * N - F

HML-10 I 1000 (i) Mouse lemur Chr.1q22 (153925506-153957424) Bushbaby HML-6 * Solitary LTR (~550 bp) Mouse 10 kb Scale 0

(ii) l ) * * 1 m

MMTV /

U 10 100 80 60 40 20 0 mya JSRV DAP3 (RefSeq) (iii) ( 0.1 * *

MPMV γ 100

HML-10 * * N -

F (ii) I 1000 d e 0

Chr.6q22.31 (122834761-123089217) l ) 1 m

1 22 /

100 kb Scale U 10 (

**

) 60 2 21 * γ 100

% ** N - (

F (iii) * PKIB (RefSeq) * I 1000 3 20 s t

n 40 HML-10 0 50 100 150 200 0 25 50 75 100 125 0 50 100 150 4 19 e

m fLuc activity in 293T (%) fLuc activity in HepG2 (%) fLuc activity in HepG2 (%) e 5 18 l e

c 20 6 17 i n d e f o r

t Chr.1q22 (153925506-153957424) HepG2 HeLa 7 16 - 1 n

I 10 ) 0 10 kb . C4_5‘LTR TATGGGACAATAAGTTGTGGAAAGCCACAAGAGGCCT Scale u 8 15 . - 2 C4_3‘LTR ....A...... a 10 ( 0 2 s 1 - -W V DAP3_5‘LTR ...... A.. n 9 14 - L DAP3 (RefSeq) o - 3 L V R DAP3_3‘LTR C...... A.. i 10 M R E s M H E H PKIB_5‘LTR ...... s 10 13 H HML-10 e H ll PKIB_3‘LTR ...... r 10 - 4 A p

IFN-γ-activated site x 11 Provirus E 10 - 5 12 Solitary LTR TSS DAP3 mRNA LTRfor1 LTRfor2 LTRrev LTRfor1+LTRrev X Primers for qRT-PCR LTRfor2+LTRrev Intronic Y Intergenic Figure 1 | HML-10 is a 35 million years old betaretrovirus frequently integrated within gene Figure 2 | Proviral LTRs of intronic HML-10 proviruses located in introns of three genes introns. (a) Structure of HML-10 elements. (b) Time of integration. (c) Phylogeny of exert distinctive activities. (a) Intron/exon structure of three genes with betaretroviruses based on pol sequences. (d) Distribution of HML-10 elements in the human genome. antisense-oriented intronic HML-10 proviruses. (b) LTR promoter activities assessed by luciferase (e) Percentages of intronic elements. *P ≤ 0.05, **P ≤ 0.01, chi-square test. LTR, long terminal reporter assay. (c) Effect of IFN-γ on LTR promoter activities. (d) Identification of a conserved repeat; mya, million years ago; JSRV, Jaagsiekte Sheep Retrovirus; MPMV, Mason-Pfizer Monkey IFN-γ-activated site within a highly conserved region of LTRs. (e) Locations of the predicted Virus; MMTV, Mouse Mammary Tumor Virus. transcription start site (TSS) and primers used for qRT-PCR in the 5‘LTR of the DAP3 HML-10 provirus. (f) Expression of DAP3 and LTR-primed regulatory antisense RNA assessed by qRT-PCR. fLuc, firefly luciferase gene; SV40, Simian Virus 40 promoter. Regulation of cell death by HML-10-primed DAP3-as Knockdown of LTR-primed regulatory antisense RNA a b 24h a b Chr.1q22 (153925506-153957424) HeLa Death receptor 0h 10 kb Scale ligand DAP3-as DAP3

n 15 o DAP3 (RefSeq) i 25 nM s * Cell survival s 50 nM e

Death receptor HML-10 r p 10 x e DAP3 DAP3-as * * * * N A *

TSS R 5 m

FADD 2 4 IFN-γ upstream 1 3 3 DAP3 P Antisense oligonucleotides A 0 D k

Casp8 - m 1 2 3 4 c a o e r m Cell death t s

Casp3 cell death p u

c IFN-γ or TNF-α d unstimulated IFN-γ TNF-α - Figure 3 | Proposed function of HML-10 LTR-primed regulatory RNA on DAP3 expression 48h HeLa mock (50 nM) and cell death. (a) The DAP3 protein serves as an adapter protein linking death receptor signaling 0h 24h 2 (50 nM) to cell death. DAP3-as originating within the 5‘LTR of the HML-10 provirus within the DAP3 gene may regulate DAP3 expression, and consequently, cell death. (b) HML-10 LTR-primed DAP3-as may 50 120 )

) 100

40 % * mock (50 nM) ( % regulate DAP3 expression by interfering with the DAP3 mRNA expression through formation of (

* y

t 80 s i l l

l 30 dsRNA or collision of RNA Polymerase transcription machineries. IFN-γ interferes with DAP3-as i e b 60 c a

expression, as we have shown. This promotes expression, and consequently, cell death. Of d 20 DAP3 v i

l a 40 l e note, DAP3 is an IFN-γ-inducible gene. e D 10 C 20

0 0 2 (50 nM) α d γ - d γ te N- te N- la IF NF la IF u T u tim im s st n n Conclusions u u Figure 4 | Sequence-specific knockdown of the HML-10 LTR-primed DAP3 antisense RNA - HERV-K(HML-10) is a ~35 million years old endogenous (DAP3-as) promotes DAP3 expression and cell death. (a) Locations of the TSS and antisense betaretrovirus oligonucleotides. (b) Effect of DAP3as knockdown on DAP3 mRNA expression assessed by qRT-PCR. (c) Effect of DAP3-as knockdown on cell viability (left: percentages of cells positive for - HML-10 integration occured preferentially within or near genes trypan blue staining; right: relative cell viabitility assessed by MTT assay. (d) Representative - Antisense orientation to encompassing genes was evolutionary favored microscopy image of HeLa cells. Knockdown of DAP3-as induces a cell morphology characteristic - Three intronic proviruses show distinctive promoter activities in their of apoptotic cells (shrunken cells, membrane blebs, cell decomposition). LTRs - LTR promoter activity is negatively regulated by IFN-γ References - An LTR-primed regulatory antisense RNA regulates DAP3 expression [1] Lander, E.S., Linton, L.M., Birren, B., et al. Initial sequencing of and analysis of the human genome. Nature 409, 860-921 (2001). - LTR-mediated regulation of DAP3 expression modulates cell death [2] ENCODE Project Consortium, Bernstein, B.E., Birney, E., et al. An integrated encyclopedia of - First description of a HERV-mediated regulatory function DNA elements in the human genome. Nature 489, 57-74 (2012). directly affecting cellular morphology with implications on [3]Broecker, F., Horton, R., Heinrich, J., Lehrach, H., Moelling, K. Novel regulation of apoptosis by a HERV-specific antisense transcript. Nature Communications (under phenotype modification).