MASARYK UNIVERSITY FACULTY OF SCIENCE

The role of the C-terminal domain of RNA polymerase II in transcriptionally regulated process of genomic instability

Ph.D. Dissertation

Květa Pilařová

Supervisor: Mgr. Dalibor Blažek, Ph.D.

Department of biochemistry

Brno 2020

Bibliografický záznam

Autor: Mgr. Květa Pilařová Přírodovědecká fakulta, Masarykova univerzita Ústav biochemie

Název práce: Úloha C-terminální domény RNA polymerasy II v transkripčně regulovaném procesu genomové nestability

Studijní program: Biochemie

Vedoucí práce: Mgr. Dalibor Blažek, Ph.D.

Akademický rok: 2019/2020

Počet stran: 132

Klíčová slova: CDK12, CDK13, kinázová aktivita, analog-senzitivní kináza, C- terminální doména RNAPII, transkripce, genová exprese, přechod mezi G1/S, genomová nestabilita, nádorový biomarker

Bibliographic entry

Author: Mgr. Květa Pilařová Faculty of Science, Masaryk University Department of biochemistry

Title of thesis: The role of the C-terminal domain of RNA polymerase II in transcriptionally regulated process of genomic instability

Degree programme: Biochemistry

Supervisor: Mgr. Dalibor Blažek, Ph.D.

Academic year: 2019/2020

Number of pages: 132

Keywords: CDK12, CDK13, activity, analogue-sensitive kinase, C- terminal domain of RNAPII, , expression, G1/S progression, genome instability, tumour biomarker

Abstrakt

Transkripce -kódujících genů je řízena v eukaryotických buňkách RNA polymerázou II (RNAPII). Cyklin-dependentní kinázy 12 a 13 (CDK12 a CDK13) se řadí do skupiny transkripčních CDKs, které asociují s RNAPII při elongaci a fosforylují její C-terminální doménu (CTD). Obě kinázy působí také na expresi genů. Abnormální funkce těchto proteinů jsou u lidských buněk spojovány s různými typy onemocnění a v posledním desetiletí se proto staly předmětem studia výzkumů v oblasti medicíny. Jakým způsobem se CDK12 a CDK13 konkrétně podílí na regulaci transkripce a fosforylačním statusu CTD však není příliš známo. Substrátová specifita a struktura lidského komplexu CDK12 s cyklinem K (CycK) byla definována in vitro (Bösken et al., 2014). V první studii této dizertační práce jsme provedli strukturní a funkční analýzu proteinů, které tvoří lidský komplex CDK13/CycK. V rámci této studie jsme také charakterizovali společné a specifické funkce obou komplexů vázajících CycK. Zjistili jsme, že in vitro kinázová aktivita na CTD RNAPII je u obou z nich stejná. Velmi odlišné jsou však skupiny genů a související procesy, které jednotlivé kinázy regulují. Výsledky této studie ukázaly, že proteiny CDK12 a CDK13 mají specifické funkce a mohly by pomoci objasnit mechanismus, jakým jejich aberace přispívají ke vzniku konkrétních onemocnění. U CDK12 bylo prokázáno, že deplece tohoto proteinu ovlivňuje fosforylační status CTD a reguluje expresi specifických skupin genů, které zahrnují mimo jiné DNA-reparační geny (Bartkowiak et al., 2010; Blazek et al., 2011; Cheng et al., 2012; Liang et al., 2015; Tien et al., 2017; Dubbury et al, 2018). Nadále však chybí komplexnější analýza funkce CDK12 v buňkách včetně určení mechanismu, kterým tato kináza reguluje své cílové geny. Za použití buněčné linie exprimující analog-senzitivní formu CDK12 jsme v přiložené studii #2 provedli celogenomové analýzy po krátkodobé inhibici buněk kompetitivním analogem ATP. Zjistili jsme, že kinázová aktivita CDK12 je potřebná pro optimální transkripci genů kódujících základní DNA-replikační proteiny, a tudíž pro G1/S přechod v rámci buněčného cyklu. Inhibice CDK12 vyvolává snížení procesivity RNAPII, což způsobuje předčasnou terminaci některých genů, které se vyznačují především dlouhou délkou a vysokým podílem polyadenylačních signálů, včetně DNA reparačních a replikačních genů. Podařilo se nám tak ukázat nový způsob, jakým CDK12 propojuje regulaci transkripce a průběh buněčného cyklu. Ukazuje se, že CDK12 může být účinným prostředkem a prediktivním biomarkem protinádorové léčby. U pacientů s nádory vaječníků a prostaty s genetickou inaktivací CDK12 byl detekován specifický typ genomové nestability charakteristický výskytem fokálních tandemových duplikací (Popova et al., 2016; Wu et al., 2018). V přiloženém článku #3 jsme shrnuli současné poznatky o regulaci genové exprese proteinem CDK12. Posuzovali jsme také, jakým způsobem změna exprese CDK12-regulovaných genů ovlivňuje proces buněčného cyklu a možné příčiny vedoucí ke genomové nestabilitě u nádorů s mutací CDK12. Abstract

In eukaryotic cells, transcription of protein-coding is directed by RNA polymerase II (RNAPII). -dependent 12 and 13 (CDK12 and CDK13, respectively) belong to a group of transcriptional CDKs that associate with elongating RNAPII, phosphorylate its C-terminal domain (CTD) and affect . Their aberrations in human cells are associated with various diseases and the have become attractive targets for medical research in the last decade. However, specific contribution of these kinases to regulation of transcription and CTD phosphorylation is still poorly understood. The structure and substrate specificity of human CDK12/Cyclin K (CycK) complex have been determined in vitro (Bösken et al., 2014). In the first study included in this work, we performed functional analyses of the human CDK13/CycK complex and we assessed the common and specific functions of both CycK-bound complexes. We demonstrated that their CTD kinase activities in vitro are the same. However, they regulate expression of a markedly different set of genes involved in dissimilar biological processes. This points to different functions of the proteins and may help to understand the mechanism by which their aberrations contribute to the onset of specific diseases. CDK12 depletion was shown to affect bulk CTD phosphorylation and regulate expression of a specific subset of genes, including DNA repair genes (Bartkowiak et al., 2010; Blazek et al., 2011; Cheng et al., 2012; Liang et al., 2015; Tien et al., 2017; Dubbury et al, 2018). The comprehensive insight into CDK12´s cellular functions and the mechanism by which CDK12 regulates expression of its target genes remain to be determined. In our study (publication #2), we performed genome-wide analyses using short-term inhibition with a competitive ATP analogue in CDK12 analogue-sensitive cell line. We found that CDK12 kinase activity is required for optimal transcription of core DNA replication genes and thus for G1/S progression. CDK12 inhibition causes an RNAPII processivity defect that leads to premature termination of predominantly long, polyadenylation-site-rich genes, including DNA replication and repair genes. Thus, we provide evidence that CDK12 represents a novel link between regulation of transcription and progression. CDK12 has emerged as a promising anti-cancer target and CDK12 aberrations found in different types of cancers have the potential to be used as biomarkers for therapeutic intervention. CDK12 inactivation in prostate and ovarian tumours is associated with unique genome instability phenotype characterized by formation of focal tandem duplications (Popova et al., 2016; Wu et al., 2018). In a review attached at the end of this work we summarized mechanisms that CDK12 utilizes for the regulation of gene expression and discussed how the perturbation of CDK12-sensitive genes contributes to the disruption of cell cycle progression and the onset of genome instability that is observed in CDK12-mutated tumours. Acknowledgements

My entire PhD studies including the work on this thesis were full of enriching experiences. There were many successes, as well as difficult challenges. I could never reach the end of this journey without the guidance from many teachers and scientists at the Masaryk University and help of other people. They did not hesitate to share their valuable experience, gave me a lot of their free time and provided important support. First and foremost, I would like to thank my supervisor, Dalibor, for giving me the opportunity to work on very interesting projects in a friendly, inspiring and professional environment of his lab. He showed me how much enthusiasm scientific research can involve and how to stay persistent and focused on the right scientific tracks. I am also very grateful that he was always trying to foster my critical and scientific thinking. I am deeply thankful to everyone in our lab. For countless meetings, collective lunches and coffee breaks during which they shared ideas and constructive criticism as well as for the good times outside of the lab. For their constant willingness to listen to my problems of any kind and helping me to overcome them. My special thanks go to Pavla who was always available for giving me advice in the lab and has also become a very good friend of mine. Thank you, Pavla, for all your help with my experiments and for all the life advice. I am grateful to all the people I met during my work, especially at CEITEC, and to all my collaborators from the labs of Matthias Geyer, Caroline Friedel, Lumir Krejci and Kamil Paruch. None of this would have been possible without the unconditional love and support of my family. I would like to thank my parents who have always supported me in whatever I decided to do. Thanks to them I have learnt to believe in myself. To my sisters, for the long calls and many visits in Brno full of empowering talks. I thank all of them for always showing interest in my work. Finally, I would like to thank my boyfriend Vojta for believing that my studies are going to finish one day. For showing me how to stay positive in any situation and for keeping me company when my schedule required working at nights.

Originální publikace a vymezení podílu autora disertační práce

Práce vychází ze 3 rukopisů autora disertace – Květa Pilařová (Pilarova K, KP).

Publikace #1 Structural and Functional Analysis of the Cdk13/Cyclin K Complex. Greifenberg AK, Hönig D, Pilarova K, Düster R, Bartholomeeusen K, Bösken CA, Anand K, Blazek D, Geyer M. Cell Rep. 2016 Jan; 14(2):320-31.

KP se podílela na přípravě designu funkčních analýz této studie. Samostatně realizovala experimenty provedené na tkáňových kulturách, konkrétně in vitro kinázové reakce s použitím purifikovaných a flag-značených proteinů, western-blotovou analýzu exprese proteinů po aplikaci flavopiridolu nebo po depleci proteinů CDK12/CDK13 pomocí siRNA a také validaci výsledků analýzy exprese genů na DNA čipech pomocí RT-qPCR.

Publikace #2 CDK12 controls G1/S progression by regulating RNAPII processivity at core DNA replication genes. Chirackal Manavalan AP, Pilarova K, Kluge M, Bartholomeeusen K, Rajecky M, Oppelt J, Khirsariya P, Paruch K, Krejci L, Friedel CC, Blazek D. EMBO Rep. 2019 Sep; 20(9):e47592.

KP se účastnila designu experimentů a přípravy textu této studie. Samostatně provedla ChIP- qPCR a western-blotovou analýzu DNA replikačních faktorů vázaných na chromatin po inhibici CDK12. Dále se podílela na experimentech zkoumajících roli kinázové aktivity CDK12 při transkripci cílových genů, konkrétně přípravou vzorků pro sekvenování jaderné RNA. KP také realizovala experimenty spojené s deplecí CycK, a to pro posouzení role tohoto proteinu při expresi CDK12-regulovaných genů.

Publikace #3 CDK12: cellular functions and therapeutic potential of versatile player in cancer. Pilarova K, Herudek J, Blazek D. NAR Cancer 2020 March; 2(1):zcaa003.

KP napsala draft publikace a pod vedením školitele a s pomocí spolupracovníků upravila text do finální podoby. Original publications and specification of author´s contribution

This PhD dissertation is based on 3 of the author´s (Květa Pilařová - Pilarova K, KP) publications.

Publication #1 Structural and Functional Analysis of the Cdk13/Cyclin K Complex. Greifenberg AK, Hönig D, Pilarova K, Düster R, Bartholomeeusen K, Bösken CA, Anand K, Blazek D, Geyer M. Cell Rep. 2016 Jan; 14(2):320-31.

KP participated in the design of the functional analyses involved in this study. She performed the experiments with cell culture. Specifically, she performed the in vitro kinase assays with purified and flag-tagged proteins, protein expression analyses upon flavopiridol inhibition and CDK12/CDK13 siRNA-mediated knockdowns with western blotting and validation of the expression microarray data using RT-qPCR.

Publication #2 CDK12 controls G1/S progression by regulating RNAPII processivity at core DNA replication genes. Chirackal Manavalan AP, Pilarova K, Kluge M, Bartholomeeusen K, Rajecky M, Oppelt J, Khirsariya P, Paruch K, Krejci L, Friedel CC, Blazek D. EMBO Rep. 2019 Sep; 20(9):e47592.

KP participated in design of experiments and manuscript preparation. She analyzed binding of DNA replication factors to chromatin upon CDK12 inhibition using western blotting and ChIP-qPCR. Next, she participated in the experiments exploring the role of CDK12 kinase activity during transcription of its target genes. Specifically, she performed the sample preparation for nuclear RNA-sequencing. KP also performed the experiments related to CycK depletion to assess its impact on the expression of CDK12-regulated genes.

Publication #3 CDK12: cellular functions and therapeutic potential of versatile player in cancer. Pilarova K, Herudek J, Blazek D. NAR Cancer 2020 March; 2(1):zcaa003.

KP wrote a draft of the publication and she made the final version with guidance from the supervisor and help from other colleagues. Abbreviations

A alanine

AR androgen receptor

ATM ataxia telangiectasia mutated

ATR ataxia telangiectasia and Rad3-related kinase

BRCA1 breast cancer type 1 susceptibility protein

BRCA2 breast cancer type 2 susceptibility protein

BRD4 bromodomain containing 4 c-Abl Abelson tyrosine kinase

CAK CDK activating kinase

Cdc2 cell division cycle protein 2

CDC2L5 cdc2-like kinase 5

CDK(s) cyclin-dependent kinase(s)

CHDFIDD congenital heart defect, facial dysmorphism and intellectual developmental disorder

ChIP chromatin immunoprecipitation

CrkRS cdc2-related kinase with an arginine/serine-rich domain

CTD C-terminal domain

CycK cyclin K

DDR DNA damage response

DRB 5,6-dichloro-1-β-D-ribofuranosylbenzimidazole

DSIF DRB-sensitivity-inducing factor

DYRK1A dual specificity tyrosine phosphorylation regulated kinase 1A

EJC exon-junction complex

ESCs embryonic stem cells

FTDs focal tandem duplications TDs tandem duplications

HGSOC high-grade serous ovarian carcinoma

HIV-1 human immunodeficiency virus type 1 hnRNPs heterogeneous nuclear ribonucleoproteins

HR homologous recombination

IPA intronic polyadenylation mCRPC metastatic castration-resistant prostate cancer

MAP/ERK mitogen-activated protein/extracellular signal-regulated kinase

MAT1 ménage-a-trois protein

NELF negative elongation factor

PAF1 Pol II-associated factor pA polyadenylation

PAS polyadenylation signal

PAXT polyA exosome targeting

PIN1 peptidyl-prolyl cis/trans , NIMA-interacting 1

PPIases peptidyl-prolyl

PR proline-rich

P-TEFb positive transcription elongation factor b qPCR quantitative real-time polymerase chain reaction

RARS-T refractory anaemia with ringed sideroblasts associated with marked thrombocytosis

RNAPII RNA polymerase II

RPAP2 RNA polymerase II associated protein 2

RS arginine-serine

Ser2P serine 2 phosphorylation

TFIIH transcription factor IIH Thr4P 4 phosphorylation

TNBC triple-negative breast cancer

SDS-PAGE sodium dodecyl sulphate-polyacrylamide gel electrophoresis

Ser5P serine 5 phosphorylation

Ser7P serine 7 phosphorylation snRNAs small nuclear RNAs snoRNAs small nucleolar RNAs

SR serine-rich

SRSFs SR splicing factors

Table of contents 1. Introduction ...... 13

1.1. CTD-dependent regulation of transcription ...... 13

1.1.1. Posttranslational modifications of the RNAPII CTD and their role in regulation of

transcription ...... 13

1.1.2. Transcriptional CDKs and other CTD kinases ...... 15

1.1.3. Prolyl isomerase PIN1 ...... 17

1.2. CDK12 and CDK13 ...... 18

1.2.1. Identification of new members of CDK family (CDK12 and CDK13) ...... 18

1.2.2. Structure and domain composition ...... 18

1.2.3. Expression, localization and role in development ...... 20

1.2.4. CTD kinase activity ...... 20

1.2.5. Role in mRNA processing ...... 21

1.2.6. Impact of CDK12 or CDK13 depletion on gene expression ...... 22

1.2.7. CDK12 and CDK13 aberrations in diseases ...... 23

1.3. CDK12-dependent gene expression ...... 24

1.3.1. Inhibition of transcription elongation by THZ531 and SR-4835 ...... 24

1.3.2. Cells with endogenous analogue-sensitive (AS) CDK12 alleles ...... 26

1.4. Regulation of genome stability by CDK12 ...... 27

1.4.1. CDK12-dependent genome stability in cancer cell lines ...... 27

1.4.2. Unique genome instability phenotype in CDK12-inactivated tumours ...... 27

1.4.3. Therapeutic potential of CDK12 ...... 29

2. Aims of the study ...... 31

3. List of used methods ...... 32

4. Linking publication ...... 33

5. Discussion ...... 34

6. Conclusions ...... 40

7. References...... 41

8. Publications ...... 51 1. Introduction

RNA polymerase II (RNAPII) is an that directs transcription of all protein- coding genes and several non-coding RNA species (Wong et al., 2012; Eick and Geyer, 2013). Posttranslational modifications of C-terminal domain (CTD) of RNAPII play a key role in regulation of the transcriptional cycle which consists of several phases - initiation, promoter- proximal release, elongation and termination (Fuda et al., 2009; Proudfoot, 2016; Adelman and Lis, 2012; Eick and Geyer, 2013). The CTD modifications are also important for coupling of transcription with co-transcriptional processes (Bentley, 2014; Hsin and Manley, 2012). To date, several transcriptional cyclin-dependent kinases (CDKs) have been described to phosphorylate the CTD of RNAPII and play a substantial role in regulation of transcription (Egloff et al., 2012; Jeronimo et al., 2013). CDK12 and CDK13 form a heterodimer with Cyclin K (CycK) (Bartkowiak et al., 2010; Blazek et al., 2011) and are among the latest members of this group of CTD kinases. The precise mechanism of how they influence transcription remains relatively unclear. In general, transcriptional CDKs control expression of various groups of genes and have an impact on different cellular processes including cell growth, proliferation and DNA-repair (Blazek et al., 2011; Liang et al., 2015; Zhang et al., 2016; Olson et al., 2019). As critical regulators of gene expression affecting these processes, the transcription- associated CDKs have emerged as promising targets and biomarkers for cancer therapy (Lui et al., 2018; Li et al., 2019; Insco et al., 2019; Chou et al., 2020).

1.1. CTD-dependent regulation of transcription

1.1.1. Posttranslational modifications of the RNAPII CTD and their role in regulation of transcription

RNAPII consists of several subunits. The largest subunit Rpb1 contains a C-terminal extension (also termed CTD) which is composed of tandem repetitions of the heptapeptide with the consensus sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. The length of CTD increases with organism complexity and reaches 52 repeats in mammals with several non- consensus repeats in the distal part (Liu et al., 2010). The amino acids of the heptapeptide are subject to various modifications that are tightly regulated during the transcription cycle. It is cis/trans isomerisation and phosphorylation of CTD that are especially important for productive transcription and the recruitment of RNA biogenesis and chromatin-remodelling factors (Phatnani and Greenleaf, 2006; Hanes, 2014). The coordination between transcription and RNA processing is further facilitated by the structural plasticity of CTD and its convenient

13 location next to the pre-mRNA exit channel of the RNAPII (Cramer et al., 2001; Andrecka et al., 2008; Drogat and Hermand, 2012). Therefore, the CTD is considered a dynamic “binding platform” enabling the coupling of transcription with co-transcriptional processes (Hsin and Maley 2012; Eick and Geyer, 2013; Bentley, 2014). The CTD is initially unphosphorylated when RNAPII associates with transcription initiation complexes. Subsequent phosphorylations at individual serines (Ser2, Ser5, and Ser7) are stage-specific (Fig.1) and mediate the temporal recruitment of appropriate RNA processing factors. Phosphorylation at Ser5 (Ser5P) is characteristic of initiating RNAPII. It is required for establishing the pausing and recognized by 5´ capping or MLL1/MLL2 . By contrast, Ser2 phosphorylation (Ser2P) marks productive elongation and culminates at the 3´end of transcripts, where it is accordingly recognized by transcription termination or cleavage and polyadenylation factors. Ser7 phosphorylation (Ser7P) is the most prominent at the beginning of genes, but stays higher throughout the rest of the transcription cycle in contrast to Ser5P. The significance of Ser7P is less understood although some specific functions have been described. It is recognized by CTD phosphatase RPAP2 (RNA polymerase II associated protein 2) and required for a recruitment of the Integrator complex which plays a role in snRNA 3´end processing (Jeronimo et al., 2013; Zaborowska et al., 2016; Harlen and Churchman 2017). Tyr1 and Thr4 also get phosphorylated but the functions of these modifications are much less examined.

Figure 1: Average chromatin immunoprecipitation coupled with sequencing (ChIP-seq) profiles of phosphorylated residues of the RNAPII CTD across protein-coding genes in human cells (TSS – transcription start site, PAS – polyadenylation signal) (adapted from Harlen and Churchman, 2017).

14

Several other CTD modifications have been identified to date. These are O- glycosylation occurring at Thr4, Ser5 and Ser7 (Ranuncolo et al., 2012), and methylation or acetylation of some residues within the non-consensus repeats (Sims et al., 2011; Voss et al., 2015). Additionally, two peptidyl-prolyl bonds (Ser2-Pro3 and Ser5-Pro6) undergo cis/trans isomerization. More detailed information including a list of individual CTD modifications and their functions can be found in several reviews (Eick and Geyer 2013; Jeronimo et al., 2013; Corden 2013; Jeronimo et al., 2016). Overall, there is a very complex interplay between the CTD modifications during transcription. They are not only necessary for active transcription but also for the recruitment of various CTD and RNA processing factors.

1.1.2. Transcriptional CDKs and other CTD kinases

Proteins in the CDK family are serine/threonine kinases that can be functionally divided into two subgroups. One group consists of kinases that are connected to the cell cycle regulation (cell cycle-related CDKs) and the other to transcriptional-related CDKs that phosphorylate the CTD of RNAPII. Six CDKs have been identified to be involved in CTD phosphorylation along with their cyclin partners – CDK7/Cyclin H, CDK8/Cyclin C, CDK9/Cyclin T, CDK11/Cyclin L, CDK12/Cyclin K and CDK13/Cyclin K (Grünberg and Hahn, 2013; Allen and Taatjes, 2015; Peterlin and Price, 2006; Pak et al., 2015; Bartkowiak et al., 2010). How they coordinate dynamic CTD phosphorylation during the transcription cycle is slowly starting to be understood. Regulation of kinase activity of CDKs is a two-step process. In the cyclin-free monomeric form, the CDK catalytic cleft is closed by the activation loop (termed T-loop), preventing enzymatic activity. After the cyclin is bound, the catalytic cleft is reoriented and a conserved threonine within the T-loop is phosphorylated by CDK activating kinase (CAK), which results in activation of the kinase (Malumbres et al., 2014). CDK7 regulates transcription initiation. This kinase along with its associated subunit cyclin H are components of the CAK complex (CDK7, cyclin H and MAT1) as a part of general transcriptional factor TFIIH. CDK7 is considered to be responsible for most Ser5P and it is suggested that this event primes its subsequent activity also at Ser7. This is consistent with the presence of TFIIH within preinitiation complex at gene promoters and with the occurrence of Ser5P and Ser7P marks at the beginning of the transcriptional cycle (Fig.1) (Eick and Geyer, 2013; Jeronimo et al., 2013). CDK7 also plays a role in establishing RNAPII promoter-proximal pausing (Zaborowska et al., 2016; Ebmeier et al., 2017). Importantly, CDK7 is an activating kinase phosphorylating the activating threonine in the T-loop of other CDKs involved in cell cycle regulation and CDK9 (Fisher, 2005; Larochelle et al., 2012).

15

CDK9 is required for the release of RNAPII from promoter-proximal pausing into productive elongation (Marshall et al., 1996; Adelman and Lis, 2012). The complex of CDK9 with cyclin T is called positive transcription elongation factor b (P-TEFb). Despite its original proposed function to mediate the majority of Ser2P, recent studies showed that it preferentially phosphorylates Ser5 and additionally Ser7 (Czudnochowski et al., 2012; Eick and Geyer, 2013). Consistently, in vivo live imaging analyses showed co-occupancy of CDK9 with Ser5P modification (Ghamari et al., 2013). RNAPII pauses 25-60bp downstream of the transcription start site at most protein-coding genes. This is demonstrated by a prominent peak of RNAPII at this gene region which is accompanied by co-occupancy of CDK9 and Ser5P (Adelman and Lis 2012; Ghamari et al., 2013). CDK9 kinase activity directs the release of RNAPII from promoter-proximal pausing by phosphorylation of the negative elongation factor (NELF) and SPT5 within the DSIF (DRB-sensitivity-inducing factor). NELF is consequently released from the transcriptional machinery and SPT5 functions as a positive elongation factor upon RNAPII promoter escape (Zhou et al., 2012; Ghamari et al., 2013; Bowman and Kelly, 2014). Thus, CDK9 regulates transcription by dual mechanism - by phosphorylating CTD of RNAPII and in addition, by phosphorylating elongation-associated factors. Interestingly, since CDK7 is the activating kinase of CDK9, its kinase activity contributes to the release of paused RNAPII (Larochelle et al., 2012) and establishing pausing at the same time. In yeast, there are two elongation-associated CTD kinases (Bur1 and Ctk1). In human cells, CDK9 is the closest ortholog of Bur1, while there are two orthologs of Ctk1 - CDK12 and CDK13 (Bartkowiak et al., 2010). Accordingly, CDK12 and CDK13 were suggested to mediate primarily Ser2P, however, their specific contributions to CTD phosphorylation in vivo are little known. Recent studies indicate that the roles of CDK12 and CDK13 in regulation of CTD phosphorylation and possibly other processes may differ in comparison to Ctk1 which mediates the majority of Ser2P in yeast (Bowman and Kelly, 2014; Srivastava et al., 2019). CDK8 is a component of the Mediator complex, a broadly required transcriptional coactivator which allows communication between transcription factors and RNAPII (Taatjes 2010). CDK8 seems to be both a negative and positive regulator of transcription (Nemet et al., 2013). It is required for efficient recruitment of the transcription elongation regulators, P-TEFb and BRD4 to the promoters (Donner et al., 2010). At the same time, it also negatively regulates TFIIH (Akoulitchev et al., 2000). Whether the mechanisms by which CDK8 contributes to regulation of transcription involve CTD phosphorylation remains uncertain. CDK8 submodule exhibits distinct CTD kinase activity in vivo and in vitro. It was suggested that this can be caused by additional factors required for the CTD phosphorylation in vivo (Nemet et al., 2013). Finally, CDK11 has recently been added to transcriptional CDKs with CTD kinase activity on Ser2 (Pak et al., 2015; Gajduskova et al., in press).

16

Various other proteins extend the list of CTD kinases. Proto-oncogene kinase c-Abl and Polo-like kinase 3 are responsible for the phosphorylation of the remaining residues, Tyr1 and Thr4, respectively (Eick and Geyer, 2013; Jeronimo et al., 2013). The atypical kinase BRD4 (Deviah et al., 2012), DYRK1A (Di Vona et al., 2015) and some members of the MAP/ERK kinase family (ERK1/ERK2) (Eick and Geyer, 2013) have been described to mediate serine phosphorylation in vitro, yet their functional significance is not clear.

1.1.3. Prolyl isomerase PIN1

Peptidyl-prolyl isomerases (PPIases) mediate the cis/trans isomerization of the peptide bond within their substrates. PIN1 (Ess1 in yeast) affects a variety of pathways including transcription (Hanes, 2014). The protein comprises of an N-terminal WW domain and a C- terminal PPIase catalytic domain that are tethered by a flexible linker (Verdecia et al., 2000; Matena et al., 2018). PIN1 mediates the isomerization of phosphorylated Ser/Thr-Pro motif and the CTD of RNAPII is its primary substrate in vivo (Yaffe et al., 1997; Zhang et al., 2012). The crystal structure of PIN1 bound to a phosphorylated CTD heptapeptide revealed that a majority of the contacts are mediated by the WW domain (Verdecia et al., 2014). RNAPII CTD contains two peptidyl-prolyl bonds (Ser2-Pro3 and Ser5-Pro6) that can be targeted by PIN1 for isomerization. Since some CTD-binding proteins show cis/trans stereoselectivity (for instance yeast 3´end processing factors Nrd1 and Pcf11) (Singh et al., 2009), the isomerization status of CTD prolines and PIN1 activity can influence the recruitment of various proteins involved in both regulation of transcription and RNA processing. The CTD phosphatase SSU72 prefers Ser5P in combination with Pro6 in a cis-configuration (Eick and Geyer, 2013) indicating a possible role of PIN1 in the modulation of CTD phosphorylation. Indeed, yeast Ess1 has been shown to promote the activity of CTD phosphatases and dephosphorylation of CTD at Ser5 and Ser7 (Hanes, 2014). However, analogous functional analyses in human cells are lacking. In contrast to yeast Ess1, PIN1 stimulated CTD phosphorylation in one study (Xu et al., 2013) suggesting it could have an opposite role in human cells. In summary, PIN1 regulates the structure and function of RNAPII CTD including its phosphorylation status and thereby contributes to CTD-dependent regulation of transcription.

17

1.2. CDK12 and CDK13

1.2.1. Identification of new members of CDK family (CDK12 and CDK13)

CDK12 and CDK13 are evolutionary conserved proteins. They belong to a group of transcription-associated CDKs that phosphorylate the CTD of RNAPII. The human CDK12 was first isolated and characterized in cDNA screens for kinases related to Cdc2 (a major cell cycle kinase in fission yeast) (Ko et al., 2001). The protein was originally named CrkRS (Cdc2- related kinase with an arginine/serine-rich (RS) domain) and after a finding that it associates with L-type renamed to CDK12 (Chen et al., 2006). However, Greenleaf lab later demonstrated that CDK12 forms a complex with CycK as an elongation-associated CTD kinase in Drosophila (Bartkowiak et al., 2010), which was followed by the same findings also in human cells (Blazek et al., 2011). CDK13 (originally named CDC2L5) was identified in cDNA screen for Cdc2-related kinases similarly to CDK12 (Marques et al., 2000). The same group also pointed out that CrkRS and CDC2L5 are highly related and suggested that they belong to a new subfamily in Cdc2-like protein kinase family (Marques et al., 2000). CDK13 is indeed a second ortholog of yeast Ctk1 and it is believed to be a close paralog of Cdk12 in humans. Although both CDK12 and CDK13 interact with CycK, they form two independent complexes (CDK12/Cyclin K and CDK13/CycK) in cells (Blazek et al, 2011). In comparison to CDK12, CDK13 is less studied and its biological functions remain elusive.

1.2.2. Structure and domain composition

Both CDK12 and CDK13 have the unique structure that is different from canonical CDKs. They have 43% and high amino acids (aa) sequence identity between the kinase domains (93%) (Kohoutek and Blazek, 2012). The largely conserved central kinase domains consisting of about 300 aa are prolonged by intrinsically disordered N- and C- terminal arms. The full-length proteins are unusually large with a molecular weight of ~160 kDa and ~170 kDa (1490 aa and 1512 aa), respectively (Fig.2). Next to the low- complexity regions, there are several functional domains located at the prolonged termini. The most distinct are RS rich motifs included in N-termini and proline-rich (PR) motifs located at both N- and C- terminal regions. CDK13 additionally contains alanine (A)- and serine-rich (SR) motifs (Marques et al., 2000; Ko et al., 2001; Kohoutek and Blazek, 2012) (Fig.2). The occurrence of RS domains is an attribute of the pre-mRNA processing factors of the SR . These proteins act often in multi-protein complexes such as spliceosome. The RS

18 domains of splicing factors associate with each other and are important for the formation of protein-protein interactions required for pre-mRNA splicing (Zhong et al., 2009). PR motifs contain a consensus for SH3 and WW domains. Proteins containing SH3 domains are often involved in signalling pathways regulating different processes (Mayer, 2001). Thus, the presence of the RS and PR motifs indicates that both CDK12 and CDK13 can be involved in a variety of protein-protein interactions with the far-reaching impacts on cellular processes.

Figure 2: Domain composition of CDK12 and CDK13 (adapted from Pilarova et al., 2020).

The crystal structure of CDK12/CycK complex revealed a C-terminal extension of the canonical kinase domain composed of DCHEL motif (aa 1038-1042) and polybasic cluster (aa 1045-1051), that interact with both the kinase domain and bound ATP (Bösken et al., 2014). The extension with polybasic cluster is typical for CTD kinases implicated in transcription elongation (Bösken et al., 2014). Its flexibility directs ATP binding (Bösken et al., 2014; Dixon- Clarke et al., 2015) and is important for the catalytic activity of the kinase (Bösken et al., 2014; Ekumi et al., 2015). High sequence and structural similarity within ATP-binding pockets make developing ATP-competitive inhibitors of CDKs a challenge. Therefore, the first developed CDK inhibitors were relatively non-specific and may be referred to as ‘pan-CDK’ inhibitors (Asghar et al., 2015). Flavopiridol is an ATP competitive kinase inhibitor considered to be specific for CDKs active in transcriptional elongation (Chao et al., 2000; Baumli et al., 2008). This compound indeed inhibits CDK12 kinase activity, however, with 10-fold weaker potency than it does in the case of CDK9 (Bösken et al., 2014). Therefore, the effects detected in flavopiridol-treated cells are attributed mostly to inactivated CDK9. Johnson and colleagues have demonstrated that CDK9 does not share the DCHEL motif in the C-terminal extension, which results in the different specificity to flavopiridol (Johnson et al., 2016). Due to an absence of selective CDK12 and CDK13 inhibitors, it is difficult to determine the specific CDK12 and CDK13 catalytic functions. Importantly, the finding that the C-terminal

19 extension associates with ATP substrate has created the basis for site-directed development of specific covalent inhibitors (Bösken et al., 2014).

1.2.3. Expression, localization and role in development

CDK12 and CDK13 are encoded at regions 17q12 and 7p14.1, respectively and are ubiquitously expressed in human tissues (Marques et al., 2000; Ko et al., 2001). In agreement with the presence of RS motifs they are localized in nuclear speckles (Ko et al., 2001; Chen et al., 2006; Ghamari et al., 2013; Even et al., 2006), the foci for storage and exchange of RNA processing factors (Spector and Lamond, 2011). In addition, CDK13 is a component of the perinucleolar compartment (Even et al., 2016). CDK12, CDK13 and CycK are all essential for the proper development of an embryo. Their depletions lead to embryonic lethality in mouse models (Blazek et al., 2011; Juan et al., 2016; Novakova et al., 2019). However, the molecular mechanism leading to the lethality in CDK12-depleted cells differs from the one in CDK13. In vitro cultured CDK12 -/- blastocysts have an increased and fail to undergo outgrowth of inner cell mass (Juan et al., 2016), whereas the lethality in CDK13-deficient animals is a result of heart failure (Novakova et al., 2019). CDK12 and CDK13 are consistently expressed in the brain at all embryonic stages and play a role during neuronal development, specifically in the regulation of axonal elongation and migration of late-arising cortical neurons (Chen et al., 2014; Chen et al., 2017). CycK along with CDK12 and CDK13 are also highly expressed in murine embryonic stem cells (ESCs) (Dai et al., 2012). Their depletion led to cell differentiation of the ESCs and reversely, when ESCs were induced to differentiate in vitro, their protein levels were gradually decreased. Thus, CDK12/CycK and CDK13/CycK complexes are specifically required to maintain a self-renewal in ESCs (Dai et al., 2012). In this case, the underlying molecular mechanism is not clear. It is tempting to speculate that CDK12 and/or CDK13 could directly affect transcription or phosphorylation of the factors involved in this process and thus regulate their activity. Importantly, the expression profiles of differentiation marker genes were different upon CDK12 or CDK13 depletion indicating that the functions of the kinases are not redundant (Dai et al., 2012).

1.2.4. CTD kinase activity

Both kinases (CDK12 and CDK13) demonstrated the CTD kinase activities in vitro (Bartkowiak et al., 2010; Liang et al., 2015). They phosphorylate Ser2, which is consistent with the function of their yeast homolog Ctk1 (Blazek et al., 2011; Cheng et al., 2012; Bösken et

20 al., 2014; Liang et al., 2015). However, distinct in vitro kinase assays showed that CDK12 and CDK13 can additionally phosphorylate Ser5 and CDK12 also Ser7 (Bösken et al., 2014; Liang et al., 2015). Bartkowiak and colleagues used a set of synthetic CTD peptides containing mutations that either eliminate or resemble a potential phosphorylation (serine to alanine and serine to glutamate, respectively) to assess the substrate specificity of CDK12. Although with different efficiency, CDK12 phosphorylated all CTD constructs including the one with substitution of Ser2 to alanine suggesting that the kinase is also able to phosphorylate other serine residues (Bartkowiak et al., 2015a). These experiments led to a conclusion that CDK12 is a promiscuous CTD kinase in vitro (Bösken et al., 2014; Bartkowiak et al., 2015a). Longer- term (days) CDK12 depletion results in a modest decrease (Bartkowiak et al., 2010; Blazek et al., 2011; Cheng et al., 2012) or little change (Liang et al., 2015) in bulk Ser2P. In contrast, CDK13 has not been shown as a CTD kinase in mammalian cells in vivo and its specificity is less characterized. Since phosphorylation status of the CTD is changing during the transcriptional cycle, the determination of preferred phospho-CTD substrate can also indicate a role of the kinase in this process. Among different synthetic phospho-CTD peptides, CDK12 showed the highest preference for the construct pre-phosphorylated at Ser7 (Bösken et al., 2014; Bartkowiak et al., 2015a). Comparative analyses of CDK9 and CDK12 CTD kinase activities led to a suggestion that CDK12 functions downstream of CDK9 (Bösken et al., 2014; Ghamari et al., 2013). This is consistent with a model proposed by Yu and colleagues where CDK9 is a critical factor for recruitment of PAF1 (Pol II-associated factor) to the genes and this factor is subsequently required for recruitment of CDK12 (Yu et al., 2015). Again, no analogous experiments have been performed using CDK13 kinase. Thus, its kinase activity and substrate specificity towards CTD remain unknown. Co-factors and especially CAK can additionally modulate the kinase activities. Although the mass spectrometry analyses confirmed the phosphorylation of CDK12 at conserved threonine upon co-expression with CAK from different species (Bösken et al., 2014; Dixon-Clarke et al., 2015), it is still not clear how the activities of both CDK12 and CDK13 are regulated in vivo. Another important unanswered question is how the N- and C-terminal parts of the proteins potentially contribute to these processes.

1.2.5. Role in mRNA processing

CDK12 and CDK13 may affect mRNA processing indirectly through regulation of phospho-state of RNAPII CTD since it serves as a binding platform for a variety of RNA processing factors. Moreover, the localisations of CDK12 and CDK13 in nuclear speckles,

21 which is dependent on the presence of the N-terminal sequences containing RS domains, indicate a role of the proteins in mRNA processing (Ko et al., 2001; Even et al., 2006). Proteomic analyses of CDK12- and CDK13-associated complexes revealed an enrichment for numerous RNA-processing factors including spliceosome factors (for instance SR splicing factors (SRSFs)), 3´end processing factors, hnRNPs and components of exon-junction complex (EJC) (Eifler et al., 2015; Bartkowiak et al., 2015a; Liang et al., 2015; Tien et al., 2017). Eifler and colleagues demonstrated that the interactions with SRSFs are mediated by the RS domain in CDK12 (Eifler et al., 2015). Although the functional significance of these interactions remains to be explored, these data support the idea that CDK12 and CDK13 can affect mRNA processing by dual mechanism: 1) indirectly by phospho-CTD dependent recruitment of splicing and 3´ end processing components and, 2) directly through physical interactions with RNA processing factors. CDK12 was shown to regulate splicing of E1a reporter gene, SRSF1 and tissue- specific Neurexin IV genes (Chen et al., 2006; Liang et al., 2015; Rodrigues et al., 2012), and also 3´ end processing of c-MYC and c-FOS genes (Eifler et al., 2015; Davidson et al., 2014). Another study demonstrated that CDK12 modulates alternative last exon splicing, a specialized subtype of alternative mRNA splicing (Tien et al., 2017). However, the depletion of CDK12 coupled with a splicing-sensitive microarray or RNA-sequencing (-seq) did not show any global splicing defects (Blazek et al., 2011; Tien et al., 2017; Dubbury et al., 2018). CDK13 was shown to affect constitutive and alternative splicing via the RS domain in vivo (Even et al., 2006) and pre-mRNA splicing of HIV-1 (Berro et al., 2008).

1.2.6. Impact of CDK12 or CDK13 depletion on gene expression

The role of CDK12 in regulation of gene expression is better characterized than the role of CDK13. Long-term depletion of CDK12 only leads to downregulation of a subset of genes. The kinase controls expression of long genes, mainly functioning in DNA repair, cell cycle, DNA replication and RNA processing and splicing (Blazek et al., 2011; Liang et al., 2015; Tien et al., 2017; Dubbury et al., 2018). Importantly, CDK12 contributes to maintenance of genome stability via regulation of transcription of the DNA repair genes (Blazek et al., 2011). Thus, CDK12-aberrant function may impact several crucial cellular processes. Long DNA-repair genes functioning in homologous recombination (HR) pathway belong to the most sensitive group of CDK12-regulated genes. The inducible depletion of CDK12 from mouse ESCs revealed a global increase of intronic polyadenylation (IPA) events, which results in the production of shortened transcripts (Dubbury et al., 2018). HR genes globally contain more IPA sites than other expressed genes. Their higher occurrence and

22 cumulative effect could explain the enhanced sensitivity to CDK12. However, this model does not provide an answer to the question of why HR genes are more sensitive than other expressed genes with the same number of IPAs. It indicates the presence of additional factors contributing to CDK12´s target selectivity. While we were preparing our manuscript (Greifenberg et al., 2016), Liang and colleagues demonstrated that loss of CDK12 and CDK13 preferentially affects expression of DNA damage response genes and snoRNA genes, respectively. CDK13 was also required for expression of genes functioning in energy metabolism in the mitochondrion. The loss of expression of RNA processing factors was, on the other hand, a shared characteristic of both CDK12 and CDK13 depletions (Liang et al., 2015). This observation led to a suggestion that CDK12 and CDK13 could affect transcription partially by regulating the expression of factors involved in RNA biogenesis (Liang et al., 2015). Taken together, both kinases control expression of subsets of genes involved in various cellular processes (Blazek et al., 2011; Liang et al., 2015; Tien et al., 2017). They represent the gene-specific CDKs with a prevalently positive effect on gene expression since the majority of the affected genes are downregulated upon their depletion. This is markedly different to function of CDK9 which regulates RNAPII promoter-proximal pausing and transcription of most genes (Chao and Price, 2001; Gressel et al., 2017).

1.2.7. CDK12 and CDK13 aberrations in diseases

Aberrations in CDK12 and CDK13 result in various diseases. Since CDK12 and CDK13 differ mainly at N- and C-termini, it suggests that these regions are important for the specific functions. Genomic alterations in CDK12 were documented in about 30 tumour types with an incidence of up to 15% of sequenced cases with molecular consequences best studied in ovarian, breast and prostate cancers (Lui et al., 2018). CDK12 is both a tumour suppressor and an oncogene, and the functional outcome of CDK12 aberrations are case and context dependent (Tien et al., 2017; Lui et al., 2018; Paculova and Kohoutek 2017). In contrast, mutations in CDK13 have been connected with a congenital heart defect, facial dysmorphism and intellectual developmental disorder (CHDFIDD) (Sifrim et al., 2016; Hamilton et al., 2018; Bostwick et al., 2017). Increased CDK13 levels were found in patients with RARS-T (refractory anaemia with ringed sideroblasts associated with marked thrombocytosis), which is a haematological malignancy with features of both a myeloproliferative and myelodysplastic disorder (Malcovati et al., 2009). A recent study also demonstrated a role of CDK13 in melanomas (Insco et al., 2019). In conclusion, both CDK12 and CDK13 are currently attractive targets for biomedical research.

23

1.3. CDK12-dependent gene expression

The finding that CDK12 maintains genome stability via transcription regulation of key DNA repair genes (Blazek et al., 2011) sparked research interest in its cellular functions. Genetic depletion of CDK12 indicates that the full-length protein is required for transcription of a subset of genes. However, it remains unclear how CDK12 kinase activity regulates this process. The recent availability of selective dual CDK12/CDK13 inhibitors (THZ531 and SR- 4835) and the use of novel experimental tools (analogue-sensitive CDKs) have contributed to our knowledge of CDK12 catalytic functions. They also helped to elucidate its role during transcription elongation.

1.3.1. Inhibition of transcription elongation by THZ531 and SR-4835

THZ531 (Fig.3) is a covalent inhibitor of CDK12 and CDK13 that irreversibly targets remote cysteine residues 1039 and 1017 within the C-terminal extensions of CDK12 and CDK13, respectively (Zhang et al., 2016). It is a highly selective and potent inhibitor of both kinases (determined IC50 values: 158 nM and 69 nM for CDK12 and CDK13, respectively, in comparison to 10.5 µM and 8.5 µM for CDK9 and CDK7, respectively) (Zhang et al., 2016). A novel SR-4835 compound functions as a competitive inhibitor. It belongs to N9 heteroaromatic purine analogues and exhibits identical selectivity and similar potency as THZ531 (Quereda et al., 2019) (Fig.3).

THZ531 SR-4835

Figure 3: The structure of potent dual CDK12/CDK13 inhibitors (adapted from Zhang et al., 2016; Quereda et al., 2019).

24

Analyses of gene expression profiles of CDK12/CDK13 inhibited cells did not show any global transcriptional defects (Zhang et al., 2016, Iniguez et al., 2018; Krajewska et al., 2019, Quereda et al., 2019). Only subsets of genes were affected, similarly to long-term CDK12 or CDK13 depletion (Blazek et al., 2011; Liang et al. 2015; Tien et al., 2017; Dubbury et al., 2018). At high concentrations of THZ531, leading to complete CDK12 and CDK13 inhibition (>200 nM), the most sensitive genes belong to the group of transcriptional factor genes associated with super-enhancers (Zhang et al., 2016). Importantly, DNA repair genes are already sensitive to low THZ531 and SR-4835 doses (<100 nM) consistently in different cancer cell lines (Zhang et al., 2016; Iniguez et al., 2018; Krajewska et al., 2019; Quereda et al., 2019). This suggests a link between DNA-repair pathway and transcription that is specific to CDK12, as other CDKs examined in previous research were not shown to affect expression of DNA repair genes (Liang et al., 2015; Zhang et al., 2016). Studies using THZ531 have also provided the first insights into the mechanism which CDK12 utilizes for regulation of gene expression. RNA-seq coupled with ChIP-seq analyses revealed a dose-dependent loss of RNAPII and its Ser2P form near the 3´ends of THZ531- responsive genes, which indicates a loss of RNAPII processivity across these genes (Zhang et al. 2016). Another study demonstrated that inhibition by THZ531 results in gene length- dependent elongation defects. THZ531 induced a premature cleavage and polyadenylation of long genes (> 45kb), leading to a shortening of the transcripts and loss of their expression (Krajewska et al., 2019). The effects on premature termination were suggested to be CDK12- specific, since only CDK12- (and not CDK13-) depleted cells displayed this phenotype. Apart from long gene length, a high number of cryptic polyadenylation (pA) sites, lower GC content and lower ratio of U1 snRNA binding to pA sites were identified as the additional determinants of genes sensitive to CDK12 inhibition. These features, particularly the increased number of cryptic pA sites, are especially prominent among DNA damage response genes. In agreement, treatment with SR-4835 in breast cancer cells led to the common suppression of DNA repair genes and promoted cleavage of the transcripts at intragenic pA sites (Quereda et al., 2019). CDK12 is present at promoters and gene bodies of protein-coding genes (Zhang et al., 2016). Its occupancy largely overlaps with RNAPII supporting the idea that CDK12 can travel with RNAPII (Davidson et al., 2014). Little is known about its recruitment to the transcriptional active sites. The PAF1C-mediated recruitment (Yu et al., 2015) supports a role of CDK12 as a general transcription factor, but does not explain its gene-specific role in human cells. Hoshii and colleagues demonstrated that SETD1A, a histone methyltransferase, recruits CycK to the promoters of DNA damage response genes (Hoshii et al., 2018). However, the functional relevance of the SETD1A-mediated recruitment of CDK12 to gene promoters remains to be determined.

25

In summary, the use of selective dual CDK12/CDK13 inhibitors leads to inhibition of transcription elongation of subset of genes. Downregulation of genes involved in DNA damage response pathway seems to be a specific effect of deficient CDK12 catalytic activity. The treatment with THZ531 and SR-4835 causes premature termination of transcription at the sensitive genes. Importantly, the anti-proliferative effect of these inhibitors observed in different cancer cell lines (Zhang et al., 2016; Iniguez et al., 2018; Krajewska et al., 2019; Quereda et al., 2019) suggests that CDK12 is a viable target for cancer therapy.

1.3.2. Cells with endogenous analogue-sensitive (AS) CDK12 alleles

As demonstrated above, it is challenging to develop highly selective inhibitors of the CDKs due to high similarities between their ATP binding sites. The use of analogue-sensitive (AS) CDKs provides an alternative approach for investigation of their catalytic functions (Blethrow et al., 2004). It allows a specific, direct and short-term inhibition of the modified kinase. Analogue-sensitive kinases are created by mutating a large phenylalanine residue (termed the “gatekeeper“) in close proximity of the enzyme´s active centre to a much smaller glycine (Fig.4). Only the mutated enzyme is then able to accept bulky adenine analogues (e.g. 1-NM-PP1 or 3-MB-PP1) in its active centre, which leads to inhibition of its kinase activity (Blethrow et al., 2004). Bartkowiak and Greenleaf determined the phenylalanine 813 in CDK12 to be the suitable “gatekeeper” residue for creating the AS CDK12 (Bartkowiak et al., 2015a) (Fig.4). They engineered a mammalian CDK12 heterozygote cell line with one allele containing a functional copy of AS CDK12 variant and one containing a premature stop codon. Inhibition of CDK12 in this cell line led to perturbations in Ser2 and Ser5 CTD phosphorylations, decreased level of BRCA1 mRNA and proliferation defects (Bartkowiak et al., 2015b).

Figure 4: Schema depicting the mechanism of preparation of AS CDK12. Phenylalanine (F) as the “gatekeeper” residue and glycine (G) are indicated in red (adapted from Manavalan et al., 2019).

26

1.4. Regulation of genome stability by CDK12

1.4.1. CDK12-dependent genome stability in cancer cell lines

Genes that are most sensitive to CDK12 inhibition or depletion (such as BRCA1, BRCA2, ATM, ATR, RAD51 and Fanconi anaemia) function in the HR repair pathway (Blazek et al., 2011; Bajrami et al., 2014; Ekumi et al., 2015; Juan et al., 2016; Zhang et al., 2016; Tien et al., 2017; Dubbury et al., 2018). The finding that CDK12 controls the optimal expression of genes involved in cellular processes whose disruption triggers DNA damage suggests a function of the kinase in maintenance of genome stability. Indeed, alterations in CDK12 generate endogenous DNA damage, genome instability and a HR-deficient phenotype in cancer cell lines (Blazek et al., 2011; Bajrami et al., 2014; Joshi et al., 2014; Ekumi et al., 2015). CDK12 has impact on cellular proliferation and cell cycle progression (Bartkowiak et al., 2015b; Zhang et al., 2016; Lei et al., 2018; Dubbury et al., 2018), which may further affect the genome stability. Depletion of CycK and CDK12 was shown to prevent the assembly of pre-replicative complex in via diminished CycK-dependent phosphorylation of cyclin E1 (Lei et al., 2018). This suggested a novel role of CDK12/CycK complex in mediating crosstalk between DNA replication and gene transcription. A recent study has also implicated CDK12 in translational regulation of a subset of mRNAs. Its targets involved checkpoint kinase 1 (a key regulator of G1/S and G2/M progression) and components and regulators of spindle assembly checkpoint (Choi et al., 2019a). Consequently, disruption of the CDK12-dependent translation network caused mitotic defects including chromosome misalignments or spindle pole detachment (Choi et al., 2019a). Depletion of CycK also decreased the expression of leading to the onset of Aurora B kinase-dependent mitotic catastrophe (Schecher et al., 2017). Thus, it can be concluded that aberrant function of CDK12/CycK complex results in severe mitotic defects which may significantly contribute to CDK12- dependent genome instability.

1.4.2. Unique genome instability phenotype in CDK12-inactivated tumours

Recurrent mutations in CDK12 have been documented in various cancers including high-grade serous ovarian cancer (HGSOC) (Cancer Genome Atlas Research Network, 2011), triple-negative breast cancer (TNBC) (Fusco et al., 2016) and metastatic castration- resistant prostate cancer (mCRPC) (Robinson et al., 2015). Analysis of ovarian cancer tumours found that CDK12 mutations are predominantly homozygous. Loss of the wild-type

27 allele occurred in the majority of the CDK12-mutated cases, which suggests a tumour suppressive role of the kinase (Carter 2012). CDK12 mutations also tend to be mutually exclusive with mutations in other HR genes (Bajrami et al., 2014) indicating they are a primary cause of the HR-deficient phenotype. Indeed, CDK12 mutations found in HGSOC disrupt its kinase activity, prevent formation of CDK12/CycK complex and disturb the HR repair capability of ovarian cancer cell lines (Joshi et al., 2014; Ekumi et al., 2015). Analyses of ovarian and prostate tumours with biallelic CDK12 inactivations revealed a specific genome instability phenotype characterized by a high number of gains over the (Popova et al., 2016; Wu et al., 2018; Quigley et al., 2018). The unusual size (greater than 10%) of increased genomic content was due to hundreds of focal tandem duplications (FTDs) with a bi-modal size distribution (∼0.3–0.5- and ∼2.5–3.0-Mb) and occurrence especially in gene-rich regions (Wu et al., 2018) (Fig.5).

Figure 5: The schema demonstrates a form of the focal tandem duplications with indicated sizes of duplicated segments in ovarian and prostate cancers carrying CDK12 inactivation.

Tandem duplications (TDs) are a regular structural rearrangement in ovarian cancer genomes and have been associated with mutations in TP53 and BRCA1 genes (Menghi et al., 2016). However, FTDs observed in CDK12-inactivated tumours are distinct from duplications in BRCA1-deficient, Cyclin E1-amplified and other HR/DNA repair-deficient tumours (Popova et al., 2016; Wu et al., 2018; Menghi et al., 2018). In contrast to depletion or inhibition of CDK12 in cancer cell lines, CDK12-inactivated ovarian tumours are not associated with characteristic genomic signatures of HR-deficiency (Vanderstichele et al., 2017). The tumour gene expression profiles are distinct from HR-deficient tumours and the decreased expression of HR genes including BRCA1/2 is not observed in ovarian and prostate CDK12- inactivated tumours (Ekumi et al., 2015; Popova et al., 2016; Wu et al., 2018; Menghi et al., 2018). Thus, CDK12-inactivated ovarian and prostate tumours have the unique genome instability phenotype (termed CDK12 TD+ phenotype) characterized by large-span FTDs (Popova et al., 2016; Wu et al., 2018). The mechanism leading to genesis of such large FTDs is not clear. However, it cannot be attributed only to defects in HR repair pathway. Since expression of some CDK12-dependent HR genes (such as BRCA1) is not decreased in the tumours there have to be some compensatory mechanisms for their expression. Finally, very

28 recent analyses have revealed the presence of FTDs in many other cancers with a low incidence (<2%) of CDK12 inactivation (Sokol et al., 2019), which suggests their occurrence in addition to ovarian and prostate cancers.

1.4.3. Therapeutic potential of CDK12

Therapeutic potential of CDK12 is linked to its function in transcription regulation. Mutations disrupting HR-mediated repair sensitize cells to poly-(ADP-ribose) polymerase inhibitors (PARPi), a phenotype known as “BRCAness” (Lord and Ashworth, 2016 and 2017). Mechanistically, PARPi-treated cells with disrupted HR pathway have to rely on the other DNA damage response pathways such as non-homologous end joining that are more error-prone, which may result in the cellular lethality (Patel, 2011). Clinical use of these inhibitors has been approved in ovarian and breast cancers with germ-line BRCA1/2 mutations (Drean et al., 2016). The finding that CDK12 is a critical transcriptional regulator of HR repair genes suggested its possible therapeutic use as a biomarker or inhibition target in anti-tumour treatment. Indeed, CDK12 was identified as another determinant of PARPi sensitivity in genome-wide screen (Bajrami et al., 2014) and silencing of CDK12 significantly increased the sensitivity of ovarian cancer cells to PARPi and platinum-based drugs (Bajrami et al., 2014; Joshi et al., 2014). CDK12 inhibition also showed a synergic effect with PARPi and various DNA-damaging agents in sensitization of TNBC cells and Ewing sarcoma cells (Johnson et al., 2016; Iniguez et al., 2018; Quereda et al., 2019). Another study proposed that mCRPC patients carrying biallelic CDK12 inactivation could benefit from immune checkpoint inhibition. This is due to CDK12-induced high neoantigen load and T-cell infiltration (Wu et al., 2018), which are suggested determinants of sensitivity to this type of therapy (Le et al., 2017). Mismatch repair deficient tumours associated with large proportion of mutant neoantigens have been shown to be highly sensitive to the therapy (Le et al., 2017). There are several ongoing clinical trials conducted on patients with CDK12-inactivated tumours evaluating their response to immune checkpoint inhibition therapy or treatment with PARPi (ClinicalTrials.gov, online). In addition, CDK12 was suggested to have an oncogenic function which is also linked to its role in regulation of transcription. According to the concept of transcriptional addiction, tumours driven by oncogenes such as MYC and EWS/FLI (Ewing´s sarcoma fusion protein) are highly dependent on transcriptional programmes converging on RNAPII and the need for DNA damage response gene expression to facilitate rapid replication (Yang et al., 2000; O’Connor 2015; Bradner et al., 2017). Thus, CDK12 as a transcriptional coactivator of gene expression and regulator of DNA replication and repair genes can improve the fitness of

29 cancer cells (O’Connor 2015). In agreement, the overexpression of c-MYC depends on CDK12. Several studies have also demonstrated the potential anti-tumour effect of CDK12 inhibition in MYC-dependent cancers (Toyoshima et al., 2012; Delehouze et al. 2014; Zeng et al., 2018) and Ewing sarcoma cells harbouring an EWS/FLI fusion (Iniguez et al., 2018). CDK12 gene is located within the smallest region of HER2 amplicon and is often co- amplified with HER2 in breast tumours (Sircoulomb et al., 2010; Mertins et al., 2016). Amplification of CDK12 is associated with increased expression and phosphorylation of the protein, which suggests a high activity of CDK12 in this type of tumours (Mertins et al., 2016; Naidoo et al., 2018). Other analyses showed that CDK12-overexpression correlates with high tumour grade and worse clinical outcomes including poor survival and disease recurrence (Capra et al., 2006; Naidoo et al., 2018; Choi et al., 2019b). CDK12 has also been demonstrated to promote tumour initiation of cancer stem cells and induce anti-HER2 therapy resistance in breast cancer (Choi et al., 2019b). Thus, several pieces of evidence indicate that CDK12 has a driving oncogenic function in HER2- and CDK12- amplified tumours. Altogether, CDK12 activity has been reported to affect sensitivity of cells in different tumour backgrounds. These findings have led to a growing interest in CDK12 as a potential biomarker and inhibition target in anti-tumour treatments.

30

2. Aims of the study

The main aim of the study was to investigate cellular functions of CDK12 and CDK13, especially in relation to regulation of gene expression and genome stability via phosphorylation of the CTD of RNAPII.

The specific aims of this study were:

– To characterize CDK13 kinase activity towards RNAPII CTD in vitro and in vivo. – To explore the possible role of PIN1 in modulation of CDK12 and CDK13 kinase activities. – To define CDK12- and CDK13- regulated genes. – To investigate a role of CDK12 catalytic activity in the regulation of transcription and other cellular processes using AS CDK12 HCT116 cell line. – To summarize mechanisms that CDK12 utilizes for the regulation of gene expression and discuss how perturbations in CDK12-regulated gene expression contribute to the onset of genome instability observed in CDK12-inactivated tumours.

31

3. List of used methods

Cell culture, siRNA-mediated knockdown, plasmid DNA transfection, cell synchronization (serum starvation and thymidine-nocodazole) Standard molecular biology techniques – isolation of RNA, reverse transcription, qPCR, SDS- PAGE, imnunoblotting, immunoprecipitation Nuclear fractionation Flow cytometry In vitro kinase assay Nuclear total RNA-seq Chromatin immunoprecipitation

32

4. Linking publications

Publication #1 characterizes structural and functional properties of CDK13/CycK complex. We determined a crystal structure and substrate specificity of the complex. Furthermore, we analyzed gene expression changes upon siRNA-mediated CDK12 and CDK13 knockdowns.

In publication #2, a chemical genetic approach is used to specifically and acutely inhibit endogenous CDK12 to explore its direct and catalytic functions in different cellular processes. The study demonstrates a tight interplay between CDK12 kinase activity, expression of DNA replication genes, cell cycle progression, and genome stability. We also elucidate the mechanism which causes the decreased expression of CDK12-regulated genes.

Publication #3 summarizes cellular functions of CDK12 with emphasis on transcription-related functions. We consider different roles of CDK12 in maintenance of genome stability in cell lines and tumours. We also discuss a therapeutic potential of CDK12 as a predictive biomarker and inhibition target in anti-tumour treatments.

33

5. Discussion

Human CDK12 and CDK13 form two separate complexes with CycK (Blazek et al., 2011). In collaboration with Geyer´s lab, we determined the structure of CDK13/CycK complex and performed comparative functional analyses of CDK12 and CDK13 in relation to phosphorylation of RNAPII CTD and regulation of gene expression (publication #1). Due to an absence of specific CDK12 inhibitors, we decided to inhibit AS CDK12 by ATP analogue to gain comprehensive insight into its target genes and affected cellular processes (publication #2). The structure of CDK13/CycK complex was determined by Geyer´s lab. They found that CDK13 exhibits the typical kinase fold consisting of N-terminal and C-terminal lobes. Analogous to CDK12, CDK13 contains a C-terminal kinase extension helix (aa 999-1032) (Fig.6) composed of polybasic cluster and DCHEL motif interacting with bound ATP (Bösken et al., 2014; publication #1). Since similar C-terminal extension was also found in CDK9 (Baummli et al., 2012), this conformation appears to be a common molecular feature of CTD kinases implicated in transcription elongation. Importantly, the knowledge about C-terminal extension has created a unique possibility for development of the specific inhibitors (publication #1; Zhang et al., 2016).

Figure 6: Domain composition of CDK13 with depicted C-terminal extension (Pilarova et al., 2020).

CDK12 and CDK13 identically phosphorylate both Ser5 and Ser2 of RNAPII CTD in vitro (publication #1). In agreement, previous studies have demonstrated the ability of the kinases to phosphorylate these residues (Blazek et al., 2011; Cheng et al., 2012; Bösken et al., 2014; Liang et al., 2015). CDK13 displayed the highest activity on the phospho-CTD substrate pre-phosphorylated at Ser7 whereas the non-phosphorylated CTD peptide was poorly recognized (publication #1), which is again reminiscent of CDK12 substrate specificity (Bösken et al., 2014). This finding suggests that both CDK12 and CDK13 are promiscuous CTD kinases in vitro and function during elongation once the CTD has already been phosphorylated by another kinase. Consistently, the pre-phosphorylation of Ser7 further enhanced their kinase activities (publication #1).

34

Protein kinases can be specific for the proline isomeric state (Lu et al., 2002). Since PIN1 was shown to stimulate CTD phosphorylation (Xu et al., 2013), we tested the possibility that PIN1 can affect the CTD phosphorylation mediated by CDK12, CDK13 and also CDK9. However, the presence of PIN1 did not cause any significant changes in preference of the kinases for CTD phosphorylation, which was also observed in another study (publication #1; Bartkowiak et al., 2015a). Thus, we did not find any evidence that PIN1 modulates activity of CTD kinases in vitro. At the same time, we cannot rule out that the presence of additional factors will affect enzymatic activities of the proteins in vivo. Flavopiridol is widely used as an inhibitor with some specificity for transcription elongation kinase CDK9 (Chao et al., 2000; Baumli et al., 2008). We tested potency of flavopiridol towards CDK13 and found out that this compound is rather poor inhibitor of CDK13, possibly due to a steric hindrance with histidine residue 1018 of DCHEL motif within C-terminal extension (publication #1). Thus, the presence of DCHEL motif in CDK13 and CDK12 C-terminal extensions may result in decreased specificity of flavopiridol to the kinases in comparison to CDK9 (publication #1; Bösken et al., 2014). Additionally, we found that flavopiridol also inhibits another transcription-associated kinase CDK7 (publication #1), although with less potency than it does in the case of CDK9. Given that CDK7 functions as CAK for other CDKs, it indicates a broad impact of flavopiridol on different cellular processes including transcription. In agreement, it strongly diminishes the phosphorylation levels of all three serine residues within the CTD in comparison to very small changes caused by depletion of either CDK12 or CDK13 (publication #1). Longer term CDK12 and CDK13 depletion did not cause significant changes in bulk CTD phosphorylation, only a modest or little decrease in bulk Ser2P, especially upon depletion of CDK12 (publication #1; Bartkowiak et al., 2010; Blazek et al., 2011; Cheng et al., 2012; Liang et al., 2015). Short term inhibition of the kinases with THZ531 led to a dose-dependent decrease in Ser2P and Thr4P, however, only at higher doses (>200 nM) (Zhang et al. 2016; Krajewska et al., 2019). We did not observe any substantial changes in CTD phosphorylation upon selective inhibition of AS CDK12 by ATP analogue, only a slight decrease in Ser5P and Ser7P, and a surprisingly slight accumulation of Ser2P (publication #2), which is consistent with another study using AS CDK12 (Bartkowiak et al., 2015b). This discrepancy could be explained either by dual specificity of THZ531 (and other possible off-targets at higher concentrations of the inhibitor, Zhang et al., 2016) and/or by different mechanism of the inhibitor action (i.e. competitive inhibition by ATP analogue in contrast to complete kinase shut-off by covalent THZ531). Given that CDK12 and CDK13 are gene-specific kinases affecting only a subset of genes (publication #1; publication #2; Blazek et al., 2011; Liang et al., 2015; Zhang et al., 2016; Tien et al., 2017, Dubbury et al., 2018; Krajewska et al., 2019; Quereda et al. 2019) it

35 can be expected to detect only slight changes in bulk CTD phosphorylation in total cell lysates. Additionally, a possible partial mutual redundancy between individual CTD kinases can also complicate the interpretation of the data. In conclusion, the results suggest that inhibition of CDK12 and CDK13 result in subtle changes in CTD phosphorylation. The precise contribution of individual kinases to phosphorylation of specific residues remains controversial and more suitable methods will be required to answer this question, for example mass spectrometric analyses (Schüller et al., 2016) coupled with the use of highly selective inhibitors. In order to identify genes specifically sensitive to CDK12 inhibition we performed nuclear RNA-seq with AS CDK12 cells inhibited by ATP analogue 3-MB-PP1 (publication #2). Consistent with studies using THZ531, CDK12 kinase activity was found to be a key regulator of DNA repair and replication genes (publication #2; Zhang et al., 2016; Krajewska et al., 2019). In contrast, CDK13 was important for expression of snoRNA genes and genes involved mainly in growth signalling pathways and functioning in energy metabolism in mitochondrion (publication #1; Liang et al., 2015). Thus, we showed that despite a high aa sequence similarity between the kinase domains (93%), CDK12 and CDK13 regulate expression of a markedly different set of genes involved in dissimilar biological processes (publication #1; publication #2). The non-redundant role of the proteins is indicated also by other studies demonstrating their different roles in embryogenesis, development and differentiation (Dai et al., 2012; Juan et al., 2016; Novakova et al., 2019). To elucidate the mechanism by which CDK12 regulates its target genes we performed ChIP-seq of total and modified RNAPII upon inhibition of AS CDK12 coupled with RNA-seq (publication #2). We and others determined that inhibition of CDK12 leads to a shortening of transcripts which is caused by premature termination of regulated genes as a result of decreased elongation rate and defect in RNAPII processivity (Fig.7) (publication #2; Zhang et al., 2016; Krajewska et al., 2019). This was accompanied by shifts of RNAPII-Ser2P ChIP-seq peaks from 3´ends to the bodies of individual CDK12-sensitive genes, approximately to the positions where transcription was lost and premature termination occurred (Fig.7) (publication #2). Of note, our observations were reminiscent of those from THZ531 inhibited cells at low concentrations (50nM) of the inhibitor (Zhang et al., 2016). Analyses of CDK12 sensitive genes revealed a gene length, high number of cryptic pA sites, lower GC content and lower ratio of U1 snRNA binding to pA sites as the main determinants of gene sensitivity to CDK12 inhibition (Fig.7) (publication #2; Krajewska et al., 2019).

36

Figure 7: Transcriptional defect detected in CDK12 inhibited cells. The CTD of RNAPII is illustrated as a light blue oval with two, five and seven in circles representing individual serine residues. The aberrant increase of RNAPII Ser2P signal is depicted by the yellow circle with number two (adapted from Pilarova et al. 2020).

The suppression of premature cleavage and polyadenylation activity has been connected to U1 snRNP (Berg et al., 2012). U1 snRNP was also recently shown to be important for sustaining long-distance transcription elongation of large genes with cell cycle progression and developmental functions including DNA repair genes (Oh et al., 2017). Defective U1 snRNP induced a shift to usage of cryptic pA sites, resulting in shorter mRNA isoforms, similarly as observed for CDK12. Therefore, we proposed that further research of U1 snRNP will likely provide an additional mechanistic understanding of CDK12 functions (publication #3). Recent study showed that CDK13 mutant human melanomas accumulate prematurely terminated RNAs due to disruption of CDK13-dependent recruitment of PAXT (polyA exosome targeting) complex leading to stabilization of the prematurely terminated transcripts and their translation into truncated proteins (Insco et al., 2019). In our study, we did not observe any changes in mRNA stability upon CDK12 inhibition (publication #2). Therefore, the mechanism that CDK12 and CDK13 utilize for regulation of gene expression seems to differ just like their target genes. THZ531 and ATP analogue inhibited cellular proliferation in various cell lines and AS CDK12 cells, respectively (Zhang et al., 2016; Bartkowiak et al., 2015b). The proposed mechanism in recent study suggested a role of CDK12/CycK complex in pre-replication complex assembly via CDK12-dependent phosphorylation of cyclin E1 (independently of CDK12-regulated transcription) (Lei et al., 2018). In publication #2 we provided evidence that CDK12 is required for pre-replication complex assembly via regulation of DNA replication genes. Therefore, it functions also upstream of the pre-replication complex assembly. We

37 demonstrated that CDK12-dependent RNAPII processivity is a rate-limiting factor for the optimal G1/S progression. Importantly, the G1/S progression defect was independent of the secondary activation of DNA damage cell cycle checkpoint. Thus, CDK12 represents a novel link between regulation of transcription and cell cycle progression. CDK12 as a key regulator of many DNA repair genes affects the ability of cells to respond to DNA damage and thus contribute to maintenance of genome stability (publication #2; Blazek et al., 2011; Bajrami et al., 2014; Ekumi et al., 2015; Zhang et al., 2016; Tien et al., 2017). Consistently, we observed an increased number of chromosomal aberrations in AS CDK12 inhibited cells (publication #2), which underlines a fundamental role of CDK12 kinase activity in maintenance of genome stability. Disruption of DNA replication and cell cycle progression leads to a replication stress which is another source of genome instability (Gaillard et al., 2015). Since inhibition of AS CDK12 downregulated expression of many DNA replication and cell cycle progression-related genes, we proposed that both DNA repair and replication defects substantially contribute to pleiotropic effect on genome instability in CDK12-inactivated cells and tumours (Fig.8) (publication #2; publication #3; Popova et al. 2016; Wu et al., 2018).

Figure 8: Overview of defects and events caused by deregulation of CDK12-dependent genes that contribute to pleiotropic effects on genome instability in CDK12-inactivated cells and tumours (adapted from Pilarova et al. 2020).

38

CDK12-inactivated tumours are associated with unique genome instability phenotype characterized by FTDs of unusually large sizes (Popova et al., 2016; Wu et al., 2018; Menghi et al., 2018) that are distinct from duplications in BRCA1-deficient tumours. Our findings can have implications in understanding the origin of these FTDs. The results seem to point to the conclusion that the onset of replication stress and deficient HR-mediated fork restart could lead to their genesis (publication #2; publication #3; Branzei and Szakal, 2017). Interestingly, larger TDs were found to be involved in the duplications of oncogenes. The generation of large-span TDs in CDK12-inactivated tumours can, therefore, lead to a secondary deregulation of gene expression including duplications of oncogenic drivers, such as AR (androgen receptor) and MYC enhancers (Wu et al., 2018; Menghi et al., 2018; Viswanathan et al., 2018). FTDs in mCRPC with CDK12 loss resulted in highly recurrent gains at loci of genes involved in the cell cycle and DNA replication (Wu et al., 2018). These events could be potential mechanisms leading to compensatory gene expression of HR genes in these tumours. Publication #2 can also help uncover the cellular and genetic background that determines sensitivity to CDK12 inhibition. Preliminary results of the clinical trial termed TRITON2 suggest that mCRPC patients with CDK12 mutations do not exhibit a response to PARPi, which is in contrast to BRCA1-deficient patients (Luo and Antonarakis, 2019). In agreement, recent studies showed that CDK12-aberrant mCRPC are transcriptionally, genetically and phenotypically different (Wu et al., 2018; Reimers et al., 2019). This points to the need for a more elaborate stratification of patients with mutations in DDR genes including CDK12 and the application of alternative treatments, such as immune checkpoint inhibition for certain groups of patients with CDK12 aberrations. Significance of CDK12 as a possible predictive biomarker can be also found in HER2-amplified breast cancer. In this subtype of breast cancer, CDK12 is often co-amplified with HER2 oncogene. The CDK12 amplification was suggested to have oncogenic function and its inhibition showed anti-proliferative effect (Naidoo et al., 2018; Choi et al., 2019b). However, in a subset (∼14%) of HER2-positive breast cancer, the HER2 amplicon breakpoint converges on CDK12, disrupting its expression and leading to sensitivity to PARPi (Natrajan et al., 2014; Naidoo et al., 2018). This demonstrates that the functional outcome of CDK12 in tumours is case and context dependent. Further discoveries regarding cellular functions of CDK12 are likely to lead to other important clinical implications.

39

6. Conclusions

In this work, we have provided evidence that CDK12 and CDK13 have different functions in human cells. They regulate expression of a markedly different set of genes involved in dissimilar biological processes although their kinase activities at CTD of RNAPII are same in vitro. We showed that kinase activity of CDK12 affects G1/S progression by regulating RNAPII processivity on DNA replication genes. Finally, we have proposed several roles of CDK12 in regulation of genome stability and discussed their possible implications in anti-cancer therapy. The major outcomes of this study can be summarized as follows:

– The crystal structure of the human CDK13/CycK complex. – CDK13 phosphorylates Ser2 and Ser5 at CTD of RNAPII in vitro. – PIN1 does not change the phosphorylation specificity of CDK12 and CDK13 in vitro. – CDK13 regulates genes involved in growth signalling pathways. – CDK12 kinase activity is required for expression of a subset of long, PAS-rich genes particularly those involved in DNA replication and DNA damage response. – CDK12 does not globally control RNAPII Ser2P on transcription units. – Inhibition of AS CDK12 causes premature termination and transcript shortening of the regulated genes. – Inhibition of AS CDK12 leads to accumulation of RNAPII–Ser2P ChIP-seq peaks in the bodies of individual CDK12-sensitive genes, approximately in the positions where transcription is lost and where premature termination occurs. – CDK12-dependent RNAPII processivity on core DNA replication genes is a rate- limiting factor for G1/S progression. – Aberrant CDK12-regulated gene expression has pleiotropic role in the onset of genome instability in cell lines and tumours.

40

7. References

Adelman, K. and Lis, J.T. (2012) Promoter-proximal pausing of RNA polymerase II: emerging roles in metazoans. Nature reviews. Genetics, 13, 720-731.

Akoulitchev, S., Chuikov, S., Reinberg, D. (2000) TFIIH is negatively regulated by cdk8- containing mediator complexes. Nature, 407, 102-6.

Allen, B.L. and Taatjes, D.J. (2015) The Mediator complex: a central integrator of transcription Nat. Rev. Mol. Cell Biol, 16, 155-166.

Andrecka, J., Lewis, R., Brückner, F., Lehmann, E., Cramer, P., Michaelis, J. (2008) Single- molecule tracking of mRNA exiting from RNA polymerase II. Proc Natl Acad Sci U S A, 105, 135-40.

Asghar, U., Witkiewicz, A.K., Turner, N.C., Knudsen, E.S. (2015) The history and future of targeting cyclin-dependent kinases in cancer therapy. Nat Rev Drug Discov., 14, 130-46.

Bajrami, I., Frankum, J.R., Konde, A., Miller, R.E., Rehman, F.L., Brough, R., Campbell, J., Sims, D., Rafiq, R., Hooper, S. et al. (2014) Genome-wide profiling of genetic synthetic lethality identifies CDK12 as a novel determinant of PARP1/2 inhibitor sensitivity. Cancer Res., 74, 287–297.

Bartkowiak, B., Liu, P., Phatnani, H.P., Fuda, N.J., Cooper, J.J., Price, D.H., Adelman, K., Lis, J.T. and Greenleaf, A.L. (2010) CDK12 is a transcription elongation-associated CTD kinase, the metazoan ortholog of yeast Ctk1. Genes & development, 24, 2303-2316.

Bartkowiak, B. and Greenleaf, A.L. (2015a) Expression, purification, and identification of associated proteins of the full-length hCDK12/CyclinK complex. J. Biol. Chem., 290, 1786– 1795.

Bartkowiak, B., Yan, C. and Greenleaf, A.L. (2015b) Engineering an analog-sensitive CDK12 cell line using CRISPR/Cas. Biochim. Biophys. Acta, 1849, 1179–1187.

Baumli, S., Lolli, G., Lowe, E.D., Troiani, S., Rusconi, L., Bullock, A.N., Debreczeni, J.E., Knapp, S. and Johnson L.N. (2008) The structure of P-TEFb (CDK9/cyclin T1), its complex with flavopiridol and regulation by phosphorylation. EMBO J, 27, 1907-1918.

Bentley, D.L. (2014) Coupling mRNA processing with transcription in time and space. Nature reviews. Genetics, 15, 163-175.

Berg, M.G., Singh, L.N., Younis, I., Liu, Q., Pinto, A.M., Kaida, D., Zhang, Z., Cho, S., Sherrill- Mix, S., Wan, L., Dreyfuss, G. (2012) U1 snRNP determines mRNA length and regulates isoform expression. Cell., 150, 53-64.

Berro, R., Pedati, C., Kehn-Hall, K., Wu, W., Klase, Z., Even, Y., Genevière, A.M., Ammosova, T., Nekhai, S., Kashanchi, F. (2008) CDK13, a new potential human immunodeficiency virus type 1 inhibitory factor regulating viral mRNA splicing. J Virol., 82, 7155-66.

Blazek, D., Kohoutek, J., Bartholomeeusen, K., Johansen, E., Hulinkova, P., Luo, Z., Cimermancic, P., Ule, J. and Peterlin, B.M. (2011) The Cyclin K/Cdk12 complex maintains

41 genomic stability via regulation of expression of DNA damage response genes. Genes & development, 25, 2158-2172.

Blethrow, J., Zhang, C., Shokat, K.M. and Weiss, E.L. (2004) Design and use of analog- sensitive protein kinases. Curr Protoc Mol Biol, chapter 18, unit 18.11.

Bösken, C.A., Farnung, L., Hintermair, C., Merzel Schachter, M., Vogel-Bachmayr, K., Blazek, D., Anand, K., Fisher, R.P., Eick, D. and Geyer, M. (2014) The structure and substrate specificity of human Cdk12/Cyclin K. Nature communications, 5, 3505.

Bostwick, B.L., McLean, S., Posey, J.E., Streff, H.E., Gripp, K.W., Blesson, A., Powell- Hamilton, N., Tusi, J., Stevenson, D.A., Farrelly, E. et al. (2017) Phenotypic and molecular characterisation of CDK13-related congenital heart defects, dysmorphic facial features and intellectual developmental disorders. Genome Med, 9, 73.

Bowman, E.A. and Kelly, W.G. (2014) RNA Polymerase II transcription elongation and Pol II CTD Ser2 phosphorylation. Nucleus, 5, 224-36.

Bradner, J.E., Hnisz, D. and Young, R.A. (2017) Transcriptional addiction in cancer. Cell, 168, 629–643.

Branzei, D. and Szakal, B. (2017) Building up and breaking down: mechanisms controlling recombination during replication. Crit. Rev. Biochem. Mol. Biol., 52, 381–394.

Cancer Genome Atlas Research Network (2011) Integrated genomic analyses of ovarian carcinoma. Nature, 474, 609–615.

Capra, M., Nuciforo, P.G., Confalonieri, S., Quarto, M., Bianchi, M., Nebuloni, M., Boldorini, R., Pallotti, F., Viale, G., Gishizky, M.L. et al. (2006) Frequent alterations in the expression of serine/threonine kinases in human cancers. Cancer Res., 66, 8147–8154.

Chao, S.H., Fujinaga, K., Marion, J.E., Taube, R., Sausville, E.A., Senderowicz, A.M., Peterlin, B.M., Price, D.H. (2000) Flavopiridol inhibits P-TEFb and blocks HIV-1 replication. J Biol Chem., 275, 28345-8.

Chao, S.H. and Price, D.H. (2001) Flavopiridol inactivates P-TEFb and blocks most RNA polymerase II transcription in vivo. J Biol Chem., 276, 31793-9.

Chen, H.H., Wang, Y.C. and Fann, M.J. (2006) Identification and characterization of the CDK12/cyclin L1 complex involved in alternative splicing regulation. Mol Cell Biol, 26, 2736– 2745.

Chen, H.R., Lin, G.T., Huang, C.K., Fann, M.J. (2014) Cdk12 and Cdk13 regulate axonal elongation through a common signaling pathway that modulates Cdk5 expression. Exp Neurol., 261, 10-21.

Chen, H.R., Juan, H.C., Wong, Y.H., Tsai, J.W., Fann, M.J. (2017) CDK12 regulates neurogenesis and late-arising neuronal migration in the developing cerebral cortex. Cereb Cortex., 27, 2289-2302.

Cheng, S.W., Kuzyk, M.A., Moradian, A., Ichu, T.A., Chang, V.C., Tien, J.F., Vollett, S.E., Griffith, M., Marra, M.A. and Morin, G.B. (2012) Interaction of cyclin-dependent kinase

42

12/CrkRS with cyclin K1 is required for the phosphorylation of the C-terminal domain of RNA polymerase II. Mol. Cell. Biol., 32, 4691–4704.

Choi, S.H., Martinez, T.F., Kim, S., Donaldson, C., Shokhirev, M.N., Saghatelian, A. and Jones, K.A. (2019a) CDK12 phosphorylates 4E-BP1 to enable mTORC1-dependent translation and mitotic genome stability. Genes Dev., 33, 418–435.

Choi, H.J., Jin, S., Cho, H., Won, H.Y., An, H.W., Jeong, G.Y., Park, Y.U., Kim, H.Y., Park, M.K., Son, T. et al. (2019b) CDK12 drives breast tumor initiation and trastuzumab resistance via WNT and IRS1–ErbB–PI3K signaling. EMBO Rep., 20, e48058.

Chou, J., Quigley, D.A., Robinson, T.M., Feng, F.Y., Ashworth A. (2020) Transcription- associated cyclin-dependent kinases as targets and biomarkers for cancer therapy. Cancer Discov, 10, 351-370.

Corden, J.L. (2013) RNA polymerase II C-terminal domain: tethering transcription to transcript and template. Chem Rev, 113, 8423-8455.

Cramer, P., Bushnell, D.A., Kornberg, R.D. (2001) Structural basis of transcription: RNA polymerase II at 2.8 angstrom resolution. Science, 292, 1863–76.

Czudnochowski, N., Bösken, C.A., Geyer, M. (2012) Serine-7 but not serine-5 phosphorylation primes RNA polymerase II CTD for P-TEFb recognition. Nat Commun., 3, 842.

Dai,Q., Lei,T., Zhao,C., Zhong,J., Tang,Y.Z., Chen,B., Yang,J., Li,C., Wang,S., Song,X. et al. (2012) Cyclin K-containing kinase complexes maintain self-renewal in murine embryonic stem cells. J. Biol. Chem., 287, 25344–25352.

Davidson, L., Muniz, L. and West, S. (2014) 3´end formation of pre-mRNA and phosphorylation of Ser2 on the RNA polymerase II CTD are reciprocally coupled in human cells. Genes Dev., 28, 342–356.

Delehouze, C., Godl, K., Loaec, N., Bruyere, C., Desban, N., Oumata, N., Galons, H., Roumeliotis, T.I., Giannopoulou, E.G., Grenet, J. et al. (2014) CDK/CK1 inhibitors roscovitine and CR8 downregulate amplified MYCN in neuroblastoma cells. Oncogene, 33, 5675–5687.

Devaiah, B.N., Lewis, B.A., Cherman, N., Hewitt, M.C., Albrecht, B.K., Robey, P.G., Ozato, K., Sims, R.J. 3rd, Singer, D.S. (2012) BRD4 is an atypical kinase that phosphorylates serine2 of the RNA polymerase II carboxy-terminal domain. Proc Natl Acad Sci U S A., 109, 6927-32.

Di Vona, C., Bezdan, D., Islam, A.B., Salichs, E., López-Bigas, N., Ossowski, S., de la Luna. S. (2015) Chromatin-wide profiling of DYRK1A reveals a role as a gene-specific RNA polymerase II CTD kinase. Mol Cell., 57, 506-20.

Dixon-Clarke, S.E., Elkins, J.M., Cheng, S.W., Morin, G.B. and Bullock, A.N. (2015) Structures of the CDK12/CycK complex with AMP-PNP reveal a flexible C-terminal kinase extension important for ATP binding. Sci. Rep., 5, 17122.

Donner, A.J., Ebmeier, C.C., Taatjes, D.J., Espinosa, J.M. (2010) CDK8 is a positive regulator of transcriptional elongation within the serum response network. Nat Struct Mol Biol., 17, 194- 201.

Drean, A., Lord, C.J. and Ashworth, A. (2016) PARP inhibitor combination therapy. Crit. Rev. Oncol. Hematol., 108, 73–85.

43

Drogat, J. and Hermand, D. (2012) Gene-specific requirement of RNA polymerase II CTD phosphorylation. Mol. Microbiol., 84, 995–1004.

Dubbury, S.J., Boutz, P.L. and Sharp, P.A. (2018) CDK12 regulates DNA repair genes by suppressing intronic polyadenylation. Nature, 564, 141-145.

Ebmeier, C.C., Erickson, B., Allen, B.L., Allen, M.A., Kim, H., Fong, N., Jacobsen, J.R., Liang, K.W., Shilatifard, A., Dowell, R.D. et al. (2017) Human TFIIH kinase CDK7 regulates transcription-associated chromatin modifications. Cell Rep., 20, 1173–1186.

Egloff, S., Dienstbier, M. and Murphy, S. (2012) Updating the RNA polymerase CTD code: adding gene-specific layers. Trends Genet, 28, 333-41.

Eick, D. and Geyer, M. (2013) The RNA polymerase II carboxy-terminal domain (CTD) code. Chem Rev, 113, 8456-8490.

Eifler, T.T., Shao, W., Bartholomeeusen, K., Fujinaga, K., Jager, S., Johnson, J.R., Luo, Z., Krogan, N.J. and Peterlin, B.M. (2015) Cyclin-dependent kinase 12 increases 3´end processing of growth factor induced c-FOS transcripts. Mol. Cell. Biol., 35, 468–478.

Ekumi, K.M., Paculova, H., Lenasi, T., Pospichalova, V., Bosken, C.A., Rybarikova, J., Bryja, V., Geyer, M., Blazek, D. and Barboric, M. (2015) Ovarian carcinoma CDK12 mutations misregulate expression of DNA repair genes via deficient formation and function of the Cdk12/CycK complex. Nucleic acids research, 43, 2575-2589.

Even, Y., Durieux, S., Escande, M.L., Lozano, J.C., Peaucellier, G., Weil, D., Genevière, A.M. (2006) CDC2L5, a Cdk-Like kinase with RS domain, interacts with the ASF/SF2-associated protein p32 and affects splicing in vivo. J Cell Biochem., 99, 890-904.

Even, Y., Escande, M.L., Fayet, C., Genevière, A.M. (2016) CDK13, a kinase involved in pre- mRNA splicing, is a component of the perinucleolar compartment. PLoS One, 11, e0149184.

Fisher, R.P. (2005) Secrets of a double agent: CDK7 in cell-cycle control and transcription. J. Cell Sci., 118, 5171–5180.

Fuda, N.J., Ardehali, M.B. and Lis, J.T. (2009) Defining mechanisms that regulate RNA polymerase II transcription in vivo. Nature, 461, 186-192.

Fusco, N., Geyer, F.C., De Filippo, M.R., Martelotto, L.G., Ng, C.K., Piscuoglio, S., Guerini- Rocco, E., Schultheis, A.M., Fuhrmann, L., Wang, L., et al. (2016) Genetic events in the progression of adenoid cystic carcinoma of the breast to high-grade triple-negative breast cancer. Mod Pathol., 29, 1292-1305.

Gaillard, H., Garcia-Muse, T. and Aguilera, A. (2015) Replication stress and cancer. Nat. Rev. Cancer, 15, 276–289.

Ghamari, A., van de Corput, M.P., Thongjuea, S., van Cappellen, W.A., van Ijcken, W., van Haren, J., Soler, E., Eick, D., Lenhard, B. and Grosveld, F.G. (2013) In vivo live imaging of RNA polymerase II transcription factories in primary cells. Genes Dev., 27, 767–777.

44

Gressel, S., Schwalb, B., Decker, T.M., Qin, W., Leonhardt, H., Eick, D., Cramer, P. (2017) CDK9-dependent RNA Polymerase II Pausing Controls Transcription Initiation. Elife, 6, e29736.

Grünberg, S. and Hahn, S. (2013) Structural insights into transcription initiation by RNA polymerase II. Trends Biochem Sci, 38, 603-11.

Hamilton, M.J., Caswell, R.C., Canham, N., Cole, T., Firth, H.V., Foulds, N., Heimdal, K., Hobson, E., Houge, G., Joss, S. et al. (2018) Heterozygous mutations affecting the of CDK13 cause a syndromic form of developmental delay and intellectual disability. J Med Genet, 55, 28–38.

Hanes, S.D. (2014) The Ess1 prolyl isomerase: traffic cop of the RNA polymerase II transcription cycle. Biochim Biophys Acta., 1839, 316-33.

Harlen, K.M. and Churchman, L.S. (2017) The code and beyond: transcription regulation by the RNA polymerase II carboxy-terminal domain. Nat Rev Mol Cell Biol, 18, 263-273.

Hsin, J.P. and Manley, J.L. (2012) The RNA polymerase II CTD coordinates transcription and RNA processing. Genes & development, 26, 2119-2137.

Insco, M.L., Abraham, B.J., Dubbury, S.J., Dust, S., Wu, C., Chen, K.Y., Liu, D., Ludwig, C.G., Bellaousov, S., Fabo, T. et al. (2019) CDK13 Mutations Drive Melanoma via Accumulation of Prematurely Terminated Transcripts. bioRxiv doi: https://doi.org/10.1101/824193.

Iniguez, A.B., Stolte, B., Wang, E.J., Conway,A.S., Alexe, G., Dharia, N.V., Kwiatkowski, N., Zhang, T.H., Abraham, B.J., Mora, J. et al. (2018) EWS/FLI confers tumor cell synthetic lethality to CDK12 inhibition in Ewing sarcoma. Cancer Cell, 33, 202–216.

Jeronimo, C., Bataille, A.R. and Robert, F. (2013) The writers, readers, and functions of the RNA Polymerase II C-terminal domain code. Chem Rev, 113, 8491-552.

Jeronimo, C., Collin, P., Robert, F. (2016) The RNA Polymerase II CTD: The Increasing Complexity of a Low-Complexity Protein Domain. J Mol Biol 428, 2607-2622. Johnson, S.F., Cruz, C., Greifenberg, A.K., Dust, S., Stover, D.G., Chi, D., Primack, B., Cao, S., Bernhardy, A.J., Coulson, R. et al. (2016) CDK12 Inhibition Reverses De Novo and Acquired PARP Inhibitor Resistance in BRCA Wild-Type and Mutated Models of Triple- Negative Breast Cancer. Cell reports, 17, 2367-2381.

Joshi, P.M., Sutor, S.L., Huntoon, C.J. and Karnitz, L.M. (2014) Ovarian cancer-associated mutations disable catalytic activity of CDK12, a kinase that promotes homologous recombination repair and resistance to cisplatin and poly(ADP-ribose) polymerase inhibitors. J. Biol. Chem., 289, 9247–9253.

Juan, H.C., Lin, Y., Chen, H.R. and Fann, M.J. (2016) Cdk12 is essential for embryonic development and the maintenance of genomic stability. Cell Death Differ., 23, 1038–1048.

Ko, T.K., Kelly. E., Pines, J. (2001) CrkRS: a novel conserved Cdc2-related protein kinase that colocalises with SC35 speckles. J Cell Sci, 114, 2591–2603. Kohoutek, J. and Blazek, D. (2012) Cyclin K goes with Cdk12 and Cdk13. Cell division, 7, 12.

45

Krajewska ,M., Dries, R., Grassetti, A.V., Dust, S., Gao, Y., Huang, H., Sharma, B., Day, D.S., Kwiatkowski, N., Pomaville, M. et al. (2019) CDK12 loss in cancer cells affects DNA damage response genes through premature cleavage and polyadenylation. Nat. Commun., 10, 1757.

Larochelle, S., Amat, R., Glover-Cutter, K., Sansó, M., Zhang, C., Allen, J.J., Shokat, K.M., Bentley, D.L., Fisher, R.P. (2012) Cyclin-dependent kinase control of the initiation-to- elongation switch of RNA polymerase II. Nat Struct Mol Biol., 19, 1108-15.

Lei, T., Zhang, P., Zhang, X., Xiao, X., Zhang, J., Qiu, T., Dai, Q., Zhang, Y., Min, L., Li, Q. et al. (2018) Cyclin K regulates prereplicative complex assembly to promote mammalian cell proliferation. Nat. Commun., 9, 1876.

Li, B.B., Wang, B., Zhu, C.M., Tang, D., Pang, J., Zhao, J., Sun, C.H., Qiu, M.J., Qian, Z.R. (2019) Cyclin-dependent kinase 7 inhibitor THZ1 in cancer therapy. Chronic Dis Transl Med 5, 155-169.

Liang, K., Gao, X., Gilmore, J.M., Florens, L., Washburn, M.P., Smith, E. and Shilatifard, A. (2015) Characterization of human cyclin-dependent kinase 12 (CDK12) and CDK13 complexes in C-terminal domain phosphorylation, gene transcription, and RNA processing. Molecular and cellular biology, 35, 928-938.

Liu, P., Kenney, J.M., Stiller, J.W., Greenleaf, A.L. (2010) Genetic organization, length conservation, and evolution of RNA polymerase II carboxyl-terminal domain. Mol Biol Evol, 27, 2628–2641.

Lord, C.J. and Ashworth, A. (2016) BRCAness revisited. Nat. Rev. Cancer, 16, 110–120.

Lord, C.J. and Ashworth, A. (2017) PARP inhibitors: synthetic lethality in the clinic. Science, 355, 1152–1158.

Lui, G.Y.L., Grandori, C. and Kemp, C.J. (2018) CDK12: an emerging therapeutic target for cancer. J Clin Pathol, 71, 957-962.

Luo, J. and Antonarakis, E.S. (2019) PARP inhibition––not all gene mutations are created equal. Nat. Rev. Urol., 16, 4–6.

Malcovati, L., Della Porta, M.G., Pietra, D., Boveri, E., Pellagatti, A., Gallì, A., Travaglino, E., Brisci, A., Rumi, E., Passamonti, F. et al. (2009) Molecular and clinical features of refractory anemia with ringed sideroblasts associated with marked thrombocytosis. Blood, 114, 3538- 45.

Malumbres, M. (2014) Cyclin-dependent kinases. Genome Biol, 15, 122.

Marques, F., Moreau, J.L., Peaucellier, G., Lozano, J.C., Schatt, P., Picard, A., Callebaut, I., Perret, E. and Geneviere, A.M. (2000) A new subfamily of high molecular mass CDC2-related kinases with PITAI/VRE motifs. Biochem Biophys Res Commun, 279, 832–837.

Marshall, N.F., Peng, J., Xie, Z., Price, D.H. (1996) Control of RNA polymerase II elongation potential by a novel carboxyl-terminal domain kinase. J Biol Chem., 271, 27176-83.

Matena, A., Rehic, E., Hönig, D., Kamba, B., Bayer, P. (2018) Structure and function of the human parvulins Pin1 and Par14/17. Biol. Chem., 399, 101–125.

Mayer, B.J. (2001) SH3 domains: complexity in moderation. J Cell Sci, 114, 1253-63.

46

Menghi, F., Inaki, K., Woo, X., Kumar, P.A., Grzeda, K.R., Malhotra, A., Yadav, V., Kim, H., Marquez, E.J., Ucar, D. et al. (2016) The tandem duplicator phenotype as a distinct genomic configuration in cancer. Proc. Natl. Acad. Sci. U.S.A., 113, E2373–E2382.

Menghi, F., Barthel, F.P., Yadav, V., Tang, M., Ji, B., Tang, Z., Carter, G.W., Ruan, Y., Scully, R., Verhaak, R.G.W. et al. (2018) The tandem duplicator phenotype is a prevalent genome- wide cancer configuration driven by distinct gene mutations. Cancer Cell, 34, 197–210.

Mertins, P., Mani, D.R., Ruggles, K.V., Gillette, M.A., Clauser, K.R., Wang, P., Wang, X., Qiao, J.W., Cao, S., Petralia, F. et al. (2016) Proteogenomics connects somatic mutations to signalling in breast cancer. Nature, 534, 55–62.

Naidoo, K., Wai, P.T., Maguire, S.L., Daley, F., Haider, S., Kriplani, D., Campbell, J., Mirza, H., Grigoriadis, A., Tutt, A. et al. (2018) Evaluation of CDK12 protein expression as a potential novel biomarker for DNA damage response-targeted therapies in breast cancer. Mol. Cancer Ther., 17, 306–315.

Natrajan, R., Wilkerson, P.M., Marchio, C., Piscuoglio, S., Ng, C.K., Wai, P., Lambros, M.B., Samartzis, E.P., Dedes, K.J., Frankum,J. et al. (2014) Characterization of the genomic features and expressed fusion genes in micropapillary carcinomas of the breast. J. Pathol., 232, 553–565.

Nemet, J., Jelicic, B., Rubelj, I., Sopta, M. (2013) The two faces of Cdk8, a positive/negative regulator of transcription. Biochimie., 97, 22-7.

Nováková, M., Hampl, M., Vrábel, D., Procházka, J., Petrezselyová, S., Procházková, M., Sedláček, R., Kavková, M., Zikmund, T., Kaiser, J. et al. (2019) Mouse Model of Congenital Heart Defects, Dysmorphic Facial Features and Intellectual Developmental Disorders as a Result of Non-functional CDK13. Front Cell Dev Biol., 7, 155.

O’Connor,M.J. (2015) Targeting the DNA damage response in cancer. Mol. Cell, 60, 547–560.

Oh, J.M., Di, C., Venters, C.C., Guo, J.N., Arai, C., So, B.R., Pinto, A.M., Zhang, Z.X., Wan, L.L., Younis, I. et al. (2017) U1 snRNP telescripting regulates a size-function-stratified . Nat. Struct. Mol. Biol., 24, 993–999.

Olson, C.M., Liang, Y., Leggett, A., Park, W.,D, Li, L., Mills, C.E., Elsarrag, S.Z., Ficarro, S.B., Zhang, T., Düster, R. et al. (2019) Development of a selective CDK7 covalent inhibitor reveals predominat cell-cycle phenotype. Cell Chem Biol 26, 792-803.

Paculova, H. and Kohoutek, J. (2017) The emerging roles of CDK12 in tumorigenesis. Cell division, 12, 7.

Pak, V., Eifler, T.T., Jäger, S., Krogan, N.J., Fujinaga, K., Peterlin, B.M. (2015) CDK11 in TREX/THOC Regulates HIV mRNA 3' End Processing. Cell Host Microbe, 18, 560-70.

Patel, A.G., Sarkaria, J.N., Kaufmann, S.H. (2011) Nonhomologous end joining drives poly(ADP-ribose) polymerase (PARP) inhibitor lethality in homologous recombination- deficient cells. Proc Natl Acad Sci U S A, 108, 3406-11

Peterlin, B.M. and Price, D.H. (2006) Controlling the elongation phase of transcription with P- TEFb. Mol Cell, 23, 297-305.

47

Phatnani, H.P. and Greenleaf, A.L. (2006) Phosphorylation and functions of the RNA polymerase II CTD. Genes Dev., 20, 2922–2936.

Popova, T., Manie, E., Boeva, V., Battistella, A., Goundiam, O., Smith, N.K., Mueller, C.R., Raynal, V., Mariani, O., Sastre-Garau, X. et al. (2016) Ovarian cancers harboring inactivating mutations in CDK12 display a distinct genomic instability pattern characterized by large tandem duplications. Cancer Res., 76, 1882–1891.

Proudfoot, N.J. (2016) Transcriptional termination in mammals: Stopping the RNA polymerase II juggernaut. Science, 352, aad9926.

Quereda, V., Bayle, S., Vena, F., Frydman, S.M.,Monastyrskyi, A., Roush, W.R. and Duckett, D.R. (2019) Therapeutic targeting of CDK12/CDK13 in triple-negative breast cancer. Cancer Cell, 36, 545.e7–558.e7.

Ranuncolo, S.M., Ghosh, S., Hanover, J.A., Hart, G.W., Lewis, B.A. (2012) Evidence of the involvement of O-GlcNAc-modified human RNA polymerase II CTD in transcription in vitro and in vivo. J Biol Chem., 287, 23549-61.

Robinson, D., Van Allen, E.M., Wu, Y.M., Schultz, N., Lonigro, R.J., Mosquera, J.M., Montgomery, B., Taplin, M.E., Pritchard, C.C., Attard, G. et al. (2015) Integrative Clinical Genomics of Advanced Prostate Cancer. Cell, 162, 454.

Rodrigues, F., Thuma, L. and Klambt, C. (2012) The regulation of glial-specific splicing of Neurexin IV requires HOW and Cdk12 activity. Development, 139, 1765–1776.

Schecher, S., Walter, B., Falkenstein, M., Macher-Goeppinger, S., Stenzel, P., Krumpelmann, K., Hadaschik, B., Perner, S., Kristiansen, G., Duensing, S. et al. (2017) Cyclin K dependent regulation of Aurora B affects apoptosis and proliferation by induction of mitotic catastrophe in prostate cancer. Int. J. Cancer, 141, 1643–1653.

Schüller, R., Forne, I., Straub, T., Schreieck, A., Texier, Y., Shah, N., Decker, T.M., Cramer, P., Imhof, A. and Eick, D (2016) Heptad-specific phosphorylation of RNA polymerase II CTD. Mol.Cell, 61, 305–314.

Sifrim, A., Hitz, M.P., Wilsdon, A., Breckpot, J., Turki, S.H., Thienpont, B., McRae, J., Fitzgerald, T.W., Singh, T., Swaminathan, G.J. et al. (2016) Distinct genetic architectures for syndromic and nonsyndromic congenital heart defects identified by exome sequencing. Nat Genet, 48, 1060–1065.

Sims, R.J. , Rojas, L.A., Beck, D.B., Bonasio, R., Schüller, R., Drury, W.J., Eick, D., Reinberg, D. (2011) The C-terminal domain of RNA polymerase II is modified by site-specific methylation. Science, 332, 99-103.

Singh, N., Ma, Z., Gemmill, T., Wu, X., Defiglio, H., Rossettini, A., Rabeler, C., Beane, O., Morse, R.H,, Palumbo, M.J. et al. (2009) The Ess1 prolyl isomerase is required for transcription termination of small noncoding RNAs via the Nrd1 pathway. Mol Cell., 36, 255- 66.

Sircoulomb, F., Bekhouche, I., Finetti, P., Adélaïde, J., Ben Hamida, A., Bonansea, J., Raynaud, S., Innocenti, C., Charafe-Jauffret, E., Tarpin, C. et al. (2010) Genome profiling of ERBB2-amplified breast cancers. BMC Cancer, 10, 539.

48

Spector, D.L. and Lamond, A.I. (2011) Nuclear speckles. Cold Spring Harb. Perspect. Biol., 3, a000646.

Taatjes, D.J. (2010) The human mediator complex: a versatile, genome-wide regulator of transcription. Trends Biochem. Sci., 35, 315e322.

Tan-Wong, S.M., Zaugg, J.B., Camblong, J., Xu, Z., Zhang, D.W., Mischo, H.E., Ansari, A.Z., Luscombe, N.M., Steinmetz, L.M., Proudfoot, N.J. (2012) Gene loops enhance transcriptional directionality. Science, 338, 671-5.

Tien, J.F., Mazloomian, A., Cheng, S.G., Hughes, C.S., Chow, C.C.T., Canapi, L.T., Oloumi, A., Trigo-Gonzalez, G., Bashashati, A., Xu, J. et al. (2017) CDK12 regulates alternative last exon mRNA splicing and promotes breast cancer cell invasion. Nucleic acids research, 45, 6698-6716.

Toyoshima, M., Howie, H.L., Imakura, M., Walsh, R.M., Annis, J.E., Chang, A.N., Frazier, J., Chau, B.N., Loboda, A., Linsley, P.S. et al. (2012) Functional genomics identifies therapeutic targets for MYC-driven cancer. Proc. Natl. Acad. Sci. U.S.A., 109, 9545–9550.

Vanderstichele A, Busschaert P, Olbrecht S, Lambrechts D, Vergote I. Genomic signatures as predictive biomarkers of homologous recombination deficiency in ovarian cancer. Eur J Cancer., 86, 5-14.

Verdecia, M. A., Bowman, M. E., Lu, K. P., Hunter, T., Noel, J. P. (2000) Structural basis for phosphoserine-proline recognition by group IV WW domains. Nat. Struct. Biol., 7, 639–643.

Viswanathan, S.R., Ha, G., Hoff, A.M., Wala, J.A., Carrot-Zhang, J., Whelan, C.W., Haradhvala, N.J., Freeman, S.S., Reed, S.C., Rhoades, J. et al. (2018) Structural alterations driving castration-resistant prostate cancer revealed by linked-read genome sequencing. Cell, 174, 433–447.

Voss, K., Forné, I., Descostes, N., Hintermair, C., Schüller, R., Maqbool, M.A., Heidemann, M., Flatley, A., Imhof, A., Gut, M. (2015) Site-specific methylation and acetylation of lysine residues in the C-terminal domain (CTD) of RNA polymerase II. Transcription, 6, 91-101.

Wu, Y.M., Cieslik, M., Lonigro, R.J., Vats, P., Reimers, M.A., Cao, X., Ning, Y., Wang, L., Kunju, L.P., de Sarkar, N. et al. (2018) Inactivation of CDK12 delineates a distinct immunogenic class of advanced prostate cancer. Cell, 173, 1770–1782.

Xu, Y.X., Hirose, Y., Zhou, X.Z., Lu, K.P., Manley, J.L. (2003) Pin1 modulates the structure and function of human RNA polymerase II. Genes Dev., 17, 2765-76.

Yaffe, M.B., Schutkowski, M., Shen, M., Zhou, X.Z., Stukenberg, P.T., Rahfeld, J.U., Xu, J., Kuang, J., Kirschner, M.W., Fischer, G. et al. (1997). Sequence-Specific and Phosphorylation- Dependent Proline Isomerization: A Potential Mitotic Regulatory Mechanism. Science, 278, 1957-60.

Yang, L. , Chansky, H.A. , Hickstein, D.D. (2000) EWS.Fli-1 fusion protein interacts with hyper-phosphorylated RNA polymerase II and interferes with serine-arginine protein-mediated RNA splicing. J Biol Chem, 275, 37612–8.

Yu, M., Yang, W., Ni, T., Tang, Z., Nakadai, T., Zhu, J. and Roeder, R.G. (2015) RNA polymerase II associated factor 1 regulates the release and phosphorylation of paused RNA polymerase II. Science, 350, 1383–1386.

49

Zaborowska, J., Egloff, S. and Murphy, S. (2016) The pol II CTD: new twists in the tail. Nat. Struct. Mol. Biol., 23, 771–777.

Zeng, M., Kwiatkowski, N.P., Zhang, T., Nabet, B., Xu, M., Liang, Y., Quan, C., Wang, J., Hao, M., Palakurthi, S. et al. (2018) Targeting MYC dependency in ovarian cancer through inhibition of CDK7 and CDK12/13. Elife, 7, e39030.

Zhang, M., Wang, X.J., Chen, X., Bowman, M.E,, Luo, Y., Noel, J.P., Ellington, A.D., Etzkorn, F.A., Zhang, Y. (2012) Structural and kinetic analysis of prolyl-isomerization/phosphorylation cross-talk in the CTD code. ACS Chem Biol., 7, 1462-70.

Zhang, T., Kwiatkowski, N., Olson, C.M., Dixon-Clarke, S.E., Abraham, B.J., Greifenberg, A.K., Ficarro, S.B., Elkins, J.M., Liang, Y., Hannett, N.M. et al. (2016) Covalent targeting of remote cysteine residues to develop CDK12 and CDK13 inhibitors. Nat Chem Biol, 12, 876- 884.

Zhong, X.Y., Wang, P., Han, J., Rosenfeld, M.G., Fu, X.D. (2009) SR proteins in vertical integration of gene expression from transcription to RNA processing to translation. Mol Cell, 35, 1-10.

Zhou, Q., Li, T., Price, D.H. (2012) RNA polymerase II elongation control. Annu Rev Biochem., 81, 119-43.

Electronic sources:

ClinicalTrials.gov [online database, ref. 11.3. 2020], U.S. National library of medicine, available at: < https://clinicaltrials.gov/ >.

50

8. Publications

Note: In the case of publication #2, only the main and expanded view figures are included in the thesis. The remaining supplementary figures are available online in Appendix: embopress.org/doi/full/10.15252/embr.201847592

51

Article

Structural and Functional Analysis of the Cdk13/ Cyclin K Complex

Graphical Abstract Authors Ann Katrin Greifenberg, Dana Ho¨ nig, Kveta Pilarova, ..., Kanchan Anand, Dalibor Blazek, Matthias Geyer

Correspondence [email protected]

In Brief Cyclin-dependent kinases regulate transcription through phosphorylation of RNA polymerase II. Greifenberg et al. report the structure of the human Cdk13/ Cyclin K complex. Cdk13 contains a C-terminal extension helix that is a specific feature of transcription elongation kinases.

Highlights Accession Numbers d The crystal structure of the human Cdk13/Cyclin K complex 5EFQ d Cdk13 phosphorylates Ser5 and Ser2 of the RNA polymerase II CTD d The isomerase Pin1 does not change the phosphorylation specificity of Cdk13 d Cdk13 regulates genes involved in growth signaling pathways

Greifenberg et al., 2016, Cell Reports 14, 320–331 January 12, 2016 ª2016 The Authors http://dx.doi.org/10.1016/j.celrep.2015.12.025 Cell Reports Article

Structural and Functional Analysis of the Cdk13/Cyclin K Complex

Ann Katrin Greifenberg,1,2 Dana Ho¨ nig,2,3 Kveta Pilarova,4 Robert Duster,€ 1,2 Koen Bartholomeeusen,4 Christian A. Bo¨ sken,2,3 Kanchan Anand,1,5 Dalibor Blazek,4 and Matthias Geyer1,2,3,* 1Institute of Innate Immunity, Department of Structural Immunology, University of Bonn, Sigmund-Freud-Strasse 25, 53127 Bonn, Germany 2Center of Advanced European Studies and Research, Group Physical Biochemistry, Ludwig-Erhard-Allee 2, 53175 Bonn, Germany 3Max Planck Institute of Molecular Physiology, Department of Physical Biochemistry, Otto-Hahn-Strasse 11, 44227 Dortmund, Germany 4Central European Institute of Technology (CEITEC), Masaryk University, 62500 Brno, Czech Republic 5Present address: EMBL Heidelberg, Structural and Computational Biology Programme, Meyerhofstrasse 1, 69117 Heidelberg, Germany *Correspondence: [email protected] http://dx.doi.org/10.1016/j.celrep.2015.12.025 This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

SUMMARY for transcription associated factors. The five hydroxyl group con- taining residues Tyr1, Ser2, Thr4, Ser5, and Ser7 are all suscep- Cyclin-dependent kinases regulate the cell cycle tible to post-translational modifications (PTMs) combined with a and transcription in higher eukaryotes. We have cis/trans isomerization of the two intermediate prolines, Pro3 and determined the crystal structure of the transcription Pro6. The reversible modifications by phosphorylation or kinase Cdk13 and its Cyclin K subunit at 2.0 A˚ reso- O-linked glycosylation as well as acetylation and methylation lution. Cdk13 contains a C-terminal extension helix of basic residues open a plethora of possible combinations of composed of a polybasic cluster and a DCHEL motif PTMs, often described as RNAPII CTD code. Phosphorylation of the three serine residues within the CTD is that interacts with the bound ATP. Cdk13/CycK tightly linked to the phases of RNAPII-mediated transcription phosphorylates both Ser5 and Ser2 of the RNA poly- (Adelman and Lis, 2012). At first, the unphosphorylated polymer- merase II C-terminal domain (CTD) with a preference ase is recruited into the pre-initiation complex at open chromatin for Ser7 pre-phosphorylations at a C-terminal posi- structures. In a simplified model, phosphorylation of Ser7 (pSer7) tion. The peptidyl-prolyl isomerase Pin1 does not starts the transcription cycle until an additional pause step stalls change the phosphorylation specificities of Cdk9, the polymerase about 50 nucleotides downstream of the tran- Cdk12, and Cdk13 but interacts with the phosphory- scription start site (TSS). To overcome this promoter proximal lated CTD through its WW domain. Using recombi- pausing, Ser5 is phosphorylated, potentially in conjunction with nant proteins, we find that flavopiridol inhibits Cdk7 Ser7, leading to the robust elongation of transcripts. On a molec- more potently than it does Cdk13. Gene expression ular level, Ser7 phosphorylation primes the CTD for further mod- changes after knockdown of Cdk13 or Cdk12 are ifications (Czudnochowski et al., 2012; St Amour et al., 2012), but, whereas Ser7 phosphorylation levels stay high during the markedly different, with enrichment of growth transcription process, the pSer5 signal decreases steadily to- signaling pathways for Cdk13-dependent genes. ward the poly-adenylation site by the action of phosphatases. Together, our results provide insights into the struc- Ser2 phosphorylation instead increases toward the transcription ture, function, and activity of human Cdk13/CycK. termination site (TTS), consistent with the recruitment of 30 RNA- processing factors by pSer2 marks (Davidson et al., 2014). The INTRODUCTION phosphorylation signals are removed by phosphatases during the termination process, giving way for a new transcription cycle Higher organisms have evolved a unique structure in the RNA of the polymerase. polymerase II (RNAPII) to coordinate transcription and the atten- Cyclin-dependent kinases (CDKs) play major roles in the regu- dant processing of pre-mRNA molecules. The largest subunit lation of the cell cycle and transcription. Five mammalian CDKs Rpb1 of RNAPII contains a C-terminal extension, termed the have been described to date as transcription regulating kinases C-terminal domain, or CTD, which is composed of multiple re- together with their corresponding cyclin subunits: Cdk7/Cyclin H peats of the hepta-sequence Y1S2P3T4S5P6S7. The number of as components of the general transcription factor TFIIH repeats varies between 26 in yeast and 52 in humans and is (Grunberg€ and Hahn, 2013), Cdk8/Cyclin C as components of thought to correlate with the genomic complexity of the organ- the Mediator kinase module (Allen and Taatjes, 2015), Cdk9/ ism (Buratowski 2009; Corden 2013; Eick and Geyer, 2013). In Cyclin T1 or T2 that constitute the active form of the positive tran- addition, alterations from the consensus sequence occur in scription elongation factor (P-TEFb) (Peterlin and Price, 2006), higher eukaryotes with a change in position 7 from serine to and Cdk12/Cyclin K and Cdk13/Cyclin K as the latest members lysine prevailing in the distal part of the human CTD. The repet- of RNAPII CTD kinases (Bartkowiak et al., 2010; Blazek et al., itive structure of the CTD is thought to act as a binding scaffold 2011). A precise assignment of the various CDKs to the different

320 Cell Reports 14, 320–331, January 12, 2016 ª2016 The Authors serine phosphorylations within the transcription cycle is not yet (see Experimental Procedures). The crystallographic phases clear, as some promiscuity in the specificity of the kinases per- were solved by molecular replacement with the coordinates of sists. However, Cdk9 is commonly described as a transcription Cdk12/CycK as a search model and the structure was deter- elongation kinase that is located at the TSS in line with its func- mined to a resolution of 2.0 A˚ . The protein complex was refined tion to overcome the promoter proximal arrest (Ghamari et al., to an Rwork of 19.5% and Rfree of 24.7% with excellent stereo- 2013). Cdk7 in contrast initiates transcription, both by phosphor- chemistry (Table S1). Two Cdk13/CycK heterodimers form the ylating Ser7 of the CTD and by its ability to phosphorylate the asymmetrical unit cell of the protein crystal. No crystallographic activating threonine in the T-loop of CDKs (Larochelle et al., density was observed for the substrate peptide but only for the

2012). AlF3 transition state mimic and two magnesium ions (Figure 1B). Human Cdk12 and Cdk13 are large proteins with molecular Cdk13 exhibits the typical kinase fold consisting of an N-termi- weights of 164 and 165 kDa, respectively. The proteins share nal lobe (aa 695–794) and a C-terminal lobe (aa 795–998) that 43% sequence identity and harbor a central kinase domain. In share 46.9% sequence identity to Cdk9 and 41.4% sequence contrast to CDKs 7, 8, and 9, human kinases Cdk12 and identity to Cdk2, respectively. The orientation of the aC helix, Cdk13 contain expanded regions of serine-arginine (SR) motifs also known as the 746PITAIRE helix (Figures 1B and 2), is in the in their N-terminal regions. These regions span residues 130– kinase active conformation, mediating contacts to the cyclin 380 in Cdk12 and 200–435 in Cdk13, respectively, linking the ki- subunit. The 855DFG motif at the start of the activation segment nases to the SR protein family involved in RNA processing and and the activation segment itself adopt conformations that allow pre-mRNA splicing (Zhou and Fu, 2013; Ghosh and Adams, access of the substrate to the catalytic site. Following the DFG 2011). Accordingly, Cdk13 was demonstrated to interact with motif is a leucine, L858, indicative for the preference of a SP motif the splicing factor SRSF1 and to regulate alternative splicing of rather than a TP motif for phosphorylation (Chen et al., 2014a). HIV (Berro et al., 2008). Cdk12 in turn was shown to regulate Similar to Cdk12, Cdk13 contains a C-terminal extension DNA-damage response genes and somatic gene mutations of segment (aa 999–1032) following the canonical cyclin box the kinase were found in multiple cancers as high-grade serous domain that associates to the kinase domain (Figure 1C). A his- ovarian carcinoma (The Cancer Genome Atlas Research tidine and a glutamate residue within this stretch, H1018 and Network, 2011; Ekumi et al., 2015; Joshi et al., 2014). However, E1019, form water-mediated interactions to the ribose of the the exact roles of Cdk13 and Cdk12 in these processes are yet to bound nucleotide (Figure 1D). The interactions of this central be explored. HE motif to the kinase domain are complemented by flanking In an effort to understand the molecular basis of Cdk13 func- residues D1016 and L1020 that weakly interact with the N- and tion, we determined the crystal structure of the kinase/cyclin do- C-terminal lobes of the kinase domain. The 1016DCHEL mains of human Cdk13/CycK at 2.0 A˚ resolution. Cdk13 contains sequence is succeeded by the polybasic region 1023KKRRRQK, a C-terminal extension helix following the canonical kinase which is resolved in the second chain of Cdk13 up to M1031, domain that interacts with the bound ATP substrate. Flavopiridol thought without displaying electron density for the side chains is a poor inhibitor of Cdk13 activity but inhibits Cdk7 more of the positively charged amino acids (Figure S2). Its conforma- potently, explaining its potency in downregulating transcription tion reveals that the HE motif and the polybasic cluster form a associated processes. The cis/trans peptidyl-prolyl isomerase continuous helix adjacent to the kinase active center. Pin1 does not change the phosphorylation specificity of Cdk13/CycK for Ser5 and Ser2 sites, while pre-incubation of a Cdk/Cyclin Complexes of Transcription Elongation CTD substrate with Cdk9 showed a small increase in Cdk13 ac- Kinases Adopt an Open Conformation tivity. Finally, we find that Cdk13 and Cdk12 regulate different The orientation of the cyclin subunit with respect to the kinase sets of genes, with Cdk13 activity mostly involved in growth domain is rotated in Cdk13/CycK by 21 toward an open confor- signaling pathways. mation compared to the structure of Cdk2/CycA (Figure 3A). This leads to exposure of the kinase active center as the first cyclin- RESULTS box domain of CycK is bent apart from the kinase T-loop element. A similar arrangement has been observed for the Structure of the Cdk13/Cyclin K Complex Cdk9/CycT1 complex and the Cdk12/CycK complex with rota- Human Cdk13 is an unusually large kinase that harbors a central tions of 26 and 24, respectively, compared to Cdk2/CycA (Fig- kinase domain flanked by an N-terminal SR region and a C-ter- ure 3B). One explanation for the twist of the two subunits with minal region of low complexity (Figure 1A). Its corresponding respect to each other is that the phospho-threonine in the cyclin partner CycK contains a cyclin-box domain followed by T-loop segment of the kinase is coordinated only by two of the a wide-stretched proline-rich region with 42% proline content three canonical arginine residues (Figure 2). Phospho-T871 over 220 amino acids. For structure determination, the kinase within the activation segment is clearly visible in the electron and cyclin box domains of human Cdk13/CycK were expressed density map and forms salt bridges with R836 and R860 (Fig- in baculovirus infected insect cells co-transfected with the CDK ure 3C). The third canonical arginine residue instead, R751 of activating kinase CAK1 from S. cerevisiae. The complex was pu- the PITAIRE motif, is about 8.1 A˚ apart from the phosphate rified to homogeneity by affinity chromatography and gel filtra- group. It forms an intermolecular salt bridge with the second tion analysis (Figure S1). Crystals were grown in the presence glutamate of the 105KVEE motif in CycK, which is conserved in of ADP, aluminum fluoride, and the substrate peptide P-pS- cyclins C, H, K, and T (Figure S3). These four cyclins are the YSPTSP-pS-YSPT by the hanging drop diffusion technique corresponding subunits to transcription regulating kinases

Cell Reports 14, 320–331, January 12, 2016 ª2016 The Authors 321 Figure 1. Structure of the Human Cdk13/CycK Complex (A) Domain architectures of human Cdk13 (UniProt: Q14004) and human Cyclin K (O75909). Cdk13 contains an extended N-terminal serine-arginine (SR)-rich region that spans 250 amino acids, followed by the central kinase domain. Human CycK consists of an N-terminal cyclin box domain followed by a region of low complexity of more than 300 amino acids, containing more than 35% prolines.

(B) Overall structure assembly of Cdk13 (green) and CycK (beige). The ADP,AlF3 nucleotide and the phospho-threonine residue in the T loop segment of the kinase are highlighted.

(C) Coordination of the C-terminal extension in Cdk13 following the canonical kinase domain. Residues 1001–1024 are shown as cartoon with the final 2Fo–Fc electron density displayed at 1s. The structure of the Cdk13 kinase domain is shown as surface representation. (D) Close up of the interactions of the C-terminal extension helix with the kinase domain and the bound nucleotide. H1018 and E1019 of the DCHEL motif in Cdk13 form water-mediated contacts to ADP complemented by interactions of D1016 with residues of the N- and C-terminal kinase lobes.

Cdk8, 7, 12 and 13, and 9. A similar arrangement to Cdk13/CycK smaller interfaces thus correlate with the open conformation is found in Cdk12/CycK, where the arginine residue of the of the Cdk/cyclin subunits, which appears as a common feature PITAIRE motif solely forms intermolecular interactions with the of transcription elongation kinases. A detailed view on the cyclin subunit (Bo¨ sken et al., 2014). The arginine of the PITALRE molecular interface between Cdk13 and CycK is shown in sequence in Cdk9/CycT1 instead contacts both the phospho- Figure S4. threonine of the T-loop and the second glutamate of the KVEE motif (Schulze-Gahmen et al., 2013). For comparison, the three Cdk13 Is a Promiscuous Kinase for RNAPII CTD canonical arginines R50, R126, and R150 of Cdk2 all form tight Phosphorylations salt bridges with the phospho-threonine pT160 (Russo et al., We tested the activity of the Cdk13/CycK complex both for its 1996), with R50 of the PSTAIRE helix interacting additionally substrate recognition preference as well as for the phosphoryla- with the first glutamate of the KFEE motif in Cyclin A (Figure 3D). tion specificity of the substrate. A series of synthetic CTD pep- The orientation of the two subunits relative to each other is tides was analyzed, each containing three hepta repeats reflected by the size of the buried surface area of the Cdk/cyclin followed by a polyethylene linker and two arginines. Besides complexes. Whereas the buried surface area of Cdk2/CycA the consensus sequence, peptides uniformly phosphorylated involves both kinase lobes and cyclin boxes and encompasses at positions Tyr1, Ser2, Thr4, Ser5, and Ser7 were used as well 3,286 A˚ 2 (PDB:1JST) when counting both subunits, the Cdk13/ as a Lys7 variant (Itzen et al., 2014). Cdk13/CycK showed the CycK interface of 2,002 A˚ 2 involves only the N-terminal lobe of highest activity on the pre-phosphorylated pSer7-CTD sub- the kinase and the first cyclin box repeat. Likewise, Cdk12/ strate, whereas the non-phosphorylated consensus CTD pep- CycK covers a buried surface area of 2,161 A˚ 2 (PDB: 4NST) tide was poorly recognized (Figure 4A). pThr4-, pSer2-, and and those of Cdk9/CycT1 1,819 A˚ 2 only (PDB: 3BLQ). The pTyr1-CTD peptides were all phosphorylated to a rather low

322 Cell Reports 14, 320–331, January 12, 2016 ª2016 The Authors Figure 2. Sequence Alignment of Transcriptional CDKs Sequence alignment of the kinase domains of human Cdk proteins involved in the regulation of transcription by RNA polymerase II. Secondary structure elements are indicated for Cdk13 as determined here. The HE motif and the polybasic cluster in the C-terminal extension helix aK present in the transcription elongation regulating kinases Cdk9 (UniProt: P50750), Cdk12 (UniProt: Q9NYV4), and Cdk13 are highlighted. The presence of these motifs correlates with a glycine residue in helix aD (G800) that provides space of the assembly of the extension helix with the bound nucleotide. Accession numbers for Cdk8 and Cdk7 are as follows: (UniProt: P49336 and P50613). Residues conserved in all CDKs are boxed red, and those that are similar are colored red. The sequence alignment was prepared with MultAlin; the secondary structure alignment was prepared with ESPript. extent. The pSer5-CTD template and the K7-CTD peptide addition, the activity of native full-length proteins was compared instead were not recognized for phosphorylations. with recombinant CDK proteins as domains outside the can- In a following experiment, single Ser2, Ser5, or Ser7 phosphor- onical kinase fold could contribute to the phosphorylation ylation marks were placed in the N- or C-terminal repeat of the specificity. Flag-tagged full-length Cdk13 and Cdk12 with their CTD peptide, giving insights into the direction and recognition corresponding CycK subunit and Cdk9 proteins were expressed specificity of the kinase. Only the peptide pre-phosphorylated in HCT116 cells as well as the kinase dead (KD) variants Cdk13 at the C-terminal Ser7 position (pS7-C) was decently phosphor- (D855N), Cdk12 (D877N), and Cdk9 (D167N), respectively. Incu- ylated by Cdk13/CycK, albeit to less than 50% of the uniformly bation of GST-CTD with anti-flag immuno-precipitated Cdk13 pSer7-CTD substrate control (Figure 4B). The five other single revealed equally strong signals for Ser2 and Ser5 phosphoryla- pre-phosphorylated peptides were phosphorylated to the tion, whereas no Ser7 phosphorylation was detected (Figure 4C). same extent as the consensus CTD. These data suggest that A similar result was obtained for recombinant Cdk13/CycK the phosphorylation mark set by Cdk13/CycK is preferentially showing the same phosphorylation specificity as the native pro- attached N-terminal to an existing pSer7 site. teins. Likewise, native and recombinant Cdk12/CycK complexes To determine the site of phosphorylation set by Cdk13/CycK, exhibited a similar specificity for Ser2 and Ser5 phosphoryla- we performed western blot analysis using monoclonal anti- tions, but again no Ser7 marks were set. In contrast, flag-tagged bodies generated against pSer2, pSer5, and pSer7 marks. In immuno-precipitated full-length Cdk9 strongly phosphorylated

Cell Reports 14, 320–331, January 12, 2016 ª2016 The Authors 323 Figure 3. Transcription-Regulating Cdk/ Cyclin Complexes Adopt an Open Confor- mation (A) Superimposition of the two kinase subunits of the Cdk13/CycK complex (colored green/wheat, 5EFQ) and the Cdk2/CycA complex (gray/red, 1JST) reveals that the cyclin subunit of the Cdk13/ CycK complex is twisted out toward an open conformation of the kinase . The orienta- tion of helix H1 of the two cyclins is displayed by an arrow and the angle between the two helices given. (B) Detailed view on the orientation of the first canonical helix H1 in CycT1 (left) and CycK (right) based on a superimposition of the kinase subunits in the Cdk/cyclin complexes Cdk9/CycT1 with Cdk2/ CycA (3BLQ, 1JST) and Cdk12/CycK with Cdk2/ CycA (4NST, 1JST). The cyclin subunits are twisted out by 26 and 24 relative to the kinase domains, suggesting that this assembly is a common feature of transcription elongation kinases. (C) Close up of the phospho-threonine pT871 co- ordination in the Cdk13 T-loop segment. The final

2Fo–Fc electron density is displayed at 1s (left). Ionic interactions are formed between the two ca- nonical CDK arginines R836 and R860 and the phosphate group, whereas R751 of the PITAIRE sequence in helix C mediates electrostatic con- tacts with E108 of the 105KVEE motif in CycK but is not directly contacting pT871. (D) Salt bridge formation between the canonical arginines R50, R126, and R150, and the phos- phate group of pT160 is a hallmark of CDK acti- vation. This conformation arranges the cyclin subunit in a tight domain assembly.

Ser5 of the CTD and a little Ser7, but no Ser2 sites, similarly as tal structure of Cdk9 bound to flavopiridol (Baumli et al., 2008) observed before (Czudnochowski et al., 2012). The empty vector superimposed with the structure of Cdk13 revealed a steric hin- control and the kinase dead variants instead showed no phos- drance of flavopiridol with H1018 of the Cdk13 C-terminal exten- phorylations of the CTD on Ser2 and Ser7 and severely reduced sion helix (Figure 5B). The simultaneous occupancy of this site by Ser5 phosphorylation intensity. Input controls by western blot flavopiridol and the HE motif of the extension helix is therefore analyses using anti-flag and anti-CycK antibodies confirm the precluded. integrity of the Cdk/cyclin complexes and display the amounts To further analyze the effect of flavopiridol in the inhibition of of kinases used in the in vitro kinase assays (Figure 4D). The transcription processes, we tested its efficacy in the inhibition contribution of the C-terminal extension helix to the kinase activ- of Cdk7 kinase activity using recombinant protein. To a concen- ity and specificity was probed by mutagenesis of the HE motif tration of 0.2 mM Cdk7/CycH/MAT1, flavopiridol was added in and the polybasic cluster (Figure S5). Both motifs were mutated increasing amounts from 1 nM to 100 mM concentration, using to alanines, yet while the stability of the kinase was impaired for 1 mM ATP and 100 mM GST-CTD[9]KKK substrates. From a the HE mutant, the preferences for Ser5 phosphorylations of the sigmoidal fit, an IC50 value of 2.3 mM was determined for the in- full-length CTD remained unchanged. hibition of Cdk7 by flavopiridol (Figure 5C). Flavopiridol thus in- hibits Cdk7 more potently than it does Cdk13. This effect is Flavopiridol Inhibits Cdk7 More Potently Than It Does remarkable given that Cdk7 is supposed to be the CDK acti- Cdk13 vating kinase in mammals by its ability to phosphorylate the ki- The small molecular compound flavopiridol is a widely used ATP nases T-loops (Larochelle et al., 2012). competitive inhibitor of transcription elongation kinases (Baumli The high efficacy of flavopiridol as transcription elongation in- et al., 2008; Wang and Fischer, 2008). We tested its ability to hibiting compound was further analyzed in cells. Using targeted inhibit Cdk13/CycK using recombinant proteins and the sub- small interfering RNAs (siRNAs), the expression levels of endog- strate pS7-CTD. To a concentration of 0.2 mM Cdk13/CycK enous Cdk12 and Cdk13, respectively, were markedly reduced and 1 mM ATP, flavopiridol was added in a concentration range in HCT116 cells (Figure 5D). Yet, compared to the use of a ran- from 3.1 nM to 310 mM. From a plot of the kinase activity versus domized control siRNA, the changes in serine phosphorylation the concentration of flavopiridol an in vitro IC50 value of 16.1 mM are very small. The use of flavopiridol, however, strongly dimin- was determined, rendering flavopiridol a rather poor inhibitor of ished the phosphorylation levels of all three serines, indicating Cdk13 (Figure 5A). Intriguingly, structural analysis using the crys- its broad inhibition of transcription processes.

324 Cell Reports 14, 320–331, January 12, 2016 ª2016 The Authors Figure 4. Substrate Preferences of Cdk13/CycK Phosphorylation (A) Activity of Cdk13/CycK for a panel of CTD substrate peptides. Three hepta repeats with either no or continuous phosphorylation marks were provided. Cdk13 showed the highest activity for a peptide containing Ser7 pre-phosphorylations. Minor activities were detected for Thr4 and Ser2 pre-phosphorylated peptides and the consensus CTD, while no activity was seen on pSer5 and K7 peptides. A cartoon of the peptides used is shown on the right. (B) Preferences of Cdk13 phosphorylations. Serine 2, 5, and 7 phosphorylations were set either in the N- or C-terminal repeat of a triple hepta-repeat substrate. Overall, the activity of Cdk13 toward these peptides is weak and only slightly above the consensus CTD. The substrate with the C-terminal pS7 mark gained the highest recognition preference. Data in (A) and (B) are reported as the mean ± SD of three independent experiments. (C) In vitro kinase assays of Cdk13, Cdk12, and Cdk9 using GST-tagged full-length human CTD as substrate. Flag-tagged full-length kinases were immune- precipitated from HCT116 cells and compared to recombinant Cdk13 (673–1039)/CycK (1–267) or Cdk12 (696–1082)/CycK (1–267). After incubation, GST-CTD was subjected to western blot analysis using monoclonal antibodies (mAbs) specific for pSer2 (3E10), pSer5 (3E8), and pSer7 (4E12). EV, empty vector control; KD, kinase dead mutants. Exposure times are indicated on the left. (D) Display of the protein input used in the in vitro kinase assays. Equal amounts of full-length proteins were used as shown by western blots with anti-flag (upper panel) and anti-CycK (middle panel) antibodies. Cdk13 purifications are free of Cdk9 contaminants as shown by western blot analysis with anti-Cdk9 antibody (lower panel).

Pin1 Does Not Change the Phosphorylation Preferences tion kinases Cdk13, Cdk12, or Cdk9, we performed in vitro of Transcription Kinases kinase assays using 300 ng GST-CTD, 100 ng Pin1, and The CTD contains two serine-proline motifs within each immuno-precipitated full-length kinases Cdk13-flag/CycK, consensus hepta repeat at positions 2/3 and 5/6, respectively. Cdk12-flag/CycK, or Cdk9-flag/CycT1. For comparison, A characteristic feature of the prolyl-peptide bond is a slow 100 ng of recombinant protein kinases Cdk13/CycK or Cdk12/ rate of cis/trans isomerization (Schiene-Fischer et al., 2013). CycK was used. The 18-kDa Pin1 protein is a human peptidyl-prolyl cis/trans Using antibodies against either pSer2 or pSer5 marks, we find isomerase (PPIase) that binds to and isomerizes phosphorylated that the presence of Pin1 does not change the preferences of the S/T-P (pSer/pThr-Pro) motifs (Hanes, 2014). It contains an N-ter- full-length kinases for phosphorylating Ser5 (Figure 6B). The minal WW domain binding to pS and pT sites followed by the phosphorylation efficacy of the recombinant proteins Cdk13 PPIase domain (Figure 6A). To analyze whether the presence and Cdk12, however, was markedly reduced upon addition of of Pin1 affects the CTD phosphorylation mediated by transcrip- Pin1. This effect can be attributed to the presence of the WW

Cell Reports 14, 320–331, January 12, 2016 ª2016 The Authors 325 Figure 5. Flavopiridol Inhibits Cdk7 More Potently than Cdk13 (A) Concentration series of flavopiridol for the inhi- bition of Cdk13/CycK at 0.2 mM kinase concen-

tration. The IC50 value was determined to 16.1 ± 2.0 mM against Cdk13 using pS7-CTD as a sub- strate. (B) Model of the proposed position of flavopiridol in the ATP pocket of Cdk13. Flavopiridol clashes with the imidazole ring of H1018 in the C-terminal extension helix of Cdk13. The model is based on a superimposition of the Cdk9–flavopiridol structure (3BLR) with the structure of Cdk13 determined here. (C) Concentration series of flavopiridol for the in- hibition of Cdk7/CycH/MAT1 at 0.2 mM kinase

concentration. The IC50 value was determined to 2.3 ± 0.4 mM against Cdk7, using a CTD template with nine hepta repeats. Data in (A) and (C) are re- ported as the mean ± SD from three independent experiments. (D) Administration of flavopiridol leads to a decrease in RNA pol II serine phosphorylations. Flavopiridol was applied at 500 nM concentration to HCT116 cells. Depletion of Cdk12 and Cdk13 by siRNA, respectively, led to an approximate 2-fold decrease of Ser2 and Ser5 phosphorylations but did not affect Ser7 phosphorylation. Flavopiridol instead diminished all serine phosphorylations of the RNA pol II CTD, indicating its broad potency in the regulation of transcription kinases. domain in Pin1 as deletion of the WW domain in the Pin1DWW the CTD of RNAPII alike, we analyzed whether they also regulate protein variant restores the full kinase activity (Figure 6C). At the expression of a similar set of genes. To address this question, the high molecular concentrations used in this experiment, the RNA expression profiling of HCT116 cells depleted of Cdk13 or WW domain of Pin1 could mask the CTD for its recognition by Cdk12 was performed (Figure 7). Cells were transfected with the kinases. In a following experiment series, we tested whether control, Cdk13, or Cdk12 siRNAs in triplicate, and RNA was iso- pre-incubation of the CTD with Cdk9/CycT1 could change the lated after 72 hr for expression analyses. Knockdown efficacy ability of Cdk13 or Cdk12 to phosphorylate the CTD. Such effect was assessed by western blotting, confirming that Cdk13 and could be a priming of the substrate for further modifications to Cdk12 were comparably depleted (Figure 7A). siRNA-mediated determine, e.g., the succession of phosphorylation events in knockdown of both kinases resulted in change of expression the transcription cycle. GST-CTD was pre-incubated for of hundreds of genes (>1.4-fold with p < 0.05) (Figure 7B). 20 min with immuno-precipitated Cdk9/CycT1 or a kinase Namely, depletion of Cdk13 diminished expression of 250 genes dead variant of Cdk9. Pin1 was added optionally for 15 min and increased expression of 242 genes, while knockdown of before recombinant kinases Cdk13/CycK or Cdk12/CycK were Cdk12 downregulated 726 and upregulated 435 genes. For a subjected for another 40 min to the reaction. Again, a robust list of all the down- and upregulated genes, see Table S2. Ser5 phosphorylation was seen for Cdk9 in the pre-incubation To probe the gene expression changes after Cdk12 knock- time together with some Ser7 phosphorylation (Figure 6D, lanes down, we selected three downregulated and two upregulated 1 and 2). Addition of Cdk13/CycK or Cdk12/CycK led to addi- genes from the expression microarray data for further analysis. tional Ser2 phosphorylation signals (lanes 3–6). Yet, compared Cdk12 or Cdk13 were depleted by siRNA in HCT116 cells as to the Cdk9 kinase dead control assay (lanes 7–12), the intensity shown by western blot analysis (Figure S6A). The qRT-PCR of the Ser2 phosphorylation bands were only 2- to 3-fold stron- experiment confirmed the same expression pattern for Cdk12 ger. We therefore conclude that a small increase in Ser2 phos- depletion as observed in the microarray experiment, whereas phorylation by transcription kinases Cdk12 and Cdk13 can be no significant change was seen upon Cdk13 depletion (Fig- observed upon a priming step of the CTD with Cdk9. Addition ure S6B). Likewise, efficient depletion of Cdk13 from cells (Fig- of Pin1, however, not much influenced the kinase phosphoryla- ure S7A) led to decrease of mRNA levels in three out of four tion specificities observed in these experiments. selected genes found downregulated in the microarray (Fig- ure S7B, upper panel), while two out of the three selected genes Gene Expression Changes after Knockdown of Cdk13 or found upregulated in the microarray assay were modestly upre- Cdk12 Are Markedly Different gulated in qRT-PCR (Figure S7B, lower panel). To validate the Given that the structures of the kinase domains of Cdk13 and dependence of the randomly selected transcripts on Cdk13 ac- Cdk12 are rather similar and that both kinases phosphorylate tivity a rescue experiment was performed. Cells were depleted of

326 Cell Reports 14, 320–331, January 12, 2016 ª2016 The Authors Figure 6. Pin1 Does Not Change the Phos- phorylation Preferences of Cdk13/CycK (A) The human peptidyl-prolyl isomerase Pin1 consists of an N-terminal WW domain followed by the isomerase domain. Full-length Pin1 (1–163) and a variant missing the N-terminal WW domain (Pin1DWW, 50–163) was generated. (B) Phosphorylation preferences of transcription kinases Cdk13, Cdk12, and Cdk9, comparing flag- tagged immunoprecipitated full-length Cdk/cyclin proteins with recombinant protein complexes. Pin1 was optionally added in all kinase reactions. The protein input using an anti-CycK antibody or an anti-flag antibody is shown in the lower panels. (C) The decrease of phosphorylation efficacy of recombinant Cdk13/CycK and Cdk12/CycK de- pends on the WW domain in Pin1. Whereas full Pin1 reduced the phosphorylation profiles of the re- combinant kinases, deletion of the WW domain abrogated this ability. Such effect was not seen for flag-tagged Cdk9. (D) Pre-incubation of the CTD with Cdk9/CycT1 leads to slightly increased Ser2 phosphorylations. A 20-min pre-incubation period of the CTD with Cdk9/CycT1 shows the expected Ser5/Ser7 phosphorylations (lanes 1 and 2). Administration of Cdk13/CycK or Cdk12/CycK leads to additional pSer2 marks (lanes 3–6). The same treatment using a kinase dead Cdk9 mutant shows significantly lower Ser2 phosphorylation levels (lanes 7–12). Addition of Pin1 only slightly reduced the phos- phorylation signals of Cdk13 and Cdk12.

Cdk13 by a single siRNA and after 30 hr expression of Cdk13 lated when either Cdk12 or Cdk13 was depleted; otherwise, was either rescued or not by transfection of a Cdk13 siRNA- different genes increased their expression (Figure 7D). Concom- insensitive plasmid. The efficacy of Cdk13 knockdown and itantly, the upregulated genes for both kinases were enriched in rescue is showed by western blotting (Figure S7C). qRT-PCR re- different biological processes with the exception of their role sults confirmed downregulation of all select mRNAs (Figure S7D), in neuron development (Figure 7D, see bar graphs on the right) and rescue of Cdk13 expression resulted in increased expres- in line with findings of a recent study (Chen et al., 2014b). sion of three out of four mRNAs. Together, the expression array and bioinformatics analyses To determine whether similar sets of genes and biological pro- show that Cdk12 and Cdk13 regulate expression of a markedly cesses were affected upon the knockdown of Cdk13 and Cdk12, different set of genes that regulate dissimilar biological pro- we analyzed the overlap between down- or upregulated genes cesses in human cells. for both kinases and performed (GO) enrichment analyses of affected genes using the DAVID software. Depletion DISCUSSION of Cdk13 and Cdk12 led to downregulation of largely different sets of genes with common downregulation of only 62 genes Cdk13 contains a C-terminal extension helix following the kinase (Figure 7C, see Venn diagram on the left). Consistent with pub- domain that interacts with the ribose of the bound ATP through lished work (Blazek et al., 2011; Bartkowiak and Greenleaf, water-mediated contacts. A similar mode of interaction has 2015; Liang et al., 2015), GO analysis of genes downregulated been found for Cdk12 and Cdk9 (Bo¨ sken et al., 2014; Baumli upon Cdk12 knockdown indicated those genes to be enriched et al., 2012), suggesting this conformation a molecular feature in processes related to DNA replication and repair (Figure 7C, of transcription elongation regulating kinases. A polybasic clus- upper bar graphs). In contrast, the GO analysis of Cdk13-depen- ter of six positively charged residues (KKRRRQK) and an aro- dent genes showed enrichment of functions connected to matic residue followed by a glutamic acid (HE) constitute the various extracellular and growth signaling pathways (Figure 7C, conserved motifs of the extension helix in Cdk13. The associa- middle bar graphs). Genes with lowered expression upon both, tion of the basic patch in close proximity to the ATP binding Cdk12 and Cdk13, knockdowns are directing basic biological site could facilitate an electrostatic association to the highly processes such as regulation of phosphatase activity and glyc- negatively charged RNAPII CTD and potentially also define a erolipid metabolism (Figure 7C, lower bar graphs). The analyses register for the association of the phosphate groups within the of upregulated genes are consistent with the finding that mostly structure of the CTD (Figure S2). The extension segment also different sets of genes and biological processes are regulated by creates a unique means for the specific interaction with potential both kinases (Figure 7D). Only 49 genes were commonly upregu- inhibitors. Superimposition of the structure of Cdk13,ADP with

Cell Reports 14, 320–331, January 12, 2016 ª2016 The Authors 327 Figure 7. Knockdown of Cdk12 or Cdk13 Exhibits Different Gene Expression Profiles (A) Knockdown efficacy of Cdk12 and Cdk13. Anti-Cdk12 and -Cdk13 antibodies show the expression levels of these kinases together with an actin antibodyas loading control. (B) Genes that were differentially regulated after siRNA knockdown of Cdk12 (siCdk12) or Cdk13 (siCdk13) in HCT116 cells were determined by expression microarray. Relative average expression levels of genes (x axis, log2) are plotted against their fold change (knockdown/control, y axis, log2). Significantly downregulated genes (>1.4-fold, p < 0.05) or upregulated genes (>1.4-fold, p < 0.05) are indicated for Cdk12 knockdown in the left graph (downregulated, green; upregulated, red) and for Cdk13 knockdown in the right graph (downregulated, blue; upregulated, yellow). (C) The overlap in altered genes between Cdk12 and Cdk13 knockdown was examined (Venn diagram, left panel). Biological processes in which altered genes are involved were determined by GO enrichment analysis (DAVID, EASE threshold 0.05). The ten most significantly represented biological processes are summarized in bar graphs for Cdk12 knockdown (upper panel, green), for Cdk13 knockdown (middle panel, blue), or for genes that were similarly downregulated (lower panel, purple). The x axis indicates the –log10 p value. (D) The same analysis was performed for genes that were significantly upregulated (>1.4-fold, p < 0.05) after siRNA-mediated knockdown of Cdk12 (red) or Cdk13 (yellow). The overlap of altered genes for each knockdown was determined (Venn diagram, left). The top most significantly represented biological processes are show for Cdk12 knockdown (red bars), for Cdk13 knockdown (yellow bars), or genes that were similarly upregulated after Cdk12 and Cdk13 knockdown (orange bar).

328 Cell Reports 14, 320–331, January 12, 2016 ª2016 The Authors Cdk9,flavopiridol shows a steric clash of H1018 of the Cdk13 HE ical processes. Consistent with our expression analyses are re- motif with a benzol ring of flavopiridol (Figure 5). A compound de- sults of a recent study showing that Cdk12 knockout leads to signed to mediate direct hydrogen bond interactions with the HE an early post-implantation embryonal lethality in mice pointing residues of Cdk13 might instead exhibit a higher binding affinity to a very specific and non-redundant role of Cdk13 and Cdk12 to this kinase and a higher specificity of inhibition. Of note, the in development perhaps by the regulation of expression of crystal structure of Cdk9 in complex with flavopiridol (Baumli different gene sets (Chen et al., 2014b). Notably, the same study et al., 2008) (PDB: 3BLR) was determined with a C-terminally showed that Cdk13 and Cdk12 regulate also expression of a truncated protein variant, missing the FE motif in Cdk9 and the common gene, Cdk5, to direct axonal growth. The gene expres- polybasic cluster. sion profiles are also consistent with a prominent role of Cdk12 in The coordination of the phosphorylated threonine in the kinase the expression of the c-FOS proto-oncogene induced by the T-loop emerges as another specific feature of transcription regu- epidermal growth factor (Eifler et al., 2015). Further evidence lating CDKs. The formation of three salt bridges between argi- pointing to different function of both kinases was provided by nine residues from sequentially distant loop sections of the knockdown of Cdk13 and Cdk12 in embryonic stem cells, which kinase to the phosphorylated threonine of the T-loop is a hall- resulted in differentiation via regulation of expression of different mark of CDK activation (Johnson and Lewis, 2001). These salt sets of genes (Dai et al., 2012). bridges lead to stabilization of the loop sections, mostly the Cyclin-dependent kinases have evolved as important target T-loop, a twist of the DFGxxR motif, the positioning of the cata- proteins for the treatment of multiple diseases, including cancer lytic aspartate from the HRD motif, and a re-orientation of the and leukemia (Malumbres and Barbacid, 2009). Despite the Cdk/ C-helix that contains the PITAIRE motif (Figures 1 and 2). Intrigu- cyclin pairs driving the cell cycle (interphase regulating Cdk2, ingly, the third canonical salt bridge between the R from the Cdk4, and Cdk6, and mitotic Cdk5), transcriptional CDKs 7, 8, PITAIRE motif and the pT is not formed in either Cdk13 or 9, 12, and 13 are now increasingly recognized as important fac- Cdk12 (Figure 3). Instead, a salt bridge is formed between the tors of tumor oncogenesis. Discovery of a covalent Cdk7 inhibi- arginine and the second glutamic acid of the KVEE motif, giving tor targeting a cysteine residue outside the canonical kinase room for the association of substrate residues. domain provided an unanticipated means of achieving selectivity The regulation of Cdk13 activity is not well understood today (Kwiatkowski et al., 2014). Inhibition of Cdk9 activity has been as the expression levels of the corresponding cyclin subunit, identified as another means to counteract cancer cell CycK, are rather stable. This is similar to other cyclins controlling progression, mixed lineage leukemia, and HIV (Wang and transcriptional CDKs, e.g., CycT1, but different from cell-cycle- Fischer, 2008). With the crystal structure of Cdk13/CycK deter- regulating cyclins such as CycA. The Cdk13 kinase is unusually mined, we provide the structural basis for targeted drug discov- large with 690 residues preceding the kinase domain and 510 ery against this human kinase. residues following the kinase domain. Regulation steps could possibly be mediated by cellular localization or by intra-molecu- EXPERIMENTAL PROCEDURES lar interactions, e.g., via the N-terminal SR elements. Despite Plasmids and Proteins being involved in constitutive and alternative splicing events, Cdk13/CycK proteins were expressed in baculo-virus-infected Sf9 cells and SR proteins are thought to attach to newly made pre-mRNA to purified as described in the Supplemental Experimental Procedures. prevent the pre-mRNA from binding to the coding DNA strand to increase genome stabilization (Zhou and Fu, 2013). Crystallization and Structure Determination The peptidyl-prolyl isomerase Pin1 does not change the phos- For crystallization, the purified Cdk13/CycK complex was mixed at 85 mM con- phorylation preferences of the transcription elongation kinases centration with ADP, AlF3, MgCl2 and substrate peptide P-pS-YSPTSP-pS- Cdk9, Cdk12, and Cdk13 (Figure 6). Using recombinant proteins YSPT in molar ratios of 1:8:32:64:8 and incubated on ice for 30 min. Initial or flag-tag immuno-precipitated full-length proteins with the crystals were obtained using the hanging drop vapor diffusion technique at 293 K. The crystal structure was determined and refined as described in the corresponding cyclin subunits, we could not see any shift toward Supplemental Experimental Procedures and Table S1. increased Ser2 phosphorylation levels of the CTD. Likewise, the activity of the kinases is not changed upon the presence of Pin1. RNA Polymerase II CTD Substrate Peptides This result is similar to the recently published observation that For kinase activity analyses, various CTD polypeptides were purchased from the Pin1 ortholog Ess1 from yeast does not stimulate the activity Biosyntan or synthesized in house with 95% purity (high-performance liquid of Cdk12 (Bartkowiak and Greenleaf, 2015). Whereas writing of chromatography [HPLC] grade). The peptide used for crystallization (ac-P- the CTD phosphorylations is not influenced by the isomerase, pS-YSPTSP-pS-YSPT-amid) contained two phosphorylated serine residues at heptad position 7. For quantitative analysis in electrospray ionization dephosphorylation by the CTD phosphatase Fcp1 was shown to mass spectrometry (ESI-MS) experiments or radioactive filter-binding assays, be modulated by Pin1 (Xu et al., 2003). Hyper-phosphorylation of CTD substrate peptides were marked at the C terminus with a double arginine the CTD thus correlates with the inhibition of Fcp1 by Pin1, whose motif separated by a polyethylene glycol spacer. The arginines were set for activity is high in early stages of the transcription cycle (Xu and improved transfer rates in quantitative filter binding assays and for improved Manley, 2007). Yet, pre-incubation of the CTD with Cdk9/CycT1 ionization properties in ESI-MS analyses. resulting in a strong Ser5 phosphorylation signal led to a small In Vitro Kinase Assays increase in pSer2 marks after exposure to either Cdk12 or Cdk13. Radioactive kinase reactions (typically 35 ml) were carried out with recombi- Despite the fact that Cdk13 and Cdk12 have similar kinase nant, highly purified proteins using a standard protocol. In short, Cdk13/ domains and share the same regulating cyclin, both kinases CycK (0.2–0.5 mM) was pre-incubated with CTD substrates (peptides regulate expression of a largely different set of genes and biolog- 100 mM; GST-CTD full length [f.l.] 10 mM) for 5–10 min at room temperature

Cell Reports 14, 320–331, January 12, 2016 ª2016 The Authors 329 in kinase buffer (150 mM HEPES [pH 7.6], 34 mM KCl, 7 mM MgCl2, 2.5 mM AUTHOR CONTRIBUTIONS dithiothreitol, 5 mM b-glycerol phosphate, 1 3 PhosSTOP [Roche]). Cold ATP (to a final concentration of 1–2 mM) and 3 mCi [32P]-g-ATP (Perkin-Elmer) A.K.G. expressed and purified the proteins and performed the biochemical were added, and the reaction mixture was incubated up to 30 min at 30Cat activity assays. D.H. crystallized the proteins with the help of C.A.B. and deter- 350 rpm. Reactions were stopped by adding EDTA to a final concentration of mined the structure together with C.A.B. and K.A. Kinase activity measure- 50 mM. For reactions using GST-CTD, aliquots of 11 ml each were spotted onto ments with flag-tagged proteins and gene ontology studies were performed P81 Whatman paper squares. For substrate peptides, Optitran BA-S85 rein- by K.P. and K.B. under the supervision of D.B. Experiments with Pin1 were per- forced membrane was used. Paper squares were washed three times for formed by R.D. M.G. designed the study and wrote the manuscript with sup- 5 min with 0.75% (v/v) phosphoric acid, with at least 5 ml washing solution port of D.B. All authors discussed the results and commented on the per paper. Radioactivity was counted in a Beckman Scintillation Counter manuscript. (Beckman Coulter) for 1 min. Measurements were performed in triplicate and are represented as mean with SD. ACKNOWLEDGMENTS

Kinase Assays with Flag-Tagged Proteins We thank Karin Vogel-Bachmayr and Sascha Gentz for excellent technical Plasmids of full-length flag-tagged proteins, immuno-precipitations from cell assistance, Ingrid Vetter for help with crystal data evaluation, the beamline lysate, and kinase assays are described in the Supplemental Experimental staff at the SLS Villigen, Switzerland, for help with data collection, and Pavla Procedures. Gajduskova ´ for cloning flag-tagged expression constructs. Microarray profiling was performed at the Gladstone Genomics Core and statistical anal- Short Interference RNA Transfections and Flavopiridol Treatment ysis of the microarray data was conducted by Alex Williams at the Gladstone HCT116 cells were plated at 20% confluency in 6-well plates in culture me- Bioinformatics Core. M.G. is a member of the DFG excellence cluster dium without antibiotics. After 24 hr cultivation, transfections using Cdk12, ImmunoSensation. D.B. is supported by a project ‘‘CEITEC – Central European Cdk13 or control siRNA (Santa Cruz Biotechnology) were performed or cells Institute of Technology’’ (CZ.1.05/1.1.00/02.0068) and a GACR grant were just mock transfected. The transfections were done in a total volume of (14-09979S). This work was supported by a grant from the Deutsche For- 2.5 ml containing 2.5 mlof10mM siRNA and 5 ml of Lipofectamine RNAiMax schungsgemeinschaft to M.G. (GE 976/9-1). (Invitrogene) according to the manufacturer’s instructions. Transfection mix was removed 3 hr later, and cells were grown for another 69 hr in fresh media. Received: July 7, 2015 Mock-transfected cells were treated with flavopiridol (500 nM final concentra- Revised: October 29, 2015 tion) 2 hr prior the cell harvest. Then, cells were lysed in buffer containing Accepted: November 30, 2015 20 mM HEPES/KOH (pH 7.9), 15% glycerol, 0.2% NP-40, 150 mM KCl, Published: December 31, 2015 1 mM DTT, 0.2 mM EDTA, and protease inhibitor (Sigma), and the CTD phos- phorylations were analyzed by western blotting with phospho-specific REFERENCES antibodies. Adelman, K., and Lis, J.T. (2012). Promoter-proximal pausing of RNA polymer- Western Blots ase II: emerging roles in metazoans. Nat. Rev. Genet. 13, 720–731. For western blot analysis, kinase assays were run with 0.2 mM Cdk13/CycK, Allen, B.L., and Taatjes, D.J. (2015). The Mediator complex: a central integrator 0.2 mM Cdk12/CycK, 5 mM Pin1, 5 mM Pin1DWW, 10 mM GST-CTD f.l., and of transcription. Nat. Rev. Mol. Cell Biol. 16, 155–166. 2 mM ATP in kinase buffer. After incubation with Cdk13/CycK, GST-CTD proteins were subjected to SDS-PAGE on a 12% gel before transfer to nitro- Bartkowiak, B., and Greenleaf, A.L. (2015). Expression, purification, and iden- cellulose (GE Healthcare). Monoclonal antibodies specific for RNAPII CTD tification of associated proteins of the full-length hCDK12/CyclinK complex. phosphorylation sites pSer2 (3E10), pSer5 (3E8), and pSer7 (4E12) were J. Biol. Chem. 290, 1786–1795. used as described previously (Bo¨ sken et al., 2014). Membranes were stained Bartkowiak, B., Liu, P., Phatnani, H.P., Fuda, N.J., Cooper, J.J., Price, D.H., with chicken anti-rat immunoglobulin G (IgG) horseradish peroxidase (HRP) Adelman, K., Lis, J.T., and Greenleaf, A.L. (2010). CDK12 is a transcription secondary antibodies, and antibody recognition was revealed by enhanced elongation-associated CTD kinase, the metazoan ortholog of yeast Ctk1. chemiluminescence. Genes Dev. 24, 2303–2316. Baumli, S., Lolli, G., Lowe, E.D., Troiani, S., Rusconi, L., Bullock, A.N., Debrec- Expression Microarrays zeni, J.E., Knapp, S., and Johnson, L.N. (2008). The structure of P-TEFb For expression microarrays HCT116 cells were transfected in triplicate with (CDK9/cyclin T1), its complex with flavopiridol and regulation by phosphoryla- 1 mlof10mM CTRL, Cdk12, and Cdk13 siRNAs as described above. After tion. EMBO J. 27, 1907–1918. 72 hr, cells were harvested and RNA isolated with miRNAeasy kit (QIAGEN) Baumli, S., Hole, A.J., Wang, L.Z., Noble, M.E., and Endicott, J.A. (2012). The according to the manufacturer’s instructions. A total of nine samples was CDK9 tail determines the reaction pathway of positive transcription elongation used: three from the siRNA CTRL, three from siRNA Cdk12, and three from factor b. Structure 20, 1788–1795. siRNA Cdk13 knockdowns. Human Gene 1.0ST microarrays from Affymetrix were used. Details on sample preparation and data analyses are provided in Berro, R., Pedati, C., Kehn-Hall, K., Wu, W., Klase, Z., Even, Y., Genevie` re, the Supplemental Information. A.M., Ammosova, T., Nekhai, S., and Kashanchi, F. (2008). CDK13, a new po- tential human immunodeficiency virus type 1 inhibitory factor regulating viral 82 ACCESSION NUMBERS mRNA splicing. J. Virol. , 7155–7166. Blazek, D., Kohoutek, J., Bartholomeeusen, K., Johansen, E., Hulinkova, P., Structure coordinates and diffraction data of the Cdk13/CycK complex were Luo, Z., Cimermancic, P., Ule, J., and Peterlin, B.M. (2011). The Cyclin deposited in the (http://www.pdb.org) under accession K/Cdk12 complex maintains genomic stability via regulation of expression of 25 code PDB: 5EFQ. DNA damage response genes. Genes Dev. , 2158–2172. Bo¨ sken, C.A., Farnung, L., Hintermair, C., Merzel Schachter, M., Vogel-Bach- SUPPLEMENTAL INFORMATION mayr, K., Blazek, D., Anand, K., Fisher, R.P., Eick, D., and Geyer, M. (2014). The structure and substrate specificity of human Cdk12/Cyclin K. Nat. 5 Supplemental Information includes Supplemental Experimental Procedures, Commun. , 3505. seven figures, and two tables and can be found with this article online at Buratowski, S. (2009). Progression through the RNA polymerase II CTD cycle. http://dx.doi.org/10.1016/j.celrep.2015.12.025. Mol. Cell 36, 541–546.

330 Cell Reports 14, 320–331, January 12, 2016 ª2016 The Authors The Cancer Genome Atlas Research Network (2011). Integrated genomic an- Kwiatkowski, N., Zhang, T., Rahl, P.B., Abraham, B.J., Reddy, J., Ficarro, S.B., alyses of ovarian carcinoma. Nature 474, 609–615. Dastur, A., Amzallag, A., Ramaswamy, S., Tesar, B., et al. (2014). Targeting 511 Chen, C., Ha, B.H., The´ venin, A.F., Lou, H.J., Zhang, R., Yip, K.Y., Peterson, transcription regulation in cancer with a covalent CDK7 inhibitor. Nature , J.R., Gerstein, M., Kim, P.M., Filippakopoulos, P., et al. (2014a). Identification 616–620. of a major determinant for serine-threonine kinase phosphoacceptor speci- Johnson, L.N., and Lewis, R.J. (2001). Structural basis for control by phos- ficity. Mol. Cell 53, 140–147. phorylation. Chem. Rev. 101, 2209–2242. Chen, H.R., Lin, G.T., Huang, C.K., and Fann, M.J. (2014b). Cdk12 and Cdk13 Joshi, P.M., Sutor, S.L., Huntoon, C.J., and Karnitz, L.M. (2014). Ovarian can- regulate axonal elongation through a common signaling pathway that modu- cer-associated mutations disable catalytic activity of CDK12, a kinase that lates Cdk5 expression. Exp. Neurol. 261, 10–21. promotes homologous recombination repair and resistance to cisplatin and poly(ADP-ribose) polymerase inhibitors. J. Biol. Chem. 289, 9247–9253. Corden, J.L. (2013). RNA polymerase II C-terminal domain: Tethering tran- scription to transcript and template. Chem. Rev. 113, 8423–8455. Larochelle, S., Amat, R., Glover-Cutter, K., Sanso´ , M., Zhang, C., Allen, J.J., Shokat, K.M., Bentley, D.L., and Fisher, R.P. (2012). Cyclin-dependent kinase Czudnochowski, N., Bo¨ sken, C.A., and Geyer, M. (2012). Serine-7 but not control of the initiation-to-elongation switch of RNA polymerase II. Nat. Struct. serine-5 phosphorylation primes RNA polymerase II CTD for P-TEFb recogni- Mol. Biol. 19, 1108–1115. tion. Nat. Commun. 3,842. Liang, K., Gao, X., Gilmore, J.M., Florens, L., Washburn, M.P., Smith, E., and Dai, Q., Lei, T., Zhao, C., Zhong, J., Tang, Y.Z., Chen, B., Yang, J., Li, C., Wang, Shilatifard, A. (2015). Characterization of human cyclin-dependent kinase 12 S., Song, X., et al. (2012). Cyclin K-containing kinase complexes maintain self- (CDK12) and CDK13 complexes in C-terminal domain phosphorylation, gene renewal in murine embryonic stem cells. J. Biol. Chem. 287, 25344–25352. transcription, and RNA processing. Mol. Cell. Biol. 35, 928–938. 0 Davidson, L., Muniz, L., and West, S. (2014). 3 end formation of pre-mRNA Malumbres, M., and Barbacid, M. (2009). Cell cycle, CDKs and cancer: and phosphorylation of Ser2 on the RNA polymerase II CTD are reciprocally a changing paradigm. Nat. Rev. Cancer 9, 153–166. coupled in human cells. Genes Dev. 28, 342–356. Peterlin, B.M., and Price, D.H. (2006). Controlling the elongation phase of tran- Eick, D., and Geyer, M. (2013). The RNA polymerase II carboxy-terminal scription with P-TEFb. Mol. Cell 23, 297–305. domain (CTD) code. Chem. Rev. 113, 8456–8490. Russo, A.A., Jeffrey, P.D., and Pavletich, N.P. (1996). Structural basis of cyclin- Eifler, T.T., Shao, W., Bartholomeeusen, K., Fujinaga, K., Ja¨ ger, S., Johnson, dependent kinase activation by phosphorylation. Nat. Struct. Biol. 3, 696–700. J.R., Luo, Z., Krogan, N.J., and Peterlin, B.M. (2015). Cyclin-dependent kinase Schiene-Fischer, C., Aumuller,€ T., and Fischer, G. (2013). Peptide bond cis/ 12 increases 30 end processing of growth factor-induced c-FOS transcripts. trans isomerases: a biocatalysis perspective of conformational dynamics in Mol. Cell. Biol. 35, 468–478. proteins. Top. Curr. Chem. 328, 35–67. ¨ Ekumi, K.M., Paculova, H., Lenasi, T., Pospichalova, V., Bosken, C.A., Rybar- Schulze-Gahmen, U., Upton, H., Birnberg, A., Bao, K., Chou, S., Krogan, N.J., ikova, J., Bryja, V., Geyer, M., Blazek, D., and Barboric, M. (2015). Ovarian car- Zhou, Q., and Alber, T. (2013). The AFF4 scaffold binds human P-TEFb adja- cinoma CDK12 mutations misregulate expression of DNA repair genes via cent to HIV Tat. eLife 2, e00327. deficient formation and function of the Cdk12/CycK complex. Nucleic Acids St Amour, C.V., Sanso´ , M., Bo¨ sken, C.A., Lee, K.M., Larochelle, S., Zhang, C., Res. 43, 2575–2589. Shokat, K.M., Geyer, M., and Fisher, R.P. (2012). Separate domains of fission Ghamari, A., van de Corput, M.P., Thongjuea, S., van Cappellen, W.A., van yeast Cdk9 (P-TEFb) are required for capping enzyme recruitment and primed Ijcken, W., van Haren, J., Soler, E., Eick, D., Lenhard, B., and Grosveld, F.G. (Ser7-phosphorylated) Rpb1 carboxyl-terminal domain substrate recognition. In vivo (2013). live imaging of RNA polymerase II transcription factories in pri- Mol. Cell. Biol. 32, 2372–2383. mary cells. Genes Dev. 27, 767–777. Wang, S., and Fischer, P.M. (2008). Cyclin-dependent kinase 9: a key tran- Ghosh, G., and Adams, J.A. (2011). Phosphorylation mechanism and structure scriptional regulator and potential drug target in oncology, virology and cardi- 278 of serine-arginine protein kinases. FEBS J. , 587–597. ology. Trends Pharmacol. Sci. 29, 302–313. Grunberg,€ S., and Hahn, S. (2013). Structural insights into transcription initia- Xu, Y.X., and Manley, J.L. (2007). Pin1 modulates RNA polymerase II activity tion by RNA polymerase II. Trends Biochem. Sci. 38, 603–611. during the transcription cycle. Genes Dev. 21, 2950–2962. Hanes, S.D. (2014). The Ess1 prolyl isomerase: traffic cop of the RNA polymer- Xu, Y.X., Hirose, Y., Zhou, X.Z., Lu, K.P., and Manley, J.L. (2003). Pin1 modu- ase II transcription cycle. Biochim. Biophys. Acta 1839, 316–333. lates the structure and function of human RNA polymerase II. Genes Dev. 17, Itzen, F., Greifenberg, A.K., Bo¨ sken, C.A., and Geyer, M. (2014). Brd4 activates 2765–2776. P-TEFb for RNA polymerase II CTD phosphorylation. Nucleic Acids Res. 42, Zhou, Z., and Fu, X.D. (2013). Regulation of splicing by SR proteins and SR 7577–7590. protein-specific kinases. Chromosoma 122, 191–207.

Cell Reports 14, 320–331, January 12, 2016 ª2016 The Authors 331 Cell Reports Supplemental Information Structural and Functional Analysis of the Cdk13/Cyclin K Complex Ann Katrin Greifenberg, Dana Hönig, Kveta Pilarova, Robert Düster, Koen Bartholomeeusen, Christian A. Bösken, Kanchan Anand, Dalibor Blazek, and Matthias Geyer

Figure S1. Generation of the Cdk13/CycK complex and analysis of its substrate phosphorylation specificity, related to Figure 1 (A) Coomassie stained SDS PAGE display of the human Cdk13/CycK complex used for crystallization and activity determination. (B) ESI mass spectrometry analysis of purified Cdk13/CycK. The molecular mass of human Cdk13 (694-1039) was calculated to 40,382.8 Da. The measured mass of 40,463 Da indicated the modification of the protein with a single phosphorylation group (+80 Da). The molecular mass of human CycK (1-267) was measured as calculated. (C) Western blot analysis of Cdk13/CycK mediated CTD phosphorylation. A human GST- CTD protein containing all 52 hepta-repeats was used as kinase substrate. The CTD- phosphorylation was followed in time course experiments over 8 hrs. Phospho-serine- directed antibodies reveal a preference of Cdk13/CycK for Ser5 phosphorylation of the CTD, with a minor fraction of Ser7 phosphorylation and a faint band indicating Ser2 phosphorylation.

- 2 -

Figure S2. Structural features of Cdk13/CycK, related to Figure 1 (A) Display of the C-terminal extension helix. Cartoon diagram of Cdk13 from the second Cdk13/CycK complex of the asymmetric unit (PDB accession code 5EFQ; chain C). Residues of the 1016DCHEL motif and the polybasic cluster 1023KKRRRQK form an extended helix. Amino acid side chains are shown as they were built from the electron density map. (B) Residues of the C-terminal extension helix up to M1031 are shown in stick representation with the final 2Fo–Fc electron density displayed at 1. The structure of the Cdk13 kinase domain from chain C is shown as surface representation. The position of glycine 800 that provides space for the association of the extension helix is marked. (C) Electrostatic surface display of Cdk13/CycK (chains A/B). The polybasic cluster at the C-terminal extension helix of Cdk13 is not contained in this display. The position of the phosphorylated T871 residue in the T-loop of the kinase is indicated. Electrostatic surface charges are shown from −5 kBT (red) to +5 kBT (blue).

- 3 -

Figure S3. Alignment of Cyclin sequences involved in transcription regulation, related to Figure 2 Sequence alignment of human Cyclin proteins involved in the regulation of transcription. Secondary structure elements are indicated for CycK as derived from the complex with Cdk13. The first cyclin box comprises helices H1 to H5 and the second cyclin box comprises helixes H1’ to H5’, with an N-terminal helix HN preceding the first cyclin box. Residues conserved in all Cyclin proteins are boxed red and those that are similar are colored red. Specific Cdk/Cyclin pairs in the transcription cycle are Cdk13/CycK, Cdk12/CycK, Cdk9/CycT1, Cdk9/CycT2, Cdk8/CycC and Cdk7/CycH. The UniProt accession numbers of the sequences displayed are O75909 (Cyclin K, human), O60563 (Cyclin T1, human), O60583 (Cyclin T2, human), P24863 (Cyclin C, human) and P51946 (Cyclin H, human). The sequence alignment was prepared with MultAlin and the secondary structure alignment was prepared with ESPript.

- 4 -

Figure S4. Close up of the Cdk13/CycK interaction, related to Figure 3 (A) Residues on helix H3 of CycK and the following loop section (F101 to K112) interact with helix C in the N-terminal lobe of Cdk13. (B) A second interaction site for Cdk13 binding is formed by residues on helix H5 and the adjacent loop (M141 to Q156) of CycK. (C) Residues at the N-terminus of the Cdk13 kinase domain (D696, K699) interact with the N-terminus of CycK and helix H5. I770 and T772 of the -strand 4 contribute to the binding by their interaction with helix H5 of CycK. The entire interaction site of Cdk13 to CycK is formed by residues located only in the N-terminal lobe (695-794) of the kinase domain.

- 5 -

Figure S5. Characterization of the HE motif and the polybasic cluster in the extension helix of Cdk13, related to Figure 4 (A) Display of the ‘HE to alanine’ mutant (HE-Ala) and the ‘polybasic cluster to alanine’ mutant (KR-Ala) in human Cdk13. The secondary structure elements are indicated above the sequence. (B,C) ESI mass spectrometry analyses of purified Cdk13/CycK mutants HE-Ala and KR-Ala. The calculated masses and the measured masses are indicated, suggesting the modification of each kinase with a single phosphorylation group (+80 Da), respectively. (D,E) Western blot analysis of mutant Cdk13/CycK mediated CTD phosphorylation. A human GST-CTD protein containing all 52 hepta-repeats was used as kinase substrate. The CTD-phosphorylation was followed in time course experiments over 8 hrs. No changes of the phosphorylation preference for Ser5 of the CTD were observed in the two mutant proteins of Cdk13.

- 6 -

Figure S6. Analysis of the Cdk12 expression microarray data by RT-qPCR, related to Figure 7 (A) Kinases were knocked down in HCT116 cells by the indicated siRNAs and protein levels were analyzed by Western blotting with antibodies indicated on the side of the panels. (B,C) The two graphs depict examples for relative mRNA levels of randomly selected genes that were found to be either down-regulated (B) or up-regulated (C) in the expression microarray assay after Cdk12 knockdown. The mRNA levels were measured by RT-qPCR. The blue, red and green bars indicate the levels of mRNA of indicated genes in HCT116 cells treated with control, Cdk12 and Cdk13(#1) siRNA, respectively. Data are normalized to HPRT mRNA levels and results represent data from two biological replicates.

- 7 -

Figure S7. Analysis of the Cdk13 expression microarray data by RT-qPCR, related to Figure 7 (A) Levels of proteins indicated on the right were determined by Western blotting using whole cell lysates of HCT116 cells treated with the control or Cdk13(#1) siRNA. (B) Graphs depict examples of the randomly selected genes that were found either down- regulated (upper graph) or up-regulated (lower graph) in the expression microarray after Cdk13 knockdown. mRNA levels were measured by RT-qPCR. The blue and red bars indicate the levels of mRNA of indicated genes in cells treated with control and Cdk13 #1 siRNA, respectively. Data are normalized to HPRT mRNA levels and results represent data from two biological replicates. (C) Levels of proteins were determined by Western blotting using whole cell lysates of HCT116 cells treated with the control or Cdk13(#2) siRNA or with Cdk13(#2) siRNA reconstituted by overexpression of plasmid carrying siRNA-insensitive Cdk13. (D) Rescue of mRNA expression levels of the Cdk13 dependent genes in HCT116 cells treated with control (blue bar) or Cdk13(#2) siRNA (red bar) or with Cdk13 (#2) siRNA reconstituted by overexpression of an siRNA-insensitive Cdk13 plasmid (green bar). Overexpression of the Cdk13 siRNA-insensitive plasmid increased the amount of Cdk13 mRNA 8-fold over Cdk13 mRNA levels in cells treated with the control siRNA. Data are normalized to HPRT mRNA levels and results represent data from two biological replicates.

- 8 - Table S1. Data collection and refinement statistics, related to Figure 1

native Data collection Beam line SLS X10SA Wavelength [Å] 0.9998 Space group P1 Unit cell a, b, c [Å] a = 50.78, b = 81.82 Å, c = 93.14 Å α, β, γ [°] α = 74.08, β = 84.26, γ = 76.89 Resolution [Å]* 2.0 unique reflections 90,134 # of reflections used 89,772 c Rmeas 0.048 (0.913)

Mean I/σI 13.0 (1.70) Completeness [%] 96.0 (94.9) Refinement Model contents A: Cdk13 (695-1025, ∆776-784) B: CycK (21-262) C: Cdk13' (697-1031, ∆776-784) D: CycK' (21-259) # of atoms in protein 8955 # of water molecules 289 # of ligand atoms 66 Solvent content [%] 47.0 d Rwork / Rfree 0.195/0.247 r.m.s. deviations bonds [Å] 0.008 r.m.s. deviations angles [°] 1.12 protein [Å2] 64.6 water [Å2] 65.2 ligands [Å2] 61.6 Ramachandran plot most favored: 97.21% allowed: 2.79% PDB accession code 5EFQ * Values in parentheses correspond to the highest resolution shell (2.05‐2.00 Å).

Table S2. List of all down- and up-regulated genes upon Cdk13 and Cdk12 knockdown, related to Figure 7 The table contains the genes differentially expressed in HCT116 cells with siRNA-mediated Cdk13 and Cdk12 knockdown. Genes were considered to be differentially expressed when p value was less than 0.05 and the average fold-change in expression was at least 1.4.

- 9 - SUPPLEMENTAL EXPERIMENTAL METHODS

Plasmids and proteins Expression plasmids of human Cdk13 (UniProt accession number Q14004) were cloned from a synthetic gene encompassing residues 673-1057 of Cdk13 that was codon optimized for expression in Trichoplusia ni cells (GeneArt, Regensburg). Human CycK (UniProt accession number O75909), residues 1-267, was cloned from a codon optimized gene as described (Bösken et al., 2014). Expression of the Cdk13/CycK protein complex was carried out in insect cells using the MultiBacTurbo system (Bieniossek et al., 2012). Both subunits were fused with an N-terminal GST affinity tag followed by a tobacco etch virus (TEV) protease cleavage site and cloned into the pACEBac1 and pIDK expression vectors, respectively (ATG:biosynthetics). Cdk13 was expressed with domain boundaries 673-1039 or 694-1039. Full length CAK1 (P43568) from S. cerevisiae was cloned into the pIDC donor vector without an affinity tag. All plasmids were confirmed by DNA sequencing prior to expression. Vectors were fused by in vitro Cre recombination and applied to Tn7-dependent integration into the baculoviral genome of DH10 MultiBacTurbo cells (ATG:biosynthetics). Recombinant bacmid DNA was isolated and then used to transfect Sf9 insect cells. Liquid culture of Sf9 cells was maintained at 27°C in SF-900 III SFM medium (Invitrogene) shaking at 100 r.p.m. Initial recombinant baculo viruses were amplified in Sf9 cells and used for expression by infecting cells at a density of 1.5 x 106 cells/ml by addition of 2% (v/v) of virus stock V2 (multiplicity of infection (MOI) > 1). After 72 to 96 h, cells were harvested by centrifugation, washed in PBS and pellets stored at -80°C. For large scale purification of Cdk13/CycK constructs cells were re-suspended in lysis buffer (50 mM Hepes pH 7.6, 500 mM NaCl, 10% glycerol and 1 mM DTT) and disrupted by sonication. The lysate was cleared by centrifugation in a Beckman Optima L-80 XP Ultracentrifuge with a Ti45 rotor (45,000 r.p.m. for 45 min at 4°C) and applied to GST Trap FF columns (GE Healthcare) equilibrated with lysis buffer using an Äkta Prime chromatography system (GE Healthcare). Following extensive washes with 10 column volumes (CV) of lysis buffer and 5 CV of wash buffer (50 mM Hepes pH 7.6, 1000 mM NaCl, 10% glycerol and 1 mM DTT), the protein was eluted in elution buffer (50 mM Hepes pH 7.6, 500 mM NaCl, 10% glycerol and 1 mM DTT, 10 mM glutathione). Cleavage of the GST tag was achieved by adding TEV protease in a 1/20 ratio and was performed for 20 hours at 4°C. Protein solution was concentrated and loaded on a preparative HiLoad 16/60 Superdex 200 prep grade gel filtration column (GE Healthcare) equilibrated in gel filtration buffer (20 mM Hepes pH 8.2, 400 mM NaCl, 5% glycerol and 2 mM TCEP). Fractions of the main peak containing pure Cdk13/CycK complex as determined by SDS-PAGE were pooled and concentrated. The protein was aliquoted, snap frozen in liquid nitrogen and stored at -80°C. Cdk9/CycT1 as well as GST-CTD proteins containing the full human CTD hepta repeat region (aa 1587-1970) of the Rpb1 subunit were prepared similarly as described (Czudnochowski et al., 2012).

- 10 - An expression plasmid of human Pin1 containing an N-terminal hexahistidine tag was purchased from Addgene (plasmid #40773). A Pin1 variant lacking the N-terminal WW domain, termed Pin1∆WW (50-163), was cloned by PCR with restrictions sites NcoI/EcoRI and ligated into a pGEX-4T1 vector modified with a TEV protease cleavage site. Plasmids were transformed into BL21(DE3) cells and expressed and purified as described (Ranganathan et al., 1997).

Crystallization and structure determination For crystallization, the purified Cdk13/CycK complex was mixed at 85 µM concentration with

ADP, AlF3, MgCl2 and substrate peptide P-pS-YSPTSP-pS-YSPT in molar ratios of 1:8:32:64:8 and incubated on ice for 30 min. Initial crystals were obtained using the hanging drop vapor diffusion technique at 293 K by mixing 1 µl protein solution with 1 µl of the reservoir solution containing 0.1 M Bis-Tris, pH 6.5, 25.5% PEG 3350 and 0.35 M MgCl2. Crystals grew as clusters that showed high mosaicity while testing on the diffractometer. Micro-seeding ('Beads for Seeds' from Jena Biosciences) was used to obtain large single crystals. The seed stock was prepared by transferring an entire drop with crystals to a micro-centrifuge tube containing glass beads and 100 µl of stabilization solution (0.1 M Bis-

Tris, pH 6.5, 25.5% PEG 3350 and 0.35 M MgCl2). The seed stock was vortexed for 2 min and serial dilutions were prepared. Crystallization drops were set up by mixing protein sample and seed stock dilutions in a 1:1 ratio and crystals were grown using the hanging drop vapor diffusion technique at 293 K. Best crystals grew within two weeks to a size of about approximately 200×30×30 μm3 using a 10-3 dilution of the seed stock. For cryo-protection, crystals were transferred to a solution that contained the stabilizing agents with additional 0.4 mM substrate peptide and 15% ethylene glycol. After 5-10 s soaking, crystals were flash-frozen in liquid nitrogen. Diffraction data were collected at the Swiss Light Source Villigen at 0.9998 Å wave length and 100 K temperature using the PILATUS 6M detector (oscillation width per frame: 0.25°; 1440 and 4x 240 frames collected). The XDS package (Kabsch, 2010) was used to process, integrate, and scale the data. The phase problem was solved by molecular replacement using the program PHASER (McCoy et al., 2007) and the coordinates of Cdk12/CycK (4NST; Bösken et al., 2014) as search model. The model was refined by alternate cycles of refinement using REFMAC5 (Murshudov et al., 2011) and PHENIX (Adams et al., 2010). The manual rebuilding was made using the graphical program COOT (Emsley and Cowtan, 2004). Protein interfaces and accessible surface areas were calculated with the program ePISA (http://www.ebi.ac.uk/pdbe/). Molecular diagrams were drawn using PyMOL (http://www.pymol.org/).

Mass spectrometry analyses Peptide and protein masses were determined by liquid chromatography-electrospray ionization-mass spectrometry using an Agilent 1100 chromatography system and an LCQ Advantage MAX (Finnigan) mass spectrometer operating in positive ion mode. Proteins

- 11 - were applied onto a Vydac RP-C4 column (Grace) at 20% buffer B (CH3CN with 0.08% trifluoroacetic acid) in buffer A (H2O plus 0.1% TFA) and eluted with a gradient from 20-80% buffer B at a flow rate of 1 ml min−1. Peptide samples were loaded onto the column at 5% buffer B and eluted with a gradient from 5-80% buffer B. Data evaluation was performed with the Xcalibur, MagTran and Bioworks software packages. For ESI-MS detection of quantitative phosphorylation numbers, kinase reactions were performed with only 50 mM Hepes buffer, as reduction of buffer reagent led to improved signal quality. The kinase reaction was stopped by adding excess of EDTA before the solution was subjected to ESI-MS analysis.

Kinase assays with Flag-tagged proteins Plasmids pcDNA3.1 3xFlag Cdk12, pcDNA3.1 3xFlag Cdk13 and pcDNA3.1 3xFlag Cdk9 used for expression of full length flag-tagged proteins as well as the pcDNA3.1 3xFlag empty vector were described previously (Blazek et al., 2011). One 15 cm plate of HCT116 cells was transfected with 15 µg of plasmid using PEI reagent. After 48 hours, cells were harvested and lysed in lysis buffer (20 mM Hepes/KOH pH 7.9, 15% glycerol, 0.2% NP-40, 300 mM KCl, 1 mM DTT, 0.2 mM EDTA, and protease inhibitor (Sigma)). Flag-tagged proteins were immuno-precipitated from the lysate using 20 µl of flag-agarose (Sigma). The immuno-precipitates were washed three times with 1 ml of the lysis buffer containing 500 mM KCl followed by washing with 1 ml of a detergent-free buffer (20 mM Hepes/KOH pH 7.9, 150 mM KCl, 1 mM DTT, 15% glycerol). The flag-tagged proteins were eluted from the flag-agarose with 40 µl of flag peptide dissolved in 20 mM Hepes pH 7.9, 150 mM KCl, 1 mM DTT. In vitro kinase assays were performed in 20 mM

Hepes/KOH pH 7.9, 5 mM MgCl2, 2 mM DTT and 1 mM ATP with either 12 µl of flag eluate or 50 ng of recombinant Cdk12/CycK, Cdk13/CycK, Pin1, Pin1∆WW and with 300 ng of GST-tagged human full length CTD as a substrate. Total of 60 µl of kinase reaction was incubated at 30°C for one hour and the reaction was terminated by adding 60 µl of 2xSDS sample buffer. In Figure 6D, the GST-CTD was first pre-phosphorylated with 5 µl of eluted flag-Cdk9 or flag-Cdk9KD for 15 min at 30°C and then 50 ng of recombinant CycK/Cdk12, CycK/Cdk13 or Pin1 was added for another 45 min. The 60 µl of reaction was terminated by adding 60 µl of 2xSDS sample buffer.

Detailed expression microarrays information and array analyses procedure Purified RNA was analyzed for quality using chip-based capillary electrophoresis (Bioanalyzer, Agilent Inc.) and quantity and purity was determined with a NanoDrop spectrometer. RNA (~5-25 ng) was amplified into cDNA using the NuGEN Pico V2, based on Ribo-SPIA technology, followed by fragmentation and labeling using NuGEN Encore Biotin Module. The labeled cDNA was hybridized to Human GeneChip Gene 1.0 ST microarrays (Affymetrix, Santa Clara, CA). Microarrays were washed, stained, and scanned according to the protocol described in GeneChip Expression Analysis Technical Manual 702232 Rev. 3. Affymetrix AGCC 3.2.2 was used for data processing.

- 12 - Microarrays were normalized for array-specific effects using Affymetrix's "Robust Multi-

Array" (RMA) normalization. Normalized array values were reported on a log2 scale. For statistical analyses, we removed all array probe sets where no experimental groups had average log2 intensities greater than 3.0. Linear models were fitted for each gene using the Bioconductor "limma" package (Gentleman et al., 2004; Smyth 2004). Moderated T- statistics, fold-change and the associated P-values were calculated for each gene. To account for the fact that thousands of genes were tested, we reported false discovery rate (FDR)-adjusted values, calculated using the Benjamini-Hochberg method (Benjamini and Hochberg, 1995). The genes were considered to be differentially expressed when p value was less than 0.05 and the average fold-change in expression was at least 1.4. GO enrichment analyses with differentially expressed genes were performed using DAVID software and EASY threshold 0.05.

Primers used for RT-qPCR CDK12-F CCTGGAGATGATGACATGGATAG CDK12-R GAGGAAGGTCTGTGAGTAAGTG FLOT1-F CCATGGTGGTCTCCGGTA FLOT1-R GCAATGTGGGCAATCTCAG SLC38A9-F CCCAAACATAGGAGGGATCA SLC38A9-R GAATACAAAGGCCAGTCCACA BRCA1-F TGAGGCATCAGTCTGAAAGCC BRCA1-R AAAATGTCACTCTGAGAGGAT HPRT-F CCAGACAAGTTTGTTGTAGGATATGCCCTTGAC HPRT-R ACTCCAGATGTTTCCAAACTCAACTTGAACTCTC CT45A3-F CGTATCAGAAAAGGCAGAGG CT45A3-R CTGGGTGGAATAGCATGTC SCG2-F TGACTTAGACCATCCAGACC SCG2-R GCACTCTCCATCCCTAAAAG PLD1-F ACCGGGTATATGTCGTGATAC PLD1-R TCTGCTTTTAACTGTCCAAGG SEMA3C-F TCAGACTTTCAATCGCACAC SEMA3C-R CGTCCTTTTCCAGATTCACAC CGRRF1-F CGCTTTATGAATACTCGCCG CGRRF1-R GAAAACTCTTGTGCTGAACTG VWA5A-F GCTGGACACAAGTTTGATCG VWA5A-R GGATAGAAACTCACCATTGCAG MOB3A-F GGCAGGATGAGCATAAGTTC MOB3A-R GCAGGAAGTTCTTGGGAAAC SREBF1-F ACAGCAACCAGAAACTCAAG SREBF1-R CCTCCACCTCAGTCTTCAC DBF4B-F CCTTTTACTTGGATCTGCCTG DBF4B-R CTCTGCCTTTACTTCTCTGC CDK13-F ACTTGCAGACTTTGGACTTG CDK13-R CCACAGCTCCATACATCAATG

- 13 - siRNAs The following siRNAs from Santa Cruz Biotechnology (each containing a mix of three different siRNAs) were used throughout this study: CTRL siRNA (sc37007), Cdk12 siRNA (sc44343), and Cdk13 #1 siRNA (sc72835). Cdk13 #2 siRNA (SIHKO317) from Sigma- Aldrich (containing a single siRNA) was used for the Cdk13 overexpression rescue experiments only.

Plasmids for the overexpression rescue of the Cdk13 RNAi Cdk13 siRNA-insensitive plasmid: PCR-amplified Cdk13 cDNA was cloned into HindIII and XhoI restriction sites in the pcDNA5 FRT/TO/3xFLAG plasmid (Invitrogen), rendering a 5'- 3xFlag-Cdk13 fusion gene. The cloned Cdk13 cDNA was made insensitive to Cdk13 #2 siRNA (SIHKO317) from Sigma-Aldrich by site-directed mutagenesis of the following nucleotides in the Cdk13 sequence: T2283C, G2286A, T2289C. CycK plasmid: pCMV6 vector carrying human CycK cDNA with Myc-Flag tags was originally obtained from OriGene. Flag tag was mutated by site-directed mutagenesis to Xpress tag.

Conformation of expression microarrays results - RNAi, overexpression rescue experiment, RT-qPCR and Western blotting 30-50% confluent HCT116 cells were transfected with 10 pmol of control, Cdk12 or Cdk13 #1 siRNA using Lipofectamine RNAiMax reagent (Life Technologies) as described in materials and methods. 72 h post-transfection cells were harvested and used for the Western blot analyses of protein knockdown efficiency and for the isolation of total RNA by miRNAeasy Kit (Qiagen) following the manufacturer's instructions. For the Cdk13 overexpression rescue experiments the cells were transfected either with control or Cdk13 #2 siRNA using Lipofectamine RNAiMax reagent. After 30 h cells were split, seeded into 6 well plates and the Cdk13 #2 siRNA treated cells were transfected with either mock or 1.5 μg (0.75 μg+0.75 μg) of Cdk13 siRNA-insensitive and CycK plasmids using PEI reagent. After another 48 h cells were harvested and used for the Western blot analyses of Cdk13 expression levels and for the isolation of total RNA by miRNAeasy Kit (Qiagen). Reverse transcription was performed with SuperScript II reverse trancriptase (Invitrogen) according to the manufacturer's instructions. The reaction was run in a final volume of 20 μl using 200 ng of random hexamers (Invitrogen) and 1 μg of total RNA. For qPCR, 0.125 μl of cDNA was mixed with 12.5 μl of 2X SYBR GREEN Jumpstar Mix (Sigma-Aldrich S4438), 0.25 μM forward and reverse primer (for sequences of the primers, see Supplementary Methods) in a final volume of 25 μl. Amplifications were run on the Aria (Stratagene) using the following cycle conditions: 94°C for 2min (1 cycle); 95°C for 30s, 55°C for 30s, 72°C for 30s (45 cycles). HPRT expression levels were used for normalization and values were plotted as mean ± standard deviation. Levels of the analyzed proteins in control, Cdk12, Cdk13 siRNA-treated HCT116 cells were determined by Western blotting using the following antibodies: anti-Fus (sc-47712) from Santa Cruz Biotechnology, rabbit anti-Cdk12 serum (produced in house) and rabbit anti-Cdk13 serum (produced in house).

- 14 - SUPPLEMENTAL REFERENCES

Adams, P.D., Afoninem, P.V., Bunkóczi, G., Chen, V.B., Davis, I.W., Echols, N., Headd, J.J., Hung, L.W., Kapral, G.J., Grosse-Kunstleve, R.W., McCoy, A.J., Moriarty, N.W., Oeffner, R., Read, R.J., Richardson, D.C., Richardson, J.S., Terwilliger, T.C., and Zwart, P.H. (2010). PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. D Biol. Crystallogr. 66, 213–221. Benjamini, Y., and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Royal Statistical Soc., Series B 57, 289–300. Bieniossek, C., Imasaki, T., Takagi, Y., and Berger, I. (2012). MultiBac: expanding the research toolbox for multiprotein complexes. Trends Biochem. Sci. 37, 49–57. Emsley, P., and Cowtan, K. (2004). Coot: model-building tools for molecular graphics. Acta Crystallogr. D Biol. Crystallogr. 60, 2126–2132. Gentleman, R.C., Carey, V.J., Bates, D.M., Bolstad, B., Dettling, M., Dudoit, S., Ellis, B., Gautier, L., Ge, Y., Gentry, J., Hornik, K., Hothorn, T., Huber, W., Iacus, S., Irizarry, R., Leisch, F., Li, C., Maechler, M., Rossini, A.J., Sawitzki, G., Smith, C., Smyth, G., Tierney, L., Yang, J.Y., and Zhang, J. (2004). Bioconductor: Open software development for computational biology and bioinformatics. Genome Biol. 5, R80. Kabsch, W. (2010). XDS. Acta Crystallogr. D Biol. Crystallogr. 66, 125–132. McCoy, A.J., Grosse-Kunstleve, R.W., Adams, P.D., Winn, M.D., Storoni, L.C., Read, R.J. (2007). Phaser crystallographic software. J. Appl. Crystallogr. 40, 658–674. Murshudov, G.N., Skubák, P., Lebedev, A.A., Pannu, N.S., Steiner, R.A., Nicholls, R.A., Winn, M.D., Long, F., and Vagin, A.A. (2011). REFMAC5 for the refinement of macromolecular crystal structures. Acta Crystallogr. D Biol. Crystallogr. 67, 355–367. Ranganathan, R., Lu, K.P., Hunter, T., and Noel, J.P. (1997). Structural and functional analysis of the mitotic rotamase Pin1 suggests substrate recognition is phosphorylation dependent. Cell 89, 875–886. Smyth, G.K. (2004). Linear models and empirical Bayes methods for assessing differential expres- sion in microarray experiments (Limma). Statistical Applications in Genetics and Molecular Biology 3, No. 1, Article 3.

- 15 - Article

CDK12 controls G1/S progression by regulating RNAPII processivity at core DNA replication genes

Anil Paul Chirackal Manavalan1 , Kveta Pilarova1, Michael Kluge2, Koen Bartholomeeusen1,†, Michal Rajecky1, Jan Oppelt1 , Prashant Khirsariya3,4, Kamil Paruch3,4, Lumir Krejci4,5,6, Caroline C Friedel2 & Dalibor Blazek1,*

Abstract many cellular processes. RNAPII directs gene transcription in several phases, including initiation, elongation, and termination CDK12 is a kinase associated with elongating RNA polymerase II [1–3]. The C-terminal domain (CTD) of RNAPII contains repeats (RNAPII) and is frequently mutated in cancer. CDK12 depletion of the heptapeptide YSPTSPS, and phosphorylation of the individ- reduces the expression of homologous recombination (HR) DNA ual serines within these repeats is necessary for individual steps repair genes, but comprehensive insight into its target genes and of the transcription cycle [4,5]. Phosphorylation of RNAPII Ser2 is cellular processes is lacking. We use a chemical genetic approach a hallmark of transcription elongation, whereas phosphorylation to inhibit analog-sensitive CDK12, and find that CDK12 kinase of Ser5 correlates with initiating RNAPII [1,6]. Various kinases activity is required for transcription of core DNA replication genes have been implicated in CTD phosphorylation [7–10], and the and thus for G1/S progression. RNA-seq and ChIP-seq reveal that kinase CDK12 is thought to phosphorylate predominantly Ser2 CDK12 inhibition triggers an RNAPII processivity defect character- [11–18]. These findings were based on the use of phospho-CTD ized by a loss of mapped reads from 30ends of predominantly long, specific antibodies combined with various experimental poly(A)-signal-rich genes. CDK12 inhibition does not globally approaches including in vitro kinase assays, long-term siRNA- reduce levels of RNAPII-Ser2 phosphorylation. However, individual mediated depletion of CDK12 from cells or application of the CDK12-dependent genes show a shift of P-Ser2 peaks into the gene CDK12 inhibitor THZ531. However, each of these experiments has body approximately to the positions where RNAPII occupancy and caveats with respect to the physiological relevance. The specific transcription were lost. Thus, CDK12 catalytic activity represents a impact of a short-term CDK12-selective inhibition on CTD phos- novel link between regulation of transcription and cell cycle phorylation and genome-wide transcription in cells remains an progression. We propose that DNA replication and HR DNA repair important question to be addressed. defects as a consequence of CDK12 inactivation underlie the CDK12 and cyclin K (CCNK) are RNAPII- and transcription elon- genome instability phenotype observed in many cancers. gation-associated proteins [11,12,19]. CDK12 and its homolog CDK13 (containing a virtually identical kinase domain) associate with CCNK to form two functionally distinct complexes CCNK/ Keywords CDK12;G1/S; CTD Ser2 phosphorylation; premature termination CDK12 and CCNK/CDK13 [11,12,16,20]. Transcription of several and polyadenylation; tandem duplications core homologous recombination (HR) DNA repair genes, including Subject Categories Cell Cycle; Chromatin, Transcription, & Genomics BRCA1, FANCD2, FANCI, and ATR, is CDK12-dependent [11,16,21– DOI 10.15252/embr.201847592 | Received 15 December 2018 | Revised 9 June 23]. In agreement, treatment with low concentrations of THZ531 2019 | Accepted 24 June 2019 | Published online 25 July 2019 resulted in down-regulation of a subset of DNA repair pathway EMBO Reports (2019) 20:e47592 genes. Higher concentrations led to a much wider transcriptional defect [17]. Mechanistically, it has been suggested that CCNK is recruited to the promoters of DNA damage response genes such as Introduction FANCD2 [24]. Other studies using siRNA-mediated CDK12 depletion showed diminished 30end processing of C-MYC and C-FOS genes Transcription of protein-coding genes is mediated by RNA poly- [18,25]. Roles for CDK12 in other co-transcriptionally regulated merase II (RNAPII) and represents an important regulatory step of processes such as alternative or last exon splicing have also been

1 Central European Institute of Technology (CEITEC), Masaryk University, Brno, Czech Republic 2 Institut für Informatik, Ludwig-Maximilians-Universität München, München, Germany 3 Department of Chemistry, CZ Openscreen, Faculty of Science, Masaryk University, Brno, Czech Republic 4 Center of Biomolecular and Cellular Engineering, International Clinical Research Center, St. Anne’s University Hospital, Brno, Czech Republic 5 Department of Biology, Masaryk University, Brno, Czech Republic 6 National Centre for Biomolecular Research, Masaryk University, Brno, Czech Republic *Corresponding author. Tel: +420 730 588 450; E-mail: [email protected] †Present address: Department of Biomedical Sciences, Institute of Tropical Medicine, Antwerp, Belgium

ª 2019 The Authors. Published under the terms of the CC BY 4.0 license EMBO reports 20:e47592 | 2019 1 of 29 EMBO reports Anil Paul Chirackal Manavalan et al

reported [26–28]. Nevertheless, comprehensive insights into CDK12 replication and cell cycle progression and shed light into the mecha- target genes and how CDK12 kinase activity regulates their tran- nism of genomic instability associated with frequent aberrations of scription are lacking. CDK12 kinase activity reported in many cancers. CDK12 is frequently mutated in cancer. Inactivation of CDK12 kinase activity was recently associated with unique genome instabil- ity phenotypes in ovarian, breast, and prostate cancers [29–31]. They Results consist of large (up to 2–10 Mb in size) tandem duplications, which are completely different from other genome alteration patterns, Preparation and characterization of AS CDK12 HCT116 cell line including those observed in BRCA1- and other HR-inactivated tumors. Furthermore, they are characterized by an increased sensitivity to The role of the CDK12 catalytic activity in the regulation of tran- cisplatin and thus represent potential biomarker for treatment scription and other cellular processes is poorly characterized. Most response [29–33]. Although inactivation of CDK12 kinase activity of the previous studies of CDK12 involved long-term depletion, clearly leads to HR defects and sensitivity to PARP inhibitors in cells which is prone to indirect and compensatory effects [11,12,14,23]. [21,34–37], the discovery of the CDK12 inactivation-specific tandem The recent discovery of the covalent CDK12 inhibitor THZ531 made duplication phenotype indicated a distinct function of CDK12 in main- it possible to study CDK12 kinase activity; however, THZ531 also tenance of genome stability. The size and distribution of the tandem inhibits its functionally specialized homolog CDK13 and transcrip- duplications suggested that DNA replication stress-mediated defect(s) tionally related JNK kinases [17]. are a possible driving force for their formation [30,31]. To overcome these limitations and determine the consequences Proper transcriptional regulation is essential for all metabolic of specific inhibition of CDK12, we modified both endogenous alle- processes including cell cycle progression [38]. Transition between les of CDK12 in the HCT116 cell line to express an analog-sensitive G1 and is essential for orderly DNA replication and cellular (AS) version that is rapidly and specifically inhibited by the ATP division, and its deregulation leads to tumorigenesis [39]. G1/S analog 3-MB-PP1 [45] (Fig 1A). This chemical genetic approach has progression is transcriptionally controlled by the well-characterized been used to study other kinases [9,46,47] and was also attempted E2F/RB pathway. E2F factors activate transcription of several for CDK12 by engineering HeLa cells carrying a single copy of AS hundred genes involved in regulation of DNA replication, S phase CDK12 (with the other CDK12 allele deleted) [48]. progression, and also DNA repair by binding to their promoters We applied CRISPR-Cas technology to mutate the gatekeeper [40]. Expression of many DNA replication genes (including CDC6, phenylalanine (F) 813 to glycine (G) in both CDK12 alleles in CDT1, TOPBP1, MCM10, CDC45, ORC1, CDC7, CCNE1/2), like many HCT116 cells (Figs 1A and EV1A). The single-strand oligo donor other E2F-dependent genes, is highly deregulated in various cancers used as a template for CRISPR-Cas editing introduced a silent [41–44]. However, it is not known whether or how their transcrip- GTA>GTT mutation to prevent alternative splicing [48], and a tion is controlled downstream of the E2F pathway, for instance TTT>GGG mutation to implement the desired F813G amino acid during elongation. change and created a novel BslI restriction site used for screening To answer the above questions, we used a chemical genetic (Fig EV1A). We validated our intact homozygous AS CDK12 approach to specifically and acutely inhibit endogenous CDK12 HCT116 cell line by several approaches, including allele-specific kinase activity. CDK12 inhibition led to a G1/S cell cycle progression PCR (Fig EV1B), BslI screening (Fig 1B; for expected restriction defect caused by a deficient RNAPII processivity on a subset of core patterns see Fig EV1A), and Sanger sequencing (Fig 1C and DNA replication genes. Loss of RNAPII occupancy and transcription Appendix Fig S1A and B). Immunoprecipitation (IP) of CDK12 from from gene 30ends coincided with a shift of the broad peaks of the WT and AS CDK12 HCT116 cells followed by Western blotting RNAPII phosphorylated at Ser2 from gene 30ends into the gene showed that equal amounts of CCNK associated with CDK12, and body. Our results show that CDK12-regulated RNAPII processivity that comparable levels of CDK12 were expressed in both cell lines, of core DNA replication genes is a key rate-limiting step of DNA confirming the functionality of the AS variant (Fig EV1C). To

Figure 1. Preparation and characterization of AS CDK12 HCT116 cell line. ▸ A Scheme depicting preparation of AS CDK12 HCT116 cell line. Gate keeper phenylalanine (F) and glycine (G) are indicated in red, and adjacent amino acids in CDK12 active site are shown in black letters (left). ATP and ATP analog 3-MB-PP1 are shown as black objects in wild-type (WT) and AS CDK12 (blue ovals), respectively (right). B Genotyping of AS and WT CDK12 clones. Ethidium bromide-stained agarose gel visualizing PCR products from genomic DNA of AS (AS-PCR) and WT (WT-PCR) CDK12 HCT116 cells and their digest with BslI enzyme (indicated as AS- BslI and WT- BslI). Primer positions and BslI restriction sites are depicted at Fig EV1A. Numbers on the left and right indicate DNA marker and DNA fragment sizes, respectively. C Detailed insight into sequencing of genomic DNA from WT and AS CDK12 HCT116 cell lines. The genomic region in WT and AS CDK12 subjected to genome editing is shown in red rectangle; gate keeper amino acids F and G are in red. The full ~ 500 kb sequence surrounding the edited genomic region is in the Appendix Fig S1A and B. D Effect of CDK12 inhibition on phosphorylation of the CTD of RNAPII. Western blot analyses of protein levels by the indicated antibodies in AS CDK12 HCT116 cells treated with 5 lM 3-MB-PP1 for indicated times. Long and short exp. = long (4–14 min) and short (10–60 s) exposures, respectively. FUS and tubulin are loading controls. A representative image from three replicates is shown. E, F Inhibition of CDK12 in AS CDK12 HCT116 cells results in down-regulation of CDK12-dependent HR genes. Graph shows RT–qPCR analysis of relative levels of mRNAs of described genes in AS CDK12 HCT116 (E) and WT CDK12 HCT116 (F) cells treated for indicated times with 3-MB-PP1. mRNA levels were normalized to HPRT1 mRNA expression and the mRNA levels of untreated control (CTRL) cells were set to 1. n = 3 replicates, error bars indicate standard error of the mean (SEM). Source data are available online for this figure.

2 of 29 EMBO reports 20:e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

investigate the putative role of CDK12 as a RNAPII CTD kinase, we EV1D). However, we did not observe any substantial changes in the treated AS CDK12 cells with 3-MB-PP1 or control vehicle for 1, 2, 3, global levels of phosphorylated Ser2 or Ser5 compared to untreated and 6 h and monitored changes in CTD phosphorylation by probing cells. Only short exposures of Western blots revealed a subtle, but Western blots with phospho-specific antibodies (Figs 1D and noticeable trend toward accumulation of P-Ser2 after 3 h and P-Ser5

A B

C D

EF

Figure 1.

ª 2019 The Authors EMBO reports 20:e47592 | 2019 3 of 29 EMBO reports Anil Paul Chirackal Manavalan et al

at 6 h and a slight decrease of P-Ser5 at 1–3 h, respectively, consis- inhibitor THZ531 (Fig EV2B), as well as AS CDK12 HeLa [48] or AS tent with previous observations in AS CDK12 HeLa cells [48]. CDK12 HCT116 cells synchronized by thymidine–nocodazole and Surprisingly, P-Ser7 levels were noticeably diminished starting with inhibited by 3-MB-PP1 also demonstrated the G1/S progression 1-h treatment but started recovering at 6 h. To functionally charac- delay (Fig EV2C and data not shown). Thus, the function of CDK12 terize AS CDK12 HCT116 cells, we treated them with 3-MB-PP1 for in optimal G1/S progression appears to be general, rather than cell 1, 3, 5, and 24 h and monitored the expression of DNA repair genes type- or treatment-specific. that were previously shown to be regulated by CDK12 (BRCA1, The protein levels of numerous cell cycle regulators fluctuate BRCA2, ATR, and FANCI). We observed rapid down-regulation of all during cell cycle progression according to their function in a specific four CDK12-dependent genes (Fig 1E). Importantly, similarly treated phase [38]. To examine whether CDK12 levels change during cell WT HCT116 cells showed no down-regulation of these genes cycle progression, we arrested AS CDK12 HCT116 cells by serum (Fig 1F), and RNA-seq of WT HCT116 cells treated with 3-MB-PP1 starvation, released them, and analyzed CDK12 proteins by Western showed differential expression of only six protein-coding genes blotting (Fig 2D). Strikingly, CDK12 levels were highest during early compared to the control (data not shown), confirming the absence G0/G1 phase, started to diminish in G1/S transition, reached lowest of off-target effects of the ATP analog on other transcription-related levels in late S phase, and started to slightly recover in G2/M kinases. (Fig 2D). Similar trends, however much less distinct, were observed In summary, these results demonstrated the generation of a fully for CDK13 and CCNK. We verified cell cycle synchronization and functional, homozygous AS CDK12 HCT116 cell line. individual phases of the cell cycle by the expression of CCNE1 in G1/S and accumulation of CCNA2 in G2/M phases (Fig 2D) and by CDK12 kinase activity is essential for optimal G1/S progression the flow cytometry DNA content profiles (Fig 2B). independently of DNA damage cell cycle checkpoint To define when CDK12 kinase activity is needed for early cell cycle progression, serum-synchronized AS CDK12 HCT116 cells In our previous work, we noted that long-term CDK12 depletion were released into serum-containing medium and 3-MB-PP1 was leads to an accumulation of cells in G2/M phase, consistent with added at various times post-release, ranging from 0 to 12 h. Cell diminished transcription of CDK12-dependent DNA repair genes cycle progression was measured by flow cytometry at 16 h post- and activation of a DNA damage cell cycle checkpoint [11,49]. To release (Fig 2E). Whereas treatments at 9 and 12 h had a weak determine whether CDK12 kinase activity directly regulates cell or no effect on the G1/S transition, treatments within 6 h post- cycle progression, we arrested AS CDK12 HCT116 cells at G0/G1 release delayed the transition, suggesting that CDK12 kinase activ- by serum withdrawal for 72 h, released them into serum- ity is needed at very early G1 phase (Fig 2F). Similar results were containing media in the presence or absence of 3-MB-PP1, and obtained by flow cytometry analyses of BrdU-labeled cells harvested cells for flow cytometry analyses every 6 h after the (Fig 2G). As an additional approach, we released cells in the release (Fig 2A). presence and absence of 3-MB-PP1 and washed away 3-MB-PP1 In the absence of the inhibitor, the cells entered S phase in after 2, 3, 4, and 5 h (Fig EV2D). When the inhibitor was washed ~ 12 h, reached G2/M phase in ~ 18 h, and completed the full cell away between 2 and 5 h, the cells were able to progress to S cycle in ~ 20 h (Fig 2B and C). In contrast, in the presence of 3-MB- phase comparably to untreated cells (Fig EV2E), indicating the PP1, cells started to enter S phase at 18 h, indicating a delay in G1/S requirement of CDK12 kinase activity in very early G1 phase for progression by 6–9 h. (Fig 2B and C). WT HCT116 cells treated with optimal G1/S progression. 3-MB-PP1 showed no defect in cell cycle progression excluding As long-term CDK12 depletion causes down-regulation of DNA unspecific inhibition of other kinases (Fig EV2A). Importantly, repair genes resulting in endogenous DNA damage [11,23], we serum-synchronized WT HCT116 cells treated with the CDK12 asked whether the observed G1/S delay upon CDK12 inhibition was

Figure 2. CDK12 kinase activity is essential for optimal G1/S progression independently of DNA damage cell cycle checkpoint. ▸ A Experimental outline. AS CDK12 HCT116 cells were arrested by serum starvation for 72 h and released into the serum-containing medium with or without 3-MB- PP1. DNA content was analyzed by flow cytometry at indicated time points after the release. B CDK12 kinase activity is needed for G1/S progression in cells arrested by serum starvation. Flow cytometry profiles of control (3-MB-PP1) or inhibitor (+3-MB-PP1) treated cells from the experiment depicted in Fig 2A. The red arrow points to the onset of the G1/S progression defect in 3-MB-PP1-treated cells. To better visualize the G1/S delay in the presence of the inhibitor, the 24-h time point is also shown. n = 3 replicates; representative result is shown. C Quantification of cells (%) in individual cell cycle phases based on flow cytometry profiles of the representative replicate in Fig 2B. D CDK12 protein levels peak in the G0/G1 phase of the cell cycle. Western blots show levels of proteins at indicated time points after the release of serum-starved AS CDK12 HCT116 cells. Corresponding cell cycle phases are depicted above time points. A representative Western blot from three replicates is shown. E Experimental outline. AS CDK12 HCT116 cells were arrested by serum starvation for 72 h and released into the serum-containing medium. 3-MB-PP1 was either added or not at indicated time points after the release. Propidium iodide- or BrdU-stained DNA content was measured by flow cytometry at 16 h after the release. Note, that for the BrdU staining the 3-MB-PP1 was added only at the time of the release (0 h) and 3, 4, 5, and 6 h after the release. F, G Inhibition of CDK12 in early G1 perturbs normal cell cycle progression. Quantification of cells (%) in cell cycle phases from flow cytometry profiles of propidium iodide (F)- and BrdU (G)-labeled cells upon addition of 3-MB-PP1 at indicated time points after serum addition in the experiment depicted in Fig 2E. CTRL in Fig 2G = control sample without 3-MB-PP1. n = 3 replicates, representative result is shown. H Short-term CDK12 inhibition does not activate DNA damage checkpoints. Western blot analyses of phosphorylation of depicted DNA damage response markers upon inhibition of CDK12 for indicated times. CPT corresponds to 5 lM camptothecin. A representative Western blot from three replicates is shown. FUS is a loading control. Source data are available online for this figure.

4 of 29 EMBO reports 20:e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

AF+/- 3-MB-PP1 - + - + - + - + - + - + - + - + - + 3-MB-PP1 100 serum 0h 6h 12h 18h 24h starvation 90 80 72h FACS G0/G1 serum FACS FACS FACS 70 addition 60 S 50 G2/M 40 B HCT116 AS CDK12 30 0h 20 10 0 % of cells in a cell cycle % of cells in a cell cycle stage 0h 1h 2h 3h 4h 5h 6h 9h 12h -3-MB-PP1 +3-MB-PP1 G 3-MB-PP1 added at hours post release 6h CTRL +3-MB-PP1 +3-MB-PP1 +3-MB-PP1 +3-MB-PP1 +3-MB-PP1 at 0h at 3h at 4h at 5h at 6h S S S S S S 72.1 58.6 39.3 44.0 46.9 50.7 12h

G2M G2M G2M G2M G2M G2M G1 G1 G1 9.4 G1 18.9 16.0 G1 17.1 G1 16.0 13.9 17.8 40.5 38.3 34.9 32.4 26.5 BrdU-FITC 18h DNA Content (PI) Hours post serum addition Hours post serum

24h

H C ------+ CPT 100 -3-MB-PP1 +3-MB-PP1 - + - + - + - + - + - 3-MB-PP1 90 G2/M 3 3 6 6 12 12 24 24 48 48 4 Hours 80 S 70 250 kDa ATM (P-Ser1981) G0/G1 60 ATM 50 250 kDa P53 (P-Ser15) 40 50 kDa 30 50 kDa P53 20 10 64 kDa FUS 0

% of cells in a cell cycle phase cell cycle a of cells in % 0 6 12 18 24 0 6 12 18 24 Hours

D G0/G1 S G2/M 0 1 2 4 6 9 12 15 18 Hours 148 kDa CDK12 148 kDa CDK13 CCNK 64 kDa

50 kDa CCNE1 CCNA2 50 kDa 64 kDa FUS

E +/-3-MB-PP1 serum 0h 1h 2h 3h 4h 5h 6h 9h 16h starvation 12h 72h serum FACS addition

Figure 2.

ª 2019 The Authors EMBO reports 20:e47592 | 2019 5 of 29 EMBO reports Anil Paul Chirackal Manavalan et al

due to secondary activation of DNA damage cell cycle checkpoints indicate that CDK12 inhibition specifically disrupts the expression of [50]. However, the levels of phosphorylated P-ATM and P-P53, its target genes, rather than general transcription, and suggest that markers of an activated DNA damage pathway, increased in cells CDK12 regulates DNA replication and cell cycle progression by only after 48-h inhibition of CDK12 (Fig 2H), coincident with onset controlling the expression of a subset of genes. of endogenous DNA damage upon long-term CDK12 depletion [11]. To determine whether the decrease in the transcript levels upon These data suggest that the delay in G1/S progression is indepen- CDK12 inhibition is a result of decreased mRNA stability, we dent of secondary activation of DNA damage pathways. performed transcription inhibition using actinomycin D (ActD; Fig EV3D). Comparison of the degradation rates after transcription CDK12 catalytic activity controls expression of core DNA shut-off on select DNA repair and replication transcripts in cells replication genes either treated or not with 3-MB-PP1 revealed no difference in the relative mRNA stability (Fig EV3D). We therefore conclude that CDK12 is associated with the transcription of specific genes, CDK12 inhibition does not influence mRNA half-lives of its target particularly DNA repair genes [11,22,23]. We hypothesized that genes. CDK12 catalytic activity is also needed for the expression of genes To elucidate whether the CDK12-dependent decrease in tran- regulating G1/S progression. To test this hypothesis, we synchro- script levels of the DNA replication genes corresponds to lower nized AS CDK12 HCT116 cells by serum starvation, released them protein levels during G1/S phase, we serum synchronized cells and into serum-containing media with or without 3-MB-PP1, and released them in the absence or presence of 3-MB-PP1 and evalu- isolated RNA after 5 h (n = 3 independent replicates). We then ated lysates after 3, 6, 9, 12, and 15 h. The tested proteins were performed 30end RNA-seq with poly(A)-selected RNA. CDK12 inhi- selected based on antibody availability and their involvement in the bition resulted in the significant differential expression of 2,102 formation and activation of origin recognition and pre-replication genes (1 > log2 fold-change > 1, P < 0.01), including 611 up- complexes [52]. We found that the levels of TOPBP1, CDC6, CDT1, regulated and 1,491 down-regulated genes (Fig 3A and Dataset MTBP, and CCNE2 proteins were reduced after 6 h of CDK12 inhibi- EV1). tion compared to untreated controls, and CDC7 and ORC2 were Gene Ontology (GO) enrichment analysis of the down-regulated reduced after 9 and 12 h inhibition, respectively (Figs 3E and genes identified high enrichment not only of DNA repair mecha- EV3E). In contrast, the levels of ORC3, CCNE1, and GINS4 were not nisms (Fig 3B, FDR q-value ≤ 0.05), but also of DNA replication significantly affected (Figs 3E and EV3E). Of note, depletion of and cell cycle processes (Fig 3B, in red frame). Comparable CDK12 regulatory subunit CCNK in asynchronous cells also resulted processes were found to be associated with down-regulation using in decrease of mRNA and protein levels of the DNA replication gene set enrichment analysis (GSEA) [51] (Fig EV3A, in red frames). genes (Fig EV3F and G). Manual inspection of the corresponding processes revealed reduced Assembly of origin recognition and pre-replication complexes on expression of most genes involved in the activation and formation the chromatin in early G1 phase and pre-replication complex activa- of replication origin recognition complexes and pre-replication tion in G1/S phase (Fig 3C) are prerequisite for the start of DNA complexes (Figs 3C and EV3B). Assembly of these complexes and replication [39,52]. To examine whether the reduced expression of their activation in early G1 phase are essential for DNA replication DNA replication factors upon inhibition of CDK12 affects their load- and cell cycle progression [52]. Using RT–qPCR, we confirmed that ing to and association with chromatin in early cell cycle phases, we several of these DNA replication genes were down-regulated upon isolated the cellular chromatin fraction [53]. Cells were synchro- CDK12 inhibition in early G1 phase (Fig 3D). In contrast, mRNA nized by serum starvation, released into media with or without 3- expression of control non-regulated genes but also genes inducible MB-PP1, and harvested every 3 h for 24 h, and chromatin-bound during G1 phase did not change significantly (Fig EV3C). These data ORC6, CDC6, and CDT1 were followed by Western blotting. Indeed,

Figure 3. CDK12 catalytic activity controls expression of core DNA replication genes. 0 ▸ A CDK12 inhibition results in differential expression of a subset of genes. Comparison of log2 fold-changes versus log2 mean expression in 3 end RNA-seq data shows differentially regulated genes after inhibition of CDK12. Down- (log2 fold-change < 1) and up-regulated (log2 fold-change > 1) genes are shown in blue and red, respectively. B CDK12 inhibition down-regulates DNA damage- and cell cycle-related genes. GO analysis using the Gorilla webserver of enriched cellular functions in 1,491 genes down-regulated (log2 fold-change < 1.0; P < 0.01)in30end RNA-seq data upon CDK12 inhibition. Functions related to DNA replication and cell cycle are marked by the red rectangle. C Outline of formation and activation of DNA replication complexes in G1/S phase. Origin recognition, pre-replication, and pre-initiation complexes are depicted; genes dependent on CDK12 kinase activity (log2 fold-change < 0.85; P < 0.01) are shown in red. D Validation of RNA-seq for select DNA replication genes by RT–qPCR. Graph shows relative levels of mRNAs of described genes in serum arrested and released (0 h G0/G1) AS CDK12 HCT116 cells either treated (3-MB-PP1) or not (CTRL) with the inhibitor for indicated times after the release. mRNA levels were normalized to B2M mRNA expression, and mRNA levels for each gene at the time of release (0 h) were set as 1. n = 3 replicates, error bars indicate SEM. E Protein levels of core DNA replication factors are dependent on the CDK12 kinase activity. Western blot analyses of protein expression by the depicted antibodies in serum synchronized and released (0 h) cells either treated or not with 3-MB-PP1 for the indicated times after the release. FUS is a loading control. A representative Western blot of three replicates is shown. F CDK12 inhibition affects loading of CDC6 and CDT1 DNA replication factors to chromatin. Western blotting analyses of chromatin association of the indicated DNA replication factors in serum synchronized and released AS CDK12 HCT116 cells treated or not with 3-MB-PP1 for the indicated times. Histone H2A serves as a loading control of chromatin fractions. A = asynchronous cells, 0 h = time of release. A representative Western blot of three replicates is shown. Source data are available online for this figure.

6 of 29 EMBO reports 20:e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

A B

C

D

E

F

Figure 3.

ª 2019 The Authors EMBO reports 20:e47592 | 2019 7 of 29 EMBO reports Anil Paul Chirackal Manavalan et al

we found that CDK12 inhibition diminished and delayed the loading control (Fig 4F). Cellular replication was affected much more of CDC6 and CDT1 proteins onto chromatin relative to control strongly after 48 h of 3-MB-PP1 treatment resulting in a 35% (Fig 3F, compare points 6–15 h post-release in the presence or decrease in the number of replicating cells and a 34% accumulation absence of 3-MB-PP1). of G1 cells compared to the control (Fig 4F). Altogether, our results show that CDK12 catalytic activity is As disruption of every CDK12-dependent process described so far required for the expression of several crucial DNA replication genes (DNA replication, cell cycle progression, DNA damage repair) is including CDC6, CDT1, and TOPBP1. CDK12 inhibition diminishes predicted to trigger DNA damage and genome instability [54], we levels of these proteins, disrupting their loading on chromatin and asked whether inhibition of CDK12 would lead to increased chromo- formation of pre-replication complexes, which delays G1/S progres- somal abnormalities. Therefore, we treated AS CDK12 HCT116 cells sion (Fig 2B). with 3-MB-PP1 for 24 and 48 h and performed a chromosomal aber- ration assay (Fig 4G and H). CDK12 inhibition led to a 3- to 4-fold A tight interplay between CDK12 kinase activity, expression of increase in the number of chromosomal aberrations (e.g., gaps, DNA replication genes, cell cycle progression, and chromosomal exchanges, DNA breaks, and single/bi-chromatid genome stability breakage (frag/difrag)) when compared to cells with normal CDK12 kinase activity. The increase was comparable to cells treated with To further clarify the interplay between CDK12 kinase activity, DNA hydroxyurea (Fig 4H). This result is consistent with fundamental replication gene expression, and cell cycle progression, we roles of CDK12 kinase activity in maintenance of genome stability. performed an inhibitor wash off experiment (Fig 4A). We employed Altogether, these findings support the existence of a tight func- RT–qPCR and Western blotting to monitor the expression of DNA tional link between CDK12 catalytic activity, the regulation of genes replication genes, and flow cytometry to monitor cell cycle progres- involved in DNA replication and of cell cycle progression, and sion. Consistent with our observations so far, CDK12 inhibition consequent DNA damage/genome instability in cells. induced a strong decrease in mRNA (Fig 4B) and protein levels (Fig 4C) of DNA replication genes, and delayed S phase entry Inhibition of CDK12 leads to diminished RNAPII processivity on (Fig 4D). Notably, washing off the inhibitor at various times down-regulated genes between 1 and 5 h after the release led to progressive rescue of mRNA (Fig 4B), protein expression (Fig 4C), and a gradual normal- Next, we aimed to determine what transcriptional mechanism(s) ization of cell cycle progression (Fig 4D). In agreement, the inhibitor affects expression of CDK12-dependent genes. It is well established wash off after 1 h of treatment restored the chromatin association of that transcription of many DNA replication, cell cycle, and DNA CDC6 and CDT1 compared to the no-wash controls (Fig 4E). Alto- repair genes is specifically regulated by the E2F/RB pathway. Since gether, these experiments revealed a tight interplay between CDK12 many CDK12-dependent DNA replication and DNA repair genes are catalytic activity, DNA replication factors expression, and their chro- dependent on E2F transcription factors [11,40], we examined matin loading and G1/S progression. CDK12-dependent recruitment of E2F1 and E2F3 to the promoters of Considering this critical role for CDK12 kinase activity in G1/S DNA replication genes by ChIP-qPCR. However, we did not observe progression, we asked if longer-term CDK12 inhibition affects repli- any significant change between CDK12-inhibited cells and controls cation of asynchronous cellular populations. Treatment of AS (Fig EV4A). E2Fs are needed for recruitment of RNAPII to its target CDK12 HCT116 cells with 3-MB-PP1 for 24 h followed by flow genes and their activation. However, CDK12 inhibition did not cytometry analyses of BrdU-labeled cells revealed a 15% decrease of affect recruitment of RNAPII to the promoters of E2F-dependent S phase stage replicating cells in comparison to the untreated genes (Fig EV4B; see below for RNAPII ChIP-seq and RNA-seq

Figure 4. A tight interplay between CDK12 kinase activity, expression of DNA replication genes, cell cycle progression, and genome stability. ▸ A Experimental outline. AS CDK12 HCT116 cells were arrested by serum starvation for 72 h and released into the serum-containing medium with (+) or without () 3-MB-PP1. 3-MB-PP1 was washed away and replaced with fresh medium at indicated times after the release and samples were subject to RT–qPCR, Western blotting, and flow cytometry analyses at 7, 12, and 15 h after the release, respectively. Note that shown wash away time points (2, 3, 4, 5 h) are valid for RT–qPCR only, for Western blotting and flow cytometry 1, 2, 3, 5 h and 1, 3, 5, 7 h wash away time points were applied, respectively. All experiments were performed in at least three replicates. B–D Removal of CDK12 inhibitor in early G1/S rescues replication gene expression and cell cycle progression. RT–qPCR (B), Western blotting (C), and flow cytometry analyses (D) of replication gene mRNA, protein levels, and cell cycle progression, respectively. RT–qPCR, Western blotting, and flow cytometry analyses were performed 7, 12, and 15 h post-release, respectively. CTRL = control samples without the 3-MB-PP1.InB,n = 3 and error bars indicate SEM. In (C, D) representative images from three biological replicates are shown. E Rescued loading of CDC6 and CDT1 on chromatin after removal of CDK12 inhibitor. Western blot analyses of chromatin fractions of serum-starved AS CDK12 HCT116 cells treated with 3-MB-PP1 for 6 or 9 h or with the inhibitor washed off after 1 h of treatment. CTRL corresponds to cells not treated with the inhibitor at the time of the serum addition. All cells were harvested either 6 or 9 h after the serum addition. Histone H2A serves as a loading control of chromatin fractions, and studied DNA replication factors are indicated. A representative image of three replicates is shown. F Inhibition of CDK12 kinase activity in cycling cells leads to decreased numbers of actively replicating cells. Asynchronous AS CDK12 HCT116 cells were grown for 24 and 48 h in the presence or absence of 3-MB-PP1, and replicating BrdU-stained cells were quantified by FACS analyses. CTRL = control samples without the 3-MB- PP1. A representative image of three replicates is shown. G, H Prolonged CDK12 inhibition causes chromosomal aberrations in cells. Specific chromosomal aberrations in cells treated with 3-MB-PP1 (24 or 48 h), 4 mM hydroxyurea (5 h), or control solvent (CTRL) were identified by microscopy. A representative image from three biological replicates is shown (G). Total numbers of chromosomal aberrations per hundred cells of the representative replicate in (G) are quantified (H). Source data are available online for this figure.

8 of 29 EMBO reports 20:e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

B A -3-MB-PP1(CTRL) 0h 7h 12h 15h 1.2 CTRL serum addition qPCR WB FACS 1.0 +3-MB-PP1 (no wash) + 3-MB-PP1 no wash serum 0.8 + 3-MB-PP1 wash 2h starvation 0h 7h 12h 15h 72h 0.6 + 3-MB-PP1 wash 3h serum addition qPCR WB FACS + 3-MB-PP1 wash 4h +3-MB-PP1 wash either at (for qPCR) 0.4 + 3-MB-PP1 wash 5h 0h 2h 3h 4h 5h 7h 12h 15h 0.2 Relative mRNA levels Relative mRNA serum addition qPCR WB FACS 0.0 CDC6 ORC2 TOPBP1 MCM10 CCNE2

C D 3-MB-PP1 3-MB-PP1 3-MB-PP1 3-MB-PP1 3-MB-PP1 CTRL no wash wash 1h wash 3h wash 5h wash 7h 98 kDa MTBP S S S S S S 20.0 64 kDa CDC6 68.7 62.3 58.0 53.6 50.4 64 kDa CDT1 G2M G2M G2M G2M G2M G1 G1 G1 G1 G1 G2M G1 6.0 12.0 6.1 6.8 7.2 24.0 62.0 28.2 30.6 37.0 6.8 40.1 64 kDa CDC7 BrdU-FITC 64 kDa FUS ------148 kDa TOPBP1 64 kDa FUS DNA Content (PI)

F CTRL 24h 3-MB-PPI 24h CTRL 48h 3-MB-PPI 48h

S S E S S 48.5 13.3 45.7 30.3

64 kDa CDT1 FITC G2M G2M G2M 64 kDa G2M G1 G1 9.3 G1 G1 10.9 10.3 CDC6 17.5 72.0 36 kDa 42.7 47.4 38.5 ORC6 BrdU- 16 kDa H2A Harvest:6h after release 9h after release DNA Content (PI) G

quadradials breaks breaks breaks gaps

gaps gaps

CTRL 24h 3-MB-PP1 24h CTRL 48h 3-MB-PP1 48h

H 120 100 gap 80 breaks 60 frag.difrag . 40 exchange 20 0 per 100 cells counted Number and types of aberrations Number and types

Figure 4.

ª 2019 The Authors EMBO reports 20:e47592 | 2019 9 of 29 EMBO reports Anil Paul Chirackal Manavalan et al

experiments, respectively). Thus, these data suggest that CDK12 P-Ser5 signal peaked at promoters, consistent with a role in initi- acts downstream of the E2F/RB pathway. ating RNAPII [6], and we found that P-Ser5 occupancy was reduced CDK12 has been implicated in the transcription of a subset of significantly at 30ends of down-regulated genes and a little at non- genes via phosphorylation of RNAPII, particularly on Ser2 and Ser5 regulated genes when CDK12 was inhibited (Fig EV5C). However, in the CTD [11–13,16,17]. To uncover a role for CDK12 kinase activ- P-Ser5 occupancy normalized to RNAPII showed no or very little ity in transcription of genes on a genome-wide level during early G1 changes across the three groups of genes after CDK12 inhibition phase, we performed ChIP-seq using antibodies for RNAPII, P-Ser2, (Appendix Fig S3), providing evidence that observed changes in P- and P-Ser5, coupled with nuclear RNA-seq (n = 3 replicates each). Ser5 signal are only due to changes in RNAPII occupancy. In contrast to 30end RNA-seq, nuclear RNA-seq allowed analyzing In control cells, P-Ser2 occupancy was most pronounced on gene changes in RNA processing and splicing and also measuring non- bodies with highest enrichment at 30ends (Fig 5C), consistent with its polyadenylated RNAs. We synchronized AS CDK12 HCT116 cells by role in elongation and 30end processing [6,55,56]. Importantly, in serum starvation for 72 h, released them into serum-containing response to CDK12 inhibition, down-regulated genes showed a very media with or without 3-MB-PP1, and collected samples at 4.5 h strong shift of P-Ser2 occupancy into the gene body and toward the post-release for ChIP-seq and nuclear RNA-seq. TSS (Fig 5C). The shift toward the gene body was most pronounced Nuclear RNA-seq revealed significant differential expression of in strongly down-regulated genes (Appendix Fig S4). To exclude that 1,617 genes (1 > log2 fold-change > 1, P < 0.01), including 1,277 the shift in P-Ser2 occupancy was only a consequence of the change genes with diminished and 340 genes with increased expression in overall RNAPII levels, we also normalized P-Ser2 occupancy pro- (Fig EV5A and Dataset EV2), consistent with our observation that files to RNAPII levels (Appendix Fig S5). This showed a small but only a subset of genes are regulated by CDK12 kinase activity. Log2 highly significant increase of normalized P-Ser2 occupancy in the fold-changes were highly correlated between 30end RNA-seq and gene body and a reduction at gene 30ends for down-regulated genes nuclear RNA-seq (Spearman rank correlation q = 0.78, Fig EV5A), and to a lesser degree for non-regulated genes (Appendix Fig S5). and we observed significant overlap between differentially SPT6 binds RNAPII via the CTD linker and stimulates transcription expressed genes in both experiments (Figs 5A and EV5B). elongation [57–59]. To investigate whether SPT6 and RNAPII associa- To determine whether this differential expression is due to a tion is dependent on CDK12 kinase activity and to correlate the transcriptional defect caused by CDK12 inhibition, we analyzed the observed changes in RNAPII occupancies with occupancies of this distribution of RNAPII, P-Ser2, and P-Ser5 ChIP-seq reads from well-characterized elongation factor we performed SPT6 ChIP-seq 3 kb of the transcription start site (TSS) to +3 kb of the transcrip- (n = 3 replicates, Fig EV5D). Metagene plots show the expected pro- tion termination site (TTS). Genes were divided into three groups file of SPT6 binding with a peak at the promoter and an increase at according to their differential expression after CDK12 inhibition in 30ends of genes, which resembles RNAPII profiles (Fig EV5D). CDK12 the nuclear RNA-seq data: up-regulated (log2 fold-change > 1, inhibition reduced relative SPT6 occupancy at the 30ends of down- P < 0.01), down-regulated (log2 fold-change < 1, P < 0.01), and regulated genes. Little or no occupancy difference was observed at non-regulated (0.1 < log2 fold-change < 0.1, P > 0.01). non-regulated and up-regulated genes, respectively (Fig EV5D). Metagene plots display the expected profile of RNAPII occu- However, SPT6 occupancy normalized to the RNAPII showed little pancy for all three groups with a peak of paused RNAPII at the changes for all three gene groups (Appendix Fig S6), indicating that promoter (Fig 5B). Strikingly, CDK12 inhibition reduced the rela- SPT6 travels together with RNAPII on genes and SPT6-RNAPII associ- tive RNAPII occupancy at the 30ends of down-regulated genes ation is independent of CDK12 kinase activity. In agreement, (Fig 5B). More strongly down-regulated genes had tendency immunoprecipitation of SPT6 from cells showed no change in the toward a higher reduction in 30end occupancy (Appendix Fig S2). interaction with RNAPII when CDK12 was inhibited (Fig EV5E). Little or no occupancy difference was observed for non-regulated The genome-wide trends in RNAPII, P-Ser2, P-Ser5, and SPT6 and up-regulated genes, respectively (Fig 5B). This phenotype is occupancies in down-regulated genes were clearly visible at selected consistent with an RNAPII elongation/processivity defect at down- CDK12-dependent genes (Fig 5D and E, and Appendix Fig S7A) regulated genes. including DNA replication genes (Appendix Fig S7B and C). Here,

Figure 5. Inhibition of CDK12 leads to diminished RNAPII processivity on down-regulated genes. 0 ▸ A Inhibition of CDK12 affects the expression of similar subsets of genes in nuclear and 3 end RNA-seq data. The Venn diagrams represent the overlap between genes significantly (P < 0.01) up- (log2 fold-change > 1) or down-regulated (log2 fold-change < 1) in nuclear and 30end RNA-seq data. B, C Genes down-regulated in nuclear RNA-seq after CDK12 inhibition have diminished relative occupancy of RNAPII at their 30ends and higher relative occupancy of P-Ser2 in their gene bodies. Metagene analyses of RNAPII (B) and P-Ser2 (C) ChIP-seq data (see Materials and Methods). Each transcript was divided into two parts with fixed length (transcription start site (TSS) 3 kb to +1.5 kb and transcription termination site (TTS) 1.5 kb to +3 kb) and a central part with variable length corresponding to the rest of gene body (shown in %). Each part was binned into a fixed number of bins (90/180/90), and average coverage for each bin was calculated for each transcript in each sample. The curve for each transcript was normalized to a sum of one and then averaged first across genes and second across samples. Dotted lines indicate TSS, 1,500 nucleotides downstream of TSS, and 1,500 nucleotides upstream of TTS and TTS. The color track at the bottom of each subfigure indicates the significance of paired Wilcoxon tests comparing the normalized transcript coverages for each bin between untreated (CTRL) cells and cells treated with 3-MB-PP1. P-values are adjusted for multiple testing with the Bonferroni method within each subfigure; color code: red = adjusted P-value ≤ 1015, orange = adjusted P-value ≤ 1010, yellow = adjusted P-value ≤ 103. D, E Examples of genes whose transcription processivity and expression is dependent on the CDK12 kinase activity. Nuclear RNA-seq data on the respective strand and RNAPII, P-Ser2, P-Ser5, and SPT6 ChIP-seq data for MED13 (D), UBE3C (E) genes from cells either treated (red) or not (blue, CTRL) with 3-MB-PP1 were visualized with Gviz. Read counts were normalized to the total number of mapped reads per sample and averaged between replicates. Blue and red boxes below the RNA-seq data indicate the 90% distance (see Fig 7D and E and corresponding text) in control and CDK12-inhibited samples, respectively.

10 of 29 EMBO reports 20:e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

A

B C

D E

Figure 5.

ª 2019 The Authors EMBO reports 20:e47592 | 2019 11 of 29 EMBO reports Anil Paul Chirackal Manavalan et al

the RNAPII, P-Ser2, P-Ser5, and SPT6 signals ended within the gene A body upon CDK12 inhibition rather than after the gene 30end. Strik- ingly, nuclear RNA-seq showed that CDK12 inhibition also lead to an earlier termination of transcription of these genes at roughly the genomic location in the gene body where RNAPII occupancy was lost and the broad 30end peak of P-Ser2 signal appeared upon CDK12 inhibition. This suggests that the apparent down-regulation of the corresponding genes in both the 30end and nuclear RNA-seq data upon CDK12 inhibition actually represents a shortening of tran- scripts as a consequence of an RNAPII processivity defect.

Transcript shortening upon inhibition of CDK12

As differential gene expression analysis is based on all reads mapped to exonic regions of a gene, it cannot distinguish between shortening of transcripts, resulting in fewer reads on only some B exons, from overall lower transcription levels, resulting in lower levels on all exons. To address this issue, we analyzed differential exon usage on the nuclear RNA-seq data using DEXSeq, a method to identify relative changes in exons usage [60]. CDK12 inhibition resulted in signifi- cant down-regulation of at least one exon for 2,110 genes and signif- icant up-regulation of at least one exon for 1,550 genes (0 > log2 fold-change > 0, P < 0.01). A comparison to differentially expressed genes included in the differential exon usage analysis [2,089 down- regulated, 1,822 up-regulated (0 > log2 fold-change > 0, P < 0.01)] in nuclear RNA-seq showed an overlap of 924 genes (44% of down- regulated genes) that were both significantly down-regulated in expression and had significantly down-regulated exons (Fig 6A). In contrast, only 123 up-regulated genes (7%) had at least one exon significantly up-regulated. Furthermore, 1,156 genes had both up- and down-regulated exons, i.e., 75% of genes with at least one up- regulated exon and 55% of genes with at least one down-regulated exon. This can be explained by a relative decrease in the use of some exons resulting in a relative increase in the use of other exons of the same gene. Notably, the majority of these genes (59%) were also down-regulated, whereas only 7% were up-regulated. C

Figure 6. CDK12 inhibition results in transcript shortening of a subset of genes. ▸ A Overlap between down-regulated genes and genes with differential exon usage upon CDK12 inhibition. Venn diagram shows the overlap between significantly differentially expressed genes (identified by DESeq2) and genes with differential exon usage (identified by DEXSeq) in nuclear RNA-seq data (0 > log2 fold-change > 0, P < 0.01, restricted to genes included in the DEXSeq analysis). B Differentially used exons are enriched at gene 30ends. Graph shows the distribution of the relative genomic position of the exon on the gene (relative exon position: 0 = at gene 50end, 1 = at gene 30end) of differentially used exons (0 > log2 fold-change > 0, P < 0.01). C For down-regulated genes with differentially used exons, exons close to the 50end and 30end tend to be up- and down-regulated, respectively. Box plots show the log2 fold-change in exon usage after CDK12 inhibition determined by DEXSeq. Exons were grouped into deciles according to their relative exon position. n = 3 replicates. The boxes indicate the range between the 25th and 75th percentile (=interquartile range (IQR)) around the median (thick horizontal line) of the distribution. The whiskers (=short horizontal lines at ends of dashed vertical line) extend to the data points at most 1.5 × IQR from the box. Data points outside this range are shown as circles.

12 of 29 EMBO reports 20:e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

To investigate whether differential exon usage of genes reflects (Appendix Fig S8G). In this case, strong down-regulation of exons shortening of transcripts, we determined the relative exon position of was only observed very close to gene 30ends, suggesting that these differentially used exons within genes. We found that differentially genes are only slightly affected by the RNAPII processivity defect used exons are highly enriched at 30end of genes with a slight accu- (Appendix Fig S8G). mulation also toward gene 50ends (Fig 6B). Moreover, the relative position of either down- or up-regulated exons showed exclusive CDK12 kinase activity is required for optimal transcription of accumulation at the gene 30end and 50end, respectively (Appendix Fig long, poly(A)-signal-rich genes S8A–C). Down-regulated genes with at least one significantly differen- tially used exon (1,151 genes) showed a clear trend, with exons up- We previously showed that long-term depletion of CDK12 leads to regulated at the 50end and down-regulated at the 30end (Fig 6C). This diminished expression of mostly longer genes [11]. To determine indicates that these genes are down-regulated because transcripts whether short-term inhibition of CDK12 kinase predominantly tend to get shorter in the absence of CDK12 catalytic activity. affects RNAPII processivity at longer genes, we sorted genes into Notably, down-regulated genes without significantly differentially deciles based on their length and evaluated the fraction of exons used exons (45% of down-regulated genes) showed a similar but less that are differentially used in each gene. We found that longer genes pronounced trend (Appendix Fig S8D). In summary, our findings tended to have a larger fraction of differentially used exons reveal that the observed down-regulation of genes upon CDK12 inhi- (Fig 7A). Similar results were obtained when only the fractions of bition generally results from transcript shortening. down-regulated or up-regulated exons were plotted (Appendix Fig When correlating differential exon usage to the ChIP-seq data, S9A and B). This is consistent with the overlap between genes with we found that genes with down- or up-regulated exons (most of the up- and down-regulated exons, and the scenario that relative down- latter also had down-regulated exons) showed reduced RNAPII regulation of some exons leads to relative up-regulation of other occupancy at the 30end (Appendix Fig S8E) as well as a relative shift exons in the same gene. Accordingly, genes with at least one exon of P-Ser2 normalized to RNAPII from the gene 30end into the gene down- or up-regulated tended to be longer than genes with no dif- body (Appendix Fig S8F). Altogether, our results suggest that inhibi- ferentially used exon, but there was no significant difference in gene tion of CDK12 kinase activity causes a shift of P-Ser2 from gene length between the two groups (Fig 7B). Down-regulated genes also 30ends to gene bodies and diminished RNAPII processivity, conse- tended to be longer than non-regulated and up-regulated genes quently leading to shorter transcripts of CDK12-dependent genes. (Fig 7C), consistent with the hypothesis that optimal RNAPII proces- Since P-Ser2 is important for recruitment of splicing factors to the sivity and RNA expression in longer genes requires CDK12 catalytic RNAPII CTD [2,61,62], we investigated whether significantly regu- activity. This conclusion is also supported by metagene plots for lated exons in genes not down-regulated might be reflective of alter- genes grouped according to gene length, which showed stronger ations in splicing rather than shortening of transcripts. However, changes for longer genes in RNAPII, P-Ser2, and P-Ser5 ChIP-seq the distribution of exon usage changes relative to the position of the occupancies after CDK12 inhibition (Appendix Figs S10–S12). exon again showed a trend similar to down-regulated genes with To verify that CDK12 catalytic activity controls the processivity a tendency for down-regulated exons near the gene 30ends of RNAPII predominantly at long genes, we calculated the distance

◀ Figure 7. CDK12 kinase activity is required for optimal transcription of long, poly(A)-signal-rich genes. ▸ A Longer genes tend to have a larger fraction of differentially used exons. Box plot shows the fraction of exons significantly differentially used for 9,026 expressed genes grouped into deciles based on the genomic length (including exons and introns) of their longest transcripts. n = 3 replicates. See legend in Fig 6C for the boxplot description. B Genes with differentially used exons tend to be longer. Box plots show length of genes with no differentially used exons, or at least one exon differentially up- regulated (DEXSeq log2 fold-change ≥ 0, P < 0.01) or down-regulated (log2 fold-change ≤ 0, P < 0.01). P-value from a two-sided Wilcoxon rank sum test comparing median lengths between genes with either up- or down-regulated exons is indicated on top. n = 3 replicates. See legend in Fig 6C for the boxplot description. C Down-regulated genes tend to be longer than not-regulated genes, while up-regulated genes show little difference. Box plots show length of genes with no differential expression (0.1 < log2 fold-change < 0.1, P > 0.01), up-regulated (log2 fold-change ≥ 0, P < 0.01), or down-regulated (log2 fold-change ≤ 0, P < 0.01) as determined by DESeq2. P-values from two-sided Wilcoxon rank sum tests comparing median lengths for up- and down-regulated genes, respectively, to non- regulated genes are indicated on top. n = 3 replicates. See legend in Fig 6C for the boxplot description. D RNAPII processivity is affected not close to but at some distance from the TSS after CDK12 inhibition. The graphs compare the relative distance from the TSS where 10, 50 and 90% of read coverage is identified (=x% distance) in control (x-axis) against CDK12-inhibited (y-axis) cells. E Transcripts of longer genes are more often impacted by shortening and lose a larger proportion of their length in comparison with shorter genes. The plot shows on the x-axis the relative change in the 90% distance (relative D90% distance = (90% distance in control 90% distance in CDK12 inhibited cells)/gene length) and on the y-axis the percentage of genes showing a D90% distance equal or greater than the value on the x-axis. Positive and negative relative D90% distances on the x-axis indicate a shortening or extension of transcripts, respectively, after CDK12 inhibition. Genes were divided into quintiles according to gene length, and curves for quintiles are shown separately. Dotted and dashed horizontal lines indicate the percentage of genes in each quintile with a transcript shortening of at least 10 and 20%, respectively. F Shortening of transcripts is evidenced by down-regulated poly(A) sites (PAS) in the 30end RNA-seq data and accompanied by up-regulated upstream PAS for the majority of genes. The plot shows the fraction of genes with shortened (relative D90% distance ≥ 0.2), extended (absolute D90% distance < 50 bp), or unaffected transcripts (|absolute D90% distance| ≤ 25 bp) with down-, up-, and non-regulated PAS according to the 30end RNA-seq data. For genes with shortened transcripts and down-regulated PAS in a 30 UTR, the percentage of genes with upstream up-regulated PAS is indicated on the right. In case of multiple identified PAS, the order of preference was as indicated in the legend from top to bottom. G DNA replication and repair genes are longer than other protein-coding genes. Box plots show the length for the indicated groups of genes (according toGO annotations). Median gene lengths for each GO category were compared against all other protein-coding genes using a one-sided Wilcoxon rank sum test (P-values provided in figure, n.s.: P > 0.001). See legend in Fig 6C for the boxplot description.

ª 2019 The Authors EMBO reports 20:e47592 | 2019 13 of 29 EMBO reports Anil Paul Chirackal Manavalan et al

A BC

DE

FG

Figure 7.

14 of 29 EMBO reports 20:e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

to the TSS at which a certain percentage of read coverage (10, 50 or compared to exonic/UTR PAS, no particular enrichment of intronic 90%) was observed for each gene in the nuclear RNA-seq data (de- PAS was observed among upstream up-regulated PAS. noted as the x% distance). When comparing control and inhibited Considering the enrichment of DNA replication and repair genes samples, we observed little difference for the 10% distance, indicat- as well as cell cycle genes among CDK12-dependent genes, we ing that CDK12 inhibition does not substantially affect transcription investigated whether genes in these groups tended to be longer than close to the TSS (Fig 7D). In contrast, we observed a significant other protein-coding genes and thus more affected by the processiv- reduction for the 50 and 90% distances, consistent with transcripts ity defect. We found that these groups of genes tended to be longer getting shorter due to the processivity defect caused by CDK12 inhi- than average protein-coding genes (Fig 7G), though the differences bition (Fig 7D). To find out how many genes are affected by the in median gene length were small and statistically significant only transcript shortening defect and to which degree transcripts were for DNA replication and DNA repair. Notably, however, down-regu- shortened, we evaluated the percentage of genes with a certain lated genes in each group tended to be even longer, whereas the change in their 90% distance after CDK12 inhibition relative to their remaining genes in each group tended to be closer to the median length (denoted relative D90% distance, Fig 7E and Dataset EV3). gene length of the other protein-coding genes. The 90% distance was used as proxy for transcripts ends as these In summary, our results show that CDK12 catalytic activity is were mostly not clearly defined after CDK12 inhibition as the essential for optimal RNAPII processivity at longer genes, including RNA-seq signal tapered off over some range. Division of genes into many DNA replication and DNA repair genes. quintiles based on their length showed that the longest genes (86–2,058 kb) are massively affected by transcripts shortening when CDK12 inhibition decreases transcription elongation rates in compared to short ones (1–23 kb; Fig 7E). For instance, almost bodies of genes with a RNAPII processivity defect 50% of the longest genes are shortened by at least 10%, while < 5% of short genes are affected to this extent (Fig 7E). Notably, the Since CDK12 is a regulator of transcription elongation [11,12,17], we longest genes lose a higher proportion of their transcript length: wanted to determine whether genes with a CDK12-dependent 26% of these genes are shortened in transcription by at least 20%, processivity defect showed reduced elongation rates. To address this whereas such shortening occurs rather exceptionally (< 1%) in question, we measured elongation rates by RT–qPCR as the onset of shorter genes (Fig 7E). Metagene analyses of ChIP-seq data demon- a pre-mRNA expression “wave” at two different positions along the strated that genes with shortened transcripts (relative D90% gene determined by primers at corresponding intron–exon junctions distance ≥ 0.2) have reduced RNAPII occupancies at their 30ends [64,65]. Initially, cells are treated with the pan-kinase inhibitor 5,6- and show a strong shift of the P-Ser2 signal to gene bodies dichlorobenzimidazole 1-b-D-ribofuranoside (DRB) to switch off the (Appendix Figs S13 and S14). transcription cycle and synchronize RNAPII at gene promoters [64]. Next, we asked whether shortening of transcripts might also be The inhibitor wash off releases RNAPII into gene bodies, and influenced by sequence-specific properties, in particular the presence pre-mRNA is synthetized at a relatively uniform elongation rate of of canonical poly(A) signal sequences (AATAAA, ATTAAA). Since 3–5 kb per minute along individual genes [64,66]. RNA samples are gene length and the abundance of poly(A) signal sequences are taken every 3–8 min after the wash off, and the change in elongation highly correlated (Spearman rank correlation q = 0.94, Appendix Fig rate is determined by monitoring the onset of pre-mRNA synthesis at S15A), we grouped genes according to the number of canonical poly specific locations in the gene defined by primer positions [64]. To (A) signals divided by gene length (denoted as poly(A) signal assess the role of CDK12 kinase activity on elongation rates, we content) and then evaluated changes in the 10, 50, or 90% distance selected three CDK12-dependent (TOPBP1, MCM10, UBE3C) and two after CDK12 inhibition for each group (Appendix Fig S15B). Interest- CDK12-independent (ARID1A, SETD3) genes and compared their ingly, we observed a correlation to the poly(A) signal content for the pre-mRNA synthesis in AS CDK12 HCT116 cells either treated or not changes in the 90% distance, and to a lesser degree for changes in with 3-MB-PP1 after the DRB wash off (see Fig 8A for the experi- the 50% distance, with genes with a higher poly(A) signal content mental setup). For each gene, we designed two primer sets within its showing a stronger shortening of transcripts. This suggests that the gene body, one at its 50end and another close to its center. For the presence of poly(A) signals may contribute to the shortening of tran- CDK12-dependent genes, the second set of primers always preceded scripts and possibly explains why longer genes are more affected by the region where the loss/decrease of RNAPII processivity became the processivity defect as they contain a larger number of poly(A) apparent in the RNAPII ChIP-seq and RNA-seq signals (Fig 5E, and signals. Since our 30end RNA-seq data provide information on Appendix Fig S7B and C). DRB wash off in control samples resulted polyadenylated transcripts ends, we used these data to identify down- in an onset of pre-mRNA synthesis at expected time points (based regulated poly(A) sites (PAS) as well as upstream PAS with increased on the location of primers) and was consistent with an expected usage after CDK12 inhibition (Fig 7F, see Materials and Methods). elongation rate between 3 and 5 kb per minute along the gene body For 60% of genes with shortened transcripts, we found at least one (Fig 8B). In CDK12-inhibited samples, we found a delay in the onset down-regulated PAS in an annotated 30 UTR in the 30end RNA-seq of pre-mRNA synthesis in all the locations tested. Surprisingly, data. Furthermore, 55% of these genes exhibited at least one up-regu- synthesis of pre-mRNA of all investigated genes was already delayed lated upstream PAS and 15% exhibited multiple up-regulated at 50ends by a similar time window of approximately 3–6 min upstream PAS. Notably, in the majority of cases these upstream PAS (Fig 8B, compare time of upswing of blue and brown curves). This were not found in annotated 30 UTRs but in other exons or introns. indicates that CDK12 kinase activity may play a role in an optimal Recently, it was reported that CDK12 suppresses intronic polyadeny- release of promoter-paused RNAPII on those genes. Importantly, in lation sites [63]. While our data show up-regulation of intronic PAS, the middle of gene bodies of the CDK12-independent genes the delay considering the much larger number of potential intronic PAS in pre-mRNA synthesis was comparable to the one observed at their

ª 2019 The Authors EMBO reports 20:e47592 | 2019 15 of 29 EMBO reports Anil Paul Chirackal Manavalan et al

A

B

C

Figure 8.

16 of 29 EMBO reports 20:e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

◀ Figure 8. CDK12 inhibition decreases transcription elongation rates in bodies of genes with RNAPII processivity defect. A Experimental outline for measurement of transcription elongation rates. AS CDK12 HCT116 cells were treated with DRB for 3.5 h to synchronize RNAPII at gene promoters. The cells were either pretreated (+) or not () with 3-MB-PP10.5 h prior DRB wash off. After DRB wash off (0 h), fresh medium either supplemented (+) or not () with 3-MB-PP1 was added and samples were taken at indicated time points for analyses of pre-mRNA expression by RT–qPCR. B Transcription elongation rate decreases in bodies of CDK12-dependent but not CDK12-independent genes after CDK12 inhibition. Graphs show relative levels of pre-mRNAs of described genes in AS CDK12 HCT116 cells either treated with 3-MB-PP1 or not (CTRL) for indicated times after DRB wash off. Pre-mRNA levels were normalized to the samples not treated with DRB (Unt) for which the value was set as 1. n = 3 independent experiments, error bars correspond to SEM. Positions of primers (designed to span exon–intron junctions) and their distance from the transcription start site in kb are indicated in the gene structures shown above the graphs. C Proposed model. Schema shows groups of genes whose RNAPII processivity is particularly sensitive to CDK12 catalytic activity and cellular functions that are especially dependent on optimal expression of these genes. The situation in cells with normal and aberrant CDK12 kinase activity is depicted. CDK12 (green oval) phosphorylates (P) unknown substrate(s) (orange oval), possibly including the CTD (blue line), which results in optimal elongation and processivity (blue arrow) of RNAPII (blue oval) for CDK12-sensitive genes. Full length, functional mRNAs are synthesized (upper panel). Inhibition of CDK12 leads to hyperphosphorylation (capital P) of Ser2 (S2) in bodies of CDK12-sensitive genes, which is associated with slower elongation and premature termination. Shorter, aberrant mRNAs are made (lower panel). mRNAs are depicted as black lines.

50ends (3–6 min), indicating that elongation rates do not change Recent findings show that many cancers with disrupted CDK12 considerably on their genes bodies. This was in a contrast to the catalytic activity have a unique, CDK12-inactivation-specific genome CDK12-dependent genes where the delay in pre-mRNA synthesis in instability phenotype: tandem duplications [29–33]. There are the middle of the genes was much longer (at least 9 min; Fig 8B, several possible scenarios for their genesis; nevertheless, we favor compare time of upswing of black and red curves). This indicates the concept that they arise due to disrupted expression of both core that RNAPII elongation slows down in bodies of these genes when DNA replication and HR genes upon inhibition of CDK12. This leads the CDK12 kinase is inhibited which likely contributes to or accom- to an onset of replication stress that as a consequence of inefficient panies the observed RNAPII processivity defect. HR-mediated fork restart results in use of alternative repair mecha- Although these experiments were performed only on a limited nism (Fig 8C). These defects thus correspond to the onset of HR- number of genes, they suggest that the CDK12-dependent RNAPII independent genome instability resulting in the distinct tandem processivity defect is accompanied by slower elongation rates at duplication genome rearrangements pattern observed in tumors gene bodies of the affected genes. with inactivated CDK12. They likely have catastrophic conse- quences for cell survival, however in some cells are occasionally compensated by a pro-growth event leading to tumorigenesis with Discussion distinct tandem duplications (Fig 8C). The outcomes of early stages of CDK12 inactivation were mimicked in AS CDK12 HCT116 cells Using rapid and specific inhibition of CDK12 kinase activity in AS documenting a progressive accumulation of various chromosomal CDK12 cells, we uncovered a crucial role for CDK12 catalytic activ- defects over several rounds of replication accompanied by a gradual ity in G1/S progression. CDK12 activity is required for optimal decrease of cellular proliferation. Notably, the recently discovered expression of core DNA replication genes and timely formation of role of CDK12 in translation of many mRNAs that encode subunits the pre-replication complex on chromatin. Our genome-wide studies of mitotic and centromere complexes contributes to these defects of total and modified RNAPII suggest that CDK12 kinase does not and adds yet another layer of complexity into the essential function globally control P-Ser2 levels on transcription units; however, it is of CDK12 in the maintenance of genome stability [70]. crucial for RNAPII processivity on a subset of long and poly(A)- During the course of our research, two studies suggested a signal-rich genes, particularly those involved in DNA replication connection between CCNK/CDK12 and S phase cell cycle progres- and DNA damage response. We further demonstrate that CDK12- sion: CDK12 deficiency was found to be synthetically lethal in dependent RNAPII processivity is a rate-limiting factor for optimal combination with inhibition of S phase checkpoint kinase CHK1 G1/S progression and cellular proliferation. [71], further supporting our findings as activation of the check- The general requirement of CDK12 kinase activity for optimal G1/ point will give the cell time to repair DNA damage caused by repli- S progression in human cells is corroborated by our finding that cation stress. In another study, knockdown of CCNK was shown CDK12 expression peaks in early G1 phase (Fig 2) resembling regula- to lead to G1/S cell cycle arrest [72]. The proposed mechanism tion of classical cell cycle-related cyclins [39]. This could not be suggested interference with pre-replication complex assembly accounted for by activation of the DNA damage checkpoint, as its caused by CDK12-mediated CCNE1 phosphorylation (directly or signaling occurs later than 24 h post-inhibition, after the cell cycle indirectly) [72]. Our results demonstrate that CDK12 also functions defect. In parallel, CDK12 kinase activity directs transcription of upstream of the pre-replication complex assembly, as CDK12 inhi- crucial HR repair genes including BRCA1, BRCA2, ATM, and Fanconi bition (and also CCNK depletion, see Fig EV3F and G) in the same anemia genes (Fig 8C) that are also essential for dealing with replica- cell line (HCT116) strongly down-regulate mRNA and protein tion stress by protecting and/or restarting stalled replication forks levels of pre-replication complex subunits, including CDC6, CDT1, [67]. As deregulation of DNA replication and cell cycle progression TOPBP1, and MTBP. It will be important to determine whether leads to replication stress and genome instability [39,68,69], these CDK12 can directly phosphorylate CCNE1 and regulate CCNE1/ findings combined with a well-established role of CDK12 in the HR CDK2 activity in early stages of replication as suggested [72]. In DNA repair pathway have important clinical implications, as particular, alterations in CCNE1 also lead to the onset of a distinct discussed below. tandem duplication phenotype [32].

ª 2019 The Authors EMBO reports 20:e47592 | 2019 17 of 29 EMBO reports Anil Paul Chirackal Manavalan et al

Mechanistically, CDK12 inhibition did not affect global transcrip- Although CDK12-dependent genes are on average longer than other tion and P-Ser2 levels, but led to a loss of RNAPII processivity human genes, we believe that there must be yet another mechanis- accompanied by transcript shortening of a subset of genes, consis- tic/signaling basis for their dependence on the kinase. Given the tent with defective transcriptional elongation. Individual CDK12- catastrophic phenotypic effects of aberrant CDK12-mediated proces- dependent genes showed a shift of P-Ser2 peaks toward gene 50ends sivity, identification of the corresponding CDK12 substrate(s) will approximately to the positions where RNAPII occupancy and tran- be of high importance. scription was lost, i.e., to new 30ends of shortened transcripts. During revision of this study, it was revealed that inducible Notably, our findings resemble inhibition of CDK12 by very low depletion of full length CDK12 leads to enhanced usage of intronic (50 nM) concentrations of THZ531, when only a subset of genes, PAS resulting in down-regulation of a subset of genes, particularly including DNA repair genes, was down-regulated without an appre- HR genes [63]. This was explained by a shortening of transcripts ciable decrease of P-Ser2 levels [17]. In contrast, we did not find due to a higher occurrence of intronic PAS in these genes and their wider transcriptional defects and parallel loss of Ser2-phosphory- higher sensitivity to CDK12 loss. We also found that CDK12 inhibi- lated RNAPII as observed with higher (≥ 200 nM) THZ531 concen- tion results in transcript shortening for a subset of genes with a trations [17]. This difference might be potentially explained by a higher frequency of poly(A) signals. Nevertheless, we did not residual kinase activity in the presence of competitive 3-MB-PP1 in conclusively identify enriched intronic PAS usage compared to contrast to a complete kinase shut-off with higher concentrations of exonic/UTR PAS in our datasets when CDK12 was inhibited covalent THZ531 or alternatively by off-target effects of higher (Fig 7F). Perhaps mere inhibition of CDK12 by 3-MB-PP1 is not suf- concentrations of THZ531. ficient to trigger preferential use of intronic PAS although slower Overall, our data indicate a role of human CDK12 that is different elongation and premature termination still occur on CDK12-sensi- from that of CDK12 homologs in Saccharomyces cerevisiae and tive genes. Alternatively, some of the numerous experimental dif- Drosophila, where the kinase is responsible for global P-Ser2 phos- ferences between the studies can account for the difference. phorylation and regulation of elongation [12,73]. One possible We conclude that CDK12-dependent RNAPII processivity is a explanation might be the presence of CDK13 and BRD4, redundant rate-limiting factor for optimal transcription of DNA replication P-Ser2 kinases, in humans [12,20,74]. In Schizosaccharomyces genes and G1/S progression, which provides a novel link between pombe, short (5 min) inhibition of AS Lsk1, a non-essential CDK12 regulation of transcription, cell cycle progression, and genome homolog, decreased Ser2 phosphorylation, but had only a subtle stability. Overall, our study has important implications for under- effect on RNAPII distribution and transcription [75]. Although we standing the CDK12 cellular function, origins of CDK12-specific cannot completely rule out that very short (in minutes) CDK12 inhi- genome instability phenotype, and in longer term for the develop- bition globally affects transcription in human cells, this seems unli- ment of CDK12-specific cancer therapy. kely, since bulk P-Ser2 and P-Ser5 levels in cells are either not affected or only subtly (Figs 1D and EV1D) [48]. Notably, bulk phosphorylation of Ser7, the modification implied in expression of Materials and Methods small nuclear RNAs (snRNAs) [76], was decreased after CDK12 inhi- bition (Figs 1D and EV1D). In any case, our experiments using 4.5-h Cell synchronization and cell cycle analysis inhibition identified the subset of genes whose transcription is crucially dependent on CDK12 catalytic activity. Notably, we did not WT or AS CDK12 HCT116 cells were synchronized by serum starva- find any evidence that inhibition of CDK12 affects alternative last tion (for G0/G1 block) and AS CDK12 HeLa cells by thymidine–noco- exon splicing, as observed in breast cancer cell lines upon CDK12 dazole (for mitotic block). For serum starvation, cells were plated at depletion [28]. Thus it seems likely that this function of CDK12 is 50–60% confluency onto 60-mm dishes containing starvation independent of its kinase activity. medium (0.1% FBS containing DMEM) for 72 h and then released Inspection of individual genes sensitive to CDK12 inhibition into medium containing 15% FBS. For mitotic block, the cells were revealed a relative accumulation of RNAPII hyperphosphorylated on plated at 60–70% confluency onto 60 mm dishes, and after incuba- Ser2 on the gene body rather than at gene 30ends, predominantly at tion with 2 mM thymidine (Sigma, T1895) for 24 h, the cells were a longer distance from the TSS together with a sudden loss of washed twice with PBS and released into fresh media for 3 h. This RNAPII occupancy and transcription from a gene at approximately was followed by 100 ng/ml nocodazole (Sigma, M1404) block for the same position. Although we cannot determine the order and 10 h. Then, the cells were washed twice with PBS and then released consequence of events, we speculate that disrupted or slow elonga- into fresh media containing 10% FBS. Synchronously progressing tion results in a compensatory increase of phosphorylation on Ser2 cells were collected at appropriate time points depending on the type by an unknown kinase (in bulk, the time-dependent accumulation of experiment. During the time of release (0 h), cells were treated of P-Ser2 and also, to some extent P-Ser5, is visible in Figs 1D and with either DMSO (CTRL) or 5 lM ATP analog 3-MB-PP1 inhibitor EV1D). Alternatively, inactivation of a P-Ser2 phosphatase or its (Merck, 529582) for the indicated times. Cell cycle profile was disabled recruitment, perhaps via CDK12-mediated changes in Ser7 measured by flow cytometry based on the DNA content of cells using phosphorylation, could be involved. In either scenario, the aberrant propidium iodide (PI) (Sigma, P4170) staining. For the PI staining, accumulation of P-Ser2 in gene bodies of long genes might represent trypsinized cells were washed twice with PBS, fixed with ice-cold a signal for triggering premature termination or polyadenylation 70% (v/v) ethanol, and incubated at 20°C for 2 h. After washing (Fig 8C). We found that long genes, genes with higher numbers of twice with ice-cold PBS, cells were resuspended in Vindal buffer canonical poly(A) signals, and subsets of DNA replication and DNA (10 mM Tris–Cl, pH = 8, 1 mM NaCl, and 0.1% Triton X-100) damage response genes are most reliant on CDK12 catalytic activity. containing freshly added PI (50 lg/ml) and RNase A (200 lg/ml;

18 of 29 EMBO reports 20:e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

Qiagen, 19101) and incubated for 20 min at room temperature before Generation of AS CDK12 HCT116 cells by genome editing measurement by BD FACSVerse (BD Bioscience). Cell cycle distribu- tion was analyzed by FLOWING version 2.1 software. To create AS CDK12 HCT116 cell line, both alleles of CDK12 were targeted using CRISPR/Cas9 system as previously described [48,77]. Rescue or washout assay Guide RNA (20-nt) targeting exon 6 of CDK12 was designed with appropriate PAM motif (50-NGG) as close to the F813 codon as Serum-starved AS CDK12 HCT116 cells were released by serum possible. Sequences of single-guide RNA (sgRNA) used were the addition (with DMEM containing 15% FBS; 0 h) and treated with following: CDK12-sgRNA-1: ATA CTC AAA TAC AAG GTA AAA 5 lM 3-MB-PP1 for the indicated time points. Medium containing GG; Cdk12-sgRNA-2: GGT CCA TAT ACT CAA ATA CAA GG. The inhibitor was subsequently removed, cells were washed carefully efficiency of gRNA/Cas9 targeting and activity was validated by three times with warm PBS, and fresh medium (DMEM containing sequencing with the following primers: CKD12-Seq 1-fwd: TAG GAC 15% FBS) was added. Cells were collected at appropriate time point TTG AGG CAT TGT TAT TTC, CDK12-Seq 1-rev: TTA GAA CAC for flow cytometry (0 and 15 h), immunoblotting (12 h), nuclear TTA ATA TCC CGA TGA. HCT116 cells expressing Cas9 and CDK12 fractionation (6 and 9 h), or RT–qPCR (7 h). targeting sgRNA were transfected with a 166-nt-long homologous repair template that introduced desired genome changes. The Nuclear fractionation homologous repair template contains: TTT to GGG mutation which results in F813G, adjacent silent change (A to T) to generate a novel AS CDK12 HCT116 cells were seeded onto 150-mm dishes and BSII restriction site to facilitate downstream validation and a silent synchronized by serum starvation as described. After release into mutation GTA to GTT to prevent alternative splicing. Following 15% fetal bovine serum (FBS) containing medium with either selection, individual colonies were isolated by low density plating DMSO (CTRL) or 5 lM 3-MB-PP1 dissolved in DMSO, the cells were and expanded, and PCR genotyped using specific forward PCR grown for various time points and then harvested. Cell pellets were primers for either WT (CDK12-PCR 1-WT-fwd: GGT GCC TTT TAC washed twice in PBS, and small aliquots were taken away for flow CTT GTA TTT GA) or AS (CDK12-PCR 1-AS-fwd: GTG CCT TTT cytometry analyses. Remaining cell pellets were quickly frozen in ACC TTG TTG GGG AG) sequences and with the reverse primer dry ice and stored at 80°C. After collecting all the time points, the (CDK12-PCR 1-rev: GGA GCA GGT ATG TTT CTC CCA; Fig EV1B). samples were further processed together. Positive clones were further validated by PCR of genomic DNA Briefly, each cell pellet was lysed in 500 ll of cytoplasmic lysis using the following primers: CDK12-PCR 2-fwd (GCT CCG TTG TTT buffer on ice for 5 min [10 mM Tris–Cl pH = 8.0, 0.32 M sucrose, ATT ATT AGG AAG G) and CDK12-PCR 2-rev (TCA CTA AAT AGT

3 mM CaCl2, 2 mM MgCl2, 0.1 mM EDTA, 1 mM DTT, 0.5% Triton GTG TGA ATA CTG C) followed by digestion using Bsll (Thermo X-100, and Protease inhibitor cocktail (Sigma, P8340)] and spin at Fisher Scientific, FD1204) (Fig 1B). Digested products were sepa- 500 g/5 min/4°C. The supernatant containing cytoplasmic fraction rated by agarose gel electrophoresis and the pattern of digestion was discarded, and the pellets were washed once in 500 ll of the confirmed homozygous AS CDK12 clones. Initial PCR screening was cytoplasmic lysis buffer and once in 500 ll of the same buffer with- followed by Sanger sequencing with the following primers to con- out detergent to remove any residual cytoplasmic proteins. Remain- firm the presence of the desired mutation (Fig 1C): CDK12-PCR 3- ing nuclear extracts were resuspended in 80 ll of EDTA-EGTA fwd: CCC CCA TGA AGA GGT GAG TAG and CDK12-PCR 3-rev: buffer [3 mM EDTA, 0.2 mM EGTA, 1 mM DTT, and Protease inhi- GGA GCA GGT ATG TTT CTC CCA, and CDK12-Seq 2-fwd: GCT bitor cocktail (Sigma, P8340)] and left on ice for 30 min, then spin CCG TTG TTT ATT ATT AGG AAG G and CDK12-Seq 2-rev: TCA at 10,000 g/5 min/4°C, and supernatant was discarded. Remaining CTA AAT AGT GTG TGA ATA CTG C. Immunoprecipitation of pellets containing chromatin-bound proteins (insoluble nuclear frac- CDK12 followed by Western blotting with cyclin K (CCNK) from tion) were washed once in 300 ll EDTA-EGTA buffer and after spin both WT and AS CDK12 HCT116 cells was performed to check the at 1,700 g/10 min/4°C lysed in 40 ll of RIPA buffer [50 mM Tris–Cl presence of intact CDK12/CCNK complex. pH = 8, 5 mM EDTA, 150 mM NaCl, 1% NP-40, 0.5% sodium deoxycholate, 0.1% SDS, 1 mM MgCl2, Protease inhibitor cocktail BrdU incorporation assay (Sigma, P8340), and benzonase nuclease (Sigma, E1014)] for 30 min at 37°C. After addition of SDS sample buffer, the samples To differentiate between replicating and non-replicating cells based were sonicated 10 × 3 s (amplitude 0.20) (QSonica Q55), spin at on the staining of newly synthesized DNA, BrdU (5-Bromo-20-deox- 13,000 g for 1 min, and boiled at 95°C for 3 min. Protein levels in yuridine) incorporation assay was performed as described in [78]. insoluble nuclear fractions were analyzed by Western blotting. Briefly, BrdU (Sigma, B9205) was added to the cell culture medium at a final concentration of 10 lM and incubated for 30 min. After Cell lines and chemicals BrdU incorporation, cells were harvested and washed twice with 1% BSA/PBS before fixing in 70% (v/v) ethanol at 20°C for 2 h. HCT116 human colon carcinoma cells (ATCC) and HeLa human Ethanol fixed cells were denatured with 2 N HCl containing 0.5% cervical carcinoma cells (gift from Dr. A.L. Greenleaf, Duke Univer- Triton X-100 for 30 min to yield single-stranded DNA molecules. sity Medical Center, USA [48]) were maintained in Dulbecco’s modi- Cells were resuspended in 0.1 M Na2B4O7.10 H2O (pH = 8.5) to fied Eagle’s medium (DMEM) containing high glucose neutralize acid before resuspending in 1% BSA/PBS/0.5% Tween- supplemented with L-glutamine, sodium pyruvate (Sigma, D6429), 20. Cells were then incubated with 0.5 lg anti-BrdU FITC (clone and 5% FBS (Sigma, F7524) at 37°C and 5% CO2. All the chemicals B44, BD Bioscience, 347583) for 45 min, washed twice with 1% were purchased from Sigma, unless specified otherwise. BSA/PBS, and stained with propidium iodide (5 lg/ml) before

ª 2019 The Authors EMBO reports 20:e47592 | 2019 19 of 29 EMBO reports Anil Paul Chirackal Manavalan et al

measurement by BD FACSVerse (BD Bioscience). Data were (Thermo Fisher Scientific, 10009D) per IP was washed three times analyzed by FlowJo version 10 software. in protein lysis buffer and incubated 4 h at 4°C with 1 lg of anti- SPT6 antibody (Novus, NB100-2582) per IP or without antibody as Immunoblotting a control. Beads were washed three times with 1 ml of protein lysis buffer and incubated with lysates overnight at 4°C. Immunoprecipi- For the isolation of total cellular proteins, AS CDK12 HCT116 cells tates were washed three times with 1 ml of protein lysis buffer; (from either 100 mm or 150 mm dishes) were lysed in protein lysis 30 llof3× Laemmli buffer was added and then boiled for 3 min at buffer [20 mM HEPES-KOH, pH = 7.9, 15% glycerol, 150 mM KCl, 95°C. The immunoprecipitates were resolved by SDS–PAGE, and 1 mM EDTA, 0.2% NP-40 (Sigma, 18896), 1 mM DTT, 0.5% v/v Western blots were probed with SPT6 and RNAPII antibodies. Protease inhibitor cocktail (Sigma, P8340)], sonicated, and centri- fuged (10,000 g, 10 min, 4°C). Cellular protein concentrations were Chromosomal aberration assay by metaphase spreads quantified using the bicinchoninic acid (BCA) protein assay. Equal amounts of proteins were loaded onto appropriate percentage of Chromosomal aberration assay was performed as described previ- either Tris-glycine or Tris-acetate gels, and proteins were resolved ously [79] with AS CDK12 HCT116 cells treated with and without by SDS–PAGE using appropriate running buffer under denaturing 5 lM 3-MB-PP1 for 24 and 48 h, and with 4 mM hydroxyurea conditions (120 V for 90 min). For immunoblotting, proteins were (Sigma, H8627) for 5 h as a positive control. Briefly, at the end of electrophoretically transferred (100 V for 1 h) to 0.45-lm nitro the treatment the cells (from 25-cm2 flasks) were incubated with cellulose membranes (Sigma, GE10600008). After blocking either 0.1 lg/ml KaryoMax colcemid (Thermo fisher Scientific, 15212012) with 5% nonfat dry milk or bovine serum albumin (BSA) in TBS-T for 90 min to arrest the cells in metaphase and allow chromosome buffer for 90 min at room temperature, the membranes were probed spreading. Cells were swollen by treatment with hypertonic KCL using antibodies raised against the indicated proteins overnight at (0.075 M) for 12 min at 37°C and fixed with methanol: glacial acetic 4°C (see the Table 1 for the complete list of antibodies used in this acid (3:1). Cells were carefully dropped onto a microscopic slide, study). Either FUS or ɑ-tubulin was used as loading controls. stained with 5% Giemsa, and air-dried. Slides were mounted with Membranes were washed and subsequently incubated with appro- Richard-Allan Scientific Cytoseal 60 (Thermo Fisher Scientific, 8310- priate HRP (horseradish peroxidase)-conjugated secondary antibody 16) and analyzed with an Olympus BX60 microscope at 1,000× (GE Healthcare, NA931V, NA934V or Santa Cruz, sc-2032) for 1 h at magnification. room temperature. Immunoreactive bands were detected on either Amersham Hyperfilm ECL or UltraCruz Autoradiography Film siRNA-mediated knockdown (Santa Cruz, sc-201697) using enhanced chemiluminescence reagent (Western Blotting Luminol Reagent, Santa Cruz, sc-2048). AS CDK12 HCT116 cells were plated at 30% confluency 7–11 h before transfection. siRNA was transfected at a final concentration of Immunoprecipitation 10 nM using Lipofectamine RNAiMax (Thermo Fisher Scientific, 13778-150) according to the manufacturer’s instruction. Briefly, to WT or AS CDK12 HCT116 cells (150 mm dish per IP) were transfect one well in 6-well plate we mixed together 2.5 ll of siRNA harvested in ice-cold PBS, lysed in protein lysis buffer [20 mM (10 lM stock solution) diluted in 250 ll of Opti-MEM (Thermo Fisher HEPES-KOH, pH 7.9, 15% glycerol, 150 mM KCl, 1 mM EDTA, Scientific, 31985-070) with 5 ll of Lipofectamine diluted in 250 llof 0.2% NP-40 (Sigma, 18896), 1 mM DTT, 0.5% v/v protease inhi- Opti-MEM. After 15 min, the mixture was added dropwise into the bitor cocktail (Sigma, P8340)], sonicated, and cleared by centrifuga- cultured cells containing 2.5 ml of media. If larger plates were used tion (10,000 g, 10 min, 4°C). Cellular protein concentrations were for transfections, the amount of reagents was scaled up proportion- quantified using the bicinchoninic acid (BCA) protein assay. For ally. Control samples were transfected with non-targeting control CDK12 IP, lysate was incubated with 2 lg of anti-CDK12 antibody siRNA-A (Santa Cruz, sc-37007). The levels of proteins after deple- (Santa cruz, sc-81834) for 2 h at 4°C, followed by incubation with tion were analyzed by Western blotting with appropriate antibodies. pre-washed protein G sepharose beads (GE Healthcare, 17-0618-01; The list of siRNAs used in this study is specified in the Table 2. 20 ll per IP) for another 2 h at 4°C. Immunoprecipitates were washed three times with 1 ml protein lysis buffer, eluted from the Reverse transcription qPCR beads with 40 ll3× Laemmli sample buffer, and then boiled for 4 min at 95°C. SDS–PAGE resolved immunoprecipitated proteins, Total RNA was isolated by TRIzol reagent (Thermo Fisher Scientific, followed by Western blotting, and probed for indicated proteins. 15596026) according to the manufacturer’s protocol. 1 lg of total RNA was treated with 1 ll of DNase (Sigma, AMPD1) and reverse SPT6 immunoprecipitation transcribed using 200 U SuperScript II RT (Thermo Fisher Scientific, 18064-014) with random hexamers (IDT, 51-01-18-01). Quantitative AS CDK12 HCT116 cells (150 mm dish per IP) were treated for 4 h gene expression analysis was performed on AriaMx Real-Time PCR with either DMSO (CTRL) or 5 lM 3-MB-PP1 dissolved in DMSO. System (Agilent) using SYBR Green. In general, each reaction (final Cells were harvested in ice-cold PBS, and pellets were equalized in volume 11 ll) contained 5.5 ll SYBR Green JumpStart Taq size, lysed in protein lysis buffer [20 mM HEPES-KOH, pH 7.4, ReadyMix (Sigma, S4438), 200 nM of each primer (primer 100 mM KCl, 0.5% Triton X-100 (Sigma, 18896), 1 mM DTT, sequences used in this study are specified in the Table 3), 0.28 ll protease inhibitor cocktail (Sigma, P8340)], sonicated, and centri- H20, and 5 ll diluted cDNA template, with the following PCR fuged (10,000 g, 10 min, 4°C). 20 ll of protein G Dynabeads cycling conditions: 95°C for 2 min followed by 45 cycles of

20 of 29 EMBO reports 20:e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

Table 1. Antibodies used for ChIP, IP, and Western blotting. Target protein Clone Cat. no. ChIP IP WB Source/Reference CCNK G-11 sc-376371 – 3 lg 1:500 Santa Cruz CDC6 180.2 sc-9964 ––1:200 Santa Cruz MTBP B-5 sc-137201 ––1:600 Santa Cruz CDT1 F-6 sc-365305 ––1:300 Santa Cruz FUS 4H11 sc-47711 ––1:10,000 Santa Cruz Histone 2A(H2A) ab18255 ––1:10,000 Abcam ORC63A4 sc-32735 ––1:3,000 Santa Cruz E2F1 A300-766A 5 lg –– Bethyl E2F3 PG30 sc-56665 5 lg –– Santa Cruz CDK12 U1-4th immune –––1:3,000 In-house made CDK12 R-12 sc-81834 – 2 lg 1:500 Santa Cruz Cyclin A2 4656 ––1:1,000 Cell Signaling Cyclin E2 4132 ––1:1,000 Cell Signaling Cyclin E1 4129 ––1:1,000 Cell Signaling RNAPII N-20 sc-899x 2 lg – 1:1,000 Santa Cruz Phospho-RNAPII (Ser2) 3E10 61083 3 lg – 1:6,000 Active Motif Phospho-RNAPII (Ser5) 3E8 61085 3 lg – 1:8,000 Active Motif a-Tubulin B-7 sc-5286 ––1:200 Santa Cruz ATM 2873 ––1:300 Cell Signaling Phospho-ATM (Ser1981)EP1890Yab81292 ––1:1,000 Abcam p53 D0-1 –––1:10 In-house made Phospho-p53 (Ser15) 9284 ––1:800 Cell Signaling TOPBP1 B-7 sc-271043 ––1:250 Santa Cruz CDC7 SPM171 sc-56275 ––1:600 Santa Cruz ORC23G6 sc-32734 ––1:1,500 Santa Cruz ORC31D6 sc-23888 ––1:1,500 Santa Cruz GINS4 (SLD5)D-7 sc-398784 ––1:400 Santa Cruz MCM3 E-8 sc-390480 ––1:200 Santa Cruz CDK13 N-term. –––1:3,000 In-house made SPT6 NB100-2582 3.5 lg 1 lg 1:4,000 Novus Biologicals RNAPII NBP2-32080 1:2,000 Novus Biologicals Phospho-RNAPII (Ser7) 4E12 ––1:1,000 Chromotek RNAPII (Rpb7)C-20 sc-398213 ––1:100 Santa Cruz Sheep anti-mouse IgG-HRP NA931V ––1:3,000 GE Healthcare Life Sciences Donkey anti-rabbit IgG-HRP NA934V ––1:3,000 GE Healthcare Life Sciences Goat anti-rat IgG-HRP sc-2032 ––1:3,000 Santa Cruz

Table 2. siRNAs used in this study. relative gene expression was determined using comparative CT DD Gene Cat. no. Source method (2 CT method) with either HPRT1 or B2M as normalizer. siCTRL A sc-37007 Santa Cruz Analysis of mRNA stability siCCNK sc-37600 Santa Cruz

To assess relative stability of select DNA damage and denaturation at 95°C for 15 s, annealing at 55°C for 30 s, and exten- replication transcripts, AS CDK12 HCT116 cells were treated with sion at 72°C for 30 s. All reactions were performed in triplicates for 1 lg/ml actinomycin D (Sigma, A9415) to block transcription in each biological replicate, and melting curve analyses were routinely the presence or absence of 5 lM 3-MB-PP1. Cells were harvested performed to monitor the specificity of the PCR product. The at various time points (0 to 5 h) after actinomycin D treatment

ª 2019 The Authors EMBO reports 20:e47592 | 2019 21 of 29 EMBO reports Anil Paul Chirackal Manavalan et al

Table 3. Primers used in this study. Name Sequence (5′–3′) Method used Reference CCNK (ex8-ex10) F AACAGCCCAAGAAACCCTC RT–qPCR This study CCNK (ex8-ex10) R CAACGGTGGATGAGTGGTC RT–qPCR This study MTBP (ex10-ex11) F GGATTGACAAACAGTACCAAACAG RT–qPCR This study MTBP (ex10-ex11) R GTTGGGAGGTGGAATCAGTATG RT–qPCR This study CCNE2 (ex3-ex4-ex5) F AAGAGGAAAACTACCCAGGATG RT–qPCR This study CCNE2 (ex3-ex4-ex5) R ATAATGCAAGGACTGATCCCC RT–qPCR This study CDC6 (+1,860) F AGAACATGCTCTGAAAGATAAAGC RT–qPCR This study CDC6 (+1,922) F GGTGTAAGAGAAGAATTTAAGGCAA RT–qPCR This study TOPBP1 (ex24-ex25) F GCTTCATCGCTCCTACCTTG RT–qPCR This study TOPBP1 (ex24-ex25) R AGTGCTAGTCTTCGTTGCTG RT–qPCR This study MCM10 (ex18-ex19-ex20) F ACTCCCGAACAAGCACTG RT–qPCR This study MCM10 (ex18-ex19-ex20) R GTCTTTTCCTTTAGCATTCCGTC RT–qPCR This study ORC2 (ex10-ex11-ex12) F GAGAGCTAAACTGGATCAGCA RT–qPCR This study ORC2 (ex10-ex11-ex12) R GCACAATGTTGAACCCAAGG RT–qPCR This study CDT1 (ex9-ex10) F AGCGTCTTTGTGTCCGAAC RT–qPCR This study CDT1 (ex9-ex10) R AGGTGCTTCTCCATTTCCC RT–qPCR This study ORC3 (ex4-ex5) F GGGCGGTCAAATAAAACTCAG RT–qPCR This study ORC3 (ex4-ex5) F GCCTCTGTTAGACTTCCGAATG RT–qPCR This study C-MYC (+1,855) F CAC AAA CTT GAA CAG CTA CGG RT–qPCR This study C-MYC (+1,941) R GGT GAT TGC TCA GGA CAT TTC RT–qPCR This study BRCA1 (+5,718) F AGATGTGTGAGGCACCTGT RT–qPCR This study BRCA1 (+5,777) R GTCCAGCTCCTGGCACT RT–qPCR This study BRCA2 (ex18-ex19) F TTCATGGAGCAGAACTGGTG RT–qPCR This study BRCA2 (ex18-ex19) R AGGAAAAGGTCTAGGGTCAGG RT–qPCR This study FANCI (ex7-ex8) F TGTAATCCAACTCACCTCCATG RT–qPCR This study FANCI (ex7-ex8) R GAGAACCAGAAGCTGATAGACC RT–qPCR This study ATR (ex34-ex35) F CGCTGAACTGTACGTGGAAA RT–qPCR This study ATR (ex34-ex35) R CAATAAGTGCCTGGTGAACATC RT–qPCR This study Exo1 (+799) F CCTCGTGGCTCCCTATGAAG RT–qPCR This study Exo1 (+872) R AGGAGATCCGAGTCCTCTGTAA RT–qPCR This study CDK6 (ex2/ex3) F TGGAGACCTTCGAGCACC RT–qPCR This study CDK6 (ex2/ex3) R CACTCCAGGCTCTGGAACTT RT–qPCR This study CCND3 (ex2/ex3) F TACACCGACCACGCTGTCT RT–qPCR This study CCND3 (ex2/ex3) R GAAGGCCAGGAAATCATGTG RT–qPCR This study CDKN1B (ex1/ex2) F CGGCTAACTCTGAGGACAC RT–qPCR This study CDKN1B (ex1/ex2) R TGTTCTGTTGGCTCTTTTGT RT–qPCR This study CDKN2A (ex2/ex3) F GAAGGTCCCTCAGACATCCCC RT–qPCR This study CDKN2A (ex2/ex3) R CCCTGTAGGACCTTCGGTGAC RT–qPCR This study E2F1 (ex5/ex6) F CAGAGCAGATGGTTATGGTG RT–qPCR This study E2F1 (ex5/ex6) R GGCACAGGAAAACATCGATC RT–qPCR This study HPRT1 (ex5/ex6) F ACACTGGCAAAACAATGCAG RT–qPCR This study HPRT1 (ex5/ex6) R ACTTCGTGGGGTCCTTTTC RT–qPCR This study B2M (ex1/ex2) F GCATTCCTGAAGCTGACAG RT–qPCR This study B2M (ex1/ex2) R GCTGGATGACGTGAGTAAAC RT–qPCR This study

22 of 29 EMBO reports 20:e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

Table 3 (continued) Name Sequence (5′–3′) Method used Reference GAPDH (ex1/ex3) F GCTCTCTGCTCCTCCTGTTC RT–qPCR This study GAPDH (ex1/ex3) R ACGACCAAATCCGTTGACTC RT–qPCR This study CDC6 (PR) F GGCTGTAACTCTTCCACTGGATTG ChIP-qPCR This study CDC6 (PR) R CCCGGCCTCGATTCTGATT ChIP-qPCR This study CDC6 (IR) F AGGTTCCAATATGCATGCTAAGTA ChIP-qPCR This study CDC6 (IR) R GCCCTTAATAACCTGAAATGGTAATG ChIP-qPCR This study CCNE2 (PR) F CTACGCGCAGCAACTCCT ChIP-qPCR This study CCNE2 (PR) R CTGTCCGGAGGTGTCAGTCT ChIP-qPCR This study CCNE2 (IR) F GACTCCATGACTTCATCCTC ChIP-qPCR This study CCNE2 (IR) R TGTGACCAGCTGTGATTC ChIP-qPCR This study BRCA1 (PR) F TATTCTGAGAGGCTGCTGCTTAGCG ChIP-qPCR [11] BRCA1 (PR) R GGGCCCAGTTATCTGAGAAACCC ChIP-qPCR [11] BRCA1 (IR) F CCA AAG CCA CCT TTC TGT TCC CAT ChIP-qPCR [11] BRCA1 (IR) R TCC TGT AAG ACC CTT TGC CTG ACA ChIP-qPCR [11] TOPBP1 (PR) F GCTCCAACGAGGTAAGTGAG ChIP-qPCR This study TOPBP1 (PR) R GAAGGCCACAGAAGGCAT ChIP-qPCR This study TOPBP1 (IR) F CTGGCTCCACATCTCTTCTTC ChIP-qPCR This study TOPBP1 (IR) R TGGCTCTGCTTAATGCTACTAC ChIP-qPCR This study MCM10 (PR) F GGCGCCAGACACTCTATTT ChIP-qPCR This study MCM10 (PR) R GTCATTGGACGCCCTCTTT ChIP-qPCR This study MCM10 (IR) F CGTGCCTTTCTTAATCAGCATC ChIP-qPCR This study MCM10 (IR) R GTGCACTGAAGTAGGAGACATAG ChIP-qPCR This study CDC45 (PR) F TGAATGGCAGAGCGCTAAT ChIP-qPCR This study CDC45 (PR) R CCAGGGATCACCAACCAATAG ChIP-qPCR This study CDC45 (IR) F ACTCTGAGCCTGCATTCTTG ChIP-qPCR This study CDC45 (IR) R AGAAATGTCTGGGCCACATC ChIP-qPCR This study RRM2 (PR) F GGCATGGCACAGCCAAT ChIP-qPCR This study RRM2 (PR) R CTCACTCCAGCAGCCTTTAAATC ChIP-qPCR This study RRM2 (IR) F GGTGGGTGAACACTAGGAATC ChIP-qPCR This study RRM2 (IR) R AAGGTCGCACAGCACAA ChIP-qPCR This study TOPBP1_9 kb_F GCATTTCAAGCACCTGAAGATTTA RT–qPCR This study TOPBP1_9 kb_R AGTCAGGCTAGGAAATGCTAATG RT–qPCR This study TOPBP1_33 kb_F CCCATCTTGCTTCTCTCTCTCT RT–qPCR This study TOPBP1_33 kb_R GGCTGCAAGTGCATCCTATAC RT–qPCR This study MCM10_10 kb_F AAATAGGGTCCTCCCTGCTC RT–qPCR This study MCM10_10 kb_R GGTGGTCTTCATCCAACTTATCC RT–qPCR This study MCM10_27 kb_F GTGTCTGCTCACTGCTGTTT RT–qPCR This study MCM10_27 kb_R TCTTGTACTGAGCCTGGACAT RT–qPCR This study UBE3C_24 kb_F TTTCTCTGTTTGGGTGTAGGAG RT–qPCR This study UBE3C_24 kb_R ACCTCTCTCTTTCTTCTTTCTTCC RT–qPCR This study UBE3C_63 kb_F CACGGATGATCACAGGGTATG RT–qPCR This study UBE3C_63 kb_R AGCCCAGTATAAACAGGACTTAAA RT–qPCR This study SETD3_17 kb_F CAAATCCTCTTTCTTGTGCAGAC RT–qPCR This study SETD3_17 kb_R CGGACTGCTGCATTCTGTAA RT–qPCR This study SETD3_67 kb_F GCTTCATTTGGCTCTTGTTAGG RT–qPCR This study

ª 2019 The Authors EMBO reports 20:e47592 | 2019 23 of 29 EMBO reports Anil Paul Chirackal Manavalan et al

Table 3 (continued) Name Sequence (5′–3′) Method used Reference SETD3_67 kb_R TGAGGATGGGTCTGGGAA RT–qPCR This study ARID1A_33 kb_F GGTTATATATTCAGTGGCCAGAGG RT–qPCR This study ARID1A_33 kb_R CATTGGACTGGATGGCTACAA RT–qPCR This study ARID1A_77 kb_F CCTGGGTCAAAGGGTAGATTA RT–qPCR This study ARID1A_77 kb_R CTGAGGACATGAAGGGATCA RT–qPCR This study CDK12-PCR 1-WT-fwd GGT GCC TTT TAC CTT GTA TTT GA PCR This study CDK12-PCR 1-AS-fwd GTG CCT TTT ACC TTG TTG GGG AG PCR This study CDK12-PCR 1-rev GGA GCA GGT ATG TTT CTC CCA PCR This study CDK12-PCR 2-fwd GCT CCG TTG TTT ATT ATT AGG AAG G PCR This study CDK12-PCR 2-rev TCA CTA AAT AGT GTG TGA ATA CTG C PCR This study CDK12-PCR 3-fwd CCC CCA TGA AGA GGT GAG TAG PCR This study CDK12-PCR 3-rev GGA GCA GGT ATG TTT CTC CCA PCR This study CDK12-Seq 1-fwd TAG GAC TTG AGG CAT TGT TAT TTC Sequencing This study CDK12-Seq 1-rev TTA GAA CAC TTA ATA TCC CGA TGA Sequencing This study CDK12-Seq 2-fwd GCT CCG TTG TTT ATT ATT AGG AAG G Sequencing This study CDK12-Seq 2-rev TCA CTA AAT AGT GTG TGA ATA CTG C Sequencing This study

by addition of TRIzol reagent. RNA was extracted and relative release (0 h) into DMEM containing 15% FBS, cells were treated mRNA levels were analyzed by reverse transcription qPCR either with DMSO (CTRL) or 5 lM 3-MB-PP1 for 5 h. Total RNA (RT–qPCR) as described above, with HPRT1 as normalization was isolated from three biological replicates by TRIzol reagent control. Primers spanning exon-exon boundaries were used (Thermo Fisher Scientific, 15596026) and purified by RNA QiAamp to assess the percentage of remaining mRNA present after Spin Column (QIAGEN, 52304), according to the manufacturer’s the inhibition of transcription. The list of primer is in the guidelines. RNA quality was assessed by TapeStation 2200 (Agilent Table 3. Technologies), and only samples with a RIN values ≥ 9 were used for library preparation. PolyA-selected libraries were made from Analysis of elongation rate 200 ng of total RNA input using QuantSeq 30mRNA-Seq Library Prep Kit FWD for Illumina (Lexogen, 015.24) and external multiplexing Elongation rate experiments on select genes were carried out as barcodes for Illumina (i7 index primers 7001-7096; Lexogen, described [64]. Briefly, AS CDK12 HCT116 cells were grown over- 044.96) with 12× PCR cycles for library amplification, according to night on 60-mm dishes to 70–80% confluency and treated with manufacturer’s instructions. The fragment size and quality of the 100 lM DRB (Sigma, D1916) for 3.5 h to synchronize the transcrip- libraries were assessed by fragment analyzer (Advanced Analytical tion cycle at the promoter-proximal paused stage. Thirty minutes Technologies) and sequenced with 50 bp single-end reads on a before DRB removal, the cells were pretreated with either 5 lM3- single lane of an Illumina HiSeq 2500 (VBCF Vienna). MB-PP1 or DMSO (CTRL). After DRB removal, the cells were washed twice with PBS and released into fresh medium containing Nuclear total RNA-seq either 5 lM 3-MB-PP1 or DMSO (CTRL) for transcription restart. The cells were then directly lysed in TRIzol reagent at appropriate AS CDK12 HCT116 cells were plated onto 150-mm dishes and time points. 2 lg of total RNA was treated with DNase and reverse synchronized by serum starvation for 72 h. Cells were released by transcribed using 200 U SuperScript II RT with random hexamers. adding 15% FBS containing medium with either DMSO (CTRL) or Pre-mRNA levels were measured by quantitative RT–qPCR using 5 lM 3-MB-PP1 diluted in DMSO. The cells were washed twice with SYBR Green on AriaMx Real-Time PCR System, as described above. ice-cold PBS 4.5 h after the release, scraped, pelleted at 500 g for The relative pre-mRNA expression was determined using compara- 3 min, and lysed in 150 ll of cytoplasmic lysis buffer [10 mM Tris– DDCT tive CT method (2 method) with HPRT1 as normalizer. Primers Cl pH 8, 0.32 M sucrose, 3 mM CaCl2, 2 mM MgCl2, 0.1 mM EDTA, spanning exon–intron junctions of select genes were designed using 1 mM DTT, 0.5% Triton X-100, 40 U/ml RNase inhibitor (Roche, the IDT software PrimerQuest (IDT). The list of primers is in the 3335402001), and Protease inhibitor cocktail (Sigma, P8340)] for Table 3. 5 min. Cytoplasmic RNA present in the supernatant was removed by centrifugation (500 g for 3 min). Nuclear pellet was washed with 30end (PolyA-selected) RNA sequencing 90 ll of cytoplasmic lysis buffer, and supernatant was completely removed after centrifugation (500 g for 3 min). Nuclear RNA was AS CDK12 HCT116 cells were plated on to 60-mm dishes and isolated from the remaining nuclear pellet using Tri-Reagent (MRC, synchronized by serum starvation as described. At the time of #TR118). 1 lg of RNA was treated with 1 ll of DNase (Sigma,

24 of 29 EMBO reports 20:e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

AMPD1). 250 ng of nuclear RNA was used for library preparation are specified in the Table 3. qPCR was performed in triplicate for after removing ribosomal RNA with NEBNext rRNA Depletion Kit each biological replicate, and error bars represent standard error of (NEB, E6310S). Sequencing libraries were prepared using the the mean of three biological replicates. NEBNext Ultra II Directional RNA Library Prep Kit for Illumina (NEB, E7760) and NEBNext Multiplex Oligos for Illumina (NEB, ChIP sequencing E7500S and E7335S) and sequenced with 50 bp at single-end reads on Illumina HiSeq 2500 (VBCF Vienna, Austria). ChIP was performed with RNAPII, P-Ser2, P-Ser5, and SPT6 antibod- ies as described above. AS CDK12 HCT116 cells were plated on to Chromatin immunoprecipitation (ChIP-qPCR) 150-mm dishes and synchronized by serum starvation as mentioned above. At the time of release (0 h) into DMEM containing 15% FBS, ChIP was performed with antibodies indicated in the Table 1. Briefly, the cells were treated either with DMSO (CTRL) or 5 lM 3-MB-PP1 20 ll of protein G Dynabeads (Thermo Fisher Scientific, 10009D) per for 4.5 h. For each ChIP sequencing (ChIP-seq) experiment (three one immunoprecipitation was washed three times with RIPA buffer biological replicates were processed for each antibody), we (50 mM Tris–Cl, pH 8, 150 mM NaCl, 5 mM EDTA, 1% NP-40, 0.5% performed three technical replicates, and from each replicate, the sodium deoxycholate, 0.1% SDS, supplemented with protease inhibi- immunoprecipitated DNA was dissolved in 20 llH2O and pooled tors, Sigma, P8340), and pre-blocked with 0.2 mg/ml BSA (Thermo together. DNA concentration was measured by Qubit fluorometer Fisher Scientific, AM2616) and 0.2 mg/ml salmon sperm DNA (Thermo Fisher Scientific), and 4 ng (3.5 ng for SPT6) of immuno- (Thermo Fisher Scientific, 15632-011) for 4 h. After pre-blocking, the precipitated DNA was used for library preparation. ChIP-seq beads were washed three times with RIPA buffer followed by the incu- libraries were generated using the KAPA Biosystems Hyper Prep Kit bation with specific antibody for at least 4 h at 4°C. (KK8502) with KAPA Pure Beads (KK8001), and NEBNext Multiplex AS CDK12 HCT116 cells were plated onto 150-mm dishes and Oligos for Illumina (Index Primers Set 1 and Set 2 (NEB, E7335S, synchronized by serum starvation as described. The cells were E7500S) with 13× (15× for SPT6) PCR cycles for library amplifi- released and incubated with 15% FBS containing medium supple- cation, as per manufacturer’s instructions. Libraries were run on the mented with either DMSO or 5 lM 3-MB-PP1 inhibitor diluted in fragment analyzer (Advanced Analytical Technologies) to check the DMSO for 4.5 h. Cells were crosslinked with 1% formaldehyde for quality and were sequenced with 50 bp single-end reads on two 10 min; reaction was quenched with glycine (final concentration lanes of an Illumina HiSeq 2500 (VBCF Vienna). 125 mM) for 5 min. Cells were washed twice with ice-cold PBS, scraped, and pelleted. Each 20-ll packed cell pellet was lysed in RNA-seq and ChIP-seq analysis 600 ll of RIPA buffer and sonicated 20 × 7s (amplitude 0.85) using 5/64 probe (QSonica Q55A). Clarified extracts (13,000 g for 10 min) Quality check of RNA-seq reads was performed using fastQC (avail- were precleared with protein G Dynabeads (Thermo Fisher Scien- able online at: http://www.bioinformatics.babraham.ac.uk/projects/ tific, 10009D) rotating for 2–4 h at 4°C and then incubated overnight fastqc). RNA-seq reads were mapped against the human genome with antibody pre-bound to the protein G Dynabeads. We used 1 ml (hg38) and human rRNA sequences using ContextMap version 2.7.9 of clarified extract to immunoprecipitate E2F1 or E2F3 proteins. 5% [80] (using BWA [81] as short read aligner and default parameters). of clarified extract was saved and used as input DNA. Next, day Number of read counts per gene and exon were determined from the beads were washed sequentially with low salt buffer (20 mM Tris– mapped RNA-seq reads in a strand-specific manner using feature- Cl, pH 8, 150 mM NaCl, 2 mM EDTA, 1% Triton X-100, 0.1% SDS), Counts [82] and gene annotations from GENCODE version 27. Dif- high salt buffer (20 mM Tris–Cl, pH 8, 500 mM NaCl, 2 mM EDTA, ferential gene expression analysis was performed using DESeq2 [83]. 1% Triton X-100, 0.1% SDS), LiCl buffer (20 mM Tris–Cl, pH 8, Differential exon usage was determined using DEXSeq [60]. P-values 250 mM LiCl, 2 mM EDTA, 1% NP-40, 1% sodium deoxycholate), were adjusted for multiple testing using the method by Benjamini and and twice with TE buffer (10 mM Tris–Cl, pH 8, 1 mM EDTA). Hochberg [84], and genes and exons with an adjusted P-value ≤ 0.01 Bound complexes were eluted with 500 ll of elution buffer (1% were considered significantly differentially expressed and used,

SDS and 0.1 M NaHCO3). To reverse formaldehyde crosslinks, both respectively. Functional enrichment analysis of differentially immunoprecipitated and input DNA were incubated at 65°C for at expressed genes for Gene Ontology terms was performed with the least 4 h with NaCl at final concentration 0.2 M and subsequently GOrilla webserver [85]. In addition, gene set enrichment analysis treated with proteinase K at 42°C for 2 h (10 lg/ml, Sigma P5568) (GSEA) [51] based on log2 fold-changes of all genes was performed. with 2 ll of GlycoBlue added (Thermo Fisher Scientific, AM9516). Analysis workflows were implemented and run using the Watchdog After phenol:chloroform extraction (Sigma, P3803), both immuno- workflow management system [86]. precipitated DNA and input DNAs were dissolved in 200 ll water Regulated poly(A) sites (PAS) were identified from 30end RNA- and 5 ll of DNA served as template for each qPCR reaction. Enrich- seq data in the following way: First, occurrences of polyadenylation ment of specific gene sequences was measured by qPCR (Agilent signal sequences as defined by [87] as well as occurrences of at least AriaMx Real-time PCR System) using SYBR Green JumpStart 10 consecutive As (to exclude internal poly(A) priming) were identi- TaqReadyMix (Sigma, S4438) with following parameters: 95°C for fied in the genome on both strands. Second, windows around the 2 min followed by 45 cycles of denaturation at 95°C for 15 s, poly(A) signal sequences (300 bp upstream of signal to 50 bp annealing at 55°C for 30 s, and extension at 72°C for 30 s. ChIP downstream of signal to include the actual PAS) and oligo-As enrichment of specific target was always determined based on (350 bp upstream of oligo-A until end of the 10 As) were defined. amplification efficiency and Ct value, and calculated relative to the All overlapping poly(A) signal windows and oligo-A windows were amount of input material. All primer sequences used in this study merged and poly(A) signal windows overlapping with an oligo-A

ª 2019 The Authors EMBO reports 20:e47592 | 2019 25 of 29 EMBO reports Anil Paul Chirackal Manavalan et al

window were removed. Third, read counts were determined for GSE120072. A UCSC genome browser session showing the mapped remaining poly(A) signal windows using featureCounts in each RNA-seq and ChIP-seq data is available at: https://genome.ucsc.ed 30end RNA-seq sample and differential gene expression analysis was u/s/CFriedel/CDK12. performed using DEseq2 as described above. ChIP-seq reads were aligned to the human genome (hg38) using Expanded View for this article is available online. BWA [81]. Reads with an alignment score < 20 were discarded. Read coverage per genome position was calculated using the Acknowledgements bedtools genomecov tool [88]. ChIP-seq and RNA-seq read coverage We thank all members of the Blazek laboratory for discussions throughout was visualized using Gviz [89]. For this purpose, read counts were the project and helpful comments on the article. We also wish to thank normalized to the total number of mapped reads and averaged Tomas Loja for help with flow cytometry, Kamila Reblova for help with between replicates. Creation of other figures and statistical analysis ChIP-seq data visualization, Stjepan Uldrijan for P53 and Dasa Bohaciakova of RNA-seq and ChIP-seq data were performed in R [90]. for phospho-P53 antibodies, VBCF Vienna for sequencing, and Core Facility X% distance (i.e., 10, 50 and 90% distance) for ChIP-seq and Bioinformatics of CEITEC Masaryk University is gratefully acknowledged for nuclear RNA-seq data were calculated as the minimum distance in the obtaining of the scientific data presented in this paper. Computational bps from the transcription start site (TSS) at which X% of the total resources were provided by the CESNET LM2015042 and the CERIT Scien- read coverage of the gene was obtained. Absolute DX% distance tific Cloud LM2015085, provided under the programme “Projects of Large was defined as the difference of X% distance in control minus the Research, Development, and Innovations Infrastructures”. The work was X% distance in inhibitor-treated cells. Relative DX% distance was supported by the following grants: the project CZ-OPENSCREEN: National defined as absolute DX% distance divided by gene length. Infrastructure for Chemical Biology (identification code: LM2015063), the project no. LQ1605 from the National Program of Sustainability II (MEYS Metagene analysis CR) to K.P.; the Czech Science Foundation (“17-13692S”), the CEITEC [Pro- ject “CEITEC-Central-European Institute of Technology” (CZ.1.05/1.1.00/ The metagene analysis of read coverage distribution in ChIP-seq data 02.0068)], the Grant agency of Masaryk university (MUNI/E/0514/2019)to was restricted to high confident transcripts of protein-coding genes D.B.; European Regional Development Fund—Project “MSCAfellow@MUNI” annotated in GENCODE version 27. Transcripts shorter than 3,180 bp (CZ.02.2.69/0.0/0.0/17_050/0008496) to A.M.; the Deutsche Forschungsge- were excluded. For each gene, we selected the transcript with the meinschaft [FR2938/7-1 and CRC 1123 (Z2)] to C.C.F; and Czech Science most read counts in the RNAPII ChIP-seq samples (normalized to Foundation (17-17720S), Wellcome Trust Collaborative Grant (206292/E/17/ library size) in the 3 kb regions around the transcription start site Z), and National Program of Sustainability II (MEYS CR, project no. (TSS) and transcription termination site (TTS). For each gene, the LQ1605) to L.K. regions 3kbto+1.5kboftheTSSand1.5 kb to +3kboftheTTS were divided into 50 bp bins (180 bins in total) and the remainder of Author contributions the gene body (+1.5 kb of TSS to 1.5 kb of TTS) into 180 bins of APCM, KPi, and MR performed experiments. MK performed bioinformatics variable length in order to compare genes with different lengths. For analyses under supervision of CCF and with some input from JO and DB. DB each bin, the average coverage per genome position was then calcu- conceived the study, acquired funding, and wrote the article with support of lated and normalized to the total sum of average coverages per bin CCF, APCM, and KPi. KB and LK contributed to design of experiments, and PK such that the sum of all bins was 1. Finally, metagene plots were synthetized THZ531 under supervision of KPa. All authors discussed the design created by averaging results for corresponding bins across all genes of experiments, analyzed the data, and commented on the article. considered. To determine statistical significance of differences between inhibitor and control, paired Wilcoxon signed rank tests Conflict of interest were performed for each bin comparing normalized coverage values The authors declare that they have no conflict of interest. for each gene for this bin with and without the inhibitor. P-values were adjusted for multiple testing with the Bonferroni method across all bins within each subfigure and are color-coded in the bottom track References of each subfigure: red = adj. P-value ≤ 10 15;orange= adj. P- value ≤ 10 10; yellow: adj. P-value ≤ 10 3. 1. Fuda NJ, Ardehali MB, Lis JT (2009) Defining mechanisms that regulate RNA polymerase II transcription in vivo. Nature 461: 186 – 192 Statistical analysis 2. Harlen KM, Churchman LS (2017) The code and beyond: transcription regulation by the RNA polymerase II carboxy-terminal domain. Nat Rev All experiments were performed at least in three or more biological Mol Cell Biol 18: 263 – 273 replicates. Results are reported as means standard error of the mean 3. Adelman K, Lis JT (2012) Promoter-proximal pausing of RNA polymerase (SEM) unless stated otherwise. All graphics and statistics (except for II: emerging roles in metazoans. Nat Rev Genet 13: 720 – 731 RNA-seq and ChIP-seq) were generated using Microsoft Excel. 4. Buratowski S (2003) The CTD code. Nat Struct Biol 10: 679 – 680 5. Egloff S, Murphy S (2008) Cracking the RNA polymerase II CTD code. Trends Genet 24: 280 – 288 Data availability 6. Eick D, Geyer M (2013) The RNA polymerase II carboxy-terminal domain (CTD) code. Chem Rev 113: 8456 – 8490 All RNA-seq and ChIP-seq data have been submitted to the Gene 7. Zaborowska J, Egloff S, Murphy S (2016) The pol II CTD: new twists in Expression Omnibus (GEO) and are available under the accession the tail. Nat Struct Mol Biol 23: 771 – 777

26 of 29 EMBO reports 20:e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

8. Peterlin BM, Price DH (2006) Controlling the elongation phase of tran- SETD1A regulates cyclin K and the DNA damage response. Cell 172: scription with P-TEFb. Mol Cell 23: 297 – 305 1007 – 1021 e17 9. Larochelle S, Amat R, Glover-Cutter K, Sanso M, Zhang C, Allen JJ, Shokat 25. Eifler TT, Shao W, Bartholomeeusen K, Fujinaga K, Jager S, Johnson J, KM, Bentley DL, Fisher RP (2012) Cyclin-dependent kinase control of the Luo Z, Krogan N, Peterlin BM (2014) CDK12 increases 30 end process- initiation-to-elongation switch of RNA polymerase II. Nat Struct Mol Biol ing of growth factor-induced c-FOS transcripts. Mol Cell Biol 35: 19: 1108 – 1115 468 – 478 10. Drogat J, Hermand D (2012) Gene-specific requirement of RNA poly- 26. Ko TK, Kelly E, Pines J (2001) CrkRS: a novel conserved Cdc2-related merase II CTD phosphorylation. Mol Microbiol 84: 995 – 1004 protein kinase that colocalises with SC35 speckles. J Cell Sci 114: 11. Blazek D, Kohoutek J, Bartholomeeusen K, Johansen E, Hulinkova P, Luo 2591 – 2603 Z, Cimermancic P, Ule J, Peterlin BM (2011) The Cyclin K/Cdk12 complex 27. Chen HH, Wang YC, Fann MJ (2006) Identification and characterization maintains genomic stability via regulation of expression of DNA damage of the CDK12/cyclin L1 complex involved in alternative splicing regula- response genes. Genes Dev 25: 2158 – 2172 tion. Mol Cell Biol 26: 2736 – 2745 12. Bartkowiak B, Liu P, Phatnani HP, Fuda NJ, Cooper JJ, Price DH, Adelman 28. Tien JF, Mazloomian A, Cheng SG, Hughes CS, Chow CCT, Canapi LT, K, Lis JT, Greenleaf AL (2010) CDK12 is a transcription elongation-asso- Oloumi A, Trigo-Gonzalez G, Bashashati A, Xu J et al (2017) CDK12 regu- ciated CTD kinase, the metazoan ortholog of yeast Ctk1. Genes Dev 24: lates alternative last exon mRNA splicing and promotes breast cancer 2303 – 2316 cell invasion. Nucleic Acids Res 45: 6698 – 6716 13. Bosken CA, Farnung L, Hintermair C, Merzel Schachter M, Vogel-Bach- 29. Popova T, Manie E, Boeva V, Battistella A, Goundiam O, Smith NK, mayr K, Blazek D, Anand K, Fisher RP, Eick D, Geyer M (2014) The struc- Mueller CR, Raynal V, Mariani O, Sastre-Garau X et al (2016) Ovarian ture and substrate specificity of human Cdk12/Cyclin K. Nat Commun 5: cancers harboring inactivating mutations in CDK12 display a distinct 3505 genomic instability pattern characterized by large tandem duplications. 14. Cheng SW, Kuzyk MA, Moradian A, Ichu TA, Chang VC, Tien JF, Vollett Can Res 76: 1882 – 1891 SE, Griffith M, Marra MA, Morin GB (2012) Interaction of cyclin-depen- 30. Wu YM, Cieslik M, Lonigro RJ, Vats P, Reimers MA, Cao X, Ning Y, Wang dent kinase 12/CrkRS with cyclin K1 is required for the phosphorylation L, Kunju LP, de Sarkar N et al (2018) Inactivation of CDK12 delineates a of the C-terminal domain of RNA polymerase II. Mol Cell Biol 32: distinct immunogenic class of advanced prostate cancer. Cell 173: 4691 – 4704 1770 – 1782 e14 15. Yu M, Yang W, Ni T, Tang Z, Nakadai T, Zhu J, Roeder RG (2015) RNA 31. Menghi F, Barthel FP, Yadav V, Tang M, Ji B, Tang Z, Carter GW, Ruan Y, polymerase II-associated factor 1 regulates the release and phosphoryla- Scully R, Verhaak RGW et al (2018) The tandem duplicator phenotype is tion of paused RNA polymerase II. Science 350: 1383 – 1386 a prevalent genome-wide cancer configuration driven by distinct gene 16. Greifenberg AK, Honig D, Pilarova K, Duster R, Bartholomeeusen K, mutations. Cancer Cell 34: 197 – 210 e5 Bosken CA, Anand K, Blazek D, Geyer M (2016) Structural and functional 32. Rao M, Powers S (2018) Tandem duplications may supply the missing analysis of the Cdk13/Cyclin K complex. Cell Rep 14: 320 – 331 genetic alterations in many triple-negative breast and gynecological 17. Zhang T, Kwiatkowski N, Olson CM, Dixon-Clarke SE, Abraham BJ, cancers. Cancer Cell 34: 179 – 180 Greifenberg AK, Ficarro SB, Elkins JM, Liang Y, Hannett NM et al (2016) 33. Menghi F, Inaki K, Woo X, Kumar PA, Grzeda KR, Malhotra A, Yadav V, Covalent targeting of remote cysteine residues to develop CDK12 and Kim H, Marquez EJ, Ucar D et al (2016) The tandem duplicator pheno- CDK13 inhibitors. Nat Chem Biol 12: 876 – 884 type as a distinct genomic configuration in cancer. Proc Natl Acad Sci 18. Davidson L, Muniz L, West S (2014) 30 end formation of pre-mRNA and USA 113:E2373 – E2382 phosphorylation of Ser2 on the RNA polymerase II CTD are reciprocally 34. Johnson SF, Cruz C, Greifenberg AK, Dust S, Stover DG, Chi D, Primack B, coupled in human cells. Genes Dev 28: 342 – 356 Cao S, Bernhardy AJ, Coulson R et al (2016) CDK12 inhibition reverses de 19. Edwards MC, Wong C, Elledge SJ (1998) Human cyclin K, a novel RNA novo and acquired PARP inhibitor resistance in BRCA wild-type and polymerase II-associated cyclin possessing both carboxy-terminal mutated models of triple-negative breast cancer. Cell Rep 17: domain kinase and Cdk-activating kinase activity. Mol Cell Biol 18: 2367 – 2381 4291 – 4300 35. Joshi PM, Sutor SL, Huntoon CJ, Karnitz LM (2014) Ovarian cancer-asso- 20. Kohoutek J, Blazek D (2012) Cyclin K goes with Cdk12 and Cdk13. Cell ciated mutations disable catalytic activity of CDK12, a kinase that Div 7: 12 promotes homologous recombination repair and resistance to cisplatin 21. Ekumi KM, Paculova H, Lenasi T, Pospichalova V, Bosken CA, Rybarikova and poly(ADP-ribose) polymerase inhibitors. J Biol Chem 289: J, Bryja V, Geyer M, Blazek D, Barboric M (2015) Ovarian carcinoma 9247 – 9253 CDK12 mutations misregulate expression of DNA repair genes via defi- 36. Bajrami I, Frankum JR, Konde A, Miller RE, Rehman FL, Brough R, Camp- cient formation and function of the Cdk12/CycK complex. Nucleic Acids bell J, Sims D, Rafiq R, Hooper S et al (2014) Genome-wide profiling of Res 43: 2575 – 2589 genetic synthetic lethality identifies CDK12 as a novel determinant of 22. Juan HC, Lin Y, Chen HR, Fann MJ (2016) Cdk12 is essential for embry- PARP1/2 inhibitor sensitivity. Can Res 74: 287 – 297 onic development and the maintenance of genomic stability. Cell Death 37. Iniguez AB, Stolte B, Wang EJ, Conway AS, Alexe G, Dharia NV, Kwiat- Differ 23: 1038 – 1048 kowski N, Zhang T, Abraham BJ, Mora J et al (2018) EWS/FLI confers 23. Liang K, Gao X, Gilmore JM, Florens L, Washburn MP, Smith E, Shilatifard tumor cell synthetic lethality to CDK12 inhibition in ewing sarcoma. A(2015) Characterization of human cyclin-dependent kinase 12 (CDK12) Cancer Cell 33: 202 – 216 e6 and CDK13 complexes in C-terminal domain phosphorylation, gene tran- 38. Malumbres M, Barbacid M (2009) Cell cycle, CDKs and cancer: a chang- scription, and RNA processing. Mol Cell Biol 35: 928 – 938 ing paradigm. Nat Rev Cancer 9: 153 – 166 24. Hoshii T, Cifani P, Feng Z, Huang CH, Koche R, Chen CW, Delaney CD, 39. Bertoli C, Skotheim JM, de Bruin RA (2013) Control of cell cycle transcrip- Lowe SW, Kentsis A, Armstrong SA (2018) A non-catalytic function of tion during G1 and S phases. Nat Rev Mol Cell Biol 14: 518 – 528

ª 2019 The Authors EMBO reports 20:e47592 | 2019 27 of 29 EMBO reports Anil Paul Chirackal Manavalan et al

40. Bracken AP, Ciro M, Cocito A, Helin K (2004)E2F target genes: unravel- 60. Anders S, Reyes A, Huber W (2012) Detecting differential usage of exons ing the biology. Trends Biochem Sci 29: 409 – 417 from RNA-seq data. Genome Res 22: 2008 – 2017 41. Tatsumi Y, Sugimoto N, Yugawa T, Narisawa-Saito M, Kiyono T, Fujita 61. Gu B, Eick D, Bensaude O (2013) CTD serine-2 plays a critical role in M(2006) Deregulation of Cdt1 induces chromosomal damage without splicing and termination factor recruitment to RNA polymerase II rereplication and leads to chromosomal instability. J Cell Sci 119: in vivo. Nucleic Acids Res 41: 1591 – 1603 3128 – 3140 62. Herzel L, Ottoz DSM, Alpert T, Neugebauer KM (2017) Splicing and tran- 42. Liontos M, Koutsami M, Sideridou M, Evangelou K, Kletsas D, Levy B, scription touch base: co-transcriptional spliceosome assembly and func- Kotsinas A, Nahum O, Zoumpourlis V, Kouloukoussa M et al (2007) tion. Nat Rev Mol Cell Biol 18: 637 – 650 Deregulated overexpression of hCdt1 and hCdc6 promotes malignant 63. Dubbury SJ, Boutz PL, Sharp PA (2018) CDK12 regulates DNA repair behavior. Can Res 67: 10899 – 10909 genes by suppressing intronic polyadenylation. Nature 564: 141 – 43. Tsantoulis PK, Gorgoulis VG (2005) Involvement of E2F transcription 145 factor family in cancer. Eur J Cancer 41: 2403 – 2414 64. Singh J, Padgett RA (2009) Rates of in situ transcription and splicing in 44. Baxley RM, Bielinsky AK (2017) Mcm10: a dynamic scaffold at eukaryotic large human genes. Nat Struct Mol Biol 16: 1128 – 1133 replication forks. Genes (Basel) 8:E73 65. Fitz J, Neumann T, Pavri R (2018) Regulation of RNA polymerase II 45. Lopez MS, Kliegman JI, Shokat KM (2014) The logic and design of processivity by Spt5 is restricted to a narrow window during elongation. analog-sensitive kinases and their small molecule inhibitors. Methods EMBO J 37:e97965 Enzymol 548: 189 – 213 66. Jonkers I, Kwak H, Lis JT (2014) Genome-wide dynamics of Pol II elonga- 46. Larochelle S, Batliner J, Gamble MJ, Barboza NM, Kraybill BC, Blethrow tion and its interplay with promoter proximal pausing, chromatin, and JD, Shokat KM, Fisher RP (2006) Dichotomous but stringent substrate exons. Elife 3:e02407 selection by the dual-function Cdk7 complex revealed by chemical 67. Branzei D, Szakal B (2017) Building up and breaking down: mechanisms genetics. Nat Struct Mol Biol 13: 55 – 62 controlling recombination during replication. Crit Rev Biochem Mol Biol 47. Galbraith MD, Andrysik Z, Pandey A, Hoh M, Bonner EA, Hill AA, Sullivan 52: 381 – 394 KD, Espinosa JM (2017) CDK8 kinase activity promotes glycolysis. Cell 68. Gaillard H, Garcia-Muse T, Aguilera A (2015) Replication stress and Rep 21: 1495 – 1506 cancer. Nat Rev Cancer 15: 276 – 289 48. Bartkowiak B, Yan C, Greenleaf AL (2015) Engineering an analog-sensi- 69. Zeman MK, Cimprich KA (2014) Causes and consequences of replication tive CDK12 cell line using CRISPR/Cas. Biochem Biophys Acta 1849: stress. Nat Cell Biol 16: 2 – 9 1179 – 1187 70. Choi SH, Martinez TF, Kim S, Donaldson C, Shokhirev MN, Saghatelian A, 49. Blazek D (2012) The cyclin K/Cdk12 complex: an emerging new player in Jones KA (2019) CDK12 phosphorylates 4E-BP1 to enable mTORC1- the maintenance of genome stability. Cell Cycle 11: 1049 – 1050 dependent translation and mitotic genome stability. Genes Dev 33: 50. Shiloh Y, Ziv Y (2013) The ATM protein kinase: regulating the cellular 418 – 435 response to genotoxic stress, and more. Nat Rev Mol Cell Biol 14: 71. Paculova H, Kramara J, Simeckova S, Fedr R, Soucek K, Hylse O, 197 – 210 Paruch K, Svoboda M, Mistrik M, Kohoutek J (2017) BRCA1 or CDK12 51. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette loss sensitizes cells to CHK1 inhibitors. Tumour Biol 39: MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES et al (2005) Gene set 1010428317727479 enrichment analysis: a knowledge-based approach for interpreting 72. Lei T, Zhang P, Zhang X, Xiao X, Zhang J, Qiu T, Dai Q, Zhang Y, Min L, Li genome-wide expression profiles. Proc Natl Acad Sci USA 102: Q et al (2018) Cyclin K regulates prereplicative complex assembly to 15545 – 15550 promote mammalian cell proliferation. Nat Commun 9: 1876 52. Fragkos M, Ganier O, Coulombe P, Mechali M (2015) DNA replication 73. Qiu H, Hu C, Hinnebusch AG (2009) Phosphorylation of the Pol II CTD by origin activation in space and time. Nat Rev Mol Cell Biol 16: 360 – 374 KIN28 enhances BUR1/BUR2 recruitment and Ser2 CTD phosphorylation 53. Beltran M, Yates CM, Skalska L, Dawson M, Reis FP, Viiri K, Fisher CL, near promoters. Mol Cell 33: 752 – 762 Sibley CR, Foster BM, Bartke T et al (2016) The interaction of PRC2 with 74. Devaiah BN, Lewis BA, Cherman N, Hewitt MC, Albrecht BK, Robey PG, RNA or chromatin is mutually antagonistic. Genome Res 26: 896 – 907 Ozato K, Sims III RJ, Singer DS (2012) BRD4 is an atypical kinase that 54. Jackson SP, Bartek J (2009) The DNA-damage response in human biology phosphorylates serine2 of the RNA polymerase II carboxy-terminal and disease. Nature 461: 1071 – 1078 domain. Proc Natl Acad Sci USA 109: 6927 – 6932 55. Rahl PB, Lin CY, Seila AC, Flynn RA, McCuine S, Burge CB, Sharp PA, 75. Booth GT, Parua PK, Sanso M, Fisher RP, Lis JT (2018) Cdk9 regulates a Young RA (2010) c-Myc regulates transcriptional pause release. Cell 141: promoter-proximal checkpoint to modulate RNA polymerase II elonga- 432 – 445 tion rate in fission yeast. Nat Commun 9: 543 56. Sanso M, Fisher RP (2013) Pause, play, repeat: CDKs push RNAP II’s 76. Egloff S, O’Reilly D, Chapman RD, Taylor A, Tanzhaus K, Pitts L, Eick D, buttons. Transcription 4: 146 – 152 Murphy S (2007) Serine-7 of the RNA polymerase II CTD is specifically 57. Ardehali MB, Yao J, Adelman K, Fuda NJ, Petesch SJ, Webb WW, Lis JT required for snRNA gene expression. Science 318: 1777 – 1779 (2009) Spt6 enhances the elongation rate of RNA polymerase II in vivo. 77. Ran FA, Hsu PD, Wright J, Agarwala V, Scott DA, Zhang F (2013) EMBO J 28: 1067 – 1077 Genome engineering using the CRISPR-Cas9 system. Nat Protoc 8: 58. Vos SM, Farnung L, Boehning M, Wigge C, Linden A, Urlaub H, Cramer P 2281 – 2308 (2018) Structure of activated transcription complex Pol II-DSIF-PAF- 78. Gratzner HG (1982) Monoclonal antibody to 5-bromo- and 5-iododeox- SPT6. Nature 560: 607 – 612 yuridine: a new reagent for detection of DNA replication. Science 218: 59. Yoh SM, Cho H, Pickle L, Evans RM, Jones KA (2007) The Spt6 SH2 474 – 475 domain binds Ser2-P RNAPII to direct Iws1-dependent mRNA splicing 79. Schlacher K, Christ N, Siaud N, Egashira A, Wu H, Jasin M (2011) and export. Genes Dev 21: 160 – 174 Double-strand break repair-independent role for BRCA2 in

28 of 29 EMBO reports 20:e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

blocking stalled replication fork degradation by MRE11. Cell 145: 86. Kluge M, Friedel CC (2018) Watchdog – a workflow management system 529 – 542 for the distributed analysis of large-scale experimental data. BMC Bioin- 80. Bonfert T, Kirner E, Csaba G, Zimmer R, Friedel CC (2015) ContextMap 2: formatics 19: 97 fast and accurate context-based RNA-seq mapping. BMC Bioinformatics 87. Gruber AR, Martin G, Keller W, Zavolan M (2014) Means to an end: 16: 122 mechanisms of alternative polyadenylation of messenger RNA precur- 81. Li H, Durbin R (2009) Fast and accurate short read alignment with sors. Wiley Interdiscip Rev RNA 5: 183 – 196 Burrows-Wheeler transform. Bioinformatics 25: 1754 – 1760 88. Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for 82. Liao Y, Smyth GK, Shi W (2014) featureCounts: an efficient general comparing genomic features. Bioinformatics 26: 841 – 842 purpose program for assigning sequence reads to genomic features. 89. Hahne F, Ivanek R (2016) Visualizing genomic data using gviz and Bioinformatics 30: 923 – 930 bioconductor. Methods Mol Biol 1418: 335 – 351 83. Love MI, Huber W, Anders S (2014) Moderated estimation of fold change 90. R Core Team (2016) R: a language and environment for statistical and dispersion for RNA-seq data with DESeq2. Genome Biol 15: 550 computing. Vienna, Austria: R Foundation for Statistical Computing 84. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate – a practical and powerful approach to multiple testing. J R Stat Soc Series License: This is an open access article under the B Stat Methodol 57: 289 – 300 terms of the Creative Commons Attribution 4.0 85. Eden E, Navon R, Steinfeld I, Lipson D, Yakhini Z (2009) GOrilla: a tool License, which permits use, distribution and reproduc- for discovery and visualization of enriched GO terms in ranked gene tion in any medium, provided the original work is lists. BMC Bioinformatics 10: 48 properly cited.

ª 2019 The Authors EMBO reports 20:e47592 | 2019 29 of 29 EMBO reports Anil Paul Chirackal Manavalan et al

Expanded View Figures

◀ Figure EV1. Preparation and characterization of AS CDK12 HCT116 cell line. ▸ A Depiction of CDK12 locus, genome editing, and genotyping strategy. Schema of CDK12 locus, with exon numbers shown above the CDK12 gene depiction (top). Primers used for genotyping PCR surrounding exon 6 of CDK12 gene are shown as horizontal arrows, PCR product is depicted as full horizontal line, and BslI restriction sites are indicated by vertical arrows. BslI restriction site created by genome editing is shown in green. Size (bp) of genotyping PCR product and BslI restriction fragments are indicated (middle). DNA subjected to genome editing and corresponding protein sequences in exon 6 of CDK12 genes are shown; the underlined DNA sequence in WT CDK12 allele underwent genome editing to create silent mutation preventing alternative splicing (nucleotide in blue), BslI restriction site, and to convert F813 to G813 (nucleotides in red) in AS CDK12. Engineered G813 in AS CDK12 is indicated in red (bottom). B Characterization of AS CDK12 clone by a AS primer-specific PCR. Exon 6 in CDK12 gene is shown as a black box. Edited DNA in the AS CDK12 is marked by a red vertical line in the exon 6. Genotyping primers specific for WT (black arrows) and AS CDK12 (red arrow) are shown, and genotyping PCR product is depicted by a dashed line with size (in bp) indicated above (top). Ethidium bromide-stained agarose gel visualizing 352 bp PCR product from PCR mixture using either WT- (left) or AS-specific (right) forward primer (bottom). C CCNK/CDK12 complex shows comparable properties in the AS and WT CDK12 HCT116 cell lines. Western blot analysis of protein levels (input) and association [determined by immunoprecipitation (IP)] of CCNK and CDK12 in the indicated cell lines. No Ab corresponds to a control immunoprecipitation without antibody. A representative image of three replicates is shown. D Quantification of individual P-Ser modifications in the CTD of RNAPII after CDK12 inhibition. Amounts of individual proteins and CTD modifications presented in Fig 1D and in another two biological replicates from short film exposures were quantified by ImageJ software. All protein levels were normalized to a corresponding tubulin loading control, and samples without treatment in each time point (CTRL) were considered as 1; n = 3 biological replicates and error bars are standard error of the mean (SEM). Source data are available online for this figure.

EV1 EMBO reports e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

A

B C

D

Figure EV1.

ª 2019 The Authors EMBO reports e47592 | 2019 EV2 EMBO reports Anil Paul Chirackal Manavalan et al

◀ Figure EV2. CDK12 kinase activity is essential for optimal G1/S progression. ▸ A 3-MB-PP1 does not affect cell cycle progression in WT HCT116 cells. The experiment was performed as shown in Fig 2A. n = 3; representative result is shown. B THZ531 causes G1/S progression defect in WT HCT116 cells arrested by serum starvation. Flow cytometry profiles of control (THZ531)or350 nM THZ531(+THZ531)- treated cells from the experiment outlined in Fig 2A. Red arrow points to the onset of the G1/S progression defect in THZ531-treated cells. n = 3 replicates; representative result is shown. C CDK12 inhibition delays G1/S progression in thymidine/nocodazole-arrested AS CDK12 HeLa cells. Flow cytometry profiles of control (3-MB-PP1)or3-MB-PP1 (+3-MB-PP1) treated cells from the experiment shown in Fig 2A. Red arrow points to the onset of the G1/S progression defect in 3-MB-PP1-treated cells. n = 3 replicates; representative result is shown. D Experimental outline. AS CDK12 HCT116 cells were arrested by serum starvation for 72 h and released into the serum-containing medium with (+) or without () 3-MB-PP1. 3-MB-PP1 was washed away and replaced with fresh medium at indicated times after the release, and all samples were subjected to flow cytometry analyses at 15 h after the release. EG1/S progression delay can be rescued by removal of CDK12 inhibitor at early G1 phase. Flow cytometry profiles of propidium iodide-labeled cells from the experiment depicted in Fig EV2D. CTRL = control samples without the 3-MB-PP1. n = 3 replicates; representative result is shown.

EV3 EMBO reports e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

AB C

D

E

Figure EV2.

ª 2019 The Authors EMBO reports e47592 | 2019 EV4 EMBO reports Anil Paul Chirackal Manavalan et al

◀ Figure EV3. CDK12 catalytic activity controls expression of core DNA replication genes. 0 ▸ A CDK12 inhibition down-regulates DNA replication-related genes. GSEA analysis based on log2 fold-changes in 3 end RNA-seq data upon CDK12 inhibition. Normalized enrichment scores (NES) are shown for significant GO terms (FDR q-val < 0.05) with negative NES, i.e., associated with down-regulation. Functions related to DNA replication are marked by the red rectangles. B Expression of crucial DNA replication genes is dependent on the CDK12 kinase activity. Comparison of log2 fold-changes versus log2 mean expression in 30end RNA- seq data and depicts down-regulated DNA replication genes (0.85 > log2 fold-change, P < 0.01) after 5-h CDK12 inhibition. C Validation of 30end RNA-seq for select non-regulated genes by RT–qPCR. See Fig 3D for legend. n = 3 replicates, error bars represent SEM. D Inhibition of CDK12 kinase does not affect mRNA degradation of select DNA repair and replication transcripts. AS CDK12 HCT116 cells were treated with ActD (1 lg/ml) either in the presence (red line) or absence (CTRL) (blue line) of 3-MB-PP1. Total mRNA was isolated at indicated time points, and levels of indicated mRNAs normalized to HPRT1 were measured by RT–qPCR. Graphs present mRNA levels relative to untreated cells (time 0 h set to 1). n = 3 independent experiments, error bars are SEM. E Expression of core DNA replication proteins is dependent on the CDK12 kinase activity. See legend in Fig 3E. F, G CCNK depletion diminishes mRNA and protein expression of DNA replication genes. RT–qPCR of mRNA levels (F) and Western blot of protein levels (G) in AS CDK12 HCT116 cells treated with control (CTRL) or CCNK siRNAs for 36 h. mRNA levels were normalized to GAPDH mRNA expression. n = 3 replicates for RT–qPCR (F), error bars indicate SEM. In (G), a representative experiment from three replicates is shown. Source data are available online for this figure.

EV5 EMBO reports e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

A

B

C

D

E

F G

Figure EV3.

ª 2019 The Authors EMBO reports e47592 | 2019 EV6 EMBO reports Anil Paul Chirackal Manavalan et al

◀ Figure EV4. CDK12 directs expression of replication and DNA damage response genes downstream of the E2F/RB pathway. ▸ A CDK12 directs expression of DNA replication genes downstream of the E2F/RB pathway. Graphs present ChIP-qPCR data for E2F1 and E2F3 in AS CDK12 HCT116 cells either treated or not with 3-MB-PP1 for 4 h. qPCR primers were designed at promoters of indicated genes. n = 3 replicates; error bars represent SEM. Ir is intergenic region; noAb corresponds to no antibody immunoprecipitation control. B CDK12 inhibition does not lead to differential recruitment of RNAPII to E2F target genes. The plots show log2 fold-changes of RNAPII occupancy on promoters of E2F target genes (y-axis) plotted against corresponding log2 fold-changes in mRNA expression from nuclear RNA-seq (x-axis). Promoter occupancy was quantified as read counts in the 3 kb regions around the transcription start site (TSS). For each gene, we selected the transcript with the most read counts in the RNAPII ChIP-seq samples (normalized to library size) in the 3 kb regions around the TSS and transcription termination site (TTS). Corresponding RNAPII ChIP-seq and nuclear RNA- seq experiments are presented in Fig 5A and B. E2F target genes were obtained from Bracken et al [40]; rho = Spearman rank correlation coefficient.

EV7 EMBO reports e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

A

B

Figure EV4.

ª 2019 The Authors EMBO reports e47592 | 2019 EV8 EMBO reports Anil Paul Chirackal Manavalan et al

◀ Figure EV5. Inhibition of CDK12 leads to diminished RNAPII processivity on down-regulated genes. 0 0 ▸ A High correlation between gene expression changes in nuclear and 3 end RNA-seq data. Graph compares log2 fold-changes in nuclear and 3 end RNA-seq data determined with DESeq2. rho = Spearman rank correlation coefficient. B Inhibition of CDK12 affects the expression of similar subsets of genes in nuclear and 30end RNA-seq data. See Fig 5A for legend. Venn diagrams are shown for significantly down-regulated (log2 fold-change < 0, P ≤ 0.01) and up-regulated (log2 fold-change > 0, P ≤ 0.01) genes. C P-Ser5 occupancy shows shifts after CDK12 inhibition. Metagene analysis of P-Ser5 ChIP-seq data as described in Fig 5B and C. D SPT6 shows diminished relative occupancy at 30ends of down-regulated genes upon CDK12 inhibition. Metagene analysis of SPT6 ChIP-seq data as described in Fig 5B and C. E CDK12 inhibition does not affect SPT6/RNAPII association in cells. Western blot analyses of SPT6 and RNAPII interaction after 4-h treatment with the 3-MB-PP1 in AS CDK12 HCT116 cells. Representative image from three replicates is shown. Source data are available online for this figure.

EV9 EMBO reports e47592 | 2019 ª 2019 The Authors Anil Paul Chirackal Manavalan et al EMBO reports

A B

E

C

D

Figure EV5.

ª 2019 The Authors EMBO reports e47592 | 2019 EV10 Published online 3 March 2020 NAR Cancer, 2020, Vol. 2, No. 1 1 doi: 10.1093/narcan/zcaa003 CDK12: cellular functions and therapeutic potential of versatile player in cancer Kveta Pilarova, Jan Herudek and Dalibor Blazek * Downloaded from https://academic.oup.com/narcancer/article-abstract/2/1/zcaa003/5775299 by Masarykova Univerzita user on 23 April 2020 Central European Institute of Technology (CEITEC), Masaryk University, 62500 Brno, Czech Republic

Received January 15, 2020; Revised February 14, 2020; Editorial Decision February 14, 2020; Accepted February 20, 2020

ABSTRACT Human CDK12 (also CRKRS, CRK7 or CRKR) is a 1490-amino-acid-long, ∼160-kDa protein consisting of a Cyclin-dependent kinase 12 (CDK12) phosphorylates centrally located kinase domain and intrinsically disordered the C-terminal domain of RNA polymerase II and regions with N-terminal arginine/serine-rich (RS) and cen- is needed for the optimal transcription elongation tral and C-terminal proline-rich (PR) motifs (1,6)(Figure and translation of a subset of human protein-coding 1). The kinase domain contains a C-terminal kinase exten- genes. The kinase has a pleiotropic effect on the sion typical for CTD kinases involved in the regulation of maintenance of genome stability, and its inactiva- transcription elongation (7). Its flexibility directs ATP bind- tion in prostate and ovarian tumours results in fo- ing (7–9) and is important for the catalytic activity of the cal tandem duplications, a CDK12-unique genome kinase (7,10). CDK12 exerts its kinase activity only when instability phenotype. CDK12 aberrations were found associated with CycK, as documented by structural stud- in many other malignancies and have the potential ies and similar changes in gene expression after depletion of the proteins (5,7). CycK is a ∼70-kDa protein consisting to be used as biomarkers for therapeutic interven- of two classical cyclin boxes mediating CDK12 association tion. Moreover, the inhibition of CDK12 emerges as a and a C-terminal extension rich in PR motifs with unknown promising strategy for treatment in several types of functions (3,5,6,11,12)(Figure1). Notably, CycK in higher cancers. In this review, we summarize mechanisms metazoans also associates with CDK13, a kinase function- that CDK12 utilizes for the regulation of gene expres- ally distinct from CDK12 (5,8,13) despite the 93% sequence sion and discuss how the perturbation of CDK12- homology of their kinase domains (6). The functional dif- sensitive genes contributes to the disruption of cell ferences are likely attributed to their N- and C-terminal ex- cycle progression and the onset of genome insta- tensions that are unusual for CDKs and whose sequences bility. Furthermore, we describe tumour-suppressive are largely different between both kinases (6)(Figure1). and oncogenic functions of CDK12 and its potential CDK12 and CycK are ubiquitously expressed in human tis- as a biomarker and inhibition target in anti-tumour sues (1,2) and their null mice die at an early stage of devel- opment (5,14), indicating an essential role of the proteins in treatments. the adult as well as during development. INTRODUCTION CDK12 AS A CTD KINASE AND REGULATOR OF HU- Cyclin-dependent kinase 12 (CDK12) was discovered as a MAN GENE TRANSCRIPTION candidate transcription and splicing machinery component by the Jonathon Pines lab in 2001 (1). Its heterodimer part- RNAPII directs the transcription of protein-coding genes in ner, cyclin K (CycK), was identified in a screen of pro- a process consisting of initiation, promoter-proximal paus- teins that can rescue the G1 progression defect and was ing, elongation and termination (15–18). It contains an associated with a strong kinase activity towards the C- unstructured CTD with 52 repeats of the evolutionarily terminal domain (CTD) of RNA polymerase II (RNAPII) conserved heptapeptide YSPTSPS, where individual serines (2). Nevertheless, it took until 2010 for the Arno Greenleaf (Ser2, Ser5, Ser7), threonine 4 (Thr4) and tyrosine 1 (Tyr1) lab to show that CycK and CDK12 are part of one com- are phosphorylated (19–21). The CTD modifications are plex functioning as an elongation-associated CTD kinase in necessary not only for the regulation of the transcription Drosophila (3). This finding, together with discoveries that cycle, but also for the coupling of transcription with co- CDK12 is among few recurrently mutated genes in ovarian transcriptional processes such as splicing and 3 end forma- carcinoma (4) and regulates genome stability via regulating tion and processing (22–25). The phosphorylation of Ser5 the transcription of key DNA repair genes (5), sparked re- (P-Ser5) marks RNAPII initiation and is required for es- search interest in the cellular functions of CDK12. tablishing pausing (20,21). The phosphorylation of Ser2

*To whom correspondence should be addressed. Tel: +420 730 588 450; Email: [email protected]

C The Author(s) 2020. Published by Oxford University Press on behalf of NAR Cancer. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. 2 NAR Cancer, 2020, Vol. 2, No. 1

C-terminal extension Intrinsically disordered Intrinsically disordered

CDK12 1 RS RS RS RS PRM Kinase Domain PRM 1490 103 379 529 697 719 1051 1021 1238 1280 Downloaded from https://academic.oup.com/narcancer/article-abstract/2/1/zcaa003/5775299 by Masarykova Univerzita user on 23 April 2020

C-terminal extension Intrinsically disordered Intrinsically disordered

CDK13 1 PRM A RS RS RS RS A Kinase Domain PRM SR 1512 34 57 62 76 434 449 489 697 189 999 1032 1269 1220 1341 1403

Intrinsically disordered

CycK 1Cyclin Box 1 Cyclin Box 2 PRM 580 44 160 258 320 568 152

Figure 1. Domain composition and structure of human CDK12, CDK13 and CycK proteins. RS, PRM, SR and A correspond to arginine/serine-, proline-, serine- and alanine-rich domains, respectively. The kinase domains and their C-terminal extensions in CDK12 and CDK13 are indicated in dark and light blue, respectively. Numbers below the depictions indicate the amino acid position for a given domain. The regions of the proteins that are predicted tobe intrinsically disordered are marked above their schemes.

(P-Ser2) is associated with active elongation and is needed of Ser7 further enhanced CDK12 kinase activity (7), while for coupling transcription to mRNA processing. The tran- incubation with prolyl isomerase did not significantly af- scription and CTD phosphorylation are regulated by sev- fect it (8,35). The depletion of CDK12 coupled with ChIP- eral CDKs, but relatively little is known about the spe- qPCR showed a decreased occupancy of P-Ser2 RNAPII at cific contributions of CDK12, although recent studies us- the 3 ends of c-MYC and c-FOS genes (36,37). ing novel experimental tools and genome-wide approaches Recent studies have used short (hours) CDK12-selective have started to uncover some of its secrets. inhibition in cells (38–40) and P-CTD levels were measured with phospho-specific antibodies in total cell lysate. Cova- / Role of human CDK12 in the phosphorylation of CTD lent CDK12 CDK13 inhibitor THZ531 revealed either no changes in the bulk CTD phosphorylation with low doses The role of human CDK12 in the phosphorylation of CTD (<100 nM) (38,40) or a decrease in P-Ser2 (38,40)andP- remains controversial, as various experimental approaches Thr4 (40) with higher doses (>200 nM). Inhibition with provide different outcomes. This is further complicated by a competitive ATP analogue in analogue-sensitive (AS) the well-known limitations of the use of phospho-CTD (P- CDK12 cell lines showed no strong changes in the bulk CTD) antibodies (26), and by what seem to be different CTD phosphorylation, a slight decrease in P-Ser5 and es- roles of human CDK12 and its homologues [Ctk1 in Sac- pecially P-Ser7, and a surprisingly slight accumulation of charomyces cerevisiae (27,28), Lsk1 in Schizosaccharomyces P-Ser2 (39,41). The differences in these findings can be ex- pombe (29,30), CDK12 in Caenorhabditis elegans (31)and plained either by off-targets of higher concentrations of Drosophila (3)] in CTD phosphorylation. While the homo- THZ531 [CDK13, transcription-related JNK kinases (38) logues are responsible for the majority of Ser2 modification and perhaps other elongation kinases] or by a residual ki- in their species, their human counterpart is not, perhaps nase activity in the presence of a competitive ATP analogue. due to a partial mutual redundancy with other candidate P- Treatment with low doses (<100 nM) of novel competi- Ser2 CTD kinases and elongation factors [CDK9 (32,33), tive CDK12/CDK13 inhibitor SR-4835 results in a slight CDK13 (8), CDK11 (Gajduskova et al., in revision) and decrease in bulk Ser2 CTD phosphorylation (42). Since BRD4 (34)] on at least a subset of genes. This may com- the low doses of THZ531, SR-4835 and inhibition of AS plicate experimental conclusions in human cells. CDK12 kinase result in downregulation of a common sub- Earlier studies in human cells indicated that longer term set of genes (DNA replication and repair genes) (38,39,42), (days) CDK12 depletion leads to a modest decrease (3,5,11) it seems likely that the inhibition of human CDK12 causes or little change (13) in bulk P-Ser2. Distinct in vitro kinase relatively subtle changes in the CTD phosphorylation that assays showed that CDK12 can phosphorylate either Ser2 are, however, critical for the optimal transcription of this and Ser5 (5,7,8,11) or Ser5 and Ser7 (7). These experiments subset of genes. Future experiments coupling short and led to the conclusion that CDK12 is a promiscuous CTD CDK12-specific inhibition with mass spectrometric analy- kinase in vitro (7,35). Interestingly, the pre-phosphorylation ses of the CTD phosphorylation will likely provide more NAR Cancer, 2020, Vol. 2, No. 1 3 definitive answers to the conundrum of which residues and and in preventing the premature termination of long genes repeats in the CTD are modified by CDK12 in human cells enriched in DNA repair and cell cycle progression func- (43,44). tions (52). Furthermore, SNRNP70, a U1 snRNP interact- ing factor, is phosphorylated in a THZ531-sensitive man- ner (40). Hence, further research of U1 snRNP will likely How do CDK12 perturbations affect gene expression? provide additional mechanistic understanding of CDK12 In most studies, siRNA-mediated CDK12 depletion, the in- functions. The inhibition of AS CDK12 led to accumulation hibition of AS CDK12 or low concentrations of THZ531 and shifts of RNAPII–P-Ser2 ChIP-seq peaks from 3 ends Downloaded from https://academic.oup.com/narcancer/article-abstract/2/1/zcaa003/5775299 by Masarykova Univerzita user on 23 April 2020 only led to expression changes in a subset of human to the bodies of individual CDK12-sensitive genes, approx- genes (hundreds to thousands) rather than affecting global imately to the positions where transcription was lost and transcription (5,8,38–40,45–48). The optimal expression where premature termination occurred (39). This coincided of long genes, mainly groups of DNA repair, replication with slower elongation rates in gene bodies (39)(Figure2). and cell cycle genes, is particularly dependent on CDK12 Since there is known to be a reciprocal relationship between (5,13,39,40,45,46). Notably, the depletion of CDK12 in high P-Ser2 signal and efficient 3 end processing (21,36), three breast cancer cell lines led to differential expression of it will be important to determine the relationship between distinct genes; however, these cellular processes were con- premature 3 end processing induced by the CDK12 pertur- firmed to be commonly modulated by CDK12 in all three bations and the factors involved (including the relevant P- cell lines (46). CDK12 in Drosophila transcriptome has a Ser2 kinase). A growing list of elongation modulators and gene-specific rather than global role, in which its targets in- factors regulating premature termination including PCF11 clude NRF2-dependent genes mediating oxidative stress re- and SCAF4/8 will be a good starting point for these in- sponse (49). CDK12 also counteracts the heterochromatin vestigations (53,54). Of note, CDK12 inhibition impacted enrichment in the Drosophila X chromosome by blocking neither the recruitment of SPT6, a key elongation fac- HP1 protein binding, predominantly affecting the expres- tor (39), nor noticeably the distribution of H3K36me3, an sion of long neuronal genes (50). elongation-associated histone mark (Chirackal Manavalan, Kluge, Friedel and Blazek, unpublished data). In summary, it appears that CDK12 prevents premature How does CDK12 regulate the transcription of its target termination in the bodies of its target genes by facilitating genes? their optimal elongation. Recent papers using different experimental models and novel tools have provided the first insights (Figure 2). Zhang How does CDK12 find its target genes? et al. found CDK12 at promoters and gene bodies of protein-coding genes, largely overlapping with RNAPII oc- Recent studies have provided the first suggestions to solve cupancies (38), strongly supporting the idea that CDK12 this conundrum (Figure 2). Yu et al. suggested that CDK12 travels with RNAPII and is an elongation-associated ki- is recruited by the PAF1C complex (55)(Figure2), a criti- nase (3). Inhibition with THZ531 led to a dose-dependent cal regulator of pause release and elongation (56), to phos- loss of RNAPII and its P-Ser2-modified forms from the phorylate Ser2 in gene bodies (55). This model is consis- 3 ends of CDK12-sensitive genes, indicating an RNAPII tent with the separate roles of yeast Ser2 kinases Bur1 elongation deficiency or termination defect (38). This was and Ctk1 (CDK9 and CDK12 homologues), in mediat- markedly different from an early elongation defect that ing this modification at promoters and gene bodies, respec- blocks RNAPII release into gene bodies and is caused by tively (25,28). The PAF1C-mediated recruitment supports CDK9 inhibition (32,51). It is worth noting that DNA re- a role of CDK12 as a general transcription factor, but does pair genes were extremely sensitive to low THZ531 doses not explain its gene-specific role in human cells. A recent (<100 nM) at which bulk P-Ser2 levels were unaffected study showed that PAF1C depletion leads to a gene-length- (38). The inducible depletion of CDK12 from mouse em- independent elongation defect together with an accumula- bryonic stem cells led to a global enhanced usage of in- tion of P-Ser2 (and P-Ser5)-modified RNAPII in a 20–30- tronic polyadenylation sites, resulting in the downregulation kb window downstream from transcription start sites (TSS) of full-length mRNA (protein) isoforms at the expense of (57). This is reminiscent of the elongation defect in SPT5- shorter ones. Since homologous recombination (HR) DNA depleted cells (58,59) and seems to be different from the ac- repair genes carry more intronic polyadenylation sites than cumulation of P-Ser2 predominantly at more distal regions other expressed genes, their cumulative usage in the absence of long CDK12-sensitive genes (39). of CDK12 gave a rationale for their unusual sensitivity to Hoshii et al. demonstrated that SETD1A, a histone CDK12 loss (45). The inhibition of CDK12 by THZ531 methylase, recruits CycK to the promoters of DNA dam- or its AS form exhibited a gene-length-dependent elonga- age response genes (Figure 2) and regulates their expression tion defect leading to a shortening of transcripts by pre- in the S phase (60). This finding is consistent with CDK12 mature termination in genes with a high number of cryp- promoter occupancy (38) and its ability to regulate the tran- tic polyadenylation sites, especially in DNA repair and core scription of DNA replication genes and G1/S progression DNA replication genes (39,40). Lower GS content and a (39). However, the functional relevance of the SETD1A- lower ratio of U1 snRNA binding to polyadenylation sites mediated recruitment of CDK12 to gene promoters for op- were identified as additional determinants of genes sensi- timal elongation in the bodies of its target genes remains tive to CDK12 inhibition (40). In agreement, U1 snRNP to be determined. Perhaps, this recruitment might also reg- was implicated in the regulation of transcription elongation ulate the optimal CDK12-dependent release of promoter- 4 NAR Cancer, 2020, Vol. 2, No. 1

Active CDK12

Initiation Elongation Termination Promoter-proximal pausing

Productive elongation Downloaded from https://academic.oup.com/narcancer/article-abstract/2/1/zcaa003/5775299 by Masarykova Univerzita user on 23 April 2020

SETD1A CDK12 CDK12 CDK12 2 2 2 5 CycK 5 CycK 5 CycK CTD 7 CTD 7 CTD 7 RNAPII PAF1C RNAPII PAF1C RNAPII PAF1C

TSS Canonical pA

2 CDK12 5 CTD 7 CycK RNAPII PAF1C

Full length mRNA Inactive CDK12

Initiation Elongation Premature Promoter-proximal pausing termination

Slower elongation

SETD1A CDK12 CDK12 CDK12 2 2 2 Long genes 5 CycK 5 CycK 5 CycK pA-rich genes CTD 7 CTD 7 CTD 7 Genes with low U1/pA ratio RNAPII PAF1C RNAPII PAF1C RNAPII PAF1C

TSS Internal pA Canonical pA

2 CDK12 5 CTD 7 CycK RNAPII PAF1C

Shorter mRNA

Figure 2. CDK12 kinase stimulates optimal transcription elongation and prevents premature termination of predominantly long, polyadenylation-site-rich genes. Working model: (top) the active CycK/CDK12 complex is recruited to the promoters (TSS) of genes by SETD1A and PAF1C proteins. CDK12 phosphorylates (thin black arrows) the CTD (light blue oval with two, five and seven in circles) of RNAPII, which results in productive elongation (thick black arrow) in gene bodies, the accumulation of P-Ser2 (2 in yellow circle) at the end of genes and optimal termination at canonical polyadenylation sites. Full-length mRNAs are produced. (Bottom) Upon CDK12 inhibition, the CTD is differentially phosphorylated (for simplicity not shown), which results in slower elongation (dotted black arrow) in the gene bodies of predominantly long, polyadenylation-site-rich genes with a low ratio of U1 snRNA binding to polyadenylation sites (U1/pA). P-Ser2 levels accumulate (2 in yellow circle) in the bodies of these genes approximately in the positions where RNAPII prematurely terminates (often at positions of internal cryptic polyadenylation sites). Shorter mRNAs are produced. pA = polyadenylation site. NAR Cancer, 2020, Vol. 2, No. 1 5 paused RNAPII, as was observed on a few sample genes splicing in S. cerevisiae (69). Although it lacks the RS do- (39). Indeed, the depletion of SETD1A led to RNAPII ac- main, it binds the only three SR proteins present in yeast cumulation at the selected target gene promoters (60). (35,70). Ctk1 depletion strongly reduces P-Ser2 and affects Numerous studies found an association of CDK12 with the co-transcriptional recruitment of the 3 end processing mRNA processing components, including spliceosome fac- factors without exhibiting any transcriptional defects (71– tors, exon-junction complex (EJC) and other RNA-binding 73). Likewise, inhibition of the fission yeast non-essential proteins (13,35,37,46). The functional significance of these protein Lsk1 led to a marked decrease in P-Ser2 levels with- interactions remains to be explored; however, the depletion out a significant effect on transcription29 ( ). Only a slight Downloaded from https://academic.oup.com/narcancer/article-abstract/2/1/zcaa003/5775299 by Masarykova Univerzita user on 23 April 2020 of eIF4A3, an EJC component and CDK12 interactor, pre- impact on elongating RNAPII was noted just past the 3 end vented the recruitment of CDK12 to the c-FOS gene and its cleavage polyadenylation termination region, likely a conse- 3 end processing (61). Interestingly, an earlier study using a quence of inefficient recruitment of 3 processing machinery reporter system implied the ability of CycK to activate tran- (29). scription via RNA (62). Whether nascent RNA (directly or indirectly) contributes to the recruitment of CycK/CDK12 CDK12 IN TRANSLATION to its target genes is an important question that remains to be answered. Apart from its role in mRNA synthesis, CDK12 also regu- lates the translation of a subset (hundreds) of mRNAs (74)  (Figure 3). In collaboration with mTORC1, CDK12 phos- CDK12 IN mRNA PROCESSING (SPLICING, 3 END phorylates the translation repressor 4E-BP1 and controls PROCESSING) the binding of translation initiation factor eIF4G, which The N-terminal 414 amino acids containing RS motifs are facilitates the efficient translation of key subunits of cen- required for the localization of CDK12 into nuclear speck- trosome, centromere and kinetochore complexes and also les (1,5,63,64), a nuclear subdomain enriched with splic- of CHK1, a key regulator of G1/S and G2/M progression ing and 3 end processing factors (65), indicating a role (74). Hence, the role of CDK12 in translation in metazoans of CDK12 in mRNA processing (1,5,64). This observation seems to be more specialized than its yeast Ctk1 homologue, was later substantiated by the identification of numerous which directs global translation during initiation steps as core spliceosomal and SR proteins, hnRNPs, 3 end process- well as during elongation (75,76). ing factors, components of EJC and other RNA-binding proteins as CDK12 interacting factors (13,35,37,46). Some CDK12 IN REGULATION OF CELL CYCLE PROGRES- of the factors were also identified as candidate CDK12 sub- SION AND CELLULAR PROLIFERATION strates (40). Thus, CDK12 was suggested to link mRNA processing via a dual mechanism, partly by direct associa- The ability of CycK to restore a cell cycle progression de- tion of the elongating kinase with mRNA processing factors ficiency by substituting missing G1 cyclins in S. cerevisiae (1,13,35), and partly by indirect P-CTD-mediated (CDK12- provided the very first hint that a CycK-associated kinase(s) dependent) recruitment of splicing and 3 end processing can regulate cell cycle progression (2). In agreement with a components (66,67). Studies using reporters or individual role of CycK complexes in the regulation of cell cycle pro- genes indeed documented that CDK12 regulates the splic- gression were findings that CycK is highly expressed in fast- ing of an E1a reporter minigene (63), serine–arginine splic- growing stem cells (12) and little in non-proliferative tissues ing factor 1 (SRSF1)(13), glial-specific neurexin IV genes (77). Moreover, THZ531 and ATP analogue 3-MB-PP1 in- (68) and 3 end processing of c-MYC and c-FOS (36,37). hibited cellular proliferation in various cell lines and engi- However, the depletion of full-length CDK12 coupled with neered AS CDK12 cells, respectively (38,41). A later study a splicing-sensitive microarray or RNA-seq did not show showed the arrest of synchronized cells in the G1 phase any global splicing defects (5,45,46). Another study found upon CycK depletion (78). The proposed mechanism sug- a role of CDK12 in the modulation of alternative last exon gested a direct role of the CDK12-dependent phosphory- (ALE) splicing, a specialized subtype of alternative mRNA lation of cyclin E1 in the pre-replication complex assembly splicing (46). The defect in ALE splicing appeared to be (independently of CDK12-regulated transcription) (78). In- gene and cell type specific, affecting hundreds of longer, hibition of the CDK12 in synchronized cells carrying AS exon-rich genes with impacted proximal last exons contain- CDK12 alleles showed that the kinase activity is required ing multiple polyadenylation motifs (46). The mechanistic for optimal G1/S progression (39). The inhibition affected role of full-length CDK12 in the specificity of ALE splicing, the transcription elongation of many origin recognition and which seems to be independent of CDK12-driven RNAPII pre-replication complexes genes, including CDC6, CDT1, processivity defects, remains to be determined (46). TOPBP1 and MTBP, resulting in their diminished protein The inhibition of CDK12 also did not produce any global levels, disrupted loading on chromatin, aberrant formation splicing deregulation (38–40); however, an increase in the of pre-replication complexes and delay in G1/Sprogres- splicing efficiency of predominantly long genes was noted sion (39)(Figure4). These gene groups are regulated by (40). This was determined to be an indirect consequence of E2F transcription factors (79), but their recruitment to the an elongation defect and shortening of transcripts (39,40). promoters is not affected by CDK12 inhibition (39). This Hence, it seems possible to conclude that CDK12 kinase ac- points to a rate-limiting role of the transcription elonga- tivity (towards the CTD or other substrates) does not have tion in the regulation of many DNA replication and repair a major direct role in the regulation of global pre-mRNA genes. Importantly, the short CDK12 inhibition also pro- splicing. Notably, Ctk1 was also not reported to regulate vided strong evidence for the G1/S progression defect being 6 NAR Cancer, 2020, Vol. 2, No. 1

Active CDK12 Inactive CDK12

CDK12 CDK12 CycK CycK

Phosphorylation Downloaded from https://academic.oup.com/narcancer/article-abstract/2/1/zcaa003/5775299 by Masarykova Univerzita user on 23 April 2020

Phosphorylation Phosphorylation mTORC1 4E-BP1 mTORC1 4E-BP1

eIF4E eIF4E mRNA mRNA

P4E-BP1

eIF4G 4E-BP1 eIF4G eIF4E eIF4E mRNA mRNA

Translation ON Translation OFF Subset of mRNAs

Figure 3. CDK12 regulates optimal translation of subset of mRNAs. (Left) CDK12 in collaboration with mTORC1 kinase phosphorylates the translational repressor 4E-BP1, which leads to its release from 5 cap (black oval)-bound eIF4E. Subsequent recruitment of the eIF4G translation initiation complex to the eIF4E on the subset of mRNAs results in their efficient translation. (Right) CDK12 depletion results in a diminished phosphorylation of4E-BP1, which stays bound to eIF4E and blocks the recruitment of eIF4G, which prevents the translation of the CDK12-specific subset of mRNAs. independent of the secondary activation of DNA damage sitivity to DNA damage agents (5,10,82,83). The depletion pathways, which occurs well after the cell cycle progression or inhibition of CDK12 also results in defects in DNA defect (39). replication and G1/S progression (39,78), likely leading to CycK or CDK12 knockdown also leads to an accumu- replication stress that is another source of genome instabil- lation of cells in G2/M, which was initially interpreted to ity (84,85). Furthermore, disruption of CDK12-dependent be a secondary effect of the activation of the DNA damage translation network of components and regulators of spin- cell cycle checkpoint (80) after the diminished expression of dle assembly checkpoint (74) also contributes to CDK12- CDK12-dependent DNA repair genes (5). However, a re- dependent genome instability (86,87). cent study has shown that CycK depletion also decreases Analyses of ovarian and prostate tumours with biallelic the expression of Aurora B kinase, a key mitotic regulator, inactivation of CDK12 revealed a unique genome instabil- and leads to the onset of Aurora B-dependent mitotic catas- ity phenotype characterized by copy-number gains (47,88– trophe and G2/Marrest(81). As the translation of many 92). The focal tandem duplications have a bimodal distri- other core mitotic regulatory proteins is critically dependent bution of ∼0.3–0.5- and ∼2.5–3.0-Mb-long duplicated seg- on CDK12, it can be concluded that aberrant catalytic ac- ments throughout the genome, especially in gene-rich re- tivity of CDK12 results in defects in multiple steps of mito- gions (88)(Figure4). Importantly, the focal tandem dupli- sis and a severe mitotic defect (74)(Figure4). cations are distinct from duplications in BRCA1-deficient, In summary, recent studies provided strong support for cyclin E1-amplified and other HR/DNA repair-deficient tu- several fundamental roles of CycK/CDK12 in the regula- mours (47,88,89,91,93). The size of the duplicated segments tion of the cell cycle and cellular proliferation. corresponds to the length of replication domains, which is consistent with their origin in defective DNA re-replication in the S phase (47,88,94). In contrast to the depletion or in- THE REGULATION OF GENOME STABILITY BY hibition of CDK12 in various cancer cell lines, the decreased CDK12 AND FOCAL TANDEM DUPLICATIONS expression of BRCA1/2 and some other HR factors was As CDK12 directs several cellular processes whose disrup- not found in ovarian and prostate CDK12-inactivated tu- tion triggers DNA damage, it likely plays a pleiotropic and mour samples (10,47,88,89). Consistently, the tumour gene cellular context-dependent role in the regulation of genome expression profiles were distinct from HR-deficient tumours stability (Figure 4). CDK12 regulates the transcription of (47,88). Thus, the molecular mechanism of the genesis of long DNA repair genes (such as BRCA1, BRCA2, ATR, focal tandem duplications cannot be solely caused by the ATM, Fanconi anaemia), predominantly those involved in aberrant expression of CDK12-dependent HR/DNA re- the HR repair pathway (5,10,14,38,39,46,47,82). Thus, al- pair genes. There are numerous possible scenarios. Given terations in CDK12 generate a non-functional HR path- that the specific inhibition of CDK12 kinase activity in the way, endogenous DNA damage, genome instability and sen- cancer cell line leads to the aberrant expression of HR and NAR Cancer, 2020, Vol. 2, No. 1 7

Transcriptional defect - Premature termination Translational defect

CDK12 CDK12 2 CycK Slower elongation 5 CycK CTD 7

RNAPII PAF1C eIF4G Downloaded from https://academic.oup.com/narcancer/article-abstract/2/1/zcaa003/5775299 by Masarykova Univerzita user on 23 April 2020 4E-BP1 Internal pA Translation OFF eIF4E mRNA

DNA replication genes Subset of mRNAs DNA repair genes (Origin recognition and pre-repli- (CHK1, subunits of centrosome, (HR repair genes) cation complexes genes) centromere and kinetochore complexes mRNAs)

Non-functional HR G1/S progression defect Mitotic defect

HR defect Replication stress Chromosome misalignment and segregation defect

Pleiotropic effects on genome instability in cell lines and tumours

Focal tandem duplications

0.3-0.5 Mb 2.5-3.0 Mb

Ovarian and prostate cancer

Figure 4. Aberrant CDK12-regulated gene expression has pleiotropic role in onset of genome instability. The schema depicts multiple ways in which non- functional CDK12 causes genome instability. (Left) The premature transcription termination of many HR and DNA replication genes results in an inactive HR DNA repair pathway and G1/S progression defects. The subsequent onset of HR defects and replication stress contributes to genome instability in cells. In ovarian and prostate cancers with inactive CDK12, a parallel onset of HR defects and replication stress likely leads to the onset of a unique CDK12-specific genome instability phenotype, focal tandem duplications. (Right) Inactive CDK12 does not allow the release of 4E-BP1 fromeIF4E, which leads to suboptimal translation of a subset of mRNAs, mainly CHK1 and several subunits of centrosome, centromere and kinetochore complexes. The subsequent mitotic defect is characterized by chromosome misalignment and segregation defects, which also contribute to cellular genome instability. 8 NAR Cancer, 2020, Vol. 2, No. 1 also many origin recognition and pre-replication complexes substantially dependent on the DNA repair system (102). genes (39), it is possible that the onset of replication stress Moreover, the overexpression of c-MYC, a central onco- and deficient HR-mediated fork restart could lead to their gene driving many tumours (103,104) and super-enhancer- genesis (39,95)(Figure4). At the same time, a compensatory associated transcription factor genes, including RUNX1 mechanism for the expression of CDK12-dependent HR and MYB (38), depends on CDK12. CDK12 is co-located genes (particularly BRCA1/2) in the tumour background close to the tyrosine kinase receptor HER2 (also ERBB2 or must exist, and might even be essential for tumour onset EGFR2) at locus Ch17q12 (105) and is often co-amplified and/or survival. Notably, CDK12 mutant prostate cancer with this oncogene in HER2-positive (amplified) breast can- Downloaded from https://academic.oup.com/narcancer/article-abstract/2/1/zcaa003/5775299 by Masarykova Univerzita user on 23 April 2020 tumours exhibit synthetic dependence on recurrent gains in cer (105,106). The resulting increased expression of CDK12 several genes involved in the regulation of the cell cycle and mRNA and protein accompanied by increased phosphory- DNA replication such as MCM7, CCND1 or RAD9A (88). lation of CDK12 was suggested to drive the oncogenic ac- In summary, CDK12 inactivation in cell lines leads to tivities of CDK12 in this type of breast cancer (106–109). the aberrant expression of many genes crucial in various Indeed, a recent study showed a CDK12-dependent tran- pathways and processes essential for the maintenance of scriptional upregulation of IRS1 and WNT ligands lead- genome stability. However, CDK12-inactivated ovarian and ing to the activation of oncogenic ERBB–PI3K–AKT and prostate tumours present a unique genome instability phe- WNT signalling pathways in HER2-positive breast cancer notype, pointing to a unique deregulation of genome sta- (110). It is worth noting that HER2-positive breast can- bility (likely by aberrant DNA replication-associated HR- cers are mutually exclusive with breast tumours carrying dependent repair) leading to tumorigenesis. Interestingly, mutations in BRCA1, which is consistent with the incom- very recent analyses have revealed the presence of focal tan- patibility of overexpressed CDK12 during the genesis of dem duplications in many other cancers with a low inci- HR-defective tumours (106). Another oncogenic property dence (<2%) of CDK12 inactivation (96), suggesting their of overexpressed CDK12 leads to the downregulation of the occurrence in addition to ovarian and prostate cancers. DNAJB6-L protein (via ALE splicing of its mRNA), which promotes the cell migration and invasiveness of HER2- positive breast cancer cells (46). The CDK12-specific focal CDK12 ABERRATIONS IN TUMOURS tandem duplications with a high number of duplications Genomic alterations in CDK12 were documented in ∼30 and fusions can lead to the differential expression of onco- tumour types with an incidence of up to 15% of sequenced genic drivers such as CCND1 or AR (androgen receptor) or cases (97) with molecular consequences best studied in ovar- c-MYC enhancers as a secondary genetic event of CDK12 ian, breast and prostate cancers. CDK12 aberrations in tu- inactivation (88,89,111). mours include mutations, deletions, amplifications, rear- rangements and overexpression. The aberrations emerge as THERAPEUTIC POTENTIAL OF CDK12 biomarkers for patient stratification and have the poten- tial to guide therapeutic interventions. CDK12 is both a The potential of CDK12 as a major therapeutic anti-cancer tumour suppressor and oncogene, and the functional out- target is gradually being discovered. Increasing knowledge comes of CDK12 aberrations are case and context depen- of CDK12 aberrations in tumours, the roles of CDK12 dent (46,97,98). in various cellular processes and the recent availability of CDK12 inhibitors have contributed to this exciting devel- opment. Numerous studies have started to reveal the cel- CDK12 as a tumour suppressor lular and genetic background that determines sensitivity to The tumour-suppressive role of CDK12 is linked to its abil- CDK12 inhibition (Table 1). This includes defects in the HR ity to maintain genome stability via regulating the tran- pathway, MYC overexpression, HER2 amplification and scription of DNA repair genes (5,10,14,38,39,46,47,82,99). the expression of EWS/FLI fusion protein. Moreover, func- The roles of CDK12 in the optimal transcription and tional studies suggest that CDK12-mediated cell cycle and translation of DNA replication genes (39,78) and mi- metabolic vulnerabilities and the CDK12-induced neoanti- totic regulators (74), respectively, significantly contribute gens load might be promising candidates/biomarkers for to the tumour-suppressive function of CDK12. The in- targeted CDK12-specific cancer therapy (Table 1). activation of CDK12 (mutations and deletions in the ki- nase domain) results in the loss of catalytic activity and Tumours with defects in the HR pathway: PARP inhibitors tumour-suppressive function of the kinase. The mutations and platinum-based chemotherapies of CDK12 in high-grade serous ovarian cancer (HGSOC) are mostly homozygous, indicating that they are driver mu- Tumours sensitive to PARP inhibitors (PARPi) have a tations of a tumour suppressor (100). non-functional HR pathway due to aberrations in ei- ther BRCA1/2 or other components of the HR pathway (112–114). CDK12 was found to be one of the determinants CDK12 as an oncogene of PARPi sensitivity in a genome-wide screen (82), consis- Evidence is emerging that CDK12 participates in multiple tent with its crucial role in the transcription of many HR oncogenic pathways and signalling. In line with the con- genes. About 50% of HGSOC and triple-negative breast cept of transcriptional addiction (101), CDK12 as a tran- cancer (TNBC) are defective in the HR pathway, with scriptional regulator of DNA damage and replication genes CDK12 being among several recurrently mutated DNA can improve the fitness of cancer cells, as they are often damage response genes (4,115). CDK12 mutations tend to NAR Cancer, 2020, Vol. 2, No. 1 9

Table 1. Overview of therapeutic sensitivity of tumours/tissues with indicated phenotype and CDK12 status Tissue/tumour CDK12 status Related phenotype Drug sensitivity References Breast (TNBC) WT HR-proficient CDK12 inhibitors (42,116) (dinaciclib, SR-4835) + PARP inhibitors (olaparib) or DNA-damaging drugs (platinum-based, doxorubicin, irinotecan) Downloaded from https://academic.oup.com/narcancer/article-abstract/2/1/zcaa003/5775299 by Masarykova Univerzita user on 23 April 2020 Ovarian (HGSOC) and Inactivating mutations; HR deficiency PARP inhibitors, (10,82,83) breast shRNA/siRNA depletion DNA-damaging drugs of WT Breast (HER2-positive) Genomic disruption HR deficiency PARP inhibitors (108,109) (out-of-frame rearrangements due to breakpoint of HER2 amplicon) Metastatic osteosarcoma WT HR-proficient CDK12 inhibitors (117) (dinaciclib, THZ531) Foreskin fibroblasts siRNA depletion of WT c-MYC overexpression (103)

Neuroblastoma and ovarian WT N-MYC/c-MYC CDK inhibitors (104,124) amplification (roscovitine, CR8, THZ1) Breast (trastuzumab- Amplified HER2 amplification CDK12 inhibitor (110) resistant/sensitive (dinaciclib, THZ531) HER2-positive) Hepatocellular carcinoma WT EGFR/HER3 and CDK12 inhibitor (THZ531) (126) (sorafenib-treated) PI3K/AKT activation Ewing sarcoma WT EWS/FLI expression CDK12 inhibitor (128) (THZ531) + PARP inhibitors (olaparib) Cancer cell lines (colon, siRNA depletion of WT Replication stress CHK1 inhibitors (129) ovarian) Prostate (mCRPC) Biallelic loss-of-function Focal tandem duplications, Immune checkpoint (PD-1) (88,94,132) mutations high neoantigen load inhibitors Ovarian (HGSOC) siRNA depletion of WT BRCA1 deficiency (HR Metabolic inhibitors (135) deficiency-independent metabolic reprogramming)

WT = wild-type CDK12. be mutually exclusive with mutations in other HR genes hibitors have the potential to be used to reverse resistance to (82), indicating they are a primary cause of the HR-deficient PARPi in tumours with residual HR activity and to enhance phenotype. Indeed, CDK12-inactivating aberrations found the efficiency of existing DNA-damaging drugs, including in HGSOC failed to support HR repair (10) and sensitized widely used platinum-based chemotherapies. the cells to PARPi and platinum-based drugs (82,83). In a About 7% of metastatic castration-resistant prostate can- subset (∼14%) of HER2-positive breast cancer, the HER2 cers (mCRPCs) carry aberrations in CDK12 and ∼24% amplicon breakpoint converges on CDK12, disrupting its have a non-functional HR pathway (88,119,120). This sug- expression and leading to sensitivity to PARPi (108,109). gests sensitivity to PARPi in a considerable proportion of Dinaciclib, a potent CDK12 inhibitor, reversed PARPi re- mCRPCs, including those with CDK12 aberrations. How- sistance in models of TNBC with mutated BRCA1/2, point- ever, when compared to other HR-deficient mCRPCs, the ing to a possibility of the combinatorial use of CDK12 and CDK12-abberant tumours were found to be transcription- PARP inhibitors (116). Likewise, treatment with SR-4835 ally, genetically and phenotypically different (88) and dis- in HR-competent TNBC led to the common suppression of played more aggressive clinical behaviour (including higher DNA repair genes and synergistic promotion of sensitivity Gleason scores at presentation, a shorter time to metas- to PARPi and to various DNA-damaging agents, including tasis and CRPC) (121). This indicates that they represent cisplatinum, doxorubicin and the topoisomerase inhibitor a molecularly distinct subtype of mCRPC with potential irinotecan (42). In metastatic osteosarcoma (OS), a cancer for a different and/or more intensive therapeutic approach with a high degree of genome instability, treatment with sev- (88,121). Indeed, preliminary results of a clinical trial with eral CDK12 inhibitors led to the sensitivity of OS cell lines PARPi rucaparib in mCRPC patients with HR gene muta- and a decrease in metastatic cell outgrowth in the lungs, per- tions revealed that none of the patients with CDK12 alter- haps partly due to the defective expression of DNA damage ations exhibited a response to the treatment, in contrast to response genes (117). patients with BRCA1/2 mutations (122). This suggests that Overall, since multiple PARPi have been approved for the response of cancers with CDK12 aberrations to PARPi clinical use in breast and ovarian cancers with BRCA1/2 may be tumour type and context specific. It also points mutations (118), CDK12 mutations could be used as other to the need for a more elaborate stratification of patients biomarkers for their application. Moreover, CDK12 in- with HR-deficient tumours and the application of alterna- 10 NAR Cancer, 2020, Vol. 2, No. 1 tive treatments, such as immunotherapy as discussed later, CHK1 and inhibitors of cell cycle checkpoints for certain groups of patients with CDK12 aberrations. Recent findings that CDK12 (directly or indirectly) regu- lates various steps of cell cycle progression (39,78,81)pro- Tumours with MYC overexpression vide fresh avenues to identify and exploit other synthetically lethal interactions of CDK12 in various genetic and cellular The MYC family of transcription factors is deregulated in contexts. Notably, cells with depleted CDK12 were shown

>50% of human cancers. In normal cells, their expression is to be more reliant on the kinase activity of CHK1 (129). Downloaded from https://academic.oup.com/narcancer/article-abstract/2/1/zcaa003/5775299 by Masarykova Univerzita user on 23 April 2020 tightly regulated; in tumours, they are often amplified and Further research in this field is warranted and will likely overexpressed, and MYC-driven transcription programmes provide more translation opportunities for the treatment of are central drivers of the disease. MYC proteins are consid- various malignancies. ered directly ‘undruggable’ (123), but their overexpression is dependent on other regulators, including various transcrip- tion elongation factors that are druggable and provide a Immunotherapy-immune checkpoint inhibitors therapeutic opportunity (97,101,103). Indeed, c-MYC over- Immune checkpoint blockage with PD-1 inhibitors is an expression in fibroblasts is synthetically lethal with CDK12 emerging strategy for the treatment of cancer. A subgroup depletion (103). The application of inhibitors that target of mismatch repair (MMR)-deficient tumours with high multiple CDKs, including CDK12, led to the downregu- neoantigen load and T-cell infiltration is highly sensitive to lation of N-MYC and c-MYC and their transcription pro- the therapy (130,131). Remarkably, 7% of mCRPC carrying grammes in MYC-driven neuroblastoma and ovarian cell a biallelic inactivation of CDK12 and focal tandem duplica- lines, respectively (104,124). Additionally, CDK12 coop- tions have increased gene fusion, fusion-induced neoanti- erates with c-MYC to promote the mTORC1-dependent gen load and T-cell infiltration, suggesting that this new translation of mRNAs of several oncogenic factors (74). In subgroup of mCRPC could also benefit from the treatment summary, the inhibition of CDK12 seems to be a promising (88). This was also indicated by a small pilot clinical study tool for the treatment of various MYC-dependent cancers. (88) and recently confirmed by a much larger clinical study (132). As the biallelic loss of CDK12 is common in other Tumours with HER2 amplification types of tumours (133,134) and in some of them is associ- ated with focal tandem duplications (96), the CDK12 aber- HER2-positive breast cancers are treated with anti-HER2 rations have a potential to be used, next to MMR deficiency, monoclonal antibodies such as trastuzumab (Herceptin), as another biomarker of response to the therapy (94). but >50% of patients develop resistance (125). A recent study has suggested that the resistance is induced via the CDK12-mediated overexpression of several WNT ligands Inhibition of energy metabolism and components of the ERBB–PI3K–AKT pathway in- cluding IRS1, which leads to the activation of the pro- A recent study has shown that reducing BRCA1 ex- growth signalling cascades (110). Consistently, the pharma- pression, either directly or via CDK12 depletion, led cological inhibition of CDK12 exhibited anti-proliferative to HR deficiency-independent metabolic reprogramming effects in trastuzumab-resistant but also trastuzumab- and an increased sensitivity of BRCA1-deficient HGSOC sensitive cells, suggesting that CDK12 inhibition might cells to metabolic inhibitors (135). This finding raises the serve as a replacement therapy for trastuzumab in breast question of whether tumours with CDK12 aberrations cancers with amplified HER2 and CDK12 (110). Interest- have metabolic vulnerabilities that could be targeted by ingly, adaptive response to sorafenib, an anti-hepatocellular metabolism-modulating drugs (136). carcinoma drug, is caused by the aberrant activation of / / EGFR HER3 receptors and the PI3K AKT pathway, and CLINICAL TRIALS was reversed by THZ531 treatment (126). This suggests an exciting possibility of targeting these oncogenic pathways There are ongoing clinical trials on patients with CDK12- by CDK12 inhibition in various types of tumours. aberrant tumours. Selected examples are shown in Table 2.

Tumours expressing EWS/FLI fusion protein SUMMARY AND FUTURE DIRECTIONS EWS/FLI fusion protein is a transforming transcriptional Over the last few years, our knowledge of CDK12 has ex- activator in Ewing sarcoma and currently difficult to target panded dramatically and its medical potential is being real- (127). CDK12 inhibition in Ewing sarcoma is synthetically ized. Nevertheless, many questions about its roles and func- lethal with EWS/FLI expression and leads to the down- tions in the cell remain open. regulation of DNA repair genes (128). In this genetic and Mechanistically, it will be important to determine which cellular context, THZ531 also exhibited a synergistic effect other proteins mediate CDK12 recruitment to genes, and with PARPi and various DNA-damaging drugs (128). Thus, which residues and repeats in the CTD of RNAPII and the interference with CDK12s roles as transcriptional co- other cellular substrates are phosphorylated by CDK12. activator and master regulator of DNA damage response This knowledge will be critical to deciphering the precise genes suggests a promising translational potential for the mechanism of CDK12-dependent transcription elongation treatment of this disease. and for elucidating why some genes are more dependent NAR Cancer, 2020, Vol. 2, No. 1 11

Table 2. List of sample clinical trials on patients with CDK12-aberrant tumours Tumour type Condition Intervention/treatment Identifier/study abbreviation mCRPC Loss of CDK12 function Checkpoint inhibitor immunotherapy NCT03570619, IMPACT (nivolumab: anti-PD-1; ipilimumab: anti-CTLA-4) mCRPC MMR deficiency or biallelic Checkpoint inhibitor immunotherapy NCT04104893, CHOMP inactivation of CDK12 (pembrolizumab: anti-PD-1) mCRPC HR deficiency PARPi (rucaparib) NCT02952534, TRITON2 Downloaded from https://academic.oup.com/narcancer/article-abstract/2/1/zcaa003/5775299 by Masarykova Univerzita user on 23 April 2020 Renal cell Inactivating mutations in PARPi (olaparib) NCT03786796, ORCHID carcinoma CDK12 (or other DNA repair genes) mCRPC Mutations in non-canonical PARPi (olaparib) NCT03012321 DNA repair genes including CDK12 mCRPC Positive for DNA repair gene PARPi (niraparib) combined with NCT03431350, QUEST defects or CDK12 biallelic checkpoint inhibitor immunotherapy inactivation (cetrelimab: anti-PD-1)

Source: ClinicalTrials.gov (10 February 2020). on CDK12 than others (not all long, polyadenylation-site- carboxy-terminal domain kinase and Cdk-activating kinase activity. rich genes are dependent on CDK12 and vice versa). An- Mol. Cell. Biol., 18, 4291–4300. other important question is whether the inhibition of pre- 3. Bartkowiak,B., Liu,P., Phatnani,H.P., Fuda,N.J., Cooper,J.J., Price,D.H., Adelman,K., Lis,J.T. and Greenleaf,A.L. (2010) CDK12 mature termination by CDK12 is a regulated or merely a is a transcription elongation-associated CTD kinase, the metazoan passive process. Given the occupancy of CDK12 on gene ortholog of yeast Ctk1. Genes Dev., 24, 2303–2316. promoters, its potential role in promoter-proximal pause re- 4. Cancer Genome Atlas Research Network (2011) Integrated genomic lease and re-initiation remains to be determined. Broader analyses of ovarian carcinoma. Nature, 474, 609–615. 5. Blazek,D., Kohoutek,J., Bartholomeeusen,K., Johansen,E., questions of high interest include determining how exactly Hulinkova,P., Luo,Z., Cimermancic,P., Ule,J. and Peterlin,B.M. CDK12-directed gene expression is integrated into the regu- (2011) The Cyclin K/Cdk12 complex maintains genomic stability lation of DNA replication and cell cycle progression as well via regulation of expression of DNA damage response genes. Genes as determining the mechanism of genesis of focal tandem Dev., 25, 2158–2172. duplications. 6. Kohoutek,J. and Blazek,D. (2012) Cyclin K goes with Cdk12 and Cdk13. Cell Div., 7, 12. Answers to these questions will move us forward in un- 7. Bosken,C.A., Farnung,L., Hintermair,C., Merzel Schachter,M., derstanding the cellular functions of CDK12. They will Vogel-Bachmayr,K., Blazek,D., Anand,K., Fisher,R.P., Eick,D. and likely create fresh avenues towards clinical applications, Geyer,M. (2014) The structure and substrate specificity of human such as finding new synthetic lethal interactions and itsuse Cdk12/Cyclin K. Nat. Commun., 5, 3505. 8. Greifenberg,A.K., Honig,D., Pilarova,K., Duster,R., as a biomarker for treatments of various malignancies. Bartholomeeusen,K., Bosken,C.A., Anand,K., Blazek,D. and Geyer,M. (2016) Structural and functional analysis of the / ACKNOWLEDGEMENTS Cdk13 Cyclin K complex. Cell Rep., 14, 320–331. 9. Dixon-Clarke,S.E., Elkins,J.M., Cheng,S.W., Morin,G.B. and This project has received funding from the European Bullock,A.N. (2015) Structures of the CDK12/CycK complex with Union’s Horizon 2020 research and innovation programme AMP-PNP reveal a flexible C-terminal kinase extension important for ATP binding. Sci. Rep., 5, 17122. under the Marie Skłodowska-Curie actions and it is co- 10. Ekumi,K.M., Paculova,H., Lenasi,T., Pospichalova,V., financed by the South Moravian Region under grant agree- Bosken,C.A., Rybarikova,J., Bryja,V., Geyer,M., Blazek,D. and ment No. 665860. This report reflects only the author’s view Barboric,M. (2015) Ovarian carcinoma CDK12 mutations and the EU is not responsible for any use that may be made misregulate expression of DNA repair genes via deficient formation and function of the Cdk12/CycK complex. Nucleic Acids Res., 43, of the information it contains. We apologize to many col- 2575–2589. leagues whose work is not cited here due to space limita- 11. Cheng,S.W., Kuzyk,M.A., Moradian,A., Ichu,T.A., Chang,V.C., tions. Tien,J.F., Vollett,S.E., Griffith,M., Marra,M.A. and Morin,G.B. (2012) Interaction of cyclin-dependent kinase 12/CrkRS with cyclin K1 is required for the phosphorylation of the C-terminal domain of FUNDING RNA polymerase II. Mol. Cell. Biol., 32, 4691–4704. 12. Dai,Q., Lei,T., Zhao,C., Zhong,J., Tang,Y.Z., Chen,B., Yang,J., Czech Science Foundation [17-13692S to D.B.]; CEITEC Li,C., Wang,S., Song,X. et al. (2012) Cyclin K-containing kinase [CZ.1.05/1.1.00/02.0068 to D.B.]; South Moravian Region complexes maintain self-renewal in murine embryonic stem cells. J. [665860 to J.H.]. Biol. Chem., 287, 25344–25352. Conflict of interest statement. None declared. 13. Liang,K., Gao,X., Gilmore,J.M., Florens,L., Washburn,M.P., Smith,E. and Shilatifard,A. (2015) Characterization of human cyclin-dependent kinase 12 (CDK12) and CDK13 complexes in REFERENCES C-terminal domain phosphorylation, gene transcription, and RNA processing. Mol. Cell. Biol., 35, 928–938. 1. Ko,T.K., Kelly,E. and Pines,J. (2001) CrkRS: a novel conserved 14. Juan,H.C., Lin,Y., Chen,H.R. and Fann,M.J. (2015) Cdk12 is Cdc2-related protein kinase that colocalises with SC35 speckles. J. essential for embryonic development and the maintenance of Cell Sci., 114, 2591–2603. genomic stability. Cell Death Differ., 23, 1038–1048. 2. Edwards,M.C., Wong,C. and Elledge,S.J. (1998) Human cyclin K, a novel RNA polymerase II-associated cyclin possessing both 12 NAR Cancer, 2020, Vol. 2, No. 1

15. Fuda,N.J., Ardehali,M.B. and Lis,J.T. (2009) Defining mechanisms cysteine residues to develop CDK12 and CDK13 inhibitors. Nat. that regulate RNA polymerase II transcription in vivo. Nature, 461, Chem. Biol., 12, 876–884. 186–192. 39. Chirackal Manavalan,A.P., Pilarova,K., Kluge,M., 16. Proudfoot,N.J. (2016) Transcriptional termination in mammals: Bartholomeeusen,K., Rajecky,M., Oppelt,J., Khirsariya,P., stopping the RNA polymerase II juggernaut. Science, 352, aad9926. Paruch,K., Krejci,L., Friedel,C.C. et al. (2019) CDK12 controls 17. Adelman,K. and Lis,J.T. (2012) Promoter-proximal pausing of RNA G1/S progression by regulating RNAPII processivity at core DNA polymerase II: emerging roles in metazoans. Nat. Rev. Genet., 13, replication genes. EMBO Rep., 20, e47592. 720–731. 40. Krajewska,M., Dries,R., Grassetti,A.V., Dust,S., Gao,Y.,

18. Core,L. and Adelman,K. (2019) Promoter-proximal pausing of Huang,H., Sharma,B., Day,D.S., Kwiatkowski,N., Pomaville,M. Downloaded from https://academic.oup.com/narcancer/article-abstract/2/1/zcaa003/5775299 by Masarykova Univerzita user on 23 April 2020 RNA polymerase II: a nexus of gene regulation. Genes Dev., 33, et al. (2019) CDK12 loss in cancer cells affects DNA damage 960–982. response genes through premature cleavage and polyadenylation. 19. Harlen,K.M. and Churchman,L.S. (2017) The code and beyond: Nat. Commun., 10, 1757. transcription regulation by the RNA polymerase II 41. Bartkowiak,B., Yan,C. and Greenleaf,A.L. (2015) Engineering an carboxy-terminal domain. Nat. Rev. Mol. Cell Biol., 18, 263–273. analog-sensitive CDK12 cell line using CRISPR/Cas. Biochim. 20. Eick,D. and Geyer,M. (2013) The RNA polymerase II Biophys. Acta, 1849, 1179–1187. carboxy-terminal domain (CTD) code. Chem. Rev., 113, 8456–8490. 42. Quereda,V., Bayle,S., Vena,F., Frydman,S.M., Monastyrskyi,A., 21. Zaborowska,J., Egloff,S. and Murphy,S. (2016) The pol II CTD: Roush,W.R. and Duckett,D.R. (2019) Therapeutic targeting of new twists in the tail. Nat. Struct. Mol. Biol., 23, 771–777. CDK12/CDK13 in triple-negative breast cancer. Cancer Cell, 36, 22. Herzel,L., Ottoz,D.S.M., Alpert,T. and Neugebauer,K.M. (2017) 545–558. Splicing and transcription touch base: co-transcriptional 43. Schuller,R., Forne,I., Straub,T., Schreieck,A., Texier,Y., Shah,N., spliceosome assembly and function. Nat. Rev. Mol. Cell Biol., 18, Decker,T.M., Cramer,P., Imhof,A. and Eick,D. (2016) 637–650. Heptad-specific phosphorylation of RNA polymerase II CTD. Mol. 23. Bentley,D.L. (2014) Coupling mRNA processing with transcription Cell, 61, 305–314. in time and space. Nat. Rev. Genet., 15, 163–175. 44. Suh,H., Ficarro,S.B., Kang,U.B., Chun,Y., Marto,J.A. and 24. Hsin,J.P. and Manley,J.L. (2012) The RNA polymerase II CTD Buratowski,S. (2016) Direct analysis of phosphorylation sites on the coordinates transcription and RNA processing. Genes Dev., 26, Rpb1 C-terminal domain of RNA polymerase II. Mol. Cell, 61, 2119–2137. 297–304. 25. Buratowski,S. (2003) The CTD code. Nat. Struct. Biol., 10, 679–680. 45. Dubbury,S.J., Boutz,P.L. and Sharp,P.A. (2018) CDK12 regulates 26. Chapman,R.D., Heidemann,M., Albert,T.K., Mailhammer,R., DNA repair genes by suppressing intronic polyadenylation. Nature, Flatley,A., Meisterernst,M., Kremmer,E. and Eick,D. (2007) 564, 141–145. Transcribing RNA polymerase II is phosphorylated at CTD residue 46. Tien,J.F., Mazloomian,A., Cheng,S.G., Hughes,C.S., Chow,C.C.T., serine-7. Science, 318, 1780–1782. Canapi,L.T., Oloumi,A., Trigo-Gonzalez,G., Bashashati,A., Xu,J. 27. Buratowski,S. (2009) Progression through the RNA polymerase II et al. (2017) CDK12 regulates alternative last exon mRNA splicing CTD cycle. Mol. Cell, 36, 541–546. and promotes breast cancer cell invasion. Nucleic Acids Res., 45, 28. Qiu,H., Hu,C. and Hinnebusch,A.G. (2009) Phosphorylation of the 6698–6716. Pol II CTD by KIN28 enhances BUR1/BUR2 recruitment and Ser2 47. Popova,T., Manie,E., Boeva,V., Battistella,A., Goundiam,O., CTD phosphorylation near promoters. Mol. Cell, 33, 752–762. Smith,N.K., Mueller,C.R., Raynal,V., Mariani,O., Sastre-Garau,X. 29. Booth,G.T., Parua,P.K., Sanso,M., Fisher,R.P. and Lis,J.T. (2018) et al. (2016) Ovarian cancers harboring inactivating mutations in Cdk9 regulates a promoter-proximal checkpoint to modulate RNA CDK12 display a distinct genomic instability pattern characterized polymerase II elongation rate in fission yeast. Nat. Commun., 9, 543. by large tandem duplications. Cancer Res., 76, 1882–1891. 30. Coudreuse,D., van Bakel,H., Dewez,M., Soutourina,J., Parnell,T., 48. Blazek,D. (2016) Transcriptional kinases: caught by a sticky drug. Vandenhaute,J., Cairns,B., Werner,M. and Hermand,D. (2010) A Nat. Chem. Biol., 12, 765–766. gene-specific requirement of RNA polymerase II CTD 49. Li,X., Chatterjee,N., Spirohn,K., Boutros,M. and Bohmann,D. phosphorylation for sexual differentiation in S. pombe. Curr. Biol., (2016) Cdk12 is a gene-selective RNA polymerase II kinase that 20, 1053–1064. regulates a subset of the transcriptome, including Nrf2 target genes. 31. Bowman,E.A., Bowman,C.R., Ahn,J.H. and Kelly,W.G. (2013) Sci. Rep., 6, 21455. Phosphorylation of RNA polymerase II is independent of P-TEFb 50. Pan,L., Xie,W., Li,K.L., Yang,Z., Xu,J., Zhang,W., Liu,L.P., in the C. elegans germline. Development, 140, 3703–3713. Ren,X., He,Z., Wu,J. et al. (2015) Heterochromatin remodeling by 32. Peterlin,B.M. and Price,D.H. (2006) Controlling the elongation CDK12 contributes to learning in Drosophila. Proc. Natl. Acad. Sci. phase of transcription with P-TEFb. Mol. Cell, 23, 297–305. U.S.A., 112, 13988–13993. 33. Ebmeier,C.C., Erickson,B., Allen,B.L., Allen,M.A., Kim,H., 51. Gressel,S., Schwalb,B., Decker,T.M., Qin,W., Leonhardt,H., Eick,D. Fong,N., Jacobsen,J.R., Liang,K.W., Shilatifard,A., Dowell,R.D. and Cramer,P. (2017) CDK9-dependent RNA polymerase II et al. (2017) Human TFIIH kinase CDK7 regulates pausing controls transcription initiation. Elife, 6, e29736. transcription-associated chromatin modifications. Cell Rep., 20, 52. Oh,J.M., Di,C., Venters,C.C., Guo,J.N., Arai,C., So,B.R., 1173–1186. Pinto,A.M., Zhang,Z.X., Wan,L.L., Younis,I. et al. (2017) U1 34. Devaiah,B.N., Lewis,B.A., Cherman,N., Hewitt,M.C., snRNP telescripting regulates a size-function-stratified human Albrecht,B.K., Robey,P.G., Ozato,K., Sims,R.J. and Singer,D.S. genome. Nat. Struct. Mol. Biol., 24, 993–999. (2012) BRD4 is an atypical kinase that phosphorylates serine2 of the 53. Kamieniarz-Gdula,K., Gdula,M.R., Panser,K., Nojima,T., RNA polymerase II carboxy-terminal domain. Proc. Natl. Acad. Monks,J., Wisniewski,J.R., Riepsaame,J., Brockdorff,N., Pauli,A. Sci. U.S.A., 109, 6927–6932. and Proudfoot,N.J. (2019) Selective roles of vertebrate PCF11 in 35. Bartkowiak,B. and Greenleaf,A.L. (2015) Expression, purification, premature and full-length transcript termination. Mol. Cell, 74, and identification of associated proteins of the full-length 158–172. hCDK12/CyclinK complex. J. Biol. Chem., 290, 1786–1795. 54. Gregersen,L.H., Mitter,R., Ugalde,A.P., Nojima,T., Proudfoot,N.J., 36. Davidson,L., Muniz,L. and West,S. (2014) 3 end formation of Agami,R., Stewart,A. and Svejstrup,J.Q. (2019) SCAF4 and pre-mRNA and phosphorylation of Ser2 on the RNA polymerase II SCAF8, mRNA anti-terminator proteins. Cell, 177, 1797–1813. CTD are reciprocally coupled in human cells. Genes Dev., 28, 55. Yu,M., Yang,W., Ni,T., Tang,Z., Nakadai,T., Zhu,J. and 342–356. Roeder,R.G. (2015) RNA polymerase II-associated factor 1 37. Eifler,T.T., Shao,W., Bartholomeeusen,K., Fujinaga,K., Jager,S., regulates the release and phosphorylation of paused RNA Johnson,J.R., Luo,Z., Krogan,N.J. and Peterlin,B.M. (2015) polymerase II. Science, 350, 1383–1386. Cyclin-dependent kinase 12 increases 3 end processing of growth 56. Van Oss,S.B., Cucinotta,C.E. and Arndt,K.M. (2017) Emerging factor-induced c-FOS transcripts. Mol. Cell. Biol., 35, 468–478. insights into the roles of the Paf1 complex in gene regulation. Trends 38. Zhang,T., Kwiatkowski,N., Olson,C.M., Dixon-Clarke,S.E., Biochem. Sci., 42, 788–798. Abraham,B.J., Greifenberg,A.K., Ficarro,S.B., Elkins,J.M., 57. Hou,L., Wang,Y., Liu,Y., Zhang,N., Shamovsky,I., Nudler,E., Liang,Y., Hannett,N.M. et al. (2016) Covalent targeting of remote Tian,B. and Dynlacht,B.D. (2019) Paf1C regulates RNA polymerase NAR Cancer, 2020, Vol. 2, No. 1 13

II progression by modulating elongation rate. Proc. Natl. Acad. Sci. 78. Lei,T., Zhang,P., Zhang,X., Xiao,X., Zhang,J., Qiu,T., Dai,Q., U.S.A., 116, 14583–14592. Zhang,Y., Min,L., Li,Q. et al. (2018) Cyclin K regulates 58. Fitz,J., Neumann,T. and Pavri,R. (2018) Regulation of RNA prereplicative complex assembly to promote mammalian cell polymerase II processivity by Spt5 is restricted to a narrow window proliferation. Nat. Commun., 9, 1876. during elongation. EMBO J., 37, e97965. 79. Bracken,A.P., Ciro,M., Cocito,A. and Helin,K. (2004) E2F target 59. Shetty,A., Kallgren,S.P., Demel,C., Maier,K.C., Spatt,D., genes: unraveling the biology. Trends Biochem. Sci., 29, 409–417. Alver,B.H., Cramer,P., Park,P.J. and Winston,F. (2017) Spt5 plays 80. Lobrich,M. and Jeggo,P.A. (2007) The impact of a negligent G2/M vital roles in the control of sense and antisense transcription checkpoint on genomic instability and cancer induction. Nat. Rev.

elongation. Mol. Cell, 66, 77–88. Cancer, 7, 861–869. Downloaded from https://academic.oup.com/narcancer/article-abstract/2/1/zcaa003/5775299 by Masarykova Univerzita user on 23 April 2020 60. Hoshii,T., Cifani,P., Feng,Z., Huang,C.H., Koche,R., Chen,C.W., 81. Schecher,S., Walter,B., Falkenstein,M., Macher-Goeppinger,S., Delaney,C.D., Lowe,S.W., Kentsis,A. and Armstrong,S.A. (2018) A Stenzel,P., Krumpelmann,K., Hadaschik,B., Perner,S., non-catalytic function of SETD1A regulates cyclin K and the DNA Kristiansen,G., Duensing,S. et al. (2017) Cyclin K dependent damage response. Cell, 172, 1007–1021. regulation of Aurora B affects apoptosis and proliferation by 61. Eifler,T.T., Shao,W., Bartholomeeusen,K., Fujinaga,K., Jager,S., induction of mitotic catastrophe in prostate cancer. Int. J. Cancer, Johnson,J., Luo,Z., Krogan,N. and Peterlin,B.M. (2014) CDK12 141, 1643–1653. increases 3 end processing of growth factor-induced c-FOS 82. Bajrami,I., Frankum,J.R., Konde,A., Miller,R.E., Rehman,F.L., transcripts. Mol. Cell. Biol., 35, 468–478 Brough,R., Campbell,J., Sims,D., Rafiq,R., Hooper,S. et al. (2014) 62. Lin,X., Taube,R., Fujinaga,K. and Peterlin,B.M. (2002) P-TEFb Genome-wide profiling of genetic synthetic lethality identifies containing cyclin K and Cdk9 can activate transcription via RNA. J. CDK12 as a novel determinant of PARP1/2 inhibitor sensitivity. Biol. Chem., 277, 16873–16878. Cancer Res., 74, 287–297. 63. Chen,H.H., Wang,Y.C. and Fann,M.J. (2006) Identification and 83. Joshi,P.M., Sutor,S.L., Huntoon,C.J. and Karnitz,L.M. (2014) characterization of the CDK12/cyclin L1 complex involved in Ovarian cancer-associated mutations disable catalytic activity of alternative splicing regulation. Mol. Cell. Biol., 26, 2736–2745. CDK12, a kinase that promotes homologous recombination repair 64. Ghamari,A., van de Corput,M.P., Thongjuea,S., van and resistance to cisplatin and poly(ADP-ribose) polymerase Cappellen,W.A., van Ijcken,W., van Haren,J., Soler,E., Eick,D., inhibitors. J. Biol. Chem., 289, 9247–9253. Lenhard,B. and Grosveld,F.G. (2013) In vivo live imaging of RNA 84. Gaillard,H., Garcia-Muse,T. and Aguilera,A. (2015) Replication polymerase II transcription factories in primary cells. Genes Dev., stress and cancer. Nat. Rev. Cancer, 15, 276–289. 27, 767–777. 85. Zeman,M.K. and Cimprich,K.A. (2014) Causes and consequences 65. Spector,D.L. and Lamond,A.I. (2011) Nuclear speckles. Cold of replication stress. Nat. Cell Biol., 16, 2–9. Spring Harb. Perspect. Biol., 3, a000646. 86. Lawrence,K.S., Chau,T. and Engebrecht,J. (2015) DNA damage 66. Gu,B., Eick,D. and Bensaude,O. (2013) CTD serine-2 plays a critical response and spindle assembly checkpoint function throughout the role in splicing and termination factor recruitment to RNA cell cycle to ensure genomic integrity. PLoS Genet., 11, e1005150. polymerase II in vivo. Nucleic Acids Res., 41, 1591–1603. 87. Janssen,A., van der Burg,M., Szuhai,K., Kops,G.J. and 67. David,C.J., Boyne,A.R., Millhouse,S.R. and Manley,J.L. (2011) The Medema,R.H. (2011) Chromosome segregation errors as a cause of RNA polymerase II C-terminal domain promotes splicing activation DNA damage and structural chromosome aberrations. Science, 333, through recruitment of a U2AF65–Prp19 complex. Genes Dev., 25, 1895–1898. 972–983. 88. Wu,Y.M., Cieslik,M., Lonigro,R.J., Vats,P., Reimers,M.A., Cao,X., 68. Rodrigues,F., Thuma,L. and Klambt,C. (2012) The regulation of Ning,Y., Wang,L., Kunju,L.P., de Sarkar,N. et al. (2018) glial-specific splicing of Neurexin IV requires HOW and Cdk12 Inactivation of CDK12 delineates a distinct immunogenic class of activity. Development, 139, 1765–1776. advanced prostate cancer. Cell, 173, 1770–1782. 69. Drogat,J. and Hermand,D. (2012) Gene-specific requirement of 89. Menghi,F., Barthel,F.P., Yadav,V., Tang,M., Ji,B., Tang,Z., RNA polymerase II CTD phosphorylation. Mol. Microbiol., 84, Carter,G.W., Ruan,Y., Scully,R., Verhaak,R.G.W. et al. (2018) The 995–1004. tandem duplicator phenotype is a prevalent genome-wide cancer 70. Hurt,E., Luo,M.J., Rother,S., Reed,R. and Strasser,K. (2004) configuration driven by distinct gene mutations. Cancer Cell, 34, Cotranscriptional recruitment of the serine–arginine-rich (SR)-like 197–210. proteins Gbp2 and Hrb1 to nascent mRNA via the TREX complex. 90. Quigley,D.A., Dang,H.X., Zhao,S.G., Lloyd,P., Aggarwal,R., Proc. Natl. Acad. Sci. U.S.A., 101, 1858–1862. Alumkal,J.J., Foye,A., Kothari,V., Perry,M.D., Bailey,A.M. et al. 71. Cho,E.J., Kobor,M.S., Kim,M., Greenblatt,J. and Buratowski,S. (2018) Genomic hallmarks and structural variation in metastatic (2001) Opposing effects of Ctk1 kinase and Fcp1 phosphatase at Ser prostate cancer. Cell, 175, 889. 2 of the RNA polymerase II C-terminal domain. Genes Dev., 15, 91. Rao,M. and Powers,S. (2018) Tandem duplications may supply the 3319–3329. missing genetic alterations in many triple-negative breast and 72. Ahn,S.H., Kim,M. and Buratowski,S. (2004) Phosphorylation of gynecological cancers. Cancer Cell, 34, 179–180. serine 2 within the RNA polymerase II C-terminal domain couples 92. Menghi,F., Inaki,K., Woo,X., Kumar,P.A., Grzeda,K.R., transcription and 3 end processing. Mol. Cell, 13, 67–76. Malhotra,A., Yadav,V., Kim,H., Marquez,E.J., Ucar,D. et al. (2016) 73. Kim,H., Erickson,B., Luo,W., Seward,D., Graber,J.H., The tandem duplicator phenotype as a distinct genomic Pollock,D.D., Megee,P.C. and Bentley,D.L. (2010) Gene-specific configuration in cancer. Proc. Natl. Acad. Sci. U.S.A., 113, RNA polymerase II phosphorylation and the CTD code. Nat. E2373–E2382. Struct. Mol. Biol., 17, 1279–1286. 93. van Dessel,L.F., van Riet,J., Smits,M., Zhu,Y., Hamberg,P., van der 74. Choi,S.H., Martinez,T.F., Kim,S., Donaldson,C., Shokhirev,M.N., Heijden,M.S., Bergman,A.M., van Oort,I.M., de Wit,R., Voest,E.E. Saghatelian,A. and Jones,K.A. (2019) CDK12 phosphorylates et al. (2019) The genomic landscape of metastatic 4E-BP1 to enable mTORC1-dependent translation and mitotic castration-resistant prostate cancers reveals multiple distinct genome stability. Genes Dev., 33, 418–435. genotypes with potential clinical impact. Nat. Commun., 10, 5251. 75. Rother,S. and Strasser,K. (2007) The RNA polymerase II CTD 94. Antonarakis,E.S. (2018) Cyclin-dependent kinase 12, immunity, and kinase Ctk1 functions in translation elongation. Genes Dev., 21, prostate cancer. N. Engl. J. Med., 379, 1087–1089. 1409–1421. 95. Branzei,D. and Szakal,B. (2017) Building up and breaking down: 76. Coordes,B., Brunger,K.M., Burger,K., Soufi,B., Horenk,J., Eick,D., mechanisms controlling recombination during replication. Crit. Rev. Olsen,J.V. and Strasser,K. (2015) Ctk1 function is necessary for full Biochem. Mol. Biol., 52, 381–394. translation initiation activity in Saccharomyces cerevisiae. Eukaryot. 96. Sokol,E.S., Pavlick,D., Frampton,G.M., Ross,J.S., Miller,V.A., Cell, 14, 86–95. Ali,S.M., Lotan,T.L., Pardoll,D.M., Chung,J.H. and 77. Xiang,X., Deng,L., Zhang,J., Zhang,X., Lei,T., Luan,G., Yang,C., Antonarakis,E.S. (2019) Pan-cancer analysis of CDK12 Xiao,Z.X., Li,Q. and Li,Q. (2014) A distinct expression pattern of loss-of-function alterations and their association with the focal cyclin K in mammalian testes suggests a functional role in tandem-duplicator phenotype. Oncologist, 24, 1526–1533. spermatogenesis. PLoS One, 9, e101539. 97. Lui,G.Y.L., Grandori,C. and Kemp,C.J. (2018) CDK12: an emerging therapeutic target for cancer. J. Clin. Pathol., 71, 957–962. 14 NAR Cancer, 2020, Vol. 2, No. 1

98. Paculova,H. and Kohoutek,J. (2017) The emerging roles of CDK12 118. Drean,A., Lord,C.J. and Ashworth,A. (2016) PARP inhibitor in tumorigenesis. Cell Div., 12,7. combination therapy. Crit. Rev. Oncol. Hematol., 108, 73–85. 99. Blazek,D. (2012) The cyclin K/Cdk12 complex: an emerging new 119. Chung,J.H., Dewal,N., Sokol,E., Mathew,P., Whitehead,R., player in the maintenance of genome stability. Cell Cycle, 11, Millis,S.Z., Frampton,G.M., Bratslavsky,G., Pal,S.K., Lee,R.J. et al. 1049–1050. (2019) Prospective comprehensive genomic profiling of primary and 100. Carter,S.L., Cibulskis,K., Helman,E., McKenna,A., Shen,H., metastatic prostate tumors. JCO Precis. Oncol., 3, 1–23. Zack,T., Laird,P.W., Onofrio,R.C., Winckler,W., Weir,B.A. et al. 120. Mateo,J., Seed,G., Bertan,C., Rescigno,P., Dolling,D., Figueiredo,I., (2012) Absolute quantification of somatic DNA alterations in Miranda,S., Nava Rodrigues,D., Gurel,B., Clarke,M. et al. (2019)

human cancer. Nat. Biotechnol., 30, 413–421. Genomics of lethal prostate cancer at diagnosis and castration Downloaded from https://academic.oup.com/narcancer/article-abstract/2/1/zcaa003/5775299 by Masarykova Univerzita user on 23 April 2020 101. Bradner,J.E., Hnisz,D. and Young,R.A. (2017) Transcriptional resistance. J. Clin. Invest., 130, 1558–8238. addiction in cancer. Cell, 168, 629–643. 121. Reimers,M.A., Yip,S.M., Zhang,L., Cieslik,M., Dhawan,M., 102. O’Connor,M.J. (2015) Targeting the DNA damage response in Montgomery,B., Wyatt,A.W., Chi,K.N., Small,E.J., cancer. Mol. Cell, 60, 547–560. Chinnaiyan,A.M. et al. (2019) Clinical outcomes in 103. Toyoshima,M., Howie,H.L., Imakura,M., Walsh,R.M., Annis,J.E., cyclin-dependent kinase 12 mutant advanced prostate cancer. Eur. Chang,A.N., Frazier,J., Chau,B.N., Loboda,A., Linsley,P.S. et al. Urol., 77, 333–341. (2012) Functional genomics identifies therapeutic targets for 122. Luo,J. and Antonarakis,E.S. (2019) PARP inhibition––not all gene MYC-driven cancer. Proc. Natl. Acad. Sci. U.S.A., 109, 9545–9550. mutations are created equal. Nat. Rev. Urol., 16, 4–6. 104. Delehouze,C., Godl,K., Loaec,N., Bruyere,C., Desban,N., 123. Chen,H., Liu,H.D. and Qing,G.L. (2018) Targeting oncogenic Myc Oumata,N., Galons,H., Roumeliotis,T.I., Giannopoulou,E.G., as a strategy for cancer treatment. Signal Transduct. Target. Ther., 3, Grenet,J. et al. (2014) CDK/CK1 inhibitors roscovitine and CR8 5. downregulate amplified MYCN in neuroblastoma cells. Oncogene, 124. Zeng,M., Kwiatkowski,N.P., Zhang,T., Nabet,B., Xu,M., Liang,Y., 33, 5675–5687. Quan,C., Wang,J., Hao,M., Palakurthi,S. et al. (2018) Targeting 105. Sircoulomb,F., Bekhouche,I., Finetti,P., Adelaide,J., Ben MYC dependency in ovarian cancer through inhibition of CDK7 Hamida,A., Bonansea,J., Raynaud,S., Innocenti,C., and CDK12/13. Elife, 7, e39030. Charafe-Jauffret,E., Tarpin,C. et al. (2010) Genome profiling of 125. Spector,N.L. and Blackwell,K.L. (2009) Understanding the ERBB2-amplified breast cancers. BMC Cancer, 10, 539. mechanisms behind trastuzumab therapy for human epidermal 106. Mertins,P., Mani,D.R., Ruggles,K.V., Gillette,M.A., Clauser,K.R., growth factor receptor 2-positive breast cancer. J. Clin. Oncol., 27, Wang,P., Wang,X., Qiao,J.W., Cao,S., Petralia,F. et al. (2016) 5838–5847. Proteogenomics connects somatic mutations to signalling in breast 126. Wang,C., Wang,H., Lieftink,C., du Chatinier,A., Gao,D., Jin,G., cancer. Nature, 534, 55–62. Jin,H., Beijersbergen,R.L., Qin,W. and Bernards,R. (2019) CDK12 107. Capra,M., Nuciforo,P.G., Confalonieri,S., Quarto,M., Bianchi,M., inhibition mediates DNA damage and is synergistic with sorafenib Nebuloni,M., Boldorini,R., Pallotti,F., Viale,G., Gishizky,M.L. treatment in hepatocellular carcinoma. Gut, et al. (2006) Frequent alterations in the expression of doi:10.1136/gutjnl-2019-318506. serine/threonine kinases in human cancers. Cancer Res., 66, 127. Pishas,K.I. and Lessnick,S.L. (2016) Recent advances in targeted 8147–8154. therapy for Ewing sarcoma [version 1; peer review: 2 approved]. 108. Naidoo,K., Wai,P.T., Maguire,S.L., Daley,F., Haider,S., F1000Res., 5, 2077. Kriplani,D., Campbell,J., Mirza,H., Grigoriadis,A., Tutt,A. et al. 128. Iniguez,A.B., Stolte,B., Wang,E.J., Conway,A.S., Alexe,G., (2018) Evaluation of CDK12 protein expression as a potential novel Dharia,N.V., Kwiatkowski,N., Zhang,T.H., Abraham,B.J., Mora,J. biomarker for DNA damage response-targeted therapies in breast et al. (2018) EWS/FLI confers tumor cell synthetic lethality to cancer. Mol. Cancer Ther., 17, 306–315. CDK12 inhibition in Ewing sarcoma. Cancer Cell, 33, 202–216. 109. Natrajan,R., Wilkerson,P.M., Marchio,C., Piscuoglio,S., Ng,C.K., 129. Paculova,H., Kramara,J., Simeckova,S., Fedr,R., Soucek,K., Wai,P., Lambros,M.B., Samartzis,E.P., Dedes,K.J., Frankum,J. et al. Hylse,O., Paruch,K., Svoboda,M., Mistrik,M. and Kohoutek,J. (2014) Characterization of the genomic features and expressed (2017) BRCA1 or CDK12 loss sensitizes cells to CHK1 inhibitors. fusion genes in micropapillary carcinomas of the breast. J. Pathol., Tumour Biol., 39, 1–11. 232, 553–565. 130. Le,D.T., Durham,J.N., Smith,K.N., Wang,H., Bartlett,B.R., 110. Choi,H.J., Jin,S., Cho,H., Won,H.Y., An,H.W., Jeong,G.Y., Aulakh,L.K., Lu,S., Kemberling,H., Wilt,C., Luber,B.S. et al. (2017) Park,Y.U., Kim,H.Y., Park,M.K., Son,T. et al. (2019) CDK12 drives Mismatch repair deficiency predicts response of solid tumors to breast tumor initiation and trastuzumab resistance via WNT and PD-1 blockade. Science, 357, 409–413. IRS1–ErbB–PI3K signaling. EMBO Rep., 20, e48058. 131. Le,D.T., Uram,J.N., Wang,H., Bartlett,B.R., Kemberling,H., 111. Viswanathan,S.R., Ha,G., Hoff,A.M., Wala,J.A., Carrot-Zhang,J., Eyring,A.D., Skora,A.D., Luber,B.S., Azad,N.S., Laheru,D. et al. Whelan,C.W., Haradhvala,N.J., Freeman,S.S., Reed,S.C., (2015) PD-1 blockade in tumors with mismatch-repair deficiency. N. Rhoades,J. et al. (2018) Structural alterations driving Engl. J. Med., 372, 2509–2520. castration-resistant prostate cancer revealed by linked-read genome 132. Antonarakis,E.S., Velho,P.I., Agarwal,N., Santos,V.S., sequencing. Cell, 174, 433–447. Maughan,B.L., Pili,R., Adra,N., Sternberg,C.N., 112. Byrum,A.K., Vindigni,A. and Mosammaparast,N. (2019) Defining Vlachostergios,P.J., Tagawa,S.T. et al. (2019) CDK12-altered and modulating ‘BRCAness’. Trends Cell Biol., 29, 740–751. prostate cancer: clinical features and therapeutic outcomes to 113. Lord,C.J. and Ashworth,A. (2016) BRCAness revisited. Nat. Rev. standard systemic therapies, PARP inhibitors, and PD1 inhibitors. Cancer, 16, 110–120. Ann. Oncol., 30, 326–355. 114. Lord,C.J. and Ashworth,A. (2017) PARP inhibitors: synthetic 133. Marshall,C.H., Imada,E.L., Tang,Z., Marchionni,L. and lethality in the clinic. Science, 355, 1152–1158. Antonarakis,E.S. (2019) CDK12 inactivation across solid tumors: 115. Staaf,J., Glodzik,D., Bosch,A., Vallon-Christersson,J., an actionable genetic subtype. Oncoscience, 6, 312–316. Reutersward,C., Hakkinen,J., Degasperi,A., Amarante,T.D., 134. Zehir,A., Benayed,R., Shah,R.H., Syed,A., Middha,S., Kim,H.R., Saal,L.H., Hegardt,C. et al. (2019) Whole-genome sequencing of Srinivasan,P., Gao,J., Chakravarty,D., Devlin,S.M. et al. (2017) triple-negative breast cancers in a population-based clinical study. Mutational landscape of metastatic cancer revealed from prospective Nat. Med., 25, 1526–1533. clinical sequencing of 10,000 patients. Nat. Med., 23, 703–713. 116. Johnson,S.F., Cruz,C., Greifenberg,A.K., Dust,S., Stover,D.G., 135. Kanakkanthara,A., Kurmi,K., Ekstrom,T.L., Hou,X., Chi,D., Primack,B., Cao,S., Bernhardy,A.J., Coulson,R. et al. Purfeerst,E.R., Heinzen,E.P., Correia,C., Huntoon,C.J., O’Brien,D., (2016) CDK12 inhibition reverses de novo and acquired PARP Wahner Hendrickson,A.E. et al. (2019) BRCA1 deficiency inhibitor resistance in BRCA wild-type and mutated models of upregulates NNMT, which reprograms metabolism and sensitizes triple-negative breast cancer. Cell Rep., 17, 2367–2381. ovarian cancer cells to mitochondrial metabolic targeting agents. 117. Bayles,I., Krajewska,M., Pontius,W.D., Saiakhova,A., Morrow,J.J., Cancer Res., 79, 5920–5929. Bartels,C., Lu,J., Faber,Z.J., Fedorov,Y., Hong,E.S. et al. (2019) Ex 136. Vander Heiden,M.G. (2011) Targeting cancer metabolism: a vivo screen identifies CDK12 as a metastatic vulnerability in therapeutic window opens. Nat. Rev. Drug Discov., 10, 671–684. osteosarcoma. J. Clin. Invest., 129, 4377–4392.