Genomics 83 (2004) 153–167 www.elsevier.com/locate/ygeno

Analysis of a high-throughput yeast two-hybrid system and its use to predict the function of intracellular encoded within the human MHC class III region

Ben Lehner,1 Jennifer I. Semple,1 Stephanie E. Brown, Damian Counsell, R. Duncan Campbell,2 and Christopher M. Sanderson*,2

Functional Genomics Group, MRC Rosalind Franklin Centre for Genomics Research,3 Hinxton, Cambridge CB10 1SB, United Kingdom Received 8 April 2003; accepted 15 July 2003

Abstract

High-throughput (HTP) -interaction assays, such as the yeast two-hybrid (Y2H) system, are enormously useful in predicting the functions of novel -products. HTP-Y2H screens typically do not include all of the reconfirmation and specificity tests used in small-scale studies, but the effects of omitting these steps have not been assessed. We performed HTP-Y2H screens that included all standard controls, using the predicted intracellular proteins expressed from the human MHC class III region, a region of the genome associated with many autoimmune diseases. The 91 novel interactions identified provide insight into the potential functions of many MHC , including C6orf47, LSM2, NELF-E (RDBP), DOM3Z, STK19, PBX2, RNF5, UAP56 (BAT1), ATP6G2, LST1/f, BAT2, Scythe (BAT3), CSNK2B, BAT5, and CLIC1. Surprisingly, our results predict that 1/3 of the proteins may have a role in mRNA processing, which suggests clustering of functionally related genes within the . Most importantly, our analysis shows that omitting standard controls in HTP-Y2H screens could significantly compromise data quality. D 2003 Elsevier Inc. All rights reserved.

Keywords: Protein–protein interaction; Yeast two-hybrid; Human MHC class III region

The yeast two-hybrid (Y2H) system has been used to studies. Importantly, von Mering’s data show that the accu- analyze large numbers of protein–protein interactions in racy of the raw (unfiltered) Y2H data is in fact comparable to yeast [1–3], bacteria [4], viruses [5], and Caenorhabditis that obtained by the TAP complex isolation approach [9] elegans [6]. In addition, an adaptation of the Gal4 two- (38.1 and 40.5%, respectively, for baits from a reference set hybrid assay was used to identify 145 interactions occurring of known interactions). Moreover, compared to purification between 3500 mouse proteins in mammalian cells [7]. The methods, Y2H data appear less biased toward highly small degree of overlap [1] between theoretically comparable expressed proteins, proteins of particular subcellular loca- data sets from some of the larger studies [1,2] has inevitably tions, or phylogenetically conserved proteins [8]. As such, led to questions about the general suitability of the Y2H the emerging consensus is that the two approaches are technology as a strategy for building protein interaction complementary and that by combining data from both maps. In particular, there has been speculation as to whether approaches it will be possible to increase the accuracy and coprecipitation methods may be more useful. In a recent coverage of predicted protein interaction networks. report, von Mering et al. [8] compared the accuracy, cover- Given the undoubted contribution that the Y2H approach age, and biases of the large-scale Y2H and complex isolation has made to the identification of novel protein–protein interactions, both in the small-scale study of human proteins * Corresponding author. and in larger scale studies of model organisms, it is clear E-mail addresses: [email protected] (C.M. Sanderson), that the Y2H technique should now be used on a larger scale [email protected] (R.D. Campbell). 1 to explore human protein interaction networks. However, Both authors contributed equally toward the publication of this although some important pioneering large-scale Y2H stud- manuscript. 2 Inquiries can be addressed to either of these authors. ies have been performed [1–3,6], high-throughput (HTP) 3 Formerly the MRC UK HGMP Resource Centre. Y2H screening is still in its infancy, and as such, there is still

0888-7543/$ - see front matter D 2003 Elsevier Inc. All rights reserved. doi:10.1016/S0888-7543(03)00235-0 154 B. Lehner et al. / Genomics 83 (2004) 153–167 a need for improvement and debate about the way HTP To address these issues we have performed a medium- screens are performed. In particular, large-scale analysis of scale pilot study using a HTP strategy that incorporates all human protein–protein interaction networks presents stra- classical secondary specificity checks adapted into a HTP tegic problems that were not encountered in previous large- format. For this study we chose to analyze a set of intracel- scale Y2H studies. It is therefore highly likely that different lular proteins encoded within the human MHC class III experimental strategies and HTP adaptations will need to be region [12]. This region (0.8 Mb), of 6p21.3, applied when analyzing the human proteome. is the most gene-dense domain of the human genome and has To date several different strategies have been used to genetic associations with a range of human diseases [13], adapt Y2H screens into HTP formats. For smaller numbers including insulin-dependent diabetes mellitus [14,15], rheu- of genes (f200), a robotically assisted array strategy has matoid arthritis [16], multiple sclerosis [17], ankylosing been used [2]. This approach provides a valuable tool for spondylitis [18], and IgA deficiency [19]. In addition to its proteomic analysis, not least of all because it enables medical significance, the MHC class III region provides an nonspecific interactions to be identified with relative ease ideal set of functionally diverse proteins that can be used to by repetitive screening. However, due to the potential size of assess the consequence of performing different classical the human proteome and the lack of a complete set of human specificity tests. Importantly, some proteins included in this open reading frames (ORFs), it is not possible to apply this study (NELF-E, CSNK2B, ATP6G2, LSM2, UAP56) have methodology directly to large-scale human protein interac- well-defined functions and known protein interaction part- tion studies. Consequently, the screening of high-complexity ners. Inclusion of these proteins allows us to assess the cDNA libraries remains the best way of identifying novel potential of the HTP Y2H approach to predict gene function interactions between human proteins. accurately and provides important internal controls for the As library screening is an inherently labor-intensive efficiency and fidelity of secondary specificity tests. procedure, a range of adaptations has been used to increase Despite convincing genetic evidence showing the impor- throughput. These include the use of non-sequence-verified tance of the class III region in health and disease, neither the baits [6], the pooling of baits [1,10], the analysis of only genes responsible nor the underlying mechanism of pathol- small numbers of interaction partners [2], and the omission ogy is known for many of the associated conditions. of secondary procedures, which are commonly used to test Meaningful interpretation of this genetic evidence would the specificity of putative protein–protein interactions in be greatly enhanced if the molecular functions of each gene small-scale Y2H screens. These procedures include the were understood. By using a high-stringency Y2H assay to reconfirmation of interactions in fresh yeast and the use of identify interaction partners we hoped to be able to predict nonspecific baits to assess prey specificity [11]. the functions for many of the currently uncharacterized gene As an alternative to experimental specificity checks some products with increased confidence. In addition, by cata- large-scale studies have implemented data-filtering methods loguing the potential function of a collection of genes from a to reduce the false-positive rate [1]. This involves reporting single genomic extending over 800 kb it was possible only those preys that are isolated three or more times in a to assess whether clustering of functionally related genes single screen. While this may be a valid strategy for has occurred in this important region of the human genome. assessing the reliability of interactions isolated from ge- nomic fragment libraries or libraries of pooled prey clones, in which each gene should be present in approximately Results equal abundance [1,4], the same criteria cannot be applied when nonnormalized cDNA libraries are being screened. In In this pilot study we have used a stringent HTP Y2H this case the number of clones isolated per gene will depend assay to analyze the protein–protein interaction profiles of far more on the relative level of rather than intracellular proteins encoded by genes in the human MHC the ‘‘trueness’’ of the observed interaction. Also, it is class III region on chromosome 6p21.3. In this region there possible that multiple positive colonies may arise from a are 60 expressed genes, approximately half of which encode single mutated yeast or vector (D. Markie, personal com- intracellular proteins. For the purpose of this study we chose munication). Therefore, obtaining multiple isolates of a not to study genes encoding secreted or surface proteins, single clone, even in repeat screens, does not prove that as they are less likely to perform well in the Y2H assay. the interaction is a true positive. Equally, singletons may Members of the HSP70 family were also excluded because well be true positives, and excluding them may increase the they are known to bind nonspecifically to hydrophobic false negative rate of the screen. peptides [20]. A list of the remaining 27 genes along with Due to the lack of primary data from many published a summary of available information regarding their domain Y2H studies, or quantitative studies that address the con- structures, functions, and known interaction partners is sequences of omitting established experimental specificity presented in Table 1. For each of these genes the full-length checks, it is difficult to rationally assess which HTP strategy ORF was amplified by PCR (ATG-end minus stop codon) should be employed to perform larger scale Y2H studies of before being cloned by recombination into the Gateway- the human proteome. converted, Gal4 DNA-binding domain (BD) vector B. Lehner et al. / Genomics 83 (2004) 153–167 155

Table 1 Summary of the yeast two-hybrid screens Bait proteina Domain structure Known function Known Primary Reconfirmed interactions coloniesb interactions G18 (C6orf9) HOX DNA binding — — 350, 400 Prom.c PBX2 (screened on 60 mM 3AT) RING finger, Hox cofactor Hox proteins, 700, 240 15 transmembrane Meis proteins RNF5 TPR E3 — 36, 14 5 DIR1 (FKBPL) FK-binding — — >1000, 1000 Prom. ATF6h (CREBL1) [1–392] bZIP ER-stress signaling — Self-activates ND ATF6h (CREBL1) [83–392] bZIP ER-stress signaling — 350, 700 Prom. STK19 — Kinase — 100, 25 6 DOM3Z — — — 25, 30 3 SKI2W (SKIV2L) DEAD-box helicase — — Self-activates ND SKI2W [709–1246] — — — 36, 14 0 NELF-E (RDBP) RRM, poly(RD) Transcription elongation NELF complex 150, 60 3 NG36/G9A Ankyrin, SET Histone methyltransferase — Self-activates ND NG36/G9A [21–210] — — — 2, 2 0 LSM2 (C6orf28) Sm U6 assembly Lsm complexes 77, 150 7 VARS2 tRNA synthetase — — >1000, >1000 Prom. MSH5 Msh Meiosis Msh4 36, 0 0 CLIC1 — Intracellular chloride channel AKAP350 6, 25 2 DDAH2 Me-R binding Me-R degradation — >1000, >1000 Prom. BAT5 a/h hydrolase, Enzyme? — 51, 131 15 transmembrane CSNK2B CK2h Regulatory subunit of Many 76, 46 8 / kinase BAT4 Ankyrin repeats, — — Self activates ND G patch domain BAT4 [1–241] Ankyrin repeats — — >1000, >1000 Prom. BAT4 [238–570] G patch domain — — >1000 Prom. C6orf47 (G4) — — — 36, 2 1 Scythe (BAT3) [1–535] UBL domain Apoptosis, protein folding — >1000 Prom. and degradation Scythe (BAT3) [521–1132] BAG domain Ibid. Hsp70 >1000 Prom. Scythe (BAT3) [1–96] UBL domain Ibid. — 29 3 BAT2 [1–773] JmjC domain — — 213, 147 19 BAT2 [756–1408] — — — >1000, >1000 1 BAT2 [1391–2157] — — — >1000 Prom. AIF-1 (AIF1) EF hand — — 3, 4 0 AIF-2 (IRT-1)d Leucine zipper — — 66, 34 0 LST1/f (LST1) — — — 55, 12 2 IKBL (NFKBIL1) Ankyrin repeats, NLS Localized in nuclear speckles — >1000 Prom. IKBL [1–137] Ankyrin repeats, NLS — >500 Prom. ATP6G2 (ATP6V1G2) Subunit of vacuolar H+ translocation across ATP6E >500, >500 1 ATPase organellar membranes UAP56 (BAT1) DEAD-box helicase mRNA splicing and export U2AF65, Aly, 109, 128 12 TREX complex MCCD1 Predicted ORF — — >1000, >1000 Prom. a The numbers enclosed in parentheses refer to the numbers at the start and the end of the protein fragment. The names enclosed in paretheses refer to official gene names, when available, or alternative gene names. b The numbers of primary colonies from replicate screens are shown separated by a comma. c Prom., promiscuous interactor. d AIF-2 is a splice variant of the AIF gene. pGBDU-G and sequenced. For comparison, some baits were were performed using partial fragments of these proteins or also cloned into pGBDU-C vectors by conventional meth- in the presence of 3-aminotriazole (3-AT), at a concentration ods. No significant differences in interaction profiles were that suppressed autoactivation. observed with either vector. The relatively long ORFs of In total, 38 baits were individually screened in duplicate BAT2 and Scythe were divided into two or three fragments, (Table 1) by mating with a K562 cDNA activation domain each of which was screened separately (Table 1). library transformed into PJ69-4A (MATa) yeast. Each Five of the baits assembled (BAT4, NG36/G9a, SKI2W, screen tested at least 107 diploids to obtain a threefold ATF6h, and PBX2) were found to autoactivate in PJ69-4A coverage of the library. An interaction was considered to be (MATa) yeast (Table 1). To circumvent this problem, screens a ‘‘true positive’’ only if it fulfilled the following criteria: 156 B. Lehner et al. / Genomics 83 (2004) 153–167

(1) It activated all three reporter genes: ADE2, HIS3, and the prey vector in fresh yeast followed by mating with two lacZ. These three reporter genes are under the control of irrelevant baits (LSM2 and ATP6G2) did not produce three different Gal4 promoters to control for nonspecific growth on Ade dropout medium (Figs. 1E and 1F). This binding to a particular Gal4 sequence [21]. (2) test for interaction specificity eliminates prey clones that are Regeneration of the prey vector in fresh yeast followed by promiscuous or have a potential to autoactivate. (4) The 5V mating with the original bait produced growth on Ade sequence read matched the protein coding region of a gene dropout medium (Figs. 1C and 1D). This test eliminates in a BLAST [22] search. BLAST matches to expressed primary clones that arise due to mutations in the host cell sequence tags (ESTs) that have no defined ORF were also (D. Markie, personal communication). (3) Regeneration of included as positives. However, we excluded from our list

Fig. 1. Reconfirmation and specificity testing: Each prey isolated in a screen was considered a true positive only if it reconfirmed when retested with the relevant bait in fresh yeast, but not with two irrelevant baits. (A) The total numbers of colonies scoring positive for each test are compared to the number of prey PCR products processed and the number of preys activating the lacZ reporter (X-gal blue). These data were derived from 32 screens that used nonpromiscuous baits and in which all colonies were retested. The number of primary colonies picked and the number of true-positive colonies for each of the 32 screens is shown in (B). Examples of reconfirmation and specificity testing are shown in (C–F). Prey inserts were transformed into PJ69-4A MATa yeast together with linearized pGAD-T7 in a 96-well format and grown on Leu dropout plates. The regenerated preys were replica plated onto YPAD plates covered with a lawn of bait culture for mating and the diploids were then replica plated onto medium lacking Leu, Ura, and Ade to select for positive interactions. The preys were mated either with the original bait with which they had been isolated from the library, (C) BAT5 or (D) G18, to test for reproducibility of the interaction, or with an irrelevant bait, (E and F) ATP6G2, to test for specificity of the interaction. Each cluster of spots represents a different prey from either the BAT5 or the G18 library screens arrayed in an 8 Â 12 96-‘‘well’’ format (only half of the array shown). Individual spots within a cluster represent colonies arising from individual yeast transformants containing a gap-repaired plasmid that after mating with yeast expressing the specific or nonspecific bait proteins as described formed a diploid capable of growing on medium lacking Ade. Prey colonies that grew only when mated with the specific bait and not with the nonspecific bait (e.g., position A1 in C and E) indicate true-positive interactions. Preys that did not grow when mated with either specific or nonspecific baits (e.g., position B3 in C and E) indicate false-positive interactions that arose due to a yeast mutation in the colony picked from the screen plates. Prey colonies that grew when mated with both specific and nonspecific baits (e.g., position B2 in D and F) indicate false-positive interactions arising from promiscuously interacting prey proteins. B. Lehner et al. / Genomics 83 (2004) 153–167 157 of positives matches to 3VUTRs and a few rare matches to interactions in each screen that pass all specificity checks. genomic DNA with no associated gene prediction. These results show that all screens include some prey clones The selection of LSM2 and ATP6G2 for use as nonspe- that fail secondary specificity checks. Also, while the num- cific baits in the reconfirmation of interactions was based on ber of nonspecific clones identified is clearly screen specific, two criteria: (1) Each of these proteins gave a highly specific in several cases none of the observed primary interactions interaction profile when used as a bait to screen the K562 were found to be true positives. These data clearly demon- library. (2) Both proteins have well-defined functions, which strate the need for the inclusion of specificity checks in all enables the plausibility of other interactions to be rationally Y2H screens irrespective of project size. assessed. In total we identified 103 true positive interactions that passed all specificity checks (Table 2), of which 91 were Analysis of the Y2H data novel. These interactions have been submitted to the BIND database (www.bind.ca) [23]. Of the 12 previously reported Of the 38 bait proteins screened in this study, 6 full- interactions, 6 are known to be direct protein–protein length proteins (G18, DIR1, VARS2, DDAH2, IKBL, and interactions, 3 protein pairs are known to be part of a MCCD1) and 8 protein fragments [ATF6h (83–392), BAT4 common protein complex and 3 are known to interact (1–241), BAT4 (238–570), Scythe (1–535), Scythe (521– functionally (Table 2). By studying only those interactions 1132), BAT2 (756–1408), BAT2 (1391–2157), IKBL (1– that pass all criteria for specificity we hoped to gain a more 137)] all selected hundreds or thousands of prey clones accurate insight into the potential function of novel genes (Table 1). Prey inserts from a minimum of 47 diploids from encoded in the human MHC class III region. A detailed each of these screens were regenerated by gap repair in fresh description of the interactions observed in this screen and yeast, before being mated with either original or irrelevant their potential functional relevance is given below. baits. In each case, most of the prey clones reconfirmed with the specific bait, but not with either of the nonspecific baits. Bait proteins implicated in mRNA processing Sequence analysis of prey inserts revealed a diverse range of proteins in each screen, implying that these baits either Four of the proteins in the MHC class III region (LSM2, exhibit broad partner specificity or are misfolded when UAP56, DOM3Z, and SKIV2L) are orthologues of yeast expressed as Gal4 AD fusion proteins. Repeat screening proteins involved in mRNA processing (LSM2, UAP56, using His selection and high 3-AT concentrations did not Rai1p, and SKI2). In addition, another five proteins (BAT2, reduce prey diversity. Significantly, although screens per- STK19, CLIC1, PBX2, and BAT5) were found to interact formed with the ATP6G2 and BAT2 (756–1408) baits also with proteins previously implicated in RNA processing. gave thousands of positive colonies, all of the colonies from Hence, at least one-third of the intracellular proteins each screen corresponded to a single, bait-specific prey encoded in the class III region are likely to have a role in (Table 1). mRNA processing. None of the preys isolated by NG36/G9a (21–210), AIF- 1, AIF-2, and MSH5 baits were found to be true positives LSM2 according to the four criteria described above. For the LSM2 is known to form part of a doughnut-shaped remaining 16 baits [PBX2, RNF5, STK19, DOM3Z, heteromeric complex of like-Sm (LSM) proteins associated NELF-E, LSM2, CLIC1, BAT5, CSNK2B, C6orf47, Scythe with the U6 snRNP [24]. The analogous yeast complexes (1–96), BAT2 (1–773), BAT2 (756–1408), LST1/f, are associated with the U6 snRNP (LSM2–8 complex) and ATP6G2, and UAP56], we detected between 1 and 19 with mRNA decapping factors (LSM1–7 complex) [25].By reconfirmed specific interactions in each screen. To assess analogy to the known interactions of the core snRNP Sm the consequence of implementing classical specificity checks complex [26], LSM2 is predicted to interact with LSM3 and on a representative set of functionally diverse human genes, either LSM1 or LSM8 depending upon the complex in we analyzed the total number of prey clones that were question [25]. In our study, the vast majority of prey excluded at each stage of our screening procedure (Fig. 1A). proteins isolated by LSM2 corresponded to LSM3 and Of the total number of clones that were positive for LSM8, which support the existing models. Although no double selection on Ade/His dropout medium, f66% were LSM1 clones were isolated from our library screens, this also positive for activation of the third reporter (lacZ) but interaction was observed by direct pair-wise mating (data only f57% reconfirmed when the prey vector was regen- not shown). LSM2 also isolated LSM7 and the related Sm erated in fresh yeast and mated with yeast containing the D2 and Sm E proteins. The LSM7 interaction may reflect a original bait (Fig. 1A). Significantly, f20% of the preys weak interaction with LSM2 or bridging effects mediated by gave positive results with one or more nonspecific bait components of the yeast LSM complex [27]. In our study, proteins. Overall f36% of primary interactions satisfy all two other novel LSM interactions were observed, namely standard criteria for specificity. Data presented in Fig. 1B the HSP40-family member chaperone DNAJ2 and the novel show a comparison between the number of primary inter- finger protein PR-17, suggesting that these proteins actions identified in 32 different screens and the number of may also have a role in RNA . 158 B. Lehner et al. / Genomics 83 (2004) 153–167

Table 2 Reconfirmed protein–protein interactions Baita Prey Accession Prey name Prey domains/function Novel No. of No. interaction?b ISTsc PBX2 NM_003640 IKBKAP Transcription elongator subunit N 2 NM_001487 GCN5L1 HAT N 2 NM_022743 FLJ21080 SET, histone–Me tranferase N 1 NM_017740 ZDHHC7 DHHC zinc finger N 1 BC009263 MGC:16385 Helix-turn-helix N 1 NM_006337 MCRS1 Nucleolar protein N 3 XM_042108 KIAA0052 Helicase, exosome component N 1 NM_016553 Nup62 Nucleoporin 62kD N 2 NM_006833 COP9 subunit 6 — N 5 NM_017451 BAIAP2 Insulin receptor signaling N 1 NM_012424 RPS6KC1 Kinase N 1 AL579387 CS0DH004YA13 — N 1 NM_000968 RPL4 Ribosome subunit N 2 NM_032251 DKFZp434G0920 — N 1 NM_007234 Dynactin3 Binds dynein N 1 RNF5 NM_003341 E2E 1 Ubiquitin-conjugating enzyme F 2 NM_006357 E2E 3 Ubiquitin-conjugating enzyme N 5 NM_003339 E2D 2 Ubiquitin-conjugating enzyme F 1 NM_003340 E2D 3 Ubiquitin-conjugating enzyme F 1 NM_013241 FHOD1 Formin homology 2 domain N 2 STK19 NM_006231 POLE DNA polymerase subunit N 1 NM_002696 POLR2G RNA polymerase subunit N 1 NM_005850 SF3b su4 Splicing factor N 1 M97191 Sp3 Zinc finger transcription factor for T cell receptor N 1 NM_004955 SLC29A1 Solute carrier family 29 (nucleoside transporter) N 1 NM_006115 PRAME Preferentially expressed antigen in melanoma N 3 DOM3Z NM_004475 Flotillin 2 Epidermal surface antigen N 2 AI580336 IMAGE:2161292 — N 1 XM_072118 LOC128127 — N 1 NELF-E NM_015456 COBRA1 Chromatin remodeling N 94 NM_006311 NCOR1 Transcription corepressor subunit N 1 XM_114002 NY-REN-24 Similar to cactin N 15 LSM2 NM_016200 Lsm8 Lsm-complex subunit C 88 NM_014463 Lsm3 Lsm-complex subunit C 75 NM_016199 Lsm7 Lsm-complex subunit C 1 NM_003094 SnmpE Sm-complex subunit N 5 NM_004597 SnmpD2 Sm-complex subunit N 1 NM_024741 PR-zinc finger 17 Zinc finger N 3 NM_001539 DNAJ-like 2 Hsp40 family chaperone N 2 CLIC1 NM_014462 LSM1 mRNA decapping and degradation N 2 BQ277691 IMAGE:5804540 — N 12 BAT5 NM_003641 IFITM1 Interferon–induced transmembrane protein 1 (9-27) N 20 NM_002136 hnRNP A1 Heterogeneous nuclear ribonucleoprotein A1 N 10 NM_006913 RNF5 E3 ubiquitin ligase N 5 XM_046160 SPP Signal peptide peptidase, catalyzes intramembrane N2 proteolysis of signal peptides after cleavage from a preprotein NM_032635 NIFIE14 Seven transmembrane domain protein N 2 NM_032125 DKFZp564D0478 — N 2 NM_022365 DNAJL1 Hsp40 chaparone family N 1 NM_022036 GPRC5C G protein-coupled receptor N 1 NM_013440 PILRbeta Paired immunoglobulin-like receptor h N1 NM_004890 SPAG7 Sperm associated antigen 7 N 1 NM_002967 SAFB Scaffold attachment factor B N 1 NM_001689 ATP5G3 Subunit of motochondrial ATP synthase H+ transporter N 1 X62997 NAD3 Mitochondrial NADH dehydrogenase N 3 X62996 NAD4 Mitochondrial NADH dehydrogenase N 1 X62996 COX II Mitochondrial cytochrome oxidase II N 8 CSNK2B NM_001896 CSNK2A2 CK2 aVcatalytic subunit D 15 NM_002350 LYN Src related tyrosine kinase N 7 NM_001895 CSNK2A1 CK2 a, catalytic subunit D 3 NM_001320 CSNK2B CK2 h, regulatory subunit D 3 B. Lehner et al. / Genomics 83 (2004) 153–167 159

Table 2 (continued) Baita Prey Accession Prey name Prey domains/function Novel No. of No. interaction?b ISTsc CSNK2B NM_000969 RPL5 Ribosomal protein L5 D 3 NM_006349 CG1I Putative cyclin G1 interacting protein N 1 NM_003925 MBD4 Methyl-CpG binding domain N 1 XM_012804 GIOT-2 Gonadotropin inducible transcription repressor-2 N 1 C6orf47 NM_000142 FGFR3 Fibroblast growth factor receptor 3 N 1 Scythe [1–96] NM_003021 SGT1 TPR domain, binds C terminus of Hsp70 N 3 NM_032907 MGC14421 UBL and UBA domain N 3 NM_017876 FLJ20552 RING domain N 1 BAT2 [1–773] NM_002136 hnRNP A1 Heterogeneous nuclear ribonucleoprotein A1 N 59 NM_001568 EIF3S6 Eukaryotic translation initiation factor 3 subunit 6 N 22 NM_007204 DDX20, Gemin3 Component of SMN complex, RNP assembly N 8 NM_000508 FGA Fibrinogen A a polypeptide N 8 NM_004199 P4HA2 Catalyzes the formation of 4-hydroxyproline in collagens N 7 NM_001535 HRMT1L1 HMT1 hnRNP methyltransferase-like 1, N5 methylates histones and hnRNPs NM_006715 MAN2C1 Mannosidase a class 2C member 1 N 4 NM_013291 CPSF1 Cleavage and polyadenylation specific factor 1, N3 recognizes the AAUAAA polyadenylation site NM_006839 IMMT Inner mitochondrial membrane protein (mitofilin) N 3 NM_016602 GPR2 G protein-coupled receptor 2 N 2 NM_014610 G2AN a Glucosidase II a subunit N 2 NM_007107 SSR3 Signal sequence receptor g N2 NM_006531 TG737 Tetratricopeptide repeats (TPR) N 2 NM_006341 MAD2L2 Inhibitor of anaphase promoting complex (APC), N2 subunit of DNA polymerase ~ involved in DNA damage postreplication repair NM_005968 hnRNP M Heterogeneous nuclear ribonucleoprotein M N 2 NM_014847 KIAA0144 UBA domain N 1 NM_005051 QARS Glutaminyl-tRNA synthetase N 1 NM_002788 PSMA3 20S proteosome subunit N 1 NM_002086 GRB2 Growth factor receptor-bound protein 2 N 1 BAT2 [756–1408] NM_001212 C1QBP Inhibitor of SF2/ASF splicing factor, N79 mitochondrial protein, C1Q binding protein LST1/f NM_017769 KIAA1333 PHD and RING finger and HECTc domains, N2 E3 ubiquitin ligase BC019848 NY-REN-24 Similar to cactin N 2 ATP6G2 NM_001696 ATP6E ATPase vacuolar proton pump subunit E D 95 UAP56 NM_005804 DDX39 RNA helicase N 74 NM_004640 UAP56 RNA helicase, mRNA splicing and export N 29 NM_032364 CIP29 Cytokine induced protein 29 kDa (CIP29), N4 SAP DNA binding motif, involved in cell cycle progression XM_040002 DKFZP547E1010 — N 2 NM_022740 HIPK2 Homeodomain-interacting 2 N 1 NM_007222 ZHX1 Zinc fingers and homeoboxes 1 N 1 NM_005033 PMSCL1 Polymyositis/scleroderma autoantigen 1, N1 component of the exosome complex. NM_004671 PIASX-BETA Protein inhibitor of activated STAT X N 1 NM_003211 TDG Thymine-DNA glycosylase N 1 NM_001019 RPS15A Ribosomal protein S15a N 1 NM_005782 ALY mRNA export, transcriptional coactivator D 1 NM_138394 MGC:15775 Three RNA recognition motifs N 1 a The numbers enclosed in parentheses refer to the amino acid numbers at the start and the end of the protein fragment. b N, novel interaction; D, known direct interaction; F, known functional interaction; C, part of the same complex. c IST, interaction sequence tag. Indicates the number of clones isolated for each interacting protein [6].

DOM3Z cell surface antigen and two novel proteins of unknown DOM3Z is homologous to the yeast protein Rai1p, which function, but not the probable human Rat1p orthologue interacts with, and stabilizes, the yeast nuclear Rat1p 5V–3V XRN2. Therefore, it remains to be seen whether DOM3Z exoribonuclease [28]. DOM3Z was seen to interact with a is the true functional orthologue of Rai1p. 160 B. Lehner et al. / Genomics 83 (2004) 153–167

UAP56 (BAT1) hnRNP A1, an interacting partner of the N-terminal frag- UAP56 is a DEAD-box RNA helicase that was identified ment of BAT2, and C1QBP can inhibit the function of ASF/ by virtue of its ability to bind U2AF65 as a component of SF2 [34,40]. Taken together, these data suggest that BAT2 the 3Vsplice site complex [29]. Subsequently, Luo et al. [30] could also play a role in the regulation of pre-mRNA demonstrated that UAP56 interacts with Aly, leading to the splicing. recruitment of the mRNA export machinery. Our Y2H screen detected the interaction with Aly. However, the most PBX2 frequently detected UAP56 interaction partner was either In addition to its potential role in transcriptional regula- UAP56 itself or a close paralogue—DDX39 (91% amino tion (see below) PBX2 was also found to interact with the acid sequence identity), suggesting that these proteins may DEAD-box helicase KIAA0052, which has been isolated as form homo- or heterocomplexes in vivo. We also identified a component of the exosome, a complex of 3V–5Vexoribo- an interaction between UAP56 and an uncharacterized nucleases involved in noncoding RNA processing and RNA protein encoded by NM_138394. This protein contains three degradation. The exosome is known to be associated with putative RNA recognition motifs (RRMs) and is similar to actively transcribing and processing RNAs and so may be hnRNP L and therefore may well be involved in RNA recruited by PBX2 [41]. processing or export. Three other UAP56 interaction partners, thymine DNA Bait proteins implicated in transcriptional regulation glycosylase, PMSCL1 (PMSCL75/Rrp45), and PIASx h, are not involved in mRNA splicing or export. However, all STK19 three have also been identified in a Y2H screen with p73a STK19 is a nuclear protein that demonstrates serine/ by virtue of their ability to bind the ubiquitin-like protein threonine kinase activity in vitro [42]. STK19 interacted SUMO-1 [31]. As amino acids 48–55 of UAP56 conform to with six proteins, four of which are known to be nuclear a consensus SUMO-1 modification site it seems probable proteins involved in DNA replication or transcription, which that human UAP56 may be modified by the yeast SUMO-1 may be either substrates or cofactors for this kinase. homologue (Smt3p), resulting in a positive Y2H interaction (Table 2). PBX2 PBX2 is one of at least 5 human TALE (three-amino-acid BAT2 loop extension) HOX proteins related to the pre-B cell acute As yet, little is known about BAT2. Despite being very lymphocytic leukemia fusion product participant PBX1 [43]. large (2157 amino acids) this protein has similarity to only PBX2, when fused to the Gal4 BD, behaved as a transcrip- one known functional domain—the JmjC domain. This tional activator, suggesting it contains a transcription activa- domain is found in many proteins that also contain DNA tion domain. This self-activation was relatively weak and we or chromatin binding domains [32]. BAT2 was divided into were able to screen PBX2 against the Y2H library on plates three fragments corresponding to amino acids 1–773, 756– lacking Ade and His, but containing 60 mM 3-AT. We found 1408, or 1391–2157. Many of the interactions detected with that PBX2 interacted with 15 proteins. Interestingly, 5 of the N terminus of BAT2 (amino acids 1–773) included these are either known to regulate transcription or would be proteins involved in mRNA processing: hnRNP A1 and predicted to do so from their domain structures. GCN5L1 is a hnRNP M are both components of the spliceosome [33,34], putative histone acetylase that, if recruited by PBX2, would DDX20 (Gemin3) is a component of the SMN complex [35] facilitate transcription activation. PBX2 also bound the that is required for assembly of a variety of RNPs including IKBKAP protein, a component of the elongator complex, spliceosomal snRNPs, the cleavage- and polyadenylation- which is associated with actively transcribing RNA poly- specific factor 1 (CPSF1) binds to the AAUAAA polyade- merase II [44]. This is the first suggestion that a DNA- nylation consensus site in pre-mRNA [36], and HRMT1L1 binding protein might recruit the elongator complex. A may methylate hnRNPs, as demonstrated for other members further PBX2 interactor, cDNA FLJ21080, contains a SET of the HRMT family of arginine methyltransferases [37]. domain. SET domains are known to function as histone- Another interaction partner of BAT2 (1–773), GRB2, in methyltransferases, which remodel chromatin structure. Two addition to its well-known role in growth receptor signal novel proteins containing DNA-binding motifs (MGC16385 transduction, has been shown to be expressed in the nucleus and ZDHHC7) also interacted with PBX2 and may represent and to interact with an hnRNP protein [38]. Both GRB2 and PBX2 DNA-binding partners. HRMT1L1 contain an SH3 domain, which may interact with the proline-rich regions in BAT2. The C-terminal NELF-E (negative elongation factor subunit E or RDBP) fragment of BAT2 interacted promiscuously with many NELF-E contains a C-terminal functional RRM and a different proteins, whereas all of the colonies obtained with central sequence consisting of 24 arginine–aspartate or the middle fragment of BAT2 corresponded to a single arginine–glutamate dipeptide repeats [45].RDBPwas interacting protein, C1QBP (also known as p32), which identified as the smallest of five subunits of the NELF interacts with the ASF/SF2 splicing factor [39,40]. Both complex [45]. The NELF complex was shown to cooperate B. Lehner et al. / Genomics 83 (2004) 153–167 161 with the DSIF (DRB-sensitive factor) complex to repress substrates, such as hormone receptors and other signaling transcription elongation by RNA polymerase II in an in molecules, by ubiquitination and targeting to the protein vitro system. To date, only one other component, WHSC2, degradation pathway [55]. The two halves of the Scythe of the NELF complex has been identified [46].Inour protein interacted promiscuously. However, a bait encoding screens NELF-E interacted with three proteins: COBRA1, only the N-terminal, ubiquitin-like domain (amino acids 1– NCOR1, and the NY-REN-24 antigen. COBRA1 (cofactor 96) interacted with three different proteins: MGC14421, of BRCA1) binds the BRCA1 protein and, when tethered to FLJ20552, and SGT1 (Table 2). Both MGC14421 and DNA, triggers large-scale chromatin remodeling [47]. The FLJ20552 are uncharacterized proteins, but they contain predicted molecular weight of COBRA1 suggests that it domains that are frequently found in proteins involved in may be the f61-kDa component of the NELF complex protein degradation: MGC14421 contains ubiquitin-like and [45]. NCOR1 recruits a corepressor complex, which con- ubiquitin-associated domains and FLJ20552 contains a tains histone deacetylase proteins and can repress transcrip- RING-finger domain [58]. SGT1 bears a number of simi- tion by chromatin modification [48]. The in vitro system larities to CHIP in that it contains a tetratricopeptide repeat used to demonstrate the repressive activity of NELF con- domain, it has been shown to interact with HSP70 [59], and tains chromatin-free DNA. Our results suggest that in vivo it is part of the SCF E3 ubiquitin-ligase complex [60]. the NELF complex may also repress transcription elonga- Therefore, SGT1 and the SCF complex may cooperate with tion by affecting chromatin structure. NELF-E also inter- Scythe to couple protein folding and degradation in a acted with a novel protein (NY-REN-24) similar to manner analogous to the cooperation between CHIP and Drosophila cactin, a protein involved in InB signaling BAG-1. These data also suggest that different BAG proteins [49]. This protein contains several RD repeats that may may cooperate with specific E3 ubiquitin ligases to deter- mediate the interaction. mine the fate of a variety of chaperone-regulated complexes.

Bait proteins implicated in protein ubiquitination Bait proteins implicated in cell signaling and other processes RNF5 RNF5 is a RING zinc-finger domain protein, predicted to G18 contain two C-terminal transmembrane anchors [50]. Mul- The G18 protein is of unknown function, but contains tiple RING-finger proteins have demonstrated E3 ubiquitin- three GoLoCo motifs. In our screens G18 interacted with a ligase activity, being able to transfer ubiquitin from E2 diverse range of prey proteins, suggesting either that its proteins to substrate proteins, thereby marking the substrates physiological function requires it to interact with a broad for destruction by the proteasome (reviewed by Joazeiro range of proteins or that it may be misfolded. One commonly [51]). E3 proteins interact with both E2 proteins and the isolated prey was the Ga-protein GAI2. Interestingly, substrate proteins. Using the nonhydrophobic (putative GoLoCo motifs in several proteins have been demonstrated cytoplasmic) RING domain of RNF5 as a bait, we isolated to interact with the a subunits of heterotrimeric G proteins 5 different proteins in our Y2H screen, including 4 E2 [61], suggesting that this may represent a physiological proteins and the FHOD1 (formin homology 2 domain interaction. containing 1) protein. Interestingly all 4 E2 proteins (E2E1, E2E2, E2D2, E2D3) are from the same UBC4/5 C6orf47 subfamily of the >40 E2 proteins in the human genome [52], The novel C6orf47 protein has no significant similarity which not only implies a highly specific E2–E3–RING to any protein other than the mouse orthologue. C6orf47 interaction, but also suggests functional specificity among was found to interact only with a region of the intracellular the E2 subfamilies. RNF5 was recently demonstrated to tail of the fibroblast growth factor receptor 3 (FGFR3), display E3 ubiquitin-ligase activity in vitro, when coupled which is mutated in achondroplasia and thanatophoric to 2 UBC4/5 family E2 proteins, but not 4 other E2 proteins dysplasia [62]. The observed Y2H interaction predicts that [53], supporting our interaction results. C6orf47 may have a role in FGFR3 signaling.

Scythe (BAT3) ATP6G2 Scythe is a homologue of a protein isolated from Xen- The ATP6G2 gene encodes the G subunit of the vacuolar opus cell extracts by virtue of its interaction with the ATPase H+ pump (V-ATPase) present in the membranes of Drosophila apoptotic regulator—Reaper [54]. Scythe shows various intracellular compartments such as endosomes, some sequence similarity to BAG-1, a member of the BAG lysosomes, and secretory vesicles [63]. According to the family of HSP70 cochaperones. Like BAG-1 [55], Scythe current model of V-ATPase, subunit G is thought to interact contains ubiquitin-like and BAG domains, and it can inter- with subunit E [63,64]. Our Y2H screen with subunit G act with HSP70 proteins and regulate chaperone-mediated identified hundreds of positive colonies. Analysis of more protein folding [56]. BAG-1 also interacts with CHIP, an E3 than a hundred of these colonies showed that all the prey ubiquitin ligase [57], and together they regulate chaperone clones encoded the V-ATPase subunit E. 162 B. Lehner et al. / Genomics 83 (2004) 153–167

CSNK2B putative interaction profiles of a functionally diverse col- CSNK2B is the h subunit of (CK2), a lection of human proteins. For this reason we chose to ubiquitous serine/threonine kinase composed of two regula- perform a pilot study, using a representative selection of tory h subunits and two catalytic a or aVsubunits [65]. In the intracellular genes from the MHC class III region of the CSNK2B screens we detected interactions of the h subunit human genome. In this study we used a stringent HTP Y2H with itself and with the other two subunits, CSNK2A1 (a) system to predict the function of novel intracellular proteins and CSNK2A2 (aV). We also identified an interaction with encoded within the human MHC class III region. By the tyrosine kinase lyn. CK2 and lyn have been shown in operating a HTP Y2H screen that incorporates all of the some cases to phosphorylate the same proteins [66] and it is classically defined specificity controls we were able to possible that they could be part of a larger regulatory rationally assess the merit of maintaining each of these complex. CK2 is required for cell cycle progression from procedures in future larger scale Y2H studies of the human G1 to S phase [67]. Therefore, the interaction we detected proteome. between CSNK2B and a putative cyclin G1-interacting Previous large-scale Y2H screens adopted one of two protein (CG1I) could also be of interest. strategies to improve data quality. First, only those inter- actions detected in multiple independent screens were LST1 reported [2]. Second, only interactions detected more than The LST1 gene expresses many differentially spliced three times in a single screen were reported [1,10]. While transcripts that could code for several different proteins these may have been valid strategies for those specific varying considerably in their predicted ORFs [68]. One screens, it would be inappropriate to adopt similar splice variant of LST1 contains a transmembrane segment approaches in HTP studies of human protein–protein inter- and when overexpressed it induces the formation of filopo- actions, which involve the screening of nonnormalized high- dia and microspikes at the cell surface [69]. An alternative complexity cDNA libraries. In this case the number of splice variant, LST1/f, that lacks the transmembrane domain clones corresponding to a specific prey that are detected does not induce the formation of filopodia nor does it have a in each screen will vary according to the relative represen- dominant negative effect. This latter splice variant is the one tation of each gene within the library. In this case excluding analyzed in this study. We identified only two specific singletons would discriminate against genes expressed at interaction partners for LST1/f: KIAA1333, a putative E3 low levels and will actually increase the false-negative rate. ligase (contains a RING finger and a HECTc domain), and For example, in our screens CSNK2B, which is known to an EST (NY-REN-24) that might encode a protein similar to form homodimers, isolated itself in only one of two replica Drosophila Cactin, which is involved in InB signaling [70]. screens and, as such, would have been eliminated by the The interactions we detected for LST1/f suggest that this first strategy. Equally, UAP56 isolated only a single clone of protein may form a complex with human Cactin and a its known interaction partner ALY and, therefore, would ubiquitin ligase (KIAA1333), which may regulate the hu- have been eliminated by the second strategy. man equivalents of Cactus by the ubiquitination/proteasome Furthermore, multiple isolates of a single clone could pathway. Although a previous study has shown that ubi- easily arise from yeast host mutations that occur during quitination of vertebrate InBa can be carried out by the growth of the bait and/or the library cultures, a possibility SCF complex [71], this putative novel LST1/f–Cactin– that can be excluded only by retesting in fresh yeast. KIAA1333 complex may regulate InBa in response to Because we repeated each of our bait screens and retested different signals, or it may regulate one of the other each positive interaction, it was possible for us to include members of the vertebrate InB family. singletons with confidence and assess the rational of imple- menting these approaches in the HTP human studies. In small-scale Y2H studies it is considered standard Discussion practice to test each isolated prey plasmid against the bait in fresh yeast and also to test the specificity of the interac- Data from Y2H studies provide a valuable insight into tion against irrelevant baits. However, a similar level of the possible function of novel proteins and the complexity stringency has not been applied in all large-scale Y2H of biochemical networks. However, for many years it has studies [1,2]. Our HTP Y2H assay allows all classical also been apparent that Y2H assays generate false-positive specificity checks to be performed rapidly in parallel be- results. Therefore, if we are to exploit the full potential of cause all steps are performed in an arrayed format, which is the Y2H system in the postgenomic era it is imperative that amenable to automation. we perform HTP screens in such a way as to maintain Analysis of the data from this study shows that the optimal fidelity and specificity, thereby generating data that incorporation of secondary specificity tests does significant- have a greater probability of driving future research in the ly reduce the number of protein–protein interactions ob- correct direction. Before embarking on larger scale studies served in each screen. For example, in one screen the of the human proteome it was necessary to establish how the UAP56 protein isolated 113 prey colonies, representing nine incorporation of different specificity criteria may affect the different interactions. However, only six of these interac- B. Lehner et al. / Genomics 83 (2004) 153–167 163 tions reconfirmed when retested, including one that was tion of functionally related genes in eukaryotic genomes eliminated due to lack of prey specificity. Overall, less than [75]. 37% of primary positive interactions passed all established Clearly further research is required to verify the physi- criteria for specificity (Fig. 1). As the MHC class III region ological relevance of the observations reported in this study. encodes a diverse range of proteins, containing many However, this exercise has provided many intriguing exper- different structural and functional domains, it is reasonable imental leads and enabled us to assess rationally the way in to assume that the trends observed in this study will also be which future larger scale studies of the human proteome reflected in larger studies. The efficacy of the specificity should be performed. checks used in this study is shown by the fact that several The power of the Y2H assay lies in its ability to predict false-positive clones (e.g., MGC4549, ferritin light-chain possible protein–protein interactions. While Y2H data are polypeptide) that turned up in multiple unrelated screens never definitive, they do provide important clues, which in were eliminated by this strategy. turn promote the formulation of new hypotheses and guide While the use of specificity checks undoubtedly reduces future research. Therefore, to reduce the risk of generating the number of false-positive clones reported, it could be misleading data it is important that standard specificity argued that they also increase the false-negative rate. For this checks are incorporated into all Y2H studies irrespective reason we deliberately sequenced all of the positive colonies of size. picked from each screen, including those that passed all the specificity checks and those that did not. Examination of the resulting data shows that all of the known interaction Materials and methods partners identified in our screens passed all the subsequent specificity checks. Therefore, although a proportion of Bait construction known interactions are missing from our data, as is the case with all Y2H data, it appears that the inclusion of standard Primers were designed to amplify every predicted intra- experimental specificity tests did not increase the false- cellular MHC class III region ORF using the RefSeq mRNA negative rate of the screen. These observations lend credence sequences or the ORF best supported by available ESTs. Two to the assumption that data derived from screens that strategies were used for Y2H bait constructions. The ORFs incorporate a full range of specificity checks may be more of BAT2, LST1/f, C6orf47, MSH5, RDBP, SKI2W, accurate and of greater utility to future research. DOM3Z, and STK19 were cloned using the Gateway re- Considering only those interactions that passed all spec- combination cloning system into the pENTR201 entry vector ificity checks, we identified 103 protein interactions, of and then subcloned into the Gateway-compatible bait vector which 91 are novel. Significantly, these data have enabled pGBDU-G, which was constructed by cloning the Gateway us to postulate putative functions for many of the uncharac- reading frame cassette B into pGBDU-C1 [21], according to terized intracellular proteins encoded within the human MHC the manufacturer’s instructions. The ORFs of NG36/G9a, class III region. In turn, these data will aid the prognostic ATF6h, RNF5, PBX2, G18, CLIC1, CSNK2B, AIF-1, AIF- interpretation of genetic mutations occurring in this important 2, MCCD1, ATP6G2, UAP56, Scythe, BAT4, and BAT5 region of the human genome and provide a novel insight into were cloned by conventional methods directly into the biochemical networks that mediate cell signaling, protein pGBDU-C series of Y2H bait vectors using restriction ubiquitination, and mRNA processing events. enzyme sites incorporated into the PCR primers. The ORFs It has recently been reported that highly expressed genes for DDAH2, LSM2, and DIR1 were cloned using both [72] and widely expressed genes [73] are not randomly strategies as controls for any effects of the Gateway cassette distributed along human chromosome arms, but form gene sequences in Y2H screening. All ORFs were amplified from clusters. Interestingly, our data also show functional inter- either tissue cDNA or an SLB-1 cDNA Matchmaker library actions occurring between different class III region gene (Clontech). All bait constructs were fully sequenced and only products. For example, 9 of the 27 proteins analyzed in this clones corresponding to a major mRNA found in GenBank study either are orthologues of yeast proteins involved in were used for Y2H screening. The Scythe (1–96) bait was mRNA processing or interact with proteins with known cloned by gap repair into the pGBD-B vector (D. Markie) roles in mRNA processing. Based on a genome-wide and 11 clones were pooled, giving a high probability of estimate that 3% of human proteins have a role in RNA obtaining a PCR-error-free clone without the need for metabolism [74], and assuming genes are ordered randomly sequence verification [6]. in the human genome, it would be expected that < 2 of the f60 class III-encoded proteins would be involved in RNA Library screening and reporter assays metabolism (and even fewer in mRNA processing). This apparent clustering of mRNA-processing genes supports the Yeast strain PJ69-4A (MATa trp1-901 leu2-3, 112 ura3- long-suspected hypothesis that the MHC encodes clusters of 52, his3-200 gal4D gal80D LYS2::GAL1-HIS3 GAL2-ADE2 genes that are both functionally and evolutionary related, met2::GAL7-lacZ [21]) was transformed by electroporation thereby supporting the concept of a higher order organiza- [76] with the bait plasmids, which carry the URA3 marker. 164 B. Lehner et al. / Genomics 83 (2004) 153–167

Bait strains were tested for self-activation of the reporter systems) and the reactions were analyzed on ABI-377 genes on synthetic dropout (SD)–Ura/Ade and SD–Ura/ sequencers. His. Self-activating baits were tested for activation on SD– Ura/Ade/His containing 0.1 to 100 mM 3-AT, a competitive Reconfirmation and specificity testing inhibitor of the His3 protein to increase the stringency of selection. Several baits self-activated in the presence of 100 To retest each interaction in fresh yeast, each prey PCR mM 3-AT and so could not be screened. PBX2 did not self- product was cloned using gap-repair recombination cloning activate on medium concentrations of 3-AT and was [11] into the pGAD-T7 vector (Clontech). Gap-repair re- screened on plates containing 40 and 60 mM 3-AT. combination reactions were performed using a 96-well All baits were screened against a K562 erythroleukemia format. A 1-Al aliquot of PCR product and 8 Alof cDNA Matchmaker library (Clontech), which contains the competent MATa PJ69-4A yeast cell mix containing restric- LEU2 marker and was amplified and transformed into a tion-cut vector (for 96 reactions: 555 Al 50% PEG, 83 Al1 mating-type-switched, MATa derivative of the PJ69-4A M LiOAc, 100 Al 2 mg/ml heat-denatured salmon sperm yeast strain. This library represents 3.5 Â 106 independent DNA, and 20 ng plasmid) were incubated at 30jC for 30 clones with an estimated cDNA insert size range from 0.4 to min, 42jC for 25 min, and 30jC for 5 min and plated 3.8 kb, with an average insert size of 1.9 kb. directly onto SD–Leu plates for selection of the repaired Bait yeast (f109 cfu) were individually mated to the plasmids. After growth at 30jC for 3 days, these yeast prey yeast library (f6 Â 108 cfu) using either a liquid containing prey vector were mated by replica plating onto a (Clontech) or a filter [77] mating procedure as previously lawn of MATa yeast containing the relevant bait, two described. Each method typically gave a mating efficiency different irrelevant baits, or an empty bait vector and then of >2%, thereby enabling an approximately fourfold cover- incubated for >5 h on YPAD plates. Diploids were selected age of the library. Mating mixtures were grown on on SD–Ura/Leu prior to testing for reporter activation by 16 Â 150-mm SD–Ura/Leu/Ade plates for up to 2 weeks replicating onto SD–Ura/Leu/Ade. for selection of clones expressing the ADE2 reporter. Col- onies were picked in a 96-grid format onto SD–Ura/Leu/ Data analysis Ade plates such that all subsequent handling could be performed using a 96-pin replicator. Activation of the lacZ Prey sequences were searched against locally held ver- reporter was assayed by growing the yeast on filter paper on sions of the Homo sapiens Unigene and the EMBLminus YPAD plates overnight, lysing the cells in liquid nitrogen, databases using an automated BLAST [22] algorithm. and then incubating the filters at 37jC for 1 h on filters Custom-built Perl modules and scripts were used to prefilter presoaked in 6 ml Z buffer (100 mM Na2HPO4,40mM and format the raw BLAST output and determine whether NaH2PO4, 10 mM KCl, 1 mM MgSO4, pH 7), 100 Al4%X- the 5V sequence read overlapped with the protein coding gal, and 11 Al h-mercaptoethanol. Colonies were scored as region of a gene. Only matches to known protein-coding positive for activation of lacZ if the coloration was greater regions and to ESTs that have no defined ORF were than that of yeast containing the bait vector alone. In included as positives. Matches to 3V UTRs and genomic approximately half of the screens we did not score for DNA with no associated gene prediction were excluded activation of the HIS3 gene because in our experience from our list of positives. Some singleton, unspliced ESTs >99% of colonies that activate ADE2 also activate HIS3, that did not correspond to any gene predictions were also as previously reported [21]. excluded, as they probably represent genomic contamina- tion of cDNA libraries. Although it is possible that the Identification of interacting proteins excluded BLAST hits also correspond to genuine coding regions that have not yet been identified, these would be Prey inserts were amplified directly from yeast using difficult to interpret at this time. Information was recorded vector-specific primers in a 96-well format. Yeast colonies for each prey regarding the PCR fragment size, sequence were picked into 20 mM NaOH and lysed for 20 min at identity, reporter activation, and reconfirmation and speci- room temperature. A 1.5-Al aliquot of lysis mix was added ficity results. to a standard 25-Al PCR containing vector-specific primers. The PCR was heated for 5 min at 95jC followed by 35 cycles of 1 min denaturation at 95jC, 1 min annealing at Acknowledgments 58jC, and 3.5 min extension at 72jC. Four microliters of each PCR was run on a 0.8% agarose gel to estimate the We thank David Markie for useful discussions and insert size. Some inserts are refractory to a single round of valuable suggestions regarding the development of HTP PCR analysis and were amplified using a second round of Y2H screens. The pGBDU-C series of vectors and the PJ69- PCR using a nested forward primer. To identify the prey 4A MATa and MATa yeast strains were kindly provided by inserts, 3–5 Al of each PCR product was sequenced in a Philip James (Department of Biomolecular Chemistry, 20-Al reaction using BigDyeDT-2 chemistry (Applied Bio- University of Wisconsin, Madison, WI 53706-1532, USA). B. Lehner et al. / Genomics 83 (2004) 153–167 165

The pGBD-B vector was obtained from David Markie plex contribute to susceptibility or protection. Belgian Diabetes (Molecular Genetics Laboratory, Pathology Department, Registry, Diabetes 47 (1998) 263–269. [15] M. Herr, F. Dudbridge, P. Zavattari, F. Cucca, C. Guja, R. March, Dunedin School of Medicine, Dunedin, New Zealand). R.D. Campbell, A.H. Barnett, S.C. Bain, J.A. Todd, B.P. Koeleman, Evaluation of fine mapping strategies for a multifactorial disease locus: systematic linkage and association analysis of IDDM1 in the References HLA region on chromosome 6p21, Hum. Mol. Genet. 9 (2000) 1291–1301. [1] T. Ito, T. Chiba, R. Ozawa, M. Yoshida, M. Hattori, Y. Sakaki, A [16] E. Zanelli, G. Jones, M. Pascual, P. Eerligh, A.R. van der Slik, A.H. comprehensive two-hybrid analysis to explore the yeast protein inter- Zwinderman, W. Verduyn, G.M. Schreuder, E. Roovers, F.C. Breed- actome, Proc. Natl. Acad. Sci. USA 98 (2001) 4569–4574. veld, R.R. de Vries, J. Martin, M.J. Giphart, The telomeric part of the [2] P. Uetz, L. Giot, G. Cagney, T.A. Mansfield, R.S. Judson, J.R. Knight, HLA region predisposes to rheumatoid arthritis independently of the D. Lockshon, V. Narayan, M. Srinivasan, P. Pochart, A. Qureshi- class II loci, Hum. Immunol. 62 (2001) 75–84. Emili, Y. Li, B. Godwin, D. Conover, T. Kalbfleisch, G. Vijayadamo- [17] M.G. Marrosu, R. Murru, M.R. Murru, G. Costa, P. Zavattari, M. dar, M. Yang, M. Johnston, S. Fields, J.M. Rothberg, A comprehensive Whalen, E. Cocco, C. Mancosu, L. Schirru, E. Solla, E. Fadda, C. analysis of protein–protein interactions in Saccharomyces cerevisiae, Melis, I. Porru, M. Rolesu, F. Cucca, Dissection of the HLA associ- Nature 403 (2000) 623–627. ation with multiple sclerosis in the founder isolated population of [3] M. Fromont-Racine, A.E. Mayes, A. Brunet-Simon, J.C. Rain, A. Sardinia, Hum. Mol. Genet. 10 (2001) 2907–2916. Colley, I. Dix, L. Decourty, N. Joly, F. Ricard, J.D. Beggs, P. Legrain, [18] M.A. Brown, K.D. Pile, L.G. Kennedy, D. Campbell, L. Andrew, R. Genome-wide protein interaction screens reveal functional networks March, J.L. Shatford, D.E. Weeks, A. Calin, B.P. Wordsworth, A involving Sm-like proteins, Yeast 17 (2000) 95–110. genome-wide screen for susceptibility loci in ankylosing spondylitis, [4] J.C. Rain, L. Selig, H. De Reuse, V. Battaglia, C. Reverdy, S. Simon, Arthritis Rheum. 41 (1998) 588–595. G. Lenzen, F. Petel, J. Wojcik, V. Schachter, Y. Chemama, A. Labigne, [19] V.B. Matthews, C.S. Witt, M.A. French, H.K. Machulla, E.G. De la P. Legrain, The protein–protein interaction map of Helicobacter py- Concha, K.Y. Cheong, P. Vigil, P.N. Hollingsworth, K.J. Warr, F.T. lori, Nature 409 (2001) 211–215. Christiansen, P. Price, Central MHC genes affect IgA levels in the [5] M. Flajolet, G. Rotondo, L. Daviet, F. Bergametti, G. Inchauspe, P. human: reciprocal effects in IgA deficiency and IgA nephropathy, Tiollais, C. Transy, P. Legrain, A genomic approach of the hepatitis C Hum. Immunol. 63 (2002) 424–433. virus generates a protein interaction map, Gene 242 (2000) 369–379. [20] A. Gragerov, L. Zeng, X. Zhao, W. Burkholder, M.E. Gottesman, [6] A.J. Walhout, R. Sordella, X. Lu, J.L. Hartley, G.F. Temple, M.A. Specificity of DnaK –peptide binding, J. Mol. Biol. 235 (1994) Brasch, N. Thierry-Mieg, M. Vidal, Protein interaction mapping in C. 848–854. elegans using proteins involved in vulval development, Science 287 [21] P. James, J. Halladay, E.A. Craig, Genomic libraries and a host strain (2000) 116–122. designed for highly efficient two-hybrid selection in yeast, Genetics [7] H. Suzuki, Y. Fukunishi, I. Kagawa, R. Saito, H. Oda, T. Endo, S. 144 (1996) 1425–1436. Kondo, H. Bono, Y. Okazaki, Y. Hayashizaki, Protein–protein inter- [22] S.F. Altschul, W. Gish, W. Miller, E.W. Myers, D.J. Lipman, Basic action panel using mouse full-length cDNAs, Genome Res. 11 (2001) local alignment search tool, J. Mol. Biol. 215 (1990) 403–410. 1758–1765. [23] G.D. Bader, D. Betel, C.W. Hogue, BIND: the Biomolecular Inter- [8] C. von Mering, R. Krause, B. Snel, M. Cornell, S.G. Oliver, S. Fields, action Network Database, Nucleic Acids Res. 31 (2003) 248–250. P. Bork, Comparative assessment of large-scale data sets of protein– [24] T. Achsel, H. Stark, R. Luhrmann, The Sm domain is an ancient protein interactions, Nature 417 (2002) 399–403. RNA-binding motif with oligo(U) specificity, Proc. Natl. Acad. Sci. [9] A.C. Gavin, M. Bosche, R. Krause, P. Grandi, M. Marzioch, A. USA 98 (2001) 3685–3689. Bauer, J. Schultz, J.M. Rick, A.M. Michon, C.M. Cruciat, M. Remor, [25] W. He, R. Parker, Functions of Lsm proteins in mRNA degradation C. Hofert, M. Schelder, M. Brajenovic, H. Ruffner, A. Merino, K. and splicing, Curr. Opin. Cell Biol. 12 (2000) 346–350. Klein, M. Hudak, D. Dickson, T. Rudi, V. Gnau, A. Bauch, S. [26] C. Kambach, S. Walke, R. Young, J.M. Avis, E. de la Fortelle, V.A. Bastuck, B. Huhse, C. Leutwein, M.A. Heurtier, R.R. Copley, A. Raker, R. Luhrmann, J. Li, K. Nagai, Crystal structures of two Sm Edelmann, E. Querfurth, V. Rybin, G. Drewes, M. Raida, T. protein complexes and their implications for the assembly of the Bouwmeester, P. Bork, B. Seraphin, B. Kuster, G. Neubauer, G. spliceosomal snRNPs, Cell 96 (1999) 375–387. Superti-Furga, Functional organization of the yeast proteome by [27] A.E. Mayes, L. Verdone, P. Legrain, J.D. Beggs, Characterization systematic analysis of protein complexes, Nature 415 (2002) of Sm-like proteins in yeast and their association with U6 snRNA, 141–147. EMBO J. 18 (1999) 4321–4331. [10] T. Ito, K. Tashiro, S. Muta, R. Ozawa, T. Chiba, M. Nishizawa, K. [28] Y. Xue, X. Bai, I. Lee, G. Kallstrom, J. Ho, J. Brown, A. Stevens, Yamamoto, S. Kuhara, Y. Sakaki, Toward a protein–protein inter- A.W. Johnson, Saccharomyces cerevisiae RAI1 (YGL246c) is homol- action map of the budding yeast: a comprehensive system to exam- ogous to human DOM3Z and encodes a protein that binds the nuclear ine two-hybrid interactions in all possible combinations between the exoribonuclease Rat1p, Mol. Cell. Biol. 20 (2000) 4006–4015. yeast proteins, Proc. Natl. Acad. Sci. USA 97 (2000) 1143–1147. [29] J. Fleckner, M. Zhang, J. Valcarcel, M.R. Green, U2AF65 recruits a [11] R. Petermann, B.M. Mossier, D.N. Aryee, H. Kovar, A recombination novel human DEAD box protein required for the U2 snRNP-branch- based method to rapidly assess specificity of two-hybrid clones in point interaction, Genes Dev. 11 (1997) 1864–1872. yeast, Nucleic Acids Res. 26 (1998) 2252–2253. [30] M.L. Luo, Z. Zhou, K. Magni, C. Christoforides, J. Rappsilber, M. [12] MHC Sequencing Consortium, Complete sequence and gene map of a Mann, R. Reed, Pre-mRNA splicing and mRNA export linked by human major histocompatibility complex. The MHC Sequencing direct interactions between UAP56 and Aly, Nature 413 (2001) Consortium, Nature 401 (1999) 921–923. 644–647. [13] C.M. Milner, R.D. Campbell, J. Trowsdale, Molecular genetics of [31] A. Minty, X. Dumont, M. Kaghad, D. Caput, Covalent modification the human major histocompatibility complex, in: W.A. Lechler R of p73alpha by SUMO-1. Two-hybrid screening with p73 identifies (Ed.), HLA in Health and Disease, Academic Press, London, 2000, novel SUMO-1-interacting proteins and a SUMO-1 interaction motif, pp. 35–50. J. Biol. Chem. 275 (2000) 36316–36323. [14] P. Hanifi Moghaddam, P. de Knijf, B.O. Roep, B. Van der Auwera, A. [32] P.M. Clissold, C.P. Ponting, JmjC: cupin metalloenzyme-like domains Naipal, F. Gorus, F. Schuit, M.J. Giphart, Genetic structure of in jumonji, hairless and phospholipase A2beta, Trends Biochem. Sci. IDDM1: two separate regions in the major histocompatibility com- 26 (2001) 7–9. 166 B. Lehner et al. / Genomics 83 (2004) 153–167

[33] P. Kafasla, M. Patrinou-Georgoula, J.D. Lewis, A. Guialis, As- [51] C.A. Joazeiro, A.M. Weissman, RING finger proteins: mediators of sociation of the 72/74-kDa proteins, members of the heteroge- ubiquitin ligase activity, Cell 102 (2000) 549–552. neous nuclear ribonucleoprotein M group, with the pre-mRNA at [52] D. Jones, E. Crowe, T.A. Stevens, E.P. Candido, Functional and phy- early stages of spliceosome assembly, Biochem. J. 363 (2002) logenetic analysis of the ubiquitylation system in Caenorhabditis el- 793–799. egans: ubiquitin-conjugating enzymes, ubiquitin-activating enzymes, [34] A. Mayeda, A.R. Krainer, Regulation of alternative pre-mRNA and ubiquitin-like proteins, Genome Biol. 3 (2002). splicing by hnRNP A1 and splicing factor SF2, Cell 68 (1992) [53] N. Matsuda, T. Suzuki, K. Tanaka, A. Nakano, Rma1, a novel 365–375. type of RING finger protein conserved from Arabidopsis to hu- [35] B. Charroux, L. Pellizzoni, R.A. Perkinson, A. Shevchenko, M. Mann, man, is a membrane-bound ubiquitin ligase, J. Cell Sci. 114 (2001) G. Dreyfuss, Gemin3: a novel DEAD box protein that interacts with 1949–1957. SMN, the spinal muscular atrophy gene product, and is a component of [54] K. Thress, W. Henzel, W. Shillinglaw, S. Kornbluth, Scythe: a novel gems, J. Cell Biol. 147 (1999) 1181–1194. reaper-binding apoptotic regulator, EMBO J. 17 (1998) 6135–6143. [36] S. Bienroth, W. Keller, E. Wahle, Assembly of a processive messen- [55] J. Luders, J. Demand, J. Hohfeld, The ubiquitin-related BAG-1 pro- ger RNA polyadenylation complex, EMBO J. 12 (1993) 585–594. vides a link between the molecular chaperones Hsc70/Hsp70 and the [37] M.F. Henry, P.A. Silver, A novel methyltransferase (Hmt1p) modi- proteasome, J. Biol. Chem. 275 (2000) 4613–4617. fies poly(A)+-RNA-binding proteins, Mol. Cell. Biol. 16 (1996) [56] K. Thress, J. Song, R.I. Morimoto, S. Kornbluth, Reversible inhib- 3668–3678. ition of Hsp70 chaperone function by Scythe and Reaper, EMBO J. [38] F. Romero, F. Ramos-Morales, A. Dominguez, R.M. Rios, F. 20 (2001) 1033–1041. Schweighoffer, B. Tocque, J.A. Pintor-Toro, S. Fischer, M. Tortolero, [57] J. Demand, S. Alberti, C. Patterson, J. Hohfeld, Cooperation of a Grb2 and its apoptotic isoform Grb3-3 associate with heterogeneous ubiquitin domain protein and an E3 ubiquitin ligase during chaper- nuclear ribonucleoprotein C, and these interactions are modulated by one/proteasome coupling, Curr. Biol. 11 (2001) 1569–1577. poly(U) RNA, J. Biol. Chem. 273 (1998) 7776–7781. [58] A. Buchberger, From UBA to UBX: new words in the ubiquitin [39] B. Ghebrehiwet, B.L. Lim, R. Kumar, X. Feng, E.I. Peerschke, gC1q- vocabulary, Trends Cell Biol. 12 (2002) 216–221. R/p33, a member of a new class of multifunctional and multicompart- [59] S. Tobaben, P. Thakur, R. Fernandez-Chacon, T.C. Sudhof, J. Rettig, mental cellular proteins, is involved in inflammation and infection, B. Stahl, A trimeric protein complex functions as a synaptic chaper- Immunol. Rev. 180 (2001) 65–77. one machine, Neuron 31 (2001) 987–999. [40] S.K. Petersen-Mahrt, C. Estmer, C. Ohrmalm, D.A. Matthews, W.C. [60] K. Kitagawa, D. Skowyra, S.J. Elledge, J.W. Harper, P. Hieter, SGT1 Russell, G. Akusjarvi, The splicing factor-associated protein, p32, encodes an essential component of the yeast kinetochore assembly regulates RNA splicing by inhibiting ASF/SF2 RNA binding and pathway and a novel subunit of the SCF ubiquitin ligase complex, phosphorylation, EMBO J. 18 (1999) 1014–1024. Mol. Cell 4 (1999) 21–33. [41] E.D. Andrulis, J. Werner, A. Nazarian, H. Erdjument-Bromage, P. [61] D.P. Siderovski, M. Diverse-Pierluissi, L. De Vries, The GoLoco Tempst, J.T. Lis, The RNA processing exosome is linked to motif: a Galphai/o binding motif and potential guanine-nucleotide elongating RNA polymerase II in Drosophila, Nature 420 (2002) exchange factor, Trends Biochem. Sci. 24 (1999) 340–341. 837–841. [62] R. Shiang, L.M. Thompson, Y.Z. Zhu, D.M. Church, T.J. Fielder, M. [42] N. Gomez-Escobar, C.F. Chou, W.W. Lin, S.L. Hsieh, R.D. Campbell, Bocian, S.T. Winokur, J.J. Wasmuth, Mutations in the transmembrane The G11 gene located in the major histocompatibility complex enc- domain of FGFR3 cause the most common genetic form of dwarfism, odes a novel nuclear serine/threonine protein kinase, J. Biol. Chem. achondroplasia, Cell 78 (1994) 335–342. 273 (1998) 30954–30960. [63] T. Nishi, M. Forgac, The vacuolar (H+)-ATPases—nature’s most ver- [43] R.S. Mann, S.K. Chan, Extra specificity from extradenticle: the part- satile proton pumps, Nat. Rev. Mol. Cell. Biol. 3 (2002) 94–103. nership between HOX and PBX/EXD homeodomain proteins, Trends [64] T. Xu, E. Vasilyeva, M. Forgac, Subunit interactions in the clathrin- Genet. 12 (1996) 258–262. coated vesicle vacuolar (H(+))-ATPase complex, J. Biol. Chem. 274 [44] N.A. Hawkes, G. Otero, G.S. Winkler, N. Marshall, M.E. Dahmus, D. (1999) 28909–28915. Krappmann, C. Scheidereit, C.L. Thomas, G. Schiavo, H. Erdjument- [65] B. Guerra, O.G. Issinger, Protein kinase CK2 and its role in cellular Bromage, P. Tempst, J.Q. Svejstrup, Purification and characteriza- proliferation, development and pathology, Electrophoresis 20 (1999) tion of the human elongator complex, J. Biol. Chem. 277 (2002) 391–408. 3047–3052. [66] R.K. Ganju, R.G. Shpektor, D.G. Brenner, M.A. Shipp, CD10/neutral [45] Y. Yamaguchi, T. Takagi, T. Wada, K. Yano, A. Furuya, S. Sugimoto, endopeptidase 24.11 is phosphorylated by casein kinase II and coas- J. Hasegawa, H. Handa, NELF, a multisubunit complex containing sociates with other phosphoproteins including the lyn src-related kin- RD, cooperates with DSIF to repress RNA polymerase II elongation, ase, Blood 88 (1996) 4159–4165. Cell 97 (1999) 41–51. [67] R. Pepperkok, P. Lorenz, W. Ansorge, W. Pyerin, Casein kinase II is [46] Y. Yamaguchi, J. Filipovska, K. Yano, A. Furuya, N. Inukai, T. Narita, required for transition of G0/G1, early G1, and G1/S phases of the cell T. Wada, S. Sugimoto, M.M. Konarska, H. Handa, Stimulation of cycle, J. Biol. Chem. 269 (1994) 6986–6991. RNA polymerase II elongation by hepatitis delta antigen, Science [68] M.J. Neville, R.D. Campbell, Alternative splicing of the LST-1 gene 293 (2001) 124–127. located in the major histocompatibility complex on human chromo- [47] Q. Ye, Y.F. Hu, H. Zhong, A.C. Nye, A.S. Belmont, R. Li, some 6, DNA Seq. 8 (1997) 155–160. BRCA1-induced large-scale chromatin unfolding and allele-specific [69] A. Raghunathan, R. Sivakamasundari, J. Wolenski, R. Poddar, S.M. effects of cancer-predisposing mutations, J. Cell Biol. 155 (2001) Weissman, Functional analysis of B144/LST1: a gene in the tumor 911–921. necrosis factor cluster that induces formation of long filopodia in [48] K. Jepsen, M.G. Rosenfeld, Biological roles and mechanistic actions eukaryotic cells, Exp. Cell Res. 268 (2001) 230–244. of co-repressor complexes, J. Cell Sci. 115 (2002) 689–698. [70] S. Govind, Control of development and immunity by rel transcription [49] P. Lin, L.H. Huang, R. Steward, Cactin, a conserved protein that factors in Drosophila, Oncogene 18 (1999) 6875–6887. interacts with the Drosophila IkappaB protein cactus and modulates [71] K. Tanaka, T. Kawakami, K. Tateishi, H. Yashiroda, T. Chiba, Control its function, Mech. Dev. 94 (2000) 57–65. of IkappaBalpha proteolysis by the ubiquitin–proteasome pathway, [50] N. Matsuda, A. Nakano, RMA1, an Arabidopsis thaliana gene whose Biochimie 83 (2001) 351–356. cDNA suppresses the yeast sec15 mutation, encodes a novel protein [72] H. Caron, B. van Schaik, M. van der Mee, F. Baas, G. Riggins, P. van with a RING finger motif and a membrane anchor, Plant Cell Physiol. Sluis, M.C. Hermus, R. van Asperen, K. Boon, P.A. Voute, S. Heist- 39 (1998) 545–554. erkamp, A. van Kampen, R. Versteeg, The human transcriptome map: B. Lehner et al. / Genomics 83 (2004) 153–167 167

clustering of highly expressed genes in chromosomal domains, Sci- [75] J. Trowsdale, The gentle art of gene arrangement: the meaning of ence 291 (2001) 1289–1292. gene clusters, Genome Biol. 3 (2002). [73] M.J. Lercher, A.O. Urrutia, L.D. Hurst, Clustering of housekeeping [76] M. Grey, M. Brendel, A ten-minute protocol for transforming Sac- genes provides a unified model of gene order in the human genome, charomyces cerevisiae by electroporation, Curr. Genet. 22 (1992) Nat. Genet. 31 (2002) 180–183. 335–336. [74] V. Anantharaman, E.V. Koonin, L. Aravind, Comparative genomics [77] M. Fromont-Racine, J.C. Rain, P. Legrain, Toward a functional anal- and evolution of proteins involved in RNA metabolism, Nucleic ysis of the yeast genome through exhaustive two-hybrid screens, Nat. Acids Res. 30 (2002) 1427–1464. Genet. 16 (1997) 277–282.