Absolute Deviation to Improve Hit Selection for Genome-Scale RNAi Screens

NAMJIN CHUNG,1 XIAOHUA DOUGLAS ZHANG,2 ANTHONY KREAMER,1 LOUIS LOCCO,1 PEI-FEN KUAN,2,3 STEVEN BARTZ,4 PETER S. LINSLEY,4 MARC FERRER,1 and BERTA STRULOVICI1

High-throughput screening (HTS) of large-scale RNA interference (RNAi) libraries has become an increasingly popular method of functional genomics in recent years. Cell-based assays used for RNAi screening often produce small dynamic ranges and significant variability because of the combination of cellular heterogeneity, transfection efficiency, and the intrin- sic nature of the genes being targeted. These properties make reliable hit selection in the RNAi screen a difficult task. The use of robust methods based on median and median absolute deviation (MAD) has been suggested to improve hit selection in such cases, but and (SD)–based methods are still predominantly used in many RNAi HTS. In an experimental approach to compare these 2 methods, a genome-scale small interfering RNA (siRNA) screen was performed, in which the identification of novel targets increasing the therapeutic index of the chemotherapeutic agent mitomycin C (MMC) was sought. MAD values were resistant to the presence of outliers, and the hits selected by the MAD-based method included all the hits that would be selected by SD-based method as well as a significant number of additional hits. When retested in triplicate, a similar percentage of these siRNAs were shown to genuinely sensitize cells to MMC compared with the hits shared between SD- and MAD-based methods. Confirmed hits were enriched with the genes involved in the DNA damage response and cell cycle regulation, validating the overall hit selection strategy. Finally, computer simulations showed the superiority and generality of the MAD-based method in various RNAi HTS data models. In conclusion, the authors demonstrate that the MAD-based hit selection method rescued physiologically relevant false negatives that would have been missed in the SD-based method, and they believe it to be the desirable 1st-choice hit selection method for RNAi screen results. (Journal of Biomolecular Screening 2008:149-158)

Key words: RNAi, RNA interference, siRNA, high-throughput screen, functional genomics, data analysis, MAD, median absolute deviation, hit selection

INTRODUCTION as one of the most popular investigative tools ever for drug tar- get identification and validation.2 With the advances in genome NA INTERFERENCE (RNAi) refers to posttranscriptional sequencing, large-scale siRNA or short hairpin (shRNA) Rgene silencing that involves the endonucleolytic cleavage libraries have been built and screened to identify novel thera- and subsequent degradation of a specific mRNA transcript by peutic targets.3-11 homologous double-stranded RNA.1 Discovery of small inter- A genome-scale RNAi library of about 20,000 genes can be fering RNA (siRNA) and its utility in mammalian cells have screened in, depending on configuration, 200 to 250 microplates enabled both academia and industry to adopt RNAi technology with 96 wells or 50 to 80 plates with 384 wells. RNAi screens are, in principle, cell-based high-throughput screens (HTS) that involve siRNA (or shRNA) transduction. It is siRNA transfection, however, that makes an RNAi screen markedly different from and 1 Department of Automated Biotechnology, Merck Research Laboratories, much more complicated than a conventional HTS. siRNA trans- North Wales, Pennsylvania. 2Department of Biometrics Research, Merck Research Laboratories, West fection can be, even with automation, a slow process and requires Point, Pennsylvania. lengthy incubation times before the effects of silencing are observed 3Department of , University of Wisconsin, Madison. and can be measured.12 Considering that RNAi assays typically last 4Department of Biology, Rosetta Inpharmatics, a wholly owned subsidiary of 48 to 96 h and often entail screening under multiple conditions Merck & Co., Inc., Seattle, Washington. (e.g., with or without DNA damage), the entire process of a Received Jun 1, 2007, and in revised form Oct 6, 2007. Accepted for publica- genome-scale RNAi screen can take up to several weeks of full tion Oct 26, 2007. operations. Therefore, RNAi screens are expensive, time-consum- Journal of Biomolecular Screening 13(2); 2008 ing, and resource intensive. For these reasons, many genome-scale DOI: 10.1177/1087057107312035 RNAi screens have been conducted in a nonreplicate manner,

© 2008 Society for Biomolecular Sciences www.sbsonline.org 149 Chung et al. followed by the retest of a relatively small number (100s to 1000s) genome-scale siRNA screen, selected 2 different types of hits of preliminary hits in replicate.13-15 As a result, confidence levels in based on either SD- or MAD-based method, and retested them in preliminary hits selected from the primary screen are generally triplicate to compare their performances. low. In addition, compared with in vitro biochemical assays, the For this purpose, we used an siRNA screen that measured assays used in an RNAi screen typically measure cellular pheno- cell viability under different mitomycin C (MMC) conditions types with narrower dynamic ranges and greater variability due to to identify chemosensitizing genes. MMC is a common cellular heterogeneity. Moreover, variable siRNA transfection effi- chemotherapeutic agent for cancer patients. It mediates the for- ciency from plate to plate increases data variability. Taken together, mation of covalent bonds between DNA strands, causing cell it is not uncommon to observe a 20% to 30% coefficient of vari- cycle arrest during the S phase and, when left unrepaired, apop- ance for control siRNAs in an RNAi screen.15 Under these circum- tosis. Most cancer cells proliferate in an uncontrolled manner stances, it is critical to reduce the number of false-negative hits and, as a result, become particularly susceptible to apoptotic rather than false-positive hits from the nonreplicate, initial screen cell death by MMC. However, like many other chemotherapeu- because false positives can be eliminated during the validation tic agents such as cisplatin, camptothecin, and doxorubicin, process whereas false negatives represent missed opportunities. MMC is also cytotoxic to normal cells such as epithelial cells A simplistic hit selection method in HTS is to use an arbi- lining the digestive tract, hematopoietic cells, and other trary numerical cutoff such as 30% inhibition or 50% activa- actively proliferating cells. Therefore, the ultimate goal of the tion. Another widely popular method is to use the z score, screen is to identify therapeutic targets that can be used which is defined by the difference between individual well data together with a lower dose of MMC in combination therapy. and sample mean (or median) divided by standard deviation (SD).16 Therefore, a z score of 3 translates into the distance of MATERIALS AND METHODS 3 × SD extended from sample mean (or median), and the wells with a z score of 3 or greater represent an extreme 0.27% sub- Cell lines and chemicals set in a normal distribution. One way to reduce false negatives is to lower the stringency for hit selection threshold (e.g., 2 × HeLa cells were purchased from the American Type Culture SD instead of 3 × SD) and select more preliminary hits.13,14 Collection (Rockville, MD) and maintained in Dulbecco’s Minimal However, this inevitably increases the number of false positives Essential Media (DMEM) supplemented with 10% fetal bovine as well, thus increasing the cost of retesting the siRNAs that serum (FBS) and penicillin/streptomycin (Invitrogen, Carlsbad, would eventually turn out to be negative.16 CA). MMC was obtained from Calbiochem (San Diego, CA). Recently, there have been more sophisticated approaches to HTS hit selection to improve overall confirmation rates.16-24 siRNA design and siRNA library composition One way is to use median absolute deviation (MAD) rather siRNA sequences were designed with an algorithm devel- than popular SD. MAD can be defined as follows: oped to increase efficiency of the siRNAs for silencing while minimizing their off-target effects.26 siRNAs were manufac- MAD = 1.4826 × median( ⎜x – median(x) ⎜), ij tured by Sigma-Proligo (Boulder, CO). The custom siRNA library is composed of 22,108 unique siRNA pools, in which where x indicates all the values in the sample wells of a plate and each pool consists of an equimolar mixture of 3 siRNAs target- xij indicates the sample well at the row i and the column j. The ing difference sequences of the same mRNA transcript. The constant 1.4826 is used to make MAD comparable to SD when siRNA library includes the druggable genome,27 membrane 25 data distribute normally. In HTS assays, SD is often inflated by proteins, enzymes, pathways of therapeutic interest, and the presence of a few strong outliers (potential hits or detection RefSeq genes (releases 6-8, http://ncbi.nih.gov/RefSeq/). errors), which could increase the number of false negatives. In contrast, MAD is more robust to outliers, and MAD-based siRNA transfection methods are expected to obtain fewer false negatives. Using data in real RNAi HTS experiments, we have demonstrated that HeLa cells were transfected with siRNAs using the Optifect MAD-based methods (e.g., median ± 3 × MAD) identify more transfection reagent (Invitrogen). Briefly, HeLa cells (440 cells/40 hits than SD-based methods (e.g., mean ± 3 × SD).22 However, μl/well in 384-well microplates) were grown for 24 h in DMEM although previous studies have analyzed existing HTS data to supplemented with 10% FBS and penicillin/streptomycin in a develop and compare new hit selection methods, there have been SelecT automated cell culture system (Hertfordshire, UK) before no experimental investigations specifically designed to show transfection with the siRNA library described above. For screen- whether the additional hits selected using MAD-based methods ing, 3 siRNAs targeting the same gene were pooled at equal molar- are mostly true or false positives. In this study, we addressed this ity (final concentration of each siRNA = 17 nM; additive siRNA important question in a prospective approach, in which we ran a concentration = 50 nM). siRNA transfection was performed with

150 www.sbsonline.org Journal of Biomolecular Screening 13(2); 2008 Improving Hit Selection for Genome-Scale RNAi Screens

5 μl of diluted Optifect (1:40 in Opti-MEM) and siRNA as a shifted mean relative to the nonhits. Therefore, the observa- 12 μ + σ2 described previously. Four hours after transfection, cells were tions for the true hits were generated from N( NH k1, H) for μ × μ σ2 treated with 5 l of media containing 10 MMC or media only. the drug-sensitizing effect and N( NH – k2, H) for the desensi- μ σ2 σ2 tizing effect. The parameters NH, NH, and H were chosen Cell viability assays such that the simulated data resembled observed data from the RNAi HTS experiment. For example, considering the effect of Cell viability was determined using alamarBlue reagent μ gene network, NH was generated from Uniform(–0.5, 0.5) (BioSource International, Camarillo, CA) 72 h after transfection. σ instead of being exactly zero, and NH was generated from Media from 384-well plates were removed and replaced with 25 Uniform(0.5, 1) for all 3 scenarios. The parameters for hits μl/well growth media containing 10% (vol/vol) alamarBlue and 10 were tuned such that the signal-to-noise ratios, k1 and k2,were mM HEPES buffer, pH 7.4. The plates were then incubated at small for noisy data and high for moderate data. In the case of 37 °C, 5% CO , and 95% relative humidity for 2 h before measur- 2 skewed data, we chose different values for k1 and k2 such that ing fluorescence (544 nm excitation, 590 nm emission) using a the true sensitizing and desensitizing hits were asymmetrical. Tecan Ultra Evolution multilabel plate reader. Specifically, the parameters for hits were generated as follows: = = σ ∼ = k1 k2 3 and H Uniform(0.25, 0.5) for moderate data, k1 = σ ∼ = Automated screening k2 2 and H Uniform(0.5, 1) for noisy data, and k1 1.5, = σ ∼ k2 3 and H Uniform(0.5, 1) for skewed data. Screening was carried out on a Thermo CRS robotic system To evaluate the robustness of the MAD-based method when equipped with multiple pipetting stations, plate washers, and the underlying normality assumption is violated, the 4th scenario plate readers, using the above-described protocol interpreted by (i.e., Gamma data) was simulated based on gamma distributions POLARA software (Thermo Scientific, Waltham, MA). Cells instead of normal distributions. The shape k and scale θ of gamma were plated offline by the SelecT system as described above distribution were determined by solving μ = kθ and σ2 = kθ2 ° NH NH and moved to an incubator on the robot (37 C, 5% CO2, and μ = θ σ2 = θ2 μ ∼ for nonhits and NH – 4 k and NH k for hits, where NH relative humidity >95%). A Beckman Biomek FX was used for σ ∼ Uniform(0, 0.5) and NH Uniform(0.5, 1). For either nonhits or the preparation of intermediate plates and subsequent liquid hits, half of the data were generated from the gamma distribution transfer, and 3 Multidrops were used for dispensing indicated described above and then centered at zero; another half of the data concentrations of MMC solutions. Media removal was per- were generated from the same, zero-centered gamma distribution formed with an Embla 384 Cell Washer, followed by the addi- but with the opposite signs. tion of alamarBlue reagent mixture. Fluorescence intensity was Consistent with the rest of the present study, we directed our measured using a Tecan Ultra Evolution reader. simulation study to focus on identifying siRNAs with sensitizing effects (i.e., strong negative values of log ratios). The perfor- Simulation study mances of MAD- and SD-based methods were compared in terms of empirical false-positive rates (FPRs) and empirical false-negative Data for eighty 384-well plates were simulated from various rates (FNRs) at various α levels (targeted FPR or type I errors in models to represent the variations similar to what we have a 1-sided test), in which each α corresponds to k =Φ(1 – α) in observed in real RNAi HTS screening campaigns. Briefly, for k × MAD-based or k × SD-based methods. A sample well was each simulation, the following steps were taken. For each plate declared a hit if the observed value was less than mean–kSD or j, we drew 1) a number of true hits n from Uniform{1, 2, . . . , Hj median–kMAD. Each scenario was repeated 500 times to mea- 30} with an average of 15 out of 304 sample wells being true sure the consistency and variability of the methods. hits; 2) the observations Xij (in log ratios) for the true hits from a distribution g (X) for i = 1, . . . , n ; 3) the observations X j Hj ij RESULTS (in log ratios) for the nonhits from another distribution fj(X) for i = 1, . . . , 304 – n . Hj Genome-scale RNAi HTS to identify chemosensitizing Various distributions for the true hits g (X) were considered j targets for MMC to obtain scenarios in which the hits are relatively well sepa- rated from the nonhits (moderate data), some hits are indistin- A cell-based assay in a 384-well microplate format was guishable from the nonhits (noisy data), and the data are designed to compare the viability of HeLa cervical carcinoma cells skewed because the distribution of truly sensitizing and desen- under 3 different conditions of 0, 20, or 60 nM MMC treatments sitizing hits is asymmetrical (skewed data). Moderate, noisy, for 72 h after siRNA transfection. A genome-scale siRNA library and skewed data are all based on normal distributions. In these of 22,108 unique siRNAs targeting 17,509 human genes was scenarios, the observations for the nonhits were generated from screened with this assay to identify the siRNAs chemosensitizing μ σ2 12 N( NH, NH). Because hits are usually identified as outliers MMC. In this assay, luciferase (Luc) siRNA, the negative con- among sample wells, it is reasonable to assume that they have trol, and BARD1 siRNA, the positive control,28 were placed in

Journal of Biomolecular Screening 13(2); 2008 www.sbsonline.org 151 Chung et al.

relatively small window of activity. This was also true in our 2 assay, in which more than 95% of all assay results fell between 4 0% and 200%, whether the Luc or sample siRNA median was 6 set to 100% (Fig. 2). When the assay window is so small, com- 8 bined with cellular heterogeneity in response to DNA damage, Row 10 small aberrations in signal detection can result in large data

12 variations. These variations became more exaggerated when = 14 the data were normalized to the median of Luc siRNA (n = 16 8/plate) than to the median of plate sample (n 304/plate for 2 4 6 8 10 12 14 16 18 20 22 24 full plates), most likely because of the difference in sampling ColumnCol size. Because data can be more variable upward than down- Screening -MMC: ------ward, the severity of data variation was especially apparent Conditions +MMC:+- ++++++++++++++++++++++ when the upper adjacent values (UAVs), excluding outliers, of the plate sample populations were compared. The standard Sample siRNA Negative control deviation for the UAVs of the Luc siRNA-normalized sample Background (no cells) Positive control Mock Transfection control populations was 72% compared with the UAVs of 22% for the Not used sample-normalized sample populations (Fig. 2B). This obser- vation led us to use the normalization based on plate sample FIG. 1. Screening plate map. A representative 384-well microplate is for the analyses hereafter. shown to illustrate the locations of control and sample small interfering To represent the degree of chemosensitization by MMC, the RNAs (siRNAs) and mitomycin C (MMC) treatment. A library of 80 viability for each individual well was first calculated relative to plates (full or partial) containing 22,108 siRNA pools were screened the plate sample median, which was reset to 100% regardless under 60-nM MMC (+MMC) or control media (–MMC) conditions. Luc siRNA was used as a nontargeting, negative control and BARD1 as a of MMC treatment. Then, the log value (with the base of 2) was positive control. PLK1 siRNA is cytotoxic to HeLa cells and was used as calculated for the ratio of the viability with MMC over that a transfection control, located both in control columns and internal to the without MMC for the same siRNA. This renders a 2-fold siRNA library. Mock indicates wells with transfection reagent only, with- decrease in viability by MMC to the value of –1, a 2-fold out siRNA. Background wells contain media only, without cells. Column increase to +1, and so on. Excluding outliers, the median log 2 was always treated with control media, regardless of how the rest of the ratios of the UAV and lower adjacent value (LAV) for the plate plates were treated. This column was used to estimate the degree of sen- samples were +1.10 and –1.05, respectively. The log ratios of sitization by MMC. Cells were transfected with 50 nM siRNA. BARD1 siRNA, the positive control for MMC sensitization used in the screen, were moderately above or below the LAV of the plate sample, and the median was –0.80 (Fig. 3). This meant that BARD1 siRNA performed close to sample outliers, duplicate on 2 separate columns: one (column 2) without MMC a desirable behavior for positive control. treatment and another (column 23) with the same concentration of MMC as with the rest of the plate (Fig. 1). This arrangement Experimental validation of the MAD-based hit selection allowed us to measure the degree of cytotoxicity conferred by method in RNAi HTS MMC within the plate. The relative viability of Luc siRNA-trans- fected cells treated with 20 and 60 nM MMC was approximately To select preliminary hits, log ratios were plotted along the 80% and 60%, respectively, compared with the cells transfected plate run order (Fig. 4A). A common practice in HTS hit selec- with the same siRNA but without MMC treatment. Data analysis tion is to use 3 × SD values from the sample median, which is showed that assay results from the 20-nM MMC treatment were equivalent to a z score of 3. However, close examination of many not significantly robust enough to produce reliable hits and subse- HTS plates reveals that a few strong outliers could significantly quently were not pursued. Instead, we focused our efforts on iden- increase SD values such that many wells that could be considered tifying MMC-chemosensitizing siRNAs from the comparison as hits in other plates are not selected as hits in the plates with between 0 and 60 nM MMC conditions. such strong outliers. These represent potential false negatives. To Viability is the measurement of overall cell health, a total reduce such false negatives, we considered using MAD, instead sum of positive and negative effects from cellular metabolic of SD, for hit selection. Following our analysis, the number of hits and proliferative activities. In contrast to many HTS assays selected by the method employing 3 × SD was 61. All these hits measuring exogenous, amplified signals such as Luc gene would be selected as hits by the method employing 3 × MAD as reporters and the processing of overexpressed proteins, cell well. Moreover, the 3 × MAD method would select an additional viability represents an unamplified, endogenous signal with a 51 siRNAs as hits, with the total count at 112 hits (Fig. 4B). The

152 www.sbsonline.org Journal of Biomolecular Screening 13(2); 2008 Improving Hit Selection for Genome-Scale RNAi Screens

2.5 A 600 Luc median = 100% 2.0 1.5 500 ) 1.0 UAV (-outliers) -MMC

400 /V 0.5

MMC 0.0 + Pos. Control 300 (V

2 -0.5

Log -1.0 200 -1.5 LAV (-outliers) -2.0 100

Viability (% Luc Median) -2.5 1 11213141516171 0 Plate Run Order 0 102 03 0 40 506 07 0 80

600 Run Order FIG. 3. Performance of BARD1-positive control small interfering Sample median = 100% (siRNA). Log values (with the base of 2) of the ratio of the viability 500 under +mitomycin C (MMC) over the viability under –MMC condi- tions were plotted along the plate run order. The upper and lower blue 400 lines indicate the upper adjacent values (UAVs) and lower adjacent

300 values (LAVs) of the plate sample siRNAs, respectively. The median of BARD1 siRNAs are shown with the red lines. 200

100 Viability (%Sample Median)

0 0 102 03 0 40 506 07 0 80 51 hits selected by 3 × MAD-only showed moderately weaker Plate Run Order sensitization than the 61 hits common to both 3 × SD and 3 × Luc siRNA Sample siRNA MAD methods (median log ratios at –1.33 v. –1.67; Fig. 4C). The 112 preliminary hits were retested in triplicate in the B 600 Luc median = 100% same assay format as in the initial primary screen. Because the 500 112 hits represented a biased, enriched population, the viability 400 results were normalized to the median of Luc siRNA, and log 300 ratio values were calculated in the same manner explained Sample UAV 200 above for the primary screen. In the confirmation assay, p- (- outliers)

Viability (% Luc Median) Sample median values were also calculated for individual siRNAs (n = 3 for 100 Luc median Sample LAV each siRNA) and used to draw a volcano plot (Fig. 5A). Using 0 (-outliers) 1 11213141516171 the volcano plot, we determined the final hits to have p-values Plate Run Order 300 Sample median = 100% less than 0.05 and the log ratios less than the median (–0.400) 250 of the medians of Luc and BARD1 siRNAs (–0.001 and

200 –0.792, respectively). These cutoffs returned 30 final hits, 18 Sample UAV (- outliers) × × 150 from the group common to both 3 SD and 3 MAD-based Sample median selections and 12 from the selections based on 3 × MAD only 100 Luc median (Fig. 5B). The latter 12 hits represent those that could have 50 Viability (% Sample Median) Sample LAV × (-outliers) missed from the initial hit selection if only the 3 SD method 0 1 11213141516171 was used. When the viability profiles of these 12 siRNAs were Plate Run Order carefully examined, they showed similar degrees of sensitiza- FIG. 2. Comparison of data normalization methods. Background- tion with MMC as compared with the 18 siRNAs commonly × × subtracted data were normalized to the median of either luciferase selected by both the 3 SD and 3 MAD methods (Fig. 5C). (Luc) small interfering RNAs (siRNAs; negative control, n = 8/plate) If these 30 siRNAs, representing 27 unique genes, were or sample siRNAs in the same plate (n = 304 for the full plate), which properly selected to show chemosensitization toward MMC, was set to 100%. (A) The entire sample (blue) and Luc (light blue) these genes would be enriched in relevant physiological path- siRNAs were plotted along the plate run order. (B) Statistical repre- ways. To validate this idea, these genes were analyzed for path- sentation of the screening results. The upper blue lines indicate the way enrichment in GO Biological Process terms. From the list upper adjacent values (UAVs) of the sample population data, which exclude outliers, and the lower blue lines represent the lower adjacent of the pathways, those with more than 3 genes and an expecta- values (LAVs). The median of the plate sample is shown in black lines tion value less than 0.05 are shown in Figure 6. It is clear that and Luc siRNA by the red lines. Only the plates screened under– the genes targeted by MMC-sensitizing siRNAs are highly mitomycin C conditions are shown here for the purpose of illustration. enriched in DNA damage response or cell cycle regulation.

Journal of Biomolecular Screening 13(2); 2008 www.sbsonline.org 153 Chung et al.

A AB

3xMAD 5 18

) 2 16 4

-MMC 14 /V 12

+MMC 0 3 10 (V -value) 2 p

( 8 10 Log 2 6 -2 −log No. of Confirmed Hits 4 0.05 1 2 0 0 1020304050607080 MAD MAD Run Order SD 0 & only SD 3xSD -2.5 -2 -1.5 -1 -0.5 0

Log2 (V+MMC/V-MMC) Hit Types

) 2 -MMC

/V C

+MMC 0 120 (V 2

Log 100 -2

80

0 1020304050607080 Plate Run Order 60

Sample siRNA Preliminary Viability (%Luc siRNA) 40 (non-hits) hits

BC4 20

MAD only 3 nM MMC 0206002060 ) 2 SD & MAD -MMC Hit Types MAD & SD MAD only /V 1

+MMC 0 FIG. 5. Confirmation screen results. The 112 preliminary hits were (V 61 51 2 retested in triplicate under the same conditions as in the primary screen. --1 Log (A) A volcano plot was derived from log2 ratios of viability and –log10 -2- p-values (Student t-test). The confirmation cutoff was determined at the -3- equal distance from negative (solid black circles) and positive (solid blue circles) control small interfering RNAs (siRNAs). The sample MAD MAD No hits total = 112 & only siRNAs are indicated in open gray circles and the 30 confirmed hits in SD solid red circles. Two hits with very low p-values (between 10–13 and –25 FIG. 4. Comparison of hit selection methods. (A) The screening results 10 ) are not seen in this illustration. (B) Eighteen of the 61 shared hits were represented by the log ratios of the viability under +mitomycin C (29.5%) and 12 of 51 median absolute deviation (MAD)–only hits (MMC) over the viability under –MMC. The small interfering (siRNAs) (23.5%) were confirmed to show significant sensitization toward mito- showing greater than or equal to 3 × the median absolute deviation mycin C (MMC). (C) Viability profiles of the confirmed hits, tested (MAD; top panel) or standard deviation (SD; bottom panel) from the under 0-, 20-, and 60-nM MMC concentrations and normalized to the plate sample median were selected as preliminary hits (marked in solid median of Luc siRNA under each condition as 100%. red circles). (B) Venn diagram of MAD- and SD-based hit selections. One hundred twelve (112) hits derived from the MAD-based method include all 61 hits derived from the SD-based method as well as 51 hits Simulation study for comparing the MAD-based unique to the MAD-based method. (C) Box-and-whisker plots for MAD method and SD-based method and SD shared hits, MAD-only hits, and no hits. Blue arrowheads indi- cate the median and black arrowheads the mean. The median values for In this report, we have provided experimental evidence that MAD and SD shared hits, MAD-only hits, and no hits were –1.67, –1.33, the MAD-based hit selection method is resistant to outliers and and 0.01, respectively. rescues potential false negatives from the SD-based method. To

154 www.sbsonline.org Journal of Biomolecular Screening 13(2); 2008 Improving Hit Selection for Genome-Scale RNAi Screens

- log10 (expectation) miss true hits, resulting in a higher proportion of false negatives. 0.0 1.0 2.0 3.0 4.0 5.0 The MAD-based method is robust and able to reduce FNRs in

response to DNA damage stimulus most cases, which is highly desirable in large-scale, nonreplicate DNA repair screens. In addition, as shown in the simulation, the empirical response to endogenous stimulus FPR for the MAD-based method is very close to the target FPR, response to radiation giving a more reliable set of hits at a given target error control. DNA damage response, signal transduction cell cycle checkpoint DISCUSSION

GO Biological Process cell cycle double-strand break repair We have performed a genome-scale RNAi screen to identify chemosensitizing drug targets for MMC using cell viability as FIG. 6. Enrichment of the confirmed hits in the pathways related to DNA damage response and cell cycle regulation. The 30 confirmed assay readout. Dynamic ranges for viability assays are usually hits were analyzed for pathway enrichment using the GO database small, as was the case with our study, in which more than 95% of (Biological Process). Pathways with expectation values (e-values) all assay results fell within 0% to 200% whether the Luc siRNA smaller than 0.05 and with the gene set overlap count of 3 or greater or sample median was used (Fig. 2). In addition, inherent cellular were shown. heterogeneity associated with siRNA transfection and the response to MMC confounded the small dynamic range with rather large data variation. With this kind of assay results from a assess whether such advantages are the pervasive features of the nonreplicate HTS, it is challenging to select reproducible hits. MAD-based method, we simulated various RNAi HTS results Therefore, it is desirable to relax hit selection criteria and increase modeling after real HTS data and performed a statistical study the number of preliminary hits to a certain degree because a comparing MAD- and SD-based hit selection methods (Fig. 7). missed hit will never be retested in follow-up investigations. A Representatives of the simulated data are plotted along the plate clear task here is how to increase the number of preliminary hits run order for the different HTS scenarios (Fig. 7A). The obser- without excessively increasing the number of false positives, vations for the true hits were generated from 3 different normal which became our goal in data analysis and hit selection. distributions with different shifted (moderate, noisy, and Having this task in mind, when we compared data normaliza- skewed) as well as from a gamma distribution (gamma) for the tion methods between the one normalizing to the median of Luc cases in which the normality assumption is violated. The obser- siRNAs and the other to the median of plate sample population, vations for the nonhits were generated from the normal distribu- it became clear that the degree of data variations was too severe tion in all scenarios. FNRs were derived from 500 simulations in with Luc siRNA-based normalization (Fig. 2). These variations each scenario at some common target error control (Fig. 7B). In were most likely due to experimental artifacts associated with the normality, the common 3 × MAD- or 3 × SD-based method cor- small number of Luc siRNAs (n = 8/plate). Furthermore, some of responds to the target error of 0.00135, whereas the 2 × MAD- the same siRNA plates resulted in high overall viability in one or 2 × SD-based method corresponds to 0.023 in 1-sided tests. In screen and low overall viability in another, indicating that they all 4 scenarios, the FNRs for MAD-based methods were signifi- were not the plates particularly biased or enriched to favor assay cantly lower than for SD-based methods, even when the under- results in one direction or the other (N. Chung, S. Bartz, unpub- lying data were noisy. In cases in which the true hits were lished results). Therefore, it was a natural choice to normalize the relatively well separated from the nonhits, the 3 × MAD-based data to the median of plate sample populations. As a result, the method was far more superior to the 3 × SD-based method. The data variations clearly became smaller, as demonstrated by a higher FNRs for the SD-based method were compensated for by more than 3-fold reduction in UAV values. the lower FPRs, a well known tradeoff between type 1 (theoreti- The 2nd measure to reduce false negatives was to employ a hit cal FPR) and type 2 (1-power) errors. Despite having a lower selection method based on the multiples of MAD instead of SD. FPR, the empirical FPR for the SD-based method still deviated In HTS, there is a good chance for a few outliers present in any from the diagonal line (Fig. 7C). A method that targets the type given plate, whether they represent technical outliers or true hits. 1 error perfectly would lie on the diagonal line. In all 4 scenar- In a sense, the purpose of HTS hit selection is to identify outliers. ios, the empirical FPR for the MAD-based method (pink line) When outliers are present, the SD of the plate sample population was very close to the target FPR. However, the SD-based method is inflated by those outliers, and the hit cutoff becomes more strin- (blue line) had a tendency to underestimate the FPR at lower tar- gent for other samples in the affected plate. On the other hand, get error and overestimate it at higher target error. We also cal- MAD is largely based on sample ranks and thus less affected by culated the area under the receiver-operator characteristics the presence of a few strong outliers, as shown in our simulation curves (AUROCs). Both methods gave similar AUROCs. The study. Therefore, MAD and SD values are close to each other simulation results illustrate that the SD-based method is very when the data are nearly normally distributed without outliers and sensitive to outliers (i.e., the existence of true hits) and tends to distant from each other when outliers are present. To further

Journal of Biomolecular Screening 13(2); 2008 www.sbsonline.org 155 Chung et al.

FIG. 7. Simulation study for comparing median absolute deviation (MAD)– and standard deviation (SD)–based hit selection methods. (A) Simulated RNA interference (RNAi) high-throughput screening (HTS) data are plotted by plate run order. The hits are derived from normal dis- tribution in moderate, noisy, and skewed data as well as from gamma distribution in Gamma data. True hits are indicated by solid pink and gray circles and nonhits by solid green circles. Cutoff values (1-sided) for 3 × MAD- and 3 × SD-based hit selections are indicated by red and blue dashes, respectively. (B) False-negative rates at various target errors (type 1 error control) are illustrated in box plots. MAD- and SD-based methods are represented by pink and blue box plots, respectively. (C) Empirical/realized false-positive rates (FPRs) versus target/theoretical FPRs. MAD- and SD-based methods are represented by pink and blue lines, respectively.

156 www.sbsonline.org Journal of Biomolecular Screening 13(2); 2008 Improving Hit Selection for Genome-Scale RNAi Screens appreciate the benefits of MAD-based hit selection, we compared 7. Kim JK, Gabel HW, Kamath RS, Tewari M, Pasquinelli A, Rual JF, et al: it to the SD-based (in other words, z-score) method by extending Functional genomic analysis of RNA interference in C. elegans. Science the cutoff line 3 times either the SD or MAD from the plate sam- 2005;308:1164-1167. ple median in the siRNA screen. The MAD-based method 8. Bernards R, Brummelkamp TR, Beijersbergen RL: shRNA libraries and returned 51 more hits in addition to the 61 hits shared by the 2 their use in cancer genetics. Nat Methods 2006;3:701-706. 9. Perrimon N, Mathey-Prevot B: Applications of high-throughput RNA methods. These 51 additional hits were weaker than the 61 shared interference screens to problems in cell and developmental biology. hits in sensitizing MMC, but if some strong sensitizers from the Genetics 2007;175:7-16. shared hits were removed from the comparison, both types of hits 10. Goshima G, Wollman R, Goodwin SS, Zhang N, Scholey JM, Vale RD, showed a degree of MMC sensitization very similar to each other et al: Genes required for mitotic spindle assembly in Drosophila S2 cells. (Figs. 4C and 5C). Overall, 12 of 51 MAD-only hits (23.5%) Science 2007;316:417-421. were validated to have significant sensitizing effects, compared 11. Whitehurst AW, Bodemann BO, Cardenas J, Ferguson D, Girard L, with 18 of 61 shared hits (29.5%). Combined, these 30 siRNAs Peyton M, et al: Synthetic lethal screen identification of chemosensitizer showed a strong enrichment in DNA damage repair and cell cycle loci in cancer cells. Nature 2007;446:815-819. regulations (Fig. 6), with the pathways highly relevant to MMC 12. Chung N, Locco L, Huff K, Bartz S, Linsley PS, Ferrer M, et al: An effi- sensitization, which validates the overall quality of hit selection. cient and fully automated high throughput transfection method for In this report, we have shown that RNAi screens measuring genome-scale siRNA screens. Unpublished manuscript. 13. Bartz SR, Zhang Z, Burchard J, Imakura M, Martin M, Palmieri A, et al: endogenous signals produce data with large variability, due to Small interfering RNA screens reveal enhanced cisplatin cytotoxicity in both measurement and biology, and then demonstrated that such tumor cells having both BRCA network and TP53 disruptions. Mol Cell noisy data can be properly analyzed to select hits using the Biol 2006;26:9377-9386. MAD-based method that rescued potential false negatives that 14. Majercak J, Ray WJ, Espeseth A, Simon A, Shi XP, Wolffe C, et al: would have been declared as nonhits by the popular SD-based LRRTM3 promotes processing of amyloid-precursor protein by BACE1 method. We believe that the MAD-based method is a desirable and is a positional candidate gene for late-onset Alzheimer’s disease. hit selection method for HTS, in which outliers commonly exist. Proc Natl Acad Sci USA 2006;103:17967-17972. 15. Stone DJ, Marine S, Majercak J, Ray WJ, Espeseth A, Simon A, et al: ACKNOWLEDGMENT High-throughput screening by RNA interference: control of two distinct types of . Cell Cycle 2007;6:898-901. We are grateful to the members of Automated Biotechnology, 16. Malo N, Hanley JA, Cerquozzi S, Pelletier J, Nadon R: Statistical prac- tice in high-throughput screening data analysis. Nat Biotechnol 2006;24: Biometrics, and Rosetta Inpharmatics for their support and encour- 167-175. agement. 17. Zhang JH, Chung TD, Oldenburg KR: A simple statistical parameter for use in evaluation and validation of high throughput screening assays. REFERENCES J Biomol Screen 1999;4:67-73. 18. Zhang JH, Chung TD, Oldenburg KR: Confirmation of primary active 1. Fire A, Xu S, Montgomery MK, Kostas SA, Driver SE, Mello CC: Potent and substances from high throughput screening of chemical and biological specific genetic interference by double-stranded RNA in Caenorhabditis ele- populations: a statistical approach and practical considerations. J Comb gans. Nature 1998;391:806-811. Chem 2000;2:258-265. 2. Elbashir SM, Harborth J, Lendeckel W, Yalcin A, Weber K, Tuschl T: 19. Brideau C, Gunter B, Pikounis B, Liaw A: Improved statistical methods Duplexes of 21-nucleotide RNAs mediate RNA interference in cultured for hit selection in high-throughput screening. J Biomol Screen 2003; mammalian cells. Nature 2001;411:494-498. 8:634-647. 3. Paddison PJ, Silva JM, Conklin DS, Schlabach M, Li M, Aruleba S, et al: 20. Gunter B, Brideau C, Pikounis B, Liaw A: Statistical and graphical A resource for large-scale RNA-interference-based screens in mammals. methods for quality control determination of high-throughput screening Nature 2004;428:427-431. data. J Biomol Screen 2003;8:624-633. 4. Hannon GJ, Rossi JJ: Unlocking the potential of the human genome with 21. Zhang JH, Wu X, Sills MA: Probing the primary screening efficiency by RNA interference. Nature 2004;431:371-378. multiple replicate testing: a quantitative analysis of hit confirmation and 5. Xin H, Bernal A, Amato FA, Pinhasov A, Kauffman J, Brenneman DE, false screening results of a biochemical assay. J Biomol Screen 2005; et al: High-throughput siRNA-based functional target validation. J 10:695-704. Biomol Screen 2004;9:286-293. 22. Zhang XD, Yang XC, Chung N, Gates A, Stec E, Kunapuli P, et al: 6. Poulin G, Nandakumar R, Ahringer J: Genome-wide RNAi screens in Robust statistical methods for hit selection in RNA interference Caenorhabditis elegans: impact on cancer research. Oncogene 2004;23: high-throughput screening experiments. Pharmacogenomics 2006;7: 8340-8345. 299-309.

Journal of Biomolecular Screening 13(2); 2008 www.sbsonline.org 157 Chung et al.

23. Zhang XD, Ferrer M, Espeseth AS, Marine SD, Stec EM, Crackower Address correspondence to: MA, et al: The use of strictly standardized mean difference for hit selec- Namjin Chung tion in primary RNA interference high-throughput screening experi- Department of Applied Genomics ments. J Biomol Screen 2007;12:497-509. Bristol-Myers Squibb Company 24. Zhang XD: A new method with flexible and balanced controls of false PO Box 5400, Mail Stop 3-1.22 negatives and false positives for hit selection in RNA interference high Princeton, NJ 08543-5400 throughput screening assays. J Biomol Screen 2007;12:497-509. 25. Tukey JW: Exploratory Data Analysis. Reading, MA: Addison-Wesley, 1977. Email: [email protected] 26. Jackson AL, Bartz SR, Schelter J, Kobayashi SV, Burchard J, Mao M, et al: Expression profiling reveals off-target gene regulation by RNAi. Marc Ferrer Nat Biotechnol 2003;21:635-637. Merck Research Laboratories, Merck & Co., Inc. 27. Hopkins AL, Groom CR: The druggable genome. Nat Rev Drug Discov 2002;1:727-730. 502 Louise Lane 28. Choudhury AD, Xu H, Modi AP, Zhang W, Ludwig T, Baer R: North Wales, PA 19454 Hyperphosphorylation of the BARD1 tumor suppressor in mitotic cells. J Biol Chem 2005;280:24669-24679. Email: [email protected]

158 www.sbsonline.org Journal of Biomolecular Screening 13(2); 2008