A panoramic view of expression in the human kidney

Danielle Chabarde`s-Garonne*†, Arnaud Me´ jean‡, Jean-Christophe Aude*, Lydie Cheval†, Antonio Di Stefano†, Marie-Claude Gaillard*, Martine Imbert-Teboul†, Monika Wittner†, Chanth Balian‡,Ve´ ronique Anthouard§, Catherine Robert§,Be´ atrice Se´ gurens§, Patrick Wincker§, Jean Weissenbach§, Alain Doucet†, and Jean-Marc Elalouf*†¶

*De´partement de Biologie Joliot-Curie, Service de Biochimie et de Ge´ne´ tique Mole´culaire, Commissariat a`l’Energie Atomique Saclay, 91191 Gif-sur-Yvette Cedex, France; †Centre National de la Recherche Scientifique Unite´de Recherche Associe´e 1859, Commissariat a`l’Energie Atomique Saclay, 91191 Gif-sur-Yvette Cedex, France; ‡Service d’Urologie, Hoˆpital Necker, 149 Rue de Se`vres, 75015 Paris, France; and §Ge´noscope, Centre National de Se´quenc¸age, 2 Rue Gaston Cre´mieux, 91006 Evry Cedex, France

Edited by Bert Vogelstein, The Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins, Baltimore, MD, and approved September 10, 2003 (received for review July 22, 2003) To gain a molecular understanding of kidney functions, we estab- possible the accurate comparison of gene expression patterns in lished a high-resolution map of gene expression patterns in the different cell populations. In addition, when carried out at the human kidney. The glomerulus and seven different nephron seg- level of single nephron portions, they provided only a partial ments were isolated by microdissection from fresh tissue speci- picture through the analysis of Ϸ1,000 transcripts (9). Hybrid- mens, and their transcriptome was characterized by using the serial ization to arrayed cDNAs or oligonucleotides offers the oppor- analysis of gene expression (SAGE) method. More than 400,000 tunity of studying more transcripts and comparing their expres- mRNA SAGE tags were sequenced, making it possible to detect in sion levels in different tissue samples and has already enabled the each structure transcripts present at 18 copies per cell with a 95% analysis of thousands of mRNAs in large human kidney zones confidence level. Expression of responsible for nephron (10) and cultured renal cells (11) but has not permitted up to now transport and permeability properties was evidenced through a systematic survey of the different human nephron portions. transcripts for 119 solute carriers, 84 channels, 43 ion-transport The method of serial analysis of gene expression (SAGE) (6) is ATPases, and 12 claudins. Searching for differences between the potentially the most exhaustive one for characterizing transcrip- transcriptomes, we found 998 transcripts greatly varying in abun- tomes, because it measures the expression of both known and dance from one nephron portion to another. Clustering analysis of unknown genes. SAGE relies on sequencing short diagnostic these transcripts evidenced different extents of similarity between 10-bp tags recovered in a cDNA library proportionally to their the nephron portions. Approximately 75% of the differentially abundance in the native tissue sample. Although SAGE initially distributed transcripts corresponded to cDNAs of known or un- required large amounts of tissue, a microassay compatible with known function that are accurately mapped in the . the analysis of microdissected nephron segments has been set up. This systematic large-scale analysis of individual structures of a It was previously used to analyze 15,000 mRNA tags retrieved complex human tissue reveals sets of genes underlying the func- from two nephron portions of the mouse kidney (12). We now tion of well-defined nephron portions. It also provides quantitative report on the analysis of Ͼ400,000 mRNAs tags isolated from expression data for a variety of genes mutated in hereditary most portions of the human nephron. diseases and helps in sorting candidate genes for renal diseases that affect specific portions of the human nephron. Methods Kidney Microdissection. The Necker Hospital ethical committee he human kidney consists of one million nephrons function- approved our study, and we obtained informed consent from all Ting in parallel to ensure efficient body fluid homeostasis. patients. The nine donors [seven males and two females; age This vital function rests on sequential blood filtration by the (years): 59 Ϯ 10 (SD)] were all devoid of AIDS and hepatitis B glomerulus and specific transport processes accomplished by the and C viral infection and had undergone surgery for removal of successive nephron segments. Because of this axial functional kidney tumors. After nephrectomy, a healthy kidney fragment segmentation of the nephron, studies carried out at the whole- was perfused via a branch of the renal artery with EuroCollins kidney level cannot define sites and mechanisms of physiological (Fresenius Kabi, Se`vres, France), immersed in ice-cold Euro- processes. Physiological and biochemical methods that allow Collins, and transferred to Saclay within 30 min. On arrival in the study of well-delineated nephron portions, including human ones laboratory, the kidney fragment was perfused with 20 ml of (1, 2), were set up long ago. More recently, molecular methods Hanks’ modified microdissection solution (13) supplemented completed these approaches through the cloning and character- with 0.24% wt͞vol collagenase (Serva) and 0.05% lissamine ization of the expression pattern along the nephron of a number green (2). The perfused zone of the kidney, as judged by the of genes essential for a variety of kidney functions. Besides presence of the dye, was excised, cut into small pyramids, and providing decisive progress for elucidating the mechanisms of incubated 45–60 min at 35°C in gassed microdissection solution physiological processes, the genetic and molecular strategies led containing 0.12% wt͞vol collagenase. The pyramids were then to the identification of genes mutated in inherited kidney or thoroughly rinsed and transferred into Petri dishes for micro- kidney-dependent diseases. Strikingly, several such genes are dissection, carried out at ice-cold temperature under stereomi- expressed in discrete nephron portions (3). croscopic observation by using anatomical and morphological Despite this progress, a remaining challenge is obtaining a criteria (2). Once isolated, the structures were transferred by complete overview of the genes expressed in the different nephron portions, especially in humans. With the availability of the human genome sequence (4, 5), the first level of genetic This paper was submitted directly (Track II) to the PNAS office. complexity has now been deciphered. The second level of Abbreviations: SAGE, serial analysis of gene expression; DCT, distal convoluted tubule. complexity, usually referred to as the transcriptome (6), can be Data deposition: SAGE data for the libraries described here are available at GEO studied by using a variety of techniques. Large-scale EST (7) and (www.ncbi.nlm.nih.gov͞geo) (accession nos. GSM10419 and GSM10423–GSM10429). cDNA (8) sequencing projects provide a framework for tran- ¶To whom correspondence should be addressed. E-mail: [email protected]. scriptome analysis, but they are not quantitative enough to make © 2003 by The National Academy of Sciences of the USA

13710–13715 ͉ PNAS ͉ November 11, 2003 ͉ vol. 100 ͉ no. 23 www.pnas.org͞cgi͞doi͞10.1073͞pnas.2234604100 Downloaded by guest on September 26, 2021 pipetting into another Petri dish and dragged individually for were discarded from the comparative analysis when they did not counting, evaluating tubular length, and removing any residual exceed by a factor of five the expected abundance due to a debris. sequencing error in a more abundant tag. As reported (12, 15), the expression of several genes was detected through more than Generation of SAGE Libraries. Pools of identical structures from one tag in a pattern consistent with alternative mRNA splicing each kidney were transferred into 400 ␮l of microdissection or polyadenylation. In the absence of systematic information on solution and centrifuged for 5 min at 1,000 ϫ g. The cell pellet the different transcripts generated from each human gene, the was dissolved in 100 ␮l of lysis-binding buffer (Dynal, Oslo) different tags for a single gene were numbered alphabetically containing 20 ␮g of glycogen (Roche, Basel), and stored at according to their abundance in the libraries of the present study. Ϫ80°C. SAGE libraries were generated by using 1,000 glomeruli or 300–600 mm of tubular segments (105 to 2 ϫ 105 cells) isolated Gene to Tag Mapping. For analyzing the expression of genes from three to nine kidneys. Libraries were obtained as described belonging to a same family (e.g., solute carriers, channels, or (12, 14) by using Sau3A I as anchoring enzyme. claudins), the web site of the Human Genome Organisation nomenclature committee (www.gene.ucl.ac.uk͞nomenclature) DNA Sequences Analysis. Ge´noscope (Evry, France) performed was used to search for all registered members. Then, mRNA plasmid minipreparations and automatic DNA sequencing. Se- sequences recorded in the corresponding UniGene clusters quence files were analyzed with the help of SADELAB (CEA (www.ncbi.nlm.nih.gov͞UniGene) were analyzed, and the most Saclay, France), a web-based integrated platform dedicated to 3Ј tag was validated by its presence either in the reviewed RefSeq the management of SAGE projects. Ditags consisting of nucle- record or in sequences of the UniGene or TIGR clusters. When otides that all displayed a PHRED score Ͼ16 or Ͻ10 were present in our SAGE nephron database, the tag was further automatically accepted or rejected, respectively. Ditags of inter- controlled for correct gene identification and possible match to mediate quality were individually checked and eventually ac- several UniGene clusters by using SAGEmap. The same kind of cepted after corrections for erroneous base calling. With single- analysis was performed to analyze the expression of disease pass sequencing, a 1% error rate is routinely obtained, which genes selected from the Online Mendelian Inheritance in Man translates to a SAGE tag error rate of 9.56% (1Ϫ 0.9910) (15). database (www.ncbi.nlm.nih.gov͞Omim). Because sequencing errors are essentially random, they do not substantially affect tag abundance but likely inflate the number Quantitative RT-PCR. Pools of glomeruli or isolated segments were of different transcripts detected. Tag extraction, counting and transferred onto a concave bacteriological glass slide and pho- library comparisons were performed by using SADELAB. Tags for tographed for counting or tubular length measurement, respec- linker-derived sequences were discarded, and those originating tively (13). Total RNAs were extracted as described (13), and from duplicate ditags were counted only once (6). cDNA synthesis was primed by using oligo(dT)12–18. Duplicate aliquots corresponding to cDNA amounts generated from half a SAGE Data Analysis. Mitochondrial SAGE tags were inferred from glomerulus or 0.5 mm of tubular length were analyzed by the sequence of Anderson et al. (16) and from the database for real-time quantitative PCR. Amplification was performed in an human mitochondrial genome polymorphisms (www.genpat.uu ABI prism 7000 SDS by using Sybr Green PCR master mix, PHYSIOLOGY .se͞mtDB). For all other tags, identification and chromosomal according to recommendations of the manufacturer (Applera, mapping were initiated on Unigene clusters by using SAGEmap Foster City, CA), and 300 nM each primer. Aliquots of the same (www.ncbi.nlm.nih.gov͞SAGE) and carried out until March 31, cDNA sample were used to study the expression of the different 2003. By using our SAGE protocol, each polyadenylated tran- targets analyzed. The amplification rate of each target, evaluated script is expected to be detected through a single tag adjacent to from experiments carried out on whole-kidney cDNAs, was used the most 3Ј Sau3A I site of the corresponding cDNA. The to calculate expression differences from one tissue sample to reliability of the identification procedure thus critically depends another. Peptidylprolyl isomerase A (PPIA), which was found to on the possibility of assessing that canonical tags are obtained. be nearly similarly expressed in all structures [tag counts for Therefore, Unigene clusters referred to as reliable matches in 50,000 tags: glomerulus, 8; proximal convoluted tubule, 12; SAGEmap were all explored for the location and correct orien- proximal straight tubule, 9; medullary thick ascending limb of tation of the tag, as well as the presence of a polyadenylation Henle’s loop, 17; cortical thick ascending limb of Henle’s loop, signal and͞or poly(A) tail in the most 3Ј sequence. Reliably 13] was used for normalization. The primers used for quantita- matched sequences belonging to different clusters were aligned tive RT-PCR are available from the authors on request. to assess the accuracy of the clustering process, allowing in a few instances resolution of two clusters in a single one. When a tag Results reliably matched sequences belonging to two unrelated clusters, Healthy parts of human kidneys were obtained from donors both entries were recorded. When more than two reliable undergoing tumorectomy. We isolated by microdissection eight matches were obtained, the ambiguous tag to gene mapping was nephron portions (Fig. 1), and these native tissue samples were referred to as multiple matches (see Results). When no entry or processed for the generation of SAGE libraries. no reliable entry was obtained in SAGEmap, additional resources We sequenced Ϸ50,000 tags from each library (Table 3, which were used. First, we considered the consensus sequence of the is published as supporting information on the PNAS web site). TIGR database (www.tigr.org), which in some instances allowed With 50,000 mRNA tags, a 95% confidence level is obtained for extending the cDNA sequence up to the poly(A) tail. When this detecting transcripts present at 0.006% of the total RNA mass procedure was inefficient, a BLAST search (www.ncbi.nlm.nih. (i.e., Ն18 copies per cell) (14). The relevance of the libraries is gov͞BLAST) was performed on human ESTs recorded in Gen- supported by the expression pattern of several genes. For Bank, and the retrieved sequences were analyzed as described example, known markers for glomerular epithelial cells above to check for matching accuracy. Such reliable entries were (PODXL) (17), proximal tubules (AQP1) (18), thick ascending recorded as EST matches or attributed a gene name when limbs (UMOD) (19), distal convoluted tubules (SLC12A3) (20), mapping information on the human genome indicated overlap- and collecting ducts (AQP3) (20, 21) were evidenced in the ping with a known gene. Tags matching ESTs in the reverse appropriate libraries and in most cases undetected in all others orientation or ESTs that could not be ascertained as 3Ј se- (Fig. 2). Uromodulin expression was detected in the distal quences, as well as tags matching only the human genome, were convoluted tubule (DCT) but reached a considerably lower level considered not reliably matched. Tags without reliable match than in either the cortical [cortical thick ascending limb of

Chabarde`s-Garonne et al. PNAS ͉ November 11, 2003 ͉ vol. 100 ͉ no. 23 ͉ 13711 Downloaded by guest on September 26, 2021 Fig. 1. A microdissected human nephron. Blue boxes indicate the eight structures analyzed in the present study. Solid and dotted arcs indicate the Fig. 2. Expression pattern of markers for different nephron portions. Tag kidney surface and limits of kidney zones, respectively. Glom, glomerulus; PCT, abundance indicates mRNA tag counts in libraries normalized to 50,000 tags. proximal convoluted tubule; PST, proximal straight tubule; DTL, descending PODXL, podocalyxin-like; AQP1, aquaporin 1; UMOD, uromodulin; AQP3, thin limb; ATL, ascending thin limb; MTAL, medullary thick ascending limb of aquaporin 3; SLC12A3, thiazide-sensitive Na-Cl cotransporter. Henle’s loop; CTAL, cortical thick ascending limb of Henle’s loop; CNT, con- necting tubule; CCD, cortical collecting duct; OMCD, outer medullary collect- ing duct; and IMCD, inner medullary collecting duct; DCT, distal convoluted Finally, the more distinctive pattern is obtained for the glomer- tubule. ulus, which connects to the clustered tubular structures rather than to a peculiar nephron segment. The specific gene- expression signature of the glomerulus is consistent with its Henle’s loop (CTAL)] or the medullary (medullary thick as- special cell constitution (22) and is highlighted further by the cending limb of Henle’s loop) portion of the thick ascending limb number of tags detected only in the glomerulus library [n ϭ 34, of Henle’s loop. This expression pattern is consistent with the whereas nephron segments display only one to two tags specific observation that the initial DCT portion consists of CTAL-like for a single structure (Table 4)]. The paucity of mitochondrial cells (22). On the other hand, the absence of AQP3 tags in the transcripts, which goes along with the absence of active ion DCT library demonstrates that DCTs were microdissected free of connecting tubule pieces (20). Comparative analysis of libraries was performed to identify sets of genes that support the specific functions of the successive nephron portions. Stringent criteria (P Ͻ 0.01 and Ն7-fold difference) were used for assessing statistically significant dif- ferences. Indeed, Monte-Carlo simulations indicate that the 0.01 level of confidence requires a Ն7-fold difference for low- abundance tags (15). To further delineate specific expression patterns despite functional kinships between adjacent nephron portions, differences obtained with at least three libraries were considered. The comparative analysis revealed differential abun- dances for 998 tags (Fig. 3, and Table 4, which is published as supporting information on the PNAS web site). As shown in Fig. 3, they do not equally partition among the structures. Clustering analysis disclosed various extents of similarity between nephron portions. Proximal convoluted tubule and proximal straight tubule emerge as the closest related structures, being both enriched for 85 of the differentially distributed tags. The two collecting duct portions (cortical collecting duct and outer Fig. 3. Overview of mRNA tag differential distribution. The number of medullary collecting duct) also share a number of specific tags ϭ mRNA tags displaying a significant differential distribution (P Ͻ 0.01, and (n 31). Strikingly, the cortical thick ascending limb of Henle’s Ն7-fold difference as compared with three libraries) is indicated for each loop, medullary thick ascending limb of Henle’s loop, and DCT library. The sum of all differences (n ϭ 998) corresponds to 773 unique tags are very close to each other. These three structures are more because of overlapping between libraries. The y axis indicates the number of closely related to the collecting duct than to the proximal tubule. differentially distributed tags common to clustered structures.

13712 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.2234604100 Chabarde`s-Garonne et al. Downloaded by guest on September 26, 2021 Table 1. Examples of disease genes differentially expressed along the human nephron mRNA tag abundance

Disease (gene) Mapping Tag Glom PCT PST MTAL CTAL DCT CCD OMCD

Nephrotic syndrome steroid-resistant (NPHS2, Podocin) 1q25-q31 CCTCACTGAA 68 0 0 0 0 0 0 0 Glomerulosclerosis, focal segmental, 1 (ACTN4) 19q13 ATGGCGGGGC 21 1 2 2 5 2 4 7 Wilms tumor, type 1 (WT1) 11p13 TTACAAGATA 18 0 0 0 0 0 0 0 Hereditary fructose intolerance (ALDOB)* 9q21.3-q22.2 AAATTTCACA 0 288 273 3 6 26 2 7 GTGGTGGGAA 019461 1 80 1 Renal tubular acidosis, proximal (SLC4A4) 4q21 AACATGGTGG 0 29 10 2 3 1 10 1 Hypophosphatemia (SLC34A1, NPT2) 5q35 AGCATTGAGA 123120 0 30 0 Fructosuria (KHK) 2p23.3-p23.2 CGGGTGTCCG 019121 0 00 0 Dihydropyrimidinuria (DPYS) 8q22 TTCATTTTAA 0163 0 0 30 0 Alkaptonuria (HGD) 3q21-q23 GCCAAGTACC 0 2 14 0 0 1 1 1 Bartter syndrome type 1 (SLC12A1)* 15q15-q21.1 TGAGCAATCA 0 0 1 222 149 29 1 2 TCAATAAATG 00074300 Bartter syndrome type 3 (CLCNKB) 1p36 CTGGTGGGCA 083514242206 Hypomagnesemia, primary (CLDN16) 3q29 ATTGTTCTAT 0 0 0 6 14 3 0 1 Hypomagnesemia (FXYD2, ATP1G1) 11q23 TTCGCTGGAC 0 50 44 115 83 156 7 7 Diabetes insipidus, nephrogenic (AQP2)* 12q12-q13 ACACACACCA 0 1 33 2 3 3 157 156 GGACCCCTGG 0 0 4 0 0 0 34 39 Apparent mineralocorticoid excess (HSD11B2) 16q22 CCCCAAGTGT 0 0 4 5 11 31 45 21 Renal tubular acidosis-osteopetrosis syndrome (CA2) 8q22 TACCTTGGTG 04310142016 Liddle syndrome (SCNN1G) 16p12 TTCCCACTTC 01100384

Values indicate mRNA tag abundance for libraries normalized to 50,000 total tags. The gene symbol from the Human Genome Organisation nomenclature, indicated in parentheses, is eventually followed by a usual alternate symbol. See Fig. 1 legend for definitions of abbreviations. *This gene was detected through more than one tag.

transports, also sets the glomerulus apart from the tubular sequence now makes it feasible to conclude that the Northern segments (Fig. 5, which is published as supporting information and SAGE data are both consistent with alternative transcripts on the PNAS web site). terminating at either the proximal or the distal polyadenylation The differentially distributed tags consist of 773 unique tags. site of the last AQP2 exon. Additional alternative transcripts may The majority of them were reliably matched to a single charac- also exist, because four other SAGE tags match the AQP2 terized gene (58.7%) or to anonymous cDNAs and ESTs sequence (Table 4). PHYSIOLOGY (13.8%) (Table 5, which is published as supporting information Identifying genes preferentially expressed in well-delineated on the PNAS web site). Because the data were obtained from a structures is potentially useful to progress toward the charac- human tissue, it is of special interest to look for the expression terization of diseases resting on a specifically located gene of disease genes. We found 75 genes mutated in inherited human expression (29). Examples of renal diseases lacking a molecular diseases to be differentially expressed along the nephron. Ex- characterization are type 2A pseudohypoaldosteronism amples of such genes are displayed in Table 1, and the complete (PHA2A), and IgA nephropathy. PHA2A is a syndrome of list is available online (Table 6a, which is published as supporting hypertension with hyperkalemia. As reviewed by Lifton et al. (3), information on the PNAS web site). Although a number of them all elucidated Mendelian forms of blood pressure disturbance are not currently related to kidney diseases, we believe it concern genes that control renal salt reabsorption, and several of important to deliver all the information that may help in refining them are chiefly expressed in the distal nephron. PHA2A has phenotypic analysis. For genes mutated in kidney or kidney- been mapped to 1q31-q42 (30), which contains Ϸ400 genes, but dependent diseases, the expression patterns display variable analyzing this region for genes preferentially expressed in the degrees of tissue specificity that are consistent with the syn- distal nephron focused on four candidates (Table 2). The same dromes. For example, NPHS2, which encodes the glomerular kind of analysis was carried out for IgA nephropathy, a common podocin and is mutated in a steroid-resistant nephrotic form of end-stage renal disease with proliferation of the glo- syndrome (23), is indeed expressed only in the glomerulus. By merular mesangium. IgA nephropathy has been linked to 6q22- contrast, the Na-K-2Cl cotransporter (SLC12A1), the chloride q23 (31). The syndrome outcome points to the glomerulus channel CLCNKB, and hydroxysteroid (11-␤) dehydrogenase 2 as a relevant target, from which we tentatively sorted three (HSD11B2), which are all mutated in blood pressure disorders candidates. resting on salt reabsorption in the distal nephron (24–26), To check for representative tag sampling and correct gene display significant predominant expression in several structures. identification, we further analyzed by quantitative RT-PCR a set Table 1 also shows that genes detected through more than one of transcripts displaying different expression patterns. We stud- tag were repeatedly encountered. For ALDOB, the minor tag ied three candidates for renal diseases and validated all of them and the predominant one are relevant for a short and a long (Fig. 4a). Fig. 4 also shows that the specific distribution of transcript, terminating at a proximal and a distal polyadenylation transcripts known only through anonymous cDNAs was corrob- site, respectively (27). For SLC12A1, such detailed information orated by RT-PCR. is not available from the literature, but a similar mechanism may The information gathered in this study goes far beyond the exist because the minor tag, which displays a consensus polyad- identification of differentially expressed genes. The human enylation signal (AATAAA), locates upstream to the major one. nephron gene expression database includes Ͼ90,000 tags, among Turning to AQP2, the published cDNA sequence corresponds to which 10,705 were detected at least five times (Table 7, which is a 1.6-kb mRNA, but a more abundant transcript was detected at published as supporting information on the PNAS web site). As 4.2 kb by Northern hybridization (28). The human genome human gene and cDNA sequences are deciphered with increas-

Chabarde`s-Garonne et al. PNAS ͉ November 11, 2003 ͉ vol. 100 ͉ no. 23 ͉ 13713 Downloaded by guest on September 26, 2021 Table 2. Candidate genes for pseudohypoaldosteronism type 2A (PHA2A) and IgA nephropathy mRNA tag abundance

Disease and candidate genes Tag Glom PCT PST MTAL CTAL DCT CCD OMCD

PHA2A (1q31-q42) Hypothetical protein DKFZp761N1114 CAACTTTTTT 000 145 3 0 1 BTG family member 2 (BTG2)* CCTTTGAGAG 5 1 1 6 2 5 51 39 E74-like factor 3 (ELF3)* TATTTTTTCT 0 6 1 3 4 1 13 24 Ladinin 1 (LAD1) TGATAAACTC 001 0 2 0 0 7 IgA nephropathy (6q22-q23) Connective tissue growth factor (CTGF)* AGTTTTTTCA 239 2 0 30 4 12 5 9 Transcription factor 21, podocyte-expressed (TCF21) ATAGGATAGC 20 0 0 0 0 0 0 1 Gap junction protein alpha1 43kD (GJA1) ATGTGTTCTG 801 0 0 1 0 2

Values indicate mRNA tag abundance for libraries normalized to 50,000 total tags. See Fig. 1 legend for definitions of abbreviations. *This gene was detected through more than one tag. Only the most abundant one is indicated.

ing accuracy, this database offers the opportunity of updated nephron. Moreover, our study was carried out on fresh human identification of human kidney mRNAs by linking each tag to the tissue pieces, which makes it possible to directly assess the SAGEmap resource (32). It is also anticipated to serve as a compendium of genes expressed in native human nephron standard for future comparisons, including comparisons across portions. species, as well as for the comprehensive analysis of gene Several lines of evidence support the reliability of the data. families. As an illustration of this later possibility, we screened First, known markers for specific nephron portions predomi- the human nephron database for the expression of genes that nated in the corresponding libraries; second, when gene expres- confer tissue transport and permeability properties (Table 8, sion was evidenced through a tag that did not match the original which is published as supporting information on the PNAS web cDNA sequence (e.g., AQP2, SLC12A1,orCLDN16), we re- site). We detected tags for 258 such genes (119 solute carriers, peatedly observed that this sequence either corresponded to a 84 water or ion channels, 43 ion-transport ATPases, and 12 short transcript variant or was incomplete; and third, we were claudins). Genes from these families that are mutated in hered- able to confirm by RT-PCR the SAGE data for mRNAs that had itary diseases are displayed in Table 6 a or b, according to the not been studied previously along the nephron. specificity of their expression pattern. We document huge differences from one nephron portion to another for several hundred transcripts. Although substantial, Discussion this number is expected to be a minimal estimate, because we The feasibility of analyzing gene expression patterns in well- used stringent criteria for defining differences. The majority delineated nephron portions was previously demonstrated by (75%) of the differentially distributed tags were reliably matched studies providing 1,000–15,000 ESTs or SAGE tags from mi- to cDNAs of known or unknown function that are accurately crodissected tubular segments (9, 12). By sequencing Ͼ400,000 mapped in the human genome. The mapping information pro- SAGE tags, we increase by several orders of magnitude the vides a series of beacons to survey genomic regions held to accuracy of gene expression profiling in the mammalian contain gene(s) for renal diseases that affect specific nephron

Fig. 4. Quantitative RT-PCR analysis of mRNAs distribution along the nephron. RNAs were extracted from the five indicated structures, obtained in sufficient amounts to perform both SAGE and RT-PCR validations. RT-PCR data are displayed relative to the structure where expression is maximal. The value below each column indicates the result of the SAGE analysis and corresponds to tag counts for 50,000 tags. (a) Expression of candidate genes for IgA nephropathy (CTGF, GJA1) or PHA2A (DKFZp761N1114). (b) Selected examples of genes predicted by SAGE to be predominantly expressed in the glomerulus (SPARC, DKFZp564B076), the proximal tubule (BC001573, BC004360), or the thick ascending limb (FLJ31166).

13714 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.2234604100 Chabarde`s-Garonne et al. Downloaded by guest on September 26, 2021 portions. Linkage studies usually highlight chromosomal do- outlined previously (2, 20), caution is required when extrapo- mains containing hundreds of candidate genes, a number that lating to human beings observations made on the kidneys of can be substantially reduced by implementing transcriptome animals. Comparison of the present results to those obtained in data to the screening procedure. This integrated strategy was the mouse kidney (12, 33) confirms that marked differences explored to progress toward the molecular analysis of PHA2A indeed are present between species. For example, the mRNA and IgA nephropathy, but it could also be used for other renal levels for AQP2 and AQP3 are heavily different in the mouse diseases. outer medullary collecting duct (OMCD), reaching a 30:1 ratio Clustering analysis revealed kinships between nephron por- tions that largely agree with those drawn from morphological (33), whereas in the human OMCD we found similar mRNA and physiological studies. The nephron segments indeed parti- abundances for both aquaporins. It was the purpose of the tioned into three groups consisting of proximal, a thick ascending present study to help characterize gene expression patterns in the limb-DCT, and a collecting duct cluster. However, the observa- human kidney. With a freely accessible database for most tion that the number of similarities between mated structures nephron segments, a number of queries pertinent to human varies greatly from one cluster to another was rather unexpected. kidney physiology can now be addressed without inferences from The number of tags predominantly expressed in both the prox- model systems. imal convoluted tubule and straight tubule is especially high. The difference with other clusters turns out to be robust, because it We thank O. Gontcharevskaia for dissecting complete human nephrons stands when we consider either the absolute (n ϭ 85) or relative (Fig. 1); P. Lesavre and D. Chauveau for help in starting this project and number of tags (45–60%) shared between the two proximal for advice about renal diseases; G. Descheˆnes for fruitful discussions; N. segments. On the other hand, clusters of tubular structures are Caudy, S. Jounier, and H. Moysan for technical assistance; I. Bordelais much more related to each other than to the glomerulus, a notion for colony picking; and all those from Ge´noscope who contributed to consistent with the distinct functions of the glomerular and DNA sequencing. This work was supported by Commissariat`l a ’Energie tubular portions of the nephron. Atomique (CEA) grants to the De´partement de Biologie Joliot-Curie, For obvious reasons, segmental analysis of nephron function and CEA and Centre National de la Recherche Scientifique grants to is most often carried out on laboratory animals. However, as Unite´ de Recherche Associe´e 1859.

1. Abramow, M. & Dratwa, M. (1974) Nature 250, 492–493. 19. Hession, C., Decker, J. M., Sherblom, A. P., Kumar, S., Yue, C. C., Mattaliano, 2. Chabarde`s, D., Gagnan-Brunette, M., Imbert-Teboul, M., Gontcharevskaia, R. J., Tizard, R., Kawashima, E., Schmeissner, U., Heletky, S., et al. (1987) O., Monte´gut, M., Clique, A. & Morel, F. (1980) J. Clin. Invest. 65, 439–448. Science 237, 1479–1484. 3. Lifton, R. P., Gharavi, A. G. & Geller, D. S. (2001) Cell 104, 545–556. 20. Biner, H. L., Arpin-Bott, M. P., Loffing, J., Wang, X., Knepper, M., Hebert, 4. International Human Genome Sequencing Consortium. (2001) Nature 409, S. C. & Kaissling, B. (2002) J. Am. Soc. Nephrol. 13, 836–847. 860–921. 21. Ishibashi, K., Sasaki, S., Fushimi, K., Uchida, S., Kuwahara, M., Saito, H., 5. Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., Furukawa, T., Nakajima, K., Yamaguchi, Y., Gojobori, T., et al. (1994) Proc. Smith, H. O., Yandell, M., Evans, C. A., Holt, R. A., et al. (2001) Science 291, Natl. Acad. Sci. USA 91, 6269–6273. 1304–1351. 22. Kriz, W. & Kaissling, B. (2000) in The Kidney, Physiology and Pathophysiology, 6. Velculescu, V. E., Zhang, L., Zhou, W., Vogelstein, J., Basrai, M. A., Bassett, eds. Seldin, D. W. & Giebisch, G. (Lippincot Williams & Wilkins, Philadel-

D. E., Jr., Hieter, P., Vogelstein, B. & Kinzler, K. W. (1997) Cell 88, 243–251. phia), 3rd Ed., Vol. 1, pp. 587–654. PHYSIOLOGY 7. Adams, M. D., Kerlavage, A. R., Fleischmann, R. D., Fuldner, R. A., Bult, C. J., 23. Boute, N., Gribouval, O., Roselli, S., Benessy, F., Lee, H., Fuchshuber, A., Dahan, Lee, N. H., Kirkness, E. F., Weinstock, K. G., Gocayne, J. D., White, O., et al. K., Gubler, M. C., Niaudet, P. & Antignac, C. (2000) Nat. Genet. 24, 349–354. (1995) Nature 377, 3–174. 24. Simon, D. B., Karet, F. E., Hamdan, J. M., DiPietro, A., Sanjad, S. A. & Lifton, 8. Strausberg, R. L., Feingold, E. A., Grouse, L. H., Derge, J. G., Klausner, R. D., R. P. (1996) Nat. Genet. 13, 183–188. Collins, F. S., Wagner, L., Shenmen, C. M., Schuler, G. D., Altschul, S. F., et 25. Simon, D. B., Bindra, R. S., Mansfield, T. A., Nelson-Williams, C., Mendonca, al. (2002) Proc. Natl. Acad. Sci. USA 99, 16899–16903. E., Stone, R., Schurman, S., Nayir, A., Alpay, H., Bakkaloglu, A., et al. (1997) 9. Takenaka, M., Imai, E., Kaneko, T., Ito, T., Moriyama, T., Yamauchi, A., Hori, Nat. Genet. 17, 171–178. M., Kawamoto, S. & Okubo, K. (1998) Kidney Int. 53, 562–572. 26. Mune, T., Rogerson, F. M., Nikkila, H., Agarwal, A. K. & White, P. C. (1995) 10. Yano, N., Endoh, M., Fadden, K., Yamashita, H., Kane, A., Sakai, H. & Rifai, Nat. Genet. 10, 394–399. A. (2000) Kidney Int. 57, 1452–1459. 27. Sakakibara, M., Mukai, T., Yatsuki, H. & Hori, K. (1985) Nucleic Acids Res. 11. Hishikawa, K., Oemar, B. S. & Nakaki, T. (2001) J. Biol. Chem. 276, 13, 5055–5069. 16797–16803. 28. Sasaki, S., Fushimi, K., Saito, H., Saito, F., Uchida, S., Ishibashi, K., Kuwahara, 12. Virlon, B., Cheval, L., Buhler, J.-M., Billon, E., Doucet, A. & Elalouf, J.-M. M., Ikeuchi, T., Inui, K.-I., Nakajima, K., et al. (1994) J. Clin. Invest. 93, (1999) Proc. Natl. Acad. Sci. USA 96, 15286–15291. 1250–1256. 13. Chabarde`s, D., Firsov, D., Aarab, L., Clabecq, A., Bellanger, A.-C., Siaume- 29. Zwaenepoel, I., Mustapha, M., Leibovici, M., Verpy, E., Goodyear, R., Liu, Perez, S. & Elalouf, J.-M. (1996) J. Biol. Chem. 271, 19264–19271. X. Z., Nouaille, S., Nance, W. E., Kanaan, M., Avraham, K. B., et al. (2002) 14. Cheval, L., Virlon, B. & Elalouf, J.-M. (2000) in Functional Genomics, eds. Proc. Natl. Acad. Sci. USA 99, 6240–6245. Hunt, S. P. & Livesey, J. P. (Oxford Univ. Press, Oxford), pp. 139–163. 30. Mansfield, T. A., Simon, D. B., Farfel, Z., Bia, M., Tucci, J. R., Lebel, M., 15. Zhang, L., Zhou, W., Velculescu, V. E., Kern, S. E., Hruban, R. H., Hamilton, Gutkin, M., Vialettes, B., Christofilis, M. A., Kauppinen-Makelin, R., et al. S. R., Vogelstein, B. & Kinzler, K. W. (1997) Science 276, 1268–1272. (1997) Nat. Genet. 16, 202–205. 16. Anderson, S., Bankier, A. T., Barrell, B. G., de Bruijn, M. H. L., Coulson, A. R., 31. Gharavi, A. G., Yan, Y., Scolari, F., Schena, F. P., Frasca, G. M., Ghiggeri, Drouin, J., Eperon, I. C., Nierlich, D. P., Roe, B. A., Sanger, F., et al. (1981) G. M., Cooper, K., Amoroso, A., Viola, B. F., Battini, G., et al. (2000). Nat. Nature 290, 457–465. Genet. 26, 354–357. 17. Kershaw, D. B., Beck, S. G., Wharram, B. L., Wiggins, J. E., Goyal, M., Thomas, 32. Lash, A. E., Tolstoshev, C. M., Wagner, L., Schuler, G. D., Strausberg, R. L., P. E. & Wiggins, R. C. (1997) J. Biol. Chem. 272, 15708–15714. Riggins, G. J. & Altschul, S. F. (2000) Genome Res. 10, 1051–1060. 18. Denker, B. M., Smith, B. L., Kuhajda, F. P. & Agre, P. (1988) J. Biol. Chem. 33. Elalouf, J.-M., Aude, J.-C., Billon, E., Cheval, L., Doucet, A. & Virlon, B. 263, 15634–15642. (2002) Exp. Nephrol. 10, 75–81.

Chabarde`s-Garonne et al. PNAS ͉ November 11, 2003 ͉ vol. 100 ͉ no. 23 ͉ 13715 Downloaded by guest on September 26, 2021