Cloning of Large Regions of Mammalian Genomes: The Molecular Genetics of X-Linked Hypophosphatémie Rickets

By

Fiona M. Francis

a thesis submitted for the degree of Doctor of Philosophy in the University of London

Department of Genetics, University College London, London.

Genome Analysis, Imperial Cancer Research Fund, Lincoln’s Inn Fields, London. ProQuest Number: 10105624

All rights reserved

INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted.

In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. uest.

ProQuest 10105624

Published by ProQuest LLC(2016). Copyright of the Dissertation is held by the Author.

All rights reserved. This work is protected against unauthorized copying under Title 17, United States Code. Microform Edition © ProQuest LLC.

ProQuest LLC 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106-1346 This thesis is dedicated to my dear family and friends Acknowledgements

In this thesis I would like to acknowledge the following people: firstly, 1 especially thank Hans Lehrach for giving me the opportunity to come to his lab, and perform this thesis work. 1 thank him for his guidance, and optimism. 1 am extremely thankful to Tony Monaco for the support he gave me throughout the X chromosome project, and Marie-Laure Yaspo, Roger Cox, Gill Bates, Mark Ross, Ann Walker, Sarah Baxendale, and Leo Schalkwyk for all their valuable advice in many aspects of this thesis work. 1 thank Frances Benham, who helped me greatly in the radiation hybrid project, and additionally, 1 thank all the Reference Library participants who used PI library filters and therefore contributed to the characterization of the libraries. 1 thank Gunther Zehetner and the Reference Library System technicians for their provision of the many library filters and clones 1 requested. My special thanks go to Hugues Roest Crollius for his considerable patience, stoic encouragment, and ceaseless support throughout the writing of this thesis. ABSTRACT

This thesis describes two main projects, first the development of an alternative method to construct large insert libraries in a PI vector, and this project led to the construction and use of PI libraries made from the genomes of Schizosaccharomyces pombe, mouse and human, and second, the progress in a positional cloning project for the X-linked hypophosphatémie rickets gene. The PI cloning system allows the propagation of maximally 95kb fragments of genomic DNA and represents an intermediate level of resolution between yeast artificial chromosomes (YACs) and clones. A protocol for PI library construction was developed, which involves the use of pulsed field gel electrophoresis for size selection of insert DNA. A PI library of 16-fold coverage was constructed from the fission yeast S. pombe, which is of interest biologically since it is likely to have many functions that are conserved in higher eukaryotes. A human PI library representing a 1-fold genome coverage (47000 clones) and a C57BL/6 mouse PI library representing a 3-fold genome coverage (120000) were also prepared. Each of these libraries has been arrayed in microtitre dishes, gridded on Nylon membranes and screened by hybridization. In parallel to this work, two Y AC contigs were constructed in the human Xp22 region, which, in the early stages of this project, was poorly characterized compared to other regions of the X chromosome. Radiation hybrids were used to isolate cosmid and YAC clones giving rise to a 2Mb Y AC contig spanning the ZFX-POLA region. Single copy probes were used to isolate YACs from the main area of interest, the X-linked dominant hypophosphatémie rickets region and a 1.5Mb YAC contig was assembled. Further characterization of the latter region by isolation of cosmid, PI, and Pl- derived artificial chromosome clones, and the analysis of gene transcripts derived from exon-trapping and cDNA enrichment experiments are described. TABLE OF CONTENTS

Title page ...... 1 Acknowledgements ...... 3 Abstract...... 4 Table of Contents ...... 5

Chapter One: Introduction 1.1 Molecular Analysis of Mammalian Genomes ...... 8 1.1.1 Introduction ...... 8 1.1.2 Genetic Linkage M aps ...... 9 1.1.3 Use of Radiation Hybrids ...... 10 1.1.4 Clone Libraries and Physical Maps ...... 10 Cosmid Libraries ...... 11 YAC Libraries...... 12 PI Libraries...... 13 Other Intermediate-Capacity Cloning Systems ...... 14 Jumping and Linking Libraries ...... 15 1.1.5 Positional Cloning ...... 15 1.1.6 Transcript Identification From Genomic DNA ...... 17 1.1.7 Genome-scale approaches to gene identification ...... 21 1.2. X chrom osom e ...... 22 1.2.1 X-linked hypophosphatémie rickets ...... 24

Chapter Two: Materials And Methods 2.1 Reagents ...... 27 2.1.1 General reagents ...... 27 2.1.2 Enzymes (and Enzyme Buffers) ...... 28 2.1.3 Other Reagents and Kits ...... 29 2.2 General Solutions and M edia ...... 29 2.3 Experimental Procedures ...... 32 2.3.1 PI Cloning ...... 32 2.3.1.1 Preparation of genomic DNA in agarose blocks ...... 32 2.3.1.2 Partial digestion of DNA in agarose blocks ...... 32 2.3.1.3 CIP Treatment of Partially Digested DNA ...... 33 2.3.1.4 Initial size selection by PFGE ...... 33 2.3.1.5 Vector arm preparation ...... 33 2.3.1.6 Ligation ...... 34 2.3.1.7 Second size selection by PFGE ...... 34 2.3.1.8 Agarase reaction ...... 34 2.3.1.9 Concentration of size selected DNA ...... 35 2.3.1.10 Storage of Size Selected D N A ...... 35 2.3.1.11 Packaging and recovery of recombinant clones ...... 35 2.3.1.12 Packaging extracts ...... 35 2.3.1.13 Head-tail extract preparation ...... 36 2.3.1.14 Pacase extract preparation ...... 36 2.3.1.15 Packaging reaction ...... 37 2.3.1.16 Preparation of plating cells ...... 37

5 2.3.1.17 Phage adsorption to host cells and infection ...... 38 2.3.1.18 Preparation of high density library filters ...... 38 2.3.1.19 Screening of PI library filters ...... 38 2.3.1.20 PI DNA Preparation ...... 39 2.3.1.21 Sizing of PI Clones ...... 39 2.3.2 Construction of YAC Contigs in Human Xp22 ...... 39 2.3.2.1 Radiation hybrid cell lines ...... 39 23.2.2 Alu-PCR amplification of radiation hybrid cell lin e s...... 40 2.3.2.3 Hybridization of Alu-PCR products to cosmid filters...... 40 2.3.2.4 Selection of predicted to m ap to Xp21.3- p22.1 ...... 41 2.3.2.5 Cosmid DNA preparation ...... 41 2.3.2.6 Other hybridization probes ...... 41 2.3.2.V Other Cosmid Probes ...... 42 2.3.2.8 YAC Clone Isolation ...... 42 2.5.2.9 YAC DNA Preparations ...... 43 2.3.2.10 PCR from YACs ...... 43 2.3.2.11 Probe Content Mapping of YACs ...... 43 2.3.2.12 PCR Screening of YACs ...... 44 2.3.2.13 YAC Walking Strategies in the HYP region ...... 45 2.3.2.14 Use of HYP YACs to isolate cosmid, PI and PAC clones ...... 45 2.3.2.15 PPG Analysis ...... 46 2.3.2.16 Fluorescent in situ hybridisation (FISH) ...... 46 2.3.3 Characterization of the HYP Region ...... 47 2.3.3.1 Riboprobe Generation from cosmid, PI, and PAG clones ...... 47 2.3.3.2 Exon-trapping ...... 48 2.3.3.3 cDNA selection (performed by B. Korn, Heidelberg) ...... 49 2.3.3.4 Hybridization of cDNAs onto digested cosmid/Pl/PAC clones ...... 50 2.3.3.5 Sequence Analysis of Exons and cDNAs ...... 50

Chapter Three: PI Cloning 3.1 Introduction ...... 51 3.2 Methods of Library Construction ...... 53 3.2.1 Optimisation of partial digest conditions ...... 53 3.2.2 Size selection ...... 56 3.2.3 Further concentration of DN A ...... 56 3.2.4 Efficiency of producing clones ...... 58 3.2.5 PI library filters ...... 58 3.2.6 Clone characterization ...... 61 3.3 Discussion ...... 61

Chapter Four: YAC Clone Isolation Using Radiation Hybrids 4.1 Introduction ...... 65 4.2 Construction of YAC contig encompassing ZFX and POLA ...... 67

6 4.2.1 Hybridization of Alu-PCR product pools to cosmid filters ...... 67 4.2.2 Assembling YAC contigs ...... 67 4.2.3 PFG analysis, and mapping cosmids in the YAC contig ...... 69 4.3 Discussion ...... 73

Chapter Five: YAC Clone Isolation In The HYP Region 5. 1 Introduction ...... 76 5.2 Isolation of YACs in Xp22 ...... 77 5.2.1 Summary of clones identified from Xp22 (excluding ZFX- POLA region) ...... 77 5.3 Isolation of YACs in the HYP region ...... 79 5.3.1 YAC Library Screening with Proximal Probes ...... 79 5.3.2 YAC Clones Identified at the DXS365 locus ...... 79 5.3.3 Link-up of Proximal and Distal YACs ...... 79 5.3.4 Co-identification of YACs Using Irradiation Hybrid DNA ...... 83 5.3.5 PCR screening of HYP YACs ...... 83 5.3.6 PFG Analysis of Minimal Span YACs...... 87 5.4 Discussion ...... 89

Chapter Six: Further Characterization Of The HYP Region 6.1 Introduction ...... 92 6.2 A Detailed Analysis of The HYP Region...... 93 6.2.1 Isolation of DXS1683 microsatellite marker from 104A0563 cosmid ...... 93 6.2.3 Identification of cosmid, PI and PAC clones ...... 94 6.2.4 Assembly of cosm id/Pl/PA C clones into a contig ...... 95 6.2.5 Production of an enriched cDNA library ...... 95 6.2.6 A new microsatellite developed from PI clone, 10220 ...... 100 6.2.7 Exon-trapping experiments ...... 100 6.2.8 Sequence analysis ...... 104 6.3 Discussion ...... 104

Chapter Seven: Final Discussion And Conclusions 7.1 Clone Libraries ...... 108 7.2 PI Cloning System ...... 108 7.3 The Hum an Xp22 Region ...... 110 References...... 116 CHAPTER ONE INTRODUCTION

1.1 Molecular Analysis of Mammalian Genomes

1.1.1 Introduction

The hum an genome consists of 50000 to 100000 genes which account for approximately 3% of the total genomic DNA. The intronic and intergenic regions are likely to contain much additional information including control and regulatory sequences. The Human Genome Project is a vast international collaboration which has the aim of mapping and sequencing the complete human genome (latest five-year plan presented by (Collins and Galas, 1993)). This will provide a resource for the elucidation and understanding of human diseases, developmental processes and those involved in maintenance of the organism. Early objectives of this project are to produce a high density of genetic markers spaced evenly throughout the genome, and physical maps which show the locations of expressed genes. Physical maps should allow immediate access to any region of the genome and consist therefore of collections of overlapping cloned genomic fragments. Analysis of the human genome is being performed in parallel to that of the mouse genome, and the mouse has been described as the 'pivotal' model organism for the Human Genome Project (European Collaborative Group, 1994). A large number of chromosomal regions exhibit conserved gene content and gene order. These have enabled the identification (and production) of mouse models for human genetic diseases which provide a valuable resource for investigating the underlying pathogenic mechanisms. In some cases mapping of a gene in the mouse has led to the identification of a human disease gene, in other cases cloning of a human disease gene has led to the identification of a candidate gene for a mouse disease. Human-mouse comparative mapping may also help us understand the mechanisms of chromosome evolution. Several other biologically-interesting species have been chosen as model organisms in the Human Genome Project. These include Escherichia coli, Saccharomyces cerevisiae, Caenorhabditis elegans and Drosophila melanogaster. Pioneeering physical mapping studies involving the C. elegans genome have been carried out (Coulson et al., 1986) (Coulson et al., 1988). This organism has a genome size of approximately 80Mb which is nearly 40 times smaller than the human genome. In these studies methods were assessed for estimating and identifying overlaps between cloned genomic fragments. The fission yeast Schizosaccharomyces pombe with a genome size of only 14Mb, has all the

8 Chapter One characteristics of a eukaryotic genome without the repetitive elements. This organism has also been used therefore as a test-case for large-scale genome mapping strategies (Maier et al., 1992) (Hoheisel et al., 1993). The C. elegans and S. pombe physical maps have both enabled easy access to a large number of interesting loci. Another organism, the pufferfish Fugu rubripes. (Brenner et al., 1993) will prove interesting to study because it has a particularly compact genome, which is about seven times smaller than the human genome. This reduction in size is apparently in the intronic regions, and since Fugu has a complement of genes similar to humans, this organism will be particularly useful for the facilitated study of genes.

1.1.2 Genetic Linkage Maps

Maps consisting of genetic markers are essential for the molecular analysis of inherited disorders. DNA-based genetic markers can be defined as having a known position in the genome, and the ability to detect variations in DNA sequence. This latter property allows a marker to be assessed for linkage to a disease gene, by studying its segregation pattern in affected families. The denser the genetic map the easier it becomes to precisely localize a disease gene within the genome. Additionally genetic markers are extremely valuable when used as a basis for creating physical maps. Genome-wide human linkage maps have been produced using a set of CEPH reference families which allow investigators to pool linkage results created using different markers on a common set of families. These early maps (Donis- Keller et al., 1987) consisted almost entirely of restriction fragment length polymorphisms (RFLPs, Botstein et al., 1980), however these particular markers have a low heterozygosity and are unevenly spaced throughout the genome. Second-generation linkage maps were produced using short tandem repeat markers or microsatellites (Litt and Luty, 1989, Weber and May, 1989) which are highly informative markers which can be analysed using polymerase chain reaction (PCR) techniques. The latest human linkage map produced in an international collaborative effort consists of 5840 loci of which 3617 are microsatellites and 427 are genes (Murray et al., 1994). This map fulfills one of the first objectives of the Human Genome Project by creating a comprehensive, high- density genetic map. In the mouse, creating genetic maps is facilitated by the ability to produce crosses with defined parents and large numbers of progeny. Interspecific backcrosses (produced by interbreeding wild mouse strains with laboratory strains) increase the number of DNA variants which can be detected. Extensive Chapter One mouse-mapping efforts to-date include the production of a genetic map containing over 4000 microsatellite markers (Dietrich et al., 1994) which was produced using 46 backcross progeny. An additional resource, offering a 10 fold increase in marker resolution over existing maps is The European Collaborative Interspecific Backcross (EUCIB European Collaborative Group, 1994) which consists of 982 backcross progeny.

1.1.3 Use of Radiation Hybrids

Just as somatic cell hybrids have had a major impact in genome analysis by providing the possibility of positioning genes on specific chromosomes, so radiation hybrids (RHs) have made an equally large contribution to genetic and physical mapping projects. RHs are produced by subjecting donor cells (usually diploid human, or single human chromosomes on a hamster background) to lethal doses of irradiation, and then recovering the fragmented chromosomes by fusion to a recipient rodent cell line (Goss and Harris, 1975), Assaying marker content (STSs or hybridization probes) in a panel of radiation hybrid lines can be used as a method for ordering DNA markers (radiation hybrid mapping, (Cox et al., 1990)), offering a complementary approach to classical genetic linkage. Hybrids are analyzed for the presence or absence of specific markers, and statistical methods are applied to estimate the frequency of breakages between them, hence an order can be determined. For long-range mapping purposes, relatively low doses of irradiation are applied in an attempt to retain large fragments. In recent years, human-rodent hybrids containing single human chromosomes have been the favoured DNA source for mapping purposes. However, it may now be feasible to create high-resolution whole genome radiation hybrid maps (Walter et al., 1994) by reverting to the original protocols of irradiating diploid human cells instead.

1.1.4 Clone Libraries and Physical Maps

Clone libraries are produced by the breakage of chromosomes into smaller fragments, which are cloned and individually propagated in E. coli or yeast cells. The reconstruction of the order of these fragments results in a clone-based physical map. Pulsed field gel electrophoresis (PFGE, (Schwartz and Cantor, 1984)) provides a second method for producing physical maps although these are of uncloned DNA. They are produced by DNA digestion with rare cutter restriction enzymes, separation of the fragments by PFGE, and hybridization of DNA markers to Southern blots. Rare cutter restriction enzymes have GC-rich

10 Chapter One recognition sequences, which are often clustered marking the 5' ends of genes (Bird, 1986). PFGE maps can therefore indicate the positions of genes, and they are generally produced in parallel to the assembly of clone maps. Contigs of clones are principally constructed in two ways: by hybridization approaches in which case clones are assayed for probe content, or by sequence tagged site (STS) mapping (Olson et al., 1989). In the latter case PCRs are used to assay for short sequences within the clones using specific primer sets. STS screening is carried out on pools of clones and has the advantage that marker information can be easily transferred between laboratories in the form of primer sequences. Efficient hybridization screening approaches require arrayed libraries (one clone per microtitre dish well), which have several advantages over libraries existing only as pools of clones (reviewed by Shepherd et al, 1994). Briefly, an arrayed library is a renewable resource that has a defined clone representation and position, and is amenable to long-range mapping procedures. Clones can be gridded in an ordered pattern on Nylon membranes, and hybridization performed with a DNA probe (therefore requiring no prior knowledge of probe sequence). Screening by hybridization is an effective means of analyzing large numbers of clones in parallel, and has the advantage of allowing the use of relatively complex probes, which can be pools of clones from other libraries. In this way related clones from different libraries can be identified in one screening experiment. Shared resources can be advantageous in mammalian genome analysis (eg CEPH families, EUCIB) since data from individual experiments can be pooled and analyzed more efficiently. This is the basis of the Reference Library Database System (RLS) (Zehetner and Lehrach, 1994), in which clone libraries are distributed to other laboratories in the form of filter membranes.

Cosmid Libraries

The cosmid cloning system has a maximum cloning capacity of 40kb, and has the advantage of an efficient packaging system for the production of clones. Often 10^-10^ clones can be generated from Ipg of insert DNA, which makes the production of libraries from flow-sorted chromosomes a possibility. The cloned DNA is relatively easily accessible which makes this a useful system from which to derive new markers and hence aid the integration of physical and genetic maps. Contig building with this system however, has its limitations, such as the large number of clones which need to be processed, the difficulty in recognizing small overlaps between clones, and the relatively small size of the initial contigs. One further disadvantage is that certain genomic regions appear to be unclonable

11 Chapter One or unstable in cosmid clones, which is probably due to their maintenance in E, coli at a relatively high copy number. Nevertheless, cosmid clones have a widespread use in genome analysis, and are usually selected in the intermediate phase of mapping projects as a substrate from which to isolate coding sequences.

YAC Libraries

The yeast artificial chromosome (YAC) system (Burke et al., 1987) has a large cloning capacity ranging from several hundred kilobases to over a megabase of DNA. Because of their large size, it is possible to use YACs to map large chromosomal regions. YACs, therefore provide the stepping stone required for integration of genetic linkage maps with higher resolution maps constructed in cosmid clones. Another advantage of this system is that libraries appear to have better overall genomic representation than prokaryotic-based cloning systems (Coulson et al., 1988). Despite these advantages there are still limitations, including the low efficiency of cloning (usually only 1000 clones are recovered per pg of insert DNA), the presence of chimeric clones in the libraries, and the difficulty in isolating large amounts of cloned insert DNA from yeast transformants. After a YAC contig has been constructed, it is generally used to isolate smaller clones propagated in E. coli, which are easier to manipulate, and less susceptible to shearing. YAC maps for chromosomes 21 and Y have already been assembled. Chromosome 21 is the smallest human chromosome, and so represents an ideal test-case for long-range mapping strategies. YACs specific for 21 q have been isolated from whole-genome YAC libraries using STSs (Chumakov et al., 1992). These were derived from a variety of sources, some preexisting genetic markers already localized on the chromosome and additional anonymous markers produced from a flow-sorted chromosome preparation. The STS approach was similarly used for the Y chromosome (Foote et al., 1992), in this case the assembly of YAC contigs was aided by the prior ordering of STSs using a series of of Y chromosome deletions (Vollrath et al., 1992), and a YAC library which was enriched in Y sequences. The first generation whole-genome YAC map has also been produced (Bellanne-Chantelot et al., 1992), by comparison of fingerprints of random YACs, which were produced by hybridization of a LINE-1 probe to YAC digests. A more detailed YAC map of every individual human chromosome no doubt will soon appear.

12 Chapter One

PI Libraries

The bacteriophage PI cloning system (Sternberg, 1990) has been developed more recently than either the cosmid or YAC systems and for this reason, protocols for constructing libraries and handling clones are less well established. Using this system, it is possible to clone pieces of DNA which are at least twice the size of a cosmid insert and maximally -95kb. PI clones, therefore have the capacity for accommodating relatively large genes, in an E.coli based system. Additionally clones are maintained at a low copy number within the host bacteria, and therefore some inserts which are unstable in the cosmid cloning system may be maintained stably in PI clones. One disadvantage related to this, is the lowered yields of DNA from PI clones compared to cosmids. There have been difficulties associated with the construction of PI libraries experienced by many users of this system. One problem arises because the system allows the packaging of very small inserts, and a stringent size selection step is therefore required to select inserts of 80-100kb. With an inefficient packaging system and a narrow size range of inserts which can be packaged, clones are subsequently produced at a relatively low efficiency (compared to cosmids) of approximately IxlO^/pg size selected insert DNA. A PI cloning kit is commercially available (DuPont Merck), but the number of clones required to generate libraries from complex genomes is well beyond the current limit of one cloning kit. A number of libraries have however, now been produced from human, mouse and rat genomes ((Pierce et al., 1992b) (Shepherd et al., 1994) (Sternberg et al., 1994) (Southard-Smith et al., 1994) and this thesis work-(Francis et al., 1994c)), and from model organisms such as Drosophila (Smoller et al., 1991) and S. pombe (this thesis work-(Hoheisel et al., 1993)). The whole-genome clone mapping project involving S. pombe (Maier et al., 1992) (Hoheisel et al., 1993) evaluated the effectiveness of cosmid, PI and YAC cloning procedures. A PI library was used to speed up the gap closure process in the construction of the cosmid map, primarily because of their larger insert size and also because PI clones were found to be more successful than cosmids in spanning rDNA repeat sequences. The S. pombe work was carried out using an arrayed PI library which was screened using hybridization. In this thesis work, in fitting with the aims of the Reference Library System, arrayed human and mouse PI libraries were additionally constructed which were made generally accessible within the RLS. It has therefore been possible to assess the usefulness of PI clones derived from mammalian genomes in a wide-range of mapping applications.

13 Chapter One

Other Intermediate-Capacity Cloning Systems

Several other cloning systems have been developed, designed to accomodate pieces of DNA which are larger than cosmids or Pis, but also propagated in E. coli for ease of manipulation. In one of these systems, PI cloning has been adapted such that recombinant DNA is packaged into bacteriophage T4 heads which have a higher capacity for DNA compared to the PI heads (Rao et al., 1992). This system has led to the generation of clones with 122kb size inserts, however, it requires elaborate packaging and site-specific recombination systems and for this reason has been hardly used. Another system takes advantage of the low copy number E. coli F factor, from which cloning vectors have been constructed (O'Connor et al., 1989) (Hosoda et al., 1990) (Leonardo and Sedivy, 1990) (Shizuya et al., 1992). Replication of the F factor in E. coli is strictly controlled, thus reducing the potential for recombination between cloned DNA fragments. Shizuya et. al. described the bacterial artificial chromosome (BAG) system and were able to clone and stably maintain SOOkb human DNA fragments, in an F factor based vector. The cloning procedure involves PFGE for size selection, and ligation to vector to create circular products which are then electroporated into DHIOB (Gibco BRL) competent cells. These electrocompetent cells permit high efficiencies of electroporation with large . Disadvantages of this system include lack of positive selection for inserts, and low DNA recovery from the clones. The BAG vector has however, several desirable features including T7/SP6 promoters suitable for riboprobe generation, two cloning sites, several rare-cutter restriction enzyme sites and a cosN site to enable facilitated restriction mapping by partial digestion (Rackwitz et al., 1985). The presence of the cosN site has also enabled the BAG vector to be used for cloning -40kb inserts using the cosmid packaging and infection system, and hence offers increased insert stability over conventional multicopy cosmid vectors ('' - (Kim et al., 1992)). More recently a system has been described which combines several of the attractive features of the PI and BAG cloning systems, and this has been named the PAG (PI-derived artificial chromosome) system (loannou et al., 1994). The vector, pGYPAG-1 retains most of the properties of the PI including the positive selection system, and the two replicons ( and lytic), however, analogous to BAG cloning, circular recombinant DNA is electroporated into DHIOB cells, eliminating the requirement for packaging extracts and in vivo site-specific recombination systems. Hence phage headful constraints on insert size are also eliminated, and average insert sizes of 130- 150kb have been attained. The PAG system looks the most promising of the more

14 Chapter One recently developed cloning systems and an arrayed human PAC library suitable for screening by hybridization has been produced (P. de Jong, unpublished data).

Jumping and Linking Libraries

Before the advent of YAC and PI cloning, jumping and linking libraries were devised (reviewed by (Poustka and Lehrach, 1986) in an attempt to derive information from longer fragments of DNA than those represented in conventional cosmid and phage cloning systems. Jumping libraries were created by either partially or completely digesting DNA (with the aim of producing large fragments), ligating fragments end to end and rédigés ting with a frequent cutter enzyme. These small fragments consisting of the two ends of the original long fragment were then cloned into a lambda vector. Chromosome jumping used in combination with PFGE mapping was therefore designed to accelerate the process of conventional chromosome walking. Its success for this application was illustrated in the cloning of the cystic fibrosis gene (Rommens et al., 1989), although latterly, its use has been largely superseded by YAC and PI clones. Linking libraries are constructed using a similar strategy to jumping libraries: fragments are created by partial digestion with a frequent cutter enzyme, 10-20kb fragments are isolated and circularized, and then digested with rare-cutter enzymes such as Notl or Eagl before cloning. Linking clones therefore are likely to contain putative CpG islands which make them useful as probes to identify transcribed sequences and for cross-species comparisons.

1.1.5 Positional Cloning

Positional cloning is a strategy for identifying disease genes with unknown biological function (Collins, 1992). Application of this approach relies upon initially determining the chromosomal localization of the responsible gene, and narrowing down this region to the smallest possible interval by genetic linkage studies. This is followed by the assembly of a clone contig and isolation of the transcripts present in the area, with the aim of identifying a gene which contains a mutation in individuals who have the disease. After the correct disease gene has been cloned, a biochemical analysis is often necessary to understand the encoded protein and its function. This approach is in direct contrast to functional mapping where a gene is cloned through knowledge of its protein product or its function. Table 1.1 contains a list of human diseases for which genes implicated in the disease have been isolated by the positional

15 Chapter One cloning approach. It can be seen from the table that an impressive number of genes have been cloned by this approach over the last two or three years.

Table 1.1 Genes implicated in disease identified by positional cloning (updated from (Ballabio, 1993)) Disease Year Ref

Chronic granulomatous disease X 1986 (Royer-Pokora et al., 1986) Duchenne muscular dystrophy X 1986 (Monaco et al., 1986) Retinoblastoma 1986 (Friend et al., 1986) Cystic fibrosis 1989 (Rommens et al., 1989), (Riordan et al, 1989), (Kerem et a l, 1989) Wilms tumour 1990 (Rose et al, 1990) Neurofibromatosis type I 1990 (Wallace et al, 1990), (Cawthom et al. 1990) Testis determining factor 1990 (Sinclair et al, 1990) Choroideremia X 1990 (Cremers et al., 1990) Fragile X X 1991 (Verkerk et a l, 1991) (Fu et a l, 1991) Familial polyposis coli 1991 (Kinzler et a l, 1991), (Groden et al, 1991) Kallmann syndrome X 1991 (Franco et al, 1991), (Legouis et al., 1991) Aniridia 1991 (Ton et a l, 1991) Myotonic dystrophy 1992 (Harley et al, 1992), (Buxton et al., 1992), (Aslanidis et al., 1992), (Brook et al, 1992), (F uetal, 1992), (Mahadevan et a l, 1992) Lowe syndrome X 1992 (Attree et al, 1992) Ewings sarcoma 1992 (Delattre et al., 1992) Norrie syndrome X 1992 (Berger et a l, 1992), (Chen et al, 1992) Menkes disease X 1993 (Vulpe et al, 1993), (Chelly et al, 1993), (Mercer et al., 1993) X-linked agammaglobulinemia X 1993 (Vetrie et al, 1993) Glycerol kinase deficiency X 1993 (Walker et al, 1993) Adrenoleukodystrophy X 1993 (Mosser et al, 1993) Huntington's disease 1993 (HD Consortium, 1993) Neurofibromatosis 1993 (Trofatter et al, 1993), (Rouleau et al, 1993) Fragile site FRAXE X 1993 (Knight et a l, 1993) Spinocerebellar ataxia type 1 1993 (Orr et a l, 1993) Tuberous sclerosis 1993 (TS Consortium, 1993) DiCeorge syndrome 1993 (Halford et al, 1993a; Halford et al, 1993b) Polycystic Kidney Disease 1994 (PKD Consortium, 1994) McLeod syndrome X 1994 (Ho et a l, 1994) Aarskog-Scott syndrome X 1994 (Pasteris et a l, 1994) Diastrophic dysplasia 1994 (Hastbacka et a l, 1994) Campomelic dysplasia 1994 (Foster et a l, 1994) Dent's disease X 1994 (Fisher et a l, 1994) Breast cancer BRCAl 1994 (Miki et al., 1994), (Futreal et al., 1994b) X-linked adrenal hypoplasia X 1994 (Muscatelli et a l, 1994),(Zanaria et a l. 1994) Spinal muscular atrophy 1995 (Lefebvre et al, 1995), (Roy et a l, 1995)

16 Chapter One

Almost half of those cloned to date are X-linked (discussed in a later section). In most diseases (both X-linked and autosomal) the positional cloning process has been aided by the presence of large deletions, or translocation events in the patient chromosomes. In these cases it has been possible to define a critical region, in which to search for gene transcripts that are deleted, or disrupted by the cytological abnormalities.In some cases large deletions result in the loss of several adjacent disease genes ('contiguous deletion syndromes', reviewed by (Ballabio, 1991)), causing complex phenotypes to be observed. Xp21 is one such region where patients with deletions display different combinations of Duchenne muscular dystrophy, chronic granulomatous disease, McLeod syndrome, glycerol kinase deficiency, retinitis pigmentosa 3 and adrenal hypoplasia congenita. Similarly, on chromosome llplS patients can display Wilms tumour, aniridia, genitourinary dysplasia and mental retardation (WAGR syndrome). In the absence of gross cytological aberrations, it is an advantage to have access to large pedigrees in the hope of reducing the candidate region to a reasonable size by genetic linkage analysis. This will only be a possibility for relatively common disorders, and for this reason the era of positional cloning projects involving simple Mendelian diseases may draw to a close. What is now a greater challenge is the identification of genes involved in common complex traits such as susceptibities to heart disease, hypertension, and diabetes. In these cases classic Mendelian inheritance is not observed. Polygenic traits, for example, due to the presence of mutations in multiple genes, are being intensively studied with the help of animal models (Ghosh et al., 1993). These studies rely upon improved linkage maps and availability of informative markers (such as microsatellites) to search the genome for susceptibility loci. It is hoped that pathophysiological mechanisms and perhaps some causative genes will be conserved aiding the understanding of the disease in humans. It is clear that the Human Genome Initiative which aims to establish a complete catalogue of mapped genes, will play an essential role in locating genes involved in complex traits, in addition to those responsible for rare diseases and those not characterized by a clearly defined phenotype.

1.1.6 Transcript Identification From Genomic DNA

In both positional cloning projects and strategies aimed at producing whole chromosome transcriptional maps, an essential aspect involves the methods used to identify coding sequences from the cloned genomic DNA. One classical method for gene identification has taken advantage of the fact that many genes show evolutionary conservation. Therefore identification of cross-

17 Chapter One hybridizing sequences in other species by hybridization of genomic fragments to zoo blots has led to the successful of cloning of disease genes (Monaco et al., 1986) (Rommens et al., 1989). A second widely used method involves the identification of CpG islands within the cloned DNA. These islands are often associated with the 5' ends of genes and are identified by rare-cutter restriction enzymes which have GC-rich recognition sequences. In both the above methods interesting genomic fragments can be used to screen cDNA libraries, or can be directly sequenced to identify putative coding sequences. In an alternative strategy whole cosmids or even YACs can be hybridized directly to cDNA library filters, however despite some successes with this method (Wallace et al., 1990) (Elvin et al., 1990; Geraghty et al., 1993), it is likely to be dependent upon the repetitive element content of the region being analyzed, and may result in the identification of a large number of false positives. More recently several new strategies have been devised, which are more applicable to the isolation of genes from large stretches of genomic DNA. Exon- trapping (Buckler et al., 1991) is a rapid, efficient, and sensitive method for isolating exons by cloning genomic DNA into a vector containing functional 5’ and 3' splice sites flanking the cloning site (see Fig 1.1). DNA is transfected into mammalian COS-7 cells where an SV40 early promoter present in the vector drives high levels of transcription. If an exon is present within the cloned genomic fragment then splicing will occur removing intergenic DNA such that the exon only remains in the vector. RNA is rescued from the COS-7 cells and exons are amplified by PCR. cDNA selection offers an alternative strategy (Lovett et al., 1991) (Parimoo et al., 1991) (Korn et al., 1992) and involves the immobilisation of biotinylated genomic template DNA, onto magnetic beads (or Nylon membranes) to act as a target for hybridization with cDNA libraries (see Fig 1.2). Non-specific cDNAs are removed, and region-specific cDNAs are eluted. Two rounds of enrichment are usually performed before finally eluting and cloning the selected cDNAs. None of the methods mentioned above are alone likely to result in a completely saturated transcriptional map. Chromosome 21, which is the smallest human chromosome has been used extensively as a template for gene identification strategies. Both YACs (Gardiner et al., 1994) and cosmids (Yaspo et al., 1994) have been used for cDNA enrichment, and in the latter case exon-amplification was used in parallel. Studies like these have given some indications of the problems often encountered and limitations associated with each of these techniques. cDNA selection is obviously dependent upon the expression of genes in the starting cDNA library, although in practice pools of different libraries can be used.

18 Chapter One

-> ÿ " ™ W e

' / ' 'v . , t

V

y;- ' Y ,< / -%!

. ' > ; » .V - X '. ' ^ <.y^< t F '

" ^Kw <- __ ; / x."' Æ 0 M

Fig 1.1 Exon-amplification - described in detail in the text

19 Chapter One

È

4

# 6 # W à ^ \

Fig 1.2 cDNA enrichment - described in detail in the text

20 Chapter One

A disadvantage of the cDNA selection technique is that it allows the selection of members of gene families, pseudogenes and cDNAs containing moderately repetitive elements, which are not contained within the starting DNA source. Exon-amplification has an advantage over other gene isolation techniques, in that cloned coding sequences are obtained irrespective of their tissue-specific patterns and levels of expression. It is apparent in this procedure however, that the more complex the starting DNA, the lower the recovery of exons. Additionally, using the classical exon-trapping system, only internal exons from a gene will be recovered, and so two-exon genes will be missed. This latter limitation may be alleviated by 3’ terminal exon trapping systems (Krizman and Berget, 1993) (Datson et al., 1994). Artefacts can also be produced in the exon- trapping system by the selection of cryptic splice sites leading to cloning of pieces of genomic DNA and repeats. It is evident that in both positional cloning projects, and whole chromosome transcriptional mapping projects, it is an advantage to use several gene-identification methods in parallel, the deficiencies in one system often being overcome by another.

1.1.7 Genome-scale approaches to gene identification

Alternative genome-scale approaches to gene identification operate in a reverse manner by attempting to derive some sequence information from anonymous cDNAs (and so in some cases highlight a possible function), and then to localize them within the genome. There are major projects devoted to the partial sequencing of random cDNAs, which generate expressed sequence tags (ESTs, (Adams et al., 1991)). One criticism of this approach is that abundant cDNAs will be sequenced multiple times, and a large number of cDNAs will need to be sequenced before rare cDNAs are identified. It may be possible to first characterize the cDNAs by fingerprinting with oligonucleotides (Meier-Ewert et al., 1994), and hence avoid redundant sequencing. Either way, large-scale efforts are then required to localize these cDNAs onto physical maps or somatic cell hybrid panels. It has been suggested that it will eventually be feasible to directly identify candidate disease genes, by virtue of their localisation in an appropriate region of the genome and their possession of a function consistent with the manifestations of the disease ('positional candidate' approach, (Ballabio, 1993)). Genomic sequencing of selected well-characterized regions is already in progress (eg the Huntington's disease gene region). With the availability of automated sequencers, and the improvement of computer programs which identify putative coding sequences (Uberbacher and Mural, 1991), this approach

21 Chapter One is now feasible as a complement to classical transcriptional mapping approaches. As a consequence of this approach, regulatory elements are likely to be revealed (either directly or through comparison of sequences conserved in different species) which may give insights into the control of gene expression, and the understanding of gene function. Additionally the overall genomic organization in a defined region is invaluable to the development of comparative maps.

1.2. X chromosome

The X chromosome is the most extensively studied of all human chromosomes (reviewed by (Mandel et al., 1992)). This is a result of the wide interest in X-linked diseases and other intriguing processes such as X chromosome inactivation. A large number of X-linked diseases have been identified, since males have only a single X, and recessive diseases have therefore been revealed. Chromosomal assignment of these diseases has been obvious because of their characteristic inheritance pattern. Historically, a need to understand these diseases stimulated the production of both a genetic linkage map based on RFLPs (which was the first of its kind, (Drayna and White, 1985)), and systematic approaches to physical mapping. Additionally, the clinical detection of many translocations between X and an autosome, and other structural abnormalities has provided valuable resources for interval mapping, and has greatly assisted in positional cloning projects (see Table 1.1). X inactivation is an interesting phenomenon which leads to the stable inactivation of one of the two X chromosomes in females hence 'equalizing' gene dosage. A cis-acting gene has been identified, XIST (X inactive-specific transcripts) which is proposed to be involved in this process. Fig 1.3 is a schematic diagram of the X chromosome, showing several of its most interesting features. As shown in Table 1.1 genes for two X-linked diseases, chronic granulomatous disease (Royer- Pokora et al., 1986) and Duchenne muscular dystrophy DMD (Monaco et al., 1986) were the first to be identified using positional cloning approaches. The gene involved in DMD, dystrophin, is the largest known gene and has 79 exons spanning 2.4Mb of genomic DNA and with introns as large as 200kb. Recent reports suggest that it takes 16 hours for this enormous gene to be transcribed from end to end (Tennyson et al., 1995). Another area of intense study has been the Xq27 to q28 region, which is one of the most gene dense portions of the chromosome. A large amount of interest was attached to this region since it was known to contain the gene responsible for Fragile X syndrome.

22 Chapter One

PSEUDOAUTOSOMAL REGION

11.23 11.22

21.3 HOMOLOGY TO Y 22.1 22.2 22.3 23 24 25

26

27 FRAGILE X SYNDROME

28 GENE RICH REGION

Fig 1.3 The human X chromosome. The pseudoautosomal region is a 2.5Mb area at the tip of the chromosome with homology to the tip of the Y chromosome. X- Y identity is maintained by frequent recombination during male meiosis. There is also a region in Xq21 which has homology to the Y chromosome. Several features of interest are: DMD, Duchenne muscular dystrophy, XIST, a gene expressed from the inactive X chromosomes only, and likely to play a role in X inactivation, and the Fragile X syndrome region. The asterisk denotes the region of interest in this project.

23 Chapter One

Fragile X is the most common form of mental retardation after Down's syndrome, and is associated with a fragile site at Xq27.3, which is observed in metaphase spreads. This cytogenetic abnormality was investigated at the molecular level leading to the isolation of a candidate gene which exhibits trinucleotide repeat expansions (Verkerk et al., 1991), (Fu et al., 1991). Since then different fragile sites have been analyzed in this region, also associated with mental retardation and the same mutation mechanism (Knight et al., 1993). The positional cloning of X-linked disease genes is now facilitated by the availability of high resolution genetic and physical maps which cover the majority of the chromosome. This has been aided by long-range mapping strategies aimed at the whole X chromosome (-160Mb) which have been undertaken by several groups. This involves the ordering of clones by a global approach producing a contig map, which is integrated with information from other mapping systems eg radiation fusion hybrids. The first-generation map in YAC clones is almost complete, resulting from the merging of global mapping strategies with high-resolution maps produced by individual groups working in disease-related regions. Many well-characterized regions are now being used for gene identification, and in some regions, principally Xq28, efforts are already being made to derive the genomic sequence.

1.2.1 X-linked hypophosphatémie rickets

Several disease genes are known to map to the human Xp22 region including X-linked hypophosphatémie rickets, X-linked juvenile retinoschisis, Coffin-Lowry syndrome, and keratosis follicularis spinulosa decalvans. This region has been has been extensively characterized in the course of this thesis project, with especial focus on hypophosphatémie ('vitamin D-resistant') rickets. This latter disorder, inherited in a dominant mode (HYP, (McKusick, 1992) MIM *^307800), is the most common form of familial rickets, with a prevalence of 1 in 20000. In the early part of this century it was recognized that vitamin D was essential for good health, and a deficiency led to the disorder of rickets, which was especially prevalent in those countries lacking sunlight. Prophylactic and therapeutic doses of vitamin D were administered and rickets was virtually eliminated. However, some inherited forms emerged which were resistant to this treatment. Subsequently, hypophosphatemia was recognized as a hallmark of the disease, and an X-linked dominant mode of inheritance could be established (Winters et al., 1958), although it is now known that some autosomal forms also exist. The most common clinical manifestation of rickets is short stature, which is thought to be due to the hypophosphatemia (Lobaugh et al., 1984), and this is

24 Hum an X \lo i lN C X

Y P ainne

STS Oic

H PDR II? 1 ('ybb Tim p Pic , DMD Svii-1 CYBB Iore a Aral SYNl.TIMP. PFC ARAF Hpn CF4

TFM . M NK EDA Gbpd.Cl-S.R.svp

XCE mm Dm J 3i Phk.Bpa PGK I sla T im GLA. PLP. IMDI T a.X cc

M o Pgk-1 till A gs

HPRT P ip ijp ) FIX 2i G C P. FVIII Pdh - H yp G 6PD Sts G\'

PPYK ASB Y Pairing

Fig 1.4 A comparative map of human and mouse X chromosomes. Chromosome regions known to contain homologous loci have the same shading. The HYP mutation (formerly called HPDRI) has been mapped to the Xp22 region of the human X chromosome. The mouse mutation (Hyp) causes similar abnormalities to the human disorder, and has been mapped to a homologous segment in the distal part of the mouse X chromosome. The Gy mutation is closely linked to the Hyp mutation (within 0.4 to 0.8 centimorgans), and there may also be a human disease caused by a mutation in the human Gy homologue (HPDRII). These data are summarised by Scriver and Tenenhouse (1990) Genet. Res. Camb. volume 56, pp 141-152. Chapter One associated with bone disease of variable severity. An increased urinary excretion is observed which is due to a decreased reabsorption of phosphate in the kidneys (Winters et al., 1958). Normal individuals with depleted phosphate experience an increase in the production of 1,25 dihydroxyvitamin D3 (l,25(OH)2D3)^ which is the m ost potent metabolite of vitamin D. HYP individuals exhibit a disordered vitamin D metabolism which is evident because their hypophosphatemia does not cause an increase in this metabolite. The biochemical mechanisms involved in the renal adaptive response, and its effect on vitamin D and bone metabolism have yet to be elucidated. (Eicher et al., 1976) have described an X-linked disorder in the mouse (Hyp) that exhibits very similar abnormalities. Its linkage to a region of the mouse X chromosome which is syntenic with human Xp22 (see Fig 1.4) make it a likely homologue of the human disorder. Interestingly, a second X-linked dominant mouse mutation Gy has been mapped to the same region and also exhibits a similar phenotype. However Gy mice show some additional abnormalities including circling behaviour, hyperactivity and deafness, and the Hyp and Gy mutations have been shown to be non-allelic (summarized by Scriver and Tenenhouse, 1990). The Hyp mouse model has been extensively studied in an attempt to elucidate the biochemical defects involved in the human disorder. Micropuncture studies of Hyp mice kidney have localized the renal phosphate leak to the proximal convoluted tubule (Cowgill et al., 1979), and a specific defect in sodium-dependent phosphate transport has been detected in the brush border membranes (Tenenhouse et al., 1978). Two sodium-phosphate cotransporters have been identified in the renal proximal tubule - a low affinity, high capacity transporter, and a high affinity, low capacity system. The latter has been demonstrated to have a half-normal Vmax in Hyp mice (Tenenhouse et al., 1989). Parathyroid hormone which might be expected to play a role in this phosphate defect, has been shown not to be the primary causative factor by studies involving parathyroidectomized Hyp mice (Cowgill et al., 1979). Vitamin D metabolism has also been extensively studied using the mouse model, and abnormal regulation of the enzyme which catalyzes the production of 1,25(0H)2D3 has been observed (Lobaugh and Drezner, 1983). Recent studies have also suggested that the renal catabolism of l,25(OH)2D3 is increased, - increased levels of mRNA and immunoreactive protein of the first enzyme in the catabolic pathway were detected in Hyp mice compared to normals (Roy et al., 1994).

25 Chapter One

It is still not clear whether the defects observed in this disorder affect the renal tubules per se or whether a circulating factor is involved. Support for a primary renal defect stems from studies by (Bell et al., 1988), who reported that primary cultures of renal epithelial cells from the Hyp mouse demonstrate a defect in phosphate transport and vitamin D metabolism. Parabiosis experiments (Meyer et al., 1989) however, and more recently cross-transplantation of kidneys between normal and Hyp mice (Nesbitt et al., 1992) suggest that the renal defect may be secondary to the effect of a humoral factor. Further support for this may be derived from a variant of the disease termed oncogenic hypophosphatémie osteomalacia (Hewison et al., 1992; Lobaugh et al., 1984), which can be difficult to distinguish from the inherited form. In this case it is thought that the tumour produces a circulating factor, which acts specifically on the proximal renal tubule, to produce similar abnormalities. An intrinsic osteoblast defect has also been suggested (Ecarot-Charrier et al., 1988) from bone transplantation studies in normal and Hyp mice. The factors which are involved in the renal adaptive mechanisms of phosphate homeostasis are poorly understood. It is unclear what role is played by vitamin D in these processes, and the cause of bone disease in hypophosphatémie individuals. The cloning of the HYP (or Hyp) gene is likely to help elucidate some of the biochemical pathways involved, and resolve the paradoxical results observed to-date. This thesis work describes the construction of clone maps encompassing the HYP region, and the ongoing positional cloning project which is being pursued in the search for the mutated gene.

26 CHAPTER TWO MATERIALS AND METHODS

2.1 Reagents

2.1.1 General reagents

Sigma Chemicals Co. Trizma hydrochloride Trizma base bovine serum albumin (BSA) dithiothreitol (DTT) phenylmethylsulphonylfluoride (PMSF) spermidine trihydrochloride spermine tetrahydrochloride yeast t-RNA salmon sperm DNA sorbitol isopropyl B-D-thiogalactopyranoside (IPTG) antibiotics markers for normal agarose gels amino acid supplements for yeast work B-mercaptoethanol polyvinylpyrrolidone human placental DNA for competition

BDH Laboratories general salts and chemicals proteinase K ethanol isopropyl alcohol phenol chloroform isoamyl alcohol N-lauroyl sar cosine

Difco bacto-tryptone bacto-peptone bacto-yeast extract gelatin bacto-agar yeast nitrogen base

27 Chapter Two

Amersham International pic. Hybond N+ membranes [32p-alpha] dATP (3000 mCi/mmol) [32p-alpha] dCTP (3000 mCi/mmol)

Pharmacia dNTPs for PCR dextran sulphate hexanucleotides for random prime labelling Ficoll

PMC SeaPlaque GIG low melting point agarose SeaKem GTG agarose Saccharomyces cerevisiae marker chromosomes lambda concatamer markers

Fluka Formamide

2.1.2 Enzymes (and Enzyme Buffers)

New England Biolabs general restriction enzymes T4 DNA ligase T4 polynucleotide kinase

Boehringer Mannheim Biochemicals Calf intestinal alkaline phosphatase SP6 RNA polymerase

Sigma Chemicals Co. agarase RNase DNase

Promega T7 RNA polymerase RNasin

Epicentre Technologies GELase

Nova Biolabs Novozyme

28 Chapter Two

Advanced Biotechnologies Ltd E. coli DNA polymerase large fragment (Klenow) Taq DNA polymerase

Gibco ERL Reverse transcriptase

2.1.3 Other Reagents and Kits

DuPont Merck PI cloning kit

Gihco ERL Human Cotl DNA CloneAmp UDG cloning kit Trizol lOObp ladder

Applied Eiosystems Taq DyeDeoxy Terminator Cycle Sequencing Kit

Qiagen Qiaquick PCR purification kit

Stratagene XLl Blue MRP" chemically competent cells

EIO 101 Geneclean kit

Boehringer Mannheim Eiochemicals Biotin 16-dUTP

Clontech Chromospin 400 columns

Dynal Dynabeads

2.2 General Solutions and Media (According to (Sambrook et al., 1989)) LE media bacto-tryptone 1% bacto-yeast extract 0.5% NaCl 1% (bacto-agar 1.5%) pH to 7.0 29 Chapter Two

2x YT media bacto-tryptone 1.6% bacto-yeast extract 1% NaCl 0.5% (bacto-agar 1.5%) pH to 7.0 lOXHMFM K2HPO4 360 mM KH2PO4 132 mM sodium citrate 17 mM M gS 04 4 mM (N H 4 )2 S 0 4 68 mM glycerol(w/v) 44%

YE media yeast extract 0.5% D-glucose 0.5% yPD media (non-selective) yeast extract 1% bacto-peptone 2% D-glucose 2% (bacto-agar 1.5% ampicillin 50|ig/ml

-II-T media (selective) D-glucose 2% (bacto-agar 2%) yeast nitrogen base 0.67% amino acid supplements (final cone IX)

20X amino acid supplements adenine 400|ig/ml arginine 400|ig/ml isoleucine 400|ig/ml histidine 400pg/ml leucine 1.2m g/m l lysine 400|ig/ml methionine 400|ig/ml phenylalanine Im g/m l valine 3m g/m l tyrosine 600|Lig/ml

Yeast nitrogen base 20X stock 13.4%

30 Chapter Two

SCE sorbitol l.OM sodium citrate pH5.8 O.IM EDTA lOmM DTT (added fresh) lOmM

IXTAE Tris-acetate pH8.0 40mM EDTA ImM

0.5XTBE Tris-borate pH8.0 45mM EDTA ImM

20XSSC NaCl 3M sodium citrate pH 7.0 0.3M lOOX Denhardt's Ficoll 2% BSA 2% Polyvinylpyrrolidone 2%

Dénaturant NaOH 0.5M NaCl 1.5M

Neutralisation Solution Tris-Cl pH7.0 IM NaCl 1.5M

IXTE Tris-Cl pH 8.0 lOmM EDTA ImM

Stripwash Solution TE 0.2X SDS 0.1%

Alkaline Lysis Solution I glucose 50mM Tris-Cl pH8.0 25mM EDTA lOmM

Alkaline Lysis Solution II NaOH 0.2N SDS 1%

31 Chapter Two

Alkaline Lysis Solution III potassium acetate 5M 60ml glacial acetic acid 11.5ml H 2O 28.5ml (This solution is 3M with respect to potassium and 5M with respect to acetate).

2.3 Experimental Procedures

2.3.1 PI Cloning

23.1.1 Preparation of genomic DNA in agarose blocks

Cells of the Schizosaccharomyces pombe strain 972h- were grown to saturation in YE medium, and DNA prepared in agarose (Schwartz and Cantor, 1984) at a final concentration of approximately 2|ig/block. High molecular weight DNA derived from human male lymphoblasts, or C57B1/ 6 mice spleen, was also prepared in agarose blocks (3x10^ cells/block) as described previously (Herrmann et al., 1987). Blocks were stored in lOmM Tris pH7.5, 50mM EDTA at 4°C, and washed in TE (lOmM Tris-HCl pH7.5, ImM EDTA) thoroughly before digesting (for example 3 x 30 min at room temperature with gentle agitation).

2.3.1.2 Partial digestion of DNA in agarose blocks

Reproducible partials were obtained as follows: blocks were washed extensively in TE prior to equilibration at 4°C in enzyme buffer (-)Mg2+ (modified from (O'Farrell et al., 1980)). This buffer contained 33mM Tris-acetate (pH7.9), 66mM potassium acetate, 0.5mM DTT (dithiothreitol), ImM EDTA and 2mM spermidine. To ensure complete penetration of enzyme into the blocks, and hence reproducible partial digestions, blocks were equilibrated at 4°C extensively (4 hours) in fresh buffer containing a suitable dilution of Mbol (BRL, 0.04 - 0.08U/mouse or human block, 0.004U/ pombe block). The reaction was initiated by the addition of Mg2+ to lOmM, and continued at 37°C for 4 hours. The digestion was terminated by the addition of EDTA to 20mM. Extent of digestion was estimated by PFGE of a sample of DNA, using a contour clamped homogeneous electric field (CHEF) apparatus (Biorad). DNA was electrophoresed in a 1% agarose gel, 0.5X TBE using lambda concatamer and Saccharomyces cerevisiae chromosome markers. Pulse times were ramped from 3 to 20 seconds, over a period of 20 hours at 180V.

32 Chapter Two

23.1.3 CIP Treatment of Partially Digested DNA

Blocks containing partially digested DNA were treated with CIP (calf intestinal alkaline phosphatase) at 0.055U/|ig DNA in IX dephosphorylation buffer (BMB). Incubation at 37°C was for 3 hours, followed by addition of EDTA to 20mM. Proteinase K was added to a final concentration of Im g/m l, and blocks were incubated at 50°C for 30 min. Blocks were washed in TE, pH 7.6 for 2 X 30 min at 50°C, then twice in TE with PMSF at 40pg/ml at the same temperature.

2.3.1.4 Initial size selection hy PFGE

In the construction of the S. pombe PI library, only one size selection was performed after ligation of partial digests to vector arms (Hoheisel et al., 1993). For the mouse and human libraries two size selections were performed. Blocks were washed in TE, prior to loading in a trough of a 1 % low melting point (LMP) gel, 0.5X TBE. Blocks were sliced lengthways and loaded evenly across the trough. The gel was electrophoresed in a Biorad CHEF apparatus for 16 hours using a fixed pulse time of 3 sec at 180V. These parameters compressed DNA of -lOOkb into the region of limiting mobiUty. Only marker lanes were stained, and then realigned with the remainder of the gel. This enabled excision of the limiting mobility DNA, without ethidium bromide intercalation and U.V. nicking. The remainder of the gel was stained to verify integrity and correct excision of material. Excised DNA was either stored in TE50 (lOmM Tris-HCl, pH7.6, 50mM EDTA), or ligated directly and therefore equilibrated in ligation mix.

2.3.1.5 Vector arm preparation

The positive selection PI vector (pAdlOsacBll) has been previously described (Pierce et al., 1992a). The vector was linearized with Seal (using manufacturer's recommended buffer, with final DNA concentration 0.17pg/pl and lOOU of enzyme for 4 hours at 37°C), then heated to 68°C for 10 min. After cooling to room temperature, DNA was treated with CIP at 0.2U/pg of DNA, for 30 min at 37°C. The dephosphorylation reaction was terminated by the addition of nitrilotriacetic acid to 15mM and incubation at 68°C for 25 min. The DNA was extracted with phenol-chloroform and precipitated with ethanol. The efficiency of the CIP treatment was assessed by ligating a small amount of DNA, and comparing it by gel to unligated DNA. Vector arms were generated by cleavage with BamUl in the cloning site (using manufacturer's recommended buffer, with 0.17|ig/|il DNA concentration and 140U of enzyme for 2 hours at 37°C) and the

33 Chapter Two

DNA was repurified with phenol-chloroform extractions and ethanol precipitation. The vector arms were resuspended at a concentration of Im g/m l, a religation control was performed and compared to unligated material on a gel. The integrity of the religated material was also assessed by packaging a small amount of DNA, and estimating the ratio of colonies produced on a kanamycin plate compared to a kanamycin and sucrose plate (if the presence of sucrose reduced the number of colonies due to religated vector arms 100-fold, then the sacB gene within the vector, an integral part of the positive selection system, was considered to be active).

23.1.6 Ligation

The agarose slices containing human DNA from the initial size selection gel, or S. pombe blocks containing partially digested material, were equilibrated in 30ml ligation buffer containing 50mM Tris-HCl (pH7.5), lOmM MgCl2, 30mM NaCl, and IX polyamines (0.3mM spermine, 0.75mM spermidine) for 4 x 30 min, transferred to eppendorf tubes and melted at 68°C for 15 min with a 8 fold molar excess of vector arms. The mouse DNA was treated in an identical manner although in the absence of polyamines. The DNA was allowed to equilibrate to 37°C, prior to addition of ligation buffer containing ImM ATP, ImM DTT and T4 DNA ligase (400U/pl) at 2.5-3U/|il total reaction volume. The mixture was stirred gently using a wide bore pipette tip, and incubated at 37°C for 60 min, followed by overnight incubation at room temperature. The reaction was terminated by the addition of EDTA to 20mM.

2.3.1.7 Second size selection hy PFGE

The ligated DNA was melted at 68°C for 15 min and loaded evenly into a trough of a 1 % LMP agarose gel and electrophoresed with a 4 sec pulse time at 180V for 16 hours. Stringency of size selection was established by varying the volume of ligation reaction loaded into the trough, and analyzing the excised DNA on a PPG. The DNA compressed into the limiting mobility zone was excised as described for the first size selection PEG.

2.3.1.8 Agarase reaction

The gel slices were equilibrated in 30ml of a buffer containing 40mM Tris- acetate (pH8.0), ImM EDTA, IX polyamines for 3 X 30 min at room temperature, and melted at 68°C for 15 min. Once again, in the case of the mouse DNA the

34 Chapter Two polyamines were omitted from the buffer. The DNA was cooled to 37°C and 5U agarase or 0.3U GELase were added per lOOpl of molten DNA using a wide bore tip. The reaction was incubated at 37°C for 3 hours. The efficiency of the reaction was verified by incubating the mixture on ice for 30 min and checking for resolidification. Size selected DNA was either packaged directly or concentrated by ethanol precipitation.

23.1.9 Concentration of size selected DNA

Using a wide bore tip, ammonium acetate (final concentration 2.5M) was gently mixed into the agarased solution, which was incubated on ice for 10 min and then spun at half speed in a microcentrifuge for 10 min. The supernatant was transferred to a fresh tube using a wide bore tip, and 3 volumes of ethanol were added and gently mixed to homogeneity. After incubation on dry ice for one hour, the DNA was centrifuged at maximum speed in a microcentrifuge for 30 min at 4°C. The supernatant was removed and the pellet was washed with 70% ethanol. After removing as much of the ethanol wash as possible TE was added to the DNA pellet, which was allowed to rehydrate at 4°C overnight.

2.3.1.10 Storage of Size Selected DNA

Ready-to-package DNA was stored by snap freezing the DNA in liquid nitrogen with storage at -70°C.

2.3.1.11 Packaging and recovery of recombinant clones

Packaging extracts can be purchased from DuPont Merck in a PI cloning kit, however for construction of PI libraries from complex genomes, a large number of packaging extracts are required far in excess of that supplied in one commercial kit. Most of the extracts used for the preparation of the S. pombe, human and mouse PI libraries were prepared in house. Preparation of packaging extracts and in vitro packaging were performed as described below. The packaging reaction is a two-step process involving a pacase cleavage of recombinant DNA prior to packaging into phage heads.

2.3.1.12 Packaging extracts

Head-tail extracts and pacase extracts were prepared according to (Pierce and Sternberg, 1992) by the heat induction of the following PI lysogens:

35 Chapter Two

NS3210 (head-tail): sup^recD' HsdM+ hsdR~ m crA'B' (PI r'm'cm cl.100 flml31) NS3208 (pacase): sup^recD' HsdM^ HsdR~ mcrA~B~ (PI rm"cm-2 cl.100 amlO.l)

2.3.1.13 Head-tail extract preparation

NS3210 was streaked out onto LB-agar containing 25|ig/ml chloramphenicol, and grown at 32°C overnight. A 5ml overnight culture in LB + 25|ig/ml chloramphenicol was prepared from a single colony and grown overnight at 32°C. 2.5ml of the overnight culture was inoculated into each of two prewarmed 250ml LB + 25|ig/ml chloramphenicol cultures. These were grown at 32°C to an O.D.^so = 0.3 (approximately 2-3 hours). Cells were pelleted at 4°C for 10 min at 7000rpm in Sorvall centrifuge in a precooled GSA rotor. Cell pellets were resuspended in a total of 2.5ml LB and diluted into 250ml LB which had been prewarmed to 42°C. The culture was grown for 45 min at 42°C with vigorous aeration. After 45 min, the culture was chilled in an ice-water bath, and sucrose was added to a final concentration of 10% (this helps maintain integrity of the cells before centrifugation). The culture was transferred to precooled centrifuge tubes and centrifuged in a GSA rotor at 7000rpm, 4°C for 8 min. The supernatant was discarded, and excess liquid drained from the pellet, which was kept on ice. Each pellet was resuspended in 500pl of cold 50mM Tris-HCl pH8.0, 10% sucrose, by gentle circular motions using a cut Gilson pipette tip (apperture diameter >2mm). Care was taken at this stage to avoid introducing air bubbles, and to act swiftly in order to prevent cell lysis (NS3210 is not a lysis deficient strain). The suspension was sonicated using a Kontes microultrasonic cell disrupter (2x5 sec pulses) in a Falcon 2059 tube on ice. Using a cut Gilson tip, 45|il of extract was aliquoted into chilled eppendorfs containing 4|il of lOmg/ml lysozyme in 50mM Tris-HCl pH 8.0,10% sucrose. Tubes were flicked once and dropped into liquid N 2, and stored at -70°C.

2.3.1.14 Pacase extract preparation

NS3208 was streaked out and grown at 32°C. A 5ml overnight culture in LB + 25pg/ml chloramphenicol was prepared with a single colony and grown at 32°C, as for the head-tail strain. Two prewarmed 250 ml LB + 25|ig/ml chloramphenicol cultures were each inoculated with 2.5 ml of the overnight culture and grown at 32®C until an O.D. ^50 = 0.5 was reached (approximately 4-5 hours). Cells were pelleted at 4°C for 10 min at 7000rpm in a precooled GSA rotor. Cell pellets were resuspended in a total of 2.5 ml of LB and diluted into 500

36 Chapter Two ml of LB which had been pre warmed to 42°C. The culture was grown 42°C for 15 min with shaking at 250rpm, and then incubated for a further 165 min at 38°C. The culture was cooled to 4®C and cells were pelleted at 7000rpm for 10 min. The supernatant was discarded and the pellet resuspended in 1ml of cold buffer containing 20mM Tris-HCl pH 8.0, ImM EDTA, 50mM NaCl and ImM PMSF. The cell suspension was sonicated for four 15 sec intervals, then centrifuged for 30 min at 17000rpm in a Sorvall SS34 rotor. Extracts (20|il aliquots) were stored at -70°C.

2.3.1.15 Packaging reaction

For a 15|il pacase cleavage reaction, the following components were added to 9 |li1 of size selected DNA:

1.5pl lOx pacase buffer, 1.5^1 ImM dNTP (each) mix l.O^il 25mM DTT l.Opl 50mMATP

(lOX pacase buffer: lOOmM Tris-HCl pH 8.0,500mM NaCl, lOOmM MgCl2).

The pacase extract was thawed just prior to use and kept on ice. Ipl was added to each reaction with a cut tip. The reaction was incubated at 30°C for 15 min, followed by the addition of 3.0|li1 of head-tail buffer (6mM Tris-HCl pH8.0, 15mM ATP, 16mM MgCl], 60mM spermidine, 30mM B-mercaptoethanol, 60mM putrescine) and 1.0|il 50mM ATP. The reaction was transferred (using a cut tip) to a tube containing the head-tail extract which was thawed just prior to use, and mixed gently by stirring. After incubation at 30°C for 5 min the contents of the tube were spun briefly in a microcentrifuge to eliminate air bubbles. Incubation was continued at 30°C for a further 15 min. 120|il of phage buffer (lOmM Tris- HCl, pH8.0, lOmM MgCb, 0.1% gelatin, lOpg/ml DNase I) was mixed in, and the tube was incubated at 37°C for 15 min. The reaction was spun briefly in a microfuge to pellet cell debris, and the supernatant transferred to a fresh tube. For storage of packaged material (at 4°C), 10|il of chloroform was added.

2.3.1.16 Preparation of plating cells

Two strains are available for recovering recombinant DNA after infection with phage lysate: NS3145 (Sternberg et al., 1990), and NS3529 (Pierce and Sternberg, 1992). In the construction of the S. pombe, human and mouse PI libraries, NS3145 was used. 37 Chapter Two

NS3145: recA~ W M + HsdR' mcrA~ mcrB" FVacH (lambda imm^'^ninb X\-cre) NS3529: recA~ mcrA+ AQisdR HsdM mcrB mrr) (lambda LPl) (lambda zmm'^34 xi-cre)

A 30 ml culture (in LB) of NS3145 was prepared by overnight incubation at 37°C with shaking at 250 rpm. Cells were pelleted at 4°C and resuspended in 15 ml sterile lOmM MgSO^, lOmM Tris-HCl pH7.6. These cells were viable for at least 1 week at 4°C. To prepare cells for phage infection 0.5ml were inoculated into 50ml LB. These were grown at 37°C with shaking at 250 rpm until OD^so = 0.3. After pelleting at 4°C and resuspending in 5ml cold LB + 5mM CaCl2, cells were kept on ice prior to use.

2.3.1.17 Phage adsorption to host cells and infection

lOOpl of plating cells were added to lOpl of packaged material (from which residual chloroform had been removed by evaporation) followed by incubation at 37°C for 5-10 min without shaking. To this, 1ml of LB was added followed by incubation at 37°C for 45 min. Cells were pelleted by a brief spin in microcentrifuge, and resuspended in small volume of L-broth. This was plated on LB-agar containing 45|ig/ml kanamycin and 5% sucrose, and grown at 37°C overnight. (Nunc Bioassay dishes (22cm x 22cm), were used for large-scale plating of libraries). Clones were picked either manually (S. pombe and human) or robotically (mouse) into 96-well microtitre dishes (Falcon) or 384-well dishes (Genetix, U.K.) respectively, containing 2x TY media, IxHMFM + 25|ig/ml kanamycin, and stored at -70°C (as described in (Nizetic et al., 1991)).

2.3.1.18 Preparation of high density library filters

Libraries were replicated into 384-well dishes, and duplicate copies were used to prepare filter membranes of robotically spotted clones for screening by hybridization. Clones were arrayed at densities of 20736 or 36864 clones per 22 x 22cm membrane.

2.3.1.19 Screening of PI library filters

Library filters were prehybridized at 65°C for 1 hour in a buffer containing 0.5M sodium phosphate (pH 7.2), 7% SDS, and ImM EDTA (Church and Gilbert, 1984). Y AC inserts were PFG purified and treated as described in

38 Chapter Two

(Baxendale et al., 1991). Probes were labelled by random hexamer priming (Feinberg and Vogelstein, 1983) and those likely to contain repetitive sequences were preannealed with 1.5mg/ml sheared human placental DNA for 2 hours at 65°C in 0.12M Na2HP04 (Litt and White, 1985) (Sealey et al., 1985). Hybridization was performed for 16 hours at 65°C. Filters were washed in 40mM sodium phosphate, 0.1% SDS, twice at room temperature, and then twice at 65°C for 20 min. When using whole cosmids as probes, two further washes at 65°C were performed. Exposure to X-ray film was for 1-3 days at -70°C, with a single intensifying screen.

2.3.1.20 PI DNA Preparation

PI plasmid DNA was prepared by a modified version of the alkaline lysis procedure (Sambrook et al., 1989). Briefly, single colonies were picked into 10ml cultures of LB media + 25|ig/ml kanamycin and grown overnight at 37°C. Cells were pelleted and resuspended in 300pl of alkaline lysis buffer I. Cells were transferred to eppendorf tubes, and 600pl of alkaline lysis buffer II was mixed in gently. 450|il of alkaline lysis buffer III was added and the tube was inverted several times. Cell debris was pelleted, and DNA recovered by isopropanol precipitation. In the preparation of the library, clones were sized as undigested DNA, after removal of soluble proteins by precipitation with ammonium acetate (final concentration 2.5M), and ethanol precipitation of the DNA.

2.3.1.21 Sizing o f PI Clones

Undigested plasmid DNA was electrophoresed on a 1% agarose PFG using a Biorad CHEF apparatus. Pulses were ramped from 4 to 20 sec for 20 hours at 180V. Several clones of known size were used as markers.

2.3.2 Construction of YAC Contigs in Human Xp22

2.3.2.1 Radiation hybrid cell lines

The high dose (20000 and 50000 rad) irradiation hybrid lines known to contain human Xp22 fragments, and details of the subcloning procedure used to segregate out non-Xp22 fragments have been described previously (Benham et al., 1989) (Benham and Rowe, 1992). After analysis for marker content, certain subclones were selected for Alu-PCR amplification as shown later in Fig 4.1.

39 Chapter Two

23,2.2 Alu-PCR amplification of radiation hybrid cell lines

The primer used for amplification was the 3'-consensus sequence AlIV (Cotter et al., 1990) and the inter Alu amplification from these subclones has been described previously (Benham and Rowe, 1992). Briefly, the PCR was carried out in a reaction buffer containing lOmM Tris-Cl, pH8.3, 50mM KCl, 1.5mM MgC^, 250|iM each dNTP, and hybrid DNA at 2|ig/ml. An initial dénaturation of 6 min at 94PC was followed by 30 cycles of 93°C x 1 min, 55°C x 1 min and 72°C x 4 min.

AlIV: 5-CAGAATTCGCGACAGAGCGAGACTCCGTCTC-3'

2.3.2.3 Hybridization of Alu-PCR products to cosmid filters

The pool of Alu-PCR products from each hybrid line was hybridized to duplicate sets of a flow sorted X chromosome cosmid library (ICRF cosmid library, (Nizetic et al., 1991)). The cosmid clones representing approximately 4 chromosome equivalents were arrayed at high density on two 22x22cm filters each containing 9216 clones (Lehrach et al., 1990). Filters were prehybridized for 16 hours with lOOpg/ml denatured, sheared human placental DNA and then hybridized for 16 hours with probe at a concentration of 1x10^ cpm/ml hybridization buffer. Qiagen columns were used to purify Alu-PCR products and -lOOng of DNA was labelled by random hexanucleotide priming (Feinberg and Vogelstein, 1983). Two different methods (giving comparable results) were used for competing the repetitive sequences from the Alu-PCR products, and different hybridization buffers were used in each case, (a) Preannealing of labelled probe with hum an DNA bound to a cellulose support (Hochgeschwender et al., 1989), (Monaco et al., 1991), and hybridization to filters at 42°C in 50% formamide, 4XSSC, 0.05M sodium phosphate pH7.2, O.OOIM EDTA, 10% dextran sulphate, 1.0% SDS, 50pg/ml denatured salmon sperm DNA and lOX Denhardt's solution, (b) Competition of probe with 1.5mg/ml sheared human placental DNA in 0.12M sodium phosphate pH 6.8 for 3 hours at 65°C (Litt and White, 1985) (Sealey et al., 1985). Hybridizations were performed at 65°C in 7% SDS, 0.5M sodium phosphate pH 7.2, O.OOIM EDTA and 1% BSA (Church and Gilbert, 1984). Filters were washed in 0.04M sodium phosphate pH 7.2, 0.1% SDS once at room temperature and then twice at 65°C for 30 min. Filters were exposed to X- ray film for 3-4 days at -70®C.

40 Chapter Two

2.3.2 A Selection of cosmids predicted to map to Xp21.3~p22.1

After a comparison of hybridization patterns resulting from each Alu-PCR pool screening, cosmids were selected which were either positive with 50K19E, 20K2B, and 20K2L pools, or positive with 50K19E and negative with 50K19C pools (see Fig. 4.1 and Chapter 4).

2.3.2.S Cosmid DNA preparation

Selected cosmids were picked and grown overnight in LB media containing 30pg/ml kanamycin. Cosmid DNA was prepared by the alkaline lysis procedure (Sambrook et al., 1989), and used for FISH analysis and probing YAC clone filters.

2.3.2.6 Other hybridization probes

Details of other probes used to screen cosmid and YAC library filters are summarised in the following table:

Table 2.1 Hybridization probes for YAC and cosmid clone isolation

Locus Vector Insert size Releasing Reference enzymes DXS43 pBR322 0.9kb Hindlll/EcoRl (Aldridge et al., 1984) DXS207 pAT153 3.5kb Hindlll (Ahrens and Kruse, 1986) DXS9 pAT153 6.1 kb EcoRl (Murray et al., 1982) DXS197 pUC19 4.0kb Hindlll (Starr and Wood, 1987) GLR pUClS 0.65kb EcoRl (Siddique et al., 1989) D274(5) Bluescript 0.75kb EcoRl (Rowe et al., 1993) DXS365 pUCS 2.3kb EcoRl (Browne et al., 1992) DXS257 pUCS 2.5kb Hindlll (Browne et al., 1992) DXCrc57 Bluescript 1.2kb EcoRl/Eagl (Brockdorff et al., 1990) DXCrcUO Bluescript 2.5kb EcoRl/Eagl (Brockdorff et al., 1990) Zfa Bluescript 2.1kb EcoRl (Page et al., 1990) POLA Bluescript O.BSkb Hindlll/Pstl (Wong et al., 1988) DXS41 pBR322 1.7kb Pstl (Aldridge et al., 1984) PDHAl Bluescribt 1.45kb EcoRl (Brown et al., 1989)

Abbreviations in the table are as follows: GLR - human glycine receptor, POLA - DNA polymerase alpha gene, PDHAl - El alpha subunit of pyruvate dehydrogenase gene. The human ZFX locus (zinc finger gene located on the short 41 Chapter Two arm of the X chromosome) was detected using Zfa - a mouse autosomal homologue which also detects Zfx, Zfa, Zfyl and Z/y2. DXCrc57 and DXCrcl40 are mouse linking clones. D274(5) was derived from DXS274.

2.3.2.7 Other Cosmid Probes

In addition to the cosmids identified in the radiation hybrid screen, cosmids identified with the probes described in Table 2.1 were often used as probes to screen YAC library filters (see Chapter 5). 104F0743 (DXS1688) was identified by hybridization of YAC inserts to the cosmid library filters, and 104H10174 (DXS1686) by hybridization of a YAC end probe (see Chapter 5).

2.32.8 YAC Clone Isolation

Cosmid miniprep DNA and all other probes were labelled by random hexanucleotide priming (Feinberg and Vogelstein, 1983). In the case of those cosmids identified in the radiation hybrid screen, pools were then created of up to six clones per pool, the remaining probes and cosmids were hybridized individually. Labelled pools or individual cosmids were competed with sheared human placental DNA as described for the screening of cosmid library filters, with pYAC4 vector DNA added to a concentration of 25pg/ml. Two YAC libraries were used for clone isolation: the ICRF human YAC library (Larin et al., 1991) constructed from a lymphoblastoid cell line GM1416B with a karyotype 48XXXX (Human Genetic Cell Repository, Camden, New Jersey), and a second library constructed at the Centre d'Etude du Polymorphisme Humaine (CEPH) from an EBV-transformed human male lymphoblastoid cell line (Albertsen et al., 1990). Both libraries were robotically arrayed on Hybond N+ membranes for hybridization screening purposes, at a density of 20736 clones per 22cm x 22cm membrane. Hybridization conditions to YAC library filters were according to (Ross et al., 1992) with slight modifications: prehybridisation was performed for 2 hours at 65®C in 0.5M sodium phosphate (pH 7.2), 7% SDS, 1% BSA, ImM EDTA, O.lmg/ml yeast tRNA and, if using highly repetitive probes, with the addition of 0.5mg/ml denatured sheared human placental DNA. Filters were hybridized overnight at 65°C in the same buffer omitting the placental DNA (Church and Gilbert, 1984). Filters were washed in 0.04M sodium phosphate, 0.1% SDS, twice at room temperature, and then once at 65°C for 20 min. Exposure to X-ray film was for 2-5 days at -70°C, with a single intensifying screen. Throughout the text ICRF YACs have names beginning 900., 901., and 905., the remaining YACs were derived from the CEPH YAC library.

42 Chapter Two

4 YACs containing ZFX: 900A0501, 900A0401, 900A0301, and 900A0201 were identified by A.P. Monaco from an earlier subset of the ICRF YAC library present on filter replicas produced using a 40000 multipin transfer device (described in (Larin et al., 1991)).

23.2.9 YAC DNA Preparations

Positive YACs were grown in selective media and DNA was prepared in agarose blocks as previously described (Larin and Lehrach, 1990). YAC miniprep DNA used for FISH analysis was prepared by inoculating a single colony into 10ml selective media, and growing at 30°C for 24 hours. Cells were pelleted prior to yeast cell wall digestion in 4mg/ml Novozyme. YAC DNA in solution was then prepared using an SDS/potassium acetate lysis procedure (Sherman et al., 1986).

2.3.2.10 PCR from YACs

PCR amplifications were performed either using YAC DNA in agarose blocks (1/100th of a block, washed extensively in Millipore filtered water), directly from YAC colonies (Coffey et al., 1992), or crude DNA prepared from YAC cultures. In the latter case, a single colony was inoculated into lOOpl selective media, and grown at 30®C for 48 hours. Cells were pelleted and the supernatant discarded. The cell pellet was washed with 50|il of SCE, cells were pelleted and resuspended in 50pl SCE + 4mg/ml Novozyme and lOmM DTT and incubated for 1 hour at 37°C. 60|xl 0.14N NaOH was added followed by incubation at room temperature for 10 min. 60|j1 1M Tris-HCl, pH8.0 was added, and the DNA stored at -20°C until use. O.lpl of the crude YAC DNA preparation was used in a 50|il PCR.

2.3.2.11 Prohe Content Mapping of YACs

Undigested YACs were electrophoresed in 1% Sea Kem agarose PFGs in 0.5X TBE, using a CHEF Biorad system. Standard electrophoresis conditions were 50 sec for 12 hours, 90 sec for 12 hours, and 120 sec for 12 hours, at 160V. Sizes of YACs were estimated by comparison to yeast chromosomal markers and hybridization of Southern blots with total human DNA. Hindlll digested YACs were electrophoresed on 0.8% Sea Kem conventional gels. Blots were screened sequentially with individual cosmids and single copy probes to establish markers contained within each YAC.

43 Chapter Two

2.3.2.12 PCR Screening of YACs

A PCR assay was used to verify the cosmid and YACs containing the DXS365 locus (shown later in Fig. 5.5). PCR primers (Browne et a l, 1992) were:

RX314-A 5’-TAAAAGAGATAAGAGGTCAA-3' RX314-B 5’>TGAACACCCATTCTAGGT-3'.

PCR reactions were performed in a volume of 50|il containing 500ng of each primer, 0.2mM dNTPs, 50mM KCl, lOmM Tris-Cl (pH 8.3), 1.5mM MgCl], 0.01% gelatine, 2U Taq polymerase and 1/100th of an agarose block containing YAC DNA or 50ng of genomic or cosmid DNA. A 'hot-start' PCR reaction was performed using Ampliwax PCR gems (Perkin Elmer Cetus) in a Perkin Elmer Cetus system 9600 thermocycler. The following conditions were used: 94°C x 2 min, then 30 cycles of 94°C x 1 min, 55°C x 2 min, 72°C x 1 min, with a final extension of 72°C x 6 min. For each of the three following loci, the final PCR reaction mix consisted of 0.2 mM of each dNTP, 75 mM Tris-Cl (pH 9.0), 20 mM (NH^)2 SO4, 0.01%(w/v) Tween, lOmM 2-mercaptoethanol, ImM MgCl]^ 500ng each primer, 1.25 - 2.5U of Taq polymerase, and 0.1 pi of the crude YAC DNA preparation or 50ng genomic DNA in a final volume of 50 pi. PCR amplification was performed in an LEP thermocycler, and began with a dénaturation step of 94°C x 2 min, followed by 35 cycles of 94®C x 1 min, annealing temperature (anneal, temp) as specified x 1 min, 72°C x 1 min, with a final extension of 72°C x 7 min. A 'hot-start' PCR reaction was performed in each case using Ampliwax PCR gems.

DXS1229: (AFM337wd5, (Biancalana et al., 1994)) product size -220bp, anneal, temp: 52^C. 5-TAGAATCAACATAAGGCCCA-3' 5-GGATACAATTAGCAGAATCAGACT-3'

DXS443: (RX324, (Browne et al., 1992)), product size -200bp, anneal, temp: 550C. 5-TTAGTACCTATCAGTCACTA-3' 5'-Ttgtt CAAGGGTCAACTG-3'

DXS3424: (D5, (Econs et al., 1994b)), product size -140bp, anneal, temp: 530C. 5'-TTTTAGAAACAAAAATTCCTTGG-3' 5-GCTGTGTTGTTCAAGGGTC-3'

44 Chapter Two

In the case of Alu-PCR amplifications, the reaction mix was as described above with 1.5mM MgCl], and reactions were performed in either a Perkin-Elmer Cetus system 9600 thermocycler, or in a PTC-100, MJ Research Inc., New Jersey. An initial dénaturation of 94°C x 5 min, was followed by 30 cycles of a two-step reaction (the annealing and DNA synthesis steps were performed at the same temperature): 94°C x 30 sec, 68°C x 2.5 min, with a final extension of 5 min. Primers were as follows (Cole et al., 1991):

ALE3: 5’-CCA(C/T)TGCACTCCAGCCTGGG-3', ALEl: 5 -GCCTCCCAAAGTGCTGGGATTACAG-3'.

In some cases only one primer was used in the reaction (ALE3), at the same concentration of lOng/pl.

2.3.2.13 YAC Walking Strategies in the HYP region

End probes were generated from several of the YACs using the vectorette m ethod (Riley et al., 1990). These ends were m apped back to Xp22.1 by hybridization using a panel of somatic cell hybrid lines described in Fig. 5.3. Vectorette end probes which were successfully mapped to Xp22 were prepared from Pstl digested 900H0439 (trp end, SOObp), Rsal digested 83B05 (trp end, 620bp), and BamHl digested 83B05 (ura end, 870bp). All remaining HYP YACs were typed by hybridization with the isolated end probes and the trp end of 83B05 was selected for rescreening against YAC library filters. On the distal side of the HYP disease gene candidate region a pool of Alu-PCR products derived from a non-chimeric YAC was used for rescreening of YAC library filters. Colony Alu-PCR was performed according to (Coffey et al., 1992), and the pool of Alu- PCR products was hybridized directly to YAC library filters using competition conditions as described above.

2.3.2.14 Use of HYP YACs to isolate cosmid, PI and PAC clones

Cosmids, Pis and PACs were identified underlying the YACs by screening PFG purified YAC inserts against gridded library filters (Baxendale et al., 1991). Libraries screened were as follows: the flow-sorted X chromosome cosmid library described above (4 chromosome equivalents, (Nizetic et al., 1991)), the Lawrence Livermore National Laboratory chromosome X cosmid library, LLOXNCOl, obtained from Pieter de Jong and constructed form flow sorted X chromosomes isolated from a hybrid cell line, the ICRF human PI library (1 x genome coverage, (Francis et al., 1994c)), the DuPont Merck DMPC - HFF 1 45 Chapter Two

pooled human PI library (3x genome coverage, (Shepherd et al., 1994)), and the human PAC library, obtained from Pieter de Jong (4 x genome coverage). Several cosmids were selected to verify the YAC overlaps by hybridization to YAC digest blots.

23.2.15 PFG Analysis

Probes detecting YAC arm fragments were derived from pBR322 - a 2.3kb Pvull/EcoRl fragment has homology to the left (Trp) YAC arm, and a 1.4kb Pvull/Sall fragment has homology to the right (URA3) arm. Digestions were performed according to manufacturers instructions. A selection of YACs from the HYP contig encompassing DXS365, DXS274 and DXS41 were analyzed by rare cutter enzyme digestions (900E01138, 900A0472, and 83B05), and also ICRF YACs 900G1063, 900G0732, 900F0127, and 900G0821 from the ZFX-POLA contig. These were digested to completion with 30U of Not I or Nru I in a 200jil reaction volume (Hamvas et al., 1994). YAC blocks were then electrophoresed in a 0.8% Sea Kem PFG in 0.5X TBE using a hexagonal PFG system (LKB) with a pulse time of either 25 sec or 40 sec for 20 hours at 200V. In addition, partial digestions were performed using the enzymes BssHII, Bag I, and Mlu I on the selected HYP YACs. lOU of enzyme were used in each case and digestion allowed to continue for 10 min (according to (Hamvas et al., 1994)). Samples were electrophoresed for 20 hours at 200V with a 40 sec pulse time. Probes (including total human DNA to detect all fragments and YAC arm probes to detect end fragments) were hybridized sequentially to the same filter lift and size estimations of fragments were based on comparison with yeast chromosomal markers and lambda concatamers. ICRF YACs 900A0501, 900A0301, 900A0201, and 900A0401 were digested with the following enzymes Not I, Sac II, Nar I, and Nael, in the laboratory of Dr. G. Rappold.

2.3.2.16 Fluorescent in situ hyhridisation (FISH) -performed at the Galton Laboratory, UCL, London, or Human Genetics Laboratory, ICRF, Oxford.

l|ig DNA (cosmid or YAC miniprep) was labelled with biotin 14-dUTP by nick translation (BioNick Labelling System, Gibco BRL). Metaphase chromosome preparations were made by standard procedures from human lymphocytes incorporated with bromodeoxyuridine (BUdR) during the late S-phase of the cell

46 Chapter Two cycle. The slides were stained with Hoescht 33258 (l|ig/ml) before being irradiated with U.V. light for 6 min and then ethanol dehydrated. The in situ hybridisation procedure was according to (Pinkel et al., 1988), with the following adaptations. Incubation in proteinase K was for a 7 min period at a concentration of 50|ig/ml prior to fixation. The chromosomal DNA was denatured in 70% formamide/2X SSC at 75®C for 5 min and then dehydrated in ice-cold ethanol. lOpl of hybridization mixture, consisting of 200ng of labelled probe DNA, 2|ig of unlabelled human Cot-1 DNA in 50% formamide, 10% dextran sulphate in 2X SSC was denatured at 75°C for 5 min and allowed to reanneal for 30-60 min at 37°C and then applied onto the slides. Hybridization proceeded in a moist chamber at 37°C overnight. After hybridization, slides were washed 3 times for 5 min each in 50% formamide/2X SSC at 42°C, 5 times for 2 min each in 2X SSC at 42°C and once in 4X SSC, 0.05% Tween at room temperature. Slides were preblocked with 5% non­ fat milk powder in 4X SSC for 20 min at room temperature before being immunochemically stained. The hybridized biotin-labelled probes were detected by incubating the slides for 20 min at room temperature successively in FITC avidin (5|ig/ml), biotinylated anti-avidin (5|ig/ml) and a second layer of FITC avidin (5pg/ml). After each antibody incubation, the slides were washed 3 times for 5 min each in 4X SSC, 0.05% Tween. The slides were washed in PBS twice for 5 min before being counterstained with a Ipg/ml propidium iodide and Ipg/ml DAPI mixture in an antifade solution (Vector labs, CA USA). A confocal laser scanning microscope was used to detect the fluorescence signals.

2.3.3 Characterization of the HYP Region

2.33.1 Rihoprohe Generation front cosmid, PI, and FAC clones

Riboprobes were generated according to (Sambrook et al., 1989) from cosmid, PI, and PAC clones each of which contained T7 and SP6 promoters. Briefly, minipreparations were performed as described above, and DNA was treated with RNase and proteinase K as suggested (Sambrook et al., 1989). Each miniprep was resuspended in 4|il H2O and 0.5|il was electrophoresed on a 0.5% gel, to estimate DNA concentration and integrity of DNA. For cosmids three quarters of a miniprep and for PI and PAC clones, one miniprep was used per riboprobe reaction. The reaction components were added in following order at room temperature:

47 Chapter Two

DNA + H2O 5.25^1 200mM DTT 0.75^1 lOmM rNTP mix 0.75|il lOX transcription buffer l.Sjil RNasin 40U /|il l.Opl 2m g/m l BSA 0.75|il alpha 32-P - rUTP (lOpCi/pl) 4.0pl SP6 or T7 DNA dependent RNA polymerase l.Opl

Reactions were incubated for 1 hour at 37^C (T7) or 40^C (SP6). The reactions were terminated by addition of EDTA to 20mM. In some cases SP6 reactions were performed using BMB buffer and enzyme according to manufacturers specifications.

23,3.2 Exon^trapping

XLl Blue: recA' (recAl, lac'endAl gyrA96 thi hsdRlV supEAA relAl {FproAB lacTl})

Exon-trapping was performed using the pSPL3 vector (Church et al., 1994), according to protocols supplied by Gibco BRL. Briefly, a YAC insert, cosmid clone and 2 PI clones were used as template DNA in 4 separate reactions. In each case the DNA was digested with BamHl and Bgtll and cloned into the BamHl site of the pSPL3 vector. Recombinant DNA was electroporated into XLl Blue cells, DNA was prepared from pSPL3 libraries by alkaline lysis, and transfected into COS-7 cells by electroporation. RNA was isolated using TRIzol reagent, and cDNA was synthesized using SA2 primer. A primary PCR was performed for 6 cycles, using SA2 and SD6 primers: 94°C x 45 sec, 60°C x 30 sec, 72^C X 2 min. Following this PCR products were digested with BstXl, and reamplified using nested primers dUSA4 and dUSD2, which contain dUMP residues required for UDG cloning into the pAMPl vector. PCR conditions were: 25 cycles of 94®C x 45 sec, ôO^C x 1 min, 72®C x 3 min. Cloned PCR products were used to chemically transform XLl-Blue cells (Sambrook et al., 1989), and cloned exons were picked into microtitre dishes. Exons were reamplified (tertiary PCR) using SD5 and SA5 primers, and labelled for hybridization by hot-PCR' in a 25|il reaction using standard PCR reaction components, with 1.6pM SD5 primer, 3^il of radioactive dCTP, Ijil of 1:40 dilution of PCR product, 0.48|iM cold dCTP, and 80|iM cold d(ATG) mix. PCR conditions were as described for the secondary PCR reaction with 10 extra cycles. All exon-trap amplifications were carried out using a PTC-100

48 Chapter Two

thermocycler. Exons were mapped back onto cosmid/Pl /PAC clones and hybridized onto cDNA library filters (described below).

Primers: SD6 5-TCT GAG TCA CCT GGA CAA CC-3' SA2 5-ATC TCA GTG GTA ITT G IG AGC-3’ dUSD2 5'-CUA CUA CUA CUA GTG AAC TGC ACT GTG ACA AGC TGC-3’ dUSA4 5’-CAU CAU CAU CAU CAC CTG AGG AGT GAA TTG GTC G-3' SD5 5-CCCTCGAGGTCGACCCAGC-3' SA5 5 -CTAGAACTAGTGGATCTCCAGG-3'

23.3.3 cDNA selection performed by B. Korn, Heidelberg

cDNA selection was performed using a mixture of three different human tissue cDNA sources: fetal brain, fetal liver, and adult muscle (Korn et al., 1992). Random-primed cDNA libraries were prepared from polyA+ , and amplified using Bluescript SK primer. Three pools of clones (cosmid, PI and PAC) encompassing a region of ~800kb were used as template DNAs. Pools were created by combining alkaline lysis minipreps from individual clones. Template DNA (l|ig) was biotinylated by nick translation, using bio tin -16-dUTP. lOOng biotinylated DNA was competed with Cot-1 DNA, (lOpg) and vector DNA (15pg), and then bound onto strep ta vidin-coated magnetic beads for 1 hour at room temperature. Amplified cDNAs (5|ig) were hybridized overnight at 65°C to the bound DNA, in a buffer containing 750mM NaCl, 50mM sodium phosphate (pH7.2), 5mM EDTA. After washing to O.IX SSC at 65°C, cDNAs were eluted in TE, size selected using Chromospin 400 columns, reamplified and subjected to a second round of enrichment using another aliquot (lOOng) of bound DNA. Eluted cDNAs were amplified using SK primers which were tailed with dUMP residues and re-size selected. An aliquot of the cDNAs were UDG cloned to pAMPlO, and used to transform Epicurian XLl Blue cells (MRF‘, Stratagene). 1152 cDNAs were picked per template pool, and arrayed in microtitre dishes. The total of 3452 cDNAs were gridded at high density onto 7x11cm membranes. Amplification of cDNA inserts were carried out using 5/86 and 3/86 pSPORT primers, using standard PCR reaction components, in a two-step PCR, (Cetus 9600 thermocycler: 30 cycles each of 94^C x 40 sec, 73®C x 3 min). cDNAs were labelled by random hexamer priming of PCR products and mapped back onto cosmid/Pl/PAC clones.

LSK 5 -GCCGCTCTAGAACTAGTGGATC-3' 5/86 5-GCACGCGTACGTAAGCTTGGATCCTCTAG-3' 3/86 5-CCGGTCCGGAATTCCCGGGT-3'

49 Chapter Two

2.33A Hybridization of cDNAs onto digested cosmid/Pl/PAC clones

An aliquot of size selected, enriched cDNAs (material prior to cloning into pAMPlO) was labelled by random hexamer priming using both radioactive dATP and dCTP, and hybridized onto EcoRI-digested cosmid/Pl/PAC clones. Positively hybridizing fragments were excised from LMP gels, and labelled for hybridization onto cDNA grids.

2.3.3.5 Sequence Analysis of Exons and cDNAs (performed in collaboration with DNA sequencing lab, ICRF, London)

Exons and cDNAs were PCR amplified from colonies using 3/86 and 5/86 primers. Amplified exons and cDNAs were purified through Qiaquick PCR purification columns and sequenced using an ABI 373A automated sequencer (Applied Biosystems, ABI). Cycle sequencing was performed using either SA5, SD5 or 3/86, 5/86 primers, and components from a Taq DyeDeoxy Terminator Cycle Sequencing Kit. Cycle sequencing conditions were as specified by ABI using a PTC-100 thermocycler. Sequences were analyzed and assembled into contigs using the Staden package (Dear and Staden, 1991; Staden, 1987) Non- redundant and protein database searches were carried out using the BLAST algorithm (Altschul et al., 1990).

50 CHAPTER THREE PI CLONING

3.1 Introduction

The upper size limit for PI clone inserts is determined by the head capacity of the PI bacteriophage from which the system is derived. The PI cloning vector (Fig 3.1) contains several phage derived sequences including a pac site, a plasmid replicon, a lytic replicon, and two loxP recombination sites. In order for DNA to be packaged into phage heads, a cleavage by pacase enzyme extracts is first necessary at the pac site. Packaged DNA is linear until the phage infects a specific E. coli host which expresses the Cre recombinase gene. Circularization is then possible by site-specific recombination between the two loxP sites. The plasmid replicon is responsible for maintaining the circularized recombinant clone at one copy per host cell chromosome, and it is thought that this low copy number may permit the stable propagation of certain sequences which are unstable in other systems. The lytic replicon is normally repressed but may be induced by isopropyl-B-D-thiogalactopyranoside (IPTG), to increase the copy number of the plasmid suitable for DNA preparation. Other features of the vector include a kanamycin resistance gene, enabling clone recovery and a sacB gene which is a vital part of the positive selection system (Pierce et al., 1992a). This latter gene is derived from Bacillus amyloliquefaciens, and encodes for an enzyme, levansucrase, which converts sucrose to levan which will accumulate in the periplasmic space of cells and result in cell death (Tang et al., 1990). This gene has been integrated in the PI cloning vector, just downstream from the cloning site. An E. coli promoter is situated upstream from the cloning site and expression of sacB is permitted in the PI host strain in the absence of a cloned insert. This therefore enables the positive selection of insert-containing clones on a media containing sucrose, and this is a major advantage of the PI cloning system. In original protocols describing the construction of PI libraries (Sternberg, 1990), (Sternberg et al., 1990), (Pierce et al., 1992b), (Smoller et al., 1991), high molecular weight DNA was manipulated in aqueous solution, and size fractionation was performed using sucrose gradients. The original Y AC cloning procedures utilized these same techniques (Burke et al., 1987), (Coulson et al., 1988), (Garza et al., 1989), however Y AC cloning by PFGE has now largely superceded the original methods (Anand et al., 1989), (McCormick et al., 1989), (Albertsen et al., 1990), (Larin et al., 1991), (Lee et al., 1992). 51 Chapter Three

lux P

Plasm id replicon Adenovirus / stutfer / fragment / . I I pAdlOsacfi 11

PB R322 o ri # kan^

Lytic replicon

95 kb DNA insert

sac B

(D Cut vector at Bam HI and Seal sites and ligate to insert DNA Sfi Bam Not i

k an

p a c toxP PI DNA Headful

Package the ligated DNA p a c - cleavage p roteins i V Tails H ead s

Phage

i I Injection, cyciization and amplification AV

^ /> t ,t recombination n «, u amplification u

Fig 3.1 The PI Cloning System - described in detail in the text

52 Chapter Three

Two additional cloning systems, the BAC (Shizuya et aL, 1992) and more recently PAC (loannou et aL, 1994) systems both use PFGE for size selection. Despite the smaller insert sizes (ideally 90-1 OOkb) of PI clones compared to YACs, BACs and PACs they are still susceptible to shearing in solution. A PFGE protocol for the preparation of PI libraries has been adapted from (Larin et al., 1991), and this is described in detail in this chapter and sum m arized schematically in Fig 3.2. Initially, a PI library was constructed from the genome of the fission yeast S. pombe. As described in Chapter One, this organism has a genome size of only 14Mb, and so a -16 x genome coverage library could be constructed with only 3456 PI clones. In the construction of this library, the necessity of size selection for elimination of small inserts, and the application of PFGE procedures in PI cloning were evaluated. Two larger (mammalian genome) libraries were constructed using this PFGE protocol: 47000 PI clones from a human male lymphoblastoid cell line, and 120000 PI clones from C57B1/6 female mice DNA. These clones have been arrayed in microtitre dishes, and robotically spotted at high density onto filter membranes for hybridization screening purposes. Sharing of library resources with the scientific community (in the form of library filters) and collation of experimental data that is generated, is the basis of the ICRF Reference Library System (RLS, (Lehrach et al., 1990), (Zehetner and Lehrach, 1994). The preliminary analysis of the human and mouse PI libraries has been provided by a combination of in-house screens, and the distribution of library filters to RLS participants.

3.2 Methods of Library Construction

3.2.1 Optim isation of partial digest conditions

Partial digests were produced from DNA contained within agarose blocks. DNA prepared in this way has a very high starting molecular weight. Partial digests were generated using Mbol enzyme and DNA was electrophoresed in a PFG. Digests were considered optimal if a highlighted region of ethidium bromide fluorescence was achieved between 100 and ISOkb, with a reduction of DNA from the limiting mobility region and without the production of appreciable underdigested material (see Fig 3.3). The optimization of the degree of digestion is a critical step since it has a major effect on overall cloning efficiency. For each PI library generated, a range of Mbol dilutions was initially tested on quarter blocks and the dilutions which appeared to give fragments of the correct size were selected for scaling up.

53 Chapter Three

HMW DNA pAdl OsacBII vector in agarose blocks Linearize with Seal Partialtigests (Mbol) CIP treat ends

CIP treit ends BamHI digest to produce vector arms: First size selection

by PFGE loxP

loxP \ / Ligation

in s e r t

Second size selection by PFGE

Agarase gel band containing DNA I Concentrate DNA by ethanol precipitation

Package DNA I Infection, and plating onto selective agar

Pick clones into microtiter dishes I Spot clones at high density onto Nylon membrane

Fig 3.2 PI cloning using PFGE. PI cloning scheme adapted from Pierce and Sternberg, 1992, summarising the steps involved in preparing PI libraries starting from DNA embedded in agarose, and using PFGE for size selection (described in detail in the text).

54 Chapter Three

Initial tests Scale-up

m 1 2 3 4

150kb lOOkb 50kb

m = lambda concatamer markers

Fig 3.3 Partial digests of genomic DNA. (A) Initial range of enzyme dilutions tested on DNA in agarose blocks {lanes 1-4): lane 1, uncut DNA; lane 2, dilution producing underdigested partial; lane 3 digestion producing approximately the desirable degree of digestion; and lane 4, dilution producing overdigested DNA. (B) Scale- up experiment - a finer range of enzyme dilutions {lanes 5-7), lane 5 showing a highlighted region of ethidium bromide fluorescence between 100 and 150kb, without appreciable underdigested material. Marker lanes (m) contain lambda concatamers. PFG conditions are as described in Chapter Two.

55 Chapter Three

Reproducible partial digests could be obtained by extensive equilibration of the DNA agarose blocks in reaction buffer containing enzyme, before initiating the reaction with magnesium ions. Partially digested DNA was treated with alkaline phosphatase to prevent coligation events between non-contiguous fragments. For this reason, PI vector arms were not treated with alkaline ^ phosphatase at the cloning site ends, and were added in excess to the ligation reaction.

3.2.2 Size selection

For the mammalian genome libraries, two size selections have been incorporated within this procedure, firstly to eliminate a large proportion of the small fragments prior to ligation, and secondly in a more stringent size selection after ligation to eliminate the remaining small fragments and remove religated vector arms (see Fig 3.4). Both size selection PFGs are performed using short pulse times which cause the DNA of interest to be compressed in the region of limiting mobility. DNA can therefore be excised from the PFG in a relatively concentrated form. Routinely, a sample of the excised DNA from the second size selection gel was re-electrophoresed on a PFG in order to assess the efficiency of size selection, and the DNA concentration. In the case of the S. pombe library only one size selection was performed (after ligation).

3.2.3 Further concentration of DNA

Size selected S. pombe DNA was packaged directly after an agarase step to digest the agarose. However in the construction of the mammalian genome libraries where large numbers of clones are required, an extra step was incorporated into the protocol to concentrate the DNA, and hence reduce the number of packaging reactions required. Several systems were assessed for concentrating and recovering linear DNA fragments of approximately lOOkb from agarose prior to packaging. These included electroelution directly from agarose, and using Qiagen columns, Centricon filters, or phenol chloroform extractions to purify DNA from an agarased solution. The problems of loss of DNA on membranes and in columns, and shearing of DNA by manipulations as a liquid were always encountered (data not shown). Direct ethanol precipitation of an agarased solution proved to be the simplest and most successful method tested.

56 Chapter Three

markers markers DNA sample

-lOOkb Excised band 48.5 kb 8.3 kb

Fig 3.4 Second size selection PFG. Ligated material electrophoresed on a 1% LMP PFG with 4 sec pulse time for 16 hours at 180V (Biorad CHEF apparatus). Markers are: 1. BRL high molecular weight (8.3-48.5kb) and 2. lambda concatamers. Marker lanes were excised, stained and aligned with the remainder of the gel for band excision. DNA to be cloned was excised from the limiting mobility prior to staining the remainder of the gel. Banding patterns across the gel are due to the vector arms.

57 Chapter Three

3.2.4 Efficiency of producing clones

The concentration of DNA in solution containing residual agarose residues cannot be determined by spectrophometric means. Therefore an accurate determination of the number of PI clones produced per |ig of size selected DNA, cannot be made. However, DNA concentrations were estimated from ethidium bromide stained PFGs, suggesting that clones were produced at an average efficiency of 1x10^ per pg of size selected DNA.

3.2.5 PI library filters

The human and S. pomhe libraries were arrayed manually into microtitre dishes, and the mouse library was arrayed using a robotic picking device. Initially, filter membranes containing S. pomhe PI clones were prepared. In this case clones imprinted on filters were grown overnight at room temperature on agar plates, followed by a six hour induction at 37°C on fresh plates containing ImM IPTG. In the preparation of the human PI library filters, clones were grown on filters overnight at 37°C, without induction, and relatively strong signals have been obtained by hybridization. The mouse PI filters have recently been prepared in a similar way and library filters are being distributed to the scientific community (Zehetner and Lehrach, 1994). A variety of probes of different complexities can be hybridized successfully to PI library filters (Fig 3.5). Single copy probes, whole cosmid clones, single and pooled cDNAs, and Y AC inserts have been used as hybridization probes, and relatively strong signal intensities have been obtained which are similar to those obtained on cosmid clone filters, and much greater than those obtained for Y AC clone library filters. The average lifespan of a PI library filter is 10 hybridizations, although with careful handling some filter membranes have been used 16 times. Human PI library filters were spotted in a 3x3 array with 20376 clones per 22x22cm membrane (Fig 3.5). Mouse and human PI clones are, however, now being spotted at higher densities (4x4 array, 36864 clones per membrane) although including internal duplications to aid identification of positives (see Fig 3.6).

58 Chapter Three

1

Fig 3.5

Autoradiographs of an ICRF human PI library filter hybridized with three different probes. Each filter carries an array of 20736 clones (-0.5 x genome coverage). These are present in 48x48 boxes each containing 9 PI clones, a) whole cosmid clone from the Huntington's disease gene region, b) anonymous cDNA clone detecting 7 strong positives. This cDNA clone may be a member of a gene family, or contain a moderatively repetitive sequence, c) PFG-purified Y AC insert (500kb). This latter filter membrane has been hybridized multiple times, and therefore signals are relatively weak.

59 oooo eoeo oooe oooo eooo oooo oooo deoo oooo oooo ooeo oooo ooeo eooo oooo oooo 4 oooo oeoo oooo OOOO ooo#5 oooo ooeo oooo oooo ooo# eooo oeoo oeoo oooo oooo oooe 8

Fig 3.6 Autoradiograph of a mouse PI library filter hybridized with the Sod2 gene from mouse chromosome 17. The filter membrane carries nearly 0.5 x genome coverage (-18432 clones spotted in duplicate), and each positive is represented as a double signal. The orientations of duplicating signals are depicted at the top of the figure.

6 0 Chapter Three

3.2.6 Clone characterization

During the production of the human PI library, DNA minipreparations from over 200 clones were electrophoresed on PFGs to establish average clone size. Initially a number of clones were sized by frequent cutter enzyme digestion, normal gel electrophoresis. Southern blotting and hybridization with total human DNA (data not shown). Several of these clones were subsequently electrophoresed undigested on a PFG, to establish a set of clones which were used as markers in future clone sizing experiments (Fig 3.7). The average insert size of the human and mouse PI clones is estimated to be approximately 75kb, whereas the S. pomhe clones have an average insert size of approximately 65kb.

3.3 Discussion

In this chapter, a new protocol is described for the production of PI clones, which uses PFGE for size selection. This protocol has been used to construct an S. pomhe library which contains 3456 clones (16 x genome coverage), a human library which contains 47000 clones (1.2 x coverage), and a C57BL/6 mouse library which contains 120000 clones (3 x coverage). As mentioned in Chapter One, many people have experienced difficulties in attempting to construct PI libraries, and for this reason there are not many available. Difficulties were encountered in the course of this project, for instance the mouse PI library took approximately 9 months to construct, and required the optimization of each step in the protocol before clones could be produced with a reasonable efficiency. However, construction of mammalian genome libraries was considered worth pursuing especially after the promising results obtained with the S. pomhe library (Hoheisel et al., 1993). Each of the libraries were characterized by hybridization screening. Filter membranes of the mammalian genome libraries were distributed to other laboratories within the RLS, to aid their characterization and so that the clones could be of use in mapping projects. One limitation of the human library is its size, which means that there is only approximately a 50% chance that a clone for any particular probe is present in the library. However, with the availability of a 3-4 X coverage human PAC library (Pieter de Jong, personal communication), the construction of an arrayed mouse PI library was considered a higher priority than increasing the size of the human library. The PI and PAC systems are likely to allow the cloning and propagation of similar sequences (since the latter is a derivative of the PI cloning system).

61 Chapter Three

marker 10 randomly PI clones selected PI clones

lOOkb

Fig 3.7

Sizing of undigested PI clones. Uncut plasmid DNA from 10 randomly selected human PI clones has been electrophoresed in a 1% normal (SeaKem, FMC) PFG (Biorad CHEF, 4-24 seconds, 20 hours at 180V). Marker clones are previously sized PI clones which range from 15-lOOkb. From left to right markers are 15kb, 30kb, 50kb, 65kb, 95kb, and lOOkb. In each case one quarter of total minipreparation DNA has been loaded per well.

62 Chapter Three

In the construction of the mammalian PI libraries two size selections have been used: prior to, and after ligation. In fact, double size selection may not be necessary and the S. pombe library was previously constructed using one size selection only (Hoheisel et al., 1993). In fact S. pomhe clones have a lower average insert size than the mouse and human PI clones, although this is also likely to be due to the partial digest which was selected to produce the library. Double size selections for the human and mouse libraries were performed in an attempt to increase the average insert sizes compared to the S. pombe clones, and also as a precautionary measure: initially to eliminate the possibility of coligation events between two small inserts which might then be packaged into one phage head, and secondly to prevent religated vector arms from reducing the packaging efficiency of viable recombinant molecules. Polyamines were incorporated into the cloning procedure for the S. pombe and human libraries as a safeguard against degradation of DNA in agarose melting steps (Larin et al., 1991). In fact, polyamines have also been found to assist efficient cloning of YACs in yeast transformations (McCormick et al., 1989). Problems have however, occasionally been experienced when the addition of poly amines has caused DNA precipitation. The C57BL/ 6 mouse PI library was constructed in the absence of poly amines. Results indicate, therefore, that in some instances polyamine addition does not appear to be crucial for the protection of pieces of DNA in melting steps, or to improve the efficiency of PI cloning. The necessity for polyamine addition is likely to be attributed to the levels of metal ion cofactors present in commercial agarose (Schultz and Dervan, 1983) and (Larin et al., 1991). A simple method for sizing PI clones has been described in this chapter, which can be performed from crude minipreparation DNA, and is therefore advantageous when analyzing large numbers of samples. Although there is a narrow window of resolution on the sizing PFG, this is a useful procedure for an initial estimate of insert size and stability. For more accurate estimates of insert size a Noil digestion can be performed, although generally this requires further cleaning up of miniprep DNA by phenol-chloroform extractions, which is undesirable for large sample numbers. A uniformity has been found in PI insert sizes, and this consistency is likely to reflect accuracy in PFG size selection, combined with the size constraint imposed by the phage head itself. The ease of DNA isolation from PI clones compared to YAC clones facilitates their use as a source of genomic DNA for the isolation of genetic markers. Average insert sizes of approximately 75kb may also mean in many cases that entire genes may be encompassed in one PI clone.

63 Chapter Three

In at least two cases PI clones have been identified which appear to delete sequences in growth. It is not clear at present what the proportion of unstable clones in the libraries is, and whether these are solely due to the genomic region from which they are derived, or are also a reflection of the recipient host cell line used to propagate the clones. NS3145 (Sternberg, 1990) which contains point mutations in a number of host restriction and modification systems, and the lacH gene on an F' plasmid, was used to propagate the libraries described here. This strain may be subject to reversions leading to the instability of some genomic DNA inserts. NS3529 (Pierce and Sternberg, 1992) is an alternative PI cloning host strain which has deletion mutations in several of the host systems including an additional system (mrr), and also contains the lacH gene on a lambda prophage. This strain may be preferable for propagation of clones. Library filters have been produced which contain PI clones spotted at densities of 20736 clones per 22x22cm membrane. For ease of handling libraries especially those containing in excess of 100000 clones, clones are now being directly arrayed in a more compact form (384-well dishes), and spotted on membranes at higher densities, however with internal duplications of each clone to help distinguish positives (18000 clones spotted in duplicate per membrane). In house human PI library screens have led to the identification of clones in the 2.2Mb Huntington's disease gene region in 4pl6.3 (Baxendale et al., 1993). Pools of probes from the ends of cosmid contigs identified 4 PI clones, two of which contained DNA sequences that had not been found in a chromosome 4 specific cosmid library (representing 5 chromosome equivalents). Similar results have been obtained in the MHC region on chromosome 6 (Sanseau et al., 1994), where one particular region initially resisted all cloning attempts in a cosmid vector. A PI clone was identified which spanned this gap. Cosmids were subsequently identified from a flow-sorted chromosome 6 cosmid library although these cosmids grew slowly. This region is known to contain many repeat sequences, which may explain the underrepresentation of this sequence cloned in the cosmid system. In the DiGeorges Syndrome region on chromosome 22, PI clones were identified in regions of YAC instability and where cosmid walking attempts had failed (Halford et al., 1993a). A detailed evaluation of the use of PI clones derived from the human library and mapping to the lp36 region has also been described (Lengauer et al., 1994). These latter studies involving a FISH analysis of PI clones, highlight one advantage of PI clones over YAC clones, since no chimeric PI clones were detected. The mouse PI library has been constructed relatively recently, although to-date over 54 confirmed positives have been identified (RLDB, and (Sutherland et al., submitted)).

64 Chapter Three

These examples highlight the utility of PI clones which may contain DNA sequences under-represented or unclonable in both cosmid and YAC systems, which emphasizes the effectiveness of using this system in parallel with other cloning systems.

65 Chapter Four

50K19E 50K19C 20K2B 20K2L

ES ES

■ ■ X p21J •P22.1 1 ■

11.23

\ \ / / Alu-PCR fKx»lsgenerated

Screen gridded X diromosome cosmid library sequentially with pools

Compare hybridisation patterns and select cosmids ▼ FISH mapping T Pool cosmids (Xp21,3-p22.1) and hybridise to gridded YAC libraries

Assemble YAC contigs

Fig 4.1 Schematic representation of the human fragment content present in radiation hybrid subclones as shown by marker analysis. Fragments present in hybrid cell line 50K19E, were additionally confirmed cytogenetically. Fragments in the region of interest are represented by black boxes, and striped boxes represent other fragments retained. Cosmids that were present when hybridized with Alu- PCR products from 50K19E and either absent from the 50K19C hybridization or present also in 20K2B and 20K2L hybridizations were also selected.

6 6 Chapter Four

This YAC contig encompasses two Xp loci, ZFX and POLA. ZFX has been previously localized to the Xp21.3-p22.1 boundary by (Page et al., 1990), and these studies also placed the DNA polymerase alpha gene (POLA, (Wang et al., 1985)) on the proximal side of ZFX. This gene order has also been more recently confirmed by deletion mapping studies performed by (Schaefer et al., 1993). (Hamvas et al., 1993) have shown this same gene order in the mouse, and preliminary studies were therefore carried out to compare this human YAC contig and associated physical map, with the equivalent region in mouse.

4.2 Construction of YAC contig encompassing ZFX and POLA

4.2.1 Hybridization of Alu-PCR product pools to cosmid filters

Alu-PCR product pools were generated by Dr. F. Benham from four of the radiation hybrid subclones. When the pools were hybridized sequentially to high density cosmid filter grids, an average of about 50 clones were identified per cell line. 13 cosmids were predicted to map to the Xp21.3-p22.1 region by assessing the hybridization patterns produced with the different lines. 3 of the 13 cosmids (ICRFcl04 : D0365, B0960, and C l0128) were positive with Alu-PCR pools from lines 50K19E, 20K2B, and 20K2L, which had Xp21.3 markers in common. The remaining 10 cosmids (ICRFcl04 : C09145, E12186, H096, A0527, A0734, A0717, A0563, G05151, H03131, and E0863) were positive in the 50K19E hybridization but negative in the 50K19C hybridization (Xp21.3-22.1 markers absent). None of these cosmids had been previously identified by any probe screening recorded in the RLDB, (Zehetner and Lehrach, 1994). 10 out of 13 cosmids were localized by FISH to Xp21.3-p22.1, two m apped to the centromere (G05151 and E0863), and H03131 mapped to Xq22. D-segment assignments have been obtained for the following cosmids: C09145 (DXS1246), B0960 (DXS1247), E12186 (DXS1248), C10128 (DXS1249), and H096 (DXS1250).

4.2.2 Assembling YAC contigs

The 10 cosmids localized to Xp21.3-22.1 were divided into two pools and used to screen high density YAC library filters: 15 YACs were identified. Analysis for cosmid content divided these YACs into two groups: 12 YACs fell into one contig containing 7 of the starting cosmids (Fig. 4.2), and the remaining 3 YACs (isolated by 3 of the cosmids) were localized elsewhere in Xp22.1 (see Chapter 5). 5 of the 12 YACs in the first contig had been previously recognised by POLA (A.P Monaco, personal communication).

67 TEL % 8 CEN

X 3 .

900G07|2 Minimal span YACs - PFG m a p p e d 9

Notl Notl ■o - i — I___ I (Nrul) Nrïn Nrtil Nnil YACs Identified ? Nrui Using Irradiation S’ Hybrid Approach

YACs positioned 900F06121 - # according to probe content 904B02255 -*

9(HH0p

Chimeric YAC CD 00 ZFX - c o n ta in in g YACs

N«rl I Sacll S*c11 Notl ' Notl

Nael Sacll

I 1------1------1------1 1 1 1 1------1------1------1------1------1 1 1 1------1------1------1------1-----1------1 1 1 1 Fig 4.2 0 1 2 Mb Schematic representation of YACs identified in the ZFX-POLA region. At the top of the figure, positions of probes, new cosmids and linking clones spanning the region are represented. The YACs are divided into three groups: uppermost there are 4 YACs representing the minimal span of YACs linking ZFX and POLA, which have been PFG mapped with Not I and Nru I. The bracketed Nru I site close to the ZFX locus is a site which is present in 900G1063, but absent from 900G0732. This may represent a microdeletion in the latter YAC. The horizontal dashed line in 900F0127 YAC represents a potential region of instability. The centre group of 8 YACs are mapped according to probe content, and chimeric YACs (as determined by FISH) are as indicated. YACs 904B02255, 904F031 24 and 904H031 93 have the CEPH names 255B02, 1 24F03 and 193H03 respectively. In the lower part of the figure there are 4 YACs which contain ZFX and have been mapped with Mar I, Sac II, Not I and Nae I. Chapter Four

This confirmed the position of this contig at Xp21.3-p22.1. The 12 YACs were subsequently screened with other probes from this region. The ZFX locus was detected in one of the YACs and this enabled the orientation of the contig with respect to the chromosome: the most telomeric YAC contained ZFX. Four further YACs containing ZFX had been previously isolated, and these were also included in the probe typing analysis. Table 4.1 lists the 16 YACs identified in the ZFX-POLA region including size estimations, probe content, and FISH analysis performed for mapping confirmation and chimerism detection. Two mouse linking clones DXCrc57 and DXCrcl40 (Brockdorff et aL, 1990) known to map between Zfx and Pola in the mouse (Hamvas et al., 1993) were also used to probe the YAC blots, in order to allow a comparison of the human YAC contig with the equivalent region of the mouse X chromosome. Each linking clone detected a specific subset of the hum an YACs. DXCrc57 identified 5 YACs from this contig, 3 of which also contained ZFX. DXCrcl40 identified 3 YACs from the contig, 2 of which were co-identified by POLA, and the third one was also recognised by DXCrc57 (see Fig.4.2 and Table 4.1).

4.2.3 PFG analysis, and m apping cosmids in the YAC contig

In order to establish a more precise placement of markers and cosmids across the region, a minimal selection of non-chimeric YACs spanning ZFX and POLA were analysed by PFGE of DNA digests with rare cutter restriction enzymes. The selected YACs extending from telomere to centromere in the order shown are 900G1063, 900G0732, 900F0127, and 900G0821. Results showing the fragments detected by various probes are summarised in Table 4.2, and the YAC contig m ap is presented in Fig. 4.2. As shown in Fig. 4.3, both ZFX and the most telomeric cosmid clone, C09145, detect a 320kb Not I fragment in 900G1063. This cosmid also identifies the left arm Not I fragment of this YAC, and hence appears to span a Not I site, identifiable in 900G0732 which is a YAC from the middle of the contig containing neither ZFX or POLA. A second cosmid, B0960, and DXCrc57 also identify the overlap region of these two YACs. Both mouse linking clones detected loci within a 270kb Nru I fragment in 900G0732, andDXCrcl40, also identified the left arm end of 900F0127, a POLA-containing YAC. POLA detects a 200kb Nru I fragment present in 900F0127, which is also detected by cosmid E12186. This clone appears to be situated on the ZFX side of POLA, as it is still contained within 900G0732. Both El 2186 and POLA detect the left arm fragments of 900G0821 which is a POLA-containing YAC extending centromerically.

69 Chapter Four

Probe C on ten t

X p 2 1 .3 A 0 7 3 4 , S ize -p 2 2 .1 C 1 0 1 2 8 YAC (kb) (FISH) ZFX C 0 9 1 4 5 DXCrc57 B0960 DXCfc140 E12186 POLA D 0 3 6 5 H 0 9 6

A 0 4 0 1 5 1 0 N ot chim . A 0 5 0 1 3 2 0 N ot chim . A 0 3 0 1 4 3 0 N ot chim . A 0 2 0 1 6 6 0 N ot chim . G 1 0 6 3 5 1 0 N ot chim . F 0 3 4 7 2 5 0 Chim., 7 p G 0 7 3 2 5 5 0 N ot chim . F 0 6 1 2 1 9 0 0 Chim., 2 q FOI 2 7 4 4 0 N ot chim . A 0 6 1 0 3 4 4 0 N ot chim . G 0 8 2 1 5 2 0 N ot chim . G 0 9 2 1 4 6 0 N ot * chim . D 0 1 6 5 > 1 6 0 0 Chim., 1 1 q . X cen 2 5 5 B 0 2 4 6 0 Chim., Xq 1 2 4 F 0 3 4 6 0 Chim., 4 q 1 9 3 H 0 3 9 0 0 N ot chim .

Table 4.1 Probe typing of YACs Identified in the ZFX-POLA Region, Including Sizes of YACs and Results of FISH Analysis for Mapping Confirmation and to Detect Chimerism.

70 PROBES : Human . Right arm ZFX L e ft arm C 0 9 1 4 5 B 0 9 6 0

Nr N o Un Nr N o Un Nr N o Un Nr N o Un Nr N o Un Nr N o Un

SIZE(KB) ‘a ? S’

Fig 4.3 PFG analysis of ICRF YAC 900G1063 with ordering of C09145 and B0960 cosmid clones. Nr, No, and Un represent Nru I, Not I, and undigested DNA respectively. ZFX and C09145 are present on the same 320kb Not I fragment, although C09145 extends across the Not I site to also detect the same fragments as B0960 and the left YAC arm probe. Weak fragments detected by the left YAC arm probe are caused by a minor contamination with the right YAC arm probe. The cosmid probes also hybridizes to lambda concatamer markers present to the right of the uncut DNA. Chapter Four

The remaining four cosmids isolated over this region (A0734, Cl 0128, D0365 and H096) map to the centromeric side of POLA, and detect the right arm Not I fragment (240kb) of 900G0821. H096 appears to be the most centromerically placed, detecting the right arm Nru I fragment (40kb) of this YAC.

Table 4.2 YAC Notl and Nrul Fragments (in kb) Detected by Probes across the ZFX-POLA Region

YAC

G1063 (BlOkb) G0732 (550kb) F0127 (440kb) G0821 (520kb)

Notl Nrul Notl Nrul Notl Nrul NofI Nrul Probe Human 320 210 310 270 440 200 280 360 150 130 240 180 (uncut) 100 240 120 40 80/90(2) 100 60 40 50 30 Left YAC arm 150 210 240 180 440 50 280 120 Right arm 40 80/90 310 100 440 30 240 40 ZFX 320 80/90 C09145 320 210 310 270 150 240 DXCrc57 150 210 310 270 B0960 150 210 310 270 DXCrcl40 150 210 310 270 440 50 E12186 310 100 440 200 280 120 POLA 440 200 280 120 A0734 240 360 C10128 240 360 D0395 240 360 H096 240 40

Certain anomalies were detected in the ZFX-POLA set of PFG mapped YACs: 900G1063 appears to contain 4 Nru I fragments, two of which have very similar sizes (80-90kb). This would position an extra Nru I site within this YAC which is absent from 900G0732. 900F0127 (represented in Fig. 4.2 by solid and broken lines) also appears to have a number of Nru I sites which are undetectable in 900G0821. All four YACs appear to be non-chimeric from FISH analyses. Probe typing using the cluster of three cosmids, (A0734, C10128, D0365) and H096 in all the POLA-containing YACs, together with consideration of their sizes suggest

72 Chapter Four that 900F0127 may contain sequences not represented in the other YACs from this region. The 4 ZFX YACs 900A0401, 900A0201, 900A0301, and 900A0501, were mapped using the enzymes Notl, Nael, Narl, and SacR. These results are also summarised in Fig. 4.2, although this work was performed elsewhere. In two of these YACs (A0301, A0201), a 320kb Notl fragment is detected by ZFX, allowing alignment of these YACs with 900G1063, and extension of the contig to a total of 2Mb. The upstream Notl site which gave rise to this fragment, is contained in a potential CpG island which contains Narl, Sacll, and Notl restriction sites.

4.3 Discussion

The human components of somatic cell hybrid lines can now be easily accessed by amplification of human-specific sequences between Alu repeats (Nelson et al., 1991). This facilitates the use of hybrid cell lines as a resource for isolating cloned DNA and new markers from a specific desired region (Monaco et al., 1991), (Muscatelli et al., 1992), (Zucman et al., 1992) and (Kumlien et al., 1994). This approach has been evaluated here using high dose radiation hybrid cell lines to target a poorly defined area of the X chromosome. As described in this chapter, an ordered set of cosmid and YAC clones covering 2Mb of the Xp21.3-p22.1 region has been established. The use of high dose radiation hybrids has both advantages and disavantages. The presence of relatively small fragments which do not appear to be subject to extensive rearrangement (Benham et al., 1989), (Cox et al., 1989) allows the detailed analysis of particular regions. The presence of multiple fragments within a cell line however, could confuse the results. This problem was overcome in the present study by using well characterized subcloned radiation hybrid cell lines in a sequential screening approach. This enabled the isolation of clones from the region of interest with a high degree of accuracy. FISH localization of the selected cosmids proved to be an important step in eliminating the few clones which may have been positively identified due to a high repetitive element content, or because of preferentially retained centromeric fragments in the hybrids (Benham et al., 1989), (Cox et al., 1989). It is possible to screen YAC colony filters directly with Alu-PCR products generated from irradiation hybrid cell lines (Monaco et al., 1991). However, YAC clone filters have a relatively poor signal-to-background ratio and also a limited lifespan, and are therefore not ideally suited for sequential screens with complex probes. At the beginning of the present study, gridded YAC clone filters were not yet available and cosmid filters were used instead. In fact, cosmid clones have a

73 Chapter Four higher signal intensity compared to background, and this may have helped reduce the number of false positives obtained. It is now possible however, to screen high density filter grids of Y AC Alu-PCR products (Ross et a l, unpublished) directly with Alu-PCR products from irradiation hybrids (Kumlein et al, unpublished). Due to the selective amplification of the same specific subset of sequences in both probe and target, and the high concentration of PCR product spotted on the filter, this strategy offers an alternative approach to the identification of Y AC clones with complex probes. The cosmids which were mapped to Xp21.3-p22.1 establish an evenly spaced set of markers across a previously undermapped region. These new probes provided multiple entry points with which to identify YACs from the gridded clone array, and probing with pools of up to 6 cosmid clones enabled the isolation of 15 YACs in two colony filter screens. Overlaps between contiguous YACs were easily established by screening YAC blots with the cosmids, and by this means the YACs were divided into two groups: a 12-Y AC con tig spanning 1.5Mb in the Xp21.3-p22.1 region, and 3 remaining YACs falling in a different region in Xp22.1. The 12-Y AC contig was shown to encompass two loci, ZFX and POLA, which allowed the positioning and orientation of the contig on the X chromosome. Using additional YACs derived from an earlier subset of the ICRF YAC library, the contig was extended to approximately 2Mb contained within 16 YACs. The YAC contig was constructed, therefore without the need for end- rescue and walking approaches. YAC end probes can be difficult to rescue often requiring the application of several different procedures. Further problems may arise if the probes produced are weakly hybridizing, or contain repetitive sequences. In this case, the use of cosmid clones as probes facilitated the contig assembly, and to note, this region has not been easily cloned by other groups using STS contig building approaches (Fifth X chromosome workshop). PFG analysis of several YACs within the ZFX-POLA contig has enabled ordering of new probes across the region. The cosmid clones detect new loci in Xp21.3-p22.1 which may prove useful for the generation of microsatellite markers. The cross-hybridization of mouse loci to sequences within the contig has been obtained by probing PFG blots with two mouse linking clones (Brockdorff et al., 1990). This has allowed a preliminary comparison of the human ZFX-POLA region to the equivalent region in the mouse genome, and identified regions which are highly likely to contain coding sequences. This data will contribute to the ZFX-POLA area of the human-mouse X chromosome comparative map. The order of: Z fx - DXCrc57 - DXCrcl40 - Pola which was recently established in the mouse genome (Hamvas et al., 1993), appears very similar in the homologous human region. The physical distance separating ZFX

74 Chapter Four and POLA can be estimated to be 700kb, while in the equivalent region in the mouse these two loci were estimated to be separated minimally by 600kb and maximally by 750kb. The signals obtained by cross-species hybridization of mouse linking clones to human YAC blots places the loci detected by DXCrc57 within 400kb centromeric to ZFX, and DXCrcl40 approximately 200kb telomeric to POLA. Again these localizations are similar to those found in the mouse genome. Hence these loci appear to be contained within an extensive conserved linkage group with a similar probe order in man and mouse. The PFG analysis of the YACs has allowed an examination of their fidelity relative to one another, and a potential problematic region in YAC 900F0127 was detected close to the POLA locus. Instability has been detected in other genomic regions (Bates et al., 1992) and unstable sequences in one YAC were not necessarily found to be unstable in another. This YAC could be re-analyzed by FISH analysis, or hybridization of end-probes to mapping panels to verify that that it contains contiguous DNA. Construction of a PFG map from genomic DNA, and a more detailed analysis of the remaining YACs in the contig would also provide further information regarding the stability of this region in YAC clones. The more extensive PFG analysis of 4 ZFX YACs has revealed the presence of one potential CpG island, and it is unclear whether this island is associated with the ZFX gene or upstream from this locus. (Hamvas et al., 1993) recently detected a CpG island in a similar position in the equivalent mouse region. Conserved GC-rich sequences were similarly observed in human genomic DNA close to both ZFX and ZFY loci (Verga and Erickson, 1989). Additional CpG islands may also exist at the region of hybridisation to the mouse linking clones. Pola is not associated with a CpG island in the mouse, and in these studies, POLA was not associated with either a Notl site or N rul site, which suggests that the human gene is similarly not associated with an island. In summary, PFG analysis has enabled the construction of a 2Mb YAC map in the Xp21.3-p22.1 region, which includes 7 new ordered cosmid markers, and identification of several potential CpG islands. A starting combination of Alu-PCR products from radiation hybrid cell lines, and high density arrays of cosmid and YAC clone libraries facilitated the assembly of this contig, which could now be used to isolate new transcribed sequences. However, the situation of this contig on the proximal boundary of Xp22 has made it less interesting in the search for the HYP gene, which has been mapped more distally on the chromosome. For this reason no further studies were carried out on the ZFX- POLA contig.

75 CHAPTER FIVE YAC CLONE ISOLATION IN THE HYP REGION

5.1 Introduction

The pathogenic mechanisms involved in HYP have been the subject of much controversy (reviewed by (Lobaugh et al., 1984)). Originally it was believed that a defect in the biochemical pathway leading to a reduced production of 'active' vitamin D, resulted in decreased renal tubular reabsorbtion of phosphate and bone disease. However, treatment of patients with solely vitamin D or its metabolites does not normalize the phosphate reabsorption. It is now more widely believed that the primary defect involves the renal phosphate transport mechanism. One argument against this hypothesis is that there is no correlation observed between the magnitude of the hypophosphatemia and the severity of the bone disease. It appears that neither the suggestion of a primary defect in the phosphate transport, nor in the vitamin D metabolism can individually account for all the abnormalities. In fact the most effective treatment for this disorder is a combination of both oral phosphorus and vitamin D, which implies defects in both these systems. It is also not clear whether a circulating factor is involved in this disorder, or whether the abnormality is intrinsic to kidney or bone. For these reasons we decided to adopt a positional cloning approach in order to search for the HYP disease gene. This is a powerful strategy for identifying disease genes with unknown biological function (Collins, 1992). The development of a clone contig map encompassing the critical region bridges the gap between genetic studies and disease gene identification. It was important to isolate YAC clones for this purpose, since they contain large inserts, simplifying and speeding up the coverage of extensive stretches of the genome. Ultimately, the YAC clones can be used to identify cosmid, PI, PAC or lambda clones which provide an important resource of easily manipulatable genomic DNA, suitable for generating markers to refine the genetic map. Genetic linkage analyses have previously localized the HYP disease gene to distal Xp (Read et al., 1986), (Machler et al., 1986) and a locus order Xpter- DXS43-HYP-DXS41 -Xcen was established (Thakker et al., 1987). DXS257 and DXS365 were established as telomeric flanking markers internal to DXS43 (Econs et al., 1992), although initially it was not possible to determine which of these was closest to the HYP locus. Since then, one recombination event established DXS365 as the closest telomeric flanking marker (Econs et al., 1993). On the 76 Chapter Five centromeric side of the disease gene candidate region, DXS274 was demonstrated to be the closest flanking marker (Econs et al., 1993), (Rowe et al., 1993). (Econs et al., 1993) estimated the genetic distance between DXS257/DXS365 and DXS274 to be approximately 3.5cM. This chapter describes the isolation of YACs using single copy probes from the Xp22 region. Throughout the course of this project, the genetic map in the HYP region has been simultaneously refined and updated, and as the HYP candidate region was reduced in size by the use of new markers, so the YAC contig spanning this region was assembled.

5.2 Isolation of YACs in Xp22

5.2.1 Summary of clones identified from Xp22 (excluding ZFX-POLA region)

Table 5.1 Cosmid, PI and YAC clones identified in Xp22

Probe Cosmids, Pis YACs

DXS41 lOODOllOl (RLDB) 311F02 100H07115 (RLDB) 173A01 100C02147 (RLDB) 100G0651 (RLDB) 100F0474 (RLDB) 100H05123 (RLDB) DXS274 104A0359 (RLDB) 900H0439 173A01 80C04 87D07 83B05 83B05 Rsa Trp 104H10174 87D07 135C11 900A01126 27C06 900A0472 90000959 900B0218 900B1239 DXS365 104D0286 900H0391 900E01138 900C03129 900E01138 900E01138 (Alu-PCR pool) 900C03129 900G1257 900A0472 900B1239 PDHAl 900G0885 900B01113 900G11101

77 Chapter Five

Probe Cosmids, Pis YACs

900G0885 900G0885 (Alu-PCR pool) 900B01113 900G11101 900A05136 900E03130 DXS1229 900A05136 (PCR screen) DXS443 900E03130 (PCR screen) DXS3424 900E03130 (PCR screen) DXS257 104B1237 900A0922 B0544L * 104C02121 900D05120 (end probe) 700D2487 900G1031 701G125 900G0630 900H1261 901E0312

DXS43 * 104E0920 (RLDB) 900B0544 104A0352 (RLDB) 900B1148 900D11134 900D1064 900C1064 900D0758 900E0640 900E1092 900B0388 900E1112 DXS197 * lOOBlOlll (RLDB) 900G1096 900B12119 900B0544 900E118 900E1092 900B03154 DXS207 * 100H0454 (RLDB) 244G07 GLR'" 104E03132 900F1020 104E0722 900A0230 104B1216 901A084 900D0287 905A1217 DXS9* 104F01173 (RLDB) 312E02 900E11113 900F127 900D0367 (RLDB) - Cosmids which had been isolated by other investigators within the RLS * - YACs identified with these markers were included in a contig constructed by Dr. T. Alitalo (Alitalo et al, 1995).

78 Chapter Five

5.3 Isolation of YACs in the HYP region

5.3.1 YAC Library Screening with Proximal Probes

A cosmid recognised by DXS274, 104A0359, was used as a hybridization probe to screen against high density filter grids of YAC libraries. 6 YACs were identified using this probe, ICRF YAC 900H0439, and CEPH YACs 83B05, 173A01, 80G04, 135C11, and 87D07. Screening with DXS41 against the library filters identified two CEPH YACs, 311F02, and 173A01 (also identified by DXS274).

5.3.2 YAC Clones Identified at the DXS365 locus

DXS365 (Browne et al., 1992) contains repetitive elements that make it unsuitable for direct hybridization to YAC library filters and Southern blots. This probe was therefore used to screen the ICRF flow-sorted chromosome X cosmid library (Nizetic et al., 1991) and one cosmid, 104D0286 appeared positive against a high background. Three ICRF YACs, 900E01138, 900H0391, and 900C03129, were identified with this cosmid, as shown in Fig 5.1.

5.3.3 Link-up of Proximal and Distal YACs

Probe content mapping established that 83B05 (660kb) as well as 173A01 (370kb) contain DXS274 and DXS41 (depicted in the YAC contig figure. Fig 5.2). Genetically, DXS274 has been mapped closer to HYP than DXS41 (Rowe et al., 1993) and therefore an attempt was made to walk from DXS274 towards DXS365, the distal flanking marker. Both end probes generated from 83B05 mapped to Xp22 (see Fig 5.3). These results suggest that this YAC is not chimeric, and this has been verified by FISH. The Trp end probe identified one DXS274 YAC, 80G04, which does not contain DXS41, and this end probe was therefore selected to rescreen YAC library filters. Initially a cosmid, 104H10174 was isolated, and this was used to identify 6 new YACs (ICRF 900A0472, 900D0959, 900B1239, 900B0218, 900A01126 and CEPH 27C06). These results are summarized schematically in the YAC contig map in Fig. 5.2.104H10174 also re-identified two DXS274 YACs, 87D07 and 13511C on the YAC colony filters. 87D07 is however, not identified by this probe on Southern blots containing YAC DNA, suggesting that it may have deleted some sequences during growth.

79 Chapter Five

marker (kb)

6.5

4.3

2.3 2.0

Fig 5.1 DXS365 cosmid, 104D0286, screened against Southern blot of Hmiilll-digested YACs. (Lanes 1-8) lane 1, 900H0391; lane 2, 900E01138; lane 3, 900C03129; lane 4, 900F03127; lane 5, 900G1257; lane 6, 900A0472; lane 7, 900D0959; lane 8, 900A01126. The marker used was lambda Hindlll. YACs 900H0391, 900E01138 and 900C03129 contain the DXS365 locus.

80 HYP

CBN TEL g

mb I BE Nr (B) (B) ? 900E01138 YACs mapped a according to PFG an a ly sis

904B0583

900H0391 900C03129 90081239 900G1257 900D09S9 900A01126 YACs positioned according to probe 90080218 content only

00 904011135 904D0787 90400480 900HO439 904A01173 904F02311

= Chimeric YAC = Deleted YAC

B, M, N, Nr, and E = SssH11, M/ul, N otl, Nrul, and fagi sites respectively

Fig 5.2 YACs identified in the HYP gene candidate region. At the top of the figure, positions of probes and cosmids are represented in a teiomeric to centromeric order. DXS274 detects an internal Nrul fragment in YAC 83B05, and there are two possible positions for the internal Nrul site, as shown by asterisks. At least 6 EagI fragments could be detected in 900A 0472 by hybridization with both HI 0174 and the right arm probe. A0563, a cosmid mapping to 900A 0472, detects relatively large complete digest fragments (W/ul - 280kb, SssHII - 320kb, and EagI - 200kb). Both cosmids F0743 and A 0717 are present in the overlap region between 900A 0472 and 900E011 38. Consideration of the M/ul sites places F0743 closer than A 0717 to the distal end of 900A 0472. Common BssHII fragments In both 900E01138 and 900A 0472 were detected with F0743 (-1 OOkb), and with A 0717 (~40kb). In 900E01138, D0286 strongly detects an 80kb SssHII fragment and a weaker partial digest fragment of ~160kb, and neither can be correlated with a situation at either end of the YAC. The SssHII sites at the ends leave a ~250kb central portion containing an unknown number of sites. It is likely that the 80kb SssHII fragment is situated internally within this region, depicted by bracketed SssHII sites. Chapter Five

123456789 10

1. XYYYY 2. XY 3. Y only 4. XXXX 5. X only 6. Xp22.3-Xqter 7. Xp21.2-Xqter a. Xq13-Xqter 9. Xpter-Xq21 10. X(delXq13-Xq21)

Fig 5.3 Hybridization of Trp end vectorette probe from 83B05, to somatic cell hybrid mapping panel. This mapping panel was comprised of EcoRI digested DNA from the following: the human cell line OXEN (49,XYYYY, (Bishop et al., 1983)); the hamster-human hybrid 853 (Y only, (Burk et al., 1985)), GM1416 (48,XXXX) from the Camden Cell Repository, New Jersey; the mouse-human hybrid MOG (X only, (Povey et al., 1980)); the mouse-human hybrid UCLA B2 (Xp22.3-Xqter, (Curry et al., 1984)); the human cell line HEMAG (Xp21.2-Xqter, (Boyd et al., 1988)), the mouse-human hybrid FIR5 (Xql3-Xqter, (Solomon et al., 1983)); the mouse-human hybrid HORL (Xpter-Xq21, (Goodfellow et al., 1980)); and TAGB8 which is a human male cell line containing a deletion of the region Xql3-Xq21.3 (Tabor et al., 1983). Bands detected indicate that this probe detects a locus between Xp21.3, and Xp22.3. This is a consistent result for the end of a non­ chimeric YAC derived from Xp22.1.

82 Chapter Five

On the distal side of the disease gene candidate region, YAC colony Alu- PCR (Coffey et ah, 1992) was used to amplify sequences from 900E01138, and the pool of products was hybridized directly to YAC library filters. 5 YACs were identified using this walking approach, and of these, 3 had already been identified by the 83B05 Trp end probe (900A0472, 900B1239, and 900D0959). As shown in Fig 5.2, these 3 YACs therefore span the gap between the distal and proximal YACs. The remaining two YACs identified were 900G1257, and 900C03129, (which contains DXS365). Cosmids underlying several of the YACs were identified, by hybridization of YAC inserts to cosmid library filters (Fig. 5.4). Several cosmids, including 104F0347, and 104H10174, were selected for use as probes to verify the overlaps on YAC digest blots.

5.3.4 Co-identification of YACs Using Irradiation Hybrid DNA

Two sets of YACs were identified using irradiation hybrid DNA, three mapped to the HYP region 900G1257, 900A0472, and 900D0959, and the remaining 12 mapped to the ZFX-POLA region (see Chapter 4, and (Francis et al., 1994a)). The three YACs from the HYP region were identified by cosmids 104A0717 and 104A0563, although their exact positions in Xp22 were originally unknown. The YAC walking strategies described above co-identified these YACs and hence localized them between DXS365 and DXS274. A num ber of other YACs from this contig have subsequently identified 104A0717 and 104A0563, from the cosmid library filters. Table 5.2 summarizes data obtained on the YAC clones isolated in the HYP region. All YACs have been mapped to Xp by FISH and those YACs which appear to be chimeric have been identified. Sizes of YACs have been estimated and the probe content for each YAC is described.

5.3.5 PCR screening of HYP YACs

As shown in Fig. 5.5 (A), a PCR assay was used to verify the cosmid and YACs containing the DXS365 locus. A -210bp band was amplified from human genomic DNA, cosmid DNA, and the three YACs 900C03129, 900F01138, 900H0391. A fourth YAC, 83B05, is negative in this assay and hence does not contain the DXS365 locus.

83 Chapter Five

Table 5,2 YACs Identified in the HYP Region

Probe Content

Size Xp22.1 YAC (kb) (RSH) DXS41 DXS274 H10174 A0563 A0717 F0743 DXS365

83B0Sa 660 Not chim. ** * A0472® 650 Not chim. * *** E01138® 560 Not chim. *** 311F02 370 Chim. * 173A01 370 Not chim. ** H0439 250 Chim. * (cotran) 80G04 420 Chim. * ♦ (cotran) 87D07 440 Not chim. * * 135C11 450 Not chim. * * 27C06 400 Not chim. ** B0218 660 Chim. * * A01126 >1000 Chim. ** * * D0959 480 Not chim. **** B1239 660 Not chim. * * ** G1257 290 Chim. ** C03129 570 Not chim. * H0391 680 Chim. * Note: Not chim., not chimeric; Chim., chimeric; and cotrans., cotransformant. The probe content of each YAC is as indicated; asterisks denote a positive hybridization, a Minimal span YACs.

Three additional loci, DXS443, DXS3424 (Econs et al., 1994b), and DXS1229 (Biancalana et al., 1994), known to be on the telomeric side of the HYP gene, initially could not be ordered with respect to DXS365 using HYP family resources. For this reason primer sets were used to ascertain whether they were present in the HYP region. None of these three markers amplified a band from the HYP YAC DNA. Several YACs in a small contig containing the PDHAl locus were however, found to contain these loci (Fig 5.5 (B) and (C) for DXS1229 and DXS443 respectively). This data has since been confirmed, either genetically in the case of DXS443 (Biancalana et al., 1994), or by YAC clone isolation (A. Hanauer, personal communication and Fifth X chromosome workshop report).

84 S

T T s r ? » 5s ■S ?

• \

r . w W-: r U- A »

•M'» i-

00 en

Fig 5.4 Autoradiographs of a cosmid library filter (containing -20000 clones) hybridized with two different YAC inserts. On the left hand side of the figure, hybridization with YAC 87D07 and on the right, 27C06. Several cosmids identified by both YAC inserts are indicated by arrows. Chapter Five

M = Marker phiX174/Haein M-1 23 4567-M

210

B /1353 J078 / 872 '^03 310 ^281,271 ■ .-234 194 -118 72

M12 5 a 7 b cdef gM M12*5 a 7 b cde fgM

Fig 5.5 PCR Screening of HYP YACs. (A) DXS365 PCR assay. (B) DXS1229 PCR assay. (C) DXS443 PCR assay. M, marker lane containing H aelll digested phiX 174; -, empty lane; lane 1, water control; lane 2, human genomic DNA; lane 3, cosmid 104D0286; lane 4, YAC 900C03129; lane 5, 900E01138; lane 6, 900H0391; lane 7, 83B05; lane a, 900A0472; lane b, 900E03130; lane c, 900G11101; lane d, 900B1113; lane e, 900G0885; lane f, 900A05136; lane g, 900A0922; lane *, DXS443 plasmid DNA. As shown in parts (B) and (C) of the figure, a positive control reaction (Alu-PCR) was performed to show that amplification from each of the YAC DNAs was possible. Thick gel bands are evident in the upper half of the gel photograph in (B) as a wide gel comb was used. 86 Chapter Five

5.3.6 PFG Analysis of M inimal Span YACs

Complete Notl and Nrul digests and partial digestions with Eagl, BssHII, and M lul were performed on YACs 83B05, 900A0472, and 900E01138 . Table 5.3 lists the Notl and N rul fragment sizes detected in the complete digest PFG analysis. Initially, it was assumed that DXS274 and DXS41 recognised the same Nrul fragment of approximately 200kb, placing these markers in close proximity. Partial mapping data and a consideration of the sum of total Nrul fragment sizes, suggest that these two markers in fact identify two N rul fragments of similar size, with DXS274 detecting an internal fragment. Fig 5.6 (A) shows Notl digested 900E01138. This YAC was isolated with the DXS365 cosmid, 104D0286, which recognises an internal fragment of approximately 250kb. Fig 5.6 (B) shows the results of probe hybridizations to partially digested YAC 83B05. DXS41 appears to be situated close to the right arm end of this YAC, while DXS274 detects some left arm end fragments.

Table 5.3 Notl and Nrul Fragments (in kb) Detected in the Minimal Span YACs by Probes Across the HYP Region.

YAC 83B05 (660kb) 900A0472 (650kb) 900E01138 (560kb)

Probe Enzyme: Noil Nrul Notl Nrul Noil Nrul Human 660 210 560 650 250 400 (uncut) (uncut) 200 90 150 160 160 130 90 20 Left YAC arm 660 160 90 650 130-150 160 Right YAC arm 660 210 560 650 150 400 DXS41 660 210 DXS274 660 200 H10174 660 160 560 650 F0743 560 650 150 400 A0717 560 650 150 400 DXS365 250 400 Note: 900A0472 and 83B05 are uncut with Nrul and Notl respectively

The completed YAC map presented in Fig. 5.2, shows that DXS274 can be placed minimally 180kb, and maximally 310kb from the left arm end of this YAC, as determined by a BssHII site and Eagl site. Therefore the physical distance separating DXS41 and DXS274 is between 150kb and 400kb.

87 Chapter Five Probes

Human Left YAC Right D0286 F0743 A arm arm N un N un N un N un N un Size (kb)

5 5 0 4 0 0 T

2 5 0

1 5 0

50

B PROBES Right YAC DXS41 DXS274 Left YAC HI 0174 Size arm arm (kb) EBMEBMEBMEBM EBM

650

500

400

300

200 m 100

50

10

Fig 5.6 PFG analysis of two YACs in the HYP contig. A N otl digested 9(X)E01138. N = N otl digested DNA, U = uncut DNA. Left and right YAC arm probes are derived from pBR322, as described in the Materials and Methods section. Two fragments of similar size (130-150kb) are detected by each YAC arm probe, a larger internal fragment (~250kb) is detected by the DXS365 cosmid, 104D0286, with a second internal ~20kb fragment detected in the hybridization with total human DNA. It is likely that the left arm end probe detects two fragments, one of ~130kb and a second partial digest fragment of ~150kb. The cosmid 104F0743 is localized in the overlap region between 900E01138 and 900A0472, and partial digest data indicates that it is situated close to the right arm end. B. Partial digest analysis of 83B05. E = Eagl, B = BssHII, M = M lul. DXS41 detects internal M lul and Eagl fragments which are close to the right arm end of this YAC. DXS41 and DXS274 both detect a 480kb BssHII fragment. DXS274 detects a 360kb M lul fragment and a 300kb Eagl fragment, detected also by the left YAC arm pBR322 probe, and the end cosmid 104H10174. This cosmid is deleted and it only faintly detects the left arm end BssHII fragment (~10kb), however it appears to span a BssHII site, as a second fragment is detected. Two BssHII fragments are similarly detected in YAC 900A0472 by this probe. 88 Chapter Five

104H10174 spans a BssHII site and hence two fragments are detected by this probe in 83B05 and 900A0472. This confirms the alignm ent of these two YACs, and the extent of overlap is estimated to be -ISOkb. DXS274 appears to be localized within a 160kb stretch proximal to this area. There are a large number of Eagl and BssHII sites in the region of overlap between 900A0472 and 900E01138 and difficulties were encountered determining exact sizes of fragments due to their limited resolution in the bottom third of the PFG. However in general, a common order of enzyme sites could be distinguished in this region. 104A0717 potentially detects a microdeletion close to the right arm end of 900E01138, since its hybridization to Hmf^lll-digested DNA shows missing fragments, compared to other YACs (data not shown). This cosmid also contains a BssHII site. As shown in Fig. 5.2, the common Notl site situated -150kb from the right arm end of 900E01138, and -'90kb from the left arm end of 900A0472 may possibly reside in a potential CpG island as BssHII, and Eagl fragments of a similar size are also detected. This data however remains speculative due to the limitations of the map, and lack of a genomic PFG map. The BssHII digestions were almost complete in this analysis, and consequently it was not possible to unequivocally determine the full number of sites in each YAC by consideration of the partial digest fragments. For this reason it has not been possible to define an exact position within 900E01138 for the internal BssHII fragment detected by the DXS365 cosmid,!04D0286. However, this probe also detects internal Notl, M lul and Eagl fragments, suggesting a central location for the DXS365 locus.

5.4 Discussion

A YAC contig has been constructed in Xp22.1 encompassing the X-linked dominant hypophosphatémie rickets gene critical region, and linking the markers DXS365, DXS274 and DXS41. Several different hybridization approaches were used to identify these YAC clones, including screening with all the available single copy probes from the region, generating YAC end probes, and using Alu-PCR pools from a non-chimeric YAC to screen directly against the YAC library filters. This latter approach proved to be a rapid method of chromosome walking, and with the availability of YAC Alu-PCR filters (discussed in Chapter 4), this approach is now being used on a large scale to generate YAC contigs down the entire length of the X chromosome (Ross et al, unpublished). A similar approach using radiation hybrid DNA as the template for Alu-PCR, co-localized several YACs from the HYP contig. Verification of

89 Chapter Five

YAC overlaps was achieved by probe typing of the YACs and hybridization to cosmid library filters (Baxendale et al., 1991), (Russo et al., 1993). A high clone density (17 YACs over -1.5Mb) makes it likely that the entire disease gene critical region is represented in this contig, unless a piece of DNA from this region is particularly unstable in cosmids or YACs. One region of potential instability was detected in a cosmid (104H10174) identified with the left arm end of 83B05. The size of this cosmid was found to be only -20kb, suggesting that some sequences have been deleted, and this was evident in the PFG analysis. Instability in YACs has been widely observed (for example, (Bates et al., 1992), (Palmieri et al., 1993)). In this study, a small deletion was identified in 87D07, in the region of the deleted cosmid. A second potential microdeletion was detected in YAC 900E01138 by hybridization of cosmid 104A0717 to Hindlll- digested YACs. Several other YACs from the contig however, appear to span both deleted regions detected. Therefore it appears that problems of this type may be overcome by obtaining YACs from more than one library, and analyzing as many YACs as possible from the region of instability. Three YACs (900E01138, 900A0472, and 83B05), representing a minimal span across the HYP region, were selected for more detailed analysis involving digestion with rare-cutter enzymes and PFGE. Initially complete digests with Notl and Nrul were performed, however a lack of well-spaced enzyme sites made it difficult to estimate the extent of YAC overlaps. Nevertheless, this data made it possible to orient the YACs relative to each other and to construct a crude map. Partial digestions using the enzymes Mlul, Eagl, and BssHII were performed in an attempt to refine this map. A large number of Eagl and BssHII sites were identified in the distal half of the contig, making it difficult to determine exact fragment sizes from the partial digest data. The PFG analysis did however, allow an estimate of the physical distances between DXS41, DXS274 and DXS365. DXS41 lies between 150kb and 400kb proximal to DXS274, and DXS365 lies between 800kb and 950kb distal to this locus. Verification of the PFG map could be obtained by digesting the remaining YACs from the contig, and electrophoresing digested DNA on several gels with different resolutions. This, in comparison with a genomic PFG map, would enable a more accurate determination of potential CpG islands in this region. In the interests of the HYP gene, the YAC contig described here with its associated PFG map, has provided a useful starting physical map in the search for the mutation causing the disease. The development of a new clone map is currently underway at the level of cosmid, PI, and PAC clones. Additionally, using a combination of family resources ((groups of Dr. M. Econs, and Dr. P. Rowe), the genetic map has been further resolved using markers derived from

90 Chapter Five the YAC, PI and cosmid clones. The next stages of this positional cloning project are described in detail in the following chapter.

91 CHAPTER SIX FURTHER CHARACTERIZATION OF THE HYP REGION

6.1 Introduction

Microsatellite variation in mammalian genomes has been recognised as a powerful tool for linkage mapping (Litt and Luty, 1989), (Weber and May, 1989), (Love et aL, 1990), giving rise to high resolution genetic maps (Weissenbach et al., 1992), (Dietrich et al., 1992; Dietrich et al., 1994). Producing genetic maps in mice is facilitated compared to humans, in the respect that it is possible to specify defined parents, and generate interspecific backcrosses, where scoring of DNA marker variants is enhanced by the evolutionary separation between the species. In this case, the majority of offspring will be informative for a particular marker, and by generating backcrosses of 1000 animals or more, single gene mutations can be localized to within a few hundred kilobases (European Collaborative Group, 1994). Obviously the same strategies cannot be applied in humans, although refining the localization of disease genes can be facilitated by the identification of a large number of multi-generation kindreds. With a refined genetic map and a YAC contig encompassing the region of a disease gene, the search for the mutation can be facilitated by generating a second-generation (E. coE-based) clone map. Cosmids are often used for this purpose, although they are notorious for containing deletions and rearrangements within their cloned DNA. This is likely to be a reflection of their relatively high copy number within the bacterial cell. There has been increasing interest in the last few years on intermediate-capacity cloning systems (eg PI, BAC and PAC), which are also propagated in E. coli, although with a restricted copy number. It is too early to tell which deficiencies, if any, may be associated with these newer systems, however, already PI clones have been isolated where problems have been encountered in isolating stable cosmids (reviewed by (Sternberg, 1994)). Cosmid libraries can, however, be constructed from single chromosomes, (this has not yet been attempted with PI, PAC and BAC systems), and hence non-contiguous parts of chimeric YACs will cease to confuse the analysis. Using a variety of different libraries in parallel is likely to overcome the limitations present in any one system and hence allows an evaluation of the effectiveness of each of the cloning procedures (Hoheisel et al., 1993). Isolation of expressed sequences from YACs, intermediate-sized clones, and cosmids can be performed using strategies such as cDNA selection ((Lovett et al., 1991), (Parimoo et al., 1991), (Korn et al., 1992) and exon-trapping (Buckler 92 Chapter Six et al., 1991). (Church et al., 1994) have recently described an improved exon-trap vector which reduces the number of artefacts which were commonly produced using the original vector system. These included selection of cryptic splice sites in genomic DNA giving rise to cloned genomic fragments, and cloning of vector- only splice products. This new vector has been tested using pools of approximately 10 cosmids (Church et al., 1994), with an average yield of 1.4 exons/cosmid, and this is about half the yield expected when individual cosmids are used. Interestingly, it is apparently possible to isolate exons from non processed pseudogenes using the exon-trapping system (Suzuki et al., 1994), and so results must be treated with caution. cDNA selection techniques will also give rise to artefacts especially when YAC DNAs are used as template. In this case a cocktail of competitor DNAs are required to eliminate the selection of repeat containing cDNAs, and those cross-hybridizing with yeast genes. Neither cDNA selection, nor exon-trapping are likely to yield a completed transcriptional map alone, therefore it is desirable to combine the two techniques in any gene-hunt. Sequencing of cDNA clones and exons, and searching the public databases for a similar sequence, is the quickest way of determining their possible function. Compositional analysis (in the cases where no significant similarity to sequences of known function can be identified), can also be used to identify biologically significant features within a sequence. This chapter describes the detailed characterization of the HYP gene candidate region, which has lead to the development of new microsatellite markers. These were screened against 20 large HYP kindreds (a pooling of resources from two groups of investigators) which improved the resolution of the genetic map considerably. A new contig was developed, composed of cosmid, PI, and PAC clones, and this DNA was used as a template for cDNA selection. Exon-trapping was performed on the periphery of the candidate region, and exons and cDNAs are being characterized by sequence analysis. This analysis of the gene transcripts in the HYP region is still in progress at the time of writing of this thesis.

6.2 A Detailed Analysis of The HYP Region

6.2.1 Isolation of DXS1683 microsatellite marker from 104A0563 cosmid

104A0563 was initially identified by screening Alu-PCR products generated from radiation hybrid cell lines onto cosmid library filters (as described in Chapter 4). This cosmid mapped to the middle of the HYP candidate region, and was used to generate a microsatellite marker, DXS1683 (Econs et al.,

93 Chapter Six

1994a). Two recombination events demonstrated that DXS1683 lies on the centromeric side of the HYP gene (Econs et al., 1994b). Since 104A0563 maps to the central portion of the middle YAC in the minimal span of three covering the region (see Chapter 5), one YAC, 83B05, was subsequently elim inated from further analysis.

6.2.3 Identification of cosmid, PI and PAC clones.

As described in Chapter 5.3.3, inserts from several of the YACs in the HYP contig were PFG-purified, labelled and used as probes against the ICRF X chromosome cosmid library. Several cosmids were then used as probes on Southern blots to confirm overlaps between YACs. When the HYP candidate region was reduced in size to an area encompassed by YACs 900E01138 and 900A0472, these two YAC inserts were then used to screen a variety of additional libraries. These were the ICRF human PI library, the DuPont Merck human PI library, a human PAC library (obtained from Pieter Dejong), and the Lawrence Livermore X chromosome cosmid library. This variety of libraries was chosen for several reasons. A relatively small number of cosmids were identified from the ICRF cosmid library, which may have been an indication that certain areas of the HYP region were unclonable in the cosmid system. For this reason, libraries constructed using the PI, or PAC cloning systems were also screened. Each of the PI libraries has limitations which make it doubtful that either would provide complete coverage of any region: the ICRF library has a relatively low genome coverage (-1-fold), and the DuPont Merck library although a 3-fold genome coverage, has been pooled with 12 clones in each microtitre dish well, which may have lead to a loss in the representation of some clones during growth. The human PAC library is a 4-fold genome coverage, and therefore a desirable one to screen. This library was not initially available, and hence was screened at a later date. The Lawrence Livermore cosmid library was screened for comparison to the ICRF library - this library has been constructed from flow-sorted material derived from a human- hamster cell line, and is therefore likely to have less autosomal contamination than the ICRF library which was constructed from flow-sorted material derived from a human cell line. In fact, the Lawrence Livermore clones from this region have as yet, remained less-well characterized compared to the other cosmid, PI, and PAC clones, although they will be essential if an attempt is made to reconstruct the contig based on cosmids alone. The ICRF cosmid library has been relatively well characterized compared to the other libraries. Consequently, a number of cosmids cross-hybridizing with

94 Chapter Six the YAC vector or yeast genome (and hence appearing in all hybridizations where unrelated YACs were used as probes) were eliminated as artefacts at an early stage of the analysis. For each of the libraries, potentially positive clones were isolated from the frozen stocks and rescreened by hybridization of the original probe to colony lifts. Positively rescreening clones were rearrayed into 96-well microtitre dishes ('repicked subset'). Fig 6.1 summarizes the clones which were identified with YAC inserts 900E01138, and 900A0472 from the ICRF X cosmid library, the ICRF human PI library, the DuPont Merck human PI library (DMPl) and the human PAC library.

6.2.4 Assembly of cosmid/Pl/FAC clones into a contig

Riboprobes were generated from selected clones using T7 or SP6 RNA polymerase. These end probes were hybridized back onto filters containing the subset of positive clones (shown for repicked cosmids in Fig 6.2). In this way overlapping clones were identified. Alu-PCR products from clones were in some cases also used as probes to identify overlapping clones. The assembled contig comprised of cosmid, PI and PAC clones is shown schematically in Fig 6.3, and data are summarised in Table 6.1. Generating SP6 ends was problematic and not always successful, although in most cases T7 ends were produced successfully. Probes were initially used for hybridization without competition, although in some cases 'repetitive' ends were rehybridized after preannealing with total human DNA. The PAC clone, 87121 appears to be ~250kb (data not shown) and has been FISH m apped to Xp22.1. It is not clear at present whether the SP6 end of this clone extends as far as 437F, 1857A, 447F, and 362B cosmid clones. Several of the riboprobes rescued in this region (177A T7, and 437F T7) were repetitive, however cDNAs which map back to this region (see later section) imply that 87121 extends at least as far as 437F. This PAC clone does not appear to contain the DXS1683 centromeric flanking marker (derived from cos635A). PI clone 83M appears to be identical to 83K and is likely to have arisen through well-to-well contamination in the microtitre dish.

6.2.5 Production of an enriched cDNA library

cDNAs were selected using cosmid, PI and PAC clone pools as template DNAs (pools included clones which had not yet been positioned on the map). These clones collectively spanned the HYP region encompassing the markers DXS365 and DXS1683.

95 TEL GEN 9 “O

DXS365 Overlap Region , DXS1683

1 1 1 1 1 1 E01138 1 1 1

1 A0472

Cosm id: 1 Cosm id: Cosmid: 12712G, 451G, 116F, 1494E, 882D, 519E, 1857A, 447F, 798C, 397D, 1 658H, 295G, 947E, 1005C, 177A, 635A, 20911G, 553B, 5012A, 9212A, 1944C, 8211F, 1 437F, 362B 56100, 441 IB, 19510A, 6611G, 17610C, 4210, to 8 6 2 D , 1 4 3 1 OH, 5 1 8H, 7 5 7 E , 5 5 A , 1 1 8 1 1 1 C CD 14111F, 2098E, 928B, 2082C, 21212B 1

ICRF PI : 1 ICRF PI: • ICRF PI: 2124J, 542N ' 1 0 2 2 0 1 4512N, 7051

1 1 1 1

D M P l: 1 DM PI : 1 D M P l: 164N, 1414P, 815M, 48G, 1022A, ' 81L, 22C, 1140, 1750, 1240, 1 31 5K 1 83M, 83K, 203B, 11M, 980, 187C, 28K 516E, 142D, 1810J, 312J, 1516P, 1 1 2 1 D 1 1 EACi lEAC; 1 EAC: 63K, 79H, 3220B, 3220N, 5017M, 11411B, ' 752C, 87121, 92210, 127A16, 16610, 1 1610M, 57241, 18917F, 2181E, 26110B 125E18, 14020, 1906L, 19320M, 307H3 1 1 6 7 1 4 8 1 1

1 Fig 6.1 : Cosmid, PI and RAC clones identified with YACs EOl 138 and A 0472 Chapter Six

Fig 6.2 Riboprobe hybridizations to repicked cosmid subset. (A) Riboprobe from cosmid 1 identifying itself and a new cosmid. (B) Riboprobe from a second cosmid identifying itself and the same new cosmid as in (A).

97 DXS365 DXS1683 B 0218 TEL

E O l138 276C I

A 0472 g-

COS635A COS421D

c o s i s m c PI 4 5 1 2N c o s i8 0 9 F COS20911G PI 203B COS5610D PI 83M/K co si 2 0 1 1 6 cosi 005C (£> COS36ZB PI 11M 00 cosSI 1C £2f6£11G COS447F COS295G ___ COS658H COS1857A co si 77A co sL l4 2 5 COS437F pac2181E

PI 1 0 2 2 0 p a c 5 7 2 ^ PI 7051

p a d 8 9 1 7F PI 28K

p ac87121 ^pac1£1^^ SP6 77

Fig 6.3 Cosmid, FI and PAC contig, assembled using end-probes from clones and Alu-PCR probes. YACs are represented as thick lines uppermost on this diagram. The PAC clone, 87121 is ~ 250kb, its T7 end has been rescued and hybridizes to clones in the centromeric side of this contig. Although the SP6 end has not been successfully rescued it is thought that this clone extends as far as cosmids 437F and 1857A, since several cDNA clones hybridize to these cosmids only, in addition to this PAC clone. This area of the map (encompassing cosmids 362B, 447F, 1857A, 437F) is still being confirmed, although the remainder of the map is supported by the probe data presented in Table 6.1. Cosmid 12011G (39kb) is. likely to be encompassed by 1005C (48kb), and these two cosmids show very similar digestion patterns. Similarly cosmid 20 9 1 1 G (41 kb) is likely to be contained within 56100 (45kb). Deleted cosmids are represented by broken lines, and in the case of 295G, the size of the gap between 1 77A and 611C is still unknown. Cosmids 1 77A and 63 SA, and PAC clone 87121 have been mapped by FISH to the Xp22 region. Chapter Six

riboprobes

PROBE POSITIVES PROBE POSITIVES 635A Alu Cos: 635A 1005C T7 Cos: 611(3 6611G 1005C 658H PI: 7051 PI: 4512N 10220 PAC: 203B PAC: 87121 83M/K 18917F 2181E 203B Alu Cos: 635A 705117 Cos: 611C PI: 10220 Pl/PAC ND 7051 PAC: ND 635A T7 Cos: 635A 20911G T7 Cos: 421D 6611(3 20911(3 658H 5610D PI: 4512N 18111C No Pis No PACs PAC: 87121 635A SP6 Cos: 635A 421D 17 Cos: 421D PI: 83K/83M (+181 li e 17) 20911(3 203B 18111(3 4512N 5610D PAC: 2181E 1809F PI: 7051 PAC: 87121 28KA1U No cos 10220 Alu Cos: 177A PI: 28K (+177A Alu) 295(3 PAC: 57241 Pl/PAC ND 2181E 1610M 658H T7 Cos: 658H 295G T7 Cos: 295(3 6611G 611(3 635A PI: 7051 PI: 83K/83M 10220 203B PAC: 18917F 4512N 87121 No PACs 87121T7 Cos: 6611(3 437F SP6 Cos: 437F 658H 1857A PI: 4512N? Pl/PAC ND PAC: 87121 7051Alu Cos: 12011(3 177AT7 Repetitive 1005C (+437FT7) 5610D 20911G? 611C? Pl/PAC ND loose SP6 Cos: 20911(3 177A SP6 Cos: 177A 1005C PI: 10220 5610D PAC: ND PI: 7051? PAC: 87121 NB: 7', denotes a faint result, and ND, results not determined.

Three different tissue sources were used to generate the random-primed cDNA libraries: fetal liver, fetal brain and adult muscle. 1152 cDNAs were picked into microtitre dishes for each of the template pools, and a total of 3456 cDNAs

99 Chapter Six were robotically spotted at high density onto 7x11cm filter membranes (minigrids). In order to identify cosmid, PI and PAC fragments likely to contain expressed sequences, a complex pool of the enriched cDNA clones was hybridized to a Southern blot of digested cosmid/Pl/PAC clones (see Fig 6.4). Positively hybridizing fragments were excised from a LMP gel, and used as probes to identify cDNAs within the library, by hybridization to the minigrids (fragments were competed with human Cotl DNA prior to hybridization). Total human DNA was also hybridized to the enriched cDNA library in order to identify those clones likely to contain repetitive elements -12% of clones were positive. In some cases, an order could be assigned to the cosmid fragments, according to the shared cDNAs which they identified (data not shown).

6.2.6 A new microsatellite developed from FI clone, 10220

A new microsatellite marker (Cap32, (Rowe et al., 1994)) was developed from PI clone, 10220 (see Fig 6.3) and this was used to screen against the HYP families. Two recombination events were detected implying that the disease gene lies proximal to this new microsatellite (however, it should be noted that in one case, the parents of the 'recombinant' individual were deceased and so their genotypes had to be inferred). The closest flanking markers to the disease gene are therefore Cap32 on the telomeric side and DXS1683 on the centromeric side. The Cap32 marker has been used to screen the cosmid, PI, PAC and Y AC clones in this area, and amplification products have been obtained from the YACs 900A0472 and 900E01138, and the parent PI clone 10220. The 900A0472 Y AC alone, therefore appears to contain the entire HYP gene candidate region. Interestingly, the large PAC clone, 87121, which appears to encompass the majority of this region, and cosmids apparently underlying the 10220 PI clone do not give amplification products with this marker. However, one of the cosmids in this region is clearly deleted, and there is also the possibilty that the PAC clone contains a small deletion in the vicinity of this locus. Alternatively, the PAC clone and cosmids may not extend as far distal as the Cap32 marker.

6.2.7 Exon-trapping experiments

Preliminary exon-trapping experiments have been performed, resulting in the isolation of exons from two areas of the HYP contig.

100 Chapter Six

Marker

1 23456789 10

23.1

Fig 6.4 Selected cDNAs hybridized to digested cosmid and PAC clones. The marker used was lambda Hindlll. Lane 1, 611C; lane 2 ,18917F; lane 3 , 1005C; lane 4, 5610D; lane 5 ,12011G; lane 6, 20911G; lane 7 ,1809F; lane 8, 421D; lane 9 ,18111C; and lane 10 is the large PAC clone, 87121. The differing signal intensities observed are likely to reflect the relative cDNA abundance in the complex PCR probe, and the length of the exons, although the PAC clone 18917F is a lower concentration on the gel. Fiybridization with total human DNA (data not shown) gave a substantially different pattern from that shown here.

1 0 1 Chapter Six

Initially an attempt was made to isolate exons using a YAC insert as template DNA. YAC clone, 900A0472 was selected for this purpose, and PFG- purified YAC insert DNA was digested with BamHl and Bglll enzymes and ligated into the pSPL3 vector. Ligated material was electroporated into XLl-Blue cells. To estimate the efficiency of using a YAC for exon trapping, a DNA pool was prepared from only 15 recombinant clones, for electroporation into COS-7 cells. Isolation of RNA, exon amplification, and cloning of the PCR product pool was performed as described in the Materials and Methods section. 30 cloned products were reamplified by PCR, and 5 different species (as judged by their different migration patterns on an agarose gel) were mapped back to the cosmid/Pl/PAC clones. Two potential exon products (exons 1 and 2) were isolated which identified specific clones from the contig, although these were both located just outside of the HYP candidate region (centromeric of DXS1683 marker). Gel-purified YACs and other complex sources of DNA can be used for exon-amplification (Church et al., 1994), (Nehls et al., 1994), however, exon recovery is reduced compared to using single cosmids as starting DNA source (Church et al., 1994). Additionally, the similarity between human and yeast splice sites, means that it is possible to trap yeast exons in addition to human ones (Nehls et al., 1994). For these reasons, and because the HYP gene candidate region was reduced to only approximately half of the 650kb 900A0472 YAC, no further exon-trapping experiments were performed from the YAC DNA. Instead two PI clones, 83M and IIM, and one cosmid, 177A were used in the next round of exon-trapping experiments. The procedure was as described for the YAC DNA although minipreparations of the Pis and cosmids were used as starting DNA, and -2000 pSPL3 recombinants (scraped from an agar plate) were used to prepare DNA for electroporation into COS-7 cells. PI clone IIM has been mapped to just outside the candidate region, and so the number of potential exon products isolated from this clone has thus far not been estimated. 5 different potential exon products were isolated from the 83M PI clone (exons 3 to 7, although exon 5 contains repetitive elements), and 3 from the 177A cosmid (exons 8 to 10). Exons (except exonS) were hybridized to the selected cDNA library (see Fig 6.5), and all exons except exon 8 identified cDNA clones. Exons 3- 10 were sequenced as described below.

102 Chapter Six

B

Fig 6.5 Hybridization of two different exons to the selected cDNA library. Exons were labelled by radioactive PCR using a single primer. Each cDNA grid contains 3456 clones, spotted without internal duplications.

103 Chapter Six

6.2.8 Sequence analysis

122 cDNAs have been identified from the selected cDNA library by hybridization with 8 exons and 29 cosmid/PAC fragments (a range of 1 to 16 cDNAs were identified per probe). An additional 7 fragments (from two regions of the contig) were hybridized, although no positive cDNAs were identified with these probes. The positive cDNAs are in the process of being mapped back to the colony cosmid/Pl/PAC filters, and X chromosome mapping panels. Exons and cDNAs were amplified by colony PCR using pSPORT primers 3/86 and 5/86. The average size of a cDNA insert was 700bp, and the average size of the exon products was 220bp. cDNAs were arranged in bins according to their localization within the contig, and cDNAs from each bin are being sequenced. PCR products were initially purified through columns, and one eighth of the purified product was used as template in an ABI DyeDeoxy Terminator cycle sequencing reaction. Exons were sequenced from one side only using SD5 primer, cDNAs were dual­ end sequenced using 3/86 and 5/86 primers. Sequences were transferred to a Sun SPARC station where vector sequences, and unreliable sequences at the 3' end of the read were clipped using a trace editor program (ted) from the Staden package. Overlapping sequences were identified using the sequence assembly program xbap (Staden package), and contigs were used in database searches using the BLAST program (blastn and blastx). Data generated to date from sequenced cDNAs are presented in Table 6.2.

6.3 Discussion

The HYP gene candidate region has been reduced to a remarkably small size during the course of this project. The resolution of the region to only approximately 300kb may be indicative of a highly recombinogenic region. The cosmid/Pl/PAC contig encompassing the new flanking markers for the disease, DXS1683 and Cap32, consists of 14 cosmids, 5 PI clones and 4 PAC clones. The 'weakest' area in this contig is spanned by a single PAC clone, although the possibility remains that clones from this region are present in the repicked subset, but have not yet been identified. Neither of the PI libraries used in this study gave complete coverage of the region, although the ICRF library was apparently as useful as the DuPont Merck library in the region analysed in detail (between Cap32 and DXS1683). However, as previously mentioned, the commercially available copy of this library was a pooled version, which consists of 12 clones per microtitre dish well. In the production of library filters for hybridization screening, it is necessary to make duplica copies of the library and

104 Chapter Six hence subject the clones to further cycles of growth. This may have led to the loss of some clones by competitive growth.

Table 6.2 Analysis of cDNAs cDNA Orig. -Size How comment 3' seq 5'seq blastn blastx probe (bp) mapped (bp) (bp)

C3A6 exon9 500 CPP& MP:- 297 278 none none MP 1 band o/lap o/lap (conserved hamster)

C2H20 exon9 400 CPP 259 266 none neutral o/lap o/lap endopeptidase P = 4.2 XlO-7

C3H22 87121 500 CPP& MP:-1 band 303 309 none none -31 MP o/lap o/lap P2P23 exonlO 750 CPP o/lap with 293 261 none none exonlO

C1I8 87121 700 CPP bad 5' seq 224 0 none none -42 (3) (3) C2J22 5610D 700 CPP 233 299 none none -16

C1J4 5610D 800 CPP& MP:-1 band 264 285 none none -16 MP C1J6 5610D 700 CPP& MP:-5 144 262 none none -16 MP bands C3P20 5610D 800 CPP 173 267 none none -16 Pa2H13 421D 650 CPP 176 307 none none -7a

PalM21 5610D 500 CPP 306 274 none none -17 o/lap o/lap

PalI3 5610D 1000 CPP 300 249 none none -15 Pa3Fl3 5610D 650 CPP 325 332 none none -15 o/lap o/lap PalK7 5610D 600 CPP 295 337 none none -15 o/lap o/lap

C1J14 exon2 450 CPP 182 236 X-STS none o/lap o/lap SWXD787 P=4.6X10-40

Orig. probe: the original probe used to identify the cDNAs, CPP and MP refer to the mapping back of the cDNA onto cosmid, PI and PAC clone grids, and X chromosome mapping panels respectively, o/lap indicates that sequencing from both ends of the cDNA has resulted in sequence overlap. In the case of cDNA P2P23, 5' and 3' sequences overlap with exonlO sequence forming a contig, and orienting the cDNA.

105 Chapter Six

Interestingly, the expected number of positives for a 3-fold genome coverage library, was obtained in the primary screen of the DMPC library, however only a proportion of these clones were confirmed as positives in the rescreen and identified with specific probes from the region. The human PAC library gave a good overall representation of the region, although even in PAC clones the coverage is at present, incomplete. Several probes from the region (eg 635A T7) have failed to identify clones from the repicked subset. This may, however, be a reflection of the complex probe (YAC insert) used on the library filters, which failed to detect all the possible clones. Hybridization of less complex probes to the library filters would enable a better evaluation of the library representation. The largest PAC clone identified was 250kb, and although DNA is relatively easy to prepare from this type of clone compared to YAC clones, this size of insert may not be desirable for certain applications, eg exon-amplification and high resolution restriction mapping. In this respect cosmids provide a more easily manipulatable source of DNA, than the intermediate-sized clone collections. There are widely-differing opinions, based on biochemical studies, as to the particular tissue source the HYP gene is most likely to be expressed in. As mentioned in the Chapter One, Ecarot Charrier et al., 1988 have demonstrated an osteoblast defect by transplantation of bone cells from Hyp mice into normal mice. In different studies however, primary cultures of renal epithelial cells from Hyp mice have also been shown to exhibit defects in phosphate transport and vitamin D metabolism (Bell et al., 1988). Considering these conflicting reports, it is not easy to decide in which cDNA library to search for the HYP gene. Brain cDNA libraries are known to express a large number of genes, also expressed in other tissues (Bande and Hahn, 1976). For this reason an enriched cDNA library was constructed for the HYP region using fetal brain, fetal liver and adult muscle cDNAs. Using a complementary strategy, preliminary exon-trapping experiments have been performed, which allow the isolation of exons irrespective of their tissue specificities. Probes which do not identify cDNA clones from the selected library will be screened against alternative cDNA libraries (eg bone). If these probes are cosmid fragments, then specific cosmids can be selected for exon amplification experiments. The selected cDNA library has a higher percentage of clones likely to contain repetitive elements, than one might expect (Moyzis et al., 1989), (Crampton et al., 1981). This may be due to an inefficient quenching of repeats in the template DNA used to select the clones, since there are areas in the contig (in the vicinity of the Cap32 marker) which appear to have a high repetitive element content. These repetitive cDNAs may also be due in part, to incomplete splicing

106 Chapter Six

events (and hence the presence of intergenic regions) occurring in the randomly primed cDNAs. Further characterization of the library, by sequencing of cDNAs and comparison to the same regions in genomic DNA will give a better estimation of this particular problem. Lower copy repeat-containing clones may be more difficult to detect, since these clones may apparently map back to the genomic clones. However, screening of cDNAs against mapping panels will almost certainly identify these artefacts. As yet the exons identified in this study have not been analyzed in sufficient detail to reveal any artefacts. The exon recovery, however was as expected from previous studies - 3 exon products were derived from an individual cosmid, and 5 derived from an 80kb PI clone. Table 6.2 summarises the cDNA sequence data obtained thus far. It can be seen that most sequences have detected no homology to other sequences in the public databases. cDNA C2H20 is the only clone sequenced so far with an interesting and significant match (P=4.2X10'^). Neutral endopeptidase (NEP) Cenkephalinase') is a type II integral membrane glycoprotein located in the plasma membrane of many tissues, including kidney brush border membranes (Devault et al., 1987). This enzyme cleaves peptide bonds on the amino side of the hydrophobic amino acids and inactivates a variety of peptide hormones. The homology between C2H20 and NEP is in the 5' region which apparently encodes for half of the transmembrane domain, and the first two exons of the extracellular domain. The human 5' sequences of NEP have been found to be homologous to those found in rat brain and rabbit kidney (D'Adamio et al., 1989), and indeed, similarly significant matches to those proteins were also identified in this study. The significance of these results have yet to be determined, especially since NEP itself has been previously localised to chromosome 3 (Barker et al., 1989). Further studies need to be carried out to examine this Xp22 gene and assess its involvement in the HYP disorder.

107 CHAPTER SEVEN FINAL DISCUSSION AND CONCLUSIONS

7.1 Clone Libraries

Clone libraries are essential for the molecular analysis of genomes, and the work described in this thesis has involved both the production and the use of such libraries. The first part of this thesis work involved the preparation of libraries constructed using the PI cloning system. This system has been developed more recently than either YAC or cosmid systems, and hence it was necessary to evaluate the efficiency with which clones could be produced, the ease of handling PI clones and the potential advantages and deficiencies of the system. The second part of this thesis work has involved the analysis of a poorly characterized region of the human X chromosome in an ongoing search for the HYP gene, and this project involved the use of a variety of genomic and cDNA libraries.

7.2 PI Cloning System

As discussed in Chapter 3, collections of intermediate-sized clones have emerged as useful complements to YAC and cosmid libraries in physical mapping projects. The PI cloning system is discussed in most detail in this thesis, however, the BAC system, and the PAC system also fall into this category. One disadvantage of the PI cloning system compared to the cosmid cloning system is the relative inefficiency with which clones can be produced. This fact is highlighted by the few mammalian PI libraries which have been constructed. Preparing clones by either the sucrose gradient fractionation method or the PFGE protocol described here, generally result in a low efficiency of clone recovery (~10^/|ig insert DNA). This is likely to be caused by two factors, firstly the need to produce insert fragments which are within the narrow size range, and secondly the inefficiency of the in vitro packaging reaction. This latter point has been addressed by (Sternberg et al., 1994), who have described a new lysis- defective PI lysogen for the production of head-tail extracts. This apparently allows the production of more concentrated extracts, and leads to a better clone recovery. Alternatively, the need for a packaging reaction has been eliminated in the PAC cloning system whereby recombinant molecules are electroporated directly into host cells. The major advantage of PAC cloning is that larger inserts

108 Chapter Seven

can be produced, however, even in this system a good clone recovery is highly dependent upon efficient electrocompetent cells. Regions of instability and deficiencies encountered in both YACs and cosmids can sometimes be overcome by the use of Pis, BACs and PACs, although deficiencies in these newer systems may also be revealed as the libraries are characterized further. Indeed, several PI clones have been identified which appear to delete sequences during growth (Sternberg, 1994), (Francis et al., 1994c). Nevertheless, many of the early uses of PI libraries have involved attempts to isolate DNA sequences which are not highly represented or are absent from other types of library. This has provided a rigorous test for library fidelity, and as described previously, PI clones have successfully been isolated in several problematic regions (Baxendale et al., 1993), (Halford et al., 1993a), (Casser et al., 1994), (Sternberg, 1994). Recently, there have been an increasing number of published reports describing the choice of PI clones over YAC or cosmid clones, for more general mapping purposes. For example, in FISH analysis where they have been used to determine chromosomal localizations of genes (Nicolaides et al., 1994), (Hewitt et al., 1994), and in clinical and tumour interphase cytogenetics where their potential use as diagnostic tools has been evaluated (Lengauer et al., 1994). PI clones have been used for direct selection of cDNAs (Futreal et al., 1994a) and this thesis work, and also for the detailed characterization of gene families (SouthardSmith et al., 1994). With their ability to encompass large genes, PI clones are also proving useful in functional mapping studies. This is illustrated by work performed by (McCormick et al., 1994) who have used a PI clone containing the apo-B gene to transfect cultured cells, and in the same studies have successfully produced transgenic mice by microinjection. Larger scale projects involving PI clones, for example mapping of the Schizosaccharomyces and Drosophila genomes (Hoheisel et al., 1993) and (Hartl et al., 1994) have similarly illustrated the effectiveness of the PI cloning system as a tool in genome analysis. Projects of this type have often employed hybridization screening approaches using complex probes, for example small pools of probes derived from the ends of cosmids (Hoheisel et al., 1993), and pools of probes derived from microdissected material (Lengauer et al., 1994). YAC clones (either PFG-purified inserts, or IRS-PCR products) and pools of cDNA clones can also be used to screen library filters, these approaches contributing to high resolution physical maps and transcriptional maps respectively. Estimates of chimerism in several different PI libraries are well below the estimates for YAC libraries: for example, less than 2% detected in the analysis of more than 3000 Drosophila PI clones (Hartl et al., 1994), less than 5% in more than 1000 human and mouse PI clones distributed through Genome Systems Inc. (Sternberg, 1994), and 0/46

109 Chapter Seven

chimeric clones were detected in one study using the human PI library described in this thesis (Lengauer et al., 1994).

7.3 The Human Xp22 Region

Radiation hybrids can be used for long range mapping purposes in an approach which uses inter-Alu PCR to amplify their human components. Alu- PCR products can be hybridized to clone libraries in an attempt to isolate the genomic DNA represented within the radiation hybrid fragments. This approach can be integrated with other genetic and physical mapping strategies (Monaco et al., 1991), (Zucman et al., 1992), (Kumlien et al., 1994). Hybrids produced using high doses of irradiation are useful for the detailed analysis of small regions, especially where few markers have previously been identified. In a different approach clone libraries can be constructed from hybrid cell lines in order to isolate new markers or cloned fragments from a region of interest (for example, in the cloning of the myotonic dystrophy gene (Brook et al., 1992)). One approach described in this thesis, for the characterization of the human Xp22 region has involved the use of Alu-PCR products from radiation hybrid cell lines hybridized to X chromosome cosmid library filters. The hybrid cell lines were produced by high-dose irradiation of a human-hamster hybrid donor which contained an X chromosome as its sole human component (Benham et al., 1989). These hybrid DNAs had been previously characterized by marker content, and contained multiple human fragments in each line. Two of the hybrid cell lines containing fragments from the Xp21.3-p22 region were single-cell cloned, that is allowing segregation of some of the fragments into different subclones. These subclonal lines were characterized by comparison of Alu-PCR profiles and mapping back of gel-purified Alu-PCR products to an X chromosome mapping panel (Benham and Rowe, 1992). In this way the number of fragments retained in a subclonal line was assessed. As described in Chapter 4, it was possible to isolate cosmid and YAC clones by hybridization of pools of Alu-PCR products onto gridded library filters. A YAC contig covering the ZFX- POLA region of Xp21.3-p22.1 was assembled, however, disappointingly only three YACs were obtained from the more distal Xp22 region using this approach, where the HYP disease gene has been mapped. The results obtained are likely to reflect both the original fragment content of the hybrid subclones, and the distribution and orientation of the Alu elements in Xp21.3-p22.1. The inter Alu-PCRs were performed using a primer specific to the 3' consensus Alu sequence (Cotter et al., 1990). In the template DNA, Alu repeats must be sufficiently close to one another, with sufficient homology to the

110 Chapter Seven primer (Tagle and Collins, 1992), (Parrish and Nelson, 1994), and in the appropriate orientation for successful inter-Alu amplification. It is known that the average spacing between Alu repeats is approximately 4 kb (Moyzis et al., 1989), although they are distributed between Alu-rich and Alu-poor domains. Two markers from the HYP region identified in several of the radiation hybrid subclones were DXS274 and DXS41 (see Fig 7.1), and later analysis by Alu-PCR of YACs containing these markers indicate that they are localized in an Alu-poor region. This may partly explain the paucity of clones identified in the vicinity of these loci using the Alu-PCR approach. Reamplification of the hybrid DNAs with a primer specific for the 5' region of the Alu consensus sequence (Monaco et al., 1991), (Cotter et al., 1990), or combining 3' and 5' primers, or using alternative repeat family primers could be performed in this region for a more exhaustive analysis. To summarize, the subclonal lines were originally shown to contain X chromosome markers in both regions in which YACs were finally isolated. At that time however, only a limited set of markers were available in the Xp22 region and it was not possible to assess the extent of the region spanned by the hybrid fragments (Benham and Rowe, 1992). From the studies described here, it appears that the hybrid subclonal line 50K19E, which was the primary one used for analysis, contains one fragment of at least 1.5Mb in the ZFX-POLA region, and a second fragment of at least 700kb in the HYP region. These fragment sizes are in fitting with the high doses of irradiation (50000 rads) applied to the donor cells (Walter and Goodfellow, 1993). A parallel approach used to identify YAC clones from the HYP region involved the hybridization of the majority of known markers from this region to YAC clone library filters. In fact a variety of probe types were used for this task including end probes from YACs, whole cosmids (containing Xp22 loci), and Alu-PCR probes. The best results were achieved using whole cosmids as probes, which gave reliable and relatively strong signal intensities. As described previously, the major area of interest was the HYP gene candidate region, although throughout the course of this project close collaborations have been maintained with groups working on other disorders in Xp22: especially Coffin Lowry Syndrome and X-linked juvenile retinoschisis. A 1.5Mb YAC contig was assembled which encompassed the HYP region, and although microdeletions were detected in two of the YACs, no major problems of instability were encountered.

Ill Chapter Seven

DXS365 DXS274

D X S4, TT Xp

M1C2 2 2 .3 DXS225

22.2 DXS43 Mb 22.1 DXS365 0 DXS274

ZFX 2 1 .3 POLA

21.2 DMD

21.1 ore ZFX POUV

1 1 .4

1 1.3 DXS7

1 1 .2 3

11.22

1 1.21 Mb 1 1.1 ^3

Fig 7.1 A diagram summarizing the YAC contigs which were developed over the human Xp22 region during the course of this project.

112 Chapter Seven

The YAC inserts were used to identify cosmid, PI and PAG clones by hybridization to high density filter grids. These experiments led to the confirmation of the overlaps between YACs: firstly by the hybridization patterns produced, and secondly by hybridizing selected cosmids onto Southern blots of YAC DNA. In an attempt to isolate a full coverage of intermediate-sized clones for a second generation map, a variety of different cosmid, PI and PAC libraries were screened. As an alternative approach new cosmid subhbraries could have been constructed for each of the YACs, and this approach can give rise to libraries with good coverage. However, problems of instability which are sometimes encountered with cosmid clones would not be overcome by this method (Sanseau et al., 1994), and unforeseen instabilities and rearrangements already present in the YAC clones would continue to cause a problem. The strategy described here, that is using a variety of different clone libraries, has the advantage that clones are derived from different genomic DNA sources. Additionally, a comparison of the different PI libraries available, and use of the new PAC library were interesting experiments to pursue, complementary with the work described in the first part of this thesis. The cosmid/Pl/PAC contig in the HYP region still needs verification, and it may be desirable to attempt to cover the entire region in cosmids alone. In parallel however, the cosmids. Pis and PAC clones are being used as a substrate for the isolation of gene transcripts. cDNAs and exons are themselves useful probes for detecting overlaps and determining order of the genomic clones. As described in Chapter 6, the search for the HYP gene is still ongoing. Exon amplification has been performed at the boundaries of the candidate region, and the whole region has been used to select for cDNA clones from three different cDNA libraries. One exon has been identified which fails to detect clones in the selected cDNA library, a result which could occur for several reasons. It is possible that this exon is derived from a gene which is expressed at an extremely low abundance or not at all in the tissues used to construct the cDNA libraries. Alternatively, it has been suggested that certain genes are not selected in the enrichment procedure, despite being expressed in the tissue sources used, because they contain short exons which fail to hybridize to the genomic template DNA (Lovett, 1994). The benefits of performing the complementary strategies of cDNA selection and exon amplification in parallel are emphasized by this result. The localization of the HYP mutation in Xp22.1, which is a light band with Giemsa staining (or R band), predicts that this region may be relatively rich in genes (Bernardi et al., 1985), (Gardiner et al., 1990), (Holmquist, 1992). However, the exhaustive analysis of this region has not yet been completed and so it is too

113 Chapter Seven early to allow a correlation of gene density with cytogenetic position. Unique cDNAs or exons (as determined by sequence analysis and localization) will be hybridized to Northern blots containing polyA+ RNA from a number of different tissues. These experiments will provide information on the number of genes in the region, and their patterns of expression. Each gene must then be assessed for potential involvement in the disorder. Indication of a likely candidate gene can sometimes be provided by sequence analyses which allow a function assignment (for example, in the positional cloning of the diastrophic dysplasia gene, (Hastbacka et al., 1994)). Several obvious candidate genes for HYP have already been eliminated by virtue of their autosomal location. These include a renal proximal tubular, brush border membrane sodium-phosphate cotransporter which has been localized to chromosome 5q35 (Kos et al., 1994), the first enzyme involved in the catabolism of 1,25(0H)2D3 which has been localized to chromosome 20ql3.1 (Labuda et al., 1992), and the enzyme which leads to the production of l,25(OH)2D3 which appears to be the basic defect in an autosomal form of rickets which has been linked to 12ql4 (Labuda et al., 1992). Additionally, mutations in the vitamin D receptor gene have been identified which are associated with vitamin D- dependent rickets type II, which is an autosomal disorder (Hughes et al., 1988). If genes from the region give no indication of involvement in the disease then each gene must be systematically analyzed for mutations in patients compared to normal individuals. cDNAs or genomic fragments can be hybridized to patient DNA panels in order to search for abnormalities: trinucleotide repeat expansions will give rise to larger fragments (eg. (Harley et al., 1992), (Orr et al., 1993), (Knight et al., 1993), gross rearrangem ents and deletions can be detected on PFGE blots (eg. (Trofatter et al., 1993)), and these can give rise to completely deleted fragments on frequent cutter blots (illustrated in Menkes disease by (Chelly et al., 1993)). Some point mutations may also be detectable by aberrant fragment patterns (Gitschier et al., 1985), (Vetrie et al., 1993). Additionally, analyses of patient RNA (by RT-PCR or hybridization to Northern blots) can provide information on diminished or aberrant expression of a gene (Mercer et al., 1993), (European PKD Consortium, 1994), (Hastbacka et al., 1994). Intron/exon boundaries within each gene can be identified through knowledge of the surrounding genomic sequence. Primers designed in intronic regions can be used to amplify exons from patient DNA for detection of point mutations by single-stranded conformational analysis (SSCP), a strategy which has been successfully applied for a number of disease genes (eg familial amyotrophic lateral sclerosis, (Rosen et al., 1993)). Splice-site mutations will be revealed by this method or alternatively by direct sequence analysis, as detected

114 Chapter Seven for example, in two McLeod Syndrome patients (Ho et al., 1994). Identification of mutations in unrelated patients compared to normal individuals is usually adequate evidence that a gene responsible for a particular disease has been isolated. Confirmation of this using an animal model is the ultimate proof. To conclude, the mutation causing HYP will almost certainly be identified by one of the strategies described above. It is often debated, that the time and effort involved in a positional cloning project might be better spent in more global mapping approaches. Ultimately, positional cloning may become as simple as a computer search, with the increased resolution and integration of genetic, physical and transcriptional maps.

115 REFERENCES

Adams, M. D., Kelley, J. M., Gocayne, J. D., Dubnick, M., Polymeropoulos, M. H., Xiao, H., Merril, C. R., Wu, A., Olde, B., Moreno, R. F., Kerlavage, A. R., Mccombie, W. R., and Venter, J. C. (1991). Complementary-DNA sequencing - expressed sequence tags and human genome project. Science 252,1651-1656.

Ahrens, P., and Kruse, T. A. (1986). An anonymous X-chromosomal clone identifying a frequent RFLP at Xp21-22 (HGM8 provisional no. DXS207). Nucl. Acid Res. 14, 7819.

Albertsen, H. M., Abderrahim, H., Gann, H. M., Dausset, J., Lepaslier, D., and Cohen, D. (1990). Construction and characterization of a yeast artificial chromosome library containing 7 haploid human genome equivalents. Proc. Natl Acad Sci USA 87, 4256-4260.

Aldridge, J., Kunkel, L., Bruns, G., Tantravahi, U., Lalande, M., Brewster, T., Moreau, E., Wilson, M., Bromley, W., Roderick, T., and Latt, S. A. (1984). A strategy to reveal high- frequency RFLPs along the human X chromosome. Am. J. Hum. Genet. 36,546-564.

Alitalo, T., Francis, F., Kere, J., Lehrach, H., Schlessinger, D., and Willared, H. F. (1995). A 6- Mb YAC contig in Xp22.1-p22.2 spanning the DXS69E, XE59, GLRA2, PIGA, G RPR, CALB3, and PHKA2 genes. Genomics 25,691-700.

Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990). Basic local alignment search tool. J. Mol. Biol. 215, 403-410.

Anand, R., Villasante, A., and Tyler Smith, C. (1989). Construction of yeast artificial chromosome libraries with large inserts using fractionation by pulsed-field gel-electrophoresis. Nucl Acid Res 17,3425-3433.

Aslanidis, C., Jansen, G., Anemiya, C., Shutler, G., Mahadevan, M., Tsilfidis, C., Chen, C., Aleman, J., Wormskamp, N. G. M., Vooijs, M., Buxton, J., Johnson, K., Smeets, H. J. M., Lennon, G. G., Carrano, A. V., Korneluk, R. G., Wieringa, B., and de Jong, P. J. (1992). Cloning of the essential myotonic dystrophy region and mapping of the putative defect. Nature 355, 548-551.

Attree, O., Olivos, I. M., Okabe, I., Bailey, L. C., Nelson, D. L., Lewis, R. A., Mclnnes, R. R., and Nussbaum, R. L. (1992). The Lowe's oculocerebrorenal syndrome gene encodes a protein highly homologous to inositol polyphosphate-5-phosphatase. Nature 358, 239-242.

Ballabio, A. (1991). Contiguous deletion syndromes. Curr. Opin. Genet. Devel. 1, 25-29.

116 Ballabio, A. (1993). The rise and fall of positional cloning. Nat Genet 3, 277-279.

Bantle, J. A., and Hahn, W. E. (1976). Complexity and characterization of polyadenylated RNA in the mouse brain. Cell 8, 139-150.

Barker, P. E., Shipp, M. A., D'Adamio, L., Masteller, E. L., and Reinherz, E. L. (1989). The common acute leukemia antigen maps to chromosomal region 3 (q21-q27). J. Immunol. 142, 283- 287.

Bates, G. P., Valdes, J., Hummerich, H., Baxendale, S., Lepaslier, D. L., Monaco, A. P., Tagle, D., MacDonald, M. E., Altherr, M., and Ross, M. (1992). Characterization of a yeast artificial chromosome contig spanning the Huntingtons-Disease gene candidate region. Nat. Genet 1 ,180- 187.

Baxendale, S., Bates, G. P., MacDonald, M. E., Gusella, J. P., and Lehrach, H. (1991). The direct screening of cosmid libraries with YAC clones. Nucl Acid Res 1 9 ,6651.

Baxendale, S., MacDonald, M. E., Mott, R., Francis, P., Lin, C , Kirby, S. P., James, M., Zehetner, G., Hummerich, H., and Valdes, J. (1993). A cosmid contig and high-resolution restriction map of the 2 megabase region containing the Huntingtons-Disease gene. Nat Genet 4, 181.

Bell, C. L., Tenenhouse, H. S., and Scriver, C. R. (1988). Primary cultures of renal epithelial cells from X-linked hypophosphatémie (Hyp) mice express defects in phosphate transport and vitamin D metabolism. Am. J. Hum. Genet. 4 3 ,293-303.

Bellanne-Chantelot, C., Lacroix, B., Ougen, P., Billault, A., Beaufils, S., Bertrand, S., Georges, I., Glibert, P., Gros, L, and Lucotte, G. (1992). Mapping the whole human genome by fingerprinting yeast artificial chromosomes. Cell 70, 1059-1068.

Benham, P., Hart, K., Crolla, J., Bobrow, M., Prancavilla, M., and Goodfellow, P. N. (1989). A method for generating hybrids containing nonselected fragments of human-chromosomes. Genomics 4,509-517.

Benham, P., and Rowe, P. (1992). Use of Alu-PCR to characterize hybrids containing multiple fragments and to generate new Xp21.3-p22.2 markers. Genomics 1 2 ,368-376.

117 Berger, W., Meindl, A., Vandepol, T., Cremers, F., Ropers, H, H., Doerner, C., Monaco, A., Bergen, A., Lebo, R., Warburg, M., Zergollern, L., Lorenz, B., Gai, A., Bleekerwagemakers, E. M., and Meitinger, T. (1992). Isolation of a candidate gene for Norrie disease by positional cloning. Nat. Genet. 1,199-203.

Bernardi, G., Olofsson, B., Filipski, J., Zerial, M., Salinas, J., Cuny, G., Meunier-Rotival, M., and Rodier, F. (1985). The mosaic genome of warm-blooded vertebrates. Science 228, 953-958.

Biancalana, V., Trivier, E., Weber, C., Weissenbach, J., Rowe, P., ORiordan, J., Partington, M. W., Heyberger, S., Oudet, C , and Hanauer, A. (1994). Construction of a high-resolution linkage map for xp22.1-p22.2 and refinement of the genetic localization of the Coffin-Lowry syndrome gene. Genomics 22,617-625.

Bird, A. P. (1986). CpG-rich islands and the function of DNA méthylation. Nature 321, 209-213.

Bishop, C. E., Guellaen, G., Geldwerth, D., Voss, R., Fellous, M., and Weissenbach, J. (1983). Single-copy DNA-sequences specific for the human Y-chromosome. Nature 3 0 3 , 831-832.

Botstein, D., White, R. L., Skolnick, M. H., and Davis, R. W. (1980). Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am. J. Hum. Genet. 32, 314-331.

Boyd, Y., Cockburn, D., Holt, S., Munro, E., VanOmmen, G. J., Gillard, B., Affara, N., FergusonSmith, M., and Craig, I. (1988). Mapping of 12 translocation breakpoints in the Xp21 region with respect to the locus for Duchenne Muscular-Dystrophy. Cytogenet. Cell Genet 48, 28-34.

European Collaborative Group (1994). Towards high-resolution maps of the mouse and human genomes - a facility for ordering markers to 0.1 cM resolution. Hum. Mol. Gen. 3,621-627.

Brenner, S., Elgar, G., Sandford, R., Macrae, A., Venkatesh, B., and Aparicio, S. (1993). Characterization of the pufferfish (Fugu) genome as a compact model vertebrate genome. Nature 366, 265-268.

Brockdorff, N., Montague, M., Smith, S., and Rastan, S. (1990). Construction and analysis of linking libraries from the mouse X-chromosome. Genomics 7,573-578.

118 Brook, J. D., McCurrach, M. E., Harley, H. G., Buckler, A. J., Church, D., Aburatani, H., Hunter, K., Stanton, V. P., Thirion, J. P., Hudson, T., Sohn, R., Zemelman, B., Snell, R. G., Rundle, S. A., Crow, S., Davies, J., Shelbourne, P., Buxton, J., Jones, C., Juvonen, V., Johnson, K., Harper, P. S., Shaw, D. J., and Housman, D. E. (1992). Molecular-basis of myotonic-dystrophy - expansion of a trinucleotide (ctg) repeat at the 3' end of a transcript encoding a protein-kinase family member. Cell 68, 799-808.

Brown, R. M., Dahl, H.-H. M., and Brown, G. K. (1989). X-Chromosome localization of the functional gene for the El alpha subunit of the human pyruvate dehydrogenase complex. Genomics 4,174-181.

Browne, D., Barker, D., and Litt, M. (1992). Dinucleotide repeat polymorphisms at the DXS365, DXS443, and DXS451 loci. Hum. Mol. Genet. 1,213.

Buckler, A. J., Chang, D. D., Graw, S. L., Brook, J. D., Haber, D. A., Sharp, P. A., and Housman, D. E. (1991). Exon amplification - a strategy to isolate mammalian genes based on RNA splicing. Proc. Natl Acad Sci. USA 88, 4005-4009.

Burk, R. D., Ma, P., and Smith, K. D. (1985). Characterization and evolution of a single-copy sequence from the human Y-chromosome. Molec. Cell. Biol 5,576-581.

Burke, D. T., Carle, G. P., and Olson, M. V. (1987). Cloning of large segments of exogenous DNA into yeast by means of artificial chromosome vectors. Science 236, 806-812.

Buxton, J., Shelbourne, P., Davies, J., Jones, C., Van Tongeren, T., Aslanidis, C , de Jong, P., Jansen, G., Anvret, M., Riley, B., Williamson, R., and Johnson, K. (1992). Detection of an unstable fragment of DNA specific to individuals with myotonic dystrophy. Nature 355, 547- 548.

Cawthorn, R. M., Weiss, R., Xu, G., Viskochil, D., Culver, M., Stevens, J., Robertson, M., Dunn, D., Gesteland, R., P., O., and White, R. (1990). A major segment of the neurofibromatosis type 1 gene: cDNA sequence, genomic structure, and point mutations. Cell 62,193-201.

Chelly, J., Tumer, Z., Tonnesen, T., Petterson, A., IshikawaBrush, Y., Tommerup, N., Horn, N., and Monaco, A. P. (1993). Isolation of a candidate gene for menkes disease that encodes a potential heavy-metal binding-protein. Nat. Genet 3, 14-19.

119 Chen, Z.-Y., Hendriks, R. W., Jobling, M. A., Powell, J. P., Breakefield, X. O., Sims, K. B., and Craig, I. W. (1992). Isolation and characterization of a candidate gene for Norrie disease. Nat. Genet. 1, 204-208.

Chumakov, I., Rigault, P., Guillou, S., Ougen, P., Billaut, A., Guasconi, G., Gervy, P., Legall, I., Soularue, P., and Grinas, L. (1992). Continuum of overlapping clones spanning the entire human chromosome-21q. Nature 3 5 9 ,380-387.

Church, D. M., Stotler, C. J., Rutter, J. L., Murrell, J. R., Trofatter, J. A., and Buckler, A. J. (1994). Isolation of genes from complex sources of mammalian genomic DNA using exon amplification. Nat. Genet. 6, 98-105.

Church, G. M., and Gilbert, W. (1984). Genomic sequencing. Proc. Natl Acad Sci USA 8 1 ,1991- 1995.

Coffey, A. J., Roberts, R. G., Green, E. D., Cole, C. G., Butler, R., Anand, R., Giannelli, P., and Bentley, D. R. (1992). Construction of a 2.6-Mb contig in yeast artificial chromosomes spanning the human dystrophin gene using an STS-based approach. Genomics 12,474-484.

Cole, C. G., Goodfellow, P. N., Bobrow, M., and Bentley, D. R. (1991). Generation of novel sequence tagged sites (STSs) from discrete chromosomal regions using Alu-PCR. Genomics 10, 816-826.

Collins, P., and Galas, D. (1993). A new five year plan for the US Human Genome Project. Science 262, 43-46.

Collins, P. S. (1992). Positional cloning - lets not call it reverse anymore. Nat.Genet. 1, 3-6.

Consortium, Tuberous Sclerosis. (1993). Identification and characterization of the Tuberous Sclerosis gene on chromosome 16. Cell 75,1305-1315.

Consortium, PKD (1994). The Polycystic Kidney Disease 1 gene encodes a 14 kb transcript and lies within a duplicated region on chromosome 16. Cell 77, 881-894.

Cotter, P. E., Hampton, G. M., Nasipuri, S., Bodmer, W. P., and Young, B. D. (1990). Rapid isolation of human chromosome-specific DNA probes from a somatic-cell hybrid. Genomics 7, 257-263.

120 Coulson, A., Sulston, J., Brenner, S., and Kam, J. (1986). Toward a physical map of the genome of the nematode caenorhabditis-elegans. Proceedings Of The National Academy Of Sciences Of The United States Of America 83, 7821-7825.

Coulson, A., Waterston, R., Kiff, J., Sulston, J., and Kohara, Y. (1988). Genome linking with yeast artificial chromosomes. Nature 335, 184-186.

Cowgill, L. D., Goldfarb, S., Lau, K., Slatopolsky, E., and Agus, Z. S. (1979). Evidence for an intrinsic renal tubular defect in mice with genetic hypophosphatémie rickets. J. Clin. Invest. 63, 1203-1210.

Cox, D. R., Burmeister, M., Price, R., Kim, S., and Myers, R. M. (1990). Radiation hybrid mapping: a somatic cell genetic method for constructing high-resolution maps of mammalian chromosomes. Science 250, 245-250.

Cox, D. R., Pritchard, C. A., Uglum, E., Casher, D., Kobori, J., and Myers, R. M. (1989). Segregation of the Huntington Disease region of human-chromosome 4 in a somatic-cell hybrid. Genomics 4 ,397-407.

Crampton, J. M., Davies, K. E., and Knapp, T. P. (1981). The occurrence of families of repetitive sequences in a library of cloned DNA from human lymphocytes. Nucl. Acid Res. 9 ,3821-3824.

Cremers, F. P. M., van de Pol, D. J. R., van Kerkhoff, L. P. M., Wieringa, B., and Ropers, H.-H. (1990). Cloning of a gene that is rearranged in patients with choroideraemia. Nature 347, 674- 677.

Curry, C., Magenis, R. E., Brown, M., Lanman, J. T., Tsai, J., Olague, P., Goodfellow, P., Mohandas, T., Bergner, E. A., and Shapiro, L. J. (1984). Inherited chondrodysplasia punctata due to a deletion of the terminal short arm of an X-chromosome. New Engl J Med 3 1 1 , 1010-1015.

D'Adamio, L., Shipp, M. A., Masteller, E. L., and Reinherz, E. L. (1989). Organization of the gene encoding common acute lymphoblastic leukemia antigen (neutral endopeptidase 24.11):

Multiple miniexons and separate 5' untranslated regions. Proc. Natl. Acad Sci USA 86, 7103- 7107.

Datson, N. A., Duyk, G. M., Vanommen, G., and Dendunnen, J. T. (1994). Specific isolation of 3'- terminal exons of human genes by exon trapping. Nucl Acid Res 22,41484153.

121 Dear, S., and Staden, R. (1991). A sequence assembly and editing program for efficient management of large projects. Nucl. Acid Res. 1 9 ,3907-3911.

Delattre, O., Zucman, J., Plougastel, B., Desmaze, C., Melot, T., Peter, M., Kovar, H., Joubert, I., de Jong, P., Rouleau, G., Aurias, A., and Thomas, G. (1992). Gene fusion with an ETS DNA- binding domain caused by chromosome translocation in human tumours. Nature 3 5 9 ,162-165.

Devault, A., Lazure, C., Nault, C., Le Moual, H., Seidah, N. G., Chretien, M., Kahn, P., Powell, J., Mallet, J., Beaumont, A., Roques, B. P., Crine, P., and Boileau, G. (1987). Amino acid sequence of rabbit kidney neutral endopeptidase 24.11 (enkephalinase) deduced from a complementary DNA. EMBO J. 6,1317-1322.

Dietrich, W., Katz, H., Lincoln, S. E., Shin, H.-S., Friedman, J., Dracopoli, N. C., and Lander, E. S. (1992). A genetic map of the mouse suitable for typing intraspecific crosses. Genetics 131, 423-447.

Dietrich, W. P., Miller, J. C., Steen, R. G., Merchant, M., Damron, D., Nahf, R., Gross, A., Joyce, D. C., Wessel, M., Dredge, R., Marquis, A., and Stein, L. (1994). A genetic map of the mouse with 4,006 simple sequence length polymorphisms. Nat. Genet. 7, 220—225.

Donis-Keller, H., Green, P., Helms, C., Cartinhour, S., Weiffenbach, B., Stephens, K., Keith, T. P., Bowden, D. W., Smith, D. R., Lander, E. S., al, e., and Abrahamson, J. (1987). A genetic linkage map of the human genome. Cell 51, 319-337.

Drayna, D., and White, R. (1985). The genetic linkage map of the human X chromosome. Science 230, 753-758.

EcarotCharrier, B., Glorieux, P. H., Travers, R., Desbarats, M., Bouchard, P., and Hinek, A. (1988). Defective bone-formation by transplanted Hyp mouse bone-cells into normal mice. Endocrinology 123, 768-773.

Econs, M. J., Barker, D. P., Speer, M. C., PericakVance, M. A., Pain, P. R., and Drezner, M. K. (1992). Multilocus mapping of the X-linked hypophosphatémie rickets gene. J Clin Endocrin Metab75, 201-206.

Econs, M. J., Pain, P. R., Norman, M., Speer, M. C , PericakVance, M. A., Becker, P. A., Barker, D. P., Taylor, A., and Drezner, M. K. (1993). Planking markers define the X-linked hypophosphatémie rickets gene locus. J Bone Min Res 8 ,1149-1152.

122 Econs, M. J., Francis, F., Rowe, P., Speer, M. C., ORiordan, J., Lehrach, H., and Becker, P. A. (1994a). Dinucleotide repeat polymorphism at the DXS1683 locus. Hum Mol Gen 3,680.

Econs, M. J., Rowe, P., Francis, P., Barker, D. F., Speer, M. C., Norman, M., Pain, P. R., Weissenbach, J., Read, A., Davis, K. P., Becker, P. A., Lehrach, H., ORiordan, J., and Drezner, M. K. (1994b). Fine-structure mapping of the human X-linked hypophosphatémie rickets gene locus. J Clin Endocrin Metab 7 91351-1354. ,

Ficher, F. M., Southard, J. L., Scriver, C. R., and Glorieux, P. H. (1976). Hypophosphatemia; mouse model for human familial hypophosphatémie (vitamin D resistant) rickets. Proc. Natl. Acad. Sci USA 73, 4667-4671.

Flvin, P., Slynn, G., Black, D., Graham, A., Butler, R., Riley, J., Anand, R., and Markham, A. F. (1990). Isolation of cDNA clones using yeast artificial chromosome probes. Nucl Acid Res 18, 3913-3917.

Feinberg, A. P., and Vogelstein, B. (1983). A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity. Anal. Biochem. 132, 6-13.

Fisher, S. P., Black, G., Lloyd, S. P., Hatchwell, P., Wrong, O., Thakker, R. V., and Craig, I. W. (1994). Isolation and partial characterization of a chloride channel gene which is expressed in kidney and is a candidate for Dents disease (an X-linked hereditary nephrolithiasis). Hum Mol Gen 3, 2053-2059.

Foote, S., Vollrath, D., Hilton, A., and Page, D. C. (1992). The human Y-chromosome - overlapping DNA clones spanning the euchromatic region. Science 2 5 860-66. ,

Foster, J. W., DominguezSteglich, M. A., Guioli, S., Kwok, C., Weller, P. A., Stevanovic, M., Weissenbach, J., Mansour, S., Young, I. D., Goodfellow, P. N., Brook, J. D., and Schafer, A. J. (1994). Campomelic dysplasia and autosomal sex reversal caused by mutations in an sry-related gene. Nature 372,525-530.

Francis, F., Benham, F., See, C. G., Fox, M., IshikawaBrush, Y., Monaco, A. P., Weiss, B., Rappold, G., Hamvas, R., and Lehrach, H. (1994a). Identification of YAC and cosmid clones encompassing the ZFX-POLA region using irradiation hybrid cell-lines. Genomics 20, 75-83.

Francis, F., Rowe, P., Peons, M. J., See, C. G., Benham, P., ORiordan, J., Drezner, M. K., Hamvas, R., and Lehrach, H. (1994b). A YAC contig spanning the hypophosphatémie rickets disease gene (HYP) candidate region. Genomics 21,229-237.

123 Francis, P., Zehetner, G., Hôglund, M., and Lehrach, H. (1994c). Construction and preliminary analysis of the ICRF human PI library. Genetic analysis: Techniques and Applications 1 1 ,148- 157.

Franco, B., Guioli, S., Pragliola, A., Incerti, B., Bardoni, B., Tonlorenzi, R., Carrozzo, R., Maestrini, E., Pieretti, M., Taillonmiller, P., Brown, C. J., Willard, H. F., Lawrence, C., Persico, M. G., Camerino, G., and Ballabio, A. (1991). A gene deleted in Kallmanns syndrome shares homology with neural cell-adhesion and axonal path-finding molecules. Nature 353, 529-536.

Friend, S. H., Bernards, R., Rogelj, S., Weinberg, R. A., Rapaport, J. M., Albert, D. M., and Dryja, T. P. (1986). A human DNA segment with properties of the gene that predisposes to retinoblastoma and osteosarcoma. Nature 323, 643-646.

Fu, Y.-H., Kuhl, D. P. A., Pizzuti, A., Pieretti, M., Sutcliffe, J. S., Richards, S., Verkerk, A. J. M. H., Holden, J. J. A., Fenwick, R. G., Warren, S. T., Oostra, B. A., Nelson, D. L., and Caskey, C. T. (1991). Variation of the CGG repeat at the fragile X site results in genetic instability: resolution of the Sherman paradox. Cell 67,1047-1058.

Fu, Y.-H., Pizzuti, A., Fenwick, R. G., King, J., Rajnarayan, S., Dunne, P. W., Dubel, J., Nasser, G. A., Ashizawa, T., de Jong, P. J., Wieringa, B., Korneluk, R., Perryman, M. B., Epstein, H. P., and Caskey, C. T. (1992). An unstable triplet repeat in a gene related to myotonic muscular dystrophy. Science 255,1256-1258.

Futreal, P. A., Cochran, C., Rosenthal, J., Miki, Y., Swenson, J., Hobbs, M., Bennett, L. M., Haugenstrano, A., Marks, J., Barrett, J. C., Tavtigian, S. V., Shattuckeidens, D., Kamb, A., Skolnick, M., and Wiseman, R. W. (1994a). Isolation of a diverged homeobox gene, m oxl, from the brcal region on 17q21 by solution hybrid capture. Hum Mol Gen 3,1359-1364.

Futreal, P. A., Liu, Q., Shattuck-Eidens, D., Cochran, C., Harshman, K., Tavtigan, S., Bennett, L. M., Haugen-Strano, A., Swensen, J., Miki, Y., al, e., and Wiseman, R. (1994b). BRCAl mutations in primary breast and ovarian carcinomas. Science 266,120122.

Gardiner, K., Aissani, B., and Bernardi, G. (1990). A compositional map of human chromosome 21. EMBOJ 9,1853-1858.

Gardiner, K., Xu, H., Bonds, W., Tassone, F., Parimoo, S., Sivakamasundari, R., Hisama, F., Rynditch, A., and Weissman, S. (1994). Towards a transcriptional map of human chromosome 21. In Identification of Transcribed Sequences, ed. Hochgeschwender, U. and Gardiner, K., Plenum Press, New York. Vol. pp.

124 Garza, D., Ajioka, J. W., Burke, D. T., and Hartl, D. L. (1989). Mapping the Drosophila genome witli yeast artificial chromosomes. Science 246, 641-646.

Casser, D. L., Sternberg, N. L., Pierce, J. C., GoldnerSauve, A., Feng, H., Haq, A. K., Spies, T., Hunt, C., Buetow, K. H., and Chaplin, D. D. (1994). PI and cosmid clones define the organization of 280 kb of the mouse H-2-complex containing the cps-1 and HSP70 loci. Immunogenetics 39,48-55.

Geraghty, M. T., Brody, L. C., Martin, L. S., Marble, M., Kearns, W., Pearson, P., Monaco, A. P., Lehrach, H., and Valle, D. (1993). The isolation of cDNAs from OATLl at X pll.2 using a 480- kb YAC Genomics 16,440-446.

Ghosh, S., Palmer, S. M., Rodrigues, N. R., Cordell, H. J., Hearne, C. M., Comall, R. J., Prins, J. B., McShane, P., Lathrop, G. M., Peterson, L. B., Wicker, L. S., and Todd, J. A. (1993). Polygenic control of autoimmune diabetes in nonobese diabetic mice. Nat Genet 4,404-409.

Gitschier, J., Wood, W. I., Tuddenham, E., Shuman, M. A., Goralka, T. M., Chen, E. Y., and Lawn, R. M. (1985). Detection and sequence of mutations in the factor-viii gene of hemophiliacs. Nature 315, 427-430.

Goodfellow, P., Banting, G., Levy, R., Povey, S., and McMichael, A. (1980). A human X-linked antigen defined by a monoclonal antibody, somatic cell genetics 6, 777-787.

Goss, S. J., and Harris, H. (1975). New method for mapping genes in human chromosomes. Nature 255, 680-684.

Groden, J., Thiliveris, A., Samowitz, W., Carlson, M., Gelbert, L., Albertsen, H., Joslyn, G., Stevens, J., Spirio, L., Robertson, M., al, e., and White, R. (1991). Identification and characterization of the familial adenomatous polyposis coli gene. Cell 66, 589-600.

Halford, S., Wadey, R., Roberts, C., Daw, S., Whiting, J. A., Odonnell, H., Dunham, I., Bentley, D., Lindsay, E., Baldini, A., Francis, F., Lehrach, H., Williamson, R., Wilson, D. I., Goodship, J., Cross, I., Burns, J., and Scambler, P. J. (1993a). Isolation of a putative transcriptional regulator from the region of 22qll deleted in Digeorge-syndrome, shprintzen syndrome and familial congenital heart-disease. Hum Mol Gen 2, 2099-2107.

125 Halford, S., Wilson, D. I., Daw, S. C. M., Roberts, C., Wadey, R., Kamath, S., Wickremasinghe, A., Burn, J., Goodship, J., and Mattei, M. G. (1993b). Isolation of a gene expressed during early embryogenesis from the region of 22qll commonly deleted in Digeorge- syndrome. Hum Mol Genet 2,1577-1582.

Hamvas, R., Francis, P., Cox, R. D., Nizetic, D., Goldsworthy, M. E., Brown, S., and Lehrach, H. R. (1994). Rapid restriction analysis of YAC clones. Nucleic Acids Research 22,1318-1319.

Hamvas, R., Larin, Z., Brockdorff, N., Rastan, S., Lehrach, H., Chartier, F. L., and Brown, S. (1993). YAC clone contigs surrounding the zfx and pola loci on the mouse X-chromosome. Genomics 17,52-58.

Harley, H. G., Brook, J. D., Rundle, S. A., Crow, S., Reardon, W., Buckler, A. J., Harper, P. S., Housman, D. E., and Shaw, D. J. (1992). Expansion of an unstable DNA region and phenotypic variation in myotonic-dystrophy. Nature 355, 545-546.

Hartl, D. L., Nurminsky, D. I., Jones, R. W., and Lozovskaya, E. R. (1994). Genome structure and evolution in Drosophila - applications of the framework PI map. Proc Natl Acad Sci USA91, 6824-6829.

Hastbacka, J., Delachapelle, A., Mahtani, M. M., Clines, G., Reevedaly, M. P., Daly, M., Hamilton, B. A., Kusumi, K., Trivedi, B., Weaver, A., Coloma, A., Lovett, M., Buckler, A., Kaitila, I., and Lander, E. S. (1994). The diastrophic dysplasia gene encodes a novel sulfate transporter - positional cloning by fine-structure linkage disequilibrium mapping. Cell 7 81073- , 1087.

Herrmann, B. G., Barlow, D. P., and Lehrach, H. (1987). A large inverted duplication allows homologous recombination between chromosomes heterozygous for the proximal t complex inversion. Cell 48, 813-825.

Hewison, M., Karmali, R., and ORiordan, J. (1992). Tumor induced osteomalacia. Clinical Endocrinology 37,382-384.

Hewitt, J. E., Lyle, R., Clark, L. N., Valleley, E. M., Wright, T. J., Wijmenga, C., VanDeutekom, J. C. T., Francis, P., Sharpe, P. T., and Hofker, M. (1994). Analysis of the tandem repeat locus D4Z4 associated with facioscapulohumeral muscular-dystrophy. Hum Mol Genet 3, 1287-1295.

126 Ho, M., Chelly, J., Carter, N., Danek, A., Crocker, P., and Monaco, A. P. (1994). Isolation of the gene for McLeod Syndrome that encodes a novel membrane transport protein. Cell 77,869-880.

Hochgeschwender, U., Sutcliffe, J. G., and Brennan, M. B. (1989). Construction and screening of a genomic library specific for mouse chromosome 16. Proc. Natl. Acad. Sci. USA 8 6 ,8482-8486.

Hoheisel, J. D., Maier, E., Mott, R., McCarthy, L., Grigoriev, A. V., Schalkwyk, L. C., Nizetic, D., Francis, P., and Lehrach, H. (1993). High-resolution cosmid and PI-maps spanning the 14- Mb genome of the fission yeast S-pombe. Cell 73,109-120.

Holmquist, G. P. (1992). Chromosome bands, their chromatin flavors, and their functional features. Am J Hum Gen 51,17-37.

Hosoda, P., Nishimura, S., Uchida, H., and Ohki, M. (1990). An P-factor based cloning system for large DNA fragments. Nucl Acid Res 18, 3863-3869.

Hughes, M. R., Malloy, P. J., Kieback, D. G., Kesterson, R. A., Pike, J. W., Feldman, D., and O'Malley, B. W. (1988). Point mutations in the human vitamin D receptor gene associated with hypocalcémie rickets. Science 2 4 2 ,1702-1705.

loannou, P. A., Amemiya, C. T., Games, J., Kroisel, P. M., Shizuya, H., Chen, C., Batzer, M. A., and Dejong, P. J. (1994). A new bacteriophage Pl-derived vector for the propagation of large human DNA fragments. Nat Genet 6, 84-89.

Kerem, B. S., Rommens, J. M., Buchanan, J. A., Markiewicz, D., Cox, T. K., Chakravarti, A., Buchwald, M., and Tsui, L. C. (1989). Identification of the cystic-fibrosis gene - genetic- analysis. Science 245, 1073-1080.

Kim, U. J., Shizuya, H., Dejong, P. J., Birren, B., and Simon, M. I. (1992). Stable propagation of cosmid sized human DNA inserts in an P-factor based vector. Nucl Acid Res 2 0 ,1083-1085.

Kinzler, K., Nilbert, M., Vogelstein, B., Bryan, T., Levy, D., Smith, K., Preisinger, A., Hamilton, S., Hedge, P., Markham, A., Carlson, M., Joslyn, G., Groden, J., White, R., Miki, Y., Miyoshi, Y., Nishisho, I., and Nakamura, Y. (1991). Identification of a gene located at chromosome 5q21 that is mutated in colorectal cancers. Science 2 5 1 ,1366-1370.

127 Knight, S. J. L., Flannery, A. V., Hirst, M. C., Campbell, L., Christodoulou, Z., Phelps, S. R., Pointon, J., MiddletonPrice, H. R., Barnicoat, A., and Pembrey, M. E. (1993). Trinucleotide repeat amplification and hypermethylation of a CpG island in FRAXE mental-retardation. Cell 74, 127-134.

Korn, B., Sedlacek, Z., Manca, A., Kioschis, P., Konecki, D., Lehrach, H., and Poustka, A. (1992). A strategy for the selection of transcribed sequences in the Xq28 region. Hum Mol Gen 1, 235-242.

Kos, C. H., Tihy, F., Econs, M. J., Murer, H., Lemieux, N., and Tenenhouse, H. S. (1994). Localization of a renal sodium-phosphate cotransporter gene to human-chromosome 5q35. Genomics 1 9 ,176-177.

Krizman, D. B., and Berget, S. M. (1993). Efficient selection of 3'-terminal exons from vertebrate dna. Nucl Acid Res 2 1 ,5198-5202.

Kumlien, J., Labella, T., Zehetner, G., Vatcheva, R., Nizetic, D., and Lehrach, H. (1994). Efficient identification and regional positioning of YAC and cosrnid clones to human chromosome-21 by radiation fusion hybrids. Mamm Gen 5,365-371.

Labuda, M., Lemieux, N., Tihy, F., and Glorieux, F. H. (1992). Cloning and mapping of human 25-Hydroxyvitamin D 24-hydroxylase. In American Society of Human Genetics Annual Meeting, abstract# 1565.

Larin, Z., and Lehrach, H. (1990). Yeast artificial chromosomes - an alternative approach to the molecular analysis of mouse developmental mutations. Genetical Research 56, 2-3.

Larin, Z., Monaco, A. P., and Lehrach, H. (1991). Yeast artificial chromosome libraries containing large inserts from mouse and human DNA. Proc Natl Acad Sci USA 8 84123-4127. ,

Lee, J. T., Murgia, A., Sosnoski, D. M., Olivos, I. M., and Nussbaum, R. L. (1992). Construction and characterization of a yeast artificial chromosome library for Xpter-Xq27.3 - a systematic determination of cocloning rate and X-chromosome representation. Genomics 12,526-533.

Lefebvre, S., Burglen, L., Reboullet, S., Clermont, O., Burlet, P., Viollet, L., Benichou, B., Cruaud, C., Millasseau, P., Zeviani, M., Le Paslier, D., Frezal, J., Cohen, D., Weissenbach, J., Munnich, A., and Melki, J. (1995). Identification and characterisation of a spinal muscular atrophy-determining gene. Cell 80, 155-165.

128 Legouis, R., Hardelin, J.-P., Levilliers, J., Claverie, J.-M., Compain, S., Wunderle, V., Millasseau, P., Le Paslier, D., Cohen, D., Caterina, D., Bougueleret, L., Delemarre-Van de Waal, H., Lutfalla, G., Weissenbach, J., and Petit, C. (1991). The candidate gene for the X- linked Kallmann syndrome encodes a protein related to adhesion molecules. Cell 67,423-435.

Lehrach, H., Drmanac, R., Hoheisel, J., Larin, Z., Lennon, G., Monaco, A. P., Nizetic, D., Zehetner, G., and Poustka, A. (1990). Hybridization fingerprinting in genome mapping and sequencing. Genome Analysis 1,39-81.

Lengauer, C., Henn, T., Onayango, P., Francis, P., Lehrach, H., and Weith, A. (1994). Large Scale Isolation of Human lp36 specific PI clones and their use for fluorescence-in situ hybridization. Genetic Analysis: Techniques and Applications 11,140-147.

Leonardo, E. D., and Sedivy, J. M. (1990). A new vector for cloning large eukaryotic DNA segments in Escherichia-coli. Bio Technology 8, 841-844.

Litt, M., and Luty, J. A. (1989). A hypervariable microsatellite revealed by in vitro amplification of a dinucleotide repeat within the cardiac muscle actin gene. Am J. Hum. Genet 44, 397-401.

Litt, M., and White, R. L. (1985). A highly polymorphic locus in human DNA revealed by cosmid-derived probes. Proc. Natl. Acad. Sci. USA 82, 6206-6210.

Lobaugh, B., Burch, W. M., and Drezner, M. K. (1984). Abnormalities of vitamin D metabolism and action in the vitamin D resistant rachitic and osteomalacic diseases. In Vitamin D, ed. Kumar, R., Martinus Nijhoff Publishing, Boston, The Hague, Dordrecht, Lancaster. Vol. pp. 665-720

Lobaugh, B., and Drezner, M. K. (1983). Abnormal regulation of renal 25-Hydroxyvitamin D-1- alpha yydroxylase activity in the X-linked hypophosphatémie mouse. J. Clin. Invest 71, 400- 403.

Love, J. M., Knight, A. M., McAleer, M., and Todd, J. A. (1990). Towards construction of a high resolution map of the mouse genome using PCR-analysed micro satellites. Nucl. Acid Res. 18, 4123-4130.

Lovett, M. (1994). Fishing for complements - finding genes by direct selection. Trends Genet 10, 352-357.

129 Lovett, M., Kere, J. H., and Hinton, L. M. (1991). Direct selection - a method for the isolation of cDNAs encoded by large genomic regions. Proc Natl Acad Sci USA 8 89628-9632. ,

Huntingtons Disease Collaborative Group (1993). A novel gene containing a trinucleotide repeat that is expanded and unstable on Huntingtons-Disease chromosomes. Cell 72,971-983.

Machler, M., Frey, D., Gal, A., Orth, U., Wienker, T. P., Fanconi, A., and Schmid, W. (1986). X- linked dominant hypophosphatemia is closely linked to DNA markers DXS41 and DXS43 at Xp22. Human Genetics 73, 271-275.

Mahadevan, M., Tsilfidis, C., Sabourin, L., Shutler, G., Amemiya, C., Jansen, G., Neville, C., Narang, M., Barcelo, J., Ohoy, K., Leblond, S., EarleMacdonald, J., Dejong, P. J., Wieringa, B., and Korneluk, R. G. (1992). Myotonic-dystrophy mutation - an unstable ctg repeat in the 3' untranslated region of the gene. Science 255,1253-1255.

Maier, E., Hoheisel, J. D., McCarthy, L., Mott, R., Grigoriev, A. V., Monaco, A. P., Larin, Z., and Lehrach, H. (1992). Complete coverage of the Schizosaccharomyces-pombe genome in yeast artificial chromosomes. Nat Genet 2, 273-277.

Mandel, J. L., Monaco, A. P., Nelson, D. L., Schlessinger, D., and Willard, H. (1992). Genome analysis and the human X-chromosome. Science 2 5 8103-109. ,

McCormick, M. K., Shero, J. H., Cheung, M. C., Kan, Y. W., Hieter, P. A., and Antonarakis, S. E. (1989). Construction of human chromosome-21-specific yeast artificial chromosomes. Proc Natl Acad Sci USA 86, 9991-9995.

McCormick, S. P. A., Linton, M. P., and Young, S. G. (1994). Expression of PI DNA in mammalian cells and transgenic Mice. GATA 1 1 ,158-164.

McKusick, V. A. (1992). In Mendelian Inheritance in Man, ed. The Johns Hopkins University Press, Baltimore and London. Vol. 2, pp.

Meier-Ewert, S., Rothe, J., Mott, R., and Lehrach, H. (1994). Establishing catalogues of expressed sequences by oligonucleotide fingerprinting of cDNA libraries. In Identification of Transcribed Sequences, ed. Hochgeschwender, U. and Gardiner, K., Plenum Press, New York. Vol. pp. 253-260

130 Mercer, J. F. B., Livingston, J., Hall, B., Paynter, J. A., Begy, C., Chandrasekharappa, S., Lockhart, P., Grimes, A., Bhave, M., and Siemieniak, D. (1993). Isolation of a partial candidate gene for Menkes disease by positional cloning. Nat Genet 3, 20-25.

Meyer, R. A., Meyer, M. H., and Gray, R. W. (1989). Parabiosis suggests a humoral factor is involved in X-linked hypophosphatemia in mice. J Bone Min Res 4, 493-500.

Miki, Y., Swensen, J., Shattuck-Eidens, D., Futreal, P. A., Harshman, K., Tavtigian, S., Liu, Q., Cochran, C., Bennett, L. M., Ding, W., al, e., and Skolnick, M. H. (1994). A strong candidate for the breast and ovarian cancer susceptibility gene BRCAl. Science 266, 66-71.

Monaco, A. P., Lam, V., Zehetner, G., Lennon, G. G., Douglas, G., Nizetic, D., Goodfellow, P. N., and Lehrach, H. (1991). Mapping irradiation hybrids to cosmid and yeast artificial chromosome libraries by direct hybridization of Alu-PCR products. Nucl Acid Res 19, 3315- 3318.

Monaco, A. P., Neve, R. L., Collettifeener, C., Bertelson, C. J., Kumit, D. M., and Kunkel, L. M. (1986). Isolation of candidate cDNAs for portions of the Duchenne Muscular-Dystrophy gene. Nature 323, 646-650.

Mosser, J., Douar, A. M., Sarde, C. O., Kioschis, P., Feil, R., Moser, H., Poustka, A. M., Mandel, J. L., and Aubourg, P. (1993). Putative X-linked adrenoleukodystrophy gene shares unexpected homology with ABC transporters. Nature 361, 726-730.

Moyzis, R. K., Torney, D. C., Meyne, J., Buckingham, J. M., Wu, J. R., Burks, C., Sirotkin, K. M., and Goad, W. B. (1989). The distribution of interspersed repetitive DNA-sequences in the human genome. Genomics 4,273-289.

Murray, J. C., Buetow, K. H., Weber, J. L., Ludwigsen, C., Scherpbier-Heddema, T., Manion, P., Quillen, J., Sheffield, V. C., Sunden, S., Duyk, G., Weissenbach, J., Gyapay, G., and et al (1994). A comprehensive human linkage map with centimorgan density. Science 265, 2049-2054.

Murray, J. M., Davies, K. E., Harper, P. S., Meredith, L., Mueller, C. R., and Williamson, R. (1982). Linkage relationship of a cloned DNA sequence on the short arm of the X chromosome to Duchenne Muscular Dystrophy. Nature 300, 69-71.

Muscatelli, P., Monaco, A. P., Goodfellow, P. N., Horscayla, M. C , Lehrach, H., and Pontes, M. (1992). Isolation of new probes from Xql2-]ql3 - an example of the screening of reference libraries with Alu-PCR products from radiation hybrids. Cytogenet Cell Genet 6 1 ,109-113.

131 Muscatelli, F., Strom, T. M., Walker, A. P., Zanaria, E., Recan, D., Meindl, A., Bardoni, B., Guioli, S., Zehetner, G., Rabl, W., Schwartz, H. P., Kaplan, J.-C., Camerino, G., Meitinger, T., and Monaco, A. P. (1994). Mutations in the DAX-1 gene give rise to both X-linked adrenal hypoplasia congenita and hypogonadotropic hypogonadism. Nature 372, 672-676.

Nehls, M., Pfeifer, D., and Boehm, T. (1994). Exon amplification from complete libraries of genomic DNA using a novel phage vector with automatic plasmid excision facility - application to the mouse neurofibromatosis-1 locus. Oncogene 9, 2169-2175.

Nelson, D. L., Ballabio, A., Victoria, M. P., Pieretti, M., Bies, R. D., Gibbs, R. A., Maley, J. A., Chinault, A. C., Webster, T. D., and Caskey, C. T. (1991). Alu-primed polymerase chain- reaction for regional assignment of 110 yeast artificial chromosome clones from the human X- chromosome - identification of clones associated with a disease locus. Proc Natl Acad Sci USA 88, 6157-6161.

Nesbitt, T., Coffman, T. M., Griffiths, R., and Drezner, M. K. (1992). Crosstransplantation of kidneys in normal and Hyp mice - evidence that the Hyp mouse phenotype is unrelated to an intrinsic renal defect. J Clin Invest89, 1453-1459.

Nicolaides, N. C., Papadopoulos, N., Liu, B., Wei, Y. P., Carter, K. C., Ruben, S. M., Rosen, C. A., Haseltine, W. A., Pleischmann, R. D., and Praser, C. M. (1994). Mutations of 2 pms homologs in hereditary nonpolyposis colon-cancer. Nature 371, 75-80.

Nizetic, D., Zehetner, C., Monaco, A. P., Cellen, L., Young, B. D., and Lehrach, H. (1991). Construction, arraying, and high-density screening of large insert libraries of human chromosome-X and chromosome-21 - their potential use as reference libraries. Proc Natl Acad Sci USA 88, 3233-3237.

O'Parrell, P. H., Kutter, E., and Nakanishi, M. (1980). A Restriction map of the bacteriophage

T4 genome. Molec. Gen. Genet. 179,421-435.

OConnor, M., Peifer, M., and Bender, W. (1989). Construction of large DNA segments in Escherichia-coli. Science 244, 1307-1312.

Olson, M., Hood, L., Cantor, C , and Botstein, D. (1989). A common language for physical mapping of the human genome. Science 245,1434-1435.

132 Orr, H. T., Chung, M. Y., Banfi, S., Kwiatkowski, T. J., Servadio, A., Beaudet, A. L., McCall, A. E., Duvick, L. A., Ranum, L. P. W., and Zoghbi, H. Y. (1993). Expansion of an unstable trinucleotide cag repeat in spinocerebellar ataxia type-1. Nature Genet 4, 221-226.

Page, D. C., Disteche, C. M., Simpson, E. M., Delachapelle, A., Andersson, M., Alitalo, T., Brown, L. G., Green, P., and Akots, G. (1990). Chromosomal localization of ZFX - a human gene that escapes X-inactivation - and its murine homologs. Genomics 7,37-46.

Palmieri, G., Romano, G., Casamassimi, A., Durso, M., Little, R. D., Abidi, F. E., Schlessinger, D., Lagerstrom, M., Malmgren, H., and Steenbondeson, M. L. (1993). 1.5-mb YAC contig in Xq28 formatted with sequence-tagged sites and including a region unstable in the clones. Genomics 16, 586-592.

Parimoo, S., Patanjali, S. R., Shukla, H., Chaplin, D. D., and Weissman, S. M. (1991). CDNA selection - efficient PCR approach for the selection of cDNAs encoded in large chromosomal DNA fragments. Proc Nat 1 Acad Sci USA 88, 9623-9627.

Parrish, J. E., and Nelson, D. L. (1994). Practical aspects of fingerprinting human DNA using Alu polymerase chain reaction. Meths. in Mol. Cell. Biol. 5, 71-77.

Pasteris, N. G., Cadle, A., Logie, L. J., Porteous, M., Schwartz, C. E., Stevenson, R. E., Glover, T. W., Wilroy, R. S., and Gorski, J. L. (1994). Isolation and characterization of the faciogenital dysplasia (aarskog-scott syndrome) gene - a putative rho/rac guanine-nucleotide exchange factor. Cell 79, 669-678.

Pierce, J. C., Sauer, B., and Sternberg, N. (1992a). A positive selection vector for cloning high- molecular-weight dna by the bacteriophage-Pl system - improved cloning efficacy. Proc Natl Acad Sci USA89, 2056-2060.

Pierce, J. C., Sternberg, N., and Sauer, B. (1992b). A mouse genomic library in the bacteriophage- Pl cloning system - organization and characterization. Mamm Genome 3,550-558.

Pierce, J. C., and Sternberg, N. L. (1992). Using bacteriophage PI system to clone high molecular weight genomic DNA. Methods Enzymol 549-74.

Pinkel, D., Landegent, J., Collins, C., Fuscoe, J., Segraves, R., Lucas, J., and Gray, J. (1988). Fluorescence insitu hybridization with human chromosome-specific libraries - detection of trisomy-21 and translocations of chromosome-4. Proc Nat 1 Acad Sci USA 85, 9138-9142.

133 Poustka, A., and Lehrach, H. (1986). Jumping libraries and linking libraries - the next generation of molecular tools in mammalian genetics. Trends In Genetics 2,174-179.

Povey, S., Jeremiah, S. J., Barker, R. P., Hopkinson, D. A., Robson, E. B., and Cook, P. J. L. (1980). Assignment of the human locus determining phosphoglycolate phosphatase (PGP) to chromosome 16. Ann, Hum, Genet. London 43,241-248.

Rackwitz, H. R., Zehetner, G., Murialdo, H., Delius, H., Chai, J. H., Poustka, A., Frischauf, A., and Lehrach, H. (1985). Analysis of cosmids using linearization by phage lambda terminase. Gene 40,259-266.

Rao, V. B., Thaker, V., and Black, L. W. (1992). A phage-T4 invitro packaging system for cloning long DNA-molecules. Gene 113, 25-33.

Read, A. P., Thakker, R. V., Davies, K. E., Mountford, R. C., Brenton, D. P., Davies, M., Glorieux, P., Harris, R., Hendy, G. N., King, A., McGlade, S., Peacock, C. J., Smith, R., and ORiordan, J. (1986). Mapping of human X-linked hypophosphatémie rickets by multilocus linkage analysis. Human Genetics 73, 267-270.

Riley, J., Butler, R., Ogilvie, D., Pinniear, R., Jenner, D., Powell, S., Anand, R., Smith, J. C., and Markham, A. P. (1990). A novel, rapid, method for the isolation of terminal sequences from yeast artificial chromosome (YAC) clones. Nucl. Acid Res. 18, 2887-2890.

Riordan, J. R., Rommens, J. M., Kerem, B. S., Alon, N., Rozmahel, R., Grzelczak, Z., Zielenski, J., Lok, S., Plavsic, N., Chou, J. L., Drumm, M. L., lannuzzi, M. C., Collins, P. S., and Tsui, L. C. (1989). Identification of the cystic-fibrosis gene - cloning and characterization of complementary-DNA. Science 245, 1066-1072.

Roberts, L. (1992). 2 chromosomes down, 22 to go. Science 258,28.

Rommens, J. M., lannuzzi, M. C., Kerem, B. S., Drumm, M. L., Melmer, G., Dean, M., Rozmahel, R., Cole, J. L., Kennedy, D., Hidaka, N., Zsiga, M., Buchwald, M., Riordan, J. R., Tsui, L. C., and Collins, P. S. (1989). Identification of the cystic-fibrosis gene - chromosome walking and jumping. Science 245,1059-1065.

Rose, E. A., Glaser, T., Jones, C., Smith, C. L., Lewis, W. H., Call, K. M., Minden, M., Champagne, E., Bonetta, L., Yeger, H., and Housman, D. E. (1990). Complete physical map of the WAGR region of llp l3 localizes a candidate Wilms' Tumor gene. Cell 60, 495-508.

134 Rosen, D. R., Siddique, T., Patterson, D., Figlewicz, D. A., Sapp, P., Hentati, A., Donaldson, D., Goto, J., Oregan, J. P., Deng, H. X., Rahmani, Z., Krizus, A., Mckennayasek, D., Cayabyab, A., Gaston, S. M., Berger, R., Tanzi, R. E., Halperin, J. J., Herzfeldt, B., Vandenbergh, R., Hung, W. Y., Bird, T., Deng, G., Mulder, D. W., Smyth, C., Laing, N. G., Soriano, E., Pericakvance, M. A., Haines, J., Rouleau, G. A., Gusella, J. S., Horvitz, H. R., and Brown, R. H. (1993). Mutations in cu/zn superoxide-dismutase gene are associated with familial amyotrophic-lateral-sclerosis. Nature 362, 59-62.

Ross, M. T., Nizetic, D., Nguyen, C., Knights, C., Vatcheva, R., Burden, N., Douglas, G., Zehetner, G., Ward, D. C., and Baldini, A. (1992). Selection of a human chromosome-21 enriched YAC sub-library using a chromosome-specific composite probe. Nat Genet 1, 284-290.

Rouleau, G. A., Merel, P., Lutchman, M., Sanson, M., Zucman, J., Marineau, C., Hoangxuan, K., Demczuk, S., Desmaze, C., and Plougastel, B. (1993). Alteration in a new gene encoding a putative membrane-organizing protein causes neurofibromatosis type-2. Nature 363,515-521.

Rowe, P., Goulding, J., Read, A., Lehrach, H., Francis, F., Hanauer, A., Oudet, C., Biancalana, V., Kooh, S. W., Davies, K. E., and ORiordan, J. (1994). Refining the genetic-map for the region flanking the x-linked hypophosphatémie rickets locus (Xp22.1-22.2). Human Genetics 93, 291- 294.

Rowe, P., Goulding, J., Read, A., Mountford, R., Hanauer, A., Oudet, C., Whyte, M. P., MeierEwert, S., Lehrach, H., Davies, K. E., and ORiordan, J. (1993). New markers for linkage analysis of X-linked hypophosphatémie rickets. Human Genetics 91, 571-575.

Roy, N., Mahadevan, M. S., McLean, M., Shutler, G., Yaraghi, Z., Farahani, R., Baird, S., Besner-Johnston, A., Lefebvre, C., Kang, X., Salih, M., Aubry, H., Tamai, K., Guan, X., loannou, P., Crawford, T. O., de Jong, P. J., Surh, L., Ikeda, J.-E., Korneluk, R. G., and MacKenzie, A. (1995). The gene for neuronal apoptosis inhibitory protein is partially deleted in individuals with spinal m,uscular atrophy. Cell 80, 167-178.

Roy, S., Martel, J., Ma, S. Y., and Tenenhouse, H. S. (1994). Increased renal 25-hydroxyvitamin- d(3)-24-hydroxylase messenger-ribonucleic-acid and immunoreactive protein in phosphate- deprived Hyp mice - a mechanism for accelerated l,25-dihydroxyvitamin-d(3) catabolism in X-linked hypophosphatémie rickets. Endocrinology 134,1761-1767.

135 Royer-Pokora, B., Kunkel, L. M., Monaco, A. P., Goff, S. C., Newburger, P. E., Baehner, R. L., Cole, F. S., Curnette, J. T., and Orkin, S. H. (1986). Cloning the gene for an inherited human disorder-chronic granulomatous disease-on the basis of its chromosomal location. Nature 322, 32-38.

Russo, J., Sunjevaric, I., Cayanis, E., Rothstein, R., Efstratiadis, A., and Fischer, S. (1993). Cosmid fingerprints for YAC contig identification and STS development. Cytogenet Cell Genet 62, 105-105.

Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989). Molecular Cloning: A Laboratory Manual. Cold Spring Harbor, New York. Cold Spring Harbor Laboratory

Sanseau, P., Jackson, A., Senger, G., Kelly, A., Francis, F., Sheer, D., and Trowsdale, J. (1994). Cloning of the region between hla-dmb and lmp2 in the human major histocompatibility complex. Human Immunology 40,1-7.

Schaefer, L., Ferrero, G. B., Grillo, A., Bassi, M. T., Roth, E. J., Wapenaar, M. C., Vanommen, G., Mohandas, T. K., Rocchi, M., Zoghbi, H. Y., and Ballabio, A. (1993). A high-resolution deletion map of human chromosome-Xp22. Nature Genetics 4,272-279.

Schlessinger, D., Mandel, J.-L., Monaco, A. P., Nelson, D. L., and Willard, H. F. (1993). Report on the Fourth International Workshop on Human X Chromosome Mapping. Cytogenet. Cell Genet. 64,147-170.

Schultz, P. G., and Dervan, P. B. (1983). Sequence-specific double-strand cleavage of DNA by penta-N-methylpyrrolecarboxamide-EDTA-Fe(II). Proc. Natl. Acad. Sci. USA 80, 6834-6837.

Schwartz, D. C., and Cantor, C. R. (1984). Separation of yeast chromosome-sized DNAs by pulsed field gradient gel electrophoresis. Cell 37, 67-75.

Sealey, P. G., Wliittaker, P. A., and Southern, E. M. (1985). Removal of repeated sequences from hybridisation probes. Nucl. Acid Res. 13, 1905-1922.

Shepherd, N. S., Pfrogner, B. D., Coulby, J. N., Ackerman, S. L., Vaidyanathan, G., Sauer, R. H., Balkenhol, T. C., and Sternberg, N. (1994). Preparation and screening of an arrayed human genomic library generated with the PI cloning system. Proc Natl Acad Sci U S A 91, 2629-2633.

Sherman, P., Fink, G. R., and Hicks, J. B. (1986). Laboratory Course Manual for Methods in Yeast Genetics. In ed. Cold Spring Harbor Laboratory, New York. Vol. pp. 127-128

136 Shizuya, H., Birren, B., Kim, U. J., Mancino, V., Slepak, T., Tachiiri, Y., and Simon, M. (1992). Cloning and stable maintenance of 300-kilobase-pair fragments of human DNA in Escherichia- coli using an F-factor-based vector. Proc Natl Acad Sci U S A 89, 8794-8797.

Siddique, T., Phillips, K., Betz, H., Grenningloh, G., Warner, K., Hung, W. Y., Laing, N., and Roses, A. D. (1989). Rflps of the gene for the human glycine receptor on the X-chromosome. Nucl Acid Res 17,1785.

Sinclair, A. H., Berta, P., Palmer, M. S., Hawkins, J. R., Griffiths, B. L., Smith, M. J., Foster, J. W., Frischauf, A.-M., Lovell-Badge, R., and Goodfellow, P. N. (1990). A gene from the human sex-determining region encodes a protein with homology to a conserved DNA-binding motif. Nature 346, 240-244.

Smoller, D. A., Petrov, D., and Hartl, D. L. (1991). Characterization of bacteriophage PI library containing inserts of Drosophila DNA of 75-100 kilobase pairs. Chromo soma 100, 487- 494.

Solomon, E., Hiorns, L., Dalgleish, R., Tolstoshev, P., Crystal, R., and Sykes, B. (1983). Regional localization of the human alpha-2(i) collagen gene on chromosome-7 by molecular hybridization. Cytogenet Cell Genet35, 64-66.

SouthardSmith, M., Pierce, J. C., and Macdonald, R. J. (1994). Physical mapping of the rat- tissue kallikrein family in 2 gene clusters by analysis of PI-bacteriophage clones. Genomics 22, 404-417.

Staden, R. (1987). Computer handling of DNA sequencing projects. In Nucleic Acids and Sequence Analysis: a practical approach, ed. Bishop, M. J. and Rawlings, C. J., IRL Press, Oxford. Vol. pp. 173-217

Starr, T., and Wood, S. (1987). Isolation and characterization of DNA probes from the short arm of the human X chromosome that detect restriction fragment length polymorphisms. Genome 2 9 ,201-205.

Sternberg, N. (1990). Bacteriophage-Pl cloning system for the isolation, amplification, and recovery of dna fragments as large as 100 kilobase pairs. Proc Natl Acad Sci USA 8 7103-107. ,

Sternberg, N. (1994). The PI cloning system - past and future. Mamm Genome 5,397-404.

137 Sternberg, N., Ruether, J., and deRiel, K. (1990). Generation of a 50,000-member human DNA library with an average DNA insert size of 75-100 kbp in a bacteriophage PI cloning vector. New Biol 2, 151-62.

Sternberg, N., Smoller, D., and Braden, T. (1994). Three new developments in PI cloning - increased cloning efficiency, improved clone recovery, and a new PI mouse library. GATA 11, 171-180.

Sutherland, H. P., Pick, E., Francis, P., Lehrach, H., and Frischauf, A.-M. (submitted). Mapping around the Fused locus on mouse chromosome 17. Mamm. Gen.

Suzuki, H., Ozawa, N., Taga, C., Kano, T., Hattori, M., and Sakaki, Y. (1994). Genomic analysis of a NFl-related pseudogene on human chromosome 21. Gene 147,277-280.

Tabor, A., Andersen, O., Lundsteen, C , Niebuhr, E., and Sardemann, H. (1983). Interstitial deletion in the critical region of the long arm of the X-chromosome in a mentally-retarded boy and his normal mother. Human Genetics 64,196-199.

Tagle, D. A., and Collins, F. S. (1992). An optimized Alu-PCR primer pair for human-specific amplification of YACs and somatic cell hybrids. Hum Mol Genet 1 , 121-2.

Tang, L. B., Lenstra, R., Borchert, T. V., and Nagarajan, V. (1990). Isolation and characterization of levansucrase-encoding gene from Bacillus amyloliquefaciens. Gene 96, 89- 93.

Tenenhouse, H. S., Klugerman, A. H., and Neal, J. L. (1989). Effect of phosphonoformic acid, dietary phosphate and the Hyp mutation on kinetically distinct phosphate-transport processes in mouse kidney. Biochim Biophys Acta 984, 207-213.

Tenenhouse, H. S., Scriver, C. R., Mclnnes, R. R., and Glorieux, F. H. (1978). Renal handling of phosphate in vivo and in vitro by the X-linked hypophosphatémie male mouse: Evidence for a defect in the brush border membrane. Kidney International 14, 236-244.

Tennyson, C. N., Klamut, H. J., and Worton, R. G. (1995). The human dystrophin gene requires 16 hours to be transcribed and is cotranscriptionally spliced. Nat Genet. 9, 184-190.

138 Thakker, R. V., Read, A. P., Davies, K. E., Whyte, M. P., Weksburg, R., Glorieux, P., Davies, M., Mountford, R. C., Harris, R., King, A., Kim, G. S., Fraser, D., Kooh, S. W., and O Riordan, J. L. H. (1987). Bridging markers defining the map position of X-linked hypophosphatémie rickets. J. Med. Genet. 24, 756-760.

Ton, C. C. T., Hirvonen, H., Miwa, H., Weil, M. M., Monaghan, P., Jordan, T., van Heyningnen, V., Hastie, N. D., Meijers-Heijboer, H., Drechsler, M., et al and Saunders, G. F. (1991). Positional cloning and characterization of a paired box- and homeobox-containing gene from the aniridia region. Cell 67, 1059-1074.

Trofatter, J. A., Maccollin, M. M., Rutter, J. L., Murrell, J. R., Duyao, M. P., Parry, D. M., Eldridge, R., Kley, N., Menon, A. G., Pulaski, K., Haase, V. H., Ambrose, C. M., Munroe, D., Bove, C., Haines, J. L., Martuza, R. L., Macdonald, M. E., Seizinger, B. R., Short, M. P., Buckler, A. J., and Gusella, J. F. (1993). A novel moesin-like, ezrin-like, radixin-like gene is a candidate for the neurofibromatosis-2 tumor suppressor. Cell 72,791-800.

Uberbacher, E. C., and Mural, R. J. (1991). Locating protein-coding regions in human dna- sequences by a multiple sensor neural network approach. Proc Natl Acad Sci USA 88, 11261- 11265.

Verga, V., and Erickson, R. P. (1989). An extended long-range restriction map of the human sex- determining region on yp, including zfy, finds marked homology on Xp and no detectable Y sequences in an XX male. Am J Hum Genet 44,756-765.

Verkerk, A., Pieretti, M., Sutcliffe, J. S., Fu, Y. H., Kuhl, D., Pizzuti, A., Reiner, O., Richards, S., Victoria, M. F., Zhang, F. P., Eussen, B. E., Vanommen, G., Blonden, L., Riggins, G. J., Chastain, J. L., Kunst, C. B., Galjaard, H., Caskey, C. T., Nelson, D. L., Oostra, B. A., and Warren, S. T. (1991). Identification of a gene (fmr-1) containing a egg repeat coincident with a breakpoint cluster region exhibiting length variation in fragile-X syndrome. Cell 65, 905-914.

Vetrie, D., Vorechovsky, I., Sideras, P., Holland, J., Davies, A., Flinter, F., Hammarstrom, L., Kinnon, C., Levinsky, R., and Bobrow, M. (1993). The gene involved in X-linked agammaglobulinemia is a member of the src family of protein-tyrosine kinases. Nature 361, 226-233.

Vollrath, D., Foote, S., Hilton, A., Brown, L. G., Beerromero, P., Bogan, J. S., and Page, D. C. (1992). The human Y-chromosome - a 43-interval map based on naturally-occurring deletions. Science 258, 52-59.

139 Vulpe, C., Levinson, B., Whitney, S., Packman, S., and Gitschier, J. (1993). Isolation of a candidate gene for Menkes disease and evidence that it encodes a copper-transporting ATPase. Nat Genet 3, 7-13.

Walker, A. P., Muscatelli, P., and Monaco, A. P. (1993). Isolation of the human Xp21 glycerol kinase gene by positional cloning. Hum Mol Genet 2,107-114.

Wallace, M. R., Marchuk, D. A., Andersen, L. B., Letcher, R., Odeh, H. M., Saulino, A. M., Fountain, J. W., Brereton, A., Nicholson, J., Mitchell, A. L., Brownstein, B. H., and Collins, F. S. (1990). Type 1 neurofibromatosis gene: identification of a large transcript disrupted in three NFl patients. Science 249, 181-186.

Walter, M. A., and Goodfellow, P. N. (1993). Radiation hybrids: irradiation and fusion gene transfer. Trends Genet 9,352-356.

Walter, M. A., Spillett, D. J., Thomas, P., Weissenbach, J., and Goodfellow, P. N. (1994). A method for constructing radiation hybrid maps of whole genomes. Nat Genet 7, 22-28.

Wang, T. S.-F., Pearson, B. E., Suomalainen, H. A., Mohandas, T., Shapiro, L. J., Schroeder, J., and Korn, D. (1985). Assignment of the gene for human DNA polymerase alpha to the X chromosome. Proc. Natl. Acad. Sci. USA 82, 5270-5274.

Weber, J. L., and May, P. E. (1989). Abundant class of human DNA polymorphisms which can be typed using the polymerase chain reaction. Am. J. Hum. Genet. 4 4 ,388-396.

Weissenbach, J., Gyapay, G., Dib, C., Vignal, A., Morissette, J., Millasseau, P., Vaysseix, G., and Lathrop, M. (1992). A 2nd-generation linkage map of the human genome. Nature 359, 794- 801.

Winters, R. W., Graham, J. B., Williams, T. F., McFalls, V. W., and Burnett, C. H. (1958). A genetic study of familial hypophosphatemia and vitamin D resistant rickets with a review of the literature. Medicine 37, 97-142.

Wong, S. W., Wahl, A. F., Yuan, P. M., Arai, N., Pearson, B. E., Arai, K., Korn, D., Hunkapiller, M. W., and Wang, T. (1988). Human DNA-polymerase alpha-gene expression is cell-proliferation dependent and its primary structure is similar to both prokaryotic and eukaryotic replicative dna-polymerases. EMBO J 7, 37-47.

140 Yaspo, M.-L., Sanseau, P., Nizetic, D., Korn, B., Poustka, A., and Lehrach, H. (1994). Integrated transcriptional maps of large DNA regions: towards a transcriptional map of human chromosome 21. In Identification of Transcribed Sequences, ed. Hochgeschwender, U. and Gardiner, K., Plenum Press, New York. Vol. pp. 213-228

Zanaria, E., Muscatelli, P., Bardoni, B., Strom, T. M., Guioli, S., Guo, W., Lalli, E., Moser, C., Walker, A. P., McCabe, E. R. B., Meitinger, T., Monaco, A. P., Sassone-Corsi, P., and Camerino, G. (1994). An unusual member of the nuclear hormone receptor superfamily responsible for X- linked adrenal hypoplasia congenita. Nature 372, 635-641.

Zehetner, G., and Lehrach, H. (1994). The reference library-system - sharing biological- material and experimental-data. Nature 367, 489-491.

Zucman, J., Delattre, O., Desmaze, C , Azambuja, C., Rouleau, G., Dejong, P., Aurias, A., and Thomas, G. (1992). Rapid isolation of cosmids from defined subregions by differential Alu-PCR hybridization on chromosome-22-specific library. Genomics 13, 395-401.

141 GENOMICS 2 0 , 75-83 (1994)

Identification of YAC and Cosmid Clones Encompassing the ZFX-POLA Region Using Irradiation Hybrid Cell Lines

Fiona Francis,^ Frances Benham,* Chee Gee See,* M argaret Fox,* Yumiko Ishikawa-Brush,! A n t h o n y P. M onaco,t BiRGiT WEiss,$ Gudrun RAPPOiD,t Renata M. J. Hamvas, and Hans Lehrach

Genome Analysis Laboratory, Imperial Cancer Research Fund, 44 Lincoln's Inn Fields, London WC2A 3PX, United Kingdom; * Department of Genetics and Biometry and the MRC Fluman Biochemical Genetics Unit, Wolfson House, University College London, London, United Kingdom; tHuman Genetics Laboratory, Imperial Cancer Research Fund, Institute of Molecular Medicine, John Radcliffe Hospital, Oxford 0X3 9DU, United Kingdom; and tinstitut fur Humangenetik und Anthropologie, Ruprecht-Karls-Università't, Im Neuenheimer Feld 328, 6900 Heidelberg, Germany

Received August 25, 1993; revised November 20, 1993

INTRODUCTION The human Xp21.3—p22.1 region is poorly mapped relative to other X chromosome regions. To target cos­ The development of YAC contigs over well-defined mid and YAC clones specifically from Xp21.3—p22.1 genomic regions is allowing access to many areas of the for rapid contig construction, a hybridization-based human genome for transcriptional mapping and posi­ screening approach using irradiation hybrids has been tional cloning projects. In stretches of the genome that used. A/u-PCR products generated from hybrid lines containing small overlapping fragments from Xp21— remain less well characterized, and for which the density p22 were hybridized to an X chromosome cosmid li­ of mapped markers is low, YAC contig construction has brary, and cosmids predicted by their hybridization been hampered due to the need to first generate new pattern to map to the region of interest were analyzed probes. Generation of region-specific sequence-tagged by fluorescence in situ hybridization (FISH). Hybridiza­ sites (STSs; Olson et al, 1989) as landmarks detected by tion of the cosmids in pools to gridded YAC libraries polymerase chain reaction (PCR) assays has been the identified 15 YACs, which were verified and tested for main approach for YAC isolation and assembly into chimerism by FISH. Cosmid content analysis of the contigs. However, this has proved an arduous task (sum­ YACs defined two contigs, one with 12 YACs covering marized by Roberts, 1992) and the development of alter­ about 1.5 Mb and one w ith 3 YACs. F ive YACs from the native approaches for more rapid contig mapping in re­ 12-Y AC cluster had been previously recognized by gions in which few markers are as yet available is DNA polymerase alpha (POLA). ZFX identified a single needed. YAC; hence, the physical linkage of ZFX and POLA With the availability of high-density filter grids con­ was demonstrated within the contig. Four YACs had structed by robots from clone libraries, a hybridization- been isolated previously with ZFX and these extend the based screening approach can be used to identify clones contig to 2 Mb. Restriction mapping of several YACs corresponding to complex probes (Lehrach et al, 1990; demonstrates that ZFX and POLA are about 700 kb Ross et ai, 1992). Using region-specific sources of hu­ apart, a distance similar to that reported in the mouse man DNA such as inter-A/u-PCR products (Nelson et betw een Zfx and Pola. The order of these two loci and al, 1991) amplified from irradiation hybrid cell lines, it two additional loci identified by homologous mouse should be possible to isolate large numbers of clones linking clones was found to be conserved between from targeted regions (Monaco et al, 1991; Muscatelli et human and mouse: tel—ZFX—DXCrc57—DXCrcl40— al, 1992; Zucman et al, 1992; Kumlien et al, in press) for POLA—cen. We have shown that YAC contigs can be rapidly constructed from targeted regions without the subsequent contig construction. need for time-consuming YAC end rescue and chromo­ One region of the X chromosome that is as yet poorly somal walking. This approach also generates a series of mapped is Xp21.3-p22.1 (Schlessinger et al, 1993). W e ordered cosmids, which is particularly valuable for have utilized a set of irradiation hybrid cell lines retain­ marker generation in regions in which disease gene lo­ ing fragments in this region to generate new markers calization is hampered by low marker density. © 1994 and to develop a YAC clone resource. The hybrid cell Academic Press, Inc. lines, generated using high-dose irradiation (Benham et al, 1989) and containing multiple short fragments, were single-cell cloned in an attempt to segregate fragments ^ To whom correspondence should be addressed. Telephone: 071 from regions other than Xp21.3-p22.1 (Benham and 269 3510. Fax: 071 269 3068. Rowe, 1992). These subclonal lines contain various

7 5 0888-7543/94 $6.00 Copyright © 1994 by Academic Press, Inc. All rights of reproduction in any form reserved. 7 6 FRANCIS ET AL.

overlapping fragments of this Xp region and have been human lymphocytes incorporated with BrdU during the late S-phase used to develop YAC contigs in a screening approach of the cell cycle. The slides were stained with Hoescht 33258 (1 gg/m] that is summarized schematically in Fig. 1. in Sorensen buffer) before being irradiated with UV light for 6 min and then ethanol dehydrated. We report here the physical linkage of two Xp loci, The in situ hybridization procedure was done according to Pinkel et ZFX and POLA, within a 2-Mb YAC contig. ZFX has a i (1988), with the following adaptations. Incubation in proteinase K been previously localized to the Xp21.3-p22.1 boundary was for a 7-min period at a concentration of 50 gg/m l prior to fixation. by Page et al. (1990), and these studies also placed the The chromosomal DNA was denatured in 70% formamide/2X SSC at DNA polymerase alpha gene (POLA; Wong et al., 1988) 75°C for 5 min and then dehydrated in ice-cold ethanol. Ten micro­ on the proximal side of ZFX. This gene order has been liters of hybridization mixture, consisting of 200 ng of labeled probe DNA, 2 jug of unlabeled human Cot-1 DNA (GIBCO BRL) in 50% more recently confirmed by deletion mapping (Schaefer formamide, 10% dextran sulfate in 2X SSC was denatured at 75°C for et ai, 1993). This same gene order has been shown in 5 min and allowed to reanneal for 30-60 min at 37°C and then applied mouse (Hamvas et al., 1993), and we have compared this onto the slides. Hybridization proceeded in a moist chamber at 37°C new human YAC contig and associated physical map overnight. with the equivalent region in mouse. After hybridization, slides were washed three times for 5 min each in 50% formamide/2X SSC at 42°C, 5 times for 2 min each in 2X SSC at 42°C, and once in 4X SSC, 0.05% Tween at room temperature. MATERIALS AND METHODS Slides were preblocked with 5% nonfat milk powder in 4X SSC for 20 min at room temperature before being immunochemically stained. Irradiation hybrid cell lines. The high-dose (20000 and 50000 rad) The hybridized biotin-labeled probes were detected by incubating the irradiation hybrid lines known to contain human Xp22 fragments and slides for 20 min at room temperature successively in FITC-avidin (5 details of the subcloning procedure used to segregate out non-Xp22 gg/m l), biotinylated anti-avidin (5 jug/ml), and a second layer of fragments have been described previously (Benham et a i, 1989; B en­ FITC-avidin (5 jug/ml). After each antibody incubation, the slides ham and Rowe, 1992). After analysis for marker content, certain sub­ were washed three times for 5 min each in 4X SSC, 0.05% Tween. The clones were selected for A(u-PCR amplification as shown in Fig. 1. slides were washed in PBS twice for 5 min before being counterstained Alu-PCR amplification. The primer used for amplification was the with a 1 gg/m l propidium iodide and 1 gg/m \ DAPI mixture in an 3 -consensus sequence AlIV (Cotter et ai, 1990), and the inter-A(u antifade solution (Vector Labs, CA). A confocal laser-scanning micro­ amplification from these subclones has been described previously scope was used to detect the fluorescence signals. (Benham and Rowe, 1992). Briefly, the PCR was carried out in a reac­ Screening of YAC libraries. Two YAC libraries were used for clone tion buffer containing 10 m M Tris, pH 8.3, 50 mAf KCl, 1.5 m M isolation: the ICRF human YAC library (Larin et ai, 1991) con­ MgClj, 250 fiM each dNTP, and hybrid DNA at 2 fig/ml. An initial dénaturation of 6 min at 94°C was followed by 30 cycles of 93°C for 1 structed from lymphoblastoid cell line GM1416B with a karyotype min, 55°C for 1 min, and 72°C for 4 min. 48XXXX (Human Genetic Cell Repository, Camden, NJ) and a sec­ ond library constructed at the Centre d’Etude du Polymorphisme Hu­ Hybridization of Alu-PCR products to high-density cosmid filters. maine (CEPH) from an EBV-transformed human male lymphoblas­ The pool of A lu-PC R products from each hybrid line was hybridized to toid cell line (Albertsen et a l, 1990). Both libraries were robotically duplicate sets of a flow-sorted X chromosome cosmid library (Nizetic arrayed at a density of 20736 clones per 22 X 22-cm membrane et a i, 1991). The cosmid clones representing approximately four chro­ (Lehrach et ai, 1990). In addition, four YACs containing ZFX, mosome equivalents have been arrayed at high density on two 22 X 900A0501, 900A0401, 900A0301, and 900A0201, were identified from 22-cm filters (Hybond N"^, Amersham), each containing 9216 clones an earlier subset of the ICRF YAC library present on filter replicas (Lehrach et ai, 1990). Filters were prehybridized for 16 h with 100 produced using a 40,000 multipin transfer device (described in Larin jug/ml denatured, sheared human placental DNA (Sigma) and then et ai, 1991). Cosmid miniprep DNA was labeled by random hexanu­ hybridized for 16 h with probe at a concentration of 1 X 10® cpm/ml. Qiagen columns were used to purify Afw-PCR products and ~100 ng cleotide priming (Feinberg and Vogelstein, 1983), and pools were then of DNA was labeled by random hexanucleotide priming (Feinberg and created of up to six clones/pool. Labeled pools were competed with Vogelstein, 1983). Two different methods (giving comparable results) sheared human placental DNA (Sigma) according to Baxendale et al. of competing out the repetitive sequences from the Alu-PCR products (1991) with pYAC4 vector DNA added to a concentration of 25 gg/m l. were used. Different hybridization buffers were used in each case: (a) Hybridization conditions to YAC library filters were according to preannealing of labeled probe with human DNA bound to cellulose Ross et a i (1992) with slight modifications: prehybridization was per­ and hybridization to filters at 42°C in a formamide buffer as previ­ formed for 2 h only and hybridization for 16 h. Exposure to X-ray film ously described (Monaco et a i, 1991) and (b) competition of probe was as that for cosmid library filters. with 1.5 mg/ml sheared human placental DNA (Sigma) in 0.12 M YAC DNA preparation. Positive YACs were grown in selective sodium phosphate, pH 6.8, for 3 h at 65°C. Hybridizations were per­ media and DNA was prepared in agarose blocks as previously de­ formed at 65°C in 1% SDS, 0.5 M sodium phosphate, pH 7.2, 0.001 M scribed (Larin and Lehrach, 1990). YAC miniprep DNA used for EDTA, and 1% BSA (Church and Gilbert, 1984). Filters were washed FISH analysis was prepared by inoculating a single colony into 10 ml in 0.04 M sodium phosphate, pH 7.2, 0.1% SDS once at room tempera­ selective media and pelleting cells after 24 h growth at 30°C, before ture and then twice at 65°C for 30 min. Filters were exposed to Kodak yeast cell wall digestion in 4 mg/ml Novozym. YAC DNA in solution X-Omat film for 3-4 days at —70°C. was then prepared using an SDS/potassium acetate lysis procedure Selection of cosmids predicted to map to Xp21.3-p22.1. After a com­ (Sherman et ai, 1986). parison of hybridization patterns resulting from each A lu-PCR pool Probe content mapping of YACs. Undigested YACs were electro- screening, cosmids that were either positive with 50K19E, 20K2B, phoresed in 1% Sea Kem agarose pulsed-field gels (PFGs) in 0.5X and 20K2L pools or positive with 50K19E and negative with 50K19C TBE, using a CHEF Bio-Rad system. Sizes of YACs were estimated pools were selected (see Fig. 1). by comparison to yeast chromosome markers and hybridization of Cosmid DNA preparation. Selected cosmids were picked and blots with total human DNA. Blots were screened sequentially with grown overnight in LB media containing 30 ^g/ml kanamycin. Cos­ individual cosmids and single-copy probes to establish markers con­ mid DNA was prepared by the alkaline lysis procedure (Sambrook et tained within each YAC. Probes used included ICRF cosmids as de­ a l, 1989). tailed under Results, a cDNA clone from the POLA gene (Wong et ai, Fluorescence in situ hybridization (FISH). One microgram of DNA 1988), and two mouse linking clones, D XCrc57 and D X C rcl40 (Brock­ (cosmid or YAC miniprep) was labeled with biotin-14-dUTP by nick- dorff et a i, 1990). D XCrc57 is a 1.2-kh probe, and D XCrclIO is a translation (BioNick Labelling System, GIBCO BRL). Metaphase 2.5-kb probe; both are cloned into pBluescript and released with chromosome preparations were made by standard procedures from EcoRI and Eagl. The ZFX locus was detected using pDPl068, a 2.1-kb YAC CONTIG ENCOMPASSING ZFX-POLA 77

50K 19E 50K19C 20K2B 20K 2L

m a r k e r

ES sss U DXS43 Region of 0X3274 interest DXS41 ■ Xp21.3 1 ■ -p22.1 0X387 ■

0X37

1 1 .2 3 11.22 1 1.21

\T , 7 / Aki-PCR pcKils generated f Screen gridded X chromosome cosmid library sequentially with pools

Compare hybridisation patterns and select cosmids

IISII mapping T Pool cosmids (Xp21.3-p22 1) and hybridise to gridded YAC libraries T Assemble YAC contigs

FIG . 1. Schematic representation of the human fragment content present in irradiation hybrid subclones as shown by marker analysis (Benham and Rowe, 1992). Fragments in the region of interest are represented by black boxes, and striped boxes represent other fragments retained. Cosmids that were present when hybridized with Alu-PCR fragments from 50K19E and either absent from 50K19C hybridization or present also in 20K2B and 20K2L hybridizations were selected.

E’coRI mouse autosomal fragment derived from the mouse Zfa locus, pools were hybridized sequentially to high-density cos­ which detects Zfx, Zfa, Zfyl, and Zfy2 (Page et al., 1990). mid filter grids, an average of about 50 clones were iden­ PFG analysis of YACs. ICRF YACs 900G1063, 900G0732, tified per cell line. Thirteen cosmids were predicted to 900F0127, and 900G0821 were digested to completion with 30 units of map to the Xp21.3-p22.1 region by assessing the hybrid­ either Not\ or Nru\ (New England Biolabs) in a 200-gl reaction vol­ ume (according to R. M. J. Hamvas, in preparation). YAC blocks were ization patterns produced with the different lines. Three then electrophoresed in a 0.8% Sea Kem PFG in 0.5X TBE using a of the 13 cosmids (ICRFcl04; D0365, B0960, and hexagonal PFG system (LKB) with a pulse time of 25 s for 20 h at 200 C10128) were positive with A/u-PCR pools from lines V. Probes were hybridized sequentially to the same filter lift, and size 50K19E, 20K2B, and 20K2L, which had Xp21.3 estimations of fragments were based on comparison with yeast chro­ markers in common. The remaining 10 cosmids mosomal markers and lambda concatamers. YAC arm probes were derived from pBR322—the left (trp) arm probe is a 2.3-kb PvuW/ (ICRFcl04: C09145, E12186, H096, A0527, A0734, EcoRl fragment, and the right (URA3) arm probe is a 1.4-kb PvuW/ A0717, A0563, G05151, H03131, and E0863) were posi­ Sail fragment. ICRF YACs 900A0501, 900A0301, 900A0201, and tive in the 50K19E hybridization but negative in the 900A0401 were digested with the following enzymes: Notl, S’acll, 50K19C hybridization (Xp21.3-p22.1 markers absent). Nar\, and Nae\. None of these cosmids had been previously identified by any probe screening recorded in the ICRF Reference Li­ RESULTS brary Database (G. Zehetner, unpublished). Ten of 13 cosmids were localized by FISH to Xp21.3-p22.1, 2 Hybridization of Alu-PCR Product Pools to Cosmid mapped to the centromere (G05151 and E0863), and Filters H03131 mapped to Xq22. Figure 2 shows the FISH local­ Alu-PCR product pools were generated from four of ization of cosmids C10128 and B0960. D-segment as­ the subcloned irradiation hybrid cell lines. When the signments have been obtained for the followingcosmids: 78 FRANCIS ET AL.

i % ' # .m

I J ù f

FTG . 2. FISH mapping of two cosmids identified with Alu-PCR products from irradiation hybrids. Cosmid DNA was hybridized to XX human metaphase chromosome spreads. (A) I04C10128; (B) 104B()96(). Both cosmids gave a signal at the Xp21-p22 border.

C09145 (DXS1246), B0960 (DXS1247), E12186 and the third was also recognized by DXCrc57 (see Fig. 3 (DXS1248), C10128 (DXS1249), and H096 (DXS1250). and Table 1).

Creating YAC Contigs PFG Analysis and Mapping Cosmids in the YAC Contig To establish a more precise placement of markers and The 10 cosmids localized to Xp21.3-p22.1 were di­ cosmids across the region, a minimal selection of non­ vided into two pools and used to screen high-density chimeric YACs spanning ZFX and POLA was analyzed YAC library filters: 15 YACs were identified. Analysis by PFGE of DNA digests with rare-cutter restriction for cosmid content divided these YACs into two groups; enzymes. The selected YACs extending from telomere to 12 YACs fell into one contig containing 7 of the starting centromere in the order shown are ICRFy: 90001063, cosmids (Fig. 3), and the remaining 3 YACs (isolated by 90000732, 900F0127, and 90000821. Results showing 3 of the cosmids) were localized elsewhere in Xp22.1 (F. the fragments detected by various probes are summa­ Francis, submitted). Five of the 12 YACs in the first rized in Table 2, and the YAC contig map is presented in contig had been previously recognized by POLA (Wong Fig. 3. As shown in Fig. 4, both ZFX and the most telo- et ai, 1988). This confirmed the position of this contig at merically placed cosmid clone, C09145, detect a 320-kb Xp21.3-p22.1. The 12 YACs were subsequently screened Not\ fragment in 90001063. This cosmid also identifies with other probes from this region. The ZFX locus was the left arm Not\ fragment of this YAC and hence ap­ detected in one of the YACs and this enabled the orienta­ pears to span a Notl site, identifiable in 90000732, tion of the contig; the most telomeric YAC contained which is a YAC from the middle of the contig containing ZFX. Four further YACs containing ZFX had been previ­ neither ZFX or POLA. A second cosmid, B0960, and ously isolated, and these were also included in the probe DXCrc57 also identify the overlap region of these two typing analysis. Table 1 lists the 16 YACs identified in YACs. Both mouse linking clones detected loci within a the ZFX-POLA region, including size estimations, 270-kb Nrul fragment in 90000732, and one DXCrcMO, probe content, and FISH analysis performed for map­ also identified the left arm end of 900F0127, a POLA- ping confirmation and chimerism detection. YAC names containing YAC. POLA detects a 200-kb Nrul fragment beginning with 900 are YACs derived from the ICRF present in 900F0127, which is also detected by cosmid YAC library (Larin et ai, 1991) and those beginning E12186. This clone appears to be situated on the ZFX with 904 are derived from the CEPH YAC library (Al­ side of POLA, as it is still contained within 90000732. bertsen et ai, 1990). Two mouse linking clones, Both E12186 and POLA detect the left arm fragments of DXCrc57 and DXCrcMO (Brockdorff et ai, 1990), 90000821, which is a POLA-containing YAC extending known to map between Zfx and Pola in the mouse centromerically. The remaining four cosmids isolated (Ham vas et ai, 1993), were also used to probe the YAC over this region (A0734, C10128, D0365, and H096) map blots to allow a comparison of the human YAC contig on the centromeric side of POLA and detect the right with the equivalent region of the mouse X chromosome. arm Notl fragment (240 kb) of 90000821. H096 appears Each linking clone detected a specific subset of the hu­ to be most centromerically placed, detecting the right man YACs. DXCrc57 identified 5 YACs from this con­ arm Nrul fragment (40 kb) of this YAC. tig, 3 of which also contained ZFX. DXCrcMO identified Certain anomalies were detected in the ZFX-POLA 3 YACs from the contig: 2 were coidentified by POLA, set of PFO-mapped YACs: 90001063 appears to contain YAC CONTIG ENCOMPASSING ZFX-POLA 79

TEL CEN 8 2 S

Minimal span YACs - PFG m apped

11 (N ru l) iNrul Nrul YACs Identified Nrul I Using Irradiation Hybrid Approach

YACs positioned according to probe content I 904B0225S —•

9 O 4 H 0 p i9 3

I

Chimeric YAC

ZFX - containing YACs

I

N jrl I S » c ll S a c ll N o tl ' N otl L

0 1 Mb F IG . 3. Schematic representation of YACs identified in the ZFX-POLA region. At the top, positions of probes, new cosmids, and linking clones spanning the region are represented. The following cosmids have D-segment assignments; C09145 {DXS1246), B0960 (DXS1247), E12186 (DXS1248), C l0128 (DXS1249), and H096 (1)XS1250). The YACs are divided into three groups; uppermost there are four YACs representing the minimal span of YACs linking ZFX and POLA, which have been PFG mapped witb Notl and Nrul. The bracketed Nrul site close to the ZFX locus is a site that is present in 900G1063 but absent from 900G0732. This may represent a microdeletion in the latter YAC. The horizontal dashed line in the 900F0127 YAC represents a potential region of instability. The center group of eight YACs is mapped according to probe content, and chimeric YACs (as determined by FISH) are as indicated. YACs 9()4B02255, 9Q4F03124, and 904H03193 have the CEPH names 255B02, 124F03, and 193H03, respectively. At the bottom, there are four YACs that contain ZFX and have been mapped with Norl, SacII, NofI, and Nael.

4 Nrul fragments, two of which have very similar sizes of the contig to a total of 2 Mb. The upstream Notl site (80-90 kb). This would position an extra Nrul site that gave rise to this fragment is contained in a potential within this YAC that is absent from 900G0732. CpG island that contains Narl, S'acII, and Notl restric­ 900F0127 (represented in Fig. 3 by solid and broken tion sites. lines) also appears to have a number of Nrul sites that are undetectable in 900G0821. All four YACs appear to DISCUSSION be nonchimeric from FISH analyses. Probe typing using the cluster of three cosmids (A0734, C10128, and D0365) The human components of somatic cell hybrid lines and H096 in all the POLA-containing YACs together can now be easily accessed by amplification of human- with consideration of their sizes suggests that 900F0127 specific sequences between Alu repeats (Nelson et ai, may contain sequences not represented in the other 1991). This facilitates the use of hybrid cell lines as a YACs from this region. resource for isolating cloned DNA and new markers The four ZFX YACs, 900A0401, 900A0201, 900A0301, from a specific desired region (Monaco et ai, 1991; M u­ and 900A0501, were mapped using the enzymes Notl, scatelli et ai, 1992; Zucman et ai, 1992). We have evalu­ Nael, Narl, and SacII. These results are also summa­ ated this approach using high-dose irradiation hybrid rized in Fig. 3. In two of these YACs (A0301 and A0201), cell lines to target a poorly defined area of the X chro­ a 320-kb Notl fragment is detected by ZFX, allowing mosome and have established an ordered set of cosmid alignment of these YACs with 900G1063 and extension and YAC clones from this region within a 2-Mb contig. 8 0 FRANCIS ET AL.

TABLE 1 Probe Typing of YACs Identified in the ZFX-POLA Region, Including Sizes of YACs, and Results of FISH Analysis for Mapping Confirmation and to Detect Chimerism

Probe content

Size Xp21.3-p22.1 A0734, C10128, YAC (kh) (FISH) ZFX C09145 DXCrc57 B0960 DXCrcMO E12186 POLA and D0365 H096

900A0401 510 Not chim. * 900A0501 320 Not chim. * 900A0301 430 Not chim. *** 900A0201 660 N ot chim. **** 900G1063 510 Not chim. **** 900F0347 250 Chim., 7p ** 900G0732 550 Not chim. *** * * 900F06121 900 Chim., 2q ****** 900F0127 440 Not chim. *** 900A06103 440 Not chim. *** 900G0821 520 Not chim. **** 900G0921 460 Not chim. **** 900D0165 >1600 Chim., llq * 904B02255 460 Chim., Xq * * 904F03124 460 Chim., 4q * 904H03193 900 Not chim. * *

Note. Not chim. indicates that YACs contain sequences from Xp21.3-p22.1 alone. Other YACs are chimeric (Chim.) with noncontiguous sequences chromosomally localized. YACs that are named 900 are derived from the ICRF YAC library (Larin et at., 1991) and those named 904 are derived from the CEPH YAC library (Albertsen et ai, 1990). 904B02255, 904F03124, and 904H03193 have the CEPH names 255B02, 124F03, and 193H03, respectively. An asterisk denotes a positive hybridization result.

The use of high-dose irradiation hybrids has both ad­ this problem by using well-characterized subcloned irra­ vantages and disadvantages. The presence of relatively diation hybrid cell lines in a sequential screening ap­ small fragments that do not appear to be subject to ex­ proach. This has enabled isolation of clones from the tensive rearrangement (Benham et ai, 1989; Cox et al, region of interest with a high degree of accuracy. FISH 1989) allows the detailed analysis of particular regions. localization of the selected cosmids proved to be an im­ The presence of multiple fragments within a cell line, portant step in eliminating the few clones that may have however, could confuse the results. We have overcome been positively identified due to a high repetitive ele-

TABLE 2 YAC iVofI and iVrwI Fragments (in kb) Detected by Probes across the ZFX-POLA Region

YAC

900G1063 (510 kh) 900G0732 (550 kh) 900F0127 (440 kh) 900G0821 (520 kh)

N o tl N rul N otl N rul N otl N rul N otl Nrul

Probe Human 320 210 310 270 440 (uncut) 200 280 360 150 130 240 180 100 240 120 40 80/90 (2) 100 60 40 50 30 Left YAC arm 150 210 240 180 440 50 280 120 Right YAC arm 40 80/90 310 100 440 30 240 40 ZFX 320 80/90 C09145 320 210 310 270 150 240 D XCrc57 150 210 310 270 B0960 150 210 310 270 DXCrcMO 150 210 310 270 440 50 E12186 310 100 440 200 280 120 POLA 440 200 280 120 A0734 240 360 C10128 240 360 D0395 240 360 H096 240 40 YAC CONTIG ENCOMPASSING ZFX-POLA 81

L eft arm B 0 9 6 0 PROBES : Hum an Right arm ZFX C 0 9 1 4 5 Nr No Un Nr No Un Nr No Un Nr No Un Nr No Un Nr No Un

SIZE KB)

%

i

FIG. 4. PFG analysis of ICRF YAC 900G1063 with ordering of C09145 and B0960 cosmid clones. Nr, No, and Un represent Nrul, Notl, and undigested DNA, respectively. ZFX and C09145 are present on the same 320-kh Notl fragment, although C09145 extends across the Notl site to also detect the same fragments as B0960 and the left YAC arm probe. Weak fragments detected by the left YAC arm probe are caused by a minor contamination with the right YAC arm probe. The cosmid probes also hybridize to lambda concatamer markers (FMC) present to the right of the uncut DNA. ment content or because of preferentially retained cen­ have a relatively poor signal-to-background ratio and tromeric fragments in the hybrids (Benham et ai, 1989; also a limited lifespan and are therefore unsuited for Cox et ai, 1989). sequential screens with complex probes. We used cosmid The cosmids mapping to Xp21.3-p22.1 establish an filters in an attempt to minimize the level of false posi­ evenly spaced set of markers across a previously under­ tives obtained and to identify contiguous YACs using a mapped region. We have used these new probes as multi­ minimal number of YAC colony screens. This technique ple entry points to identify YACs from a gridded clone is now being complemented by the availability of analo­ array. Probing with pools of up to six cosmid clones en­ gous high-density filter grids of spotted Alu-FCR prod­ abled us to isolate 15 YACs in just two colony filter ucts of YAC libraries (Ross et ai, unpublished) that can screens. It is possible to establish overlaps between be directly screened with A/n-PCR products from irra­ YACs by screening individual PFG-purified YAC in­ diation hybrids. Due to the selective amplification of the serts against cosmid library filters (Baxendale et al., same specific subset of sequences in both probe and tar­ 1991). This has the advantage of identifying a large num­ get, and the high concentration of PCR product spotted ber of cosmids underlying each YAC suitable for gener­ on the filter, this strategy may offer an alternative ap­ ating a high-resolution map across the region of interest. proach to the identification of YAC clones. For our purposes, however, overlaps between contiguous The YAC contig we have described was constructed YACs were much more easily established by screening without the need for end-rescue and walking ap­ YAC blots with the cosmids that had already been iden­ proaches. In our experience, YAC end probes can be dif­ tified. By this means, the YACs were divided into two ficult to rescue, often requiring the application of several groups: a 12-YAC contig spanning 1.5 Mb in the different procedures. Further problems may arise if the Xp21.3-p22.1 region and 3 remaining YACs falling in a probes produced are weakly hybridizing or contain repet­ different region in Xp22.1. The 12-YAC contig showed itive sequences. The approach we describe offers an al­ the physical linkage of two loci, ZFX and POLA, allow­ ternative method for clone isolation, whereas the new ing positioning and orientation of the contig on the X probes identified in the first step not only enable us to chromosome (gene order according to Davies et ai, 1992; isolate overlapping YACs in the second step but are also Schaefer et ai, 1993). Using additional YACs derived available for facilitated contig assembly. from an earlier subset of the ICRF YAC library, the con­ PFG analysis of several YACs within the ZFX-POLA tig was extended to approximately 2 Mb contained contig has enabled ordering of the new probes across the within 16 YACs. These clones therefore provide a useful region. The cosmid clones detect new loci in Xp21.3- resource available for further mapping purposes. p22.1 that may prove useful for the generation of micro­ It is possible to screen YAC colony filters directly with satellite markers. The cross-hybridization of mouse loci A/u-PCR products generated from irradiation hybrid to sequences within the contig has been obtained by cell lines (Monaco et ai, 1991). However, YAC filters probing PFG blots with two mouse linking clones 8 2 FRANCIS ET AL.

(Brockdorff et ai, 1990). This has allowed comparison to Reference Library System and Neil Brockdorff for providing mouse the equivalent region in the mouse genome and identi­ linking clones. We also thank Anna-Maria Frischauf, Gillian Eates, fied regions that are highly likely to contain coding se­ and Roger Cox for helpful discussions and Hugues Roest Crollius for assistance with the Y AC contig figure. This work was supported by quences. We provide data to improve the resolution of grants from the MRC U.K. Human Genome Mapping Project, the the ZFX-POLA area of the human-mouse X chromo­ Wellcome Trust, and the Human Frontiers Science Program. some comparative map. The order Zfx-DXCrc57- DXCrcl40-Pola, which was recently established in the REFERENCES mouse genome (Hamvas et al., 1993), has been reflected Albertsen, H. M., Ahderrahim, H., Cann, H. M., Dausset, J., LePas- in the homologous human region. The physical distance lier, D., and Cohen, D. (1990). Construction and characterisation of separating ZFX and POLA can be estimated to be 700 a yeast artificial chromosome library containing seven haploid hu­ kb, while in the equivalent region in the mouse, these man genome equivalents. Proc. Natl. Acad. Sci. USA 87 : 4256- two loci were estimated to be separated minimally by 600 4260. kb and maximally by 750 kb. The signals obtained by Bates, G. P., Valdes, J., Hummerich, H., Baxendale, S., Le Paslier, hybridization of mouse linking clones to human Y AC D. L., Monaco, A. P., Tagle, D., MacDonald, M. E., Altherr, M., Ross, M., Brownstein, B. H., Bentley, D., Wasmuth, J. J., Gusella, blots places the loci detected by DXCrc57 within 400 kb J. F., Cohen, D., Collins, F., and Lehrach, H. (1992). Characterisa­ centromeric to ZFX, and DXCrcl40 approximately 200 tion of a Y AC contig spanning the Huntington’s Disease gene can­ kb telomeric to POLA. Again, these localizations are sim­ didate region. Nature Genet. 1: 180-187. ilar to those found in the mouse genome. Hence, these Baxendale, S., Bates, G. P., MacDonald, M. E., Gusella, J. F., and loci appear to be contained within an extensive con­ Lehrach, H. (1991). The direct screening of cosmid libraries with served linkage group with a similar probe order in man YAC clones. Nucleic Acids Res. 19: 6651. and mouse. Benham, F., Hart, K., Crolla, J., Bobrow, M., Francavilla, M., and Goodfellow, P. N. (1989). A method for generating hybrids contain­ The PFG analysis of the YACs has allowed an exami­ ing nonselected fragments of human chromosomes. Genomics 4: nation of their fidelity relative to one another, and a 509-517. potential region of instability was detected close to the Benham, F., and Rowe, P. (1992). Use of A lu-PC R to characterize POLA locus. Instability has previously been detected in hybrids containing multiple fragments and to generate new other genomic regions (Bates et ai, 1992) in which unsta­ Xp21.3-p22.2 markers. Genomics 12: 368-376. ble sequences in one Y AC were not necessarily found to Brockdorff, N., Montague, M., Smith, S., and Rastan, S. (1990). Con­ struction and analysis of linking libraries from the mouse X chro­ be unstable in another. Construction of a PFG map from mosome. Genomics 7: 573-578. genomic DNA and analysis of the remaining YACs in Church, G. M., and Gilbert, W. (1984). Genomic sequencing. Proc. the contig may provide further information regarding Natl. Acad. Sci. USA 81: 1991-1995. the stability of this region in Y AC clones. Cotter, F. E., Hampton, G. M., Nasipuri, S., Bodmer, W. F., and The more extensive PFG analysis of four ZFX YACs Young, B. D. (1990). Rapid isolation of human chromosome-speci­ has revealed the presence of one potential CpG island, fic DNA probes from a somatic cell hybrid. Genomics 7: 257-263. and it is unclear whether this island is associated with Cox, D. R., Pritchard, C. A., Uglum, E., Easher, D., Kobori, J., and the ZFX gene or upstream from this locus. Hamvas et at. Myers, R. (1989). Segregation of the Huntington disease region of human chromosome 4 in a somatic cell hybrid. Genomics 4: 399- (1993) recently reported a similar situation in the equiva­ 407. lent mouse region. Conserved GC-rich sequences were Davies, K. E., Mandel, J. L., Monaco, A. P., Nussbaum, R. L., and similarly observed in human genomic DNA close to both Willard, H. F. (1992). Report of the committee on the genetic con­ ZFX and ZFY loci by Verga and Erickson (1989). Addi­ stitution of the X chromosomes. Cytogenet. Cell. Genet. 58: 1-117. tional CpG islands are likely to exist at the region of Feinberg, A. P., and Vogelstein, B. (1983). A technique for radiolabel­ hybridization to the mouse linking clones. Pola is not ling DNA restriction endonuclease fragments to high specific activ­ associated with a CpG island in the mouse, and in our ity. Anal. Biochem. 132: 6-13. studies, POLA was associated with neither a Notl site Hamvas, R. M. J., Larin, Z., Brockdorff, N., Rastan, S., Lehrach, H., Chartier, F. L., and Brown, S. D. M. (1993). YAC clone contigs nor a Nrul site; hence, it is possible that the human gene surrounding the Zfx and Pola loci on the mouse X chromosome. is similarly not associated with an island. Genomics 17: 52-58. PFG analysis has enabled the construction of a 2- Kumlien, J., Labe 11a, T., Zehetnev, G., Vatcheva, R., Nizetic, D., and MbYAC map of this region, which includes seven new Lehrach, H. (1994). Efficient identification and regional position­ ordered cosmid markers, and the identification of sev­ ing of YAC and cosmid clones to human chromosome 21 by radia­ eral potential CpG islands. A starting combination of tion fusion hybrids. Mammal. Genome, in press. Alu-PCR products from irradiation hybrid cell lines and Larin, Z., and Lehrach, H. (1990). Yeast artificial chromosomes: An alternative approach to the molecular analysis of mouse develop­ high-density arrays of cosmid and Y AC clone libraries mental mutations. Genet. Res. 56: 203-208. have facilitated the assembly of the essential resources Larin, Z., Monaco, A. P., and Lehrach, H. (1991). Yeast artificial chro­ required for the identification of transcribed sequences mosome libraries containing large inserts from mouse and human from a region which, up until now, has remained rela­ DN A. Proc. Natl. Acad. Sci. USA 88: 3233-3237. tively poorly mapped. Lehrach, H., Drmanac, R., Hoheisel, J., Larin, Z., Lennon, G., Mon­ aco, A. P., Nizetic, D., Zehetner, G., and Poustka, A. (1990). Hybri­ disation fingerprinting in genome mapping and sequencing. In “Ge­ ACKNOWLEDGMENTS nome Analysis Volume I: Genetic and Physical Mapping” (K. E. Davies and S. M. Tilghman, Eds.), pp. 39-81, Cold Spring Harbor We thank members of the Genome Analysis Laboratory involved in Laboratory Press, Cold Spring Harbor, NY. the filter generation and distribution of clones associated with the Monaco, A. P., Lam, V. M. S., Zehetner, G., Lennon, G. G., Douglas, YAC CONTIG ENCOMPASSING ZFX-POLA 8 3

C., Nizetic, D., Goodfellow, P. N., and Lehrach, H. (1991). Mapping Ross, M., Nizetic, D., Nguyen, C., Knights, C., Vatcheva, R., Burden, irradiation hybrids to cosmid and yeast artificial chromosome li­ N., Douglas, C., Zehetner, G., Ward, D., Baldini, A., and Lehrach, braries by direct hybridisation of Alu-PCR products. Nucleic Acids H. (1992). Selection of a YAC sub-library enriched for chromosome Res. 19: 3315-3318. 21 by hybridisation with a chromosome specific composite probe. Muscatelli, F., Monaco, A. P., Goodfellow, P. N., Hors-Cayla, M. C., Nature Genet. 1: 284-290. Lehrach, H., and Fontes, M. (1992). Isolation of new probes from Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989). “Molecular Xql2-ql3: An example of the screening of reference libraries with Cloning: A Laboratory Manual,” 2nd ed.. Cold Spring Harbor Labo­ Alu-PCR products from radiation hybrids. Cytogenet. Cell Genet. ratory, Cold Spring Harbor, NY. 61: 109-113. Schaefer, L., Ferrero, G. B., Grillo, A., Bassi, M. T., Roth, E. J., Wa- Nelson, D. L., Ballabio, A., Victoria, M. F., Pieretti, M., Bies, R. D., penaar, M. C., van Ommen, G.-J. B., Mohandas, T. K., Rocchi, M., Gibbs, R. A., Maley, J. A., Chinault, A. C., Webster, T. A., and Zoghbi, H. Y., and Ballabio, A. (1993). A high resolution deletion Caskey, C. T. (1991). Alu-primed PCR for regional assignment of map of human chromosome Xp22. Nature Genet. 4: 272-279. 110 yeast artificial chromosome clones from the human X chromo­ Schlessinger, D., Mandel, J-L., Monaco, A. P., Nelson, D. L., and some: Identification of clones associated with a disease locus. Proc. Willard, H. F., Eds. (1993). Report of the Fourth International Natl. Acad. Sci. USA 88: 6157-6161. Workshop on Human X Chromosome Mapping. Cytogenet. Cell Nizetic, D., Zehetner, G., Monaco, A. P., Gellen, L., Young, B. D., and Genet. 64: 147-170. Lehrach, H. (1991). Construction, arraying and high-density screening of large insert cosmid libraries of the human chromo­ Sherman, F., Fink, G. R., and Hicks, J. B. (1986). “Laboratory Course somes X and 21. Proc. Natl. Acad. Sci. USA 88: 3233-3237. Manual for Methods in Yeast Genetics,” pp. 127-128, Cold Spring Harbor Laboratory, New York. Olson, M., Hood, L., Cantor, C., and Botstein, D. (1989). A common language for physical mapping of the human genome. Science 2 4 5 : Verga, V., and Erickson, R. P. (1989). An extended long-range restric­ 1434-1435. tion map of the human sex-determining region on Yp, including ZFY, finds marked homology on Xp and no detectable Y sequences Page, D. C., Disteche, C. M., Simpson, E. M., De La Chapelle, A., Andersson, M., Alitalo, T., Brown, L. G., Green, P., and Akots, G. in an XX male. Am. J. Hum. Genet. 44: 756-765. (1990). Chromosomal localisation of ZFX—A human gene that Wong, S. W., Wahl, A. F., Yuan, P. M., Aral, N., Pearson, B. E., Aral, escapes X inactivation. Genomics 7: 37-46. K.-L, Korn, D., Hunkapiller, M. W., and Wang, T. S.-F. (1988). Pinkel, D., Landegent, J., Collins, C., Fuscoe, J., Segraves, R., Lucas, Human DNA polymerase alpha gene expression is similar to both J., and Gray, J. (1988). Fluorescence in situ hybridisation with hu­ prokaryotic and eukaryotic replicative DNA polymerases. EM BO J. man chromosome-specific libraries: Detection of trisomy 21 and 7: 37-47. translocations of chromosome 4. Proc. Natl. Acad. Sci. USA 85 : Zucman, J., Delattre, O., Desmaze, C., Azambuja, C., Rouleau, G., 9138-9142. De Jong, P., Aurias, A., and Thomas, G. (1992). Rapid isolation of Roberts, L. (1992). Two chromosomes down, 22 to go. Science 2 5 8 : cosmids from defined subregions by differential AZu-PCR hybridiza­ 28-30. tion on chromosome 22-specific library. Genomics 13: 395-401. GENOMICS 21, 229-237 (1994)

A YAC Contig Spanning the Hypophosphatémie Rickets Disease Gene (HYP) Candidate Region

Fiona Francis,’ Peter S. N . Rowe,* Michael J. EcoNS,t Chee Gee SEE,* Frances Benham,* Jeffrey L. H. O'Riordan,* M arc K. DREZNER,t Renata M. J. Hamvas, and Hans Lehrach

G enom e Analysis Laboratory, Imperial Cancer Research Fund, 44, Lincoln's Inn Fields, London WC2A 3PX, United Kingdom; Department of Medicine, Middlesex Hospital, Mortimer Street, London WIN 8AA, United Kingdom; f Department of Medicine and the Sarah W. Stedman Center For Nutritional Studies, Duke University Medical Center, Durham, North Carolina 27710; and ^Department of Genetics and Biometry, Wolfson House, University College London, 4, Stephenson Way, London NWl 2HE, United Kingdom

Received November 15, 1993; revised February 7, 1994

produces the same abnormalities. This mouse model Dominant X-linked hypophosphatémie rickets (HYP) has been extensively studied in an attempt to elucidate is the most common form of familial rickets. Linkage the biochemical defects involved in this disorder. The studies have localized the gene for this disorder to renal transport defect in these animals has been local­ Xp22.1 between the markers DXS365 and DXS274, a ized to the brush-border membrane of the proximal tu­ region estimated to be approximately 3.5 cM. We have bule (Tenenhouse et aL, 1978), and abnormal vitamin constructed a 1.5-Mh YAC contig encompassing this D metabolism has been detected specifically in the region by hybridization screening of high-density YAC proximal convoluted tubule (Nesbitt et aL, 1987). How­ clone filters. Rapid chromosome walking was achieved ever, it is not clear whether the abnormality affects the by direct hybridization of a pool of AZu-PCR products renal tubules per se or whether this is an endocrine derived from a YAC containing DXS365 to the filter grids. Overlaps between YACs in the contig were esti­ disorder. Support for a primary renal defect stems from mated by hybridization of end probes to YAC digest studies by Bell et aL (1988), who reported that primary blots and by analysis of cosmid fingerprints obtained cultures of renal epithelial cells from the H yp mouse by hybridization of YAC inserts to a fiow-sorted chro­ demonstrate a defect in phosphate transport and vita­ mosome X cosmid library. All YACs in the contig have min D metabolism. Parabiosis experiments (Meyer et been verified by fluorescence in situ hybridization. aL, 1989), however, and more recently cross-trans­ Several YACs spanning the HYP gene candidate region plantation of kidneys between normal and H yp mice were selected for further analysis by rare-cutter en­ (Nesbitt et aL, 1992) suggest that the renal defect may zyme digestion and pulsed-field gel electrophoresis. be secondary to the effect of a humoral factor. Further We estimate that the markers flanking the disease re­ support for this may be derived from a variant of the gion, DXS365 and DXS274, are less than 1 Mb apart. disease termed oncogenic hypophosphatémie osteoma­ This clone contig map provides an essential resource lacia (Hewison et aL, 1992), which can be difficult to for the isolation of the HYP gene. O 1994 Academic Press, Inc. distinguish from the inherited form. In this case it is thought that the tumor produces a circulating factor(s), which acts specifically on the proximal renal tubule to INTRODUCTION produce similar abnormalities. X-linked hypophosphatémie (vitamin D-resistant) Since the primary defect underlying HYP remains to rickets (HYP; McKusick, 1992; MIM No. 307800), in­ be conclusively determined, despite the usefulness of herited in a dominant mode, is the most common form the animal model, it has been necessary to adopt a of familial rickets. It is characterized by hypophos­ positional cloning approach to search for the disease phatemia due to decreased renal phosphate reabsorp­ gene. Genetic linkage analyses have localized the dis­ tion and is associated with disordered vitamin D me­ ease gene to distal Xp (Read et aL, 1986; Machler et tabolism and osteomalacic bone disease (Rasmussen aL, 1986), and the locus order Xpter-DXS43-HYP- and Tenenhouse, 1989). Eicher et aL (1976) have de­ DXS41-Xcen was established (Thakker et aL, 1987). scribed an X-linked mutation in the mouse (Hyp) that More recently, the genetic map in this region has been refined considerably with several new markers. ^ To whom correspondence should be addressed: Telephone: 071 DXS257 and DXS365 were established as telomeric 269 3510; Fax: 071 269 3068. flanking markers (Econs et aL, 1992), although initially

2 2 9 0888-7543/94 $6.00 Copyright © 1994 by Academic Press, Inc. All rights of reproduction in any form reserved. 2 3 0 FRANCIS ET AL.

M = Marker Hybond N* membranes for hybridization screening purposes, at a phi XI 74/H ael 11 density of 20,736 clones per 22 X 22-cm membrane (Lehrach ei aL, (bp) M - 4 5 6 7 - M 1990). Probes likely to contain repetitive elements were competed with sheared human placental DNA (Sigma) according to Baxenlale et al. (1991) with pYAC4 vector DNA added to a concentration cf 25 figlml. YAC library filters were prehybridized at 65°C in 0.5 M so­ dium phosphate (pH 7.2), 7% SDS, 1% BSA, 1 mM EDTA, 0.1 mg/ ml yeast tRNA, and if necessary with the addition of 0.5 mg/ml 210 denatured sheared human placental DNA. Filters were hybriized overnight at 65°C in the same buffer, omitting the placental DNA (Church and Gilbert, 1984). Filters were washed in 40 mM sodium phosphate, 0.1% SDS twice at room temperature and then once at 65°C for 20 min. Exposure to X-ray film (Kodak X-Omat AR) was for 2 to 5 days at -70°C, with a single intensifying screen. Probe typing of YACs. YAC DNA was prepared in agarose blocks as previously described (Larin et aL, 1991), and undigested Y\Cs FIG. 1. PCR assay verifying the cosmid and YACs containing were electrophoresed on 1% Sea Kem (EMC) agarose PFGs in C.5x the DXS365 locus. 1, water control; 2, human genomic DNA; 3, cos­ TBE using a CHEF Bio-Rad system. Sizes of YACs were estimated by mid DNA (104D0286); 4, 900C03129; 5, 900E01138; 6, 900H0391; 7, comparison to yeast chromosomal markers (FMC). Hmdlll-digested 83B05. An —210-bp fragment was amplified from human genomic YACs were electrophoresed on 0.8% Sea Kem conventional gels. Blots DNA, cosmid 104D0286, and YACs 900C03129, 900E01138, and were screened sequentially with probes from the area to establish 900H0391. markers contained within each YAC. PCR assay verifying the cosmid and YACs containing the DXS365 it was not possible to determine which of these was locus. A PCR assay was used to verify the cosmid and YACs con­ closest to the HYP locus. Since then, one recombination taining the DXS365 locus (see Fig. 1). PCR primers RX314-A and event has established DXS365 as the closest telomeric RX314-B were prepared according to Browne et al. (1992). PCR reac­ tions were performed in a volume of 50 pi containing 10 ng/pl of each flanking marker (Econs et aL, 1993). On the centro­ primer, 200 pM dNTPs, 50 mM KCl, 10 mM Tris-Cl, pH 8.3, 1.5 meric side of the disease gene locus, DXS274 has been mM MgCl2 , 0.01% gelatine, 2 units Tag polymerase (Advanced Bio­ demonstrated to be the closest flanking marker (Econs technologies Ltd.), and 1/100 of an agarose block containing YAC et al., 1993; Rowe et al., 1993). Econs et al. (1993) have DNA or 50 ng of genomic or cosmid DNA. A “hot-start” PCR reaction was performed using Ampliwax PCR gems (Perkin-Elmer Cetus) estimated the genetic distance between DXS365 and in a Perkin-Elmer Cetus system 9600 thermocycler. The following DXS274 to be approximately 3.5 cM. conditions were used: 2 min at 94°C; 30 cycles of 1 min at 94°C, 2 We have constructed a YAC contig encompassing min at 55°C, 1 min at 72°C; then a final extension of 6 min at 72°C. these markers in the HYP gene critical region. The FISH analysis. YAC miniprep DNA (1 pg) was labeled with bio- development of this clone resource will permit a sys­ tin-14-dUTP by nick-translation (BioNick Labelling System, Gibco tematic characterization of the gene transcripts local­ BRL). Metaphase chromosome preparations were made by standard ized in this area, an essential step in the identification procedures from human lymphocytes incorporated with BrdU during of the gene responsible for this disease. 123456789 10

MATERIALS AND METHODS 1. XYYYY 2. XY Probes. 104F0743 (DXS1688), 104A0717 (DXS1687), 104A0563 3. Y only (DXS1683), and 104H10174 (DXS1686) are cosmids obtained from 4. XXXX the ICRF flow-sorted chromosome X cosmid library (Nizetic et aL, 5. X only 6. Xp22.3-Xqter 1991). Cosmids were grown on L-agar plates containing 30 /xg/ml 7. Xp21.2-Xqter kanamycin, and DNA was prepared by the alkaline lysis procedure 8. Xql 3-Xqter (Sambrook et aL, 1989). DXS365 (Browne et aL, 1992) was screened 9. Xpter-Xq21 against the cosmid library, and in all hybridizations subsequent to 10. X (delXql3-Xq21) this, the cosmid identified, 1G4D0286, was used (see Fig. 1). A frag­ ment derived from DXS274 (Rowe et aL, 1993) and the cosmid this fragment identifies (104A0359) were used in hybridizations to iden­ FIG. 2. Hybridization of Trp end vectorette probe from 83B05 to tify this locus. Probes detecting YAC arm fragments required for somatic cell hybrid mapping panel. This mapping panel was com­ pulsed-field gel electrophoresis (PFGE) analysis of YACs were de­ posed of EcoRI-digested DNA from the following: the human cell line rived from pBR322; a 2.3-kb PvulHEcoRl fragment has homology to OXEN (49,XYYYY, Bishop et aL, 1983); a normal human male (XY); the left (Trp) YAC arm, and a 1.4-kb P vu lV S all fragment has homol­ the hamster-human hybrid 853 (Y only, Burk et aL, 1985), GM1416 ogy to the right (URA3) arm. All probes were labeled by random (48,XXXX) from the Institute for Medical Research, Camden, New hexamer priming (Feinberg and Vogelstein, 1983). Jersey; the mouse-human hybrid MOG (X only, Povey et aL, 1980); YAC clone isolation. Two YAC libraries were used for clone isola­ the mouse-human hybrid UCLA B2 (Xp22.3-Xqter, Curry et aL, tion: the ICRF human YAC library (Larin et aL, 1991) constructed 1984); the human cell line HEMAG (Xp21.2—Xqter, Boyd et aL, from a lymphoblastoid cell line GM1416B with a karyotype 48,XXXX 1988); the mouse-human hybrid FIR5 (Xql3-Xqter, Solomon et aL, (Human Genetic Cell Repository, Camden, NJ), and a second library 1983); the mouse-human hybrid HORL (Xpter-Xq21, Goodfellow et constructed at the Centre d’Etude du Polymorphisme Humaine aL, 1980); and TAGB8, which is a human male cell line containing (CEPH) from an EBV-transformed human male lymphoblastoid cell a deletion of the region Xql3-Xq21.3 (Tabor et aL, 1983). Bands line (Albertsen et aL, 1990). Throughout the text ICRF YACs have detected indicate that this probe detects a locus between Xp21.3 and names beginning with 900; the remaining YACs were derived from Xp22.3. This is a consistent result for the end of a nonchimeric YAC the CEPH YAC library. Both libraries were robotically arrayed on derived from Xp22.1. YAC CONTIG SPANNING HYP DISEASE GENE REGION 2 3 1 the late S-phase of the cell cycle. The slides were stained with comparison with yeast chromosomal markers and lambda concatam- Hoescht 33258 (1 //g/ml in Sorensen’s buffer) before being irradiated ers (EMC). with uv light for 6 min and then ethanol dehydrated. The in situ hybridization procedure was performed according to Pinkel et al. (1988), with the following adaptations. Incubation in RESULTS proteinase K was for a 7-min period at a concentration of 50 fig/ ml prior to fixation. The chromosomal DNA was denatured in 70% YAC Library Screening with Proximal Probes formamide/2x SSC at 75°C for 5 min and then dehydrated in ice- cold ethanol. Ten microliters of hybridization mixture, consisting of A cosmid recognized by DXS274,104A0359, was used 200 ng of labeled probe DNA, 2 fig of unlabeled human Cot-1 DNA as a hybridization probe to screen against high-density (GIBCO, BRL) in 50% formamide, 10% dextran sulfate in 2x SSC, was denatured at 75°C for 5 min, allowed to reanneal for 30-60 min filter grids of YAC libraries. As shown in Fig. 4, six at 37°C, and then applied onto the slides. Hybridization proceeded YACs were identified using this probe: ICRF YAC in a moist chamber at 37°C overnight. After hybridization, slides 900H0439 and CEPH YACs 83B05, 173A01, 80G04, were washed three times for 5 min each in 50% formamide/2x SSC 135C11, and 87D07. Screening with DXS41 against the at 42°C, five times for 2 min each in 2x SSC at 42°C, and once in library filters identified two CEPH YACs, 311F02 and 4 x SSC, 0.05% Tween at room temperature. Slides were preblocked with 5% nonfat milk powder in 4 x SSC for 20 min at room tempera­ 173A01 (also identified by DXS274). ture before being immunochemically stained. The hybridized biotin- labeled probes were detected by incubating the slides for 20 min at PCR Assay for Clones Identified at the DXS365 Locus room temperature successively in FITC-avidin (5 figlml), biotinyl- ated anti-avidin (5 fig/m\), and a second layer of FITC-avidin (5 fig! DXS365 (Browne et aL, 1992) contains repetitive ele­ ml). After each antibody incubation, the slides were washed three ments that make it unsuitable for direct hybridization times for 5 min each in 4 x SSC, 0.05% Tween. The slides were washed in PBS twice for 5 min before being counterstained with a to YAC library filters and Southern blots. This probe 1 figJwX propidium iodide and 1 figjnxl DAPI mixture in an antifade was therefore used to screen the ICRF flow-sorted solution (Vector Labs, CA). A confocal laser scanning microscope was chromosome X cosmid library (Nizetic et al., 1991), and used to detect the fluorescence signals. one cosmid, 104D0286, appeared positive against a YAC w alking strategies. End probes were generated from several high background. Three ICRF YACs, 900E01138, of the proximal YACs using the vectorette method (Riley et al., 1990). 900H0391, and 900C03129, were identified with this These ends were mapped back to Xp22.1 by hybridization using a panel of somatic cell hybrid lines, described in Fig. 2. Vectorette end cosmid. As shown in Fig, 1, a PCR assay was used to probes that were successfully mapped to Xp22 were prepared from verify that the cosmid and YACs contain the DXS365 PsH-digested 900H0439 (Trp end, 800 bp), Rsal-digested 83B05 (Trp locus. An ~ 2 10-bp band was amplified from human end, 620 bp), and BamHI-digested 83B05 (Ura end, 870 bp). All genomic DNA, cosmid DNA, and the three YACs men­ remaining YACs were typed by hybridization with the isolated end tioned above, A fourth YAC, 83B05, is negative in this probes, and the Trp end of 83B05 was selected for rescreening against YAC library filters. On the distal side of the disease gene candidate assay and hence does not contain the DXS365 locus, region, a pool of AZw-PCR products derived from a nonchimeric YAC was used to rescreen YAC library filters. Colony AZu-PCR was per­ Link-Up of Proximal and Distal YACs formed according to Coffey et al. (1992), and the pool of AZu-PCR products was hybridized directly to YAC library filters under compe­ Probe content mapping established that 83B05 (660 tition conditions as described above. kb) and 173A01 (370 kb) contain DXS274 and DXS41 YAC identification using irradiation hybrid DNA. DNA from a (Fig, 4), Genetically, DXS274 has been mapped closer panel of irradiation hybrids containing Xp22 fragments (Benham et to HYP than DXS41 (Rowe et aL, 1993), and therefore a l, 1989) was also used to isolate YACs from this region. Briefly, AZu-PCR products from hybrid DNA (Benham and Rowe, 1992) were an attempt was made to walk from DXS274 toward used to screen cosmid library filters, and the 10 cosmids identified DXS365, the distal flanking marker. Both end probes from Xp21-p22 were hybridized in pools to YAC library filters (Fran­ generated from 83B05 mapped to Xp22, These results cis et al., 1994). suggest that this YAC is not chimeric, and this has Cosmid Fingerprinting of YACs. Cosmids underlying the YACs been verified by FISH, The Trp end probe identified were identified by screening PFG-purified YAC inserts against gridded one DXS274 YAC, 80G04, that does not contain DXS41, cosmid library filters (Baxendale et al., 1991), and a cosmid fingerprint for each YAC was established (Fig. 3). Several cosmids were selected and this end probe was therefore selected to rescreen to verify the YAC overlaps by hybridization to YAC digest blots. YAC library filters. Initially, a cosmid, 104H10174, was PFG analysis. A selection of YACs from the contig encompassing isolated, and this was used to identify six new YACs DXS365, DXS274, and DXS41 was further analyzed by rare-cutter (ICRF 900A0472, 900D0959, 900B1239, 900B0218, enzyme digestions. 900E01138, 900A0472, and 83B05 were digested 900A01126, and CEPH 27C06), These results are sum­ to completion with 30 units of N otl or N rul, and partial digestions marized schematically in the YAC contig map in Fig, 4, were performed using the enzymes SssHII, E agl, and M lul. Ten units of enzyme was used in each case, and digestion was allowed 104H10174 also reidentified two DXS274 YACs, 87D07 to continue for 10 min (according to Ham vas et aL, 1993). All enzym es and 13511C, on the YAC colony filters, 87D07 is, how­ were obtained from New England Biolabs, and digestions were per­ ever, not identified by the end probe on Southern blots formed according to the manufacturer’s instructions. Samples were containing YAC DNA, suggesting that it may have de­ electrophoresed on 0.8% normal agarose gels using an LKB system leted some sequences during growth. with a hex electrode for 20 h at 200 V with a 40-s pulse time. Probes (including total human DNA to detect all fragments and YAC arm On the distal side of the disease gene candidate re­ probes to detect end fragments) were hybridized sequentially to the gion, YAC colony AZw-PCR (Coffey et aL, 1992) was same filter lift, and size estimations of fragments were based on used to amplify sequences from 900E01138, and the 2 3 2 FRANCIS ET AL.

f* ?*-53 ■ -• ■ =■ # ' Vk 4.'" /f ), • \

f f if'&BsüE.

FIG. 3. Autoradiographs of a cosmid library filter (containing 20,000 clones) hybridized with two different YAC inserts. On the left, hybridization with YAC 87D07 and on the right, 27C06. Several cosmids identified by both YAC inserts are indicated by arrows.

pool of products was hybridized directly to YAC library PFG Analysis of Minimal Span YACs filters. Five YACs were identified using this walking Complete N o tl and N ru l digests and partial diges­ approach, and of these, three had already been identi­ tions with E agl, SssHII, and M lu l were performed on fied by the 83B05 Trp end probe (900A0472, 900B1239, YACs 83B05, 900A0472, and 900E01138. Table 2 lists and 900D0959). As shown in Fig. 4, these three YACs the N o tl and N ru l fragment sizes detected in the com­ therefore span the gap between the distal and proximal plete digest PFG analysis. Initially, we assumed that YACs. The remaining two YACs identified were DXS274 and DXS41 recognized the same N ru l frag­ 900G1257 and 900C03129 (which contains DXS365). ment of approximately 200 kb, placing these markers Cosmid fingerprints were generated for several of the in close proximity. Partial mapping data and a consid­ YACs by hybridization of YAC inserts to cosmid library eration of the sum of total N ru l fragment sizes sug­ filters (Fig. 3). Several cosmids, including 104F0347 gest that these two markers in fact identify two N ru l and 104H10174, were selected for use as probes to ver­ fragments of similar size, with DXS274 detecting an ify the overlaps on YAC digest blots. internal fragment. Figure 5a shows NoH-digested 900E01138. This YAC was isolated with the DXS365 Coidentification of YACs Using Irradiation Hybrid cosmid, 104D0286, which recognizes an internal frag­ DNA ment of approximately 250 kb. Two sets of YACs were identified using irradiation Figure 5b shows results of probe hybridizations to hybrid DNA; 3 mapped to the HYP region 900G1257, partially digested YAC 83B05. DXS41 appears to be 900A0472, and 900D0959, and the remaining 12 situated close to the right arm end of this YAC, while mapped to a more proximal region of the X chromosome DXS274 detects some left arm end fragments. The (Francis et aL, 1994). The 3 YACs from the HYP region completed YAC map presented in Fig. 4 shows that were identified by cosmids 104A0717 and 104A0563, DXS274 can be placed minimally 180 kb and maxi­ although their exact positions in Xp22 were originally mally 310 kb from the left arm end of this YAC, as unknown. The YAC walking strategies described determined by a BssHII site and an E a g l site. There­ above coidentified these YACs and hence localized fore, the physical distance separating DXS41 and them between DXS365 and DXS274. A number of DXS274 is between 150 and 400 kb. 104H10174 spans other YACs from this contig have subsequently identi­ a BssHII site, and hence, two fragments are detected fied 104A0717 and 104A0563 from the cosmid library by this probe in 83B05 and 900A0472. This confirms filters. the alignment of these two YACs, and the extent of Table 1 summarizes data obtained for the YAC clones overlap is estimated to be ~150 kb. DXS274 appears isolated in the HYP region. All YACs have been to be localized within a 160-kb stretch proximal to this mapped to Xp by FISH, and those YACs that appear area. to be chimeric have been identified. Sizes of YACs have There are a large number of E a g l and BssHII sites been estimated, and the probe content for each YAC is in the region of overlap between 900A0472 and described. 900E01138, and difficulties were encountered in de- YAC CONTIG SPANNING HYP DISEASE GENE REGION 2 3 3

HYP

TEL CEN

E B

. M i : YACs m apped according to PFG analysis

900B 1239

900G 1257

9 0 0 0 0 9 5 9

9 0 0 A 0 1 126 YACs positioned according to probe content only

I35C11 I

87D07

80G04

1 ------1------1------1------1------1------1 - " — I------1------1------1------1------1------1------1------1------1 I 0 1 2 ' 3 Mb

= Chimeric YAC = Deleted YAC B, M, N. Nr, and E = BssHl 1, M/ul, Not], Nru^, and Eag) sites respectively

FIG. 4. Schematic representation of YACs identified in the HYP gene candidate region. At the top of the figure, approximate positions of probes and cosmids are represented in a telomeric to centromeric order. Cosmids 104F0743, 104A0717, 104A0563, and 104H10174 have the following D-segment assignments: DXS1688, DXS1687, DXS1683, and DXS1686, respectively. The three YACs represented at the top of the figure have been mapped more extensively than the remainder. As described under Results, DXS274 detects an internal Nrul fragment in YAC 83B05, and there are two possible positions for the internal Nrul site, as shown by asterisks. At least six E agl fragments could be detected in 900A0472 by hybridization with both 104H10174 and the right-arm probe. 104H10174 does not extend as far as the first E agl site from the right-arm side. Compared to the left-arm end (in the overlap region with 900E01138), where a cluster of enzyme sites has been detected, the middle portion of the YAC has few sites, and 104A0563, a cosmid in this area, detects relatively large complete digest fragments {M lul, 280 kb; RssHII, 320 kb; and E agl, 200 kh). Both cosmids 104F0743 and 104A0717 are present in the overlap region between 900A0472 and 900E01138. Consideration of the M lul sites places 104F0743 closer to the distal end of 900A0472, with 104A0717 more proximally situated. Common BssHII fragments in both 900E01138 and 900A0472 were detected with 104F0743 (—100 kb) and with 104A0717 (—40 kh). In 900E01138, 104D0286, containing the DXS365 locus, strongly detects an 80-kb BssHII fragment and a weaker partial digest fragment of —160 kb, and neither can be correlated with a situation at either end of the YAC. The BssHII sites at the ends leave an — 250-kb central portion containing an unknown number of sites. It is likely that the 80-kb BssHII fragment is situated internally within this region, depicted by bracketed BssHII sites. The remaining 14 YACs depicted in this figure have been positioned according to only their probe content and have not been digested with rare-cutter enzymes.

termining exact sizes of fragments due to their limited end of 900A0472 appears to reside in a potential CpG resolution in the bottom third of the PFG. However, an island, as BssHII and E agl fragments of a similar size overall common order of enzyme sites could be distin­ are also detected. guished in this region. 104A0717 potentially detects a The SssHII digestions were almost complete in this microdeletion close to the right arm end of 900E01138, analysis, and consequently it was not possible to deter­ since its hybridization to Hmdlll-digested DNA shows mine unequivocally the full number of sites in each missing fragments, compared to other YACs. This cos­ YAC by consideration of the partial digest fragments. mid also contains a BssHII site. As shown in Fig. 4, For this reason it has not been possible to define an the common N o tl site situated ~150 kb from the right exact position within 900E01138 for the internal arm end of 900E01138 and ~90 kb from the left arm SssHII fragment detected by the DXS365 cosmid. 2 3 4 FRANCIS ET AL.

TABLE 1 YACs Identified in the HYP Region

Probe content Size FISH YAC (kb) result DXS41 DXS274 104H10174 104A0563 104A0717 104F0743 DX3365

83B05“ 660 Not chim. *** 900A0472“ 650 Not chim. **** 900E01138“ 560 Not chim. *** 311F02 370 Chim. * 173A01 370 Not chim. ** 900H0439 250 Chim. (Ct.) * 80G04 420 Chim. (Ct.) ** 87D07 440 Not chim. ** 135C11 450 Not chim. ** 27C06 400 Not chim. ** 900B0218 660 Chim. ** 900A01126 >1000 Chim. **** 900D0959 480 Not chim. **** 900B1239 660 Not chim. **** 900G1257 290 Chim. ** 900C03129 570 Not chim. * 900H0391 680 Chim. *

Note. Not chim., not chimeric; Chim., chimeric; and Ct., cotransformant. The probe content of each YAC is as indicated; asterisks denote a positive hybridization. “ Minimal span YACs.

104D0286. However, this probe also detects internal scripts present in the area, with the aim of identifying Notl, Mlul, and E a g l fragments, suggesting a central a gene that contains a mutation in individuals who location for the DXS365 locus. have the disease. After the correct disease gene has been cloned, a biochemical analysis then ensues to un­ DISCUSSION derstand the encoded protein and its function. The development of a clone contig map encompassing Positional cloning is a powerful strategy for identi­ the critical region bridges the gap between genetic fying disease genes with unknown biological function studies and disease gene identification. Generally, YAC (Collins, 1992). Application of this approach relies upon clones are isolated for this purpose, since they contain initially determining the chromosomal localization of large inserts, simplifying and accelerating the coverage the responsible gene and narrowing down this region of extensive stretches of the genome. Ultimately, it may to the smallest possible interval by genetic linkage then be preferable to use the YAC to identify cosmid, studies. This is followed by a examination of the tran­ PI, or lambda clones. These clones provide an im-

TABLE 2 N o tl and N ru l Fragments (in kb) Detected in the Minimal Span YACs by Probes Across the HYP Region

YAC: 83B05 (660 kb) 900A0472 (650 kb) 900E01138 (560 kb)

Probe Enzyme: N otl N ru l N otl N rul N otl N rul

Human 660 (uncut) 210 560 650 (uncut) 250 400 200 90 150 160 160 130 90 20 Left arm 660 160 90 650 130-1 5 0 160 Right arm 660 210 560 650 150 400 DXS41 660 210 DXS274 660 200 H 10174 660 160 560 650 F0743 560 650 150 400 A0717 560 650 150 400 DXS365 250 400

Note. 900A0472 and 83B05 are uncut with N ru l and N otl, respectively. YAC CONTIG SPANNING HYP DISEASE GENE REGION 2 3 5

PROBES

Right YAC DXS41 DXS274 Left YAC HI 01 74 Size arm arm Probes (kb) E B M E B M E BMEBM EBM

Human Left YAC Right D0286 F0743 650 arm arm N un N un N un N un N un Size 500 (kb) 400 m 550 300

400 200 m 250 100

150

50

FIG. 5. PFG analysis of two YACs in the HYP contig. (a) Nofl-digested 900E01138. N, Nofl-digested DNA; un, uncut DNA. Left-arm and right-arm YAC probes are derived from pBR322, as described under Materials and Methods. Two fragments of similar size (130-150 kb) are detected by each YAC arm probe, and a larger internal fragment (—250 kb) is detected by the DXS365 cosmid, 104D0286, with a second internal —20 kb fragment detected in the hybridization with total human DNA. It is likely that the left-arm end probe detects two fragments, one of —130 kb and a second partial digest fragment of —150 kb. The cosmid 104F0743 is localized in the overlap region between 900E01138 and 900A0472, and partial digest data indicate that it is situated close to the right-arm end. (t) Partial digest analysis of 83B05. E, Eagl; B, BssHII; M, M lul. DXS41 detects internal M lul and E agl fragments that are close to the right-arm end of this YAC. DXS41 and DXS274 both detect a 480-kb BssHII fragment. DXS274 detects a 360-kb M lul fragment and a 300-kb E agl fragment, detected also by the left-arm YAC pBR322 probe, and the end cosmid 104H10174. This cosmid is deleted, and it only faintly detects the left-arm end BssHII fragment ( — 10 kb); however, it appears to span a BssHII site, as a second fragment is detected. Two BssHII fragments are similarly detected in YAC 900A0472 by this probe. portant resource of easily manipulable genomic DNA, was achieved by probe typing of the YACs and cosmid suitable for generating markers to refine genetic maps. fingerprinting (Baxendale et aL, 1991; Russo et aL, In addition, strategies for rapidly identifying tran­ 1993). scripts directly from cosmid and YAC clones, such as Three YACs (900E01138, 900A0472, and 83B05), exon-trapping (Buckler et aL, 1991) and cDNA selection representing a minimal span across the HYP region, techniques (Lovett et aL, 1991; Korn et aL, 1992), have were selected for more detailed analysis involving di­ recently been developed. gestion with rare-cutter enzymes and PFGE. Initially, The underlying pathophysiological mechanisms in­ complete digests with N o tl and N ru l were performed; volved in hypophosphatémie rickets remain to be eluci­ however, a lack of well-spaced enzyme sites made it dated, despite extensive biochemical studies of the H yp difficult to estimate the extent of YAC overlaps. Never­ mouse model. The lack of information regarding the theless, these data made it possible to orient the YACs protein products and their functions emphasizes the relative to each other and to construct a crude map. need for the application of positional cloning strategies. Partial digestions using the enzymes Mlul, Eagl, and For this reason, a YAC contig has been constructed in BssHII were performed in an attempt to refine this Xp22.1 encompassing the X-linked dominant hypo­ map. A large number of E agl and RssHII sites were phosphatémie rickets gene critical region and linking identified in the distal half of the contig, making it the markers DXS365, DXS274, and DXS41. Several difficult to determine exact fragment sizes from the different hybridization approaches were used to iden­ partial digest data. Verification of the number and posi­ tify these YAC clones, including screening with all the tion of these sites can be obtained by digesting the available single-copy probes from the region, generat­ remaining YACs from the contig and electrophoresing ing YAC end prohes, and using A/m-PCR pools derived on several gels with different resolutions. This would from nonchimeric YACs to screen directly against the enable a more accurate determination of potential CpG YAC library filters. This latter approach proved to be islands in this region. a rapid method of chromosome walking, and at the A high clone density (17 YACs over —1.5 Mb) makes same time an A/m-PCR fingerprint of the YAC was it likely that the entire disease gene critical region is generated. A similar approach using irradiation hybrid represented in this contig, unless a particularly unsta­ DNA as the template for A/m-PCR colocalized several ble part of the genome exists in this area. One region YACs from this contig. Verification of YAC overlaps of potential instability was detected in a cosmid 2 3 6 FRANCIS ET AL.

(104H10174) identified with the left arm end of 83B05. Baxendale, S., Bates, G. P., MacDonald, M. E., Gusella, J. F., and The size of this cosmid was found to be only ~20 kb, Lehrach, H. (1991). The direct screening of cosmid libraries with YAC clones. Nucleic Acids Res. 19: 6651. suggesting that some sequences have been deleted, and Bell, C. L., Tenenhouse, H. S., and Scriver, C. R. (1988). Primary this was evident in the PFG analysis. Instability in cultures of renal epithelial cells from X-linked hypophosphatémie YACs has been widely observed (for example. Bates et (.Hyp) mice express defects in phosphate transport and vitamin D aL, 1992; Palmieri et aL, 1993). In this study, a small metabolism. Am. J. Hum. Genet. 43: 293—303. deletion was identified in 87D07 in the region of the Benham, F., and Rowe, P. (1992). Use of Æm-PCR to characterize deleted cosmid. A second potentied microdeletion was hybrids containing multiple fragments and to generate new detected in YAC 900E01138 by hybridization of cosmid Xp21.3-p22.2 markers. Genomics 12: 368-376. 104A0717 to Hmdlll-digested YACs. Several other Benham, F., Hart, K., Crolla, J., Bobrow, M., Francavilla, M., and Goodfellow, P. N. (1989). A method for generating hybrids con­ YACs from the contig, however, appear to span both of taining nonselected fragments of human chromosomes. Genomics the deleted regions detected. Therefore, it appears that 4: 509-517. problems of this type may be overcome by obtaining Bishop, C. E., Guellaen, G., Geldwerth, D., Voss, R., Fellous, M.,and YACs from more than one library and analyzing as W eissenbach, J. (1983). Single-copy DNA sequences specific for the many YACs as possible from the region of instability. human Y chromosome. N ature 303: 831-832. The PFG analysis allowed an estimate of the physical Boyd, Y., Cockburn, D. C., Holt, S., Munro, E., Van Ommen, G. J., distances between DXS41, DXS274, and DXS365. Gillard, B., Affara, N., Ferguson-Smith, M., and Craig, I. (1988). Mapping of 12 translocation breakpoints in the Xp21 region with DXS41 lies between 150 and 400 kb proximal to respect to the locus for Duchenne muscular dystrophy. Cytogmet. DXS274, and DXS365 lies between 800 and 950 kb Cell Genet. 48: 28-34. distal to this locus. The development of a large number Browne, D., Barker, D., and Litt, M. (1992). Dinucleotide repeat of microsatellite markers covering the human genome polymorphisms at the DXS365, DXS443, and DXS451 loci. Hum. (Weissenbach et aL, 1992) has recently enabled a fur­ Mol. Genet. 1: 213. ther refinement of the HYP disease gene candidate re­ Buckler, A. J., Chang, D. D., Graw, S. L., Brook, D., Haber, D. A., gion by genetic linkage studies. A new marker, AF- Sharp, P. A., and Housman, D. E. (1991). Exon amplification: A strategy to isolate mammalian genes based on RNA splicing. Proc. M163yH2 (DXS1052), has been shown to be the closest Natl. Acad. Sci. USA 88: 4005—4009. marker to the HYP locus on the centromeric side (M. Burk, R. D., Ma, P., and Smith, K. D. (1985). Characterisation and Econs et aL, in preparation). This information narrows evolution of a single-copy sequence from the human Y chromosome. the interval to be examined in the search for the HYP Mol. Cell. Biol. 5: 5 7 6 -5 8 1 . disease gene. Church, G. M., and Gilbert, W. (1984). Genomic sequencing. Proc. The YAC clone contig described with its associated Natl. Acad. Sci. USA 81: 1991—1995. PFG map provides an important starting resource in Coffey, A. J., Roberts, R. G., Green, E. D., Cole, C. G., Butler, R., the search for the mutation causing hypophosphatémie Anand, R., Giannelli, F., and Bentley, D. R. (1992). Construction rickets and also for identifying other gene transcripts of a 2.6-Mb contig in yeast artificial chromosomes spanning the human dystrophin gene using an STS-based approach. Genomics in this area. 12: 474-484. Collins, F. S. (1992). Positional cloning; Let’s not call it reverse any­ ACKNOWLEDGMENTS more. Nature Genet. 1: 3 -6 . Curry, C. R., Magenis, R. E., Brown, M., Lanman, J. T., Tsai, J., We thank members of the Genome Analysis Laboratory involved O’Lague, P., (Goodfellow, P., Mohandas, T., Bergner, E. A., and in the filter generation and distribution of clones associated with the Shapiro, L. J. (1984). Inherited chondrodysplasia punctata due to Reference Library system, and Mark Ross for provision of a somatic a deletion of the terminal short arm of an X chromosome. N. Engl. cell hybrid mapping panel. We also thank Yumiko Ishikawa-Brush J. M ed. 311: 1010-1015. for fluorescence in situ hybridization analysis of some of the YACs Econs, M. J., Barker, D. F., Speer, M. C., Pericak-Vance, M. A., Fain, and Roger Cox and Anthony P. Monaco for helpful discussions. This P. R., and Drezner, M. K. (1992). Multilocus mapping of the X- work was supported by grants from the MRC UK Human Genome linked hypophosphatémie rickets gene. J. Clin. Endocrinol. Metab. Mapping Project, the Wellcome Trust, and the Human Frontiers 75: 201-206. Science Program and by NIH Grants MO1-RR30, AR27032, and Econs, M. J., Fain, P. R., Norman, M., Speer, M. C., Pericak-Vance, AR42228. M. A., Becker, P. A., Barker, D. F., Taylor, A., and Drezner, M. K. (1993). Flanking markers define the X-linked hypophosphatémie REFERENCES rickets gene locus. J. Bone Min. Res. 8: 1149-1152. Eicher, E. M., Southard, J. L., Scriver, C. R., and Glorieux, F. H. Albertsen, A., H. M., Ahderrahim, H., Cann, H. M., Dausset, J., Le (1976). Hypophosphatemia: Mouse model for human familial hypo­ Pasher, D., and Cohen, D. (1990). Construction and characteriza­ phosphatémie (vitamin D-resistant) rickets. Proc. Natl. Acad. Sci. tion of a yeast artificial chromosome library containing seven hap­ USA 73: 4667-4671. loid human genome equivalents. Proc. Natl. Acad. Sci. USA 87: Feinberg, A. P., and Vogelstein, B. (1983). A technique for radiolabel­ 4256-4260. ing DNA restriction endonuclease fragments to high specific activ­ Bates, G. P., Valdes, J., Hummerich, H., Baxendale, S., Le Pasher, ity. Anal. Biochem. 132: 6-13. D. L., Monaco, A. P., Tagle, D., MacDonald, M. E., Altherr, M., Francis, F., Benham, F., See, C. G., Fox, M., Ishikawa-Brush, Y., Ross, M., Brownstein, B. H., Bentley, D., Wasmuth, J. J., Gusella, Monaco, A. P., Weiss, B., Rappold, G., Hamvas, R. M. J., and Leh­ J. F., Cohen, D., Collins, F., and Lehrach, H. (1992). Characteriza­ rach, H. (1994). Identification of YAC and cosmid clones encom­ tion of a YAC contig spanning the Huntington's disease gene candi­ passing the ZFX-POLA region using irradiation hybrid cell lines. date region. Nature Genet. 1: 180-187. Genomics 20: 75-83. YAC CONTIG SPANNING HYP DISEASE GENE REGION 2 3 7

Goodfellow, P. N., Banting, G., Levy, R., Povey, S., and McMicheal, Pinkel, D., Landegent, J., Collins, C., Fuscoe, J., Segraves, R., Lucas, A. (1980). A human X-linked antigen defined by a monoclonal anti­ J., and Gray, J. (1988). Fluorescence in situ hybridisation with body. Somatic Cell Genet. 6; 7 7 7 -7 8 3 . human chromosome-specific libraries: Detection of trisomy 21 and Hamvas, R. M. J., Larin, Z., Brockdorff, N., Rastan, S., Lebracb, H., translocations of chromosome 4. Proc. Natl. Acad. Sci. USA 85: Chartier, F. L., and Brown, S. D. M. (1993). YAC clone contigs 9138-9142. surrounding the Zfx and Pola loci on the mouse X chromosome. Povey, S., Jeremiah, S. J., Barker, R. F., Hopkinson, D. A., Robson, Genomics 17: 52-58. E. B., and Cook, P. J. L. (1980). Assignment of the human locus Hewison, M., Karmali, R., and O’Riordan, J. L. H. (1992). Tumour determining phosphoglycolate phosphatase (PGP) to chromosome induced osteomalacia. Clin. Endocrinol. (Oxford) 37: 382—384. 16. Ann. Hum. Genet. 43: 241-248. Korn, B., Sedlacek, Z., Manca, A., Kioschis, P., Konecki, D., Lehrach, Rasmussen, H., and Tenenhouse, H. S. (1989). In “The Metabolic H., and Poustka, A. (1992). A strategy for the selection of tran­ B asis of Inherited D isease” (C. R. Scriver, A. L. Beaudet, W. S. scribed sequences in the Xq28 region. Hum. Mol. Genet. 1: 2 3 5 - Sly, and D. Valle, Eds.), Vol. II, 6th ed., McGraw-Hill, New York. 242. Read, A. P., Thakker, R. V., Davies, K. E., Mountford, R. C., Brenton, Larin, Z., Monaco, A. P., and Lehrach, H. (1991). Yeast artificial D. P., Davies, M., Glorieux, F., Harris, R., Hendy, G. N., King, A., chromosome libraries containing large inserts from mouse and hu­ McGlade, S., Peacock, C. J., Smith, R., and O’Riordan, J. L. H. man DNA. Proc. Natl. Acad. Sci. USA 88: 3233-3237. (1986). Mapping of human X-hnked hypophosphatémie rickets by Lehrach, H., Drmanac, R., Hoheisel, J., Larin, Z., Lennon, G., Mo­ multilocus linkage analysis. Hum. Genet. 73: 267-270. naco, A. P., Nizetic, D., Zehetner, G., and Poustka, A. (1990). In Riley, J., Butler, R., Ogilvie, D., Finniear, R., Jenner, D., Powell, S., “Genome Analysis,” Vol. I, “CJenetic and Physical Mapping” (K. E. Anand, R., Smith, J. C., and Markham, A. F. (1990). A novel, rapid Davies and S. M. Tilghman, Eds.), pp. 39-81, Cold Spring Harbor method for the isolation of terminal sequences from yeast artificial Laboratory Press, Cold Spring Harbor, NY. chromosome (YAC) clones. Nucleic Acids Res. 18: 2887-2890. Lovett, M., Kere, J., and Hinton, L. M. (1991). Direct selection: A Rowe, P. S. N., Goulding, J., Read, A. P., Mountford, R. C., Hanauer, method for the isolation of cDNAs encoded by large genomic re­ A., Gudet, C., Whyte, M. P., Meier-Ewert, S., Lehrach, H., Davies, gions. Proc. Natl. Acad. Sci. USA 88: 9628-9632. K. E., and O’Riordan, J. L. H. (1993). New markers for linkage Machler, M., Frey, D., Gal, A., Orth, U., Wienker, T. F., Fanconi, A., analysis of X-linked hypophosphatémie rickets. Hum. Genet. 91: and Schmid, W. (1986). X-linked dominant hypophosphatemia is 5 7 1 -5 7 5 . closely linked to DNA markers DXS41 and DXS43 at Xp22. Hum. Russo, J., Sunjevaric, I., Cayanis, E., Rothstein, R., Efstratiadis, A., Genet. 73: 271-275. and Fischer, S. (1993). Cosmid fingerprints for YAC contig identi­ McKusick, V. A. (1992). “Mendelian Inheritance in Man,” Vol. 2 ,10th fication and STS development. Cytogenet. Cell Genet. 62: 105. ed.. The Johns Hopkins Univ. Press, Baltimore/London. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989). “Molecular Meyer, R. A., Jr., Meyer, M. H., and Gray, R. W. (1989). Parabiosis Cloning: A Laboratory Manual,” 2nd ed.. Cold Spring Harbor Labo­ suggests a humoral factor is involved in X-hnked hypophos­ ratory Press, Cold Spring Harbor, NY. phatemia in mice. J. Bone Min. Res. 4: 493-500. Solomon, E., Hiorns, L., Dagleish, R., Tolstoshev, P., Crystal, R., and Nesbitt, T., Lobaugh, B., and Drezner, M. K. (1987). Abnormal para­ Sykes, B. (1983). Regional localisation of human alpha2(l) collagen thyroid hormone stimulation of 25-hydroxyvitamin D-1 alpha-hy- gene on chromosome 7 by molecular hybridisation. Cytogenet. Cell droxylase activity in the hypophosphatémie mouse: Evidence for Genet. 35: 64—66. a generalised defect of vitamin D metabolism. J. Clin. Invest. 79: Tabor, A., Andersen, O., Lundsteen, C., Niebuhr, E., and Sardemann, 15-1 9 . H. (1983). Interstitial deletion in the “critical region” of the long Nesbitt, T., Coffman, T. M., Griffiths, R., and Drezner, M. K. (1992). arm of the chromosome in a mentally retarded boy and his normal Crosstransplantation of kidneys in normal and H yp mice: Evidence mother. Hum. Genet. 64: 196-199. that the H yp mouse phenotype is unrelated to an intrinsic renal Tenenhouse, H. S., Scriver, C. R., Mclnnes, R. R., and Glorieux, defect. J. Clin. Invest. 89: 1453-1459. F. H. (1978). Renal handling of phosphate in vivo and in vitro by Nizetic, D., Zehetner, G., Monaco, A. P., Gellen, L., Young, B. D., the X-hnked hypophosphatémie male mouse: Evidence for a defect and Lehrach, H. (1991). Construction, arraying and high-density in the brush border membrane. Kidney Int. 14: 236-244. screening of large insert cosmid libraries of the human chromo­ Thakker, R. V., Read, A. P., Davies, K. E., Whyte, M. P., Weksberg, somes X and 21. Proc. Natl. Acad. Sci. USA 88: 3233-3237. R., Glorieux, F., Davies, M., Mountford, R. C., Harris, R., King, Palmieri, G., Romano, G., Casamassimi, A., D’Urso, M., Little, R. D., A., Kim, G. S., Fraser, D., Kooh, S. W., and O’Riordan, J. L. H. Abidi, F. E., Schlessinger, D., Lagerstrom, M., Malmgren, H., (1987). Bridging markers defining the map position of X-linked Steen-Bondeson, M.-L., Pettersson, U., and Landegren, U. (1993). hypophosphatémie rickets. J. Med. Genet. 24: 756-760. A 1.5-Mb YAC contig in Xq28 formatted with sequence-tagged sites Weissenbach, J., Gyapay, G., Dib, C., Vignal, A., Morissette, J., Mil- and including a region unstable in the clones. Genomics 16: 5 8 6 - lasseau. P., Vaysseix, G., and Lathrop, M. (1992). A second-genera­ 592. tion linkage map of the human genome. N ature 359: 794-801. 148

GATA 11(5-6): 148-157, 1994

Construction and Preliminary for size selection. PI clones have smaller insert sizes than YACs, BACs, and PACs and hence are Analysis of the ICRF less susceptible to shearing in solution. We have, however, adapted a PFGE method [11] to produce Human PI Library a reliable protocol for preparing PI libraries, which is described in detail in this report and sum­ marized schematically in Figure 1. FIONA FRANCIS, We have previously prepared a PI library from GÜNTHER ZEHETNER, the genome of the fission yeast Schizosaccharo- myces pombe. The whole-genome mapping proj­ MATTIAS HOGLUND, and ect involving S. pom be evaluated the effectiveness HANS LEHRACH of cosmid, PI, and YAC cloning procedures [15, 16]. A PI library was used to speed up the gap closure process in the construction of the cosmid map, primarily because of their larger insert size PI clone libraries have now been established as effec­ and also because PI clones were found to be more tive complements to cosmid and yeast artificial chro­ successful than cosmids in spanning rDNA repeat mosome libraries in long-range mapping projects. To allow general access to PI clones, we have constructed sequences. human and mouse PI libraries. Clones have been We have now prepared 47,000 PI clones from picked into microtiter plates and used to prepare high- human DNA. These clones have been arrayed density filter grids, providing an efficient and easy manually in microtiter dishes and robotically spot­ screening system. Filters are being made available to ted at high density onto filter membranes for hy­ other laboratories through the Reference Library Sys­ bridization screening purposes. Screening by hy­ tem. In this work, we have developed a reliable protocol bridization is an effective means of analyzing large for generating PI clones, based on the use of pulsed- numbers of clones in parallel and has the advan­ field gel electrophoresis for size selection of DNA. A tage of allowing the use of relatively complex I.2x genome coverage human library has been pro­ probes to isolate clones (for example, YAC in­ duced using this method. A preliminary analysis of this library is described. serts, pools of cDNAs). In this way, related clones from different libraries can be identified in one screening experiment. Sharing of library resources with the scientific community (in the form of li­ Introduction brary filters) and collation of experimental data In original protocols describing the construction of that are generated is the basis of the ICRF Refer­ PI libraries [1-4], high molecular weight DNA was ence Library System (RLS) [17, 18]. The prelimi­ manipulated in aqueous solution, and size frac­ nary analysis of the ICRF human PI library has tionation was performed using sucrose gradients. been provided by a combination of in-house The original yeast artificial chromosome (YAC) screens and the distribution of library filters to cloning procedures used these same techniques RLS participants. [5-7]; however, YAC cloning by pulsed-field gel electrophoresis (PFGE) has now largely super­ seded the original methods [8-12]. Two additional Materials and Methods cloning systems, the bacterial artificial chromo­ some (BAC) [13] and more recently PI artificial Preparation of Genomic DNA in chromosome (PAC) [14] systems both use PFGE Agarose Blocks High molecular weight DNA derived from the hu­ man lymphoblastoid cell line GM1416B (Human From the Genome Analysis Laboratory, Imperial Cancer Genetic Cell Repository, Camden, NJ, USA) with Research Fund (ICRF), London, England. Dr. Hoglund’s present address: Department of Clinical Ge­ a karyotype 48XXXX was prepared in agarose netics, University Hospital, S-221 85 Lund, Sweden. blocks (3 X 10^ cells/block) as described previ­ Address correspondence toF. Francis, Genome Analysis, ously [19]. Blocks were stored in 10 mM Tris pH Imperial Cancer Research Fund, 44, Lincoln’s Inn Fields, Lon­ don WC2A 3PX, United Kingdom. 7.5, 50 mM EDTA at 4°C and washed in TE (10 Received 6 April 1994; revised and accepted 17 May 1994. mM Tris-HCl pH 7.5 and 1 mM EDTA) thor-

© 1994 Elsevier Science Inc., 655 Avenue of the Americas, New York, NY 10010 1050-3862/94/$7.00 149

ICRF Human PI Library GATA 11(5-6): 148-157, 1994

HMW DNA pAdl OsacBII vector in agarose blocks Linearize^with Seal Partial digests (Mbol) CIP treat ends

CIP treit ends BamFII digest to produce Ï vector arms: First size selection by PFGE

\ / Ligation

Second size selection by PFGE

Agarase gel band containing DNA

Concentrate DNA by ethanol precipitation

Package DNA i Infection, and plating onto selective agar

Pick clones into microtiter dishes

Spot clones at high density onto Nylon membrane

Figure 1 PI cloning using pulsed-field gel electrophoresis (PFGE). PI cloning scheme, adapted from Pierce and Sternberg [22], summarizing the steps involved in preparing PI libraries starting from DNA embedded in agarose and using PFGE for size selection (described in detail in the text). oughly before digestion (typically 3 x 30 min at gestions, blocks were equilibrated extensively for room temperature on a rocker). 4 h at 4°C in fresh buffer containing a suitable dilution of Mbol (BRL, 0.04-0.08 U/block). Mbol was freshly diluted in a buffer containing 50 mM Partial Digestion of DNA in Agarose Blocks KCl, 10 mM Tris-HCl pH 7.5, 0.1 mM EDTA, 1 Reproducible partial digests were obtained as fol­ mM DTT, 200 |xg/ml BSA, and 50% glycerol. The lows: blocks were washed extensively in TE be­ reaction was initiated by the addition of Mg^^ to fore equilibration at 4°C in enzyme buffer 10 mM and continued at 37 °C for 4 h. The diges­ (-)Mg^^ (modified from O’Farrell et al. [20]). tion was terminated by the addition of EDTA to 20 This buffer contained 33 mM Tris-acetate (pH mM. Extent of digestion was estimated by PFGE 7.9), 66 mM potassium acetate, 0.5 mM dithio- of DNA, using a contour-clamped homogeneous threitol (DTT), 1 mM EDTA, and 2 mM spermi­ electric field (CHEF) apparatus (Biorad). DNA dine. To ensure complete penetration of enzyme was electrophoresed in a 1% agarose gel, 0.5x into the blocks, and hence reproducible partial di­ TBE using X. concatamer and Saccharomyces ce-

1994 Elsevier Science Inc., 655 Avenue of the Americas, New York, NY 10010 150

GATA 11(5-6); 148-157, 1994 F. Francis et al. revisiae chromosome markers (FMC). Puise times chloroform and precipitated with ethanol. The ef­ were ramped from 3 to 20 s, over a period of 20 h ficiency of the CIP treatment was assessed by li­ at 180 V. gating a small amount of DNA and comparing it by gel with unligated DNA. Vector arms were gener­ ated by cleavage with BamYil (New England Bio­ CIP Treatment of Partially Digested DNA labs) in the cloning site (for example, 50 jxg DNA Blocks containing partially digested DNA were in 300 |xl total reaction volume following manufac­ treated with CIP (calf intestinal alkaline phos­ turer’s conditions with 140 U of enzyme for 2 h at phatase, BMB) at 0.055 U/|xg DNA in Ix dephos­ 37°C), and the DNA was repurified with phenol- phorylation buffer (BMB). Incubation at 37°C was chloroform extractions and ethanol precipitation. for 3 h, followed by addition of EDTA to 20 mM. The vector arms were resuspended at a concen­ Proteinase K was added to a final concentration of tration of 1 mg/ml. A religation control was per­ 1 mg/ml, and blocks were incubated at 50°C for 30 formed and compared with unligated material on a min. Blocks were washed in TE, pH 7.6, for 2 x 30 gel. The integrity of the religated material was also min at 50°C, then twice in TE with PMSF at 40 assessed by packaging a small amount of DNA and |xg/ml at the same temperature. estimating the ratio of the number of colonies pro­ duced on a kanamycin plate compared with a kanamycin and sucrose plate. The expression of Initial Size Selection by PFGE the sacB gene, which is part of the PI vector, in Blocks were washed in TE, prior to loading in a the presence of sucrose causes bacterial cell death trough of a 1% low melting point (LMP) gel (Sea- [21]. Provided this gene is intact, religated vector Plaque GTG, FMC), 0 .5 x TBE. Blocks were arms without insert should not give rise to viable sliced lengthways and loaded evenly across the bacteria and therefore we would expect to see at trough. The gel was electrophoresed in a Biorad least one hundred times more colonies on a kana­ CHEF apparatus for 16 h using a fixed pulse time mycin agar plate compared with kanamycin with of 3 s at 180 V. These parameters compressed sucrose. DNA of —100 kb into the region of limiting mobil­ ity. Only marker lanes were stained and then re­ aligned with the remainder of the gel. This enabled Ligation excision of the limiting mobility DNA, without The agarose slice containing DNA from the initial ethidium bromide intercalation and nicking by ul­ size selection gel was equilibrated in 30 ml ligation traviolet light. The remainder of the gel was buffer containing 50 mM Tris-HCl (pH 7.5), 10 stained to verify integrity and correct excision of mM MgCl2, 30 mM NaCl, and Ix polyamines (0.3 the material. Excised DNA was either stored in mM spermine and 0.75 mM spermidine) for 4 x 30 TE50 (10 mM Tris-HCl pH 7.6 and 50 mM EDTA) min, transferred to an Eppendorf tube and melted or immediately equilibrated in ligation mix. at 6 8 °C for 15 min with an eightfold molar excess of vector arms. The DNA was allowed to equili­ brate to 37°C, before addition of ligation buffer Vector Arm Preparation containing 1 mM ATP, 1 mM DTT and T4 DNA The positive selection vector (pAdlOsacBll) has ligase (NEB 400 U/|xl) at 3 U/|xl total reaction vol­ been previously described [21]. The vector was ume. The mixture was stirred gently using a wide- linearized with ^cal obtained from New England bore pipette tip (yellow tip with the last millimeter Biolabs (for example, 50 |xg DNA in 300 |xl total cut off with a scalpel blade) on a gilson pipetman final reaction volume following manufacturer’s and incubated at 37°C for 1 h, followed by over­ recommended conditions, using 100 U of enzyme night incubation at room temperature. The reac­ for 4 h at 37°C), and then heated to 6 8 °C for 10 tion was terminated by the addition of EDTA to min. After cooling to room temperature, DNA was 20 mM. treated with CIP at 0.2 U/|xg of DNA and incu­ bated for 30 min at 37°C. The dephosphorylation reaction was terminated by the addition of nitrilo- Second Size Selection by PFGE triacetic acid to 15 mM and incubation at 6 8 °C for The ligated DNA was melted at 6 8 °C for 15 min 25 min. The DNA was extracted with phenol- and loaded evenly into a trough of a 1% LMP aga­

1994 Elsevier Science Inc., 655 Avenue of the Americas, New York, NY 10010 151

ICRF Human PI Library GATA 11(5-6): 148-157, 1994 rose gel and electrophoresed with a 4-s pulse time fection with phage lysate was NS3145 [2]. Clones at 180 V for 16 h. Stringency of size selection was were selected on LB agar containing 45 |xg/ml estimated first by varying the volume of ligation kanamycin and 5% sucrose. Clones were picked reaction loaded into the trough and analyzing the into microtiter dishes containing media (25 |xg/ml excised DNA on a PFG. The DNA compressed kanamycin) as previously described [23] and into the limiting mobility zone was excised as de­ stored at -70°C. scribed for the first size selection PFG. Preparation of High-Density Library Filters Agarase Reaction The library of human PI clones was replicated into The gel slice was equilibrated in 30 ml of a buffer 384-well dishes (Genetix, UK), and a duplicate containing 40 mM Tris-acetate (pH 8.0), 1 mM copy was used to prepare filter membranes of ro­ EDTA, Ix polyamines for 3 x 30 min at room botically spotted clones for screening by hybrid­ temperature, and melted at 68°C for 15 min. The ization. Clones were arrayed at densities of 20,736 DNA was cooled to 37°C, and 0.5 U agarase clones per 22 x 22-cm membrane as previously (Sigma) was added per 100 jxl of molten DNA by described [17]. using a wide-bore tip. The reaction was incubated at 37°C for 3 h. The efficiency of the agarase re­ action was verified by incubating the mixture on Screening of the Human PI Library ice for 30 min and checking for resolidification. Library filters were prehybridized at 65°C for 1 h Size-selected DNA was either packaged directly in hybridization buffer [24] containing 0.5 M so­ or concentrated by ethanol precipitation. dium phosphate (pH 7.2), 7% SDS, and 1 mM EDTA. YAC inserts were PFG purified and treated as described [25]. Probes were labeled by Concentration of Size-Selected DNA random hexamer priming [26], and those likely to Using a wide-bore tip, ammonium acetate (final contain repetitive sequences were preannealed concentration, 2.5 M) was gently mixed into the with 1.5 mg/ml sheared human placental DNA agarased solution, which was incubated on ice for (Sigma) for 2 h at 65°C in 0.12 M NazHPO^ [27, 10 min and then spun at half-speed in a microcen­ 28]. Hybridization was performed for 16 h at 65°C trifuge for 10 min. The supernatant was trans­ in the same hybridization buffer. Filters were ferred to a fresh tube using a wide-bore tip, and 3 washed in 40 mM sodium phosphate, 0.1% SDS, vol of ethanol was added and gently mixed to ho­ twice at room temperature, and then twice at 65°C mogeneity. After incubation on dry ice for 1 h, the for 20 min. When using whole cosmids as probes, DNA was centrifuged at maximum speed in a mi­ two further washes at 65°C were performed. Ex­ crocentrifuge for 30 min at 4°C. The supernatant posure to x-ray film (Kodak X-Omat AR) was for was removed and the pellet was washed with 70% 1-3 days at -70°C, with a single intensifying ethanol. After removing as much of the ethanol screen. wash as possible, TE was added to the DNA pel­ let, which was allowed to rehydrate at 4°C over­ night. PI DNA Preparation PI plasmid DNA was prepared by a modified ver­ sion of the alkaline lysis procedure [29]. Briefly, Storage of Size-Selected DNA single colonies were picked into 10-ml cultures of Ready-to-package DNA was stored by snap freez­ LB media containing 25 |xg/ml kanamycin and ing the DNA in liquid nitrogen with storage at grown overnight at 37°C. Cells were pelleted and -70°C. resuspended in 300 |xl of alkaline lysis buffer I. Cells were transferred to Eppendorf tubes, and 600 |xl of alkaline lysis buffer II was mixed in gent­ Packaging and Recovery of ly. A total of 450 |xl of alkaline lysis buffer III was Recombinant Clones added, and the tube was inverted several times. Preparation of packaging extracts and in vitro Cell debris was pelleted, and DNA recovered by packaging were performed as described [22]. The isopropanol precipitation. In the preparation of strain used to recover recombinant DNA after in­ the library, clones were sized as undigested DNA,

1994 Elsevier Science Inc., 655 Avenue of the Americas, New York, NY 10010 152

GATA 11(5-6); 14&-157, 1994 F. Francis et al. after removal of soluble proteins by precipitation equilibration of the DNA agarose blocks in reac­ with ammonium acetate (final concentration 2.5 tion buffer containing enzyme before initiating the M), and ethanol precipitation of the DNA. reaction with magnesium ions. Partially digested DNA was treated with alkaline phosphatase to prevent coligation events between noncontiguous Sizing of PI Clones fragments. For this reason, vector arms were not Undigested plasmid DNA was electrophoresed on treated with alkaline phosphatase at the cloning a 1% agarose PFG using a Biorad CHEF appara­ site ends and were added in excess to the ligation tus. Pulses were ramped from 4 to 20 s for 20 h at reaction. 180 V. Several clones of known size were used as markers. Size Selection We have incorporated two size selections within Results this procedure to eliminate a large proportion of Partial Digests the small fragments before ligation and to elimi­ nate the remaining small fragments and remove Partial digests were considered optimal if a high­ religated vector arms in a more stringent size se­ lighted region of ethidium bromide fluorescence lection after ligation (see Figure 3). Both size se­ was achieved between 100 and 150 kb, with a re­ lection PFGs are performed using short pulse duction of DNA from the limiting mobility region times, which cause the DNA of interest to be com­ and without the production of appreciable under­ pressed in the region of limiting mobility. DNA digested material (see Figure 2). This optimization can therefore be excised from the PFG in a rela­ of the degree of digestion has a major effect on tively concentrated form. Routinely, a sample of overall cloning efficiency. Initially, a range of en­ the excised DNA from the second size-selection zyme dilutions was tested on quarter blocks, and gel was reelectrophoresed on a PFG in order to the dilutions that appeared to give fragments of the assess the efficiency of size selection and the DNA correct size were selected for scaling up. Repro­ concentration. ducible partial digests were obtained by extensive

A B Further Concentration of DNA Initial tests Scale-up We assessed several techniques for concentrating and recovering linear DNA fragments of approxi­ m 1 2 3 4 1 5 6 7 mately 100 kb from agarose before packaging. These included electroelution directly from aga­ rose and the use of Qiagen columns (Diagen), Cen- tricon filters (Amicon), or phenol-chloroform ex­ ISOkb ISOkb tractions to purify DNA from an agarased solu­ lOOkb lOOkb tion. The problems of loss of DNA on membranes 50kb SOkb and in columns and shearing of DNA by manipu­ lations as a liquid were always encountered. Di­ rect ethanol precipitation of an agarased solution proved to be the simplest and most successful method tested.

Figure 2. Partial digests of genomic DNA. (A) Initial range of Efficiency of Producing Clones enzyme dilutions tested on DNA in agarose blocks {lanes 1-4): lane I, uncut DNA; lane 2, dilution producing underdigested The concentration of DNA in an agarased solution partial; lane 3, dilution producing approximately the desirable cannot be determined by conventional means. degree of digestion; and lane 4, dilution producing overdi­ gested DNA. (B) Scale-up experiment—a finer range of en­ Therefore, we cannot accurately say how many PI zyme dilutions (lanes 5-7), lane 5 showing a highlighted region clones were produced per microgram of size- of ethidium bromide fluorescence between 100 and 150 kb, selected DNA. However, DNA concentrations without appreciable underdigested material. Marker lanes (m) contain lambda concatamers (FMC). PFG conditions are as were estimated from ethidium-bromide-stained described in Materials and Methods. PFGs, suggesting that clones were produced at an

1994 Elsevier Science Inc., 655 Avenue of the Americas, New York, NY 10010 153

ICRF Human PI Library GATA 11(5-6); 148-157, 1994

m arkers m arkers DNA sam ple

~100kb Excised band 48.5 kb

8.3 kb

Figure 3. Second size-selection PFG. Ligated material electrophoresed on a 1% LMP (Seaplaque, FMC) PFG with 4-s pulse time for 16 h at 180 V (Biorad CHEF apparatus). Markers are (/) BRL high molecular weight (8.3^8.5 kb) and (2) \ concatamers (FMC). Marker lanes were excised, stained, and aligned with the remainder of the gel for band excision. DNA to be cloned was excised from the limiting mobility before staining the remainder of the gel. Banding patterns across the gel are due to the vector arms.

average efficiency of 1 x l(W|jLg size-selected phoresed on PFGs to establish average clone size. DNA. Initially, we sized a number of clones by frequent cutter enzyme digestion, normal gel electrophore­ sis, Southern blotting and hybridization with total Human PI Library Filters human DNA (data not shown). Several of these We have previously prepared filter membranes clones were subsequently electrophoresed undi­ containing S. pombe PI clones robotically arrayed gested on a PFG to establish a set of clones that at high density [15]. In this case, clones imprinted were used as markers in future clone-sizing exper­ on filters were grown overnight at room tempera­ iments (Figure 5). We estimate that the average ture on agar plates, followed by a 6-h induction at insert size of the ICRF human PI clones is at least 37°C on fresh plates containing I mM isopropyl-[3- 75 kb. D-thiogalactopyranoside (IPTG). In the prepara­ A long-range mapping project, which has con­ tion of the human PI library filters, clones were tributed greatly to the further characterization grown on filters overnight at 37°C without induc­ of the ICRF human PI library, involves the Ip36 tion and have produced relatively strong signals region. PI clone isolation and analysis from this by hybridization. Library filters are being distrib­ region is described in this issue (Lengauer et uted to the scientific community [18], and as of al.). March 1994 we have had 87 filter requests (see Table 1). A variety of probes of different complex­ ities can be hybridized successfully to PI library Discussion filters (Figure 4). We have hybridized single-copy probes, whole cosmid clones, single and pooled We have described in detail an alternative pro­ cDNAs, and YAC inserts and have obtained tocol for the production of PI clones, which strong signal intensities that are equivalent to uses PFGE for size selection. We have used those obtained on cosmid clone filters and much this protocol to construct and array 47,000 hu­ greater than that obtained for YAC clone library man PI clones, which represent a I.2x genome filters. The lifespan of a PI library filter is 8-10 coverage. We are now in a position to increase hybridizations, although with careful handling we clone numbers further with the aid of a robotic have used some filter membranes 16 times. picking device to array the clones into storage dishes. In the construction of this library, we have used Clone Characterization two size selections: before and after ligation. In During the production of the human PI library, fact, two size selections may not be necessary, DNA minipreps from >200 clones were electro­ and we have previously constructed an S. pombe

1994 Elsevier Science Inc., 655 Avenue of the Americas, New York, NY 10010 154

GATA 11(5-6): 148-157, 1994 F. Francis et al.

Table 1. Hybridizations to High-Density Pi Filters

C h r o m o so m e location of Number of used probes Probes U sers R eq uests Clones Distribution of PI library witbin lp32 2 1 2 4 the Reference Library System Ip35-p36.1 1 1 1 4 lp36 15 15 148 Number of filter requests: 87 l p 3 6 .Il-p35 1 1 1 17 Number of filters distributed: 397 lpter-p36.12 1 6 42 Number of probes used: 150 2ql2-ql4 3 1 3 3 Number of clone requests: 170 2 1 1 1 1 Number of clones identified: 1390 3 1 1 1 2 4pl6.3 4 4 29 4 1 1 1 20 5 1 1 1 1 6q25-q27 1 1 1 2 6 1 1 1 5 7q21-q22 1 1 1 5 Number and location of laboratories 8q2.4 1 1 1 8 using PI Reference Library Filters 8q24 1 2 18 9pl3 1 1 1 3 2 A u stria 9p3 6 1 1 1 53 1 F in la n d 9ql3-q21.1 1 4 16 6 F r a n c e 9q34 1 5 69 5 G e rm a n y 10pl3 1 1 1 12 1 Isr a e l llql3 1 1 1 5 3 Ita ly 14 1 2 12 2 The Netherlands 15qll-ql3 1 2 16 3 8 U K 16 1 1 1 1 5 U S A 17q21-qter 1 1 1 4 17q21 3 6 23 17q21.1 1 1 1 4 21q22 1 1 1 5 22qll 5 34 For Reference Library filler and clones from this 22 1 2 23 PI library plea.se contact the Reference Library Xp 1 1 1 1 DataBase. ICRF, 44 Lincoln's Inn Fields, Xpll.21 1 1 1 3 London WC2A 3PX. Fax: +44 71 269 3645 Xpll.23 1 1 1 7 Email: GENOME®ICRF.ICNET.UK Xp21.3 1 2 7 Xp2 2 4 1 4 17 Data from the RLDB are accessible through Xp22.1-p22.2 1 1 1 4 anonymous FTP from the NCBI data repository X p 2 2 .1 3 1 3 20 (at ncbi.nlm.nib.gov or 130.14.20.1 in directory Xq2.7-q2.8 1 2 4 /repository/RLDB/*) or online over the World Xq28 21 21 150 Wide Web from the RLDB information server at X 1 1 1 6 the URL http://gea.lif.icnet.uk/. X+Ypseudoauto 1 1 1 3 X+Y 5 2 5 30

library using one size selection only 115]. How­ instances polyamine addition does not appear to ever, we opted to perform two size selections for be crucial for the protection of 100-kb pieces of the human library as a precautionary measure: ini­ DNA in melting steps or to improve the efficiency tially to eliminate the possibility of coligation of PI cloning. events between two small inserts that might then We have used a simple method for sizing PI be packaged into one phage head and secondly to clones that can be performed from crude miniprep prevent religated vector arms from reducing the DNA and is therefore useful when analyzing large packing efficiency of viable recombinant mole­ numbers of samples, for example in the prepara­ cules. tion of a library. However, there is a narrow win­ We have incorporated polyamines into the clon­ dow of resolution on this type of sizing PFG, and ing procedure as a safeguard against degradation for more accurate estimates of insert size, Notl of D N A in agarose-melting steps [11]. In addition, digests of miniprep DNA are advisable. The ease polyamines have also been found to assist efficient of DNA isolation from PI clones compared with cloning of several hundred kilobase YACs in yeast YAC clones facilitates the use of PI clones as a transformations [9]. We have, however, occasion­ source of genomic DNA for the isolation of genetic ally experienced problems when the addition of markers. PI insert sizes may also allow entire polyamines has caused DNA precipitation. We genes, in many cases, to be encompassed in one have now constructed a 3x genome coverage clone. C57BL/6 mouse PI library (Francis et al., manu­ In at least two cases, PI clones have been iden­ script in preparation) using a protocol without tified that appear to delete sequences in growth. It polyamine addition. Results indicate that in some is not clear at present what the proportion of un-

1994 Elsevier Science Inc., 655 Avenue of the Americas, New York, NY 10010 155

ICRF Human PI Library GATA 11(5-6); 148-157, 1994

f ./ - ' . 4 ? '

tsi> ■■

Figure 4. Autoradiographs of an ICRF human PI library filter hybridized with three different probes. Each filter carries an array of 50,736 clones (—0.5 x genome coverage). These are present in 48 x 48 boxes each containing 9 PI clones, (a) Whole cosmid clone from the Huntington’s disease gene region, (b) Anonymous cDNA clone detecting seven strong positives. This cDNA clone may be a member of a gene family or contain a moderatively repetitive sequence, (c) PFG-purified YAC insert (500 kb). This filter membrane has been hybridized multiple times, and therefore signals are relatively weak.

Stable clones in the library is, and whether these clones), we are now directly arraying clones in a are solely due to the genomic region from which more compact form (384-well dishes) and spotting they are derived or are also dependent upon the clones on membranes at higher densities. Filters recipient host used to propagate the clones. We are, however, with internal duplications of each have used NS3I45 12], which contains point mu­ clone to help distinguish positives (18,000 clones tations in a number of host restriction and modi­ spotted in duplicate per membrane). These filters fication systems and contains the /ac/^ gene on an are distributed as part of the ICRF Reference Li­ F' plasmid. This strain may be subject to rever­ brary System. sions or loss of the F' plasmid. In the latter case, In-house human PI library screens have led to stabilization of some genomic DNA inserts may be the identification of clones in the 2.2 Mb Hunting­ lost due to amplification in copy number of the PI ton’s disease gene region in 4pl6.3 [30]. Pools of plasmid. NS3529 [21] is an alternative PI cloning probes from the ends of cosmid contigs identified host strain that has deletion mutations in several of 4 PI clones, two of which contained DNA se­ the host systems including an additional system quences that had not been found in a chromosome (mrr) and contains the lacV^ gene on a X prophage. 4 specific cosmid library (representing five chro­ This strain may therefore be preferable for propa­ mosome equivalents). Similar results have been gation of clones. obtained in the MHC region on chromosome 6 We have produced library filters containing PI [31], where one particular region initially resisted clones spotted at densities of 20,736 clones per 22 all cloning attempts in a cosmid vector. A PI clone X 22-cm membrane. For ease of handling the li­ was identified that spanned this gap. Cosmids braries (especially those containing >100,000 were subsequently identified from a flow-sorted

1994 Elsevier Science Inc., 655 Avenue of the Americas, New York, NY 10010 156

GATA 11(5-6); 148-157, 1994 F. Francis et al.

marker 10 rantdomiy colleagues at DuPont Merck for the advice and resources they PI clones selected PI clones have provided. This work was supported in part by MRC grants 9106753 and 9215554.

References lOOkb 1. Sternberg N: Proc Natl Acad Sci USA 87:103-107, 1990 2. Sternberg N, Ruether J, de Riel K: New Biol 2:151-162, 1990 3. Pierce JC, Sternberg N: Mamm Genome 3:550-558, 1992 4. Smoller D, Petrov D, Hartl D: Chromosoma 100:487-494, 1991 5. Burke DT, Carle GF, Olson MV: Science 236:806-812, 1987 6. Coulson A, Waterston R, Kiff J, Sulston J, Kohara Y: Figure 5. Sizing of PI clones. Uncut plasmid DNA from 10 Nature 335:184-186, 1988 randomly selected human FI clones has been electrophoresed 7. Garza D, AJioka JW , Burke DT, Hartl DL: Science 246: in a \% normal (SeaKem, FMC) PFG (Biorad CFIEF, 4-20 s, 641-646, 1989 20 h at 180 V). Marker clones are PI clones (previously sized 8. Anand R, Villasante A, Tyler-Smith C: Nucleic Acids Res by digestion), which range from 15 to 100 kb. From left to right, 17:3425-3433, 1989 markers are 15, 30, 50, 65, 95, and 100 kb. In each case, one- quarter of total minipreparation DNA has been loaded per well. 9. McCormick MK, Shero JH, Cheung MC, Kan YW, Hieter Under these conditions, PI clone DNA (prepared as described PA, Antonarakis SE: Proc Natl Acad Sci USA 86:9991- in Materials and Methods) appears to be in relaxed form ex­ 9995. 1989 clusively, as only a single band is visible in each lane. Occa­ 10. Albertsen HM, Ahderrahim H, Cann HC, Dausset J, Le- sionally, a band with size corresponding to linear DNA is also Paslier D, Cohen D: Proc Natl Acad Sci USA 87:5109- present. 5113, 1990 11. Larin Z, Monaco AP, Lehrach H: Proc Natl Acad Sci USA 88:4123-4127, 1991 chromosome 6 cosmid library although these 12. Lee JT, Murgia A, Sosnoski DM, Olivos IM, Nussbaum cosmids grew slowly. This region is known to con­ RL: Genomics 12:526-533, 1992 tain many repeat sequences, which may explain 13. Shizuya H, Birren B, Kim U-J, Mancino V, Slepak T, the underrepresentation of this sequence cloned in Tachiiri Y, Simon M: Proc Natl Acad Sci USA 89:8794- 8797, 1992 the cosmid system. In the DiGeorge’s syndrome 14. loannou PA, .4memiya CT, Games J, Kroisel PM, Shizuya region on chromosome 22, PI clones were identi­ H, Chen C, Batzer MA, de Jong PJ: Nature Genetics 6: fied in regions of YAC instability and where 84-89, 1994 cosmid walking attempts had failed [32]. 15. Hoheisel JD, Maier E, Mott R, McCarthy L, Grigoriex' These are just a few examples of the ICRF hu­ AV, Schalkwyk EC, Nizetic D, Francis F, Lehrach H: Cell 73:109-120, 1993 man PI library hybridization screens that have 16. Maier E, Hoheisel JD, McCarthy L, Mott R, Grigoriev been performed to date. A detailed evaluation of AV, Monaco AP, Larin Z, Lehrach H: Nature Genet 1: the use of ICRF human PI clones (especially for 273-277, 1992 FISH analysis) from the lp36 region is also dis­ 17. Lehrach H, Drmanac R, Hoheisel J, Larin Z, Lennon G, Monaco AP, Nizetic D, Zehetner G, Poustka A: In K E , cussed in this issue (Lengauer et al.). These ex­ Davies. Tilghman SM (eds) Genome Analysis, vol 1: Ge­ amples highlight the utility of PI clones that may netic and Physical Mapping. (Cold Spring Harbor, NY. contain DNA sequences underrepresented or un- Cold Spring Harbor Laboratory, 1990, pp 39-81 clonable in both cosmid and YAC systems and 18. Zehetner G, Lehrach H: Nature 367:489-491, 1994 hence emphasize the effectiveness of using this 19. Herrmann BG, Barlow DP, Lehrach H: Cell 48:813-825. 1987 system in parallel with other cloning systems. 20. O’Farrell PH, Kutter E, Nakanishi M: Mol Gen Genet Note added in proof. The karyotype of the cell line used to 179:421-435, 1980 produce this library is currently being reassessed. 21. Pierce JC, Sauer B. Sternberg N: Proc Natl Acad Sci USA 89:2056-2060, 1992 22. Pierce JC, Sternberg N: Methods Enzymol 216:549-574, 1992 The authors thank all members of the Genome Analysis Lab­ oratory who have been involved in duplication, filter produc­ 23. Nizetic D, Zehetner G, Monaco AP, Gellen L, Young BD, tion, and recovery of clones associated with the ICRF human Lehrach H: Proc Natl Acad Sci USA 88:3233-3237, 1991 PI library, and Dean Nizetic for helpful advice in the early 24. Church GM, Gilbert W: Proc Natl Acad Sci USA 81:1991- stages of this project. We thank Reference Library System 1995, 1984 participants for their contribution to the preliminary character­ 25. Baxendale S, Bates GP, MacDonald ME, Gusella JF, ization of this library. We are indebted to Nat Sternberg and Lehrach H: Nucleic Acids Res 19:6651, 1991

1994 Elsevier Science Inc., 655 Avenue of the Americas, New York, NY 10010 157 ICRF Human PI Library GATA 11(5-6): 148-157, 1994

26. Feinberg AP, Vogelstein B; Anal Biochem 132:6-13, 1983 Collins FS, Deaven LJ, Gusella JF, Lehrach H, Bates GP: 27. Sealey PG, Whittaker PA, Southern EM: Nucleic Acids Nature Genet 4:181-186, 1993 Res 13:1905, 1993. 31, Sanseau P, Jackson A, Senger G, Kelly A, Francis F, 28. Litt M, White RL: Proc Natl Acad Sci USA 82:6206, 1985 Sheer D, Trowsdale J: Hum Immunol 40:1-7, 1994 29. Sambrook J, Fritsch EF, Maniatis T: In Molecular Clon­ 32. Halford S, Wadey R, Roberts C, Daw SCM, Whiting JA, ing: A Laboratory Manual, 2nd ed. Cold Spring Harbor, O’Donnell H, Dunham I, Bentley D, Lindsay E, Baldini A, NY, Cold Spring Harbor Laboratory, 1989 Francis F, Lehrach H, Williamson R, Wilson DI, Goodship 30. Baxendale S, MacDonzild ME, Mott R, Francis F, Lin C, J, Cross I, Bum J, Scambler PJ: Hum Mol Genet 2:2099- Kirby SF, James M, Zehetner G, Hummerich H, Valdes J, 2107, 1993

1994 Elsevier Science Inc., 655 Avenue of the Americas, New York, NY 10010