Length Polymorphisms in Human Proline-Rich Protein Genes Generatedby Intragenic Unequal Crossing Over
Total Page:16
File Type:pdf, Size:1020Kb
Copyright 0 1988 by the Genetics Society of America Length Polymorphisms in Human Proline-Rich Protein Genes Generatedby Intragenic Unequal Crossing Over Karen M. Lyons, James H. Stein and Oliver Smithies Laboratory of Genetics, University of Wisconsin, Madison, Wisconsin 53706 Manuscript received September 25, 1987 Revised copy accepted May 16, 1988 ABSTRACT Southern blot hybridization analysis of genomic DNAs from 44 unrelated individuals revealed extensive insertion/deletion polymorphisms within the BstNI-type loci (PRBl, PRBP, PRB3and PRBI) of the human proline-rich protein (PRP) multigene family. Ten length variants were cloned, including alleles at each of the four PRB loci, and in every case the region of length difference was localized to the tandemly repetitious third exon. DNA sequences covering the region of length variation were determined for seven of the alleles. The data indicate (1) that the PRB loci can be divided into two subtypes, PRBl plus PRBP, and PRB3 plus PRBI, and (2) that the length differences result from different numbers of tandem repeats in the third exons. Variant chromosomes were also identified with different numbers of PRP loci resulting from homologous but unequal exchange between the PRBl and PRBP loci. The overall data are compatible with the observed length variants having been generated via homologous but unequal intragenic exchange. The results also indicate that these crossover events are sensitive to the amount of homology shared between the interacting DNA strands. Allelic length variants have arisen independently at least 20 times at the PRB loci, but only one has been detected at PRHa locus. Comparison of the detailed structures of the repetitious regions in PRB and PRH loci shows that the repeats in PRB genes are very similar to each other in sequence and in length. The PRH genes contain fewer repeats, which differ considerably in their individual lengths. These differences suggest that the larger number of length variants in PRB genes is related to their greater ease of homologous but unequal pairing compared to PRH genes. TUDIES of the evolution of families of repetitive The same types of recombinational events that oc- S DNAs have revealed that these sequences show cur inyeast between repeated sequences mayalso greater homogeneity within speciesthan between spe- occurwithin genes that have internally repetitious cies. SMITH(1 976) constructed computer models that structures. An increasing volume of literature docu- showed repeated unequal sister chromatid exchange ments the existence of such genes (see, for example, can generate and maintain this homogeneity. Other ALLISONet al. 1985; MUSKAVITCHand HOGNESS1982; investigators (SCHERERand DAVIS 1980; NAGYLAKI SHIN et al. 1985; WARREN,COROTTO and WOLBER and PETES 1982) have considered the role of gene 1986). conversions in similar processes of homogenization. The human proline-rich protein (PRP) multigene In addition to these model-building studies, recip- family (reviewed by AZEN 1988) provides a particu- rocal recombination and gene conversionbetween larly interesting opportunity to examine the types of equally and unequally paired repeated sequences have recombinational events that can occur within genes been directly demonstrated in yeast. Unequal sister with internally repetitious structures. This family con- chromatid exchange has been detected in the rDNA sists of six tandemly linked genes, which encode the repeats during mitosis and meiosis (SZOSTAKand WU complex array of PRPs found in human saliva (AZEN 1980; PETS 1980). Experiments involving dispersed et al. 1984; MAEDAet al. 1985; LYONS,AZEN, GOOD- repeats on the same chromosome have demonstrated MAN and SMITHIES 1988). that homologous but unequal recombination occurs The six PRP genes are arranged on chromosome in both meiosis (KLEINand PETES198 1; KLEIN1984; 12p (MAMULAet al. 1985) in the order5’ PRB2-PRBI- JACKSONand FINK1985; JINKS-ROBERTSON and PETES PRB4-PRH2-PRB3-PRHI 3’ (H.-S. KIM, unpublished 1986) andmitosis (JACKSON and FINK 198 1; ROEDER, data); the cluster spans a physical distance of approx- SMITHand LAMBIE1984). Gene conversions between imately 600 kbwith approximately 70 to 180 kb repeated genes have also been reported (AMSTUTZet separating adjacent genes. The six genes are of two al. 1985;JINKS-ROBERTSON and PETES1985; JACKSON types that canbe distinguished by their ability to and FINK 1981). hybridize to a probe derived from PRBl (AZENet al. Genetics 120: 267-278 (September, 1988) 268 K. M. Lyons, J. H. Stein and 0. Smithies 1984; MAEDA1985). Four of the genes (PRBZ, PRBB, 123456 PRB3 and PRBI) hybridize strongly to the probe and T " contain multiple BstNI restriction sites (BstNI-type or PRB genes). The remaining twogenes (PRHI and PRH2) do not hybridize as strongly to the probe and contain multiple Hue111 restriction sites (HaeIII-type or PRH genes). Nucleotide sequence analyses (MAEDA et al. 1985; KIM and MAEDA1986) have demonstrated that the PRPs contain a series of proline-rich tandem repeats whichvary in length from 16 to 21 amino acids (48-63 bp). Initial molecular studies of the human PRP multi- gene family (AZENet al. 1984) demonstrated that four of the six genesshow frequent insertion/deletion FIGURE 1.-Southern blot hybridilation analysis of PRP gene length polymorphisms. We haveextended these stud- patterns in six unrelated individuals. DNAs were digested with ies by analyzing at the DNA sequence level ten alleles EcoRI. The 980-bpHinfl fragment from pPRPII2.2RP was used as from these four loci. This analysis reveals that the a probe. Assignment of length variants to specific loci was per- length differences are due to changes in the numbers formed as described in the text. Alleles of PRBl are identified by white squares. Alleles of PRBZ, PRB3 and PRB4 are identified by of the proline-rich tandem repeats. We consider sev- white arrows placed on the center, right, and left, respectively, of eral models for the generation of these length differ- the hybridizing band. Bands corresponding to PRHl and PRH2 are ences and find that intragenic homologous but un- identified by white dots (shown only in lane 2). The lengths of equal exchange provides the most simple explanation polymorphic alleles are indicated in kb. among those examined. We also find that thenumber TABLE 1 of PRP loci has been altered in some individuals by unequal exchanges between different loci. Incidence of polymorphic alleles in 44 unrelated individuals ObservedfEx MATERIALS AND METHODS No. oFted" No. of individuals DNA preparation: High molecular weightDNA was Size (kb) of alleles in heterozygous prepared from peripheral blood leukocytes of 44 unrelated Locus EcoRI fragment population at locus ~ individuals, including three individuals examined in a pre- PRB44 1 6.2 18123.1 vious study (AZENet al. 1984) by the method of PONCZet 6.0 36 al. (1982). 7 5.8 Southernblot hybridizations: Human genomic DNA 5.6 1 (5.0 pg per lane) was digested with EcoRI, electrophoresed in 0.8% agarose gels, and transferred to nitrocellulose ac- PRBZ 4.6 2 514.8 cording to the method of SOUTHERN(1 975) with modifica- 3.9 83 tions as described by WAHL,STERN and STARK(1979). 3.7 2 Filters were hybridized to a nick-translated 980 bp Hinfl 3.3 1 fragment from the plasmid pPRPII2.2RP. which contains PRB3 4.4 1 1811 6.6 the 2.2-kbEcoRIIPstI fragment from PRPl (AZENet al. 4.3 67 1984) subcloned into PAT153. The 980 bp Hinfl fragment 4.2 2 used as the probe contains the tandemly repetitious exon 3 4.0 18 of PRBI. Hybridization conditions were as described in VANINet al. (1 983). PRB4 3.4 6 1011 1.7 Construction and screening of phage libraries: Insert 3.3 4 DNA was prepared by digesting human genomic DNA to 75 3.2 completion with BamHI or Hind111 (New England Biolabs). 3 3.0 Bacteriophage X Charon 35 (LOENENand BLATTNER1983) Calculated from observed allele frequencies, assuming Hardy- arms were prepared, and phage libraries were constructed Weinberg mating. essentiallyas described by MANIATIS,FRITSCH and SAM- BROOK (1982). The phage libraries were plated onto Esch- erichia coli strain K802recA. vided by the University of Wisconsin Genetics Computer Subcloning the polymorphicPRP fragments: DNA iso- Group (DEVEREUX,HABERLI and SMITHIES1984). lated from small recombinant bacteriophage growths was digested with EcoRI, subcloned into pAT153, and trans- formed into E. coli strain K802recA. RESULTS DNA sequencing and analysis: DNA sequence analysis Length polymorphisms: Previous studies (AZENet was performed as described by MAXAMand GILBERT(1977) using the modifications of SLIGHTOM,BLECHL and SMITHIES al. 1984) showed that thenick-translated 980-bp Hinfl (1980). Nucleotide sequences were determined for both fragment from PRBI detected polymorphisms in hu- strands. The sequences were analyzed using software pro- man genomic DNA samples from three individuals. Length Polymorphisms in PRP Genes 269 P E PV P PRBlM FIGURE2.-Maps of thefour human c c c 4 puls PRB loci illustrating their two subtypes and their allelic variants. A, PRBl; B, PRB2;C, PRB3 and D, PRB4. The re- striction sites indicated are BstEII (Bs), EcoRI (E), HpaI (H),PstI (P),PuuII (Pv), PRBZS SstI (S), Sf1 (Sf), and XbaI (X). Restric- I+ 1M bl tion sites unique to a particular gene are * PRBZL indicated by single-headed arrows. Re- 1+ P3a bl PRBZV striction sites in identical positions in PRBl and PRB2 or in PRB3 and PRB4 are indicated by double-headed arrows. The regions of length variation were - = 250 localized and are shown above or below the respective complete maps forthe shortest allele at each locus. The approx- E S PHX Ir 9M hn h PB P. PRB3L imate lengthdifference between the ii 4 i his, * 41 PRBIS shortest allele at each locus and the other C tt allele(s) is indicated in bp above the map srp PV BS of each length variant. The restriction maps of alleles at each locus are identical D pRB4S except for the PstI site indicated in pa- rentheses that is present in PRB2vLbut e' d.