Gene Families

Gene families By Dr. NIJAGAL B.S. What are gene families? • A gene family is a group of genes that share important characteristics. • genes in a family share – similar sequence of DNA building blocks (nucleotides) – provide instructions for making products (such as proteins) that have a similar structure or function • dissimilar genes are grouped together – these genes work together as a unit or – participate in the same process. Human Molecular Genetics. 2nd edition Advantages of classification • Classifying individual genes into families helps researchers – how genes are related to each other – to predict the function of newly identified genes based on their similarity to known genes. – to predict where and when a specific gene is active (expressed). – provide clues for identifying genes that are involved in particular diseases. Human Molecular Genetics. 2nd edition Challenges • not enough is known about a gene to assign it to an established family • genes may fit into more than one family. • No formal guidelines define the criteria for grouping Human Molecular Genetics. 2nd edition RNA-encoding gene families often have numerous family members • Ribosomal RNA (rRNA) genes – 28S, 18S and 5.8S cytoplasmic rRNAs • tandemly repeated about 300 times, comprising five clusters of about 60 tandem repeats located on the short arms of human chromosomes 13, 14, 15, 21 and 22. (Acrocentric) – 5S cytoplasmic rRNA • several hundred gene copies in at least three clusters on the long arm of chromosome 1. Human Molecular Genetics. 2nd edition • Transfer RNA (tRNA) genes – 40 different subfamilies each with several members which encode the different species of cytoplasmic tRNA – multiple copies of genes specifying the individual cytoplasmic tRNA molecules – several defective gene copies ( pseudogenes ). Human Molecular Genetics. 2nd edition • Small nuclear RNA (snRNA) genes – large dispersed family of genes. • Small nucleolar RNA (snoRNA) genes – A large subfamily of about 200 genes are present in the nucleolus. • Other RNA genes – 7SL RNA component of the signal recognition particle which is required for protein export and the RNA component of telomerase. – certain RNA genes encode products that are important in gene regulation Human Molecular Genetics. 2nd edition polypeptide-encoding gene families • some genes encoding identical or functionally related products are clustered • but often they are dispersed on several chromosomes. • Functionally identical genes – these are encoded by recently duplicated genes in a gene cluster. Ex: duplicated α- globin genes Human Molecular Genetics. 2nd edition • some genes on different chromosomes encode identical polypeptides. – members of histone gene subfamilies. – five groups in terms of structure -H1 (the linker histone ) and the four core histones , H2A, H2B, H3 and H4. – In addition histone genes can be classified into three groups according to expression • replication-dependent (restricted to the S phase of the cell cycle) • replication-independent (expressed at a low level throughout the cell cycle to give so-called replacement histones) • tissue-specific, e.g. the H1t and H3t genes are expressed exclusively in the testis. Human Molecular Genetics. 2nd edition Human Molecular Genetics. 2nd edition Chromosomal distribution of the human histone gene family. Eleven clusters comprising a total of about 60 histone genes are distributed over seven human chromosomes. The two clusters on 6p contain the great majority of histone genes. Other clusters contain only one or two of the histone gene subtypes • Functionally similar genes – large fraction of human genes are members of gene families where individual genes are closely related – but not identical in sequence – In many such cases the genes are clustered and have arisen by tandem gene duplication – Ex. different members of each of the α-globin and β-globin gene clusters Human Molecular Genetics. 2nd edition Human Molecular Genetics. 2nd edition • Functionally related genes – Some genes encode products which may not be so closely related in structure, but are clearly functionally related. • subunits of the same protein or macromolecular structure. • components of the same metabolic or developmental pathway • may be required to specifically bind to each other as in the case of ligands and their relevant receptors. • such cases, the genes are not clustered and are usually found on different chromosomes Human Molecular Genetics. 2nd edition Nuclear genome contd.. • The total number of genes in the human genome has been estimated to be about 70 000–80 000. • As all but 37 of these genes are located in the nuclear genome , this gives a rough estimate of about 3000 genes per chromosome. • However, gene density can vary substantially between chromosomal regions and also between whole chromosomes. • For example, heterochromatic regions are known to be very largely composed of repetitive noncoding DNA, and the centromeres and large regions of the Y chromosome, in particular, are notably devoid of genes. FISH of a CpG (that is, neighboring cytosine and guanine residues on the same DNA strand in the 5′ → 3′ direction) island fraction from human DNA The texas red signal is derived from the CpG island probe while the fluorescein isothiocyanate (FITC) green signal represents late replicating regions (which are mostly transcriptionally inactive), Black regions represent overlap of signals Gene distribution • Recently, insight into gene distribution along the lengths of the different chromosomes has been obtained by hybridizing purified CpG island fractions of the genome (which are associated with perhaps about 56 % of human genes ; to metaphase chromosomes. On this basis, it is clear that gene density is high in subtelomeric regions and that some chromosomes (e.g. 19 and 22) are gene rich while others (e.g. 4 and 18) are gene poor. Gene density • The number of genes in the human genome has been the subject of much speculation; while the small mitochondrial genome is known to have precisely 37 genes, the number in the nuclear genome remains unknown. • Theoretical calculations based on the mutational load that a genome can tolerate and observed average mutation rates of human genes (~10 -5 per gene per generation) suggest an upper limit of about 100,000. • A variety of different approaches have been used to obtain more precise estimates of the total gene number. Human gene organization Human Molecular Genetics. 2nd edition Average sizes of exons and introns in human genes Human Molecular Genetics. 2nd edition .

Gene Families

Expression of a Mouse Long Terminal Repeat Is Cell Cycle-Linked (Friend Cells/Gene Expression/Retroviruses/Transformation/Onc Genes) LEONARD H

RESEARCH ARTICLES Gene Cluster Statistics with Gene Families

Gene-Pseudogene Evolution

Gene Family Amplification Facilitates Adaptation in Freshwater Unionid

CRISPR/Cas9-Mediated Genome Editing Efficiently Creates Specific

Genome Organization/ Human

Role of Human Endogenous Retroviral Long Terminal Repeats (Ltrs) in Maintaining the Integrity of the Human Germ Line

A Member of a New Family of Telomeric Repeated Genesin Yeast

Characterization of the Long Terminal Repeat of the Endogenous

Goldfish DNA. 25

Annotation and Characterization of Class 2 Transposable

Individual Gene Cluster Statistics in Noisy Maps