University of Windsor Scholarship at UWindsor
Electronic Theses and Dissertations Theses, Dissertations, and Major Papers
2006
Characterization of cathepsin L genes and their cDNAs in the brine shrimp, Artemia franciscana.
Jian Ping Cao University of Windsor
Follow this and additional works at: https://scholar.uwindsor.ca/etd
Recommended Citation Cao, Jian Ping, "Characterization of cathepsin L genes and their cDNAs in the brine shrimp, Artemia franciscana." (2006). Electronic Theses and Dissertations. 1398. https://scholar.uwindsor.ca/etd/1398
This online database contains the full-text of PhD dissertations and Masters’ theses of University of Windsor students from 1954 forward. These documents are made available for personal study and research purposes only, in accordance with the Canadian Copyright Act and the Creative Commons license—CC BY-NC-ND (Attribution, Non-Commercial, No Derivative Works). Under this license, works must always be attributed to the copyright holder (original author), cannot be used for any commercial purposes, and may not be altered. Any other use would require the permission of the copyright holder. Students may inquire about withdrawing their dissertation and/or thesis from this database. For additional inquiries, please contact the repository administrator via email ([email protected]) or by telephone at 519-253-3000ext. 3208. CHARACTERIZATION OF CATHEPSIN L GENES AND THEIR cDNAs IN THE BRINE SHRIMP, ARTEMIA FRANCISCANA
By:
JianPing Cao
A Thesis
Submitted to the Faculty of Graduate Studies and Research
through the Department of Biological Sciences
in Partial Fulfillment of the Requirements for
the Degree of Master of Science at the
University of Windsor
Windsor, Ontario, Canada
2006
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Library and Bibliotheque et Archives Canada Archives Canada
Published Heritage Direction du Branch Patrimoine de I'edition
395 Wellington Street 395, rue Wellington Ottawa ON K1A 0N4 Ottawa ON K1A 0N4 Canada Canada
Your file Votre reference ISBN: 978-0-494-17101-1 Our file Notre reference ISBN: 978-0-494-17101-1
NOTICE: AVIS: The author has granted a non L'auteur a accorde une licence non exclusive exclusive license allowing Library permettant a la Bibliotheque et Archives and Archives Canada to reproduce,Canada de reproduire, publier, archiver, publish, archive, preserve, conserve,sauvegarder, conserver, transmettre au public communicate to the public by par telecommunication ou par I'lnternet, preter, telecommunication or on the Internet,distribuer et vendre des theses partout dans loan, distribute and sell theses le monde, a des fins commerciales ou autres, worldwide, for commercial or non sur support microforme, papier, electronique commercial purposes, in microform,et/ou autres formats. paper, electronic and/or any other formats.
The author retains copyright L'auteur conserve la propriete du droit d'auteur ownership and moral rights in et des droits moraux qui protege cette these. this thesis. Neither the thesis Ni la these ni des extraits substantiels de nor substantial extracts from it celle-ci ne doivent etre imprimes ou autrement may be printed or otherwise reproduits sans son autorisation. reproduced without the author's permission.
In compliance with the Canadian Conformement a la loi canadienne Privacy Act some supporting sur la protection de la vie privee, forms may have been removed quelques formulaires secondaires from this thesis. ont ete enleves de cette these.
While these forms may be includedBien que ces formulaires in the document page count, aient inclus dans la pagination, their removal does not represent il n'y aura aucun contenu manquant. any loss of content from the thesis. i * i Canada Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Cao JianPing
Copyright© 2006
All rights reserved
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ABSTRACT:
Embryos of the brine shrimp,Artemia franciscana, contain a novel cysteine
protease composed of a cathepsin L-Iike catalytic subunit of 28.5 kDa, and a cell
adhesion protein of the FAS-1 family of 31.5 kDa. The cathepsin L-like subunit is
encoded by a cDNA derived fromCL-1 the gene inA. franciscana which has been
shown to be intron-less. TheCL-1 gene was detected in genomic DNA prepared from
nuclei of Artemia, and a secondCL gene (CL-2 ) was obtained from a genomic library
in EMBL3. Screeningo f Artemia adult and embryo cDNA libraries yielded sequences
homologous to the CL-2 gene, and confirmed thatCL-2 gene is also intron-less.
Artemia adult CL-2 cDNA was found to contain two distinct open reading frames
encoding the pro-peptide and mature protease. Several potential transcription factor
binding sites were identified, indicating the possibility of a functional promoter in the
5’ upstream sequence of the CL-2 gene.
iv
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Dedicated to Dr. A. H. Warner
V
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ACKNOWLEDGEMENT
I wish to thank my supervisor Dr. A. H. Warner very much for providing me the opportunity to work in his laboratory as a Master’s student. His advice and training during this period are very much appreciated.
I would also like to thank my committee member Dr. J. Hudson from the Department of Biological Sciences and Dr. B. Mutus from the department of Chemistry and Biochemistry for their time to review my thesis.
Many thanks are also due to the graduate students and the technical support staff of this department for their help.
vi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. TABLE OF CONTENTS
ABSTRACT...... iv
DEDICATION...... v
ACKNOWLEDGEMENT...... vi
LIST OF FIGURES...... xi
INTRODUCTION...... 1
1. Artemia ...... 1
2. Cysteine protease...... 2
3. Papain family...... 2
4. Cathepsin L...... 5
5. Pro-region of cathepsin L...... 7
6. Intracellular protease targeting...... 7
6.1 Mannose-6-phosphate dependent process...... 7
6.2 Mannose-6-phosphate independent process...... 8
7. Human cathepsin Lgene...... 8
8. Cathepsin Lgenes of the shrimp Penaeus vannamei...... 9
9. Cathepsin Z,-like cysteine proteases gene
of Leishmania...... 9
10.Cathepsin Lgene ofFasciola gigantica...... 10
11 .Artemia franciscana cathepsin L...... 11
12.Cysteine protease inhibitors...... 12
13. Objectives...... 15
vii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. MATERIALS...... 16
M ETH O D S...... 17
1. Isolation of cDNA clones coding forArtemia franciscana
embryo cathepsin L...... 17
2. Construction of [32p]-labeled CL cDNA probe...... 18
3. Purification of PCR products...... 18
4. Cloning of PCR products...... 19
4.1 DNA ligation reaction...... 19
4.2 Transformation of competentEscherichia coli cells...... 19
5. Characterization ofArtemia cathepsin L-l gene isolated
from genomic DNA...... 20
5.1 Isolation ofArtemia nuclei...... 20
5.2 Isolation ofArtemia franciscana genomic DNA...... 20
6. Screening of recombinant DNA clones forArtemia franciscana
cathepsin L nucleotide sequences...... 21
7. Isolation of plasmid DNA from CL positiveArtemia franciscana
clones...... 21
8. Screening of an Artemia franciscana genomic DNA library in
EMBL3...... 22
9. Isolation of lambda EMBL3 phage containing putative cathepsin L
genomic DNA...... 23
lO.Restriction analysis and Southern blotting of putative CL clones
viii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. derived from the genomic DNA EMBL3 library...... 24
11. Polymerase chain reaction using phage DNA clones as
Substrate ...... 25
12. Amplification ofArtemia cathespin L genes...... 25
13.DNA sequencing ofArtemia cathepsin L clones...... 25
14.PCR analysis ofArtemia franciscana adult cDNA library for the
presence of cathepsin L sequence...... 27
15.PCR analysis ofArtemia embryonic cDNA library for additional CL
cDNAs...... 28
16.Analysis ofArtemia franciscana genomic DNA in phage EMBL3 for
additional CL genes...... 29
17.Attempt to identify the putative promoter sequence ofArtemia
franciscana CL genes...... 30
RESULTS...... 31
1. Isolation of cDNA clone coding forArtemia embryo
cathepsin L...... 31
2. Isolation and analysis of cathepsin-L like clones from Artemia an
fanciscana genomic DNA library...... 31
3. Identification of DNA sequence in putative CL genomic clones from
Artemia franciscana...... 34
4. PCR anlysis of theArtemia franciscana phage
DNA library...... 41
ix
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5. Attempt to amplifyArtemia franciscana cathepsin L-l andL-2 genes
from different preparation of genomic DNA...... 50
6. Isolation of a cathepsin L cDNA from an Artemia adult cDNA
library...... 55
7. Isolation of a cathepsin L cDNA representing theCL-2 gene from the
Artemia embryo cDNA library...... 60
8. Identity of 5’ upstream sequences ofArtemia franciscana
CL genes...... 64
DISSCUSSION...... 77
Appendix 1: Pimers used in PCR experiments...... 90
Appendix 2: Primers designed onArtemia embryo
CL-1 cDNA sequence...... 91
Appendix 3: Primers designed onArtemia genomic
clone 9C sequence...... 93
REFERENCES...... 95
VITA AUCTORIS...... 109
X
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. LIST OF FIGURES
Figure 1. Analysis of a cDNA in vector pCR2.1 coding Artemia for
franciscana embryo cathepsin L...... 32
Figure 2. EcoR I digestion ofArtemia franciscana genomic DNA clones
in X EMBL3...... 33
Figure 3. PCR products from the use of putative CL phage DNA clones
as substrate ...... 35
Figure 4. Alignment of genomic clone A1 withArtemia embryo
cathepsin L cDNA...... 36
Figure 5. Alignment of genomic clone A1 withArtemia cathepsin L
genomic clone 9C CL-2( gene)...... 38
Figure 6. Comparison of gene structure of Artemia cathepsin L cDNA
and genomic clone 9C...... 40
Figure 7. PCR analysis of total DNA fromArtemia franciscana EMBL3
genomic DNA library...... 42
Figure 8. EcoR I restriction endonuclease digestion of PCR generated
fragments from EMBL3 DNA cloned
into pCR2.1...... 43
Figure 9. Alignment of PCR derived clone 818 withArtemia cathepsin L
genomic clone 9C CL-2( gene)...... 44
xi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 10. Sequence alignment of PCR derived clone 818 from EMBL3
genomic DNA library withArtemia
embryo cathepsin L cDNA...... 47
Figure 11. PCR products from Artemia genomic DNA prepared from
various sources...... 51
Figure 12. EcoR I digestion of PCR product shown in Fig 11 after cloning
in pCR2.1...... 52
Figure 13. Sequence alignment of DNA genomic clone 4271 withArtemia
embryo cathepsin L cDNA...... 53
Figure 14. Comparison of PCR products obtained from anArtemia adult
cDNA library andArtemia clone 9C representing theCL-2
gene...... 56
Figure 15. Comparison of DNA sequences derived from anArtemia
franciscana adult cDNA library andArtemia CL-2 gene (clone
9C) by PCR...... 57
Figure 16. Open reading frames in Artemia adult cathepsin L cDNA
sequence...... 61
Figure 17. PCR products derived fromArtemia embryonic
cDNA library...... 63
Figure 18. Comparison of DNA sequences derived from anArtemia
franciscana embryo cDNA library andArtemia CL-2 gene
(clone 9C) by PCR...... 65
XU
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 19. Analysis of open reading frame Artemiain embryo cathepsin
L-2 cDNA sequence...... 67
Figure 20. PCR products obtained using degenerate primers andArtemia
genomic clone 9C as substrate...... 69
Figure 21. EcoR I digestion of PCR product using OPC-4 and CLR10 as
primers shown in Fig 20 after cloning in pCR2.1...... 70
Figure 22. Comparison of DNA sequences of Artemia genomic clone 9C
and its 5’ upstream with Artemia franciscana adult CL-2
cDNA and embryo CL cDNA...... 71
Figure 23. Analysis of 5’ upstream sequence ofArtemia genomic clone 9C
representingCL-2 gene...... 75
Figure 24. Putative transcription binding sites in 5’ upstream Artemiaof
genomic clone 9C...... 76
Figure 25. Comparison of the deduced partial amino acid sequence of
Artemia CL-2 with Artemia CL-1 cDNA...... 83
Figure 26. Comparison of the deduced amino acid sequence ofArtemia
CL-1 and CL-2 cDNA with cathepsin L sequences from other
organisms...... 84
xiii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Introduction
1. Artemia
Artemia franciscana is a member of anostracan crustaceans, most commonly
known as brine shrimps. The entire body Artemiaof is covered with a thin, flexible
exoskeleton of chitin to which muscles are attached internally.Artemia is found on six
continents and consists of bisexual and parthenogenetic strains. In North America,
only bisexual species are found, whereas in Europe, Asia and Africa, both populations
occur (Criel and MacRae, 2002). Females of most Artemia strains reproduce either
ovoviviparously or oviparously, releasing either nauplius larvae or encysted embryos,
respectively (Jackson and Clegg, 1996; Liang and MacRae, 1999). They live in saline
environments and are able to tolerate large changes in salinity, ionic composition,
temperature and oxygen tension. Under ideal environmental conditionsArtemia tends
to reproduce ovoviviparously, whereas under adverse environmental situations it
reproduces oviparously (Criel and MacRae, 2002).
In advance of over-wintering conditions, Artemia adult females secrete a
chitinous material which forms a hard shell cyst around the fertilized egg in the ovisac.
Encysted embryos enter a state known as diapause during which their metabolic
activity becomes arrested and the embryo undergoes dehydration (Clegg and Conte,
1980). When environmental conditions are adequate, the cysts rehydrate and resume
metabolic activity and their developmental program.
Among all macromolecules in Artemia embryos, proteolytic enzymes play
important roles in development. They are key enzymes in hatching, yolk platelet
degradation, protein synthesis control, and in proenzyme activation reactions (Criel
and MacRae, 2002). In general proteolytic enzymes are divided into five groups based
on the reactive nucleophile in the catalytic site o f the enzyme. They are cysteine, serine,
aspartic, metallo- proteases and unclassified groups of peptidases. In embryos of
Artemia franciscana a cysteine protease is the dominant proteolytic enzyme.
l
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2. Cysteine protease
Cysteine proteases can be isolated from prokaryotic and eukaryotic organisms.
They are part of a family of hydrolytic enzymes that require an SH group in their active
site. Highly conserved Cys and His residues in the active site form a
thiolate-imidazolium ion pair that mediates catalysis (Storer and Menard, 1994). The
structure is stabilized by a highly conserved Asn, while a highly conserved Gin forms
the oxyanion hole, a crucial element in forming an electrophilic center to stabilize the
tetrahedral intermediate during hydrolysis (Sajid and McKerrow, 2001).
The cysteine proteases have been evolving for at least three million years
(Barrett and Rawlings, 2001). During this period, an ancestral peptidase may have
diversified first into a family with detectable similarity in amino acid sequences, and
then into a cluster of families that differ in many ways. Despite the differences in
amino acid sequences, the families in a cluster are ultimately related according to the
conservation of their protein fold. This group of related families is called a “clan”.
Conventionally, proteases are assigned to clans and families depending on a
number of characteristics including sequence similarity, possession of inserted peptide
loops and biochemical specificity to small peptide substrates. More recent
classification relies on sequence homology directly spanning the catalytic cysteine and
histidine residues (Sajid and McKerrow, 2001).
Clan CA is the largest clan of cysteine proteases, with about half of the total
families. It contains the families of papain (Cl), calpain (C2), streptopain (CIO) and
the ubiquitin-specific peptidases (Cl2, C l9). Among them the papain family is the
largest family of cysteine proteases identified (Barret and Rawlings, 2001).
3. Papain family
Cysteine peptidases of clan CA family Cl (papain family) can be found in the
animal and plant kingdoms as well as in some viruses and prokaryotes (Bernd 2003).
They exist predominantly within the endosomal and lysosomal compartment of cells.
They include cathespins B, L, H, S, K, F, V, X, W, O and C. Among the cathepsins
2
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. there are two subfamilies: cathepsin B-like subfamily and cathepsin L-like subfamily.
The main differences between the two subfamilies are the insertions in cathepsin B,
including the “occluding loop”, and the much shorter proregion in cathepsin B
compared to other cysteine proteases (Carmonaet al. 1996; Musil et al. 1991).
These enzymes share the general architecture of three catalytic residues Cys25,
His 159 and Asnl75. The ionized state of the nucleophilic cysteine residue, in the
active site is independent of substrate binding, making these and other cysteine
proteases a priori active (Polgar and Halasz, 1982). Except for cathepsin C, all
cysteine cathespins are monomers consisting of two domains, R (right) and L (left),
which ^ e formed in a V-shaped configuration as shown below. At the top of the cleft,
active site amino acids cysteine and histidine are positioned. The left domain (towards
the N-terminus) is composed largely of an a-helix, while a long helix runs through the
middle of the molecule. The right domain (towards the C-terminus) supports the
catalytic histidine and contains a P-barrel structure (Barret and Rawlings, 2001).
Structure of mammalian cathepsin L.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A, catalytic residue Cys 25; B, catalytic residue His 159; C, P-barrel structure in right domain; and D, a-helix structure in left domain (taken from Turk and Guncar, 2003).
Most of the papain family of enzymes are relatively small proteins with Mr values
in the range 20,000-35,000 (Brocklehurst et al. 1987; Polgar, 1989; Rawlings and
Barrett 1994; Berti and Storer, 1995), while cathespin C is an oligomeric enzyme with
Mr value approximately 200,000 (Metrioneet al. 1970; Dolenc et al. 1995). Cathepsin
C is a dipeptidyl aminopeptidase, not an endopeptidase like other cathepsins (Kirschke
et al. 1995). Cathepsins B, H and L are ubiquitous in lysosomes of mammals, whereas
cathespin S has a more restricted localization (Barrett and Kirschke, 1981; Kirschke
and Wiederanders, 1994).
Most lysosomal cysteine proteases are synthesized as 30-50 kDa
prepro-enzymes, processed in the ER then directed into lysosomes where they serve
their function in protein hydrolysis. After removal of the signal peptide, the molecular
mass of these enzymes remains within the range of 20-35 kDa. The processing of
prepro-enzyme into active enzyme includes two steps, the removal of the
prepro-region of the enzyme, and one or more limited proteolytic cleavages within the
polypeptide backbone as well as at the N- and C-termini, respectively (Machet al.
1994, Menardet al. 1998).
The functions of papain-like cysteine proteases are different in various
organisms. Plant papain proteases are mainly used to mobilize storage proteins in seeds.
Protein bodies in seeds contain both storage proteins and protease precursors. The
latter become activated after germination and begin degradation of the stored proteins
(Schlereth etal. 2001). Most parasitic papain-like cysteine proteases act extracellularly,
and they are important in the life cycle of parasites for invading tissues and cells,
gaining nutrients, hatching and even evading the host immune system (Sajid and
Mckerrow, 2002). Primitive organisms dependent on phagocytosis use papain-like
proteases to digest phagocytised proteins. The enzymes of these organisms are already
4
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. packed into lysosomes or acidified lysosome-like structures (Volkelet al. 1996;
Krasko et al. 1997; Gotthardt et al. 2002). Papain-like proteases in mammals are
primarily lysosomal enzymes. Only cathepsin W seems to be retained in the
endoplasmic reticulum (ER) (Wex et al. 2001). Cathepsins B, C, H, L, O are found in
the lysosomes of nearly all tissues and cells, thus probably fulfilling housekeeping
functions. Other cysteine proteases show a different distribution within tissues
suggesting specific functions not found in all mammals (Wiederanders, 2003).
There are several ways to regulate the activity of papain-like cysteine proteases.
The most important are pH and endogenous inhibitors. At low pH (2.8 - 3.8) mature
cysteine proteases, mainly cathepsins B, S, L, are denatured irreversiblyet (Turk al.
1995). Endogenous protein inhibitors include the cystatins (stefins, cystatins,
kininogens) (Turk et al. 1997), thyropins (thyroglobulin type-1 domain inhibitors)
(Lenareie et al. 1998), and the general protease inhibitor a-2-macroglobulin (Masonet
al. 1989).
Among members of the papain family, cathepsins B, H, L and S have been
studied extensively because they have been implicated in a variety of physiological
processes such as proenzyme activitation (Eeckhout and Vales, 1997; Shinagawaet al.
1990; Kobayashiet al. 1991), enzyme inactivation (Bond and Barrett, 1980), antigen
presentation (Takahashi et al. 1989; Roche and Cresswell, 1991; Michalek et al. 1992),
hormone maturation (Uchiyamiet al. 1989), tissue remodelling and bone matrix
resportion (Delaisse et al. 1980; Machiewicz et al. 1990; Guinec et al. 1993; Blondeau
et al. 1993)
4. Cathepsin L
Cathepsin L (CL) is a lysosomal cysteine protease in mammals. It is the most
active of all cysteine proteases in degrading protein substratesin vitro such as
azocasein, collagen, or elastin (Barrett and Kirschke, 1981; Maciewiczet al. 1987;
Mason et al. 1989). CL was first purified from rat liver lysosomes by Bohley and
5
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. colleagues (1971) and was subsequently designated cathepsin L, with “L” to designate
lysosomes.
Some cathepsin L-like cysteine proteases have been detected in non-lysosomal
regions of eukaryotic cells and embyos. Cathespin L-like cysteine proteases occur in
the cytoplasm of unfertilized eggs and around yolk granules in amphibians and fish
embryos (Miyataet al. 1995, 1998; Kestemont, 1999). A cytoplasmic cathepsin-L like
protease is required for grastrulation inXenopus (Miyata and Kubo, 1997). CL is also
found in embryos of the flesh fly,Sarcophaga peregrine, for differentiation of
(cultured) imaginal disks (Homma and Natori, 1996) and inArtemia franciscana
embryos (Warneret al. 1995; Warner and Matheson, 1998). A cathepsin L isoform
devoid of a signal peptide has been detected in the nucleus of murine NIH3T3 cells
(Goulet et al. 2004).
As a member of the papain family, cathspsin L has been shown to play a role in a
variety of intracellular and extracellular processes including antigen presentation
mentioned above (Villadangoset al., 1999), prohormone activation (Marx, 1987),
sperm maturation (Erickson-Lawrence et al., 1991), bone resorption (Delaisse and
Vaes, 1992), and extra-cellular matrix (ECM) remodeling (Yamadaet al., 2000).
Over-expression of cathepsin L has been reported to be involved in several
diseases. Many human tumors and cancers of kidney, testicles, lung, colon, breast,
adrenal gland and ovary have been found to express very high levels of cathepsin L
(Chauhan et al., 1991). Cathespin L is able to hydrolyze components of the ECM such
as collagen IV, fibronectin, and laminin suggesting that it can degrade the ECM, thus
enabling tumor cell proliferation and invasion into surrounding tissues, as well as into
the vasculature. After entering blood vessels malignant cells can metastasize by
extravasation into other tissues (Koblinskiet al. 2000; Szpaderska and Frankfater,
2000; Severet al. 2002). Over-expression of cathepsin L has also been implicated in a
variety of inflammatory diseases, such as inflammatory myopathies, rheumatoid
arthritis, and periodontitis (Berdowska, 2004).
6
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5. Pro-region of cathepsin L
The cysteine proteases known as cathepsin L are synthesized as prepro-enzymes
that require processing (cleavage of the N-terminal fragment) to become active
catalytically at pH 3.0-6.5. The pro-region contains two conserved motifs. The first is a
GNFD motif (Gly-Xl-Asn-Xl-Phe-Xl-Asp) and a similar motif could also be found in
the pro-region of cathepsin B group (Ishidohet al. 1987a; Vemet et al. 1995). Some
researches have proposed that alteration of the charge state in the GNFD motif could
trigger the processing of the proenzyme, and the GNFD motif may participate in the
pH regulation of the processing (Vemetet al. 1995). The second conserved motif is the
ERFNIN motif (Glu-X3-Arg-X3-Phe-X2-Asn-Xl-Tyr-X2-Asp), which distinguishes
cathepsins L and B (Karreret al. 1993). The ERFNIN motif is required to maintain the
three-dimensional structure of the pro-region of CL (Coulombeet al. 1996; Cygler and
Mort, 1997).
The pro-region of cathepsin L can inhibit CL activity by covering the active site
cleft in a non-productive orientation (Carmonaet al. 1996; Volkel et al. 1996).
Inhibition by the pro-region displays pH-dependency very similar to that required for
processing of the pro-cathepsin L, and the N-terminus of the pro-region is more
important for inhibition than the C-terminus (Carmonaet al. 1996).
6. Intracellular protease targeting
6.1 Mannose-6-phosphate dependent process
Mannose-6-phosphate marker for delivery of enzymes to the lysosomes or
acidified vesicles can be found in the pro-peptide domain as well as in the catalytic
domain of the mature protease. The generation of the mannose-6-phosphate marker
was made through transferring phospho-Glc-Nac (N-acetyl-glucosamine) to mannose
residues of cathepsin L by the UDP-GlcNAc (lysosomal enzyme
N-acetyl-glucosamine-1-phosphotransferase). The removal of terminal GlcNAc yields
the mannose-6-phosphate group which attaches cathepsin L to the
mannose-6-phosphate receptors for transport to the lysosomes (Wiederanders, 2003).
7
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6.2 Mannose-6-phosphate independent process
A mannose-6-phosphate independent process for trafficking of lysosomal
proteins has also been suggested. A conserved 9 amino acid long peptide motif was
identified in the alternative trafficking process (HuetePerezet al. 1999). It is located
close to the N-terminus of the pro-peptides. A receptor recognizing the motif hasn’t
been identified so far, although a 43 kD integral lysosomal membrane protein was
described binding mouse procathepsin L in a pH dependent manner (McIntyre and
Erickson, 1993).
7. Human cathepsin L gene
Human cathepsin L is involved in many diseases. High levels of expression of the
cathespin Lgene can be found in various types of tumors and cancers (Izabela 2004).
The human cathepsin Lgene is located on human chromosome 9q21 -22 (Chauhanet al.
1993), and it encodes five mRNA species, namely hCATL A, Al, All, AIII and hCATL
B, all differing in their 5’ untranslated region (5’-UTRs). Of these hCATL A, Al, All
and AIII are produced by the alternative splicing of the same primary transcript (Arora
and Chauhan, 2002). The hCATL Al, All and AIII variants lack 27, 90 and 145 bases,
respectively from the 3’ end of exon-1 (Rescheleitet al. 1996), while the hCATL B
variant includes sequence (182 bases) from the 3’ end of intron-1 of the primary
transcript (Chauhan et al. 1993). Interestingly, the hCATL AIII variant is the most
efficiently translated isoform. The predominance of this translated splice variant
(hCATL AIII) in malignant cells suggests that it plays a key role in the over-expression
of human cathespin L in cancer (Arora and Chauhan, 2002).
A 5’-flanking region of the humancathepsin Lgene containing 3263 bp upstream
of translation start site was identified previously (Jeanet al. 2002). The promoter of
human cathepsin L gene is TATA-less, and there are about forty-three potential
transcription factor binding sites in the 5’ upstream area referred to above (Sethet al.
2003). One specific region of 50 bp located 60 bp from the putative transcription
initiation site was found to have transcription factor binding sites required for
8
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. cathepsin L promoter activity. It contains a CCAAT motif and two GC boxes. The
CCAAT motif could bind transciption factor NF-Y, also named CBF or CPI. The two
GC boxes could bind transcription factors Spl (a general activator of transcription) and
Sp3, an activator or a repressor of Spl-mediated activation.
8. Cathepsin L genes of the shrimp Penaeus vannamei
The cathespin Lgene structure of P. vannamei was identified by Le Boulayet al.
(1998). This gene expresses a cathespin L-like enzyme in the hepatopancreas. It is
encoded by six exons. The six exons and their intervening introns span 1792 bp of
genomic DNA. Exon 1 encodes 14 amino acids of the ER signal sequence, and the first
12 amino acid residues of the pro-peptide. The remainder of the pro-peptide is encoded
by exon 2 and 90 bp of exon 3. The last 90 bp of exon 3 and all of exons 4-6 code for
the mature region of the protease. Sequence polymorphism was also found in the last
intron of the gene, giving rise to three variants of the enzyme. The gene structureP. of
vannamei cathepsin isL homologous to that of rat cathepsin L(63 %), but it contains
fewer introns. When compared with ratcathepsin L, three of the conserved intron
positions were identified in theP. vannamei cathepsin gene L (Le Boulayet al. 1998).
In contrast no similarity or low similarity could be found betweenDrosophila or
Plasmodium cathepsin L-like genes with that of the rat cathepsin L.
9. Cathepsin L-like cysteine protease genes ofLeishmania.
The Leishmania donovani complex causes a variety of diseases such as visceral
leishmaniasis, which is a serious health problem in tropical and subtropical countries
(Badaro et al. 1986; Evan et al. 1986; Tselentis et al. 1994). Characterization of
cathepsin L-like cysteine proteases ofLeishmania donovani complex was first
determined by Mundodiet al. (2001). In Leishmania chagasi the CL gene cluster has
five copies of tandemly arrangedCL genes. The sequences coding for pre-pro and
mature regions of the protease are conserved in all the five genes of the cluster except
for the last gene (LdccyslE ). The LdccyslE gene is identical to the first geneLdccysJA
9
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. except for a deletion of 39 bases coding for 13 amino acids in the mature region,
including one of the active site histidine residues and a truncated carboxyl terminal
extension. The CL gene organization ofLeishmania chagas is similar to that of
Leishmania donovani. L. donovani possesses six CL genes, but two are identical
(LdccyslF andLdccyslE) except that LdccyslE lacks 39 bases in the mature region as
described above. TheLdccyslA and LdccyslF proteases show cysteine protease
activities in gelatin gels, whileLdccyslE is inactive in gelatin gels.
10. Cathepsin Lgene ofFasciola gigantica
The liver flukes Fasciola gigantica andFasciola hepatica are causative agents
of fascioliasis in humans, which has been considered an increasingly important chronic
disease since 1980 (Chen and Mott, 1990; Estebanet al. 1998; Mas-Coma et al. 1999).
Fasciola cathespin L is involved in crucial biological functions such as host protein
degradation, tissue penetration and immune invasion. For these reasons, the cathepsin
L-like cysteine proteases of liver flukes have been potential targets as
immunodiagnostic antigens for fascioliasis (Yamasakiet al. 1989; Fagbemi and
Guobadia, 1995; O’Neill et al. 1999) or as vaccine candidates (Wijffelset al. 1994a;
Dalton et al. 1996b).
The structure of Fasciola gigantica cathespin gene L was characterized by
Yamasaki et al. (2002). The gene consists of four exons and three introns spanning
approximately 2.0 kb in the genome. Exon 1 encodes 15 amino acid residues of the
pro-region and the first 21 residues of the mature enzyme. Of the three introns, two are
in the same position as in the mammaliancathepsin Lgene. In the promoter region two
TATA boxes are localized upstream of the transcription initiation site. The sequence of
the 5’-upstream region of the cathespin Lgene transcript is transcribed by cis-splicing.
The ERFNIN motif was also found inF. gigantica cathepsin and L the processing of
procathespin L is consistent with that found for mammalian cathespin L.
10
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 11. Artemia franciscana cathepsin L
The major protease in Artemia embryos was identified twenty years ago as a
cysteine protease based on its inhibition by cysteine protease inhibitors (Warner and
Shridhar 1985). Since then research usingArtemia embryos and larvae have shown
that the cysteine protease activity is found mostly in non-lysosomal structures (Warner
and Shridhar 1985; Lu and Warner 1991; Warneret al. 1995). The non-lysosomal
Artemia cysteine protease has been implicated in yolk utilization and remodeling of the
extracellular matrix in early development, and in regulation of larval molting (Warner
et al. 1995; Warner and Matheson 1998).
Further study also confirmed that theArtemia cysteine protease belongs to the
cathepsin L group (Butler, et al. 2001). This conclusion was based on assays using
substrates specific for cathepsin B (N-a-Cbz-Arg-Arg-4-methoxy- p-naphthylamide),
cathepsin H (L-leucine- P-naphthylamide), and cathepsin L.
(N-Cbz-Phe-Arg-4-methoxy- P-naphthylamide).Artemia embryos contain at least
seven isoforms of cysteine proteases with pi values ranging from 4.6 to 6.2 (Butler,et
al. 2001). Using HPLC and isoelectric focusing, theArtemia embryo cysteine protease
was found to be a unique heterodimeric protease of about 60 kDa, and composed of
two tightly associated polypeptides, a catalytic subunit of 28.5 kDa and a non-catalytic
subunit of 31.5 kDa (Bulter, et al. 2001; Warner et al. 2004).
A cDNA encoding the 28.5 kDa catalytic subunit was isolated and the nucleotide
sequence was shown to have high homology with other cathespin L-like cysteine
proteases (Butler, et al. 2001). The proenzyme has 338 amino acids, while the mature
protein contains 217 amino acids. The first 20 amino acids of the proenzyme encode an
endoplasmic reticulum secretory signal (Watson 1984; Nielsenet al. 1997; Butler et al.
2001).
Previous work in our lab demonstrated that theArtemia cathepsin Lgene coding
for the embryo CL is intron-less inArtemia franciscana (Matt Shaw, unpublished). The
polymerase chain reaction was performed to amplify cathespinthe L gene inArtemia
franciscana using genomic DNA as template and the results indicated a sequence
n
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. identical to that found for cDNA. More recently, thecathepsin L gene inArtemia
parthenogenetica, a parthenogenetic relative ofArtemia franciscana, was identified
and found to contain one intron (Shamoon, unpublished). The DNA sequence of
Artemia parthenogenetica shares 98% identity with the cDNA ofArtemia franciscana,
except for the fact that the CL gene inArtemia parthenogenetica contains an intron of
1085 bp in the prepro-coding region. Since it is believed thatArtemia
parthenogenetica evolved more recently (5-6 mya) Artemiathan franciscana, these
observations tend to support the “intron-late” theory of evolution.
The illustration on page 13 indicates the intron-exon structure of theCathepsin L
gene in various organisms.
12. Cysteine protease inhibitors
An imbalance between endogenous proteases and protease inhibitors may lead to
pathologies such as rheumatoid arthritis, multiple sclerosis, neurological disorders,
osteoporosis and tumors (Berdowska and Siewinski, 2000). As well, many studies
have shown that endogenous protease inhibitors play an important role in development
(Thiery 1984; Montesanoetal. 1990; Matrisian and Hogan 1990). These inhibitors act
through steric blockage of the substrate access to the enzyme catalytic center (Rzychon
et al. 2004). The cyteine protease inhibitors include cystatin, thyropin, chagasin, as
well as inhibitors of the apoptosis protein family (IAP) and staphostatin (Rzychonet al.
2004).
Cystatins inhibit the activity of the papain family of cysteine proteases found in
viruses, bacteria, plants and animals. On the basis of sequence homology, the cystatin
superfamily is divided into three groups: stefins (family I), cystatins (family II) and
kininogens (family III) (Otto and Schirmeister, 1997; Barrettet al. 1998; Grzonka et al.
2001). The cystatins bind to amino acids adjacent to the protease active site,
obstructing the access of substrate, but they do not interact directly with enzyme
catalytic center (Bode and Huber, 2000). Thyropin is the protease inhibitor whose
structure contains an arrangement designated as thyroglobulin type-1 domain
12
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Intron and Exon Structure in theCathepsin L Gene in Various Organisms
1186 345 100 166 665 550 500 Total bp Exon Homo 5117 (1605 ) 280 136 123 147 225 163 118 305
1069 715 105 765 213 2100 1110 Mus 7329 (1262 ) 57 137 123 147 225 163 118 403
141 97 101 166 311 Penaeus 1792 (976 ) 79 153 180 192 166 206
1309 4049 60 Dros. 6836 (1418) 129 117 215 957
474 540 571 C. e/e. 2596 (1011) 258 102 447 204
1086 A. par. (CL-1) -2000 (1014)
Leishmania** 1326 (1326 ) 1326
Metapenaeus 1094 (1094) 1094
A.fran. (CL-1) 1014 (1014) 1014
The organisms shown here are as follows (from top to bottom): Human, mouse,
Penaeus, Drosophila, C. elegans, A. parthenogenetica, Leishmania, Metapenaeus, and
A. franscicana. ** Present in five copies in tandem, each as a single exon, with slight
differences at the 3’-end. Gene shown here is LclA.
13
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Lenarcic et al. 1999). It has a similar mechanism of inhibition, but it is more specific
than the relatively non-selective cystatins (Guncar et al. 1999; Bode and Huber 2000).
Chagasins are Trypanosoma cruzi protein and were recently characterized as a
tight-binding inhibitor of papain-like cysteine proteases (Santoset al. 2005). They are
inhibitory towards papain-like proteinases of bacterial, protozoan and mammalian
origin (Monteiro et al. 2001; Rigden et al. 2002; Sandersenet al. 2003). IAP
(inhibitors of the apoptosis protein family) could function through direct inhibition of
caspases (Rzychonet al. 2004). Staphostatin is a newly identified cysteine protease
inhibitor and it has high specificity towards staphopains, bacterial papain-like cysteine
proteases (Rice et al. 2001; Massimi et al. 2002; Rzychonet al. 2003). Recent studies
have also shown that some novel inhibitor proteins are homologous to the pro-region
of papain-like cysteine proteases such as mouse cytotoxic T-lymphocyte antigen
(CTLA-2), which is homologous to the pro-region of mouse cathepsin L, andBombyx
cysteine protease inhibitor (BCPI) in the silkmothBombyx mor\ (Yamamoto et al.
2002 ).
In dormantArtemia embryos two kinds of cysteine (thiol) protease inhibitors
have been identified, one dialyzable and the other non-dialyzable (Nagainis and
Warner 1979; Warner and Shridhar 1985). Nothing is known about the nature of the
dialyzable inhibitor, but the non-dialyzable inhibitors have been partially characterized
(Nagainis and Warner, 1979). Using gel filtration and HPLC, three thiol-protease
inhibitors were identified and named TPI-1, TPI-2, TPI-3 (Warner and
Sonnenfeld-Karcz, 1992). TPI-2 and TPI-3 were found to be homogenous by
electrophoresis and chromatography on a C-18 column, while TPI-1 appeared to be
heterogeneous, and composed of two components with molecular masses of 11.8 and
13.6 kDa. The Artemia embryo TPI proteins belong to the cystatin superfamily, and are
similar to the members of the type I cystatin family (stefins) (Warner and
Sonnenfeld-Karcz, 1992).
14
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 13. Objectives
The objectives of this thesis are as follows: 1) to elucidate further the structure of
the cathepsin Lgene(s) inArtemia franciscana and determine whether more than one
functionalcathepsin Lgene exists; 2) to characterize the cathepsin Lgene(s) expressed
in Artemia adult tissue, and determine whether they have properties similar to the
cathepsin Lgene(s) expressed in the embryo Artemiaof franciscana-, and 3) to begin
characterization of the promoter region of thecathepsin L genes inArtemia to help
elucidate the requirements for transcription.
15
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Materials and Methods
Materials:
The Artemia franciscana genomic DNA library and cDNA libraries were
prepared by L. Sastre (Madrid, Spain). TheArtemia genomic DNA library is in
bacteriophage lambda EMBL 3, and it was maintained Ecoli in K802 cells. The
Artemia adult cDNA library is in bacteriophage lambda gtl 1 and it was maintained in
Ecoli rY1090~ cells. The Artemia embryo cDNA library is in bacteriophage lambda
ZAPII, and it was maintained Ecoliin K802 cells.
Artemia franciscana genomic DNA was isolated fromArtemia cysts obtained
from the Great Salt Lakes in Utah and purchased from the Sanders Brine Shrimp
Company (Ogden, UT).
The Artemia franciscana cDNA clone representing the DNA sequence coding
for Artemia embryo cathepsin L-l was obtained previously in our lab (see NCBI
database, AF147207) (Butler et al. 2001). A genomic clone (9C) representing the
cathepsin L-2 gene was isolated from theArtemia franciscana EMBL3 library and
provided by Matt Shaw in our lab (see NCBI database, AY557372) (unpublished).
All PCR primers were synthesized at Sigma-Genosys (Oakville, ON). The TA
Cloning Kit was purchased from Invitrogen (Burlington, ON). 32P-dCTP for making
CL probe was from PerkinElmer (Boston, MA). PCR reagents and the Wizard
Miniprep Kit were both from Promega (Madison, WI). PCR products were purified
using the Wizard DNA Clean-Up system (Promega) and the QIA Gel Extraction Kit
(Qiagen) (Mississauga, ON). Molecular grade chemicals were from Sigma Chem. Co.
(Mississauga, ON). Restriction enzymes used in this study were obtained from
Promega. The DNAzol Reagent was from GIBCO-BRL (Burlington, ON).
Nitrocellulose membranes were obtained from Pall Life Sciences (East Hills, NY).
X-ray film was obtained from Kodak (Rochester, NY). Sequencing was performed
using the Thermo Sequenase Cy5.5 Dye Terminator Cycle Sequencing Kit (Amersham
Biosciences, Baie d’urfe’, QC) and the departmental DNA sequencer (Visible Genetics)
16
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. or by the Robarts Institute (London, ON).
Methods:
1. Isolation of cDNA clones coding for Artemia franciscana
embryo cathepsin L
A cDNA clone in Bluescript coding forArtemia franciscana embryo cathepsin L
was obtained from our laboratory stock (Butleret al., 2001) and amplified inE.coli
strain JM109 cells as follows. The transformed JM109 cells were grown in LB broth
with ampicillin and 1.5ml stocks were stored at -80°C (see Butleret al. 2001), while
similar size aliquotes were used to isolate plasmid DNA containing the embryo
cathepsin L cDNA using the small-scale alkali lysis method (Birnboim and Doly, 1979;
Ish-Horowicz and Burke, 1981) with modifications as described more recently
(Sambrook et al. 1989). Cultures (1.5 ml) were transferred to microcentrifuge tubes
and centrifuged at 13,000 rpm for 2 minutes. The supernatant was removed and the
sediment was resuspended in 200 pi ice-cold Solution I (50 mM glucose, 25 mM
Tris-Cl, pH8.0, 10 mM EDTA, pH8.0), followed by addition of 200 pi Solution II (0.2
N NaOH, 1% SDS) and inverting the tube five times. Next, 200 pi Solution III (5M
potassium acetate 60 ml, glacial acetate acid 11.5 ml, H20 28.5 ml) was added and the
tube inverted several times before centrifugation at 13,000 rpm for 10 minutes. The
supernatant was transferred to a clean tube and an equal volume of phenol: chloroform
(1:1) was added to the tube and mixed by vortexing. After centrifugation at 13,000 rpm
for 2 minutes, the supernatant was transferred to a clean tube, and 2 volumes of 95%
ethanol were added to precipitate the DNA. After standing for 30 minutes at room
temperature, the plasmid DNA was collected by centrifugation (5 mins, 13,000 rpm),
washed with 70% ethanol, then air-dried for 10 minutes. Finally, the DNA pellet was
resuspended in 50 pi TE (8.0) containing RNAase (20 pg/ml), and purified using the
17
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Wizard DNA Clean-Up system for use as template in making a [32P]-labeled CL cDNA
probe.
2. Construction of [32p]-labeled CL cDNA probe
A [32P]-labeled cathepsin L probe was constructed by the polymerase chain
reaction using a previously clonedArtemia franciscana embryo cathepsin L cDNA as
substrate. The primers used in the PCR were CLF and CLR (see Appendix 1 and 2 for
details of sequence) at a final concentration of 1 pmol/pl constructed from published
nucleotide sequence information forArtemia cathepsin L cDNA (Butler et al. 2001).
The reaction also contained 2 pi [a-32P] dCTP with 2 mM dGTP, dATP, dTTP and 0.4
mM dCTP. Other materials in the reaction mixture were Bluescript plasmid containing
the Artemia embryo cathepsin L cDNA as template (250 ng), lx PCR reaction buffer
containing 2.5 mM MgCl2, and 5 units of Taq DNA polymerase in a final volume of 50
pi. The PCR was performed under the following conditions: 94°C for 5 minutes; 35
cycles of 94°C for 1 minute, 50°C for 1 minute, 72°C for 2 minutes; then 72°C for 10
minutes at the end of the reaction. After the reaction, the PCR product was purified on
a 1 x 5 cm G50 Sephadex column using buffer E (10 mM Tris-Cl pH 8.0, 50 mM NaCl,
ImM EDTA pH 8.0), and the amount of radioactivity in the PCR product was
determined by liquid scintillation counting. The [32P]-labeled probe consisted of 1018
bp representing the prepro- and mature regions of theArtemia embryo cathepsin L
cDNA (see Butler et al. 2001).
3. Purification of PCR products
All PCR products were purified using the Wizard DNA Clean-Up system. At
least 50 pi of the PCR reaction mix was added to a microcentrifuge tube with 1 ml
Clean-Up resin. The resin and sample were mixed by gently invertion several times.
The whole mixture was transferred to a syringe barrel then dispensed through the
minicolumn. The resin was washed with 2 ml 80% isopropanol then transferred to a
1.5 ml microcentrifuge tube and centrifuged for 2.5 minutes to dry the resin. The
18
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. minicolumn (resin) was transferred to a new microcentrifuge tube and 50 pi distilled
water was added to the minicolumn. After 1 minute the minicolumn-microcentrifuge
tube was centrifuged for 30 seconds to collect the eluted DNA. The concentration and
purity of the DNA eluting from the minicolumn was determined by spectrometry at
260 nm and 280 nm.
4. Cloning of PCR products
4.1 DNA ligation reaction
The TA Cloning Kit was used for cloning all PCR products. DNA ligation with
the vector (pCR2.1) was done in a total volume of 10 pi and contained 1 pi 1 Ox ligation
buffer, 2 pi pCR 2.1 vector (25 ng/pl) and 1 pi T4 DNA ligase (4.0 Weiss units). The
amount of template (pCR2.1) used in the reaction depended on the amount of PCR
fragment available for the ligation reaction to obtain a ratio about 1:5 or 1:10
(vector:insert). The ligation reaction was carried out overnight at 14 C.
4.2 Transformation of competent Escherichia coli cells.
Competent E. coli cells, INVaF’ (Invitrogen), were thawed on ice before their
transformation with the pCR2.1 vector with insert. To one vial of competent cells (50
pi) was added 2 pi p-mercapto-ethanol and 2 pi of the ligation reaction. The treated
cells were mixed gently, incubated on ice for 30 minutes, then heated in water bath for
30 seconds at 42°C. The vial was placed in ice for several minutes, then 250 pi of
S.O.C medium (2% tryptone; 0.5% yeast extract; 10 mM NaCl; 2.5 mM KC1; 10 mM
MgCl2; 10 mM MgS04; 20 mM glucose) were added. The vial was mixed horizontally
at 37°C for 1 hour at 225 rpm on a platform shaker. After incubation, 100 pi of the
transfected cells were spread on an LB plate containing 40 pg /ml X-gal (Sigma) and
50 pg /ml ampicillin (Sigma). The LB plate was incubated at 37°C overnight.
19
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5. Characterization ofArtemia cathepsin L-l gene isolated
from genomic DNA
5.1 Isolation ofArtemia nuclei
Initially, nuclei were isolated from newly hatched nauplius larvaeArtemia of
franciscana following the method of Squires and Acey (1989) as described by Clegget
al. (1994). Starting material was 1 g (wet weight) nauplii that had been obtained from
Artemia cysts incubated at 28°C for 15-18 hours. The nauplii were homogenized with
8 ml ice-cold homogenization buffer (HB) (10 mM Tris-HCl, pH 7.5; 10 mM MgCl2;
0.1% Nonidet P40). The homogenate was centrifuged briefly to remove fragments of
exo-skeleton and the sediment was washed twice with 5 ml homogenization buffer
(HB), and centrifuged at 2500 rpm for 10 minutes at 4°C to sediment nuclei and
residual yolk platelets. The nuclei-rich pellet was resuspended in 2-3 ml
homogenization buffer lacking Nonidet P40, then layered over a 75% Percoll solution
containing 0.15 M NaCl, 0.01 M MgCl2, and 0.01 M Tris-Cl, pH 7.5 in a centrifuge
tube. The Percoll-nuclei preparation was centrifuged at 14,500 rpm for 30 minutes at
4°C. The white fluffy zone just beneath the surface was collected, diluted with 5 ml
homogenization buffer lacking Nonidet P40 and centrifuged at 10,000 rpm to recover
nuclei.
5.2 Isolation ofArtemia franciscana genomic DNA
Artemia genomic DNA was isolated from purified nuclei using the DNAzol
Reagent as follows. DNAzol (1 ml) was added to about 1-3 x 107 isolated nuclei and
the reaction vessel was inverted several times to lyse the nuclei. The vessel was
centrifuged at 10,000 rpm, then genomic DNA was precipitated from the supernatant
by the addition of 0.5 ml of 100% ethanol per 1ml of DNAzol used for isolation. The
visible DNA precipitate was removed by spooling with a pipette tip. The DNA
precipitate was washed twice with 0.8-1 ml 75% ethanol. The DNA was air dried for
20
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. about 15-60 seconds to remove the ethanol then dissolved in 0.2-0.3 ml of 8 mM
NaOH. The concentration and purity of the DNA was determined by UV absorption at
260 nm and 280 nm.
6. Screening of recombinant DNA clones for Artemia
franciscana cathepsin L nucleotide sequences
Nitrocellulose filters used for screening were placed on agar plates with 50pg/ml
ampicillin (Sigma). White bacterial colonies were transferred onto the filter, and then
onto a master agar plate with ampicillin but no filter. Colonies were streaked in
identical positions on both plates. The plates were incubated overnight at 37°C. The
next day both plates were marked with India ink in three positions, sealed with
parafilm and stored at 4°C.
The nitrocellulose filter was placed on Whatman 3MM paper saturated with 10%
SDS for 3 minutes, then treated as described below (see section 8).
Finally, hybridization with the [32P]-labeled CL probe was carried out as
described above and the filter was exposed to an X-ray film at -80°C until a signal of
the desired strength was obtained.
7. Isolation of plasmid DNA from CL positive Artemia
franciscana clones
Bacterial clones showing a positive signal when probed with 32P-labeled CL
cDNA were suspended in 500 pi LB, incubated with shaking at 37°C for 30 minutes,
then mixed with 3 ml LB broth containing 50 pg/ml ampicillin. The liquid cultures
were incubated overnight, then 1 ml of each bacterial culture was centrifuged for 2
minutes at 13,000 rpm. The supernatant was removed and plasmid DNA was isolated
from the pellet using the Wizard Miniprep Kit according to the manfacturer’s
directions.
The plasmid DNA was treated with the restriction enzyme EcoR I and the
21
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. reaction mixture was analyzed on a 1% agarose gel to confirm the presence of an insert
in the vector.
8. Screening of anArtemia franciscana genomic DNA library
in EMBL3
An Artemia franciscana genomic DNA library, constructed in bacteriophage
lambda EMBL3, was prepared and kindly supplied by L. Sastre (Madrid, Spain). To
screen the library for CL DNA sequence(s),E.coli K802 cells were grown overnight in
LB medium supplemented with 10 mM MgS04 and 0.2% maltose, then various
dilutions of the EMBL3 phage were added to 0.5 ml K802 cells and each tube was
incubated at 37°C for 40 minutes. The transfected cells were then added to 2.5 ml LB
containing 0.7% agarose and poured onto 90 mm LB/agar plates. The plates were
incubated overnight, and those with distinct, but non-confluent plaques were
transferred to nitrocellulose membranes as described previously (Benton and Davis,
1977) in Sambrook et al. (1989). The membranes were placed on Whatman 3 MM
filter paper and treated sequentially with the following solutions: 1) denaturing
solution (1.5 M NaCl, 0.5 M NaOH), 2) neutralizing solution (1.5 M NaCl, 0.5 M
Tris-Cl, pH 7.0), and 3) 2x SSC, each for 3 minutes. The membranes were then air
dried for 30 minutes and baked in a vacuum oven at 80°C for one hour. Prior to
treatment with the 32P-labeled CL probe, the membranes were treated with 6x SSC
containing 5x Denhardt’s and 0.1% SDS at 64°C for one hour. Next, the hybridization
reaction was carried out in fresh pre-hybridization solution containing heat denatured
32P-labeled CL probe (1-2 x 106 cpm) at 64°C overnight in a hybridization chamber
(Fisher Scientific). After the hybridization step, the membranes were washed in 6x
SSC containing 1% SDS for 20 minutes, 2x SSC containing 1% SDS for 20 minutes,
and 0.2x SSC containing 1% SDS for 10 minutes, then exposed to an X-ray film at
-80°C for up to 4 days. Plaques giving positive signals after hybridization were
“cored” with a sterile pipette tip and placed in 0.5 mi SM (0.1 M NaCl; 0.008 M
22
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. M gS04-7H20 ; 0.05 M Tris-Cl, pH 7.5; 0.01% gelatin) and one drop of chloroform was
added to each vessel. Phage released from the agar core was used to re-infectE.coli
K802 cells to further purity the phage to homogeneity.
9. Isolation of lambda EMBL3 phage containing putative
cathepsin L genomic DNA
E.coli K802 cells were grown overnight in LB medium supplemented with 10
mM MgS04 and 0.2% maltose. Approximately 30 pi of a purified EMBL3 clone in
SM buffer, containing the putative CL sequence, was added to 0.1 ml K802 cells
containing 0.1 ml of 10 mM CaCl2 and 10 mM MgCl2. The mixture of cells and phage
was incubated at 37°C for 60 minutes, then added to 25 ml LB medium supplemented
with 10 mM MgCl2. The culture was incubated at 37°C overnight with shaking until
lysis had occurred. Next, 5 drops of chloroform were added, and the culture was
shaken for another 5 minutes. Following this, cellular debris was removed by
centrifugation for 20 minutes at 10,000 rpm and 4°C (Sorvall, SS-34 rotor). The
supernatant from the previous step was centrifuged at 40,000 rpm (Beckman L5
ultracentrifuge) for 2.5 hours at 4°C. The resulting pellet was resuspended in SM
solution containing DNase I and RNase A in a final concentration of 1 pg/ml and 10
pg/ml, respectively, followed by incubation at 37 °C for 1.5 hours. Solid NaCl and
polyethylene glycol (PEG 8000) were then added to a final concentration of 1 M and
10% w/v, respectively, and the mixture was stored on ice overnight. The bacteriophage
particles were recovered by centrifugation at 10,000 rpm for 15 minutes at 4°C. The
phage pellet was resuspended in SM, then EDTA and protease K were added to a final
concentration of 0.04 M and 0.09 mg/ml, respectively. The mixture was incubated at
65 °C for one hour. Proteins were removed by extractions first with phenol, then with
phenol:chloroform (1:1), and with chloroform only using standard procedures. After
the final centrifugation the aqueous layer was treated with 1/10 volume of 3 M sodium
23
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. acetate, pH 5.2 and two volumes of 95% cold ethanol to precipitate the phage DNA.
After 3 hours at -20°C, the ethanol-insoluble material was recovered by centrifugation
at 10,000 rpm. The DNA pellet was washed twice with 70% cold ethanol. Finally, the
phage DNA precipitate was dried in a Speedvac and resuspended in distilled water. The
DNA concentration and purity was determined by UV absorption at 260 and 280 nm,
where 50 (xg/ml DNA gives an absorbance of 1.0 at 260 nm. A 260/280 ratio of 1.9-2.0
was taken to represent “pure” DNA.
10. Restriction analysis and Southern blotting of putative
CL clones derived from the genomic DNA EMBL3 library
Since Artemia embryo CL-1 cDNA contained an EcoR I restriction site just
before the mature protease coding sequence, recombinant EMBL3 clones containing
the putative CL genes were analyzed with the restriction enzyme EcoR I. Typically, the
reactions contained 1-2 pg phage DNA, buffer H (90 mM Tris-Cl, 10 mM MgCl2>50
mM NaCl, pH 7.5) and 12 units of restriction enzyme EcoR I in a total volume of 20 pi
reaction. The reactions were incubated at 37°C overnight, combined with loading dye
and subjected to electrophoresis on 1.0% argarose gel with lx TAE buffer. At the end
of the run (about 2 hours at 60 volts) the gel was incubated in lx TAE buffer containing
ethidium bromide, then visualized using UV transillumination.
The EcoR I reaction products were transferred to a nitrocellulose membrane by
Southern blotting according to the standard protocol (Southern, 1975). After the
transfer was complete, the membrane was rinsed in distilled water for several minutes
then baked at 80°C for 90 minutes under vacuum. The prehybridization and
hybridization steps were performed as described above (see section 8). Finally the
membrane was exposed to an X-ray film at -80 °C for at least 18 hours.
24
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 11. Polymerase chain reaction using phage DNA clones as
substrate
Cathepsin L (CL) positive clones in EMBL3 were analyzed using the
polymerase chain reaction. The primers used were CLF and CLR3 (see Appendix 1
and 2). The reaction contained 150 ng of a pure CL EMBL3 clone as template, lx PCR
reaction buffer, 2.5 mM MgCl2, 0.2 mM dNTP (dATP, dTTP, dCTP, dGTP) (final
concentration) and 5 units Taq DNA polymerase. PCR was performed under the
following condition: 94°C for 5 minutes; 35 cycles of 94°C for 1 minute, 53°C for 1
minute, 72°C for 2 minutes; finally the reaction vessel was incubated at 72°C for 10
minutes to complete unfinished chain extensions.
The PCR product(s) was subjected to electrophoresis on a 1% agarose gel in lx
TAE buffer. The bands of interest were localized using ethidium bromide, and cut from
the gel using a clean scalpel and purified further using the QIA Gel Extraction Kit. The
concentration and purity of the DNA was determined by UV absorption at 260 nm and
280 nm.
12. Amplification of Artemia cathespin L genes
The polymerase chain reaction was used to amplify the cathepsin L genes from
genomic DNA as follows. The primers used were CLF13 and CLR18 (see Appendix 1
and 2) at the final concentration of 1 pmol/pl. The template DNA was diluted to 100
ng/pl and 2.5 pi template DNA (250 ng) was added to a reaction vessel, containing lx
PCR reaction buffer, 2.5 mM MgCl2, 0.2 mM dNTP (dATP, dTTP, dCTP, dGTP) (final
concentration), 5 units Taq DNA polymerase and ImM Betaine at 5 mg/ml (Sigma).
PCR was performed as follows: 94°C for 5 minutes; 35 cycles of 94°C for 1 minute,
54°C for 1 minute, 72°C for 2 minutes; 72°C for 10 minutes.
13. DNA sequencing ofArtemia cathepsin L clones
DNA sequencing was conducted in two ways. Initially the Thermo Sequenase
25
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Cy5.5 Dye Terminator Cycle Sequencing Kit and departmental DNA sequencer were
used. For each clone to be sequenced, the master mix contained 3.5 pi reaction buffer,
2 pi primer (20 pmol/pl), and 2 pi Thermo Sequenase DNA polymerase (10 U/pl).
PCR products and plasmid DNA were used as template. The total volume of the initial
reaction mix was 31.5 pi. The primers used in the sequencing reactions were CLF and
CLR3. Aliquotes of the initial reaction mix (7 pi) were placed in four different
sequencing reaction vessels, each containing 2.5 pi of different ddNTPs (ddATP,
ddCTP, ddTTP, ddGTP). The reaction vessels were incubated for 2.5 minutes at 94°C,
then 35 cycles of 45 seconds at 94°C, 45 seconds at 52°C, 2 minutes at 72°C, and 10
minute at 72°C. After completion of the cycling program, 2 pi of 7.5 M ammonium
acetate and 30 pi of cold 95% ethanol were added to each of the four reaction tubes.
The tubes were placed at -80°C for at least 18 hours. Next, the tubes were centrifuged
(4°C) at 13,000 rpm for 45 minutes. The supernatant was removed and the pellet was
washed with 70% ice-cold ethanol. The tubes were centrifuged again for 5 minutes.
The supernatant was removed and the pellet dried under vacuum. Formamide loading
dye (6 pi) was added to each pellet and the tubes were vortexed vigorously.
After heating at 70°C for 2.5 minutes, the products were subjected to
electrophoresis on a polyacrylamide gel in Long Read Tower system under the
following conditions: gel temperature: 60°C; gel voltage: 2000 volts; laser power: 50%.
The electrophoresis buffer was lx TBE. Prior to loading the samples (2.5 pi), the gel
was pre-run for 20 mintes. The electrophoresis running time was one hour.
Except for the above, all subsequent plasmid DNA clones were sent to the
Robarts Institute for sequencing. The primers used in the sequencing reaction were
provided by the Robarts Institute as follows:
T7 promoter (5’-TAATACGACTCACTATAGGG-3’)
M13 Forward (5 ’ -CGCCAGGGTTTTCCCAGTCACGAC-3 ’)
M13 Reverse (5 ’ -T C AC AC AGG A AAC AGCTATG AC-3 ’).
After sequencing, the data were used for alignments withArtemia franciscana
cathepsin L cDNA (AF147207) and genomic DNA clone 9C (AY557372), whose
26
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. sequences were in Genebank at the National Center for Biotechnology Information
(NCBI). The alignments were performed using Clustal W (1.82) program (Thompson
et al. 1994) at EBI (European Bioinformatics Institute) toolbox.
14. PCR analysis ofArtemia franciscana adult cDNA library
for the presence of cathepsin L sequence
An adult Artemia franciscana cathepsin L cDNA library constructed in
bacteriophage Xgtl 1 was obtained from L. Sastre (Madrid, Spain).Ecoli rY1090~ cells
were used as host and grown overnight in LB medium supplemented with lOmM
MgS04 and 0.2% maltose. Approximately 9x 105 lambda gtll phage from the library
were added to 0.5 ml rY1090' cells and incubated at 37°C for 60 minutes. The mixture
was then added to 2.5 ml LB with 0.7% agarose and poured on a 90 mm LB/agar plate.
The plates (6) were incubated overnight, and those showing confluent plaques were
saved. SM solution (5ml) was added to each plate, and the plates were shaken at room
temperature for 4 hours. The SM solution on the plates was collected and centrifuged
at 10,000 rpm for 15 minutes at 4°C. The supernatant was used for phage DNA
isolation, following the protocols as described in section 4 above. The phage DNA was
analyzed at 260 nm and 280 nm, and the purity and concentration of the DNA were
calculated as described above.
Using DNA (total) prepared from anArtemia franciscana adult cDNA library,
PCR was performed using different pairs of primers designed to determine the
presence of sequence matchingArtemia CL-2 gene as follows: CL9CF11, CLR11,
CLF11 and CLRlOb (see Appendix 1 and 3). The reaction vessels contained 500 ng
template DNA, lx PCR reaction buffer, 2.5 mM MgCl2, 0.2 mM dNTP (final
concentration) (dATP, dTTP, dCTP, dGTP), 5 units Taq DNA polymerase and 1 mM
Betaine. PCR was performed as follows: 94°C for 5 minutes; 35 cycles of 94°C for 1
minute, 51 or 54°C, depending on the primer pair, for 1 minute, 72°C for 2 minutes;
72°C for 10 minutes.
27
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The PCR products were analyzed on a 1% agarose gel and the major bands were
purified using the Wizard DNA Clean-Up system. The PCR derived DNA fragments
were ligated into pCR 2.1, transfected into INVaF’ cells and grown on LB plates with
ampicillin as described above. Through the screening process, DNA clones showing a
positive signal with 32P-labeled CL cDNA were collected, grown in LB broth with
ampicillin and isolated using the Wizard Miniprep Kit. Clones containing putative CL
inserts were sent to the Robarts Institute for DNA sequencing.
The sequencing data were compared withArtemia franciscana embryo cathepsin
L cDNA (AF147207) representing Artemia CL-1 gene andArtemia franciscana
genomic DNA clone 9C (AY557372) representingArtemia CL-2 gene. The program
used was Clustal W (1.82) (Thompson et al. 1994).
15. PCR analysis ofArtemia embryonic cDNA library for
additional CL cDNAs
Bluescript phagemid DNA was prepared as described previously from the
Artemia franciscana cDNA library in XZAP1I using a protocol from the supplier
(Butler et al. 2001).
Using total DNA prepared from anArtemia embryonic cDNA library, PCR was
performed using primer pair TP-7F and CLR11 (see Appendix 1 and 3). The reaction
vessels contained 276 ng template DNA, lx PCR reaction buffer, 2.5 mM MgCl2, 0.2
mM dNTP (dATP, dTTP, dCTP, dGTP), 5 units Taq DNA polymerase and 1 mM
Betaine in 50 pi final volume. PCR was performed as follows: 94°C for 5 minutes; 35
cycles of 94°C for 1 minute, 51 or 54°C, depending on primer pair, for 1 minute, 72°C
for 2 minutes; 72°C for 10 minutes.
The PCR products were analyzed on a 1% agarose gel and the main bands were
purified using the Wizard DNA Clean-Up system. The PCR derived DNA was ligated
into pCR 2.1 then INVaF’ cells were transfected and grown on LB plates with
ampicillin as described above. DNA clones showing a positive signal when probed
28
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. with 32P-labeled CL cDNA were collected, grown in LB broth with ampicillin and
isolated using Wizard Miniprep Kit. Clones containing putative CL inserts were sent to
Robarts Institute for DNA sequencing.
The sequencing data were compared withArtemia franciscana genomic clone 9C
(AY557372) representing Artemia CL-2 gene and withArtemia franciscana embryo
cathepsin L cDNA (AF147207). The program used was Clustal W (1.82) (Thompson et
al. 1994).
16. Analysis ofArtemia franciscana genomic DNA in phage
EMBL3 for additional CL genes
Artemia franciscana genomic DNA cloned in EMBL3 was grown E.coliin K802
cells, purified as described above, then used as template for the polymerase chain
reaction using the following primers: CLF10 and CLR8 (see Appendix 1, 2 and 3). The
reaction contained 500 ng total phage DNA as template, lx PCR reaction buffer, 2.5
mM MgCl2, 0.2 mM dNTP (dATP, dTTP dCTP, dGTP), 5 units Taq DNA polymerase
arid 1 mM Betaine in 50 pi final volume. PCR was performed under the following
conditions: 94°C for 5 minutes; 35 cycles of 94-°C for 1 minute, 56°C for 1 minute,
72°C for 2 minutes; 72°C for 10 minutes.
The PCR product was purified by Wizard DNA Clean-Up system and ligated
into vector pCR 2.1 as described above. DNA clones showing a positive signal using
our 32P-labeled CL cDNA probe were collected, purified and the PCR product within
the plasmid was sequenced.
The sequencing data were analyzed for its similarity withArtemia franciscana
embryo cathepsin L cDNA (AF147207) andArtemia franciscana genomic clone 9C
(AY557372).
29
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 17. Attempt to identify the putative promoter sequence of
Artemia franciscana CL genes
Degenerate PCR was performed in an attempt to identify the upstream (5’)
promoter sequence of one of the Artemia franciscana CL genes. Artemia Jransicana
genomic clone 9C contains nucleotide sequence that we have designatedCL-2, as so it
was used as template in the PCR reaction. The degenerate primers used were as
described in paper of Badaraccoet al. (1995) in addition to an internal primer as
follows: OPC-2, OPC-4, OPC-8, OPC-9 and CLR10 (see Appendix 1 and 3).
The reaction contained 250 ng template, lx PCR reaction buffer, 2.5 mM MgCl2,
0.2 mM dNTP (dATP, dTTP, dCTP, dGTP) (final concentration) and 5 units Taq DNA
polymerase. PCR was performed in two consecutive steps. Step 1: 94°C for 5 minutes;
10 cycles of 94°C for 1 minute, 35°C for 1 minute, 45°C for 12 minutes. Step 2: 30
cycles of 94°C for 1 minute, 56°C for 1 minute, 72°C for 2 minutes; 72°C for 7 minutes
after the 30 cycles to complete all extension.
The PCR products were transferred to a nitrocellulose membrane by Southern
blotting (Southern, 1975) and the membrane processed for hybridization with a
[32P]-labeled CL probe as described above.
CL-positive PCR products were separated by electrophoresis on 1% agarose gel,
and the band showing a positive signal was cut from gel, purified using the QIA Gel
Extraction Kit, then ligated into vector pCR 2.1 and cloned as described above.
Positive clones were collected, purified and the plasmid (insert) containing the PCR
product was sequenced.
30
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Results:
1. Isolation of cDNA clone coding for Artemia embryo
cathepsin L
The plasmid pCR2.1 containing a cDNA coding forArtemia franciscana
cathepsin L (CL) used in this experiment was isolated using alkaline lysis and treated
with restriction endonuclease EcoR I as shown in Fig 1. The cloned CL cDNA
contained 1085 bp, and was composed of fragments of 335 bp and 750 bp. This cloned
cDNA was described previously (Butleret al. 2001). The cloned cDNA was used for
construction of a [32P]-labeled probe to detect similar sequences in genomic DNA and
cDNA libraries prepared fromArtemia franciscana larvae and adults.
2. Isolation and analysis of cathepsin-L like clones from an
Artemia fanciscana genomic DNA library
Ten putative CL-positive clones (Al to A10) were isolated from a genomic DNA
library constructed in phage EMBL3 and analyzed using EcoR I digestion. Three of
these clones (Al, A2, A3) are shown in Fig 2. All clones showed the same EcoR I
restriction pattern displayed byArtemia franciscana genomic clone 9C isolated
previously (Fig 2), whose sequence is about 80% identical Artemia to embryo CL
cDNA. The restriction patterns of all the clones were similar in that they yielded three
bands: 10,000 bp, 8000 bp and 2400 bp. These results indicated that there are two
EcoR 1 digestion sites in the cloned sequence, however only one band (8000 bp) gave
a signal with the 32P-labeled CL probe. These results demonstrated that while all clones
appeared to represent CL sequences, they were “identical” to genomic clone 9C, but
not withArtemia embryo CL cDNA as shown in Figl.
31
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 2
4000bp ► •4pCR2.1
2000bp ►
innnhn
750bp ► «<750bp
250bp ► «<335bp
Fig. 1. Analysis of a cDNA in vector pCR2.1 coding Artemia for franciscana
embryo cathepsin L.
Lane 1, 1Kb ladder; lane 2, products of EcoR I digestion. The bands above were
visualized using ethidium bromide staining.
32
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A. B.
123 45 678 9
12 345 67 8 9
«*# *»*•***»* *•*!«* m m ^
8000bp ► **" + hlsen
3000bp ► 2000bp ►
Fig. 2. EcoR I digestion ofArtemia franciscana genomic DNA clones in >. EMBL3.
Panel A, the restriction pattern of all the clones analyzed. The restriction enzyme used
was EcoR I. Lanel, 1Kb ladder; lane 2, clone 9C (control); lane 3, EcoR I treatment of
clone 9C; lane 4, clone A l; lane 5, EcoR I treatment of clone A l; lane 6, clone A2; lane
7, EcoR I treatment of clone A2; lane 8, clone A3; and lane 9, EcoR I treatment of
clone A3. Panel B, X-ray of clones in panel A after Southern blotting and hybridization
with 32P-labeled CL cDNA probe. Lanes 1-9 represent the same DNAs as shown in
panel A. All clones showed a strong hybridization signal at about 8000 bp after EcoR I
digestion.
33
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3. Identification of DNA sequence in putative CL genomic
clones from Artemia franciscana
The polymerase chain reaction was used to amplify part of the insert in the CL
positive clones (Al, A2, A3) shown in Fig 2, which could then be sequenced. The
primers for the PCR reaction were CLF and CLR3 (see Appendix 1 and 2), a design
based on the sequence ofArtemia embryo cDNA and expected to yield a fragment of
578 bp. The PCR products of these reactions are shown in Fig 3. The results yielded
products of approximately 600 bp and they were similar to the product obtained with
genomic clone 9C as substrate (lane 2).
According to the EcoR I restriction pattern and hybridization pattern of all ten
putative CL genomic clones isolated from the EMBL3 library, all clones appeared to
be identical so only one clone (Al) was sequenced. The PCR product shown in lane 3
of Figure 3 was cloned into pCR2.1, then sequenced as described in the methods. The
primers in separate sequencing reactions were CLF and CLR3, the same as the PCR
primers, so only a partial sequence was obtained because of the primer presence which
is not usually seen clearly on the pherogram of sequencing. Thus, 482 base sequence
was obtained, and compared withArtemia cathepsin L cDNA (AF147207) using
Clustal W (1.82) program (Thompson et al. 1994) (Fig 4.). The sequence of PCR
product from clone Al was 90% identical with embryo CL cDNA, and this sequence
covered the prepro-region and part of the mature region of the cysteine protease
according to the embryo CL cDNA sequence. The most notable differences with the
embryo cDNA clone were a six-base gap (bp 318-323) in the prepro-coding region and
loss of the EcoR I site, upstream from the mature protease coding region. The partial
sequence of clone Al was also compared withArtemia franciscana genomic clone 9C
(AY557372) where the identity was 99% as shown in Figure 5. These results suggest
that genomic clone Al is nearly identical to clone 9C except for a few nucleotide
polymorphisms. Overall, theCL gene in genomic clone 9C has 1049 base pairs and
was about 80% identical withArtemia embryo cathepsin L cDNA. Both clones Al and
9C have the same restriction pattern, but they differ from embryonic CL cDNA as
34
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 2 3 4 5
< primer dimer
Fig. 3. PCR products from the use of putative CL phage DNA clones as substrate.
PCR products were separated by electrophoresis on 1% agarose, and stained with
ethidium bromide as described in methods. The primers used were CLF and CLR3 (see
Appendix 1 and 2). Lane 1, 1Kb ladder; lane 2, PCR product using genomic clone 9C
as substrate; lanes 3-5, PCR products using genomic clones Al, A2 and A3,
respectively, as substrate. The major band of all reactions was approximately 600 bp.
35
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fig. 4. Alignment of genomic clone A l withArtemia embryo cathepsin L cDNA.
The sequence of Artemia embryo cathepsin L cDNA (AF147207) was obtained from
the GenBank database. The alignment was performed using program Clustal W (1.82).
The two sequences were 90% identical in the region compared. Asterisks indicate
identical base pairs. The arrowhead indicates potential cleavage site for prepro- and
mature region. The boxed bases represent EcoR I restriction site inArtemia embryo
cDNA that is lacking in genomic clone Al (and clone 9C). The dashes in clone Al
indicate sequences missing (bp 318-323 in cDNA). Compared to cDNA, other dashes
indicate the regions of the clone not sequenced.
36
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 4.
EcDNAl CATCTTGTGGCAGACAATTACACAATGAAGCAGATTACTTTGATATTTTTACTGGGAGCTGTACTTGTGCAGTTAAGTGCTGCACTATCA 90
Al ......
BCDNA1 CTGACAAATTTACTTGCTGATGAATGGCATCTATTCAAGGCTACACACAAGAAAGAATATCCAAGCCAACTTGAGGAGAAATTTAQAATG 180
Al ..... AATTTGCTTGCTGATGAATGGTATCTATTCAAGGCTAGACACAAGAAAGATTATCCAAGCCAACTTGAGQAAAAATTTAGAATG 84
EcDNAl AAGATTTATTTGGAAAATAAACACAAAGTTGCCAAACATAACATCCTTTATGAAAAAGGCGAAAAGTCTTATCAAGTCGCAATGAATAAG 27 0
Al AAGATTTATTTTGAAAATAAAGACAAAATTGCCAAACATAACATCCTTTATGAGAAAGGCGAAAAGTCTTATCAAGTTGCAATGAATCAG 17 4
EcDNAl TTTGGAGATCTTCTTCATCATGAATrrAGATCTATCATGAATGGATACCAACATAAGAAACAjGAATTCjCTCAAGAQCTGAGAGCACTTTC 360
Al TTTGGAGATCTTCTTCATCATGAATTTACATCTATCATGATTGGATA.....-TAAGAAATGAACTTCACCCTTTGCTAAGAGCACTTTT 258
EcDNAl ACTTTTATGGAGCCTGCTAATGTTGAAGTTCCAGAATCTGTTGACTGGAGGGTAAAAGGAGCCATAACTCCTGTAAAAGACCAAGGACAG 450
Al ACTTTTATGGAGCCTGCTAATGTTACAGTTCCAGAATCTGTTGACTGGAGGGAAAAAGGAGCAGTAACTCCTGTAAAATACCAAGGACAG 348
EcDNAl TGTGGTTCATGCTGGGCTTTCTCATCTACTGGTGCCTTGGAAGGTCAAACCTTCAGAAAAACAGGGAAGCTCATTTCTTTGAGTGAACAG 540
Al TGTGCTTCTTGCTTGGCTTTTTCACCTACTGQTQCCTTGGAAAGTCAAACTTTCAGAAAAACAQGAAAGCTCATTTCTTTQAQTQAACAA 43 8
EcDNAl AACTTGATTGATTGTTCTGGAAAATATGGAAATGAAGGATGCAATGGAGGATTAATGGACCAAGCTTTCCAGTATATCAAGGATAACAAG 63 0
Al AACTTGATTGATTGTTCCGGTGAATATGGAAATTTAGGATGCAATGGAGGATTAATGGA...... 497
EcDNAl GGAATTGACACTGAAAATACGTACCCTTATGAAGCTGAAGACAATGTCTGTCGTTATAATCCAAGGAACCGAGGTGCCATTGACCGTGGC 720
Al ......
ECDNA1 TTTGTCCATATCCCATCTGGAGAAGAAGATAAGCTTAAGGCAGCTGTTGCCACTGTTGGACCTGTATCTGTTGCCATCGATGCCTCTCAT 810
Al ......
ECDNA1 GAAAGTTTCCAATTCTATTCTAAAGGTGTTTACTATGAGCCATCATGTGACTCTGATGACCTAGACCACGGAGTTCTTGTGGTTGGCTAT 900
Al ......
BCDNA1 GGTTCTGATAATGGCAAAGACTATTGGCTCGTTAAAAACTCGTGGTCTGAGCACTGGGGAGACGAAGGGTATATCAAGATTGCTCGCAAT 990
Al ......
ECDNA1 CGCAAGAACCATTGTGGTATTGCTACTGCAGCTAGCTATCCACTTGTATAGATAGGGTTGTGGTAATTTTTGTGGATGTGTGTAATTGCA 1080
Al ......
ECDNA1 TACGTTAAATTCTTATTCTCTTGATAGGTTTAGAGAGTTCTAGTTTTCAGTTTGATTCCGTAGATGACAGATTTTGTGACCATATTCGAG 1170
Al -......
ECDNA1 AATAAAGCGTTTTTTTTACCTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 122 9
Al ......
37
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fig. 5. Alignment of genomic clone A1 withArtemia cathepsin L genomic clone
9C (CL-2 gene).
The sequence of Artemia cathepsin L genomic clone 9C (AY557372) was obtained
from the GenBank database. The alignment was performed using program Clustal W
(1.82). The two sequences were 99% identical. Asterisks indicate identical base pair.
Dashes indicate regions not sequenced yet.
38
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 5.
9 C CAAATGAAGCAGATTACTTTGACATATTTACTAACAGCTGTAATGATATTTTTACTGTCAGTTGTACTTGTGCAGTTAAGTGCTACACAA 9 0
A1 ......
9C TCACAGTCAAATTTGCTTGCTGATGAATGGTATCTATTCAAGGCTAGACACAAGAAAGATTATCCAAGCCAACTTGAGGAAAAATTTAGA 180
A1 ...... AATTTGCTTGCTGATGAATGGTATCTATTCAAGGCTAGACACAAGAAAGATTATCCAAGCCAACTTGAGGAAAAATTTAGA 81
9 C ATGAAGATTTATTTTGAAAATAAAGACAAAATTGCCAAACATAACATCCTTTATGAGAAAGGCGAAAAGTCTTATCAAGTTGCAATGAAT 270
A1 ATGAAGATTTATTTTGAAAATAAAGACAAAATTGCCAAACATAACATCCTTTATGAGAAAGGCGAAAAGTCTTATCAAGTTGCAATGAAT 171
9 C CAGTTTGGAGATCTTCTTCATCATGAATTTACATCTATCATGATTGGATATAAGAAATGAACTTCACCCTTTGCTAAGAGCACTTTTACT 360
A1 CAGTTTGGAGATCTTCTTCATCATGAATTTACATCTATCATGATTGGATATAAGAAATGAACTTCACCCTTTGCTAAGAGCACTTTTACT 261
9 c TTTATGGAGCCTGCTAATGTTACAGTTCCAGAATCTGTTGACTGGAGGGAAAAAGGAGCAGTAACTCCTGTAAAATACCCAGGACAGTGT 450
A1 TTTATGGAGCCTGCTAATGTTACAGTTCCAGAATCTGTTGACTGGAGGGAAAAAGGAGCAGTAACTCCTGTAAAATACCAAGGACAGTGT 351
9C GCTTCTTGCTTGGCTTTTTCACCTACTGGTGCCTTGGAAAGTCAAACTTTCAGAAAAACAGGAAAGCTCATTTCTTTGAGTGAACAAAAC 54 0
A1 GCTTCTTGCTTGGCTTTTTCACCTACTGGTGCCTTGGAAAGTCAAACTTTCAGAAAAACAGGAAAGCTCATTTCTTTGAGTGAACAAAAC 441
9C TTGATTGATTGTTCCGGTGAATATGGAAATTTAGGATGCAAAGGGGGATGGATAAGCCAAGCTTTTGAGTATATCAAGGATAACAAAGGA 630
A1 TTGATTGATTGTTCCGGTGAATATGGAAATTTAGGATGCAATGGAGGATTAATGGA...... 497
9C ATTGACACTGAAAATAAATArCATTATGAAGCTAAAGAAAATTTCTGTCGTGATAATCCAAGAAACCGAGGTGCAGTTGCCCTTGGCTTT 720
A1 ......
9C GTCAATATTCCATCTGGGGAAGAAGATAAACTTAAGGCAGCTGTTGCCACGGTTGGACCTGTTTCCGCTGTTATTGATGTCTCTCATGAA 810
A1 ......
9C GGTTTTCAATTCTATTCTAAGGGTGTTTACTATGAGCCATCATGTAAAACATCATTTGAACACCTAAACCACGAAGTTCTTGTAATTGGC 900
A1 ......
9C TGTGGTTCTGATAATGGCGAAGACTATTGGCTCGTTAAAAACTCATGGTCTAAGCACTGGGGAGACGAAGGGTACCTCAAGATTGCTCGC 990
A1 ......
9C AATCGCAAGAACCATTGTGGTGTTGCTACTGCAGCTCTCTATCCAATTGTATAGATAGGGTTGTGGTACTTTTTGTGATGTGTGTAATTG 1080
A1 ......
9C ACCACGGTACATCT 1094
A1 ......
39
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A . Artemia franciscana embryo cathepsin L cDNA
m E m HI 11 HIII PstI , stop I______. 1 i______l_ 1 _
B. Artemia franciscana cathepsin L genomic clone 9C CL-2( gene)
m stop m HIII PstI stop
{______+ ______i______I 1 A -
Fig. 6. Comparison of gene structure ofArtemia cathepsin L cDNA and genomic
clone 9C.
E indicates EcoR I site. HIII indicates Hind III site. PstI indicates Pst I site, m indicates
potential translation start sites (methionine), and stop indicates stop codon. Panel A
represents the structure of Artemia embryo cathepsin L cDNA, which contains one
EcoR I site, two Hind III sites and one Pst I site. The cDNA has an one open reading
frame of 1014 bp. Panel B represents the structure of genomic phage clone 9C CL-2(
gene), which contains one Hind III site and one Pst I site and two stop codons; it
contains two open reading frames of 328 bp and 680 bp, respectively.
40
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. shown in Fig 6. Since all clones isolated from EMBL3 phage library have the same
restriction digestion pattern using EcoR I and Hind III, they all appear to be identical to
clone 9C. The sequence contained in genomic clone 9C has now been designated as
Artemia franciscana cathepsin L-2 gene (CL-2 gene).
4. PCR anlysis of the Artemia franciscana phage DNA
library.
Since we were unable to isolate a phage from the EMBL3 library with a sequence
identical to embryo cDNA coding for cathespin L, we decided to use the PCR reaction
to determine if there is aCL gene in the phage (EMBL3) library matching theArtemia
embryo cDNA sequence. This alternative approach to conventional screening of phage
library should detect and amplify the desiredCL gene sequence if present in the library.
Artemia franciscana genomic DNA library constructed in EMBL3 was purified as
described in the methods. The primers chosen were designed based on embryo CL
cDNA sequence and were designated CLF10 and CLR8 (see Appendix 1 and 2). The
sequence between the two primers covers the prepro- and mature regions of the
protease. These primers were also efficient in amplifyingArtemia CL-2 gene, as they
contain similar sequence of theCL-2 gene. The results in Figure 7 show that one band
of about 800 bp was produced using PCR and total genomic DNA in the EMBL3
library. The PCR product shown in Fig 7 was cloned in vector pCR 2.1, and several
white colonies were analyzed using EcoR I digestion and probing with 32P-labeled CL
cDNA. As shown in Fig 8, each of the four clones analyzed contained an insert about
800 bp, and all were lacking an internal EcoR I site as foundArtemia in embryo CL
cDNA. All four clones were sequenced and showed about 97% identity, including the
expected amount of polymorphisms. The sequence data in Figure 9 compare genomic
clone 818 obtained using PCR with clone 9C obtained by screening a genomic DNA
library prepared in EMBL3, while the sequence data in Figure 10 compare genomic
clone 818 obtained using PCR with CL cDNA isolated froman Artemia embryo cDNA
library constructed in phage XZAPII. All four clones compared well (99% identical)
41
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 2
lOOObp ► < ~800bp 750bp ►
Fig. 7. PCR analysis of total DNA fromArtemia franciscana EMBL3 genomic DNA library.
PCR products were separated by electrophoresis on 1% agarose, and stained with
ethidium bromide as described in methods. Lane 1 is 1 kb ladder, and lane 2 is the PCR
product generated fromArtemia EMBL3 genomic library' using primer pair CLF10 and
CLR8 (see Appendix 1 and 2).
42
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 2 3 4 5
4000bp ► Vector (pCR2.1)
IflOOhn ►
750bp ► ■^insert ~800 bp
Fig. 8. EcoR I restriction endonuclease digestion of PCR generated fragments
from EMBL3 DNA cloned into pCR2.1. Lane 1, 1Kb ladder; lane 2, clone 818; lane 3, clone 819; lane 4, clone 820; lane 5,
clone 821. All the four clones have inserts of about 800 base pairs.
43
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fig. 9. Alignment of PCR derived clone 818 withArtemia cathepsin L genomic
clone 9C (CL-2 gene).
The sequence of Artemia cathepsin L genomic clone 9C (AY557372) was obtained
from the GenBank database. The alignment was performed using program Clustal W
(1.82). Asterisks indicate identical base pair. Dashes indicate regions not sequenced.
The sequences used as primers (CLF10 and CLR8) in PCR are underlined.
44
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 9.
9C CAAATGAAGCAGATTACTTTGACATATTTACTAACAGCTGTAATGATATTTTTACTGTCA 6 0
8 1 8 ------
9C GTTGTACTTGTGCAGTTAAGTGCTACACAATCACAGTCAAATTTGCTTGCTGATGAATGG 120
8 1 8 ------
9C TATCTATTCAAGGCTAGACACAAGAAAGATTATCCAAGCCAACTTGAGGAAAAATTTAGA 180 818 -ATCTATTCAAGGCTACACACAAGAAAGATTATCCAAGCCAACTTGAGGAAAAATTTAGA 59 *************** *******************************************
9C ATGAAGATTTATTTTGAAAATAAAGACAAAATTGCCAAACATAACATCCTTTATGAGAAA 240 818 ATGAAGATTTATTTTGAAAATAAAGACAAAATTGCCAAACATAACATCCTTTATGAGAAA 119
************************************************************
9C GGCGAAAAGTCTTATCAAGTTGCAATGAATCAGTTTGGAGATCTTCTTCATCATGAATTT 3 00 818 GGCGAAAAGTCTTATCAAGTTGCAATGAATCAGTTTGGAGATCTTCTTCATCATGAATTT 17 9
9C ACATCTATCATGATTGGATATAAGAAATGAACTTCACCCTTTGCTAAGAGCACTTTTACT 360 818 ACATCTATCATGATTGGATATAAGAAACGAACTTCACCCTTTGCTAAGAGCACTTTTACT 239
*************************** ********************************
9C TTTATGGAGCCTGCTAATGTTACAGTTCCAGAATCTGTTGACTGGAGGGAAAAAGGAGCA 420 818 TTTATGGAGCCTGCTAATGTTACAGTTCCAGAATCTGTTGACTGGAGGGAAAAAGGAGCA 299
9C GTAACTCCTGTAAAATACCCAGGACAGTGTGCTTCTTGCTTGGCTTTTTCACCTACTGGT 480 818 GTAACTCCTGTAAAATACCAAGGACAGTGTGCTTCTTGCTTGGCTTTTTCACCTACTGGT 3 59
9C GCCTTGGAAAGTCAAACTTTCAGAAAAACAGGAAAGCTCATTTCTTTGAGTGAACAAAAC 54 0 818 GCCTTGGAAAGTCAAACTTTCAGAAAAACAGGAAAGCTCATTTCTTTGAGTGAACAAAAC 419
9C TTGATTGATTGTTCCGGTGAATATGGAAATTTAGGATGCAAAGGGGGATGGATAAGCCAA 600 818 TTGATTGATTGTTCCGGTGAATATGGAAATTTAGGATGCAAAGGGGGATGGATAAGCCAA 4 79
9C GCTTTTGAGTATATCAAGGATAACAAAGGAATTGACACTGAAAATAAATATCATTATGAA 6 60 818 GCTTTTGAGTATATCAAGGATAACAAAGGAATTGACACTGAAAATAAATATCATTATGAA 53 9
45
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 9C GCTAAAGAAAATTTCTGTCGTGATAATCCAAGAAACCGAGGTGCAGTTGCCCTTGGCTTT 720 818 GCTAAAGAAAATTTCTGTCGTGATAATCCAAGAAACCGAGGTGCAGTTGCCCTTGGCTTT 599
9C GTCAATATTCCATCTGGGGAAGAAGATAAACTTAAGGCAGCTGTTGCCACGGTTGGACCT 780 818 GTCAATATTCCATCTGGGGAAGAAGATAAACTTAAGGCAGCTGTTGCCACGGTTGGACCT 659
9 C GTTTCCGCTGTTATTGATGTCTCTCATGAAGGTTTTCAATTCTATTCTAAGGGTGTTTAC 840 818 GTTTCCGCTGTTATTGATGTCTCTCATGAAGGTTTTCAATTCTATTCTAAGGGTGTTTAC 719
********++*++**+**+********+****+*★+*+**+++*★+******+*+*★★**
9C TATGAGCCATCATGTAAAACATCATTTGAACACCTAAACCACGAAGTTCTTGTAATTGGC 900 818 TATGAGGCATCATGTAAAACATCATTTGAACACCTAAACCACGCAGTTCTTGTAATTGGC 779
****+★ i*********************************** it***************
9C TGTGGTTCTGATAATGGCGAAGACTATTGGCTCGTTAAAAACTCATGGTCTAAGCACTGG 96 0 818 TGTGGTTCTGATAATGGCGAAGACTAT------8 0 6
9C GGAGACGAAGGGTACCTCAAGATTGCTCGCAATCGCAAGAACCATTGTGGTGTTGCTACT 1020 8 1 8 ------
9 C GCAGCTCTCTATCCAATTGTATAGATAGGGTTGTGGTACTTTTTGTGATGTGTGTAATTG 1080 8 1 8 ------
9C ACCACGGTACATCT 1094 8 1 8 ------
46
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fig. 10. Sequence alignment of PCR derived clone 818 from EMBL3 genomic DNA library withArtemia embryo cathepsin L cDNA.
The sequence of Artemia embryo cathepsin L cDNA (AF147207) was obtained from
the GenBank database. The alignment was performed using Clustal W (1.82).
Asterisks indicate identical base pair. Dashes at bp 318-323 in cDNA indicate missing
sequence in clone 818. Other dashes indicate regions not sequenced. The EcoR I
restriction site inArtemia embryo CL cDNA is boxed. See Fig 4 for a similar type
comparison of sequence. The sequences used (CLF10 and CLR8) as primers in PCR
are underlined.
47
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 10.
EcDNAl CATCTTGTGGCAGACAATTACACAATGAAGCAGATTACTTTGATATTTTTACTGGGAGCT 60
8 1 8 ------
EcDNAl GTACTTGTGCAGTTAAGTGCTGCACTATCACTGACAAATTTACTTGCTGATGAATGGCAT 120
8 1 8 ------A T 2 *★
EcDNAl CTATTCAAGGCTACACACAAGAAAGAATATCCAAGCCAACTTGAGGAGAAATTTAGAATG 180
818 CTATTCAAGGCTACACACAAGAAAGATTATCCAAGCCAACTTGAGGAAAAATTTAGAATG 6 2 ************* ************★ *★★★***★★★*★**★*★*★* ★★******★★★*
EcDNAl AAGATTTATTTGGAAAATAAACACAAAGTTGCCAAACATAACATCCTTTATGAAAAAGGC 240
818 AAGATTTATTTTGAAAATAAAGACAAAATTGCCAAACATAACATCCTTTATGAGAAAGGC 122
***** ************************* ******
EcDNAl GAAAAGTCTTATCAAGTCGCAATGAATAAGTTTGGAGATCTTCTTCATCATGAATTTAGA 300
818 GAAAAGTCTTATCAAGTTGCAATGAATCAGTTTGGAGATCTTCTTCATCATGAATTTACA 182
***************** ********* ****************************** *
EcDNAl TCTATCATGAATGGATACCAACATAAGAAACA|GAATTC|CTCAAGAGCTGAGAGCACTTTC 360
818 TCTATCATGATTGGATA------TAAGAAACGAACTTCACCCTTTGCTAAGAGCACTTTT 236
********* ****** ******** * *★★ * **★ **********
EcDNAl ACTTTTATGGAGCCTGCTAATGTTGAAGTTCCAGAATCTGTTGACTGGAGGGTAAAAGGA 420
818 ACTTTTATGGAGCCTGCTAATGTTACAGTTCCAGAATCTGTTGACTGGAGGGAAAAAGGA 296
************************ ************************** *******
EcDNAl GCCATAACTCCTGTAAAAGACCAAGGACAGTGTGGTTCATGCTGGGCTTTCTCATCTACT 480
818 GCAGTAACTCCTGTAAAATACCAAGGACAGTGTGCTTCTTGCTTGGCTTTTTCACCTACT 356
** ************** *************** *** **** ****** *** *****
EcDNAl GGTGCCTTGGAAGGTCAAACCTTCAGAAAAACAGGGAAGCTCATTTCTTTGAGTGAACAG 540
818 GGTGCCTTGGAAAGTCAAACTTTCAGAAAAACAGGAAAGCTCATTTCTTTGAGTGAACAA 416
************ ******* ************** ***********************
EcDNAl AACTTGATTGATTGTTCTGGAAAATATGGAAATGAAGGATGCAATGGAGGATTAATGGAC 600
818 AACTTGATTGATTGTTCCGGTGAATATGGAAATTTAGGATGCAAAGGGGGATGGATAAGC 476
***************** ** *********** ********* ** **** ** *
EcDNAl CAAGCTTTCCAGTATATCAAGGATAACAAGGGAATTGACACTGAAAATACGTACCCTTAT 660
818 CAAGCTTTTGAGTATATCAAGGATAACAAAGGAATTGACACTGAAAATAAATATCATTAT 536
******** ******************* ******************* ** * ****
48
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. EcDNAl GAAGCTGAAGACAATGTCTGTCGTTATAATCCAAGGAACCGAGGTGCCATTGACCGTGGC 720 818 GAAGCTAAAGAAAATTTCTGTCGTGATAATCCAAGAAACCGAGGTGCAGTTGCCCTTGGC 596
****** **** *** ******** ********** *********** *** ** ****
EcDNAl TTTGTCCATATCCCATCTGGAGAAGAAGATAAGCTTAAGGCAGCTGTTGCCACTGTTGGA 780
818 TTTGTCAATATTCCATCTGGGGAAGAAGATAAACTTAAGGCAGCTGTTGCCACGGTTGGA 656
****** **** ******** *********** ******************** ******
EcDNAl CCTGTATCTGTTGCCATCGATGCCTCTCATGAAAGTTTCCAATTCTATTCTAAAGGTGTT 84 0
818 CCTGTTTCCGCTGTTATTGATGTCTCTCATGAAGGTTTTCAATTCTATTCTAAGGGTGTT 716
***** ** * ** * ★ **** ********** **** ************** ******
E c D N A l T A C T A T G A G C C A T C A T G T G A C ------T C T G A T G A C C TA G A C C A C G G A G TT C T TG T G G T T 8 9 4
818 TACTATGAGGCATCATGTAAAACATCATTTGAACACCTAAACCACGCAGTTCTTGTAATT 77 6
********* ******** * ★ *** ***** ****** ********* **
EcDNAl GGCTATGGTTCTGATAATGGCAAAGACTATTGGCTCGTTAAAAACTCGTGGTCTGAGCAC 954
818 GGCTGTGGTTCTGATAATGGCGAAGACTAT------8 0 6
**** **************** ********
EcDNAl TGGGGAGACGAAGGGTATATCAAGATTGCTCGCAATCGCAAGAACCATTGTGGTATTGCT 1014
8 1 8 ------
EcDNAl ACTGCAGCTAGCTATCCACTTGTATAGATAGGGTTGTGGTAATTTTTGTGGATGTGTGTA 1074
8 1 8 ------
EcDNAl ATTGCATACGTTAAATTCTTATTCTCTTGATAGGTTTAGAGAGTTCTAGTTTTCAGTTTG 1134
8 1 8 ------
EcDNAl ATTCCGTAGATGACAGATTTTGTGACCATATTCGAGAATAAAGCGTTTTTTTTACCTAAA 1194
8 1 8 ------
EcDNAl AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 1229
8 1 8 ------
49
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. with Artemia genomic clone 9C, but less so (87 %) with Artemia embryo cathepsin L
cDNA. The results showed that all four clones giving positive signals with 32P-labeled
CL cDNA probe contained sequence identical Artemiato franciscana CL-2 gene, but
none were identical withArtemia embryo cDNA. These results suggest that the
genomic DNA library in EMBL3 is devoid of CL-1the gene for reasons discussed
later.
5. Attempts to amplifyArtemia franciscana cathepsin L-land
L-2 genes from different preparations of genomic DNA
Genomic DNA was prepared in our lab from nauplii ofArtemia franciscana and
used as template to search for the cathepsin L genes, using PCR primers CLF13 and
CLR18 (see Appendix 1 and 2), synthesized from sequenceArtemia of embryo
cathepsin L cDNA. DNA prepared from genomic clone 9C and the EMBL3 library was
also used as templates for comparison. Primers were designed to distinguish between
Artemia embryo CL cDNA and genomic clone 9C, representing CL-1the gene and
CL-2 gene, respectively. The PCR results shown in Figure 11 indicate that only
genomic DNA prepared fromArtemia nauplii gave a PCR product. DNA prepared
from clone 9C and the EMBL3 library total DNA did not yield any PCR products using
these primer pairs, while genomic DNA yielded a product of about 600 bp product as
expected. The PCR derived DNA fragment (lane 4, Fig 11) was purified and cloned
into vector pCR 2.1. It gave a positive signal when hybridized with 32P-labled CL
cDNA, and when treated with EcoR I, it yielded one detectable band of the size
predicted from data in Figure 11 (see Figure 12). Sequencing yielded a product of 565
bp that was nearly identical (97%) withArtemia franciscana embryo cathepsin L
cDNA sequence (AF147207) (see Fig 13). The sequence of the genomic clone also
contained one EcoR I restriction digestion site and two Hind III sites at the same
positions as inArtemia embryo CL cDNA. These results indicate that while freshly
prepared Artemia genomic DNA contains theCL-1 gene as predicted, the genomic
DNA library constructed in EMBL3 is deficient in the gene codingCL-1 for (for
50
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 2 3 4
750bp ► < -6 0 0 bp 500bp ►
& & & ■ r?*5" primer dimer
Fig. 11. PCR products from Artemia genomic DNA prepared from various sources.
PCR products were separated by electrophoresis on 1% agarose, and stained with
ethidium bromide as described in methods. The primers used in the PCR were CLF13
and CLR18 (see Appendix 1 and 2). Lane 1, 1 Kb ladder; lane 2, PCR reaction from
Artemia genomic 9C clone{CL-2 gene); lane 3, PCR reaction from totalArtemia DNA
in EMBL3; lane 4, PCR product fromArtemia franciscana genomic DNA. Only
Artemia genomic DNA yielded a product of around 600 bp with the primers used.
Betaine (10 p.1) was added to each reaction to increase the amount of the products.
51
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 2 3
4000bp ►
lOOObp ► 750bp ► 500bp ► ■4 insert 250bp ►
Fig. 12. EcoR 1 digestion of PCR product shown in Fig 11 after cloning in pCR2.1. Lane 1, lkb ladder; lane 2, plasmid containingArtemia embryo cathepsin L cDNA;
lane 3, Artemia genomic DNA clone (4271) derived from PCR as shown in Fig 11.
52
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fig. 13. Sequence alignment of DNA genomic clone 4271 withArtemia embryo
cathepsin L cDNA.
The sequence of Artemia embryo cathepsin L cDNA (AF147207) was obtained from
the GenBank database. The alignment was performed using the Clustal W program
(1.82). The sequence of genomic DNA clone (4271) was 97% identical withArtemia
embryo cathepsin L cDNA sequence. Asterisks indicate identical base pair. Dashes
indicate regions not sequenced. The EcoR I restriction site is boxed. Hind III
restriction sites are indicated with light underline. The sequences used as primers in
PCR are indicated with bold underline. The arrowhead indicates potential cleavage
site for mature region.
53
Reproduced with permission of the copyrightowner. Further reproduction prohibited without permission. Figure 13. EcDNAl CATCTTGTGGCAGACAATTACACAATGAAGCAGATTACTTTGATATTTTTACTGGGAGCTGTACTTGTGCAGTTAAGTGCTGCACTATCA 90
4271 ......
EcDNAl CTGACAAATTTACTTGCTGATGAATGGCATCTATTCAAGGCTACACACAAGAAAGAATATCCAAGCCAACTTGAGGAGAAATTTAGAATG 180
4271 ......
EcDNAl AAGATTTATTTGGAAA&TAAACACAAAGTTGCCAAACATAACATCCTTTATGAAAAAGGCGAAAAGTCTTATCAAGTCGCAATGAATAAG 270
4271 ......
EcDNAl TTTGGAGATCTTCTTCATCATGAATTTAGATCTATCATGAATGGATACCAACATAAGAAACA^AATT^CTCAAGAGCTGAGAGCACTTTC 360
4271 ...... CCAACATAAGAAACAfgAATTCtCTCAAGAQCTGAQAQTACTTTC 43
EcDNAl ACTTTTATGGAGCCTGCTAATGTTGAAGTTCCAGAATCTGTTGACTGGAGGGTAAAAGGAGCCATAACTCCTGTAAAAGACCAAGGACAG 450
4271 ACTTTTATGGAGCCTGCTAATGTTGAAGTTCCAGAATCTGTTGACTGGAGGGAAAAAGGAGCCATAACTCCTGTAAAGGACCAAGGACAG 133
EcDNAl TGTGGTTCATGCTGGGCTTTCTCATCTACTGGTGCCTTGGAAGGTCAAACCTTCAGAAAAACAGGGAAGCTCATTTCTTTGAGTGAACAG 540
4271 TGTGGTTCATGCTGGGCTTTCTCATCTACTGGTGCCCTGGAAGGTCAAACCTTCAGAAAAACAGGGAAGCTCATTTCTTTGAGTGAACAG 223
EcDNAl AACTTGATTGATTGTTCTGGAAAATATGGAAATGAAGGATGCAATGGAGGATTAATGQACCAAGCTTTCCAGTATATCAAGGATAACAAG 630
4271 AACTTGATTGATTGTTCTGQAAAATATQGAAATGAAGGATGCAATGGAGQATTGATGGACCAAGCTTTCCAGTATATCAAGGATAACAAG 313
EcDNAl GGAATTGACACTGAAAATACGTACCCTTA7GAAGCTGAAGACAATGTCTGTCGTTATAATCCAAGGAACCGAGGTGCCATTGACCGTGGC 720
4271 GGAATTGACACTGAAAATACGTATCCTTATGAAGCTGAAGACGATGTCTGTCGTTATAATCCAAGGAACCGAGGTGCAGTTGACCGCGGC 403
EcDNAl TTTQTCCATATCCCATCTOGAGAAGAAGATAAQCTTAAGGCAGCTGTTGCCACTGTTGQACCTQTATCTGTTGCCATCGATGCCTCTCAT 810
4271 TTTGTCGATATCCCATCTGGAGAAGAAGATAAGCTTAAGGCAGCTGTTGCCACGGTTGGACCTGTATCTGTTGCCATCGATGCCTCTCAT 493
EcDNAl GAAAGTTTCCAATTCTATTCTAAAGGTGTTTACTATGAGCCATCATGTGACTCTGATGACCTAGACCACGGAGTTCTTGTGGTTGGCTAT 900
4271 GAAAGTTTCCAATTCTATTCTAAAGGTGTTTAgTATGAGCCATgATQTGAgTgTGATGACCTAGACCACGGA...... 565
EcDNAl GGTTCTGATAATGGCAAAGACTATTGGCTCGTTAAAAACTCGTGGTCTGAGCACTGGGGAGACGAAGGGTATATCAAGATTGCTCGCAAT 990
4271 ......
EcDNAl CGCAAGAACCATTGTGGTATTGCTACTGCAGCTAGCTATCCACTTGTATAGATAGGGTTGTGGTAATTTTTGTGGATGTGTGTAATTGCA 1080
4271 ......
EcDNAl TACGTTAAATTCTTATTCTCTTGATAGGTTTAGAGAGTTCTAGTTTTCAGTTTGATTCCGTAGATGACAGATTTTGTGACCATATTCGAG 1170
4271 ......
EcDNAl AATAAAGCGTTTTTTTTACCTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 1229
4271 ......
54
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. possible reasons discussed later). These observations explain the inability of at least
three individuals in our lab to isolate theCL-1 gene from the Artemia genomic library
constructed in EMBL3.
6. Isolation of a cathepsin L cDNA from an Artemia adult
cDNA library
In an attempt to determine whether theArtemia CL-2 gene is functional, an
Artemia franciscana adult cDNA library in Xgt 11 was analyzed using PCR to look for a
cDNA matching the CL-2 gene sequence. Total DNA from the adult cDNA library in
Lgtl 1 was isolated following the procedures described in the methods, and the DNA
was used as template in a PCR reaction. This approach avoided the need to screen large
numbers of phage (plaques), which is very time consuming and not always successful.
The PCR primers were designed based on the sequence Artemiaof genomic clone 9C
as shown in Appendix 1 and 3. Primers CL9CF1 and CLR11 covered the prepro-region
of the CL-2 gene, while primers CLF11 and CLRlOb covered part of the region coding
for the mature protease of the CL-2 gene. Artemia genomic clone 9C CL-2( gene) was
used as a positive control substrate. The PCR was done successfully and the products
were separated by electrophoresis on 1% agarose gel, stained with ethidium bromide
and visualized using UV transillumination. The results in Figure 14 show that the adult
cDNA library contained a cDNA sequence identical in size to that found using DNA
from genomic clone 9C. Primer pairs CL9CF1 and CLR11 yielded a fragment of
around 300 bp, while primer pairs CLF11 and CLRlOb yielded a product of about 600
bp. The PCR products were purified, cloned into vector pCR 2.1, and those clones that
gave a signal with 32P labled CL cDNA were sequenced.
Sequences of the two PCR products derived fromArtemia adult cDNA (see Fig
14) were aligned withArtemia cathepsin L genomic clone 9C as shown in Fig 15.
Overall, the clones were 97% identical with genomic clone 9C(CL-2) and had a
restriction pattern identical to that found in clone 9C. These results demonstrate that
55
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A. 1 2 3 B. 1 2 3
750bp ► 0mm» M 600 bp 500bp ► 500bp^ ■4 300 bp 250bp^
Fig. 14. Comparison of PCR products obtained from anArtemia adult cDNA library andArtemia clone 9C representing theCL-2 gene.
PCR products were separated by electrophoresis on 1% agarose, and stained with
ethidium bromide as described in methods. Panel A shows PCR products obtained
using primers CLF11 and CLRlOb. Lane 1, lkb ladder; lane 2, genomic clone 9C; and
lane 3, Artemia adult cDNA library. Both the control (genomic clone 9C) and adult
cDNA library yielded a similar size product of about 600 bp. Panel B shows PCR
products obtained using primers CL9CF1 and CLR11. Lanes 1, 2 and 3 represent the
same DNAs as given for panel A. The PCR product was about 300 bp in both lanes 2
and 3.
56
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fig. 15. Comparison of DNA sequences derived from anArtemia franciscana
adult cDNA library andArtemia CL-2 gene (clone 9C) by PCR.
The sequence of Artemia cathepsin L genomic clone 9C CL-2 ( gene) (AY557372)
was obtained from the GenBank database. Products obtained from the two PCR
experiments shown in Fig 14 were cloned into pCR2.1, sequenced then combined for
presentation here. Totally, 874 bp of sequence was obtained fromArtemia adult
cDNA as substate. The alignment was performed using program Clustal W (1.82).
The sequence of clones derived fromArtemia adult cDNA was 97% identical with
Artemia CL-2 gene. Asterisks indicate identical base pair. Dashes indicate regions not
sequenced yet. The sequences used as primers in PCR are underlined. The
overlapping area of the two clones is indicated with bold letters.
57
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 15. 9C CAAATGAAGCAGATTACTTTGACATATTTACTAACAGCTGTAATGATATTTTTACTGTCA 60 ACDNA2 ------GCAGATTACTTTGACATATTTACTGGAAGCTGTACTGATATTTTTACTGTCA 52
***★******************** ******* *****************
9C GTTGTACTTGTGCAGTTAAGTGCTACACAATCACAGTCAAATTTGCTTGCTGATGAATGG 120
ACDNA2 GTTGTACTTGTGCAGTTAAGTGCTACACAATCACAGTCAAATTTGCTTGCTGATGAATGG 112
************************************************************
9C TATCTATTCAAGGCTAGACACAAGAAAGATTATCCAAGCCAACTTGAGGAAAAATTTAGA 180
ACDNA2 TATCTATTCAAGGCTAGACACAAGAAAGATTATCCAAGCCAACTTGGGGAAAAATTTAGA 172
********************************************** *************
9C ATGAAGATTTATTTTGAAAATAAAGACAAAATTGCCAAACATAACATCCTTTATGAGAAA 240
ACDNA2 ATGAAGATTTATTTTGGAAATAAAGACAAAATTGCCAAACATAACATCCTTTATGAGAAA 232
**************** *******************************************
9C GGCGAAAAGTCTTATCAAGTTGCAATGAATCAGTTTGGAGATCTTCTTCATCATGAATTT 3 00
ACDNA2 GGCGAAAAGTCTTATCAAGTTGCAATGAATCAGTTTGGAGATCTTCTTCATCATGAATTT 292
************************************************************
9C ACATCTATCATGATTGGATATAAGAAATGAACTTCACCCTTTGCTAAGAGCACTTTTACT 3 6 0
ACDNA2 ACATCTATCATGATTGGATATAAGAAATGAACTTCACCCTTTGCTAAGAGCACTTTTACT 3 5 2
************************************************************
9C TTTATGGAGCCTGCTAATGTTACAGTTCCAGAATCTGTTGACTGGAGGGAAAAAGGAGCA 420
ACDNA2 TTTATGGAGCCTGCTAACGTTACAGTTCCAGAATCTGTTGACTGGAGGGAAAAAGGAGCA 412
***************** ******************************************
9C GTAACTCCTGTAAAATACCCAGGACAGTGTGCTTCTTGCTTGGCTTTTTCACCTACTGGT 480
ACDNA2 GTAACTCATGTAAAATACCAAGGACAGTGTGCTTCTTGCTGGGCTTTTTCATCTACTGGT 472
******* *********** ******************** ********** ********
9C GCCTTGGAAAGTCAAACTTTCAGAAAAACAGGAAAGCTCATTTCTTTGAGTGAACAAAAC 540
ACDNA2 GCCTTGAAAAGTCAAACTTTCAGAAAAACAGGAAAGCTCATTTCTTTGAGTGAACAGAAC 532
****** ************************************************* ***
9C TTGATTGATTGTTCCGGTGAATATGGAAATTTAGGATGCAAAGGGGGATGGATAAGCCAA 600
ACDNA2 TTGATTGATTGTTCCGGTGAATATGGAAATTTAGGATGCAAAGAGGGATGGATAAGCCAA 592
******************************************* ****************
9C GCTTTTGAGTATATCAAGGATAACAAAGGAATTGACACTGAAAATAAATATCATTATGAA 660
ACDNA2 GCTTTTGAGTATATCAAGGATAACAAAGGAATTGACACTGAAAATAAATATCATTATGAA 652
************************************************************
58
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 9C GCTAAAGAAAATTTCTGTCGTGATAATCCAAGAAACCGAGGTGCAGTTGCCCTTGGCTTT 720
ACDNA2 GCTAAAGAAAATTTCTGTCGTGATAATCCAAGAAACCGAGGTGCAATTGCCCTTGGCTTT 712
********************************************* **************
9C GTCAATATTCCATCTGGGGAAGAAGATAAACTTAAGGCAGCTGTTGCCACGGTTGGACCT 780
A C D N A 2 GTCAATATTCAATCTGGGGAAGAAGATAAACTTCAGGCAGCTGTTGCCACGGTTGGACCT 772
********** ********************** **************************
9C GTTTCCGCTGTTATTGATGTCTCTCATGAAGGTTTTCAATTCTATTCTAAGGGTGTTTAC 84 0
A C D N A 2 GTTTCCGCTGTTATTGATGTCTCTCATGAAGGTTTTCAATTCTATTCTAAGGGTGTTTAC 832
************************************************************
9C TATGAGCCATCATGTAAAACATCATTTGAACACCTAAACCACGAAGTTCTTGTAATTGGC 900
A C D N A 2 TATGAGCCATCATGTAAAACATCATTTGAACACCTAAACCAC ------8 7 4
******************************************
9C TGTGGTTCTGATAATGGCGAAGACTATTGGCTCGTTAAAAACTCATGGTCTAAGCACTGG 960
A C D N A 2
9C GGAGACGAAGGGTACCTCAAGATTGCTCGCAATCGCAAGAACCATTGTGGTGTTGCTACT 1020
A CD N A2
9C GCAGCTCTCTATCCAATTGTATAGATAGGGTTGTGGTACTTTTTGTGATGTGTGTAATTG 1080
A CD N A2
9C ACCACGGTACATCT 1094
A CD N A2
59
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Artemia franciscana CL-2 gene is expressed in adultArtemia franciscana and
probably represents a functional gene and not a pseudogene. Further, analysis of the
PCR product obtained using theArtemia adult cDNA library indicated two open
reading frames in the cDNA. The results in Fig 16 show the two open reading frames
inArtemia adult cDNA sequence for cathepsin L as identified using the ORF Finder in
NCBI, containing 49 and 133 amino acid sequence, respectively.
7. Isolation of a cathepsin L cDNA representing theCL-2
gene from theArtemia embryo cDNA library
An Artemia embryo cDNA library in AZAPII was screened previously in our lab
(Butler, 2001), andArtemia cathepsin L cDNA containing 1229 bp was isolated from
the library. The embryo CL cDNA was about 80 % identical with genomic clone 9C
and therefore could not have been derived from their genomic clone. In order to search
for a cDNA in the Artemia embryo cDNA library identical with theCL-2 gene (clone
9C), the PCR method was performed. Total DNA from theArtemia embryo cDNA
library in XZAPII was converted into the bluescript phagemid as described previously
(Butler et al. 2001), and 276 ng DNA was used as substrate in PCR. The vector primer
TP-7F (T7 promoter) was used with internal primer CLR11 in PCR reactions (see
Appendix 1). The internal primer was specific for theCL-2 gene. The PCR products
were analyzed on 1% agarose gel as shown in Fig 17. The PCR yielded several
products of different sizes because of the use of T7 promoter in the reaction.
The gel was then blotted to a nitrocellulose membrane and probed with
32P-Iabeied CL cDNA. The membrane was washed and exposed to an X-ray film at -80
°C. PCR product using TP-7F and CLR11 showed a strong positive signal at a band
about 500 bp (Fig 17). This PCR product was purified using the Wizard DNA
Clean-Up system (Pormega), then ligated into plasmid vector pCR 2.1. Through the
plasmid DNA screening with 32P labled CL cDNA, the clones with a positive signal
were collected and their DNA isolated using the Wizard Miniprep Kit (Promega) then
sequenced.
60
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fig. 16. Open reading frames inArtemia adult cathepsin L cDNA sequence.
Panel A, first open reading frame inArtemia adult CL cDNA. The putative translation
start code, ATG, is underlined. Asterisk indicates the stop codon; Panel B, second
open reading frame in adult CL cDNA. Deduced amino acid sequence of adult CL
cDNA is shown under the nucleotide sequence. The numbers at the left indicate the
positions of the nucleotides and amino acids in the complete sequence.
61
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 16.
173 atgaaqatttattttqqaaataaaqacaaaattqccaaacataac
MKIYFGNKDKIAKHN
218 atcctttatgagaaaggcgaaaagtcttatcaagttgcaatgaat
ILYEKGEKSYQVAMN
263 cagtttggagatcttcttcatcatgaatttacatctatcatgatt
QFGDLLHHEFTSIMI
308 ggatataagaaatga 3 22 G Y K K *
B.
356 atqqaqcctqctaacqttacaqttccaqaatctqttqactqqaqq
MEPANVTVPESVDWR
401 gaaaaaggagcagtaactcatgtaaaataccaaggacagtgtgct EKGAVTHVKYQGQCA
446 tcttgctgggctttttcatctactggtgccttgaaaagtcaaact SCWAFSSTGALKSQT
491 ttcagaaaaacaggaaagctcatttctttgagtgaacagaacttg
FRKTGKLISLSEQNL
536 attgattgttccggtgaatatggaaatttaggatgcaaagaggga
IDCSGEYGNLGCKEG
581 tggataagccaagcttttgagtatatcaaggataacaaaggaatt
WISQAFEYIKDNKGI
626 gacactgaaaataaatatcattatgaagctaaagaaaatttctgt DTENKYHYEAKENFC
671 cgtgataatccaagaaaccgaggtgcaattgcccttggctttgtc
RDNPRNRGAIALGFV
716 aatattcaatctggggaagaagataaacttcaggcagctgttgcc NIQSGEEDKLQAAVA
761 acggttggacctgtttccgctgttattgatgtctctcatgaaggt
TVGPVSAVIDVSHEG
806 tttcaattctattctaagggtgtttactatgagccatcatgtaaa FQFYSKGVYYEPSCK
851 acatcatttgaacacctaaaccac 874 TSFEHLNH
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 500bp>- ■< Signal 250bp^
Fig. 17. PCR products derived fromArtemia embryonic cDNA library.
PCR products were separated by electrophoresis on 1% agarose, and stained with
ethidium bromide as described in methods. Panel A, PCR product using primers TP-7F
and CLR11 (see Appendix 1). Panel B, X-ray film after the Southern blotting of the gel.
Lanes 1 and 2 represent the same DNAs as given for Panel A. In lane 2 the band
containing about 500 bp hybridized strongly with the 32P labled CL cDNA probe.
63
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A total of 340 bp sequence was obtained (excluding sequence associated with the
T7 promoter and vector) and aligned withArtemia cathepsin L genomic clone 9C as
shown in Figure 18. The insert consisting of 340 bp is 97 % identical with genomic
clone 9C (CL-2 ), and demonstrates thatArtemia CL-2 gene is also expressed in
Artemia franciscana embryos. The results in Figure 19 (panel A) show the amino acid
sequence for the open reading frame inArtemia embryo cDNA sequence representing
the cathepsin L-2 gene. The sequence consisting of 49 amino acids compares well with
the first open reading frame ofArtemia adult CL cDNA (97 %) as shown in Figure 19
(panel B).
8. Identity of 5’ upstream sequences ofArtemia franciscana
CL genes
As described above,Artemia franciscana CL-2 gene was confirmed to be
functional, or at least transcribed, so we attempted to identify the 5’ upstream part of
the CL-2 gene to help understand the transcriptional regulationArtemia of cathepsin L
genes. To obtain the 5’ flanking upstream sequence of the CL-2 gene, PCR was
performed using degenerate primers (OPC-2, OPC-4, OPC-8, OPC-9) with an internal
primer (CLR10) (see Appendix 1) in the coding region CL-2of gene. Since genomic
clone 9C containing an insert of about 8 kb represented the sequence of theCL-2 gene
in Artemia, it was used as template for the PCR analysis. The PCR reactions were
performed in two consecutive steps according to the nature of degenerate primers and
as described in the methods. The PCR products were separated by electrophoresis on
1% agarose gel and we observed that each primer pair yielded one intense band and
several minor bands ranging in size from 250-1500 bp (see Fig 20). The complex
pattern was attributed to the degenerate primers, as they might have multiple binding
sites in the whole DNA.
The gel containing the PCR products was blotted to a nitrocellulose membrane
and probed with 32P-labeled CL cDNA. Among the four sets of primer pairs used in
PCR, only one CL-positive product was observed and this was with primer pair OPC-4
64
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fig. 18. Comparison of DNA sequences derived from anArtemia franciscana
embryo cDNA library andArtemia CL-2 gene (clone 9C) by PCR.
The sequence of Artemia cathepsin L genomic clone 9C CL-2( gene) (AY557372)
was obtained from the GenBank database. The alignment was performed using
program Clustal W (1.82). Asterisks indicate identical base pair. Dashes indicate
regions not sequenced yet. The internal sequence used as primer in PCR is underlined
in bold, while the TP-7F primer of the vector is not shown.
65
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 18.
9C CAAATGAAGCAGATTACTTTGACATATTTACTAACAGCTGTAATGATATTTTTACTGTCAGTTGTACTTGTGCAGTTAAGTGCTACACAAT 91
ECDNA2 - - CAAGAAGCAGATTACTTTGAAATATTTACTGGAAGCTGTACTGATATTTTTACTGTCAGTTGTACTTGTGCAGTTAAGTGCTACACAAT 9 0
9C CACAGTCAAATTTGCTTGCTGATGAATGGTATCTATTCAAGGCTAGACACAAGAAAGATTATCCAAGCCAACTTGAGGAAAAATTTAGAA 181
ECDNA2 CACAGTCAAATTTGCTTGCTGATGAATGGTATCTATTCAAGGCTAGACACAAGAAAGATTATCCAAGCCAACTTGGGGAAAAATTTAGAA 180
9C TGAAGATTTATTTTQAAAATAAAGACAAAATTGCCAAACATAACATCCTTTATGAGAAAGGCGAAAAGTCTTATCAAGTTGCAATGAATC 271
ECDNA2 TGAAGATTTATTTTGGAAATAAAGACAAAATrGCCAAACATAACATCCTTTATGAGAAAGGCGAAAAGTCTTATCAAGTTGCAATGAATC 270
9C AGTTTGGAGATCTTCTTCATCATGAATTTACATCTATCATGATTGGATATAAGAAATGAACTTCACCCTTTGCTAAGAGCACTTTTACTT 361
ECDNA2 AGTTTQQAGATCTTCTTCATCATQAATCTACATCTXTCATgATTGGATATAAGAAATGAACTTCACCCTTT...... 341
9C TTATGGAGCCTGCTAATGTTACAGTTCCAGAATCTGTTGACTGGAGGGAAAAAGGAGCAGTAACTCCTGTAAAATACCCAGGACAGTGTG 451
EcDNA2
9C CTTCTTGCTTGGCTTTTTCACCTACTGGTGCCTTGGAAAOTCAAACTTTCAGAAAAACAGGAAAGCTCATTTCTTTGAGTGAACAAAACT 541
EcDNA2
9C TGATTGATTGTTCCGGTGAATATGGAAATTTAGGATGCAAAGGGGGATGGATAAGCCAAGCTTTTGAGTATATCAAGGATAACAAAGGAA 631
ECONA2
9C TTGACACTGAAAATAAATATCATTATGAAGCTAAAGAAAATTTCTGTCGTGATAATCCAAGAAACCGAGGTGCAGTTGCCCTTGGCTTTG 721
8CDHA2
9C TCAATATTCCATCTGGGGAAGAAGATAAACTTAAGGCAGCTGTTGCCACGGTTGGACCTGTTTCCGCTGTTATTGATGTCTCTCATGAAG 811
ECDHA2
9C GTTTTCAATTCTATTCTAAGGGTGTTTACTATGAGCCATCATGTAAAACATCATTTGAACACCTAAACCACGAAGTTCTTGTAATTGGCT 901
EcDKA2
9C GTGGTTCTGATAATGGCGAAGACTATTGGCTCGTTAAAAACTCATGGTCTAAGCACTGGGGAGACGAAGGGTACCTCAAGATTGCTCGCA 991
ECDMA2
9C ATCGCAAGAACCATTGTGGTGTTGCTACTGCAGCTCTCTATCCAATTGTATAGATAGGGTTGTGGTACTTTTTGTGATGTG7GTAATTGA 1081
BcDKA2
9C CCACGGTACATCT 1094
ECDNA2
66
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 179 atgaagatttattttggaaataaagacaaaattgccaaacataac
MKIYFGNKDKIAKHN
224 atcctttatgagaaaggcgaaaagtcttatcaagttgcaatgaat
I LYEKGEKSYQVAMN
269 cagtttggagatcttcttcatcatgaatctacatctatcatgatt
QFGDLLHHESTS IMI
314 ggatataagaaatga 328
G Y K K *
B.
EcDNA MKIYFGNKDK1AKHNILYEKGEKSYQVAMNQFGDLLHHESTSIMIGYKK 49
AcDNA MKIYFGNKDKIAKHNILYEKGEKSYQVAMNQFGDLLHHEFTSIMIGYKK 4 9
Fig. 19. Analysis of open reading frame inArtemia embryo cathepsin L-2 cDNA
sequence.
Panel A, open reading frame inArtemia embryo CL-2 cDNA. The putative translation
start code, ATG, is underlined. The numbers at the left indicate the positions of the
nucleotides and amino acids in the complete sequence. Asterisk indicates the stop code;
Panel B, comparison o f open reading frame Artemiain embryo CL-2 cDNA with first
open reading frame inArtemia adult CL-2 cDNA. Asterisks indicate identical base
pair.
67
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and CLR10. This primer pair yielded a positive signal for the band at about 400 bp (Fig
20, panel B, lane 5). The band was cut from the gel, purified, then cloned into plasmid
vector pCR 2.1. Through plasmid DNA screening with 32P labled CL cDNA, the clones
with positive signal were purified, and treated with EcoR I restriction endonuclease to
check the presence of inserts, then one of these clones was sequenced. A total o f404 bp
of DNA sequence was obtained including 260 bp of sequence 5’ to the sequence
determined previously Artemia for clone 9C. The newly obtained 5’- sequence was
combined with previously determined sequence for clone 9C and compared with
Artemia adult CL cDNA andArtemia embryo CL-2 cDNA sequences as shown in
Figure 22. As expected most of the 3’ end of the PCR product was nearly identical (97
%) with both adult CL cDNA and embryo CL cDNA derived fromCL-2 the gene, but
the 5’ end of the PCR product had a surprisingly new and different sequence. First,
primer CLR10 appeared at both ends of the PCR product, but on opposite
(complementary) strands, Second, the 5’ end of the PCR product contained an open
reading frame with 69 amino acids for a sequence representing (possibly) another gene
(DEAD-box helicases) as shown in Figure 23. The amino acid sequence of the first
open reading frame is about 30 % identical with DEAD-box helicases. The 227 bp 5’
upstream sequence, excluding primer OPC-4 and CLR10 adjacent to one another, was
analyzed by Match-Public 1.0 (core similarity 0.95, matrix similarity 0.95) using the
TRANSFAC 6.0 database (Wingenderet al. 2000; Wingenderet al. 2001; Matyset al.
2003) to identify putative transcription binding sites as shown in Figue 24. Several
transcription binding sites were found such as GATA-1, GATA-3, CDP CR3+HD,
GATA-3, C/EBP, CDP CR1, NF-Y and Limo2 complex. Some o f them have been
identified previously in the promoter region of genes coding for proteases (Bakhshiet
al. 2001).
68
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1500bp ►
7 50h n ► snnhn ► • m + 250bp ► < Signal
Fig. 20. PCR products obtained using degenerate primers andArtemia genomic
clone 9C as substrate.
Panel A, PCR products were separated by electrophoresis on 1% agarose, and stained
with ethidium bromide as described in methods. Lane 1, lkb ladder; lane 2, PCR
products using primers OPC-8 and CLR10; lane 3, PCR products using primers OPC-9
and CLR10; lane 4, PCR products using primers OPC-2 and CLR10; lane 5, PCR
products using primers OPC-4 and CLR10. Panel B, hybridization reaction with
32P-labeled CL cDNA after the Southern blotting of the gel. Lanes 1-5 represent the
same DNAs as given for Panel A. Only one band of about 400 bp derived from use of
OPC-4 and CLR10 as primers showed a positive signal.
69
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fig. 21. EcoR 1 digestion of PCR product using OPC-4 and CLR10 as primers shown in Fig 20 after cloning in pCR2.1.
Lane 1, lkb ladder; lane 2, DNA clone (9C33) derived from PCR using OPC-4 and
CLR10 as shown in Fig 20.
70
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fig. 22. Comparison of DNA sequences of Artemia genomic clone 9C and its 5’ upstream sequence with Artemia franciscana adult CL-2 cDNA and embryo CL
cDNA.
The sequence of Artemia cathepsin L genomic clone 9C CL-2( gene) (AY557372),
(lacking the newly acquired 5’ sequence) was obtained from the GenBank database
and the 5’ upstream sequence obtained in this experiment was added (indicated with
bold letters). The alignment was performed using program Clustal W (1.82).
Asterisks indicate identical base pair. Dashes indicate regions not sequenced. The
sequences used as primers in PCR to obtain the 5’-extended sequence are underlined
in bold. Primer CLR10 sequence appeared at the 5’ end of the sequence next to primer
OPC-4 as shown in the box as well as at position bp 381 - 405 (underlined). The
arrowhead indicates potential cleavage site for mature region. The open reading
frame identified in the 5’ upstream sequence of genomic clone 9C is indicated as light
underline.
71
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 22.
EC D N A 2
A CD N A2
9C [TCTTGTSTGTAGCCTTQAATAj 2 0
E C D N A 2
A CD N A2
9C |gat|ccgcatctactggatcagataaaacqctggcccatatgctqccatcqatcgtccaca8 0
EC D N A 2
A CD N A2
9C TTAAAAACAAGGCAAATCQTAAAAAAQGAQATQGAACCATAGCTTTTATTTCCGCTCAAG 1 4 0
E C DN A 2
A CD N A 2
9C CTAGAGAATTGGCAAAACAGATCAAAGATGTGGCAGAAAAATATGGAGCAGTATCTTGCA 2 0 0
E C DN A 2
A CD N A2
9C TAAGAGGTACATGTGTCTTCGGTGGGTCTCCAAAGAAAGAAACTGAACATAATACTTGCA 2 6 0
E CDNA2 CAA-GAAGCAGATTACTTTGAAATATTTACTGGAAGCTGTACTGATATTTTTACTGTCAG 5 9
A CD N A2 GCAGATTACTTTGACATATTTACTGGAAGCTGTACTGATATTTTTACTGTCAG 5 3
9C CAATGAAGCAGATTACCTTGACATATTTACTAACAACTGTAATGATATTTTTACTGTCAG 320
* * * * * * * * * **** Hr******** * ***** ******************
E C DN A 2 TTGTACTTGTGCAGTTAAGTGCTACACAATCACAGTCAAATTTGCTTGCTGATGAATGGT 119
ACDNA2 TTGTACTTGTGCAGTTAAGTGCTACACAATCACAGTCAAATTTGCTTGCTGATGAATGGT 113
9C TTGTACTTGAGCAGTTAAGTGCTACACAATCACAGTCAAATTTGCTTGCTGATGAATGGT 3 80
********* **************************************************
ECDNA2 ATCTATTCAAGGCTAGACACAAGAAAGATTATCCAAGCCAACTTGGGGAAAAATTTAGAA 17 9
ACDNA2 ATCTATTCAAGGCTAGACACAAGAAAGATTATCCAAGCCAACTTGGGGAAAAATTTAGAA 173
9C ATCTATTCAAGGCTACACACAAGAAAGATTATCCAAGCCAACTTGAGGAAAAATTTAGAA 44 0
*************** ***************************** **************
ECDNA2 TGAAGATTTATTTTGGAAATAAAGACAAAATTGCCAAACATAACATCCTTTATGAGAAAG 23 9
ACDNA2 TGAAGATTTATTTTGGAAATAAAGACAAAATTGCCAAACATAACATCCTTTATGAGAAAG 233
9C TGAAGATTTATTTTGAAAATAAAGACAAAATTGCCAAACATAACATCCTTTATGAGAAAG 500
*************** ********************************************
72
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ECDNA2 GCGAAAAGTCTTATCAAGTTGCAATGAATCAGTTTGGAGATCTTCTTCATCATGAATCTA 299
ACDNA2 GCGAAAAGTCTTATCAAGTTGCAATGAATCAGTTTGGAGATCTTCTTCATCATGAATTTA 293
9C GCGAAAAGTCTTATCAAGTTGCAATGAATCAGTTTGGAGATCTTCTTCATCATGAATTTA 560
********************************************************* * *
ECDNA2 CATCTATCATGATTGGATATAAGAAATGAACTTCACCCTTT------3 4 0
ACDNA2 CATCTATCATGATTGGATATAAGAAATGAACTTCACCCTTTGCTAAGAGCACTTTTACTT 353
9C CATCTATCATGATTGGATATAAGAAATGAACTTCACCCTTTGCTAAGAGCACTTTTACTT 620
*****************************************
ECDNA2 ------
ACDNA2 TTATGGAGCCTGCTAACGTTACAGTTCCAGAATCTGTTGACTGGAGGGAAAAAGGAGCAG 413
9C TTATGGAGCCTGCTAATGTTACAGTTCCAGAATCTGTTGACTGGAGGGAAAAAGGAGCAG 680 '
ECDNA2 ------
ACDNA2 TAACTCATGTAAAATACCAAGGACAGTGTGCTTCTTGCTGGGCTTTTTCATCTACTGGTG 473
9C TAACTCCTGTAAAATACCCAGGACAGTGTGCTTCTTGCTTGGCTTTTTCACCTACTGGTG 74 0
ECDNA2 ------
ACDNA2 CCTTGAAAAGTCAAACTTTCAGAAAAACAGGAAAGCTCATTTCTTTGAGTGAACAGAACT 533
9C CCTTGGAAAGTCAAACTTTCAGAAAAACAGGAAAGCTCATTTCTTTGAGTGAACAAAACT 800
ECDNA2 ------
ACDNA2 TGATTGATTGTTCCGGTGAATATGGAAATTTAGGATGCAAAGAGGGATGGATAAGCCAAG 593
9C TGATTGATTGTTCCGGTGAATATGGAAATTTAGGATGCAAAGGGGGATGGATAAGCCAAG 860
ECDNA2 ------
ACDNA2 CTTTTGAGTATATCAAGGATAACAAAGGAATTGACACTGAAAATAAATATCATTATGAAG 653
9C CTTTTGAGTATATCAAGGATAACAAAGGAATTGACACTGAAAATAAATATCATTATGAAG 92 0
EcDNA2 ------
ACDNA2 CTAAAGAAAATTTCTGTCGTGATAATCCAAGAAACCGAGGTGCAATTGCCCTTGGCTTTG 713
9C CTAAAGAAAATTTCTGTCGTGATAATCCAAGAAACCGAGGTGCAGTTGCCCTTGGCTTTG 980
ECDNA2 ------
ACDNA2 TCAATATTCAATCTGGGGAAGAAGATAAACTTCAGGCAGCTGTTGCCACGGTTGGACCTG 773
9C TCAATATTCCATCTGGGGAAGAAGATAAACTTAAGGCAGCTGTTGCCACGGTTGGACCTG 1040
ECDNA2 ------
ACDNA2 TTTCCGCTGTTATTGATGTCTCTCATGAAGGTTTTCAATTCTATTCTAAGGGTGTTTACT 833
9 C TTTCCGCTGTTATTGATGTCTCTCATGAAGGTTTTCAATTCTATTCTAAGGGTGTTTACT 1100
73
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ECDNA2 ------
ACDNA2 ATGAGCCATCATGTAAAACATCATTTGAACACCTAAACCAC------8 7 4
9C ATGAGCCATCATGTAAAACATCATTTGAACACCTAAACCACGAAGTTCTTGTAATTGGCT 116 0
ECDNA2 ------
AcDNA2 ------
9C GTGGTTCTGATAATGGCGAAGACTATTGGCTCGTTAAAAACTCATGGTCTAAGCACTGGG 122 0
ECDNA2 ------
ACDNA2 ------
9C GAGACGAAGGGTACCTCAAGATTGCTCGCAATCGCAAGAACCATTGTGGTGTTGCTACTG 128 0
ECDNA2 ------
ACDNA2 ------
9C CAGCTCTCTATCCAATTGTATAGATAGGGTTGTGGTACTTTTTGTGATGTGTGTAATTGA 134 0
ECDNA2 ------
ACDNA2 ------
9C CCACGGTACATCT 1353
74
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A.
50 ctggcccatatgctgccatcgatcgtccacattaaaaacaaggca
LAHMLPS IVHI KNKA
95 aatcgtaaaaaaggagatggaaccatagcttttatttccgctcaa
NRKKGDGTIAFISAQ
14 0 gctagagaattggcaaaacagatcaaagatgtggcagaaaaatat
ARELAKQIKDVAEKY
185 ggagcagtatcttgcataagaggtacatgtgtcttcggtgggtct GAVSCIRGTCVFGGS
230 ccaaagaaagaaactgaacataatacttgc 260
PKKETEHNTC
B.
DEAD LAHMLPSIVHIKNKANRKKGDGTIAFISAQARELAKQIKDVAEKYGAVSCIRGTCVFGGS 6 0
5' - 9C AAFLIPILEKLDP ------SPKKDGPQALILAPTREIiALQIAEVARKLGKHTNLKWVIYGGT 107
* ★ **************** * *
DEAD PKKE 64
5 ' - 9C SIDK 111
Fig. 23. Analysis of 5’ upstream sequence ofArtemia genomic clone 9C
representing CL-2 gene.
Panel A, open reading frame in 5’ upstream sequence of revisedArtemia genomic
clone 9C. The numbers at the left indicate the positions of the nucleotides and amino
acids in the complete sequence; Panel B, BLAST analysis of the amino acid sequence
of the open reading frame in panel A. The best match is a DEAD-box helicase.
Asterisks indicate identical base pair.
75
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. B R -C Z 4
►
^ ______C D PC R 1 ______p. GATA-X M______CDP CR3+HD p Lmo2 complex
______p. GATA-2 ______p. C D PC R 1
______► GATA-1 ______p. CDP CR3+HD TGGATCAGATAAAACGCTGGCCCATATGCTGCCATCGATCGTCCACATTAAAAACAAGGC 6 0
NF-Y 4 ______
M______Elk-1
p Barbie box NF-Y ^ AAATCGTAAAAAAGGAGATGGAACCATAGCTTTTATTTCCGCTCAAGCTAGAGAATTGGC 120
4 ______C/EBP
► GATA-3 ^ Lmo2 complex
______4 ______GATA-1 AAAACAGATCAAAGATGTGGCAGAAAAATATGGAGCAGTATCTTGCATAAGAGGTACATG 180
______p. FO XJ2
TGTCTTCGGTGGGTCTCCAAAGAAAGAAACTGAACATAATACTTGCA 227
Fig. 24. Putative transcription binding sites in 5’ upstream ofArtemia genomic clone 9C.
The 227 bp 5’ upstream sequence of genomic clone 9C was analyzed by Match™
Public 1.0 using the TRANSFAC 6.0 database. The putative transcription binding
sites are shown with an arrow, and indicating the direction.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Discussion:
Cysteine proteases such as cathepsin L have been studied extensively because
they play essential roles in intracellular protein degradation in animal cells. Most
cysteine proteases are synthesized as prepro-enzymes, then undergo proteolytic
processing in endoplasmic reticulum (ER). In most eukaryotes the mature enzymes are
sent to lysosome for storage (Ishidoh and Kominami, 1995), howeverArtemia in
embryos and larvae, most of the cathepsin L-like cysteine protease activities appear to
be non-lysosomal (Warner and Shridhar 1985; Lu and Warner 1991; Warneret al.
1995). Recently some cathepsin L cysteine proteases have been identified in
non-lysosomal regions of several organisms. A cathepsin L isoform in murine NIH3T3
cells, which is devoid of an ER signal peptide in the prepro-sequence, has been found
in the nucleus during the Gl-S transition phase of the cell cycle, and experimental data
suggest that it functions in the regulation of cell cycle progression through proteolytic
processing of the CDP/Cux transcription factor (Goulet et al. 2004). In the shrimp,
Metapenaeus ensis, a cathepsin L encoded by an intron-less gene was found in the
germinal vesicle and thought to be involved in male chromatin remodeling (Hu and
Leung, 2003). Non-lysosomal cathepsin L has also been identifiedXenopus in embryos
(Miyata and Kubo, 1997), Sarcophaga peregrine (Homma and Natori, 1996) and
Onchocerca volvulus (Lustigman et al. 1996).
An Artemia franciscana cathepsin L cDNA containing 1229 bp sequence and
coding for a protease with 217 amino acids was isolated previously in our lab from an
embryo cDNA library (Butleret al. 2001). The cDNA encodes the catalytic subunit of
cathepsin L found in eggs and young larvae of the brine shrimp. This cDNA was used
as template to make a [32P]-labeled probe in search of genomic sequences that might
provide clues as to how theCL gene is regulated inArtemia. An Artemia franciscana
genomic library in EMBL3 was screened using the probe and several clones were
isolated. Prior to this study, theArtemia genomic library in EMBL3 had been searched
by a previous worker (Matt Shaw) and several putative CL clones isolated as well. All
77
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. clones isolated from the library showed an identical restriction pattern and one clone
(9C) was sequenced. The sequence of clone 9C was entered into the GenBank database
in October 2004. Since clone 9C showed only 85 % homology withArtemia embryo
CL cDNA, further screening of the genomic library in EMBL3 was carried out to
isolate a genomic clone matching the embryo cDNA in order to study the promoter
region of the gene. Since the genomic library constructed in phage EMBL3 should
represent all cathepsin L genes inArtemia, several months were spent screening and
analyzing putativeCL gene clones. Ten additional CL-positive clones were isolated,
and upon restriction enzyme analysis, all showed the same pattern, but different from
that observed forArtemia embryo CL cDNA. Three clones were chosen for further
analysis using PCR, and one clone of the PCR products was sequenced. A 482 bp PCR
fragment (see Figure 4) showed that the newly isolated clones were only 90% identical
with Artemia embryo CL cDNA, but 99% identical with genomic DNA clone 9C
previously isolated in our lab. Subsequently, clone 9C was designatedArtemia as
franciscana CL-2 gene.
To analyze further theArtemia franciscana EMBL3 genomic library for the
presence of a CL gene matching the embryonic cDNA, the PCR method was used with
total EMBL3 DNA as substrate and primer pair CLF10 and CLR8. Primers CLF10 and
CLR8 were designed based on the sequence of CL cDNA, however the amplified PCR
fragment was identical to theCL-2 gene (clone 9C). In fact, all PCR products derived
from the use of the EMBL3 library DNA had sequences identical (99 %) with clone 9C
{CL-2), and only 87% identity with CL cDNA.
As the genomic DNA clones matching the embryo CL cDNA could not be found
in the Artemia EMBL3 genomic DNA library, I performed the PCR method using
Artemia genomic DNA isolated from Artemia larvae as template to test for the
presence of the CL-1 gene and confirm the observation of Matt Shaw in our lab. This
experiment yielded a DNA fragment of 565bp, and sequencing showed it to be 97%
identical with embryo CL cDNA. While the PCR product did not cover the entire CL
cDNA, it included most of the mature region and part of prepro-region of cathepsin L.
The 565 bp fragment also contained one EcoR I restriction digestion site, two Hind III
78
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. sites and one PstI site at the same positions foundArtemia in embryo CL cDNA. These
results showed that the gene coding forArtemia embryo CL cDNA was indeed present
inArtemia genomic DNA.
Thus, while conventional screening of a phage DNA library and PCR analysis of
the library DNA did not yield a clone matchingArtemia embryo CL cDNA, the gene
coding forArtemia embryo CL(CL-1 gene) was identified inArtemia genomic DNA
isolated from purified nuclei. Given that theArtemia franciscana genomic library in
EMBL3 has been screened in our lab by myself and Matt Shaw without finding a gene
matching Artemia embryo CL cDNA, it appears that the gene matching embryo CL
cDNA {CL-1) was lost during the construction ofArtemia genomic library in EMBL3
(or its re-amplification), or that the cloned gene expressed the protease which was
harmful (killed) to the host cells containing theCL-1 gene ofArtemia.
In comparingArtemia genomic DNA clones withArtemia embryo CL cDNA, no
introns were found in theCL gene sequences. This observation is consistent with the
conclusion obtained previously in our lab (Matt Shaw, unpublished) showing that the
Artemia CL-1 gene is intronless.
Introns were discovered in 1977 (Berget,et al. 1977; Chow et al. 1977; Jeffreys
and Flavell, 1977). By definition introns do not code for protein, but they are thought
to provide a protective mechanism for an organism’s coding regions of DNA from
being damaged by environmental factors. However, in 1990 Liu and Maxwell showed
that intronic sequences in the mousehsc70 heat shock gene are the source of U14, a
small nuclear RNA (or snoRNA). Also, it has been shown that the second intron in the
human apolipoprotein gene B is required for expression of this gene in liver (Brookset
al. 1994). Introns may also increase the rate of meiotic crossing over within a coding
sequence (Fedorova and Fedorov, 2003). Introns are present in most eukaryotic
organisms includinghomo, mus, Penaeus, Drosophila, Fasciola, and even in
single-cell organisms like yeastS. cerevisiae containing about 300 introns (Fedorova
and Fedorov, 2003).
Initially, the lack o f introns in the sequence Artemiaof franciscana cathepsin L
gene was thought to be unique, but recently cathepsin L-like cysteine protease genes in
79
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Leishmania donovani (Mundodiet al. 2001) and the shrimp Metapenaeus ensis
cathepsin L (MeCatL) (Hu and Leung, 2003) have been shown to be intron-less. By
comparing MeCatL with the CL genes PC PI and PCR2 from the marine shrimp
Penaeus vannamei, each containing five introns (Le Boulayet al. 1998), Hu and
Leung found thatMeCatL shares a high degree of sequence identity withPCP1 and
PCP2, suggesting that they have the same ancestor and diverged from one another
recently. They hypothesized that MeCatLthe gene has lost all five introns during
evolution. They also suggested that the double-strand-break repair (DSBR) machinery
might play a role in cDNA-mediated homologous recombination (cDMHR) that causes
the loss of introns (Hu and Leung, 2005).
The “intron-early” theory, first proposed in 1978, hypothesized that introns are
very ancient genetic elements which existed at the beginning of life before the
divergence of eukaryotes and prokaryotes (Doolittle, 1978; Darnel, 1978). The theory
suggests that introns made an essential contribution to the evolution of genes via “exon
shuffling” which created genes from exon “pieces” by recombination within introns
(Doolittle, 1978; Darnel, 1978; Roy et al. 1999; Gilbert, 1987; Fedorov, 2001).
Accordingly, many biologists believe that introns are lost in the course of evolution
(Gilbert et al. 1986). The prokaryotic lineage completely lost its introns, whereas early
introns were retained in the eukaryotes (Fedorovet al. 2002). In 2003, Royet al.
compared 10,020 introns in human-mouse orthologs and 1,459 in mouse-rat, and
found evidence of intron loss in mammals, but no gain in introns during evolution.
In contrast, the intron-late theory, formulated in 1991, hypothesizing that introns
have appeared relatively recently in the genomes of eukaryotes long after their
divergence from prokaryotes (Cavalier-Smith, 1991; Palmer and Logsdon, 1991). This
theory is supported by the fact that introns can behave as transposable elements which
are capable of being inserted into a gene or deleted from it, and based on the
observation that intron positions vary in homologous genes of different organisms
(Longsdonet al. 1995; Logsdon, 1998; Logsdon, Stoltzfus and Doolittle, 1998). The
proponents of this theory have suggested that introns arose as “selfish” elements, and
play no constructive role in evolution. They suggest that introns are spread as mobile go
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. elements that invade genes by insertion into short 4- to 5-nt-long “proto-splice sites”
(Dibb and Newman, 1989). Studies of the triosephosphate isomerase{TPI) gene
support an insertional origin of all its known introns (Longsdonet al. 1995;
Kwaitowski et al. 1995). Also, three introns in theDrosophila Xdh gene were shown to
be recent insertions (Tarrioet al. 1998). Divergent structures of Caenorhabditis
elegans cytochromeP450 genes also suggest intron insertions (Gotoh, 1998).
The Artemia franciscana cathepsin L genes {CL-1, CL-2) tend to follow the
intron-late theory. Cysteine proteases arose early in evolution, most likely before the
divergence of eukaryotes and prokaryotes (Berti and Storer, 1995).Artemia is
considered to be an ancient eukaryote and similar in age to prokaryotes, whose genes
lack introns. Rita Shamoon in our lab identified thecathepsin L-l gene of Artemia
parthenogenetica, a parthenogenetic relative ofArtemia franciscana, evolving more
recently (5-6 mya) thanArtemia franciscana. The CL-1 gene sequence of Artemia
parthenogenetica shares 98% identity with the cDNA ofArtemia franciscana, but the
CL gene in Artemia parthenogenetica contains an intron of 1085 bp in the
prepro-region, whereas the CL-1 gene inArtemia franciscana is intron-less. At this
time we have no information on the structure of theCL-2 gene (if it exists) inArtemia
parthenogenetica.
In contrast with theCL genes, the actin genes in twoArtemia species, Artemia
franciscana andArtemia parthenogenetica, contain several introns like that found in
mammals (Ortega et al. 1996). One actin gene isolated fromArtemia parthenogenetica
contains 3 introns, while two otheractin genes isolated fromArtemia franciscana
contain 5 and 6 introns, respectively. In the gene codingArtemia for franciscana
Na/K-ATPase al subunit, ten of the 14 introns are located in identical position as in the
human Na/K-ATPase a3 subunit gene (Garcia-Saez et al. 1997). These findings
suggest that perhaps the actin andATPase genes andCL genes evolved differently in
Artemia and more genes need to be analyzedArtemia in to better understand the
mechanisms involved in intron-loss or intron-gain of this species.
To confirm the functionality ofCL-2 gene inArtemia franciscana, both adult
and embryo cDNA libraries were analyzed using PCR in search of a cDNA matching
81
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the CL-2 gene. The Artemia adult cDNA library in X.gtl 1 was amplified using internal
primers prepared from sequence in theCL-2 gene, and two fragments with a combined
total of 874 bp sequence were obtained representing most of the CL-2 cDNA. The
Artemia embryo cDNA library was also analyzed using PCR and yielded products that
represented both theCL-1 and CL-2 genes. The latter was unexpected because no
cathepsin L matching theCL-2 gene sequence has ever been isolated or identified in
Artemia embryos and larvae.
The 874 bp sequence of CL-2 cDNA obtained from the adult cDNA library was
analyzed in the Open Reading Frame Finder from NCBI, and two open reading frames
were identified, suggesting that theArtemia CL-2 cDNA might encode two proteins.
The deduced amino acid sequence Artemiaof adult cDNA (CL-2) was compared with
amino acid sequence ofArtemia franciscana embryo cathepsin L (ECL-1) (see Fig 25)
as well as cathepsin L sequences from other organisms (see Fig 26). The deduced
amino acid sequence from the embryo CL-2 clone (ACL-2) shows 77 % identity
overall with the amino acid sequence of ECL-1. High amino acid identity with other
cathepsins L was also observed with the fruit flyDrosophila melanogaster (57 %), the
fresh flySarcophagaperegrina (57 %) and the shrimpMetapenaeus ensis (51 %). The
Artemia cathepsin L coded by the CL-2 cDNA (ACL-2) shares the active site Cys and
His with other cathepsin L-like proteases. As the cDNA sequence is not complete, we
have no information on 3’-amino acids including the active site Asn. As shown in
Figure 25, the first open reading frame with 49 amino acids of CL-2 cDNA appears to
encode part of the pro-peptide, while the second open reading frame with 173 amino
acids encodes most (79 %) of the mature region of the protease. The mature CL
proteases of Drosophila melanogaster andSarcophaga peregrina have high identity
(60 % and 59 %, respectively) with the second open reading frame of ACL-2 (see
Figure 26). As well, the mature region of ECL-1 is 76 % identical with the second open
reading frame of ACL-2, while the pro-region of ECL-1 is 81 % identical with the first
open reading of ACL-2. The first open reading frame was also compared with
Drosophila melanogaster (46 %), the flesh fly Sarcophaga peregrina (48 %) and
82
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ACL- 2 - MKIYFGNKD 9
ECL-1 ^IKQITLIFLLGAVLVQLSAA|L5LTNLLADEWHLFKATHKKEYPSQLEEKFRMKIYLENKH 60
**** * *
ACL-2 KIAKHNILYEKGEKSYQVAMNQFGDLLHHEFTSIMIGYKK------MEPANV 55
E C L -1 KVAKHNILYEKGEKSYQVAMNKFGDLLHHEFRSIMNGYQHKKQNSSRAESTFTFMEPANV 12 0
* ******************* ********* *** ** ******
ACL-2 TVPESVDWREKGAVTHVKYQGQgASCWAFSSTGALKSQTFRKTGKLISLSEQNLXDCSGE 115
ECL-1 EVPESVDWRVKGAITPVKDQGQjcjGSCWAFSSTGALEGQTFRKTGKLISLSEQNLIDCSGK 18 0
ACL- 2 YGNLGCKEGWISQAFEYIKDNKGIDTENKYHYEAKENFCRDNPRNRGAIALGFVNIQSGE 175
E C L -1 YGNEGCNGGLMDQAFQYIKDNKGIDTENTYPYEAEDNVCRYNPRNRGAIDRGFVHIPSGE 2 4 0
*** ** * *** ************ * *** * ** ******** *** * ***
ACL- 2 edklqaavatvgpvsavidvshegfqfyskgvyyepsckts - f e [h | - - l n h ------2 1 9
ECL-1 EDKLKAAVATVGPVSVAIDASHESFQFYSKGVYYEPSCDSDDLDgGVLWGYGSDNGKDY 3 00
A C L-2 ------2 2 2
E C L -1 WLVKNSWSEHWGDEG YI KI ARNRKNHCGIATAAS Y PLV 3 38
Fig. 25. Comparison of the deduced partial amino acid sequence ofArtemia CL-2 with Artemia CL-1 cDNA.
The sequence of Artemia embryo cDNA (ECL-1) (AF147207) was obtained from
GenBank database. The alignment was performed using program Clustal W (1.82).
Asterisks indicate identical amino acids. Dashes are used to optimize the alignment.
The active site Cys and His are boxed. Arrowhead indicates the putative cleavage site
of the pro-peptide from the mature enzyme. The first open reading frameArtemia of
adult cDNA (ACL-2) is underlined, and the rest of the amino acid sequence belongs to
second open reading frame. The signal peptide for ERArtemia in embryo cDNA is
boxed. Area showing double underline contains bases between the two open reading
frames which are not translated.
83
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fig. 26. Comparison of the deduced amino acid sequence ofArtemia CL-1 and CL-2 cDNA with cathepsin L sequences from other organisms.
Sequences used in the alignments were obtained from GenBank database, and are the
CPI of fruit fly, Drosophila melanogaster, (DCP1) (AF012089); flesh fly,
Sarcophaga peregrina, (D16533); shrimp, Metapenaeus ensis, (Y126713); Artemia
embryo cDNA (ECL-1) (AF147207) and Artemia adult cDNA (ACL-2). The
alignment was performed using program Clustal W (1.82). Asterisks indicate
identical amino acids. Dashes are used to optimize the alignment. The active sites Cys
and His are boxed. Arrowhead indicates the putative cleavage site of the pro-peptide
from the mature enzyme. The first open reading frame ofArtemia adult cDNA
(ACL-2) is underlined, and the rest amino acid sequence belongs to second open
reading frame.
84
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 26.
A C L-2 ------MKIYFGNKD 9
ECL- 1 MKQITLIFLLGAVLVQLSAALSLTNLLADEWHLFKATHKKEYPSQLEEKFRMKIYLENKH 60
DCP1 - -MRTAVLLPLLALLAVAQAVSFADWMEEWHTFKLEHRKNYQDETEERFRLKIFNENKH 58
F l e s h - - MRT- VLVALLALVALTQAISPLDLIKEEWHTYKLQHRKNYANEVEERFRMKIFNENRH 57
M et - -MKALSVLACWAVAVASP WQDFKVQYGRHYGTAREDLYRQSVFEQNQQ 48
*
ACL-2 KIAKHNILYEKGBKSYQVAMNQFGDLLHHEFTSIMIGYKK ME 51
E C L -1 KVAKHNILYEKGEKSYQVAMNKFGDLLHHEFRSIMNGYQHKKQNSSRAEST FTFME 116
DCP1 KIAKHNQRFAEGKVSFKLAVNKYADLLHHEFRQLMNGFNYTLHKQLRAADESFKGVTFIS 118
F l e s h KIAKHNQLFAQGKVSYKLGLNKYADMLHHEFKETMNGYNHTLRQLMRERTG-LVGATYIP 116
Met FIEDHNAKFENGEVTFTLKMNQFGDMTSEEFAATMNGFLNVPTRHP VAILE 99
* * * ******
T ACL-2 PANVTVPESVDWREKGAVTHVKYQGQgASCWAFSSTGALKSQTFRKTGKLISLSEQNLID 111
ECL-1 PANVEVPESVDWRVKGAITPVKDQGQgGSCWAFSSTGALEGQTFRKTGKLISLSEQNLID 176
DCP1 PAHVTLPKSVDWRTKGAVTAVKDQGHgGSCWAFSSTGALEGQHFRKSGVLVSLSEQNLVD 178
F l e s h PAHVTVPKSVDWREHGAVTGVKDQGHgGSCWAFSSTGALEGQHFRKAGVLVSLSEQNLVD 176
M et ADDETLPKHVDWRTKGAVTPVKDQKQgGSCWAFSTTGSLEGQHFLKDGKLVSLSEQNLVD 15 9
* * * * * ** * ** * * ****** ** * * * * * * ******* *
ACL-2 CSGEYGNLGCKEGWISQAFEYIKDNKGIDTENKYHYEAKENFCRDNPRNRGAIALGFVNI 171
ECL-1 CSGKYGNEGCNGGLMDQAFQYIKDNKGIDTENTYPYEAEDNVCRYNPRNRGAIDRGFVHI 236
DCP1 CSTKYGNNGCNGGLMDNAFRYIKDNGGIDTEKSYPYEAIDDSCHFNKGTVGATDRGFTDI 238
F lesh CSTKYGNNGCNGGLMDNAFRYIKDNGGIDTEKSYPYEGIDDSCHFNKATIGATDTGFVDI 236
Met CSGKFGNMGCCGGLMDQAFKYIKENKGIDTEESYPYEAQDGKCRFDSSNVGATDTGFVDI 219
** ** ** * *** * ***** * ** * ** ** *
ACL-2 QSGEEDKLQAAVATVGPVSAVIDVSHEGFQFYSKGVYYEPSCKTS-FEjHj- -LNH------2 1 9
ECL-1 PSGEEDKLKAAVATVGPVSVAIDASHESFQFYSKGVYYEPSCDSDDLE03VLWGYGSD- 295
DCP1 PQGDEKKMAEAVATVGPVSVAIDASHESFQFYSEGVYNEPQCDAQNLDgGVLWGFGTDE 298
F le sh PEGDEEKMKKAVATMGPVSVAIDASHESFQLYSEGVYNEPECDEQNLDjHjGVLWGYGTDE 296
Met AHGEENSLMKAVANIGPISVAIDASHPSFQFYHQGVYYEKECSSTMLDgGVLAIGYGETD 27 9
* * *** +* * ** *★ ** * *** * * ★ *
ACL-2 222
ECL- 1 NGKDYWLVKNSWSEHWGDEGYIKIARNRKNHCGIATAASYPLV 33 8
DCP1 SGEDYWLVKNS WGTT WGDKGFIKMLRNKENQCGI AS ASS Y PLV 3 4 1
F l e s h SGMDYWLVKNSWGTTWGEQGYIKMARNQNNQCGIATASSYPTV 33 9
M et DGKEYWLVKNSWNTSWGDKGFIQMSRNKKNNCGIASQASYPLV 32 2
85
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. shrimp Metapenaeus ensis (34 %) in the pro-region. However, it should be noted that
Western blot analysis of the CL from various life cycle stagesArtemia in revealed a
cathepsin L of about 24 kDa after 14 days in culture representing a late juvenile stage
inArtemia development (unpublished data). These results suggest that the second open
reading ofCL-2 gene, which codes for at least 173 amino acids of the mature protein,
may code for a cathepsin L specific to adult tissue and not embryos and larvae. This
conclusion is supported by HPLC and SDS-PAGE analysis of cysteine proteases of
Artemia franciscana in our lab which have indicated the presence of only one cysteine
protease in Artemia embryos, arising from theCL-1 gene (Bulter et al. 2001).
In mammals the prepro-region of the cysteine protease is responsible for proper
targeting of the enzymes (Hanewinkelet al. 1987; Cuozzo et al. 1995) and correct
folding of the enzymes (Smith and Gottesman 1989; Coulombe et al. 1996), as well as
inhibiting the activity of mature proteasesin vitro (Cygler and Mort 1997). Recently,
several studies have focused on the auto-catalytic processing of the prepro-enzyme
under an acidic environment in the ER (Turk et al. 2000). The activation is triggered by
a drop in pH from 8.0 to 5.3, which weakens the interactions between the propeptide
and the catalytic domain (Carmonaet al. 1996; Fox et al. 1992). Under acidic
conditions, the pro-peptide is less tightly bound to the active site, and can be cleaved to
the mature enzyme (Menardet al. 1998; Rozman et al. 1999). Since the second open
reading frame of adult CL-2 cDNA lacks the sequence coding for prepro-region of the
protease, including the ER localization signal peptide, theArtemia cathepsin L
encoded by the second open reading frame would not enter the ER for proteolytic
processing, and subsequent localization in the lysosomes. This is consistent with the
fact that in Artemia franciscana, most of the cysteine protease activity is found in
non-lysosomal areas of embryos, but whether this occurs in adult tissues is not known.
In murine NIH3T3 cells where a cathepsin L isoform devoid of a signal peptide has
been detected in the nucleus, the authors speculate that this cathepsin L isoform
contains a very short pre-domain, which does not bind tightly to the active groove of
mature protease, thereby allowing auto-catalytic processing to occur at higher pH
(Goulet et al. 2004). As well, a cytosolic chaperone could serve as a substitute for the
86
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. pro-peptide (Frydman, 2001). A novel form of cathepsin B associated with
intracellular membranes in human tumors was shown to lack a signal peptide and part
of the pro-peptide, but it could still become active (Mehtaniet al. 1998). Artemia
cathepsin L coded by theCL-2 gene could probably follow a similar pathway. It is also
possible that the 49 amino acids encoded by the first open reading frameArtemia of
adult cDNA could substitute for the normally present pro-peptide as found for the
Artemia embryo cathepsin L coded by theCL-1 gene since its sequence is 81 %
identical with the pro-peptide encoded Artemia by embryo CL-1 cDNA. Given the
lack of hydrophobicity in the 49 amino acid prepro-enzyme fragment, it is unlikely that
this fragment could enter the ER for degradation like what is thought to happen with
the pro-peptide coded byArtemia embryo CL-1 cDNA. However, since no matching
protein for the first open reading frame of CL-2 cDNA has been identified on Western
blots o f Artemia franciscana embryo or adult preparations, if one is synthesized it must
be degraded rapidly shortly after translation.
While analyzing theArtemia embryo cDNA library, a PCR product of 340 bp,
identical to the 5’ end of theArtemia adult cDNA and representing theCL-2 gene as
shown in Figure 19 was found. Since a cysteine protease encoded by the CL-2 cDNA
has not been found in the embryosArtemia, of it remains unknown whether this (CL-2)
cDNA is functional or not. Clearly more work needs to be done with bothArtemia
embryo and adult cDNA libraries to obtain the full sequence of the cDNA coded by the
CL-2 gene. Towards this goal, the 3’ end of embryo CL-2 cDNA has to be identified.
Eventually, the cysteine protease observed on Western blotsArtemia of juveniles needs
to be isolated and sequenced to compare with adult cDNA sequences to test the above
hypothesis.
In Artemia franciscana, regulation of promoter occupancy in theactin 302 gene
and sarco/endoplasmic reticulumCa2+-ATPase-encoding gene has been identified
(Martinez-Lamparero et al. 2003). Transcriptional regulation of these genes appears to
be associated with deactivation of cryptobiosis, leading to enhanced metabolism and
development ofArtemia embryos. To understand the mechanism involved in
transcriptional regulation of theArtemia cathepsin L genes, the 5’ upstream sequence
87
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. of the cathepsin L genes must be identified. Toward this objective I performed the PCR
method using a degenerate primer with an internal primer.Artemia genomic clone 9C
representing the CL-2 gene was used as template. Degenerate primers, also called
random primers, were those used by other researchers along with a primer designed to
a conserved region of the cysteine protease. Random primers have been used in PCR
for universal amplification of prevailing DNA or for amplification of unknown
intervening sequences that are not generally defined in length or sequence (Eggeling
and Spielvogel, 1995). This technique has been widely performed for the identification
of viral genomes (Rose, 2005), for cloning and sequencing ofLactococcus the lactis
subsp. lactis recA gene (Duwart et al. 1992), and for obtaining sequence from the yeast
artificial chromosome (YAC) insert ends (Swensen, 1996).
Our degenerate primers were designed according to primers used by Badaracco
et al. (1995) to randomly amplify polymorphic DNA in the phylogenetic study of
bisexual Artemia. In my experiments four degenerate primers were used with
conserved primers in PCR reactions to yield multiple bands in each case. Of the several
PCR products generated (see Fig 20), only one showed a positive signal on a Southern
blot when probed with [32P]-labeled CL cDNA. This PCR product was ligated into
plasmid vector, cloned and sequenced. Sequencing yielded a product of 404 bp
including 260 bp 5’ upstream sequence of genomic clone 9C. The degenerate primer
OPC-4 was identified at one end of the PCR product, but the conserved primer
(CLR10) appeared at 5’ end of both strands of newly synthesized DNA for reasons not
yet clear. The sequence of (genomic) clone 9C is 97% identical withArtemia adult and
embryo CL-2 cDNA, and no intron or putative splicing sites were identified in the
genomic clone 9C sequence. This observation demonstrates that theArtemia CL-2
gene is also intron-less like theCL-1 gene inArtemia franciscana, which supports the
conclusion thatArtemia cathepsin L genes lack introns as mentioned above.
Analysis of the 5’ end of clone 9C, after PCR extension, revealed an open reading
frame of 69 amino acids prior to the coding sequence for theCL-2 gene. When
analyzed by BLAST (NCBI), the sequence was similar to a gene coding for
DEAD-box helicases (see Figure 23) with an E value of 2E-08. DEAD-box helicases
88
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. belong to a family of proteins involved in ATP-dependent RNA unwinding, needed in a
variety of cellular processes including splicing, ribosome biogenesis and RNA
degradation. The name derives from the sequence of the Walker B motif (motif II),
which contains the ATP-binding region (Marchler-Bauer, 2005). It is unusual to
associate the putative promoter region with another gene, especially if the amino acid
sequence in the open reading frame is only 30 % identical with the DEAD-box helicase.
However, as the 5’ upstream sequence of clone 9C obtained here is not extensive, the
meaning of this sequence is not clear. On the other hand, several putative transcription
factor binding sites were identified in the 227 bp 5’ upstream sequence as shown in
Figure 24, indicating that this area could include the (putative) promoter region of the
Artemia CL-2 gene. However, the 227 bp sequence was found to have a high AT
content (59.5 %), which is in contrast with the promoter region of many lysosomal
enzymes, where a high GC content has been found for human cathepsin L (Bakhshiet
al. 2001), rat cathepsin L gene (Charronet al. 2002), human cathepsin B (Yanet al.
2000) and human cathepsin S (Shiet al. 1994). Clearly, additional 5’ upstream
sequence of Artemia genomic clone 9C is needed to more fully characterize the
promoter region of Artemmia franciscana cathepsin L-2 gene. As well, functional
analysis will be needed to identify the core promoter region ofCL the genes inArtemia
to better understand the transcription mechanism involved in this gene.
Future work will focus on several points as follows: first, completion of the
sequences of CL-2 cDNA inArtemia adult and embryo libraries using the PCR method
or conventional screening procedure; second, isolation of the cysteine protease
observed on Western blots Artemiaof juveniles; third, a search will be carried out for
additional 5’ upstream sequence ofArtemia CL-2 gene, identifying the core promoter
region using deletion analysis, and the transcription initiation site; and fourth,
completion of the sequence of CL-1 gene, and search for its 5’ promoter sequence to
understand the expression and regulation of thecathepsin L-l gene in Artemia
embryos.
89
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix 1: Pimers used in PCR experiments.
Primer Sequence Tm
CLF 5 ’ -C AATGAAGC AGATTACTTTGA-3 ’ 57°C
CL9CF1 5 ’ -GC AG ATTACTTTG AC ATATTTACT-3 ’ 55°C
CLF 11 5 ’ -TATAAGAAATGAACTTCACCCTTT-3 ’ 58.7°C
CLF 13 5 ’ -CCAAC ATAAGAA ACAGAATTCCTC-3 ’ 61.5°C
CLF 10 5 ’ -ATCTATTC AAGGCTAC AC ACAAGA-3 ’ 60.6°C
CLR 5 ’ -TATAC AAGTGGATAGCTAGCT-3 ’ 52.5°C
CLR3 5 ’-TCCATTAATCCTCCATTGCAT-3 ’ 62.9°C
CLR8 5 ’ -ATAGTCTTCGCC ATTATCAGAACC-3 ’ 63°C
CLR10 5 ’ -TCTTGTGTGTAGCCTTGAATAGAT-3 ’ 60.6°C
CLR 10b 5’-GTGGTTTAGGTGTTCAAATGATGT-3’ 62.6°C
CLR 11 5’-AAAGGGTGAAGTTCATTTCTTATA -3’ 58.7°C
CLR 18 5 ’ -TCCGTGGTCTAGGTCATCAGAGTC-3 ’ 67.5°C
OPC-2 5 ’-GTCAGGCGTC-3 ’ 33.4°C
OPC-4 5 ’ -CCGCATCTAG-3 ’ 30°C
OPC-8 5 ’ -TGG ACCGGTG-3 ’ 40.7°C
OPC-9 5 ’ -CTCACCGTCC-3 ’ 33.TC
TP-7F 5 ’ -TTGTAATACGACTC ACTATAGGGC-3 ’ 60°C
90
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix 2: Primers designed on Artemia embryo CL-1
cDNA sequence.
Start code
1 catcttgtgg cagacaatta caca|atg|aag cagattactt tgatattttt actgggagc ------CLF 61 gtacttgtgc agttaagtgc tgcactatca ctgacaaatt tacttgctga tgaatggcat
CLR10 121 ctattcaagg ctacacacaa gaaagaatat ccaagccaac ttgaggagaa atttagaatg ► CLF10 181 aagatttatt tggaaaataa acacaaagtt gccaaacata acatccttta tgaaaaaggc
241 gaaaagtctt atcaagtcgc aatgaataag tttggagatc ttcttcatca tgaatttaga
EcoR I
301 tctatcatga atggatacca acataagaaa ca|gaattcjct caagagctga gagcactttc ► CLF13 361 acttttatgg agcctgctaa tgttgaagtt ccagaatctg ttgactggag ggtaaaagga
421 gccataactc ctgtaaaaga ccaaggacag tgtggttcat gctgggcttt ctcatctact
481 ggtgccttgg aaggtcaaac cttcagaaaa acagggaagc tcatttcttt gagtgaacag
CLR3 ■4------541 aacttgattg attgttctgg aaaatatgga aatgaaggat gcaatggagg attaatggac
601 caagctttcc agtatatcaa ggataacaag ggaattgaca ctgaaaatac gtacccttat
661 gaagctgaag acaatgtctg tcgttataat ccaaggaacc gaggtgccat tgaccgtggc
721 tttgtccata tcccatctgg agaagaagat aagcttaagg cagctgttgc cactgttgga
91
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 781 cctgtatctg ttgccatcga tgcctctcat gaaagtttcc aattctattc taaaggtgtt
CLR18 ■4------841 tactatgagc catcatgtgactctgatgac ctagaccacg gagttcttgt ggttggctat
CLR8 ■4------901 ggttctgata atggcaaaga ctattggctc gttaaaaact cgtggtctga gcactgggga
961 gacgaagggt atatcaagat tgctcgcaat cgcaagaacc attgtggtat tgctactgca
CLR 4 - - , - 1 0 2 1 gctagctatc cacttgt|ata| gatagggttg tgg taatttt tgtggatgtg tgtaattgca
Stop code
1081 tacgttaaat tcttattctc ttgataggtt tagagagttc tagttttcag tttgattccg
1141 tagatgacag attttgtgac catattcgag aataaagcgt ttttttta c c taaaaaaaaa
1201 aaaaaaaaaa aaaaaaaaaa aaaaaaaaa
92
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix 3: Primers designed onArtemia genomic clone 9C
sequence.
1 caaatgaagcagattacttt gacatattta ctaacagctg taatgatatt tttactgtca ► CL9CF1
61 gttgtacttg tgcagttaag tgctacacaa tcacagtcaa atttgcttgc tgatgaatgg
CLR 10
1 2 1 tatctattca aggctagaca caagaaagat tatccaagcc aacttgagga aaaatttaga ► CLF10
181 atgaagattt attttgaaaa taaagacaaa attgccaaac ataacatcct ttatgagaaa
241 ggcgaaaagt cttatcaagt tgcaatgaat cagtttggag atcttcttca tcatgaattt
CLR11
301 acatctatca tgattggata taagaaatga acttcaccct ttgctaagag cacttttact — » CLF11
361 tttatggagc ctgctaatgt tacagttcca gaatctgttg actggaggga aaaaggagca
421 gtaactcctg taaaataccc aggacagtgt gcttcttgct tgg ctttttc acctactggt
481 gccttggaaa gtcaaacttt cagaaaaaca ggaaagctca tttctttg ag tgaacaaaac
541 ttgattgatt gttccggtga atatggaaat ttaggatgca aagggggatg gataagccaa
601 gcttttgagt atatcaagga taacaaagga attgacactg aaaataaata tcattatgaa
661 gctaaagaaa atttctgtcg tgataatcca agaaaccgag gtgcagttgc ccttggcttt
721 gtcaatattc catctgggga agaagataaa cttaaggcag ctgttgccac ggttggacct
93
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 781 gtttccgctg ttattgatgt ctctcatgaa ggttttcaat tctattctaa gggtgtttac
CLRlOb
841 tatgagccat catgtaaaacatcatttgaa cacctaaacc acgaagttct tgtaattggc
CLR 8 ■4------901 tgtggttctg ataatggcga agactattgg ctcgttaaaa actcatggtc taagcactgg
961 ggagacgaag ggtacctcaa gattgctcgc aatcgcaaga accattgtgg tgttgctact
1021 gcagctctct atccaattgt atagataggg ttgtggtact ttttg tg atg tgtgtaaittg
1081 accacggtac atct
94
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. References:
Arora, S., Chauhan, S. S. 2002. Identification and characterization of a novel human cathepsin L splice variant.Gene 293: 123-131.
Badaracco, G, Bellorini, M., Landaberger, N. 1995. Phylogenetic study of bisexual Artemia using random amplified polymorphic DNA.J. Mol. Evol. 41: 150-154.
Badaro, R., Jones, T.C., Lorenco, R., 1986. A prospective study of visceral leishmaniasis in an endemic area of Brazil.J. Infect. Dis. 154: 639-649.
Barrett, A. J., and Kirchke, H. 1981. Cathepsin B, cathepsin H, and cathepsin L. Methods Enzymol. 80: 535-561.
Barrett, A. and Rawlings, N. D. 2001. Evolutionary Lines of Cysteine Peptidases.Bio. Chem. 382: 727-733.
Barrett. A. J., Rawlings, N. D., Woessner J. F. 1998.Handbook o f Proteolytic Enzymes. Academic Press, San Diego, California.
Benton, W. D. and R.W. Davis. 1977. Screening Xgt recombinant clones by hybridization to single plaques in situ.Science 196:180.
Berdowska, I. 2003. Cysteine protease as disease markers.Clinica ChimicaActa 342: 41-69.
Berget, S. M., C. Moore, and P.A. Sharp, 1977. Spliced segments at the 5_terminus of adenovirus 2 late mRNA.Proc. Natl. Acad. Sci. USA 74: 3171-3175.
Berti, P. J., Storer, A. C. 1995. Alignment/phylogeny of the papain superfamily of cysteine proteases.J. Mol. Biol. 246(2): 273-283.
Bimboim, H.C. and J. Doly. 1979. A rapid alkaline extraction procedure for screening recombinant plasmid DNA.Nucleic Acids Res. 7: 1513.
Blondeau,X., Vidmar,S.L., Emod, I., Pagano, M., Turk, V. and Keil-Dlouha, V. (1993). Generation of matrix-degrading proteolytic system from fibronectin by cathepsins B, G H and L.Biol. Chem. Hoppe-Seyler, 374: 651-656.
Bode, W., Huber, R. 2000. Structural basis of the endoproteinase — protein inhibitor interaction.Biochim. Biophys. Acta 1477: 241—252.
95
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Bromme, D., Nallaseth, F. S., and Turk, B. 2004. Production and activation of recombinant papain-like cysteine proteases.Methods 32(2): 199-206. Review.
Britton, C., Murray, L. 2002. A cathepsin L protease essential forCaenorhabditis elegans embryogenesis is functionally conserved in parasitic nematodes.Mol. Biochem. Parasit. 122: 21-33.
Brocklehurst K, Kowlessur D, O'Driscoll M, Patel G, Quenby S, Salih E, Templeton W, Thomas EW, Willenbrock F. 1987. Substrate-derived two-protonic-state electrophiles as sensitive kinetic specificity probes for cysteine proteinases. Activation of 2-pyridyl disulphides by hydrogen-bonding.Biochem J. 244(1): 173-181.
Brooks, A. R., B.P. Nagy, S. Taylor, W.S. Simonet, J.M. Taylor, and B. Levy-Wilson, 1994. Sequences containing the second-intron enhancer are essential for transcription of the human apolipoprotein B gene in the livers of transgenic mice.Mol. Cell. Biol. 14: 2243-2256.
Butler, A. M., Aiton, A. L. and Warner, A. H. 2001. Characterization of a novel heterodimeric cathepsin L-like protease and cDNA encoding the catalytic subunit of the protease in embryos ofArtemia franciscana. Biochem. Cell Biol. 79: 43-56.
Carmona, E., Dufour, E., Ploufee, C., Takebe, S., Mason, P., Mort, J. S., and Menard, J. 1996. Potency and selectivity of the cathepsin L propeptide as an inhibitor of cysteine protease. Biochem. 35: 8149-8157.
Cavalier-Smith, T., 1991. Intron phylogeny: a new hypothesis.Trend. Genet. 7: 145-148.
Chauhan, S.S., Popescu, N.C., Ray, D., Fleischmann, R., Gottesman, M.M., Troen, B.R., 1993. Cloning, genomic organization and chromosomal localization of human cathepsin L. J. Biol. Chem. 218: 1039- 1045.
Chen, M.G., Mott, K.E., 1990. Progress in morbidity due Frasciolato hepatica infection.Trrop. Dis. Bull. 87: 1-37.
Chow, L.T., R.E. Gelinas, J.R. Broker, and R.J. Roberts, 1977. An amazing sequence arrangement at the 5_ends of adenovirus 2 messenger RNA.Cell 12: 1-8.
Clegg, J.S. and Conte, F.P.1980. A review of the cellular and development biology of Artemia. In: The brine ShrimpArtemia. Vol. 2. (Persoone, G., Sorgeoloos, P., Roels, O., and Jaspers, E. editors), pp. 11-54, Universa Press, Wetteren.
96
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Collette, J., Bocock, J. P., Ahn, K., Chapman, R. L., Godbold, G., Yeyeodu, S., and Erickson, A.H. 2004. Biosynthesis and alternative targeting of the lysosomal cysteine protease cathepsin L. Int. Rev. Cytol, 1: 241.
Criel, G RJ. and Macrae, T.H. 2002. Chapter 1.Artemia morphology and structure. In: Artemia basic and applied biology. (Abatzopoulos, J., Beardmore, J. S., and Sorgeloos, P.), pp. 1-38, Kluwer Academic Publishers, London.
Criel, G.R.J. and Macrae, T.H. 2002. Chapter 2. Reproductive biologyArtemia. of In: Artemia basic and applied biology. (Abatzopoulos, J., Beardmore, J. S., and Sorgeloos, P.), pp. 39-128, Kluwer Academic Publishers, London.
Cygler, M. and Mort, J.S. 1997. Proregion structure of members of the papain superfamily. Mode of inhibition of enzymatic activity.Biochimie 79: 645-652.
Dalton, J.P., McGonigle, S., Rolph, T.P., Andrews, S.J., 1996. Induction of protective immunity in cattle against infection withFasciola hepatica by vaccination with cathepsin L proteinase and hemoglobin.Infect. Immun. 64: 5066-74.
Darnel, J.E., 1978. Implications of RNA. RNA splicing in evolution of eukaryotic cells. Science 202: 1257-1260.
Delaisse, J.M., Vaes, G., 1992. In: Griffin, B.R., Gay, C.V. (Eds.). Biology and Physiology of the Osteoclast,CRC Press, Boca Raton, FL, p. 290.
Delaisse,J.M., Eeckhout,Y. and Vaes,G. 1980. Inhibition of bone resorption in culture by inhibitors of thiol proteinases.Biochem. J., 192: 365-368.
Dibb, N. J., Newman, A. J. 1989. Evidence that introns arose at proto-splice sites. EMBOJ. 8(7): 2015-2021.
Dolenc I., Turk B., Pungercic G, RitonjaA., Turk V. 1995. Oligomeric structure and substrate induced inhibition of human cathepsin J.C. Biol. Chem. 270(37): 21626-21631.
Doolittle, W.F., 1978. Genes in pieces: were they ever together?Nature 272: 581-582.
Duwat, P., Ehrlich, S. D., and Gruss, A. 1992. Use of degenerate primers for polymerase chain reaction cloning and sequencing of the Lactococcus lactis subsp. lactis recA gene. Appl Environ Microbiol. 58(8): 2674-2678.
97
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Eeckhout,Y. and Vaes,G. 1977. Further studies on the activation of procollagenase, the latent precursor of bone collagenase. Effects of lysosomal cathepsin B, plasmin and kallikrein, and spontaneous activation.Biochem. J., 166: 21-31.
Erickson-Lawrence, M., Zabludofe, S.D., Wright, W.W., 1991. Cyclic protein 2, a secretory product of rat sertoli cells is the proenzyme form of cathepsinMol. L. Endocrinol. 5: 1789-1798.
Esteban, J.G., Bargues, M.D., Mas-Coma, S., 1998. Geographical distribution, diagnosis and treatment of human fascioliasis: a review.Res. Rev. Parasitol. 57: 309-18.
Evans, T.G., Teixeira, M.J., McAuliffe, I.T., 1986. Epidemiology of visceral leishmaniasis Northeast Brazil. J. Infect. Dis. 166: 1124-1132.
Fagbemi, B.O., Guobadia, E.E., 1995. Immunodiagnosis of fascioliasis in ruminants using a 28-kDa cysteine protease ofFasciola gigantica. Vet. Parasitol. 57: 309-18.
Fedorov, A., Merican, A. F., Gilbert W. 2002. Large-scale comparison of intron positions among animal, plant, and fungal genes.Proc Natl Acad Sci USA. 99(25): 16128-16133.
Fedorova, J., and Fedorova, A. 2003. Introns in gene evolution.Genetica 118: 123-131.
Frydman, J. 2001. Folding of newly translated protein in vivo: the role of molecular chaperones. Annu. Rev. Biochem. 70: 603-647.
Garci, A., Perona, R. and Sastre, L. 1997. Polymorphism and structure of the gene coding for the al subunit of the Artemia franciscana Na/K-ATPase.Biochem. J. 321: 509-518.
Gilbert W, Marchionni M, McKnight G: On the antiquity of introns.Cell 1986, 46: 151-154.
Gotoh O. 1998. Divergent structures of Caenorhabditis elegans cytochromeP450 genes suggest the frequent loss and gain of introns during the evolution of nematodes. Mol. Biol. Evol. 15(11): 1447-1459.
Gotthardt, D., Warnatz, H.J., Henschel, O., Bruckert, F., Schleicher, M., Soldati, T. 2002. High-resolution dissection of phagosome maturation reveals distinct membrane trafficking phases. Mol Biol Cell. 13: 3508-3520.
98
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Goulet, B., Baruch, A., Moon, N-S., Poirer, M., Sansregret, L., Erickson, A., Bogyo, M., and Nepveu, A. 2004. A cathepsin L isoform that is devoid of a signal peptide localizes to the nucleus in S phase and processes the CDP/Cux transcription factor. Mol. Cell, 14: 207-219.
Grzonka, Z., Jankowska, E., Kasprzykowski, F., Kasprzykowska, R., Lankiewicz, L., Wiczk, W., Wieczerzak, E., Ciarkowski, J., Drabik, P., Jankowski, R., Kozak, M., Jask olski, M., Grubb A. 2001. Structural studies of cysteine proteases and their inhibitors. Acta Biochim Polon. 48: 1-20.
Guinec,N., Dalet-Fumeron,V. and Pagano,M. 1993. "In vitro" study of basement membrane degradation by the cysteine proteinases, cathepsins B, B-like and L. Digestion of collagen IV, laminin, fibronectin, and release of gelatinase activities from basement membrane fibronectin.Biol. Chem. Hoppe- Seyler, 374:1135-1146.
Guncar, G., Pungercic, G., Klemencic, I., Turk, V., Turk, D. 1999. Crystal structure of MHCclass II-associated p41 Ii fragment bound to cathepsin L reveals the structural basis for differentiation between cathepsins L andEMBOJ. S. 18: 793-803.
Homma, K., and Natori, S. 1996. Identification of substrate proteins for cathepsin L that are selectively hydrolyzed during the differentiation of imaginal discs of Sarcophagaperegrina. Eur. J. Biochem. 240: 443-447.
Hu, K. G., and Leung, P. C. 2004. Shrimp cathepsin L encoded by an intronless gene has predominant expression in hepatopancrease, and occurs in the nucleus of oocyte. Comp. Biochem. Physiol 137: B 21-33.
Hu, K. G., and Leung, P. C. 2006. Complete, precise, and innocuous loss of multiple introns in the currently intronless, active cathepsin L-like genes, and inference from this event. (Article in press).
Huete-Perez, J.A., Engel, J.C., Brinen, L.S., Mottram, J.C., McKerrow, J.H. 1999. Protease trafficking in two primitive eukaryotes is mediated by a prodomain protein motif. JBiol Chem. 274: 16249-16256.
Hughes, A. L. 1994. Evolution of cysteine protease in eukaryotes.Mol. Phylogenet. Evol. 3(4): 310-321.
Ish-Horowicz, D. and J.F. Burke. 1981. Rapid and efficient cosmid cloning.Nucleic Acids Res. 9: 2989.
Ishidoh K, Toeatari T, Imajoh S, Kawasaki H, Kominami E, Katunuma N, Suzuki K. 1987. Molecular cloning and sequencing of cDNA for rat cathepsin L.FEBS Lett. 1987. 223(1): 69-73.
99
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Ishidoh, K., and Kominami, E. 1995. Procathepsin L degrades extracellular matrix proteins in the presence of glycosaminoglycans in vitro.Biochem. Biophys. Res. Commun. 217: 624-631.
Ishidoh, K., Imajoh, S., Emori, Y., Ohno, S., Kawasaki, H., Minami, Y., Kominami, E., Katunuma, N., and Suzuki, K. 1987. Molecular cloning and sequencing of cDNA for rat cathepsin H. Homology in pro-peptide regions of cysteine proteinases.FEBS Lett. 226: 33-37.
Jackson, S. A. and Clegg, J.S. 1996. Ontogeny of low molecular weight stress protein p26 during early development of the brine shrimp,Artemia franciscana. Development Growth and Differentiation 32: 41-49.
Jean, D., Guillaume, N., and Frade, R. 2002. Characterization of human cathepsin L promoter and identification of binding sites for NF-Y, Spl and Sp3 that are essential for its activity.Biochem. J. 361: 173-184.
Jeffreys, A.J. and R.A. Flavell, 1977. The rabbit beta-globin gene contains a large insert in the coding sequence. Cell 12: 1097-1108.
Karrer, K.M., Peiffer, S.L., DiTomas, M.E.1993. Two distinct gene subfamilies within the family of cysteine protease genes.Proc Natl Acad Sci USA. 90(7): 3063-3067.
Kestemont, P., Cooremans, J., Ayad, A.A., and Melard, C. 1999. Cathepsin L in eggs and larvae of perchPerea fluviatilis. Fish Physiol. Biochem. 21: 59-64.
Kirschke H., Barrett, A.J., Rawlings, N.D. 1995. Proteinases 1: lysosomal cysteine proteinases. Protein Profile. 2(14): 1581-1643.
Kirschke H, Wiederanders B. 1994. Cathepsin S and related lysosomal endopeptidases. Methods. Enzymol. 244: 500-511.
Kobayashi,H., Schmitt,M., Goretzki,L., Chucholowski,N., Calvete,J., Kramer,M., Gunzaler,W.A., Janicke,F. and Graeff,H. 1991. Cathepsin B efficiently activates the soluble and the tumor cell receptor-bound form of the proenzyme urokinase-type plasminogen activator (Pro-uPA).J. Biol. Chem., 266: 5147-5152.
Koblinski, J. E., Ahram, M., Sloane, B.F. 2000. Unraveling the role of proteases in cancer. Clin. Chim. Acta. 291: 113- 135.
Krasko, A., Gamulin, V., Seack, J,. Steffen, R., Schroder, H.C., Muller, W.E. 1997. Cathepsin, a major protease of the marine spongeGeodia cydonium: purification of the enzyme and molecular cloning of cDNA.Mol. Mar. Biol. Biotechnol. 6: 296-307.
100
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Kuipers, A.G., and Jongsma, M.A. 2004. Isolation and characterization of cathepsin L-like cysteine protease cDNAs from western flower thripsFrankliniella ( occidentalis). Comp. Biochem. Physiol. 139: B. 65-75.
Kwaitowski, J., Krawczyk, M., Kornacki, M., Bailey, K., Ayala, F.J. 1995. Evidence against the exon theory of genes derived from the triosephosphate isomerase.Proc Natl Acad Sci USA 92: 8503-8506.
Lamparero, A. M., Casero, M. C., Caro, J. O., and Sastre, L. 2003. Regulation of promoter occupancy during activation of cryptobiotic embryos from the crustacean Artemiafranciscana. J. Exp. Biol. 206:1565-1573.
Lazzarino, D. and Gabe,l C.A. 1990. Protein determinants impair recognition of procathepsin L phosphorylated oligosaccharides by the cation-independent mannose 6-phosphate receptor. J. Biol. Chem. 265: 11864-11871.
Le Boulay, C., Sellos, D., Van Wormhoudt, A. 1998. Cathepsin L gene organization in crustaceans. Gene. 218(1-2): 77-84.
Lenarcic B, Turk V. 1999. Thyroglobulin type-1 domains in equistatin inhibit both papain-like cysteine proteinases and cathepsinJ D. Biol Chem. 274: 563—566.
Liang, P. and MacRae, T.H. 1999. The synthesis of a small heat shock/a-crystallin protein in Artemia and its relationship to stress tolerance during development. Developmental Biology 207: 445-456.
Liu, J. and E.S. Maxwell, 1990. Mouse U14 snRNA is encoded in an intron of the mouse cognate hsc70 heat shock gene. Nucl. Acids Res. 18: 6565- 6571.
Logsdon, J.M., 1998. The recent origin of spliceosomal introns revised.Curr. Opin. Genet. Dev. 8: 637-648.
Logsdon, J.M., A. Stoltzfus, and W.F. Doolittle, 1998. Molecular evolution: recent cases of spliceosomal intron gain?Curr. Biol. 8: R560-R563.
Logsdon, J.M., M.G. Tyshenko, C. Dixon, J.D. Jafari, V.K. Walker, and J.D. Palmer, 1995. Seven newly discovered intron positions in the triose-phosphate isomerase gene: evidence for the intronslate theory.Proc. Natl. Acad. Sci. USA 92: 8507-8511.
Lu, J. and Warner, A.H. 1991. Immunodetection of thiol protease levels in various populations o f Artemia cysts and during development.Biochem. Cell Biol. 69: 96-101.
101
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Lustigman, S., McKerrow, J.H., Shah, K., Lui, J., Huima, T., Hough, M., and Brotman, B. 1996. Cloning of a cysteine protease required for the molting ofOnchocerca volvulus third stage larvae.J. Biol. Chem. 271: 3081 - 3089.
Maciewicz, R.A., and Etherington, D.J. 1988. Enzyme immunoassay for cathepsin B and cathepsin L in synovial fluids from patients with arthritis.Biochem. Soc. Trans. 16: 812-813.
Maciewicz,R.A., Wotton,S.F., Etherington,D.J. and Duance,V.C. 1990. Susceptibility of the cartilage collagens types II, IX and XI to degradation by the cysteine proteinases, cathepsins B and L.FEBSLett., 269: 189-193.
Maniatis, T., and R. Reed, 2002. An extensive network of coupling among gene expression machines.Nature 416: 499-506.
Marchler-Bauer, A., Anderson, J.B., Cherukuri, P.F., DeWeese-Scott, C., Geer, L.Y., Gwadz, M., He, S., Hurwitz, D.I., Jackson, J.D., Ke, Z., Lanczycki, C.J., Liebert, C.A., Liu, C., Lu, F., Marchler, G.H., Mullokandov, M., Shoemaker, B.A., Simonyan, V., Song, J.S., Thiessen, P.A., Yamashita, R.A., Yin, J.J., Zhang, D., Bryant, S.H. 2005. "CDD: a Conserved Domain Database for protein classification.", Nucleic Acids Res. 33: D 192-196.
Marx, J.L., 1987. A new wave of enzymes for cleaving prohormones.Science 235: 285-286.
Mas-Coma, S., Esteban, J.G., Bargues, M.D., 1999. Epidemiology of human fascioliasis: a review and proposed new classification.Bull. WHO 77: 340-346.
Massimi, I., Park, E., Rice, K., Muller-Esterl, W., Sauder, D., McGavin, M.J. 2002. Identification of a novel maturation mechanism and restricted substrate specificity for the SspB cysteine protease ofStaphylococcus aureus. Biol Chem. 277: 41770—41777.
Matys,V., Fricke,E., Geffers,R., Gossling,E., Haubrock,M., Hehl,R., Homischer,K., Karas,D., Kel,A.E., Kel-Margoulis,O.V. 2003. TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res., 31: 374-378.
McGrath, M. E. 1999. The Lysosomal Cysteine Proteases.Annu. Rev. Biophys. Biomol. Struct. 28: 181-204.
McIntyre, G F. and Erickson, A. H. 1993. The lysosomal proenzyme receptor that binds procathepsin L to microsomal membranes at pH 5 is a 43-kDa integral membrane protein.Proc Natl Acad Sci USA. 90: 10588-10592.
102
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Menard, R., Carmona, E., Takebe, S., Dufour, E., Plouffe, C., Mason, P., and Mort, J. S. 1998. Autocatalytic Processing of Recombinant Human Procathepsin J.L. Biol. Chem. 273(8): 4478-4484.
Mehtani, S., Gong, G., Panella, J., Subbiah, S., Peffley, D. M., and Frankfater, A. 1998. In Vivo Expression of an Alternatively Spliced Human Tumor Message That Encodes a Truncated Form of Cathepsin B.J. Biol. Chem. 273: 13236-13244.
Metrione, R. M., Okuda, Y., Fairclough, G F Jr. 1970. Subunit structure of dipeptidyl transferase. Biochem. 9(12): 2427-2432.
Michalek,M.T., Beancerraf,B. and Rock,K.L. 1992. The class II MHC-restricted presentation of endogenously synthesized ovalbumin displays clonal variation, requires endosomal/lysosomal processing, and is up-regulated by heat shock.J. Immunol., 148: 1016-1024.
Miyata, S., and Kubo, T. 1997. Inhibition of gastrulation Xenopusin embryos by an antibody against a cathepsin L-like protease.Dev. Growth Differ. 39: 111-115.
Miyata, S., Nishibe, Y., and Kihara, H.K. 1995. Effects on properties of a thiol protease from Xenopus embryos in substrate assay conditions.Cell Biol. Int. 19: 33-38.
Monteiro, A.C.S., Abrahamson, M., Lima, A.P.C.A, Vannier-Santos MA, Scharfstein J. 2001. Identification, characterization and localization of chagasin, a tight-binding cysteine protease inhibitor inTrypanosoma cruzi. J Cell Sci. 114: 3933-3942.
Mundodi, V., Somanna, A., Farrell, P. J., Gedamu, L. 2002 . Genomic organization and functional expression of differentially regulated cysteine protease genesLeishmania of donovani complex. Gene 282: 257-265.
Musil, D., Zucic, D., Turk, D., Engh, R.A., Mayr, I., Huber, R., Popovic, T., Turk, V., Towatari, T., Katunuma, N.. 1991. The refined 2.15 A X-ray crystal structure of human liver cathepsin B: the structural basis for its specificity.EMBOJ. 10(9): 2321-2330.
Nagainis, P. A., and Warner, A.H. 1979. Evidence for the presence of an acid protease and protease inhibitors in dormant embryosArtemia of salina. Dev. Biol. 68: 259-270.
Neilson, H., Engelbrecht, J., Brunak, S. and von Hiejne, G. 1997. Identification of prokaryotic and eukaryotic signal peptides and prediction from their cleavage sites. Protein Eng. 10: 1-6.
103
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. O’Neil, S.M., Parkinson, M., Dowd, A.J., Strauss, W., Angles, R., Dalton, J.P., 1999. Immunodiagnosis of human fascioliasis using recombinantFasciola hepatica cathepsin LI cysteine protease.Am. J. Trop. Med. Hyg. 60: 749-51.
Ortega, M. A., Diaz-Guerra, M., Sastre, L. 1996. Actin gene structure in two Artemia species, A. franciscana and A. parthenogenetica.J. Mol. Evol. 143(3): 224-35.
Otto, H. H., and Schirmeister, T. 1997. Cysteine proteases and their inhibitors.Chem Rev.; 97: 133-71.
Palmer, J.D., and J.M. Logsdon, 1991. The recent origin of introns.Curr. Opin. Genet. Dev. 1: 470-477.
Polgar, L. and Halasz, P. 1982. Current problems in mechanistic studies of serine and cysteine proteinases.Biochem J. 207(1): 1-10.
Quandt, K., Freeh, K., Karas, H., Wingender, E., Werner, T., 1995. Matlnd and Matlnspector - new fast and versatile tools for detection of consensus matches in nucleotide sequence data.Nucleic Acids Res. 23: 4878-4884.
Reed, R., and K. Magni, 2001. A new view of mRNA export: separating the wheat from the chaff. Nat. Cell Biol. 3: E201-E204.
Rescheleit, D.K., Rommerskirch, W.J., Weideranders, B., 1996. Sequence analysis and distribution of two new human cathepsin L splice variants.FEBS Lett. 394: 345-348.
Rice, K., Perlata, R., Bast, D., Azavedo, J., McGavin, M. J. 2001. Description of staphylococcus serine protease (ssp) operon inStaphylococcus aureus and nonpolar inactivation of sspA-encoded serine protease.Infect Immun.; 69: 159—169.
Rigden, D., Moscolov, V. V., Galperin, M. 2002. Sequence conservation in the chagasin family suggests a comon trend in cysteine proteinase binding by unrelated protein inhibitors.Protein Sci; 11: 1971—1977.
Roche,P. A. and Cresswell,P. 1991. Proteolysis of the class II-associated invariant chain generates a peptide binding site in intracellular HLA-DR molecules.Proc. Natl Acad. Sci. USA, 88: 3150-3154.
Rose, T.M. 2005. CODEHOP-mediated PCR - a powerful technique for the identification and characterization of viral genomes.Virol. J. 2:20.
Roy, S. W., Fedorov, A., and Gilbert, W. 2003. Large-scale comparison of intron positions in mammalian genes shows intron loss but no gain.Proc Natl Acad Sci USA. 100(12): 7158-7162.
104
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Roy, S. W., Nosaka, M., de Souza, S. J., Gilbert, W. 1999. Centripetal modules and ancient introns.Gene. 238(1): 85-91.
Rzychon, M., Sabat, A., Kosowska, K., Dubin, A., Potempa, J. 2003. Staphostatins: an expanding new group of proteinase inhibitors with a unique specificity for the regulation of staphopains,Staphylococcus spp. cysteine proteinases.Mol Microbiol. 49: 1051-1066.
Rzychon, M., Chmiel, D. and Niemczyk, J. S. 2004. Modes of inhibition of cysteine proteases. Acta Biochimica Polonica 51(4): 861-873.
Sahagian, G.G., and Gottesman, M.M. 1982. The predominant protein of transformed murine fibroblasts carries the lysosomal mannose 6-phosphate recognition marker.J. Biol. Chem. 257: 11145-11150.
Sajid, M., and McKerrow, J. H. 2002. Cysteine protease of parasitic organisms.Mol. Biochem. Parasitol. 120: 1-21.
Sambrook, J., Fritsch, E. F., Maniatis, T. 1989. Molecular cloning. A laboratory Manual. 2nd ed. Cold Spring Harbor Laboratory Press.
Sandersen, S. J., Westrop, G. D., Scharfstein, J., Mottram, J. C., Coombs, G. H. 2003 Functional conservation of a natural cysteine peptidase inhibitor in protozoan and bacterial pathogens. FEBS Lett. 542: 12-16.
Santos, C. C., Sant'anna, C., Terres, A., Cunha-e-Silva, N. L., Scharfstein, J., de A Lima, A. P. 2005. Chagasin, the endogenous cysteine-protease inhibitor of Trypanosoma cruzi, modulates parasite differentiation and invasion of mammalian cells. J. Cell. Sci. 118: 901-915.
Sastre, L. 1999. Isolation and characterization of the gene coding for Artemia ffanciscana TATA-binding protein: expression in cryptobiotic and developing embryos. Biochim. Biophy. Acta 1445: 271-282.
Schlereth A, Standhardt D, Mock HP, Muntz K. 2001. Stored cysteine proteinases start globulin mobilization in protein bodies of embryonic axes and cotyledons during vetch (Vicia sativa L.) seed germination.Planta. 212(5-6): 718-727.
Seth, P., Mahajanl, V. S., Chauhan, S. S. 2003. Transcription of human cathepsin L mRNA species hCATL B from a novel alternative promoter in the first intron of its gene. Gene 321: 83-91.
Sever, N., Filipic, M., Brzin, J., Lah, T.T. 2002. Effect of cysteine proteinase inhibitors on murine B16 melanoma cell invasion in vitro.Biol. Chem. 383: 839-842.
105
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Shi, GP., Webb, A.C., Foster, K.E., Knoll, J.H., Lemere, C.A., Munger, J.S., Chapman, H.A. 1994. Human cathepsin S: chromosomal localization, gene structure, and tissue distribution. J. Biol. Chem. 269(15): 11530-11536.
Shinagawa,T., Do,Y.S., Baxter,J.K., Carilli,C., Schilling,J. and Hsueh,W.A. 1990. Identification of an enzyme in human kidney that correctly processes prorenin.Proc. Natl Acad. Sci. USA, 87: 1927-1933.
Smith, S. M., Gottesman, M. M. 1989. Activity and deletion analysis of recombinant human cathepsin L expressed in Escherichia coli. J. Biol. Chem. 264(34): 20487-20495.
Southern, E.M. 1975. Detection of specific sequences among DNA fragments separated by gel electrophoresis.J. Mol. Biol. 98: 503.
Steams, N.A., Dong, J., Pan. J.-X., Brenner, D.A., and Sahagian, G.G. 1990. Comparison of cathepsin L synthesized by normal and transformed cells at the gene, message, protein and oligosaccharide levels.Arch. Biochem. Biophys. 283: 447-457.
Storer, A.C. and Menard, R. 1994. Catalytic mechanism in papain family of cysteine peptidases. Methods Enzymol. 244: 486-500.
Swensen, J. 1996. PCR with random primers to obtain sequence from yeast artificial chromosome insert ends or plasmids.BioTechniques 20: 486-491.
Szpaderska, A and Frankfater, A. 2001. An intracellular form of cathepsin B contributes to invasiveness in cancer.Cancer Res. 61: 3493-500.
Takahashi,H., Cease,K.B. and Berzofsky,J.A. 1989. Identification of proteases that process distinct epitopes on the same protein.J. Immunol., 142: 2221-2229.
Tarrio R, Rodriguez-Trelles F, Ayala FJ. 1998. NewDrosophila introns originate by duplication.Proc Natl Acad Sci USA 95: 1658-1662.
Tselentis, Y., Gilkas, A., Chaniotis, B., 1994. Kala-azar in Athens basin.Lancet. 343(8913): 1635.
Turk, B., Bieth, J.G., Dolence, I., Turk, D. Cimerman N, Kos J, Colic A, Stoka V, Turk V. 1995. Regulation of the activity of lysosomal cysteine proteinases by pH-induced inactivation and/ or endogenous protein inhibitors, cystatins.Biol. Chem. 376(4): 225-230.
Turk, B., Turk, V., Turk, D. 1997. Structural and functional aspects of papain-like cysteine proteinases and their protein inhibitors.Biol. Chem. 378:141- 150.
106
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Turk, B., Turk, D., Turk, V. 2000. Lysosomal cysteine proteases: more than scavengers.Biochim. Biophy. Acta 1477: 98-111.
Uchiyama,Y., Watanabe,T. Watanabe,M., Ishii,I., Matsuba,H., Waguri,S. and Kominami,E. 1989. Immunocytochemical localization of prorenin, renin, and cathepsins B, H, and L in juxtaglomerular cells of rat kidney.J. Histochem. Cytochem., 37: 691-696.
Vemet, T., Berti, P.J., de Montigny, C., Musil, R., Tessier, D.C., Menard, R., Magny, M.C., Storer, A.C., Thomas, D.Y. 1995. Processing of the papain precursor. The ionization state of a conserved amino acid motif within the Pro region participates in the regulation of intramolecular processing. J. Biol. Chem. 270(18): 10838-10846.
Vemet, T., Berti, P. J., de Montigny, C., Musil, R., Tessier, D. C., Me'nard, R., Magny, M.-C., Storer, A. C., and Thomas, D. Y. 1995.J. Biol. Chem. 270: 10838-10846.
Villadangos, J.A., Bryant, R.A., Deussing, J., Drissen, C., Lennon-Dumenil, A.M., Riese, R.J., Saftig, P., Shi, G.P., Chapman, H.A., Peters, C., Ploeghl, H.L., 1999. Proteases involved in MHC class II antigen presentation.Immunol. Rev. 172: 109-120.
Volk, H., Kurz, U., Linder, J. Klumpp, S., Gnau, V., Jung, G., and Schultz, J. 1996. Cathepsin L is anintracellular and extracellular protease.Eur. J. Biochem. 238: 198-206.
Von Eggeling, F. and Spielvogel, H. 1995. Applications of random PCR.Cell. Mol. Biol. (Noisy-le-grand). 41(5): 653-670.
Warner, A.H. and Shridhar, V. 1985. Purification and characterization of a cytosol protease from dormant cysts of the brine shrimpArtemia. J. Biol. Chem. 260: 7008-7014.
Warner, A. H. and Sonnenfeld-Karcz, M. J. 1992. Purification and partial characterization of thiol protease inhibitors from embryos of the brine shrimpArtemia. Biochem. Cell Biol. 70: 1020-1029.
Warner, A.H., and Matheson, C. 1998. Release of proteases from larvae of the brine shrimp Artemia franciscana and their potential role during the molting process.Comp. Biochem. Physiol. 119:B 255-263.
Warner, A.H., Perz, M.J., Osahan, J.K., and Zielinski, B.S. 1995. Potential role in development of the major cysteine protease in larvae of the brine shrimpArtemia franciscana. Cell Tissue Res. 282: 221-231.
107
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Watson, M.E. 1984. Compilation of published signal sequences.Nucleic Acids Res. 12: 5145-5164.
Wex, T., Buhling, F., Wex, H., Gunther, D., Malfertheiner, P., Weber, E., Bromme, D. 2001. Human cathepsin W, a cysteine protease predominantly expressed in NK cells, is mainly localized in the endoplasmic reticulum.J. Immunol.', 167: 2172-2178.
Wiederanders, B. 2003. Structure-function relationships in class CA1 cysteine peptidase propeptides.Acta Biochimica Polonica. 50: 691-713.
Wijffels, G.L., Panaccio, M., Salvatore, L., Wilson, L., Walker, I.D., Spithill, T.W., 1994. The second cathepsin L-like proteinase of the trematode,Fasciola hepatica, contain 3-hydroxyproline residues.Biochem. J. 299: 781-790.
Wingender,E., Chen,X., Fricke,E., Geffers,R., Hehl,R., Liebich,I., Krull,M., Matys,V., Michael,H., Ohnha»user,R. et al. 2001. The TRANSFAC system on gene expression regulation. Nucleic Acids Res., 29: 281-283.
Wingender,E., Chen,X., Hehl,R., Karas,H., Liebich,I., Matys,V., Meinhardt,T., Pru, M., Reuter,I. and Schacherer,F. 2000. TRANSFAC: an integrated system for gene expression regulation.Nucleic Acids Res., 28: 316-319.
Wolters, P. J., and Chapman, H. A. 2000. Importance of lysosomal cysteine proteases in lung disease.Respir Res 1:170-177.
Yamasaki, H., Aoki, T., Oya, H., 1989. A cysteine proteinase from the liverfluke Fasciola spp.: purification, characterization, localization and application to immunodiagnosis.Jpn. J. Parasitol. 38: 373-384.
Yamasaki, H., Mineki, R., Murayama, K., Ito, A., Aoki, T. 2002. Characterization and expression of the Fasciola gigantica cathepsin L gene. International Journal for Parasitology 32: 1031 -1042.
Yan, S., Berquin, I.M., Troen, B.R., Sloane, B.F. 2000. Transcription o f human cathepsin B is mediated by Spl and Ets family factors in glioma.DNA Cell Biol. 19(2): 79-91.
108
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. VITA AUCTORIS
Name: Cao JianPing
Born: February 13th, 1980,Shanghai, P.R. China.
Education: 2003-2006
Master of Science Program, Department of Biological Sciences, University of Windsor, Windsor, Ontario, Canada.
1998-2003 Bachelor of Medicine, Department of Medicine, Shanghai Second Medical University, Shanghai, P. R. China.
Publication:
Abstract: Structure of the cathepsin L gene in two species of the crustacean, Artemia. A.H. Warner, M.F. Shaw, R. Shamoon, and P.J. Cao. The International Proteolysis Society, Quebec City, Canada, October, 15-19, 2005.
109
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.