US 2004O146980A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2004/0146980 A1 Merkulov et al. (43) Pub. Date: Jul. 29, 2004

(54) ISOLATED HUMAN PROTEINS, (22) Filed: Mar 18, 2004 NUCLEC ACID MOLECULES ENCODING HUMAN LIPASE PROTEINS, AND USES Related U.S. Application Data THEREOF (62) Division of application No. 10/003,302, filed on Dec. (75) Inventors: Gennady V. Merkulov, Baltimore, MD 6, 2001, which is a division of application No. (US); Karen A. Ketchum, 09/820,001, filed on Mar. 29, 2001, now Pat. No. Germantown, MD (US), Valentina Di 6,387,680. Francesco, Rockville, MD (US); Ellen M. Beasley, Darnestown, MD (US) Publication Classification

Correspondence Address: (51) ------C12N 9/20; CO7H 21/04 CELERA GENOMICS CORP. (52) U.S. Cl...... 435/69.1; 435/198; 435/320.1; ATTN: WAYNE MONTGOMERY, VICE PRES, 435/325; 536/23.2 INTEL PROPERTY ABSTRACT 45 WEST GUDE DRIVE (57) C2-4iii.20 The present invention provides amino acid Sequences of ROCKVILLE, MD 20850 (US) peptides that are encoded by genes within the human genome, the lipase peptides of the present invention. The (73) Assignee: APPLERACORPORATION, Norwalk, present invention specifically provides isolated peptide and CT nucleic acid molecules, methods of identifying orthologs and paralogs of the lipase peptides, and methods of identi (21) Appl. No.: 10/802,805 fying modulators of the lipase peptides.

1. CTOTTACTOT TO AGCCTGA GTCAAAAGCA AAAGTTCAGA AGT TOOTCAT 51 CAATAAGGAG TCCGGAG CAGGGAAGCTCATCTAACT AGGCATC 101 ATGATGTGGC TGCTTT TAAC AACAACT GT TTGATC-TGTG GAACT TAAA 1S1 TGCTGGTGGA TCCTGATT TIGGAAAATGA AGTGAATCOT GAGGGGGA 201 TGAAFACTAG TGAAATCATC ATCTACAATG GCTACCCCAG TGAAGAGTAT 251 GAAGTCACGACTGAAGATGG GTATATACTC CTTGTCAACA GAATTCOTTA 301 TGGGCGAACA CATGCTAGGA GCACAGGTCC CCGGCCAGTT GTGTATATGC 351 AGCAIGCCCT GITTTGCAGAC AATGCCTACT GGCTTGAGAA TATGCCAAT 401 GGAAGCOTTG GATTCOT TOT AGCAGATGCA GGTTATGATG TAGGATGGG 451 AAACAGTCGG GGAAACACTT GGTCAAGAAG ACACAAAACA CTOTCAGAGA 501 CAGATGAGAA Al TCTGGGCC TTAGTTTTG ATGAAATGGC CAAATATGAT 551. OTCCCAGGAG TAATAGACTT CATTGTAAAT AAAACTGGTCAGGAGAAATT 601 GATCATT GGACATCAC TTGGCACTAC AATAGGGTTT GTAGCCTTTT 651 CCACCATGCC TGAACTGGCA CAAAGAATCA AAATGAATTT TGCGTTGGGT 701 COTACGAO CATTCAAATA TOCCACGGGC ATTACCA GGTTTTTOT 751 ACT TCCAAAT CCAAATCA AGGCTGT T T T GG TACCAAA GGT I TOT T T T 801. TAGAAGATAA GAAAACGAAG ATAGCTTCTA CCAAAATCTG CAACAATAAG 851 AACTCTGG GATAGAG. CGAATTATG CCTTATGGG CGGATCCAA 901 CAAGAAAAA ATGAATCAGA GCGAAT GGA GTGTATA TG TOACA TGCC 951 CCAGTGG TO ATCAGTACAC AACATTCTGC ATATAAAACA GOTTACCAC 1001. TCTGATGAAT TCAGAGCTA TGACTGGGGA AATGACGCTGATAATATGAA 1051 ACATACAATCAGAGTCACCCCCTATATA TGACCTGACT GCCATGAAAG 1101 TGCGTAGTGC TATT TGGGC GGTGGACATG ATGTCCTGGG AACACCCCAG 1151 GATGTGGCCA GGA ACTCCC TCAAA TCAAG AG TOTT TCAT TAGTGCAAG 1201. ???? ?CCCA GAA TOGAAC CCACCI I I GA T IT TO ???CG GCCCT TO???. 1251 CCCCI CAACG GATGT TCAG GGAAATCATA ACCT I TAAG AAGGCATAT T 1301 TCCTAAATGC CAATGCAT, TACCTTTCAATAAAGG TGGTCCA 1351 AAGCOOTAC (SEQ ID NO: 1)

FEATURES 5 "?TR: 1 - 100 Start Codon: 101 Stop Codon: 1286 3 UTR: 1289 Homologous proteins: Top 10 BLAST Hits: CRA||18000004922653 /alltid=gi | 7434997 /def=pir||G01416 lysosoma . . . 431 e-120 CRA||18000004903706 /alltid=gi || 542751 /def=pir||S41408 lysosomal . . . 430 e-119 CRA|18000004924799 /altid-gi 4557721 /def=ref|NP 000226.1| lipa... 428 e-119 CRAI98000043616611 /altid=gi 12844223 /def=dbjIBAB26283.1 (AKO. . . 415 e-115 CRAI98000043617058 /altid-gi 12845127 /def=dbjIBAB26629.1 CAKO. . . 415 e-115 CRA||98000043616593 /alltid=gi 12844194 /def=dbj || BAB26272.1 (AK0. . . 414 e-115 CRAI98000043617174 /altid=gi 12845372 /def=dbjIBAB26725.1 CAKO. . . 414 e-115 Patent Application Publication Jul. 29, 2004 Sheet 1 of 36 US 2004/0146980 A1

1 CTOTTACTOT TCAGCOTGAT GTCAAAAGCA AAAGTTCAGA AGITCOTCAT 51 CAATAAGGAG TCCTTGTGAG CAGGTGAAGC TCATCTAACTAGGCATTTCT 101 ATGATGTGGC TGCTTT TAAC AACAACTTGT TTGATCTGTG GAACTT TAAA 151 TGCTGGTGGA TTCCTTGATT TGGAAAATGA AGTGAATCOT GAGGTGTGGA 201 TGAATACTAG TGAAATCATC ATCTACAATG, GCTACCCCAG TGAAGAG TAT 251 GAAGTCACCA CTGAAGATGG GTATATACTC CTTGTCAACA GAATTCCTTA 301 TGGGCGAACA CATGCTAGGA GCACAGGTCC CCGGCCAGTT GTGTATATGC 351 AGCATGCCCT GITTGCAGAC AATGCCTACT GGCTTGAGAWA TATGCCAAT 401 GGAAGCCT TG GATTCCTTCT AGCAGATGCA GGT TATGATG TATGGATGGG 451 AAACAGTCGG GGAAACACTT GGTCAAGAAG ACACAAAACA CTOTCAGAGA 501 CAGATGAGAA ATTCTGGGCC TTTAGTTTTG ATGAAATGGC CAAATATGAT 551. OTCCCAGGAG TAATAGACTT CATTG TAAAT AAAACTGGTC AGGAGAAATT 601 GTATTTCATT GGACATTCAC TTGGCACTAC AATAGGGTTT GTAGCCTTTT 651 CCACCATGCC TGAACTCGCA CAAAGAATCA AAATGAATTT TGCCTTGGGT 701 CCTACGAC CATCAAATA TCCCACGGGC ATTTTTACCA GGTTTTTCT 751 ACTTCCAAAT TOCATAATCA AGGOTG TT GG TACCAAA GGT T TOT T T T 801 TAGAAGATAA GAAAACGAAG ATTAGCTCTA CCAAAATC TG CAACAATAAG 851 ATACTCTGGT TIGATAG TAG CGAAT I TATG CCT TATGGG CTGGATCCAA 901 CAAGAAAAAT ATGAATCAGA GTCGAATGGA TGTGTATATG TCACATGCTC 951 CCACTCG! TC ATCAGTACAC AACATTCTGC ATATAAAACA GCTT TACCAC 1001. TCTGATGAAT TCAGAGCTTA TGACTGGGGA AATGACGCTGATAATATGAA 1051 ACATTACAAT CAGAGTCATC CCCCTATATA TGACCTGACT GCCATGAAAG 1101 TGCCTACTGC TATT TGGGCT GGTGGACATG ATGTCCTCGG AACACCCCAG 1151 GATGTGGCCA GGATACTCCC TCAAACAAG AG TOTT TCAT TAGTGCTTAAG 1201 CCATTGCCA GAATGGGAAC CCACCTTGA TTTTGTCTGG GGCCTTGATG 1251 CCCCTCAACG GATGTTCAGTGGAAATCATA ACCTTTAATG AAGGCATATT 1301 TCCTAAATGC CAATGCATTT TACCTTTTTC AATTTAAAGG TTGGTTTCCA 1351 AAGCCCITTAC (SEQ ID NO: 1)

FEATURES S"?TR: 1 - 100 Start Codon: 101 Stop Codon: 1286 3 UTR: 1289 Homologous proteins: Top 10 BLAST Hits: CRA||18000004922653 /alltid=gi | 7434997 /def=pi r || ||G01416 lysosomal . . . 431 e-120 CRA||18000004903706 /alltid=gi || 542751 /def=pir||S41408 lysosomal . . . 430 e-119 CRA|18000004924799 /altid=gi 14557721 /def=ref|NP 000226.1| lipa... 428 e-119 CRAI98000043616611 /altid=gi 12844223 /def=dbjIBAB26283.1 (AKO. . . 415 e-115 CRAI98000043617058 /altid-gi 12845127 /def=dbjIBAB26629.1 (AKO. . . 415 e-115 CRAI98000043616593 /altid=gi | 12844194 /def-dbjIBAB26272.1 (AKO. . . 414 e-115 CRAI98000043617174 /altid-gi|12845372 /def=dbjIBAB26725.1 (AKO. . . 414 e-115 FIG.1A Patent Application Publication Jul. 29, 2004 Sheet 2 of 36 US 2004/0146980 A1 CRAI98000043617140 /altid=gi 12845298 /def=dbjI BAB26697.1 CAKO. . . 414 e-115 CRA98000043617224 /altid-gi 12845477 /def=dbjIBAB26766.1 (AKO. . . 414 e-114 CRA 98000043616955 /altid=gi|12844939 /def=dbjIBAB26556.1 (AKO. . . 414 e-114 EST: gi || 8003062 /dataset=dbest /taxon=960. . . 62 4-e-07 gi 8000757 /dataset=dbest /taxon=960. .. 54 9e-05

EXPRESSION INFORMATION FOR MODULATORY USE: gi| 8003062 Stomach normal gi|8000757 Stormoach normal Tissue expression: Human leukocyte

FIG.1B Patent Application Publication Jul. 29, 2004 Sheet 3 of 36 US 2004/0146980 A1 1 MMWLLLTTC LICGTLNAGG FDLENEVNP EWMNTSEII YNGYPSEEY 51 EVTTEDGYIL LVNRIPYGRT HARSTGPRPV VYMOHALFAD NAYWLENYAN 101 GSLGFLLADA GYDVWMGNSR GNTWSRRHKT LSETDEKFWA FSFDEMAKYD 151 LPGVIDFIVN KTGQEKLYFI GHSLGT TIGF VAFSTIMPELA QRIKMNFALG 201 PTISFKYPTG IFTRFFLLPN SIIKAVFGTK GFFLEDKKTK IASTKIONNK 251 ILWLICSEFM SLWAGSNKKN MNOSRMDVYM SHAPTGSSVH NILHIKOLYH 301 SDEFRAYDWIG NDADNMKHYN QSHPPIYDLT AMKVPTAIWA GGHDVLGTPQ 351 DVARILPQIK SLSLVLSLLP WEPTFDFW GLDAPQRMFS GNHNL CSEQ ID NO: 2)

FEATURES: Functional domains and key regions: [1] PDOC00001 PS00001 ASNU GLYCOSYLATION N-glycosylation site Number of matches: 5 1. 35–38 NTSE 2 100-103 NGSL 3 160–163 NKTG 4. 272-275 NQSR 5 320-323 NOQSH [2] PDOC00005 PS00005 PKC_PHOSPHO_SITE Protein kinase C phosphorylation site Number of matches: 4 1. 125-127 SRR 2 204-206 SFK 3 243–245 STK 4. 266-268 SNK [3] PDOC00006 PS00006 CK2_PHOSPHO_SITE Casein kinase II phosphorylation site Number of matches: 8 53-56 TITED 130-133 TLSE 132-135 SETD 142-145 SFDE 162-165 TGOE 185-188 MPE 274-277 SSR?MD 348-351 TPOD [4] PDOC00007 PS00007 TYR_PHOSPHO_SITE FIG.2A Patent Application Publication Jul. 29, 2004 Sheet 4 of 36 US 2004/0146980 A1 Tyrosine kinase phosphorylation site 161-168 KTGQEKLY [5] PDOC00008 PSO0008 MYRISTYL N-myristoylation site Number of matches: 4 1. 14-19 GTLNAG 2 117-122 GNSRGN 3 1211-126 GNTWSR 4. 175-180 GTT GF 6. PDOC00110 PS00120 LIPASE SER , serine active site 167-176 LYFIGHSLGT Membrane Spanning structure and domains: Helix Begin End Score Certainity 1. 3 23 1.398 Certain 2 167 187 1.637 Certain 3 248 268 0.715 Putative

BLAST Alignment to Top Hit: >CRA||18000004903706 /alltid=gi || 542751 /def=pir||S41408 lysosomal acid lipase (EC 3.1.1. -) / Sterol (EC 3.1.1.13) precursor - human /orghuman /taxon=9606 /dataset=nraa /length=399 Length = 399 Score = 430 bits (1094), Expect = e-119 Identities = 211/394 (53%), Positives = 274/394 (68%), Gaps = 2/394 (0%) Query: 2 MWLLLTTTCLICGTLNAGGFLDLENEVNPEWMNTSEIIIYNGYPSEEYEVTTEDGYILL 61 M CL-- TL-H+ G WPE MN SETI Y GPSEEY V EDGYL Sbjct: 3 MRFLGLWCLVLWTLHSEGSGGKLTAVDPETNMNVSEIISYWGFPSEEYLVETEDGYILC 62 Query: 62 VNRIPYGRTHARSTGPRPWYMOHALFADNAYWLENYANGSLGFLLADAGYDWMGNSRG 121 -NRIP--GR -- GP+PWH-QH L AD++ W+ N AN SLGF-LADAGHDWMGNSRG Sbjct: 63 LNRIPHGRKNHSDKGPKPWFLQHGLLADSSNWTNLANSSLGFILADAGFDWMGNSRG 122

FIG.2B

Patent Application Publication Jul. 29, 2004 Sheet 6 of 36 US 2004/0146980 A1

1 TTATGGCCTA ACCTTTTTAA CTTTGAGTTA TTTTCAAGAG AAAATTTGAA 51 AAAGCAGCCT TTGAGGAGAA AGAAGCAATC CAACAAACAA AAAGATAACC 101 ACACTGTAAT AGGAAATGTG TTTTGAATAG GACATTGGAA GAAAAATAAT 151 AATCATTTT ACAGGTAGAT CCCAAAGTCA AGGATCTATG TTCAACCATG 201 TGTGTTCCAC CATOTTCACAATTGAATGAGTAACCATCATTAAGCAGTTA 251. GOTTAGGCCG TAA TATGAT CTTGGACTGA GATTTCAAAA ATACCACAGG 301 COTTCTGAAA GGTACCCCT TTCTAGOTCC ACTATCATCT AAT I T TATA 351 AAAAAAAAAA AAAAGGAAAA ATTTGAGCTT CTAGAGAGTA GGGGCTACCA 401 TTTTGTATCC CACAGGGCCA AGGAACAAGT TTTAATG TAT TCATTTAAAT 451 TAATTTCAGT ATGAGTATTG AAAATATAA TAGAAA TATT GAACATAT 501 ATATT TOTA TATACT TITTA TTATATAGAA AATATA TATT ACAGAATATA 551 TTATTAAATA TTGTAGAACA ATATATAATA CAGAAAAATA TATAATACTC 601 AG TAATATAT TAAATAOTTA T TAAAATAGC AAGOTTATAT AGGAAGAG, TG 651 ATGGAGCATT GTGAGAAAGT TTCAGCTTTA TTTCTTTGAC ATTACTTTGT 701 T TCTGCACAA ACAAAAGAAT TACAGGAATT GTCCAGATTA TTCAAATAAC 751 TCGAAGTTGA GGAGGGAATA TAAGTCAATGATGTAGAAAC TCTTT TAAGA 801 TTTGAGCTAG CCTACAATCT GTAAAGATCT GTGAAATTGA ACTATATTTG 851 TGCTAT I TCC ATAT TAAGTC AAGGCAACAA ATCAATATIA ATAATAATAA 901 CATAGCACTT CTAGAACTTT CTAAAGAGTC CAATAAAGTT TTG TTAGAAA 951 GGATTGTTTT TGAAGTTAAA, AACCATGAGA AATTCCAGGA AAATCCACAT 1001 ACCTATGCCA TCATACTATC AATCAGGGCA AAACATGCTT GAGTCTT TCA 1051. TCAAGACTAA ATGATTAAGG AGTGGTACAT. AACTTTTCCC TGTTCTGACT 1101 AGCTGAACAC TCCITTAC TCCACAT I TG TITAATGGC ATGAAATITC 1151. CCACTCCACT AAAACAGATC TTAGGATTG GACAACACAA AATATCATTT 1201 GTTTTGAAAG GATTTGAGGA TAAATCCAAA CTAATAGAACTGAAACTTCT 1251 ATATTATGCT GGG TAGCAAC T TAGTTTTCC CITACCOT TOT TOATGCTGGG 1301 AGATGAAAGA GATTCAGTTA CGGCTTAAGC TCCACAGGCA TACAAAGTGA 1351 AGOAGAAAAC TGAGGCACGT G GCCLTCCAT TATOTGG TAT CTCATG, TGGG 1401 GCTTAGAGGT AAATTGTCGT TATTTGGCCT CCATTTCTGC CTTTAACCAC 1451 TGG TG TAAAC AAAGG TTACT GIGCCAAAGT TGACAGCAAC CCAAATCCCT 1501. TTGGCATGTG AAT TAGTTTC CTCTGCCATACTGCTAGTTC CAAATTCCTT 1551 CTGGTTTCAG GATTTAGGAG TCAGGGTTGC CTGATCTTCT CAAATGAGTT 1601 ACAGTCACGC ACATCCCTAC ACACTGCATG GTTGGCACTA GTTOCTTGAT 1651. ATATGTTACT CCGTTTGATC CTCATGAAGG ATCAAATGGG GAAGGGAGAT 1701 ACTATTGTCT CGATTGTCC ATTAAGATCT TGAGTATGT CTACT CCCT 1751 GTTTGACACA CTGGTTTGAA AATGTTGCTA AGTCTTCCCA ACAATGACAG 1801 ATACTICAGTG GAAACATGAA GGATTCCGTC AAACTGGTTA T TTTGCATCA 1851. TGTAGACCACTATTTCCCAA CCTGCAAGTG CATCATGGCC TTTGGTGTGT 1901 CAGGGACACG CCTGGGTGT GTGTCTCAGT CTAAAGCTTC CTCCTTTCA 1951 CAAGCTTCCT GTTTCTCATCTCTCTAGCTT CTAACTGTCA CTGTAATCAT 2001 CTOTTACTOT TCAGCOTGAT GTCAAAAGCA AAAG TCAGA AGTTCOTCAT 2051 CAATAAGGAG TCOTTGTGAG CAGGTGAAGIC TCA TOTAACIT AGGTAAGATG 2101 AAGAT????C ATAACCAGGA GGOAGGT TGG AAGGTGCCAG TTG?????GC 2151 AGTCAGGTGC AAGAGCTOTG CAGTGAGGCT GCCTGAGTGT CCATCCTAGA 22011 TCTCTCACCT CTTGGCTCTG TGACCTGAG CAGGTCTTAA ATCTCTCTAA 2251 GCCTGT TAATGA TAAAATGAGG ATAATAA TAG TACCAAAAT FIG 3A Patent Application Publication Jul. 29, 2004 Sheet 7 of 36 US 2004/0146980 A1

TOAGAGTTTTAAAAATAGGCCOTGAACTGAAGCAAGAGGTAAACTAGGGAAGOOTOAGGA GAACTGAGACTTCTCCAGAGAGAAGTATCTGGGATTTAACTTCTTCTAAT GAGGCTGG TOOATGAAOTOOTTAAAOOAAGGGGGGTAT TGOTOATOTT TOTOT TGAGOOOO A,G) TTTGTCATAAT GAAVAATGGGGGTTACATCOTTO GGTGATCTAGGAGGCCCTATTTTC GTOCTAGCATACAGCAT CAAAATTT GCGTTAGCTTTCATGATTCTTACCCTAAC TAT TOT T T TOTAAAAAACAT T TGT TOAGOT TACCAGTCTGATGAAT TCAGAGCTTAT GACTGGGGAAATGACGCTGATAATATGAAACATTACAATCAGGTGAGCTATTTACAGTAA CCCCAGOATGCTGATT TIGATAAATATAVATAAAAAATTATT TGAGGGTGGAAAGAOTOO

16966 AGTAGATGACATAAATGAACACCACCT TAAATCAGAGT TTAAAAATAGGCCCTGAACTG AAGCAAGAGGAAAC TAGGGAAGCCTCAGGAGAACTGAGAC CTCCAGAGAGAAGTATC TGGGAT TEAACITCTT CTAATGAGGCTGGT TT TCCATGAACTTTTCCT || TAAACCAAG GGGGGTATTGCTGATCTTTCTGTTGAGCCCCAT TTGTCATAAT TGTAAAATGGGTGGTTA CATCCTCTGGTGA TOTAGGAGCCCTA TCGTCCTAGCATTACAGCAT TOTAAAAT T,G) TGCTGTTAGOTTTCATGATTCTACCCTAACTATTCTTT TTCTAAAAAACATTTGTTTCA GCTTACCACTCTGATGAATTCAGAGCTTATGACTGGGGAAATGACGC GATAATATGAA ACATACAAT CAGGTGAGCA I ACAGTAACCCCAGGATGCTGA I T GAAAAT TATA ATAAAAAATTATTTGAGGGTGGAAAGACTGCTACCGTCATTTGGTGGCATTTATACTGA TAGAACTTTTT TAAAAAAATT I TAA T I TAAT I T TAAT T AT T TOAGAAAAT T TATAA 17147 GGGGTATTGCTCATOTTTCTGTTGAGCCCCATTTGTCATAATTGTAAAATGGGGGTTAC ATCCTTCTGGTGATCTAGGAGCCCTATTCGTCCTAGCATACAGCATTITTCTAAAATT TGCTGTTAGCTTTCAT GATTCTTACCCTAACTATTCTTTT TCTAAAAAACATTTG ITTCA GOTTTACCACTCTGATGAATTCAGAGCTTATGACTGGGGAAATGACGCTGATAATATGAA ACAI TACAA CAGG TGAGC IAT I TACAG TAACCCCAGCAEGC GAI TGATAAAI TATA A, G TAAAAAAT TATT GAGGGGGAAAGAOTCOACCTGTCA I TGGTGGCAT TATACTGAT AGAACTTTTTTT TAAAAAAATTT TAAT TITTAAT T T TAVAT T TATT TCAGAAAA TATAVAA T TAAAGAAGCATATACAAAGAAACT TACATCAT G TG TAIAT COT TOCATCCAGAGATAACT AGATGTACTAACATTTGGTGTATTTATTCCAATTTCTCAGTATTATATTGCTTTAGA CAACT T T TAACI TOTA TTTACT TAAGOTA TAGTAAGAGATAJAOTAVATATAACTGAGG

17219 ATCTAGGAGCCC TAT T T TCGTCCTAGCATACAGCA, T TTCTAAAAT T TGCTG | AGO T T TCATGATTCTTACCCTAACTATCT TOTAAAAAACATT TG TITCAGCTTTACCACTC TGATGAATTCAGAGOTTATGACTGGGGAAATGACGCTGATAATATGAAACATTACAATCA GGTGAGCTATT TACAGTAACCCCAGOATGCTGAT T T TGATAVAAT TAAATAAAAAATTAT TGAGGG TGGAAAGACITCCTACCTGTCATGGGGCATATACTGATAGAACLT T,C) TAAAAAAAT ITTAATTAVAT TAAT TTAT I TCAGAAAAT TATAVAATTAAAGAAGCAT AACAAAGAAACTTACATCAT|GG TAATCOT CCATOCAGAGATAJAOTAGATGTAOTAAC AT TTTGGTGATTAT TOCAAT T T TOTCAGTATTATA TGOTT TAGACAACTT T TAVATC TTCTATTTTACTAAGCTATAGTAAGAGATAACTAATAAACTGAGGGATTTT AAATG CATTTTTAATGGOTACATAATAGAAATTATTCATAAAAATCTTTACAGCATAAATGAAT FIG 3AA Patent Application Publication Jul. 29, 2004 Sheet 8 of 36 US 2004/0146980 A1

2301 AGGGAGATTT TCAGAGCTTA AAAACATAC GTGAACTATT TAGAGTAATG 2351 CCTGCCATAA GGGGACTCAG TAGCTTATTA TTAGTTTCATACAATTTGAA 2401 AAGTTTCATA ATATTTGCAG ATATAAGATG ATCTTCAACC AGATAGCTAA 2451 TG TATGCAAA GC TATT TAGC TTCAGAAG TA AAC TOTGCAT TTCTAGAAG T 2501 AAA TATTAC TT GT TATAG TGAATTAT CT GTAVATATTTA TOTOT TGOTC 2551 ACTTTTATAA GAAAAATAGT GAAAGCATTT ATTAAGAACT TACACTGCAC 2601 TAAATG TITAT AT ATGACTTA ATOOTCACTA TAACCOTATG AGATAGGTTA 2651 CATTATTGTC CTAATTTTAC TAACAAGGAA ACCAAGAGAC AAAGCTACTA 2701 AAACACTTGC CTGAGGTTAG ACATCTTCTT CTGTGGTGAG GCTGGATTTC 2751 AAATTTAGAC CATTTGACTG TAGCACTTAT ATGATGAGCA TGCTGTTAG 2801 TGITTATAGTG TTGGTCTACC TTTGAATAGA CATACT TTTA AACCATGGCA 2851 AGGAAGTGAG ACTGCACATT GAAATATG TA AAATTTGCCT TTGGGTGCCA 2901 CGTGAGAAAT AGTCACATCA CTAGAAACTA ATCATAAGCT TTTGTGITTG 2951 GTTAAAGTT TATTGATCCA TTTTTCTTGT TTACTTTGTG GGATACTGGG 3001 CTAACTAGG GGATACCTCC ACITACT TGGCCATGGT ATGAAAACCT 3051 GTCCTCTGAA TCTTTAGATA TTTTGGCAAA TTG TAGGCAA ACAAAGACTT 3101 AAAGCAATTC AACCTTGATT AAAATAAGAC CAAAAATGCC TOCATACTTG 3151 AT TAAATT TA TTTCAT TTTA GGAVACTGGAT TFATAATOAAG ACAAOTTOTA 3201 CATGAAAAAA TAGATTAATA GTGOTOCAAGTAGTTCACT GTAT TATTC 3251 CTT TT TAAC AT TATCTGCC TCGGTGT TA *, TCAAGT TIT CATTAA CAT 3301 TAATAAT I TC ACTAATCAT T T TATT CAT AATCAACATT GATAGITTAVAA 3351 ATTAATCTGT GAATATTAAA TGTTTTATGC CAGGCATTTC TATGATGTGG 3401 CTGCTTTTAA CAACAACTTG TTTGATCTGT GGAACTTTAA ATGCTGGTGG 3451 ATTCCT GATTTGGAAAATG AAGTGAATCC TGAGGTGTGG ATGAATACTG 3501 TAAGTCATGG AAAACTG GA AGAACATCAA ATAAAGCAGG ACTAATGGAG 3551 TATGAGGTTA CGAAAGGTCC TGTTGTAACA GAAAATCTCT GATAAAACAG 3601 ATAAAATG TA GATGGTTTTT AACCTCTGCA AGAGTCAAGC TAG TAGATC 3651 TTTGTCTGAAAAACAAATAC TGTCCGGTAA TGAAAACCAA ATTGTGCTAT 3701 GGCTATOT A TOTATCAT CITATCTATOT ATCTATCITAT CITATCTATOT 3751. ATCTATCTAT T TATCTATC T ATCTATAGAT AGAACC TCCT CT | T TGAAT T 3801 TAGITT I TAA GAATATCAAG CITAT T TGT TG ATATACATGA TTGCCTTCTA 3851 TTGATCTATA GTTCTATTAC TTTTAAAGCA AGAGGGGTCT CAAAAGACAA 3901. TTGACTTGAT AATATAGOTT TGTCAGAAAG AATGGGTCAA TGCTAAATTT 3951 TCCCCCAACC CCCCAAAATA TTAGCCAATA GTAGATATTT TTTAAAATTC 4001 TACTAT TTT GTATAAGAC TTTATT TATT AAT IT TACAG TACCTGGTG 4051 CTACAAAT T! CAGATAATTC ACCCTAATAA GCACACAACA GATGGTI GT 4101 TTTGATTCCT TTTTATATCC TTGGAGAAG TTCCACTAAC GAC GTATTT 4151 TTACTGGGCA GAGTGAAATC ATCATCTACA ATGGCTACCC CAGTGAAGAG 4201TATGAAGTCA CCACTGAAGA TGGGTATATA CTCCTTGTCA ACAGAATTCC 4251 TTATGGGCGA ACACATGCTA GGAGCACAGGTACAAGATAT GTCTCTCCTG 4301 AAAAGGGGAC TGCATTGACC TCCTGCTTCT CAGGAGGAATTTAATGCTAG 4351 ATATGCATCA ACAGAGT TTA TCAAAATTGG TITTGAATTAT TGGATTAGTC 4401 TTTAAATAGT TATCAGGGAG GCTCACTCTT TGCCTGATAA TTCTCTGAAG 4451 ACAGACAGGA ACCTAAAAAT ACAAACAGCA AGACTGATCT TGCTAACTGC 4501 AACCAGAGGT ACT TGTTAGG GTG TAAACAG AAAGGCAGAG CCTGCAT T T 4551 GTCACIOTCAT TACTGATT TA TCATIG TIGGAA AAT GOTT TG TOCCAGGAAA FIG 3B Patent Application Publication Jul. 29, 2004 Sheet 9 of 36 US 2004/0146980 A1

18628 AAAATGAAACAAAATCAACACGCACAT TCAAGATCATTATGG CAAGTACTAAAGTATGT GAGAGTGT TAATGTCCT TAGAATTTGGCCACAGT TAGCTGGTCCTACTCTGCTCCAAGCC GGTCCTATOG, TGAATAATCTCATGATGCCAATATACATOTOTOOAAAAA AOTAGTO CAACAGI TGOTOCTOCOAAGCACAGCAT ACTOGCIATATOTATA TATGAGTATAAGAGAATAACCCATGTAAGCTCCATGAGGG TAGGGATTCTCATC A,G) I T TGTCAGTCAGTGT T T TOCATO TGAAGAGTACATGACAAT TACTGGGCTCCCAGTA TCTATGTGTTGCATTAATGAAATTTCTTAACTTTAATCTACCTCAAAATGTCTCTATCTT CT TGATTCCTCC TCOTT TOTOTATCAGAVAAVATGATGGCCTCT TA TTTCCAAGTTAT TCCGGTCCTGTGCCCTTGATCCCATCTOTTCTCACTTCCCCTTCCTTCCTGCCTCCATTC TCC TGTCCCT TAT GAAAAACAAGCAAGACCATCAAT TO TATICAAG T TATICAT TATG TCAC

18655 TCAAGATCATTATGGTCAAGTACTAAAGTATGTGAGAGG TAATGTCCTTAGAATTTGG CCACAGTTAGCTGGTCCTACTCTGCTCCAAGCCGGTCCTATTT TGTGAATTAATCTGATT TGATGCCAAT TTT TA' TACAT TO CTCCAAAAAACTAG TO CAVACAGT T TGO TOOT COLT CAAGT TCACAGCAT TATOTOTGCTATATOTATATTTTATTGAGTATAAGAGAATTAACCC ATGTAAGCTCCATGAGGGTAGGGATTTCTCATCGTT TGTTCACCAGTGTT TTCTGATCT T,G) GAAGAGACATGACAATTACTGGGCTCCCAGTATCTAGTG || TGCATTAATGAAATTTCT TAACTTAATCTACCTCAAAATGTCTCTATOTTCTTGATTCTCTCC TCOTT TO TOTATC AGAAAATGATGG TOOTOT TAT T T TOOAAGT TAT TOOGG TOOTG, TGOOOT TGATOCOAT OT CTTCTCACTTCCCCTTCCTCCTGCCTCCATTCTCCTGTCCCTATGAAAAACAAGCAAG ACCACAATTCTATCAAGTTATCATTAGTCACTCGTTCTTACAACAATT TAGA

18984 CAGTATCTATGTGTTGCATTAATGAAATTTCT AACTT TAATCTACCTCAAAATGTCTCT ATCT TCT TGATTCTCTCCITTCCT TTCTCTATCAGAAAATGATGGTCCTCTTATTT TCCAA GT TA TCCGGTCCTGTGCCCT TGATCCCATO TOT OTCACT TOCCOLT TOOT TOO I GCCTC CATTCTCC GT COCOTTAT GAAAAACAAGOAAGIACOAT CAMATTOTAT OAAGTT ATOATTAT GTCACTCTGTTOTTATCAACATATTT TTAGTATTGAAGAGGGCTTCTTCTACTTACTCCT G, T AACCTTGTACAA TGTAGTTTAGGTCTTCATC TTTATCATAGCTACCTTATTTAAAGTC ACCCATGGOT T T TAAT TGCCAAAT TOTAATGGCCTATOTTCACCT TTTGAAATGTGTATG TTOG TACCACAGCTCCTTGAAACCAGTCCOCTGACTTGGACTTCCATAACACAATGA TTCTGAT I COT TOT GITT TIGGA TG TCC T T TGTCCCAGGCACCTGGOTACTCCACC TTCCACCTCTCTGAAATCATTAGCATTCCCCAAGGATTCTTCAAAACTOTCTT TCTTCCT

19407 CGT TACCACAGTC. CCTTGAAACTCAGTCCCCTGACT TGGACTTCCATAACACAATGATT TCTGATTT TCCT TCTG | T TG TGAT TGT TCCT TT TGTCCCAGGCACTGGCITACTCCACCITT CCACCCTOTGAAATCATTAGCATTCCCCAAGGATTOTTCAAAACTCTOTTTOTTCCTTG GAGAAGTCAGCATAGCT TTAVA T TGGACCAT TOTATGGOJ TACAGA CAGGA CTTGCCTTCAACCTATTCTTTCTGTAGGTGATTCCATTAACTGTTGCCCATATGGTAGTC C, TD GAAGACAGACCTCCGAGAAATGACCCTGTCCCAAAACCCGCAAATGTCCAAAT COTAGCCTGACAT TCAGACTGATTATOTGCCTCCAAGTT TATA TOOTATCATAT TOOT TTATATATTCTGTTCCCAGGACACTGGGAAGCTTGOCATIOCGATCATAGCCTACAA ACTCTCCTGCCCCCACTCACCCTCATOTOTGCTG TCAAAATGCAACCTCCCTCAAGA FIG. 3BB Patent Application Publication Jul. 29, 2004 Sheet 10 of 36 US 2004/0146980 A1

4601 ATGGATCCTC TCATTGTCAG AAGGAGATT TOTAGGTTGT ATGAAATTGA 4651 CTCTGGGGCA CCCAAGAAGA ACCTCTCCTG CTCCCACTAA AATTAAGGGG 4701 CCTCCCTCTG CAGGATAAAAAACAATCTAGTTAAATGACA ACGCATTTCT 4751. GAAAAGTTTT CCAGGACTGA AAACCTTAAC ATCCACATAC ACTTTGATCT 4801 AAGGGACAGA CGGTTCATAG AATGAAAGAG TATGGTGTCA ATAAGGCTTG 4851 AATTCTAGAA TGAGGAGCCA GCCATGCCAT AGCAGGGGAA TGATACTCOT 4901 TAAAAGGGAA AATTTAACTA CAAATCCTCT GAAGTAGAAA TGATAAGAAT 4951. AACCAAAATA TCTGCAATGGTTCAATAGCA AATAATTAT TGGCAGCGC 5001 TTACCGTGTT CATTTTGCAT CTTTTTTCCC ACCACACATA TTAAGGAGCA 5051 GCTGAAGTCA TGITT TIGACAT TOTCTCCCTC TTTTATOTCIC AGTTTCAGAA 5101 TGAAAAATGA GAGTGAGATA TGAGTAGT TACTAGTAA AATATGAAAC 5151 ACCCAGTTAA ATTTGAAGGT CAGATAAACA ACAAATAATT TTGTATAAGT 5201 CTCA TTTAA GATAATACTA AAAAGTCATT ATTTAT TCAC TATTATCACT 5251 ATTATAAAA TTTTGTAGAG CATCCTGGAT CTTTTTGCTT ACTTTTGTTT 5301 TTATTTTTTG CTAAATCTGG CAATCCCAGG CACATGTGTGAAGGAGCTGT 5351 GAAATATAAA AGGAGAAAAC TATGGGA AAGAT T TGGC TAAGGAGAG 5401 ATAATTTTGG AAAGATTTAG AATTAAAGAT CATTCATTAG ATGTAATGTT 5451 CTAAATACTT TATATCAGTT AAACTTOTCA TCAACAATAT GAGATGGGTA 5501 CCACTAATAG TCACCATTTC ACAAATGATG AAA TAAGGC ACAACCGGTT 5551 ATGTTAAGAG GCCTAAAGTC CACAAATAGC AAGCTGACAGACCAGAATTT 5601 AAGCCCAGGC ATGCTGGCTC CAGAGCCTGT GCTCTTAGTC ATTAAATTAT 5651 AG, TGCOTTAC TTGACOTTCC ACCOTGGT TA CT TTGGATIOT CCCTGAATGC 5701 TCTCTCTCCC TCAGAAATAC TGGAAGTTGG CAGAGGGACA CTGAGCTGAG 5751 CATATTATTG TAGTTTTTAAATGCTCTOCA CTGGACAGAA GATGGGGGAT 5801 TTGAATAGAA ATT TGGTGAG GAACTAATCA GTGTCCATT T ACACTCACCT 5851 CCTCTTCCTC CCTGGAAGAG CTATAGGACT TGAG TAAGCA TGATAAATTT 5901 CGTGTCTTTG TAAACCACAC CCAGGAAATT TGTATATACA AATACATAGA 5951 GCACAGTAGT TATCAGGACA GACTT TGACA TAAAAAGAAC TGGGTTTGAG 6001 TCCCTGCTCT GGCCT TOT TA TCTGGGTGGC, CCTCTGGGAA AGTTACT TAA 6051 CTACATAAAG TITTGTTTCC ATATCTACAA AATGAGGTTT CTCAAAATAG 6101 CAGCTAGTTT ATAGAGTTGT TGCAAGAATT TAGTAAGCTA ATACATATAA 6151 ATACGTCAAC ATAGCACCAG GTACAAAAAT ATGTGCTCAA GAAACTGAAG 6201 TTACCTGATT ATAATGCTCT ATACTATTGA CAAGGGAAAA GTGAAAACAG 62S1 TTT TGT TITT ACCATGTGTG TATGTGTGTGG TGTCTGTGAT G TTCCGACA 6301 TGCTCAT T AACATAAATT ACTCCACTC T T TOTO TOTC TCC T TOTC 6351 T T TCTCCCTC TCTCATCTTA CCCITTTCCCC CACCAGGTCC CCGGCCAGITT 6401 GTGTATATGC AGCATGCCCT GITT TGCAGAC AATGCCTACT GGCTTGAGAA 6451 TTATGCCAAT GGAAGCCTTG GATTCCTTCT AGCAGATGCA GGT TATGATG 6501 TATGGATGGG AAACAGTCGG GGAAACACTT GGTCAAGAAG ACACAAAACA 6551 CTCTCAGAGA CAGATGAGAA A TOTGGGCC TT TAGG TAAA TAT TAGCTAA 6601 GAAAAC-TCAA GGGGGAAATT GGAGGCAATT TTAAAAAAAT AACGTGGACG 6651 CTATTAATGA TTATCTTTGA CGCTTGAAGT CATATAGCTC CTTGTAGTTT 6701 CTGTTAAGATCTCAAAGGAG GGTAACAGCA AGAAGCTCTGATTTTTCACT 6751 GATTCTCCCA CAAGCAAAGT ATGGCATTTC AACAAGATCA TTTTTACATC 6801 CAATTCTG TG AAI TCLTA TGC AT TAAAAG TA TG TCCAAAGA GACAGCTCAG 6851 GAAATTATCA TGACCAATGTGCACATTCAT TCAGCCAATG TTTACTGAGT FIG 3C Patent Application Publication Jul. 29, 2004 Sheet 11 of 36 US 2004/0146980 A1

GTCATTTCACAGGACCCCTOTTTCTATGAAGCCCTCAGGGGAAATAATTTTTTGCCTTT

19531 CTCTCTGAAMATOATTAGCATTCCCCAAGGATCTTCAAAACTO TOT TOT TOOTTGGAGA AG TCAGCATAGOTAATGGACCATCHATGGOTATO TAGATOAGGACTG COT TCAACOTAT TOT TTCTGTAGGGAT TOCA TAACGTGCCCATATGGTAGTCOGAA GACAGACCCCGAGAAAGACCCT GTCTCCAAAACTTCOGCAATAGTCCAAATTCOT AGCOTGACATTCAGACTTGAT TATOTGCOTCCAAG TATA COLT ATCAT ATT COTT TA T,C) ATAT TCTGT TCT CCAGG ACACTGGGAAGCT TGCCAT CCI GAI CATAGCCTACAAACTC TTCCTGCCTCCCACTCACCCTCATCTCTGCTGTCAAAATGCAACCTTCCCTCAAGAGTCA TT TOACAGGACOOOTOT TOTATGAAGOOOTOAGGTGGAAATAAT TTTTTGOOTT T CCATTTTATTTTGGAGTGTTTATGGCATTTAACATACCTTACTTTGTATACAAATAT TT GCCTTGCTCCCTCTT TTGCAAATTTCTTAAAGGTAGAGACCATTGTATGTTTTCTTCATA

1991 CTCATCCTGCTGTCAAAATGCAACCTTCCCTCAAGAGTCATTCACAGGACCCCTCT CITATGAAGGCCCTCAGGTGGAAATAA TTTT TGCC.TTTTTTCCATT TTA TTTGGAGTG TT TATGGCAT TAACAACCT TACTTTGTATACAAATATT TGCCTGCTCCCTOTT TGC AAAT I TO AAAGGTAGAGACCAT GITATGT T T TOT CATATGTGCTGGTGCOITAVACAG AACTATGGCCATTGTCCACATTCATTTAGCAGCOTTTGTAGTTATTGC TTGAGGAGOTT C,T) w OTOTCAT GAATGCCCT TGOTT TOTO TOCCACAGAGT CATCCCCOLTATATATGACCTGACT GCCAGAAAGTGCC TACTGCATTGGGCTGGTGGACATGAGCCTOGAACACCCCAG GATGTGGCCAGGATACTCCCTCAAATCAAGAGTCTTCATTACT T TAAGCTATTGCCAGAT TGGAACCACTTTGATTTGTOTGGGGCCTCGATGCCCCTCAACGGATGTACAGTGAAATC ATAGCT TTAATGAAGGCATATTCCTAAATGCAATGCATTTACTITTCAAT TAAAAGTTGC

2O199 TTTGAGGAGOTTCCTCTGATGAATGCCCT TGCT TCTCTCCCACAGAGTCATCCCCCTAT AATGACCTGACTGCCATGAAAGTGCCTACGCTATTGGGCTGGTGGACATGAGCCT CGTAACACCCCAGGATGTGGCCAGGATACTCCCTCAAATCAAGAGTCTTCAT TACT || TAA GCTAT TGCCAGATTGGAACCAC TTGATT TTGTCTGGGGCCCGATGCCCCTCAACGGAT GTACAGTGAVAATCATAGOT I TAAT GAAGGCATA TCOTAAATGCAAT GOA I TAO I T O A,G) AT TAAAAGT GOT TCCAAGCCCAAAGGGACT TAGAAAAAAGGT AACCAACAATGAGG T TGTCCCCCAGCACCCTGGGGGAGATGCACAGGGAGTOTGTTTTCCAAGTCAATTGTGT TAGTIGT TAT TATGI T TAGAGACA TOT I TGCATGGGACCATO TACAGGTCOT TATAAACA ATGAGGTAGATTAGGOAAAAAGATAAACAAGTGOTACTOTATOTGGCATTAAGTOTAA TTAAATTGTAATTTTTAGGGCATACCATGAAGTATAGAAAGTCTGAAGOTTCAAAGGAA

2O243 AGAGTCATCCCCCTATATATGACCTGACTGCCATGAAAGTGCCTACTGCTATTTGGGCTG GTGGACATGATGTCCCGTAACACCCCAGGATGTGGCCAGGATACTCCCTCAAATCAAGA GTCTTCATTACTTAAGCTATTGCCAGATTGGAACCACTTGATTTGCTGGGGCCTCG ATGCCCCTCAACGGATGTACAGTGAAATCATAGOTTTAATGAAGGCATATTCCTAAATGC AATGCATTTACTTTTCAATTAAAAGTGCTTCCAAGCCCATAAGGGACTAGAAAAAAT G,A) GTAACCAACAATGAGGTGCCCCCAGCACCCTGGGGGAGATGCACAGGGAGTCTGT T TCCAAGTCAATTGTGTTAGTGTATTTATGTTTAGAGACATOTTTGCAGGGACCATOTA FIG. 3CC Patent Application Publication Jul. 29, 2004 Sheet 12 of 36 US 2004/0146980 A1

6901 GGCTACTGTA TGCGCTGTTC TAGGCCCCGA ACATTCAAAC AGGGAACAGA 6951 CAAACLTCITGA CCTCACAAAG CITATOGTCA TAGTGAT AVATACAVA 7001 GTCATTGCTC CTGGATTGCC AATCAACTGT GTAAAGATGATTTGGACCAG 7051 GACCTTATTGATTTAGAGAA ACTGTGATTGATTTAGAGAA ACTGAGATCG 7101 CACATAG TAC CATTTTCAGG AAAAC TCCAA TATAGATTT TTAAAACOTT 71S1 GTAATGGGC. AVAITGAAGAAG AATOTTTTTT GATATOT TGT TTCTTTTAVAT 7201 GGAAGAGTTT TCTGCTGTCA CCAGAGGACA GGCTGATGCC TGCGATAGAC 7251 TT TOTT TOT CAGGCCTAVA GCTCCCTGTT GGTTTGAAA CCTGATGCTA 7301 GAACAGACTG TGI TATTCCTA TACATTAAT AAAACA TCA GITACCCACG 7351 AAAGTTTGAG AATAGTGGAG GAATAGAATA GAATGTTATA GTCTGAGTTC 7401 TTGGGCAGGG GCAAGCATCA GGAAATATTG AATCATTAGTCTTTAGGAGG 7451. TGTCACAACA ATTCTGCTAT TCTTG TAAGT CCCAATCTAT AGATTTCCTC 7501 ACATGTTCT T T TAATAAACA GGCTCAGO T TATGGAATA CCTGATTTGA 7551 CTAVAATGTA TATAGGCCCT TT TGTTCCTC CTGTCTGAAG AACAAAATAC 7601 TAGTACTATG GAATAT TGGT ATATAT TAAA TATATATCTA TATATCCATG 7651 TGGACAGGAA TACTACTACT AACAACATCT TACTGAGCAC CCACTGGCAG 7701 CCAGAGTCGT TTCTTTCATA CTATTAAACC CCGITAGCAG CCCCGTAAAC : 7751 CAGGTACTAC CCGTTATT TCCCAAATGA GAAAACATAGGGTCAGAGCA 7801, TTCAGTAAT TTCTCAAGAG TTGCAAAGGC CATAAATAGT:AGAATCATGA 7851 I TACAAAAC CCCTG TTTCC AAAGATGGGT AT TAAATGGT. COTAACAAT T 7901 GTGAAGCCTC ATGTGGGAGTCAGAAGTAGA GGCACACAAG CCAGATGGGG 7951 AAAGGGAGGG CAAAGAAAAG CAAGAGAAGG GAAGGAAGAG GAGGGATCAT 8001 AAGGTTGAAC TTCAAATATC ATACACAAGT TTCGAAAGTG TTCCTCTTAT 8051 AAGGAAGTAA AATGTACATA TGCAGAAAAA CAAAAAGCTA CAATAGCCTA 8101 CATATAATTG GATAAATAAT GAAATACACA TTGAATCTAA GTAAACAGCA 8151 TAGAATCTGG GTGTAAAAAA GAAGTGAGCA AGTGCTCTGA GTTTTAAACT 8201 TAAACTTGCA AGTATTATA AAAGCCCCTG TITTATTTTG CAGTTTTGAT 8251 GAAATGGCCA AATATGATCT COCAGGAGTA ATAGACITTCA T TG TAAATAA 8301 AACTGGTCAG GAGAAATTGT ATTTCATTGG ACATTCACTT GGCACTACAA 8351 TAGGTATG | T TATGAGGGTC ACTG T TAGGT GTG T | T T TGA GGGTCAGT T T 8401 TOOAGAGIC TTACAGGAGT TCACCITTAT GTTGGAATAA AACAACTG TIT 8451 ACTTATAGTG CCCTCAATTC CCTGTCCTCTGCTGGGAATA ACCCTAGTAC 8501 TCTAAGTAGC TGTGAGCCTG CAGTGCACAGACTATATGTA GGGCAAACOT 8551 TTCCTGGGTC TCTGGTCACA GCAGCATATT GACTACGGTG ATGCAATTTC 8601 CCAGGAATAA CATGTGTTCC AAATTCAAAG AAATAATTCC ACAGAGTAAG 8651 T TTCTAGATT COC TOTGAGC TGAAAAAG TA AAAT TOAATG CCATGGAATA 8701 TGGCTGAAAC ATAATAAATG TGCATCAATC ATCTOTT TOT CACAACCCAA 8751 ATGGGATTT TAAAAAATAA AAGGGAAGGG CITTATACCTA TATT TAAACA 8801 AAT TGAAAAG, GCATGGT TAT ATTTGT TTGT GAGT TGGAAC ACACAAGCTT 8851 ACTATAATAA ATCAATTGAG CITTATO TATT CAGTGTGTGA T T TAG TATTT 8901 ATGAAATAGC AAGTAAATGT AAGCACTATG TAGAAATTTC TAAAGTTTTT 8951 TAAGCTGACA ACTTACTTCT TAATTTACET ACTTTACTTA ATTTACITTA 9001 CAATTTACTT TCCAGGTATT TTGGAAAGAA ATCAATAATC TAGT TCCAAG 9051 TAAAAGTTGA AAGGAACCCA CACTAATAAA AGCTTTGAAT TTGTCATTGA 9101 ACTTCCACTA AAGT I TCCAA TTTTAAGAGA ATAAATCATG TGAAAGTGCA 9151 ATATTTCAGT T TAGGGAAA ATT TTCATTA TCACCACTAT CATCAGTAAC FIG. 3D Patent Application Publication Jul. 29, 2004 Sheet 13 of 36 US 2004/0146980 A1

CAGGTCCTTAAAACAATGAGGTAGATTAGGCAAAAAGAAAACAAGTTGCTACTOTATC TGGCATTTAAGTCTAAT TAAATTGTAATTTT (AGGGCATACCATGAAGTATAGAAATGTC TGAAGCTCAAAGGAACAGTGAVAATTCOTT TAAGGCCATATGGAAACCCTGT TGTCA

20640 GACATOTTTGCATGGGACCATOTACAGGTCCTTATAAACAATGAGG TAGATTAGGCAAAA AGATAAACAAGTTGCTACTCTATCTGGCATTTAAGTCTAATTAAATT GTAATTTTTAGGG CATACGATGAAGTATAGAAATGTCTGAAGOTTCAAAGGAACAGTGAAATTCCTTTAAGGT COTATA TGGAAACOLTOTGT GT CAT TTTAT TATAT GGAT TGOITAT GGCAAT GGACAGAG TGTGGGATTAGGAGGAGGGCCTGTAACTTOTTTATAAAAGTTOTTAGOTATCCTGAAGA T, C) GTATAGACATTTTTACTTTTTTAGG TATTTTCAACATCAGAAAT TCAAAAAAGTCCCCAA AGAT TOT TOCAGAGAAGCCO TOT T T TO TACAAOT TA OCCTGGOTA TOTGCGTAAACG GAATCTTGAACCCATAATAGGATACATGTATAAAATCT TCC || TAT TAAAGCAGAAATAAA TTGTACAGCATCAATATCATTT TATAATCATAGGGAGGCTTCT TTGTTTAGCATGTAATG COCCOT T TACAGGOTT TTTGT TOT TTGAGGGGTTGAACAT TOCATGAAAAACTGACAGA

21156 AGGCTC ITGTAGCATG ITAATGCCCCCTACAGGOT TTTGTOTGAGGGG | T GAACAT ICCATGAAAAACGACAGATAGGAAACTGACAATAAAAGATTGAGCTAAAGATG GAAGCAGAAAGTACTAGGCTAGAAGTCTCAAACAT TAAGTATTTTC CCCCATCT AAAAGCAAGAGAAGCCACCAAAATAT T TACOTAVATGGAAACCTGATGCCGCAT I T T GTAACCACCACTGGCTGCTACATAGAGAATGGAT TAGAAGATGCCAACAAAAGATOT G, C) AGCAAGTOTGTAAATOTGATCAAGTG ITOTGATGCAGGOTGATATCOTTOTG TGCTAAGA GAGATGATCCT GGAAAATCCAGAGCCAGCTCCATAATACITTCCTGCTCTGCTGGCAAA TCCACAAGCTGCTGGCCCCTGGAGCCATTCTTCTCTCAAAACTAGCATTCATCAATTTAA TGTATACG TATTGATGGGGAATAATGGTCAGTATGAAAACCATGTGATAATATGGAAAAA ACCCATGATAAATGTTATGTGAAGAGAAGAAAAGAAACTGGTAGAACATGTGATTG

21163 TTTGTTTAGCATGTAATGCCCCCTTTACAGGCTTTT GTTCT || TGAGGGG TTGAACATT CCAGAAAAACTGACAGAAGGAAACTGACAATAAAAGATTGAGCTAAAGATGGAAGCAG AAAG AC AGGOTAGAAG OOTAAACAT TAAG AT TOT TOOTCOAT O TAAAAGCA AGAGAAGCCACCAAAATATTTTACCTAATGGAAACCGATTGCCGCAT TTTGAACCA CCACTTTGGCTGCTACATAGAGAATGGATTAGAAGATGCCAACAAAAGATTCTGAGCAAG A,T CTGTAAATCTGATCAAGTGTTCTGATGCAGGCTGATATCCITCAGTGCTAAGAGAGATGA TCCTGGAAAA TOCAGAGCCAGO COATAA TAOT T I GCTGCCTGCGGCAAACCACAA GCTGOGGCCCCGGAGCCATTO TOTCTCAAAACTAGCATCATCAATTAATGTATAC GTATTGATGGGGAATAATGGTCACTATGAAAACCATGTGAIAATATGGAAAAATACGCAT GATATAATGTTAGTGAAGAGAAGAAAATGAAACTGGTAGAACTATGTGATTGCAAATAT

21425 AATGGATTAGAAGATGCCAACAAAAGATTOTGAGCAAGTOTG TAAATOTGATCAAGTGT T CTGATGCAGGCTGATATCCTTCTGTGCTAAGAGAGATGATCCTTGGAAAATCCAGAGCCA GO TOCATAATACTTCCTGCTCTGCTGGCAAATCCACAAGCTGCTGGCCCCGGAGCCAT TCTTCTCCAAAACAGCATCATCAVATTTAAG TATACGTATTGATGGGGAATAATGGT CACTATGAAAACCAGTGATAATATGGAAAAATACCCATGATATAATG TATGTGAAGAG G,A) FIG 3DD Patent Application Publication Jul. 29, 2004 Sheet 14 of 36 US 2004/0146980 A1

9201 AAACATATAT TCATTAGTAT TTTAGAT TGA CAGGCACTTT COAAGCTCAG 9251 AACAGGCAGT TAGCATCAGT CAGCATATAC TAAAAAAG TA TCAAAGAACT 9301 CATAGGAGAT CAAAAATGCC ACCAATAGGIC AAATAATTAC AG TATOTAAC 9351 ACTTATTGAG CATCG TAT GGTAGGGTC TTGTGTTCAG GACCTTCCCC 94.01 ACAGTATCTC CCTCTGATCT TOAAAACAAC CCGAATGTTA TTATCCCCAT 9451 GTCATAGAAG AAGAAACACA AGTTCAGAAC ACAGATTCAA ACCAGATGTA 9501 TCTGATTTCA CCAATAGGGT GTGTAAGGATTCCGGAGAAA TGGTGTAGAG 9551 AAGAAGAAAT GACT TAGTT GGTTTTGGAA AGTGGG TAGG ACT TAGATAT 9601 GCTCTTATAC TTGATCTGCA AAAAAAAAAA AAAAAACCAT GGAGAATTTG 9651 ATTACTGTG CTCTGTGT TT CA I TAGGAC ATAAATA T T T TAGTGACTG 9701 TTG TGCAT TTTGGACAGA GCAATTTCTG TITATG TAAGG AGCACCCACT 9751 CTTTG TAGGA CATT TAGTAG GTCCCAGCCC AT TAAACAGG GGCTCTGCAGT 9801 CAGCGTGACC CTCAAAAATC TCACCTCCAC ACATTTCCAA ACACCCTCTG 9851 GGGAAGTACT ATTCCTGAT T CAGAGTCTTT TTATCAATTG TTCAGTCAAT 9901 TATTTCAGTT CTTCTTTTTC TGGCCAAGAC AGTTT TAATG TTCCAACAAG 9951 TG T TTCAG TA CACACATACA CACACACACA CACACAOACA CACACACACA 10001 CACATGCTAG TGGAGGCCCA GGAAGGGACC TCTGGAAACC AAATTATATG 10051 GATATTCTCC CTAGCCTACC CAGTGTGTGCTAATCTCCA TCCTCACAGA 10101 TATACAAAGGGGTGCAATGC TACTGCTGAA AGAGCAAAGC AAATGGAGAT 10151 GCCTGGTCCT TACTGGGCCA TCGTGGATGC TAGGGAAAGC CCCTTTOTTT 10201. TTGGAAACAG GGAAGAGTCTAGAGGGTTGA AAAACACCCA GTAAGACACT 10251 GGGAGCAGTGAAATTTCATT CCATAGTGAG AAAGAAAACC TGTTAGAATA 10301 ACTGGGTGAT GCTGCAGAAA GAAATCAATT CACCTCCTGT GACTGATTAT 10351. TTGCTTCTGG AAGCTCTGTGATTCATTCTG GCATCTCAGA GTTAGGGATG 10401 AAATGAGAAT GTTGCCAGCA TTTACCCCATGCTTGGGAAG TTTACACAGC 10451 AG TAGCTACT CCAGCAGCTT AACCATCACC TTTCCCCTGC CAACTACTCC 10501 ATTTCCCCCA ATCAAGTCAA ACTGTCCATA AATAGAATAA AATAAAATTG 10551 GAGACTTGAG AGCAGAGAAG ACTGAAGGCA GATTATCTTT ATAGAATAAC 10601 TCAGAAGACT TCCAAT TCAT CCCCAGTATG ATCACGATAG AAGGAAAAAA 10651 TGACTAAGCA GAGCCCCAAT T T TGT TAGAA ACAT TGCGTA AG TATT TATT 10701 TTTACAAGAT TGTOTTATCT CCTGTTCTCT CAGGGTTGT AGCC TTTCC 10751. ACCATGCCTG AACTGGCACA AAGAATCAAA ATGAATTTTG CCTTGGGTCC 10801 TACGATCTCA TOAAA TATC CCACGGGCAT TACCAGG TCTAC 10851 TTCCAAATC CATAATCAAG GTAGGCTCCT, TTCAACAAAA TGTACCTGAG 10901 GATOTCAT TT TGGATCATAA ATCOTTAT TA' TT TTCAAATIC TACTG TAAAG 10951 TAAAAG TAGG AAATT TAGAT AAAATIC TATA GAACT AGAC TOTG, TGGG TA 11001 TG TGCTTG TG TA TG TG TG TIC COTGCGTG, TG CGCATG TOTG TGCCATAG TA 11051 TCTGCAGGTT CTGTAATACA ATTTACTATA CAAGGTCATC AGCAGGCTGA 11101 GTATATGTCA GAATTTCTAG CTGAACTGAGTGCTATATGA CAACAAGGAT 11151 TTCTTGITT TTCCCAAGTG TTTTTTGTTC CATT TAGTCA GGTAGGTCAA 11201 TGAATTCACA TTGCCCAAAT GAAAGACACT TCAAGTTACC CATAATCACT 11251 GATGTGTCCA ATTTTGACATTAGAAAAACC TGATTAATAT ATTCCTTCCA 11301 ATATGGAAAC TTGCCCTAAT AAC TAAAGCT AAGAT TCCAA AGCCTAAATG 11351 ATTACAGCT CAAGTATTAATTCAAATAT TATTGGTTAT TTTCAGGAG 11401 TTGAAAAAGT CATTTGGTTG CCAATTGTGG ATTTGGGATT TTATCTATA 11451 AAGGGTT TT T T TT T | T | TTC TCT T TGCT T T TGTT TCTCTA CAAAGGTCAT FIG. 3E Patent Application Publication Jul. 29, 2004 Sheet 15 of 36 US 2004/0146980 A1

AGAAAATGAAACTGG TAGAAGTATGTGATTGCAAATATATACAAATATTAAAACAATTAT ATGAOTT ATAAAATAT TG TATATAATGAAAACTGAAGCAATATAAAAAATAAAATTAG TTGTGTCAGGGTTAGITAVACATGATGAGTGATTAA TAGTTTT TAAT ITT TAVATATAGTAATG ACATAATGTTACAACTTGTCCAAATCTCACAAACATAATATTCAGTAAAGGAAGATAAAC ATAAAAGAATACATATTTT ATTAT ACAT TTT ATG TAGGCTAVA TTGATGGT TOTGAAAGC Chromosome map:

FIG. 3EE Patent Application Publication Jul. 29, 2004 Sheet 16 of 36 US 2004/0146980 A1 11501 TGCCACAATGAACACAGCAT TTAATCAAAT TCCAGATTGG CCTTTGAACT 11551. TGGGATGATG GATAAAATGG ATTTGGGCCA AAATTGAAGT CAAGGAGACC 11601 AGTTAGAATA TCAAAATAAT TCATATATAA GAAAATGAGA CGTTGGTTTG 11651. GGGTAGAGTG GTAGGAATGA AAAAAATTATTTGTGAGCTA ACACAAGGAA 11701 TAATTTCCAT AGGGCCTAAT AATAGTTAGG TCTGATAATA CTATGGTCTG 11751 ATAA TAGTTT TATTGTATTG TTTACTGAGA GCACAAATGA TG TAACTTCC 11801 TTATTCAAGA GCTTTTCTAG TTTATTTAAA AATGTGTTGA CATCAGTTAG 11851 GTTTTAATGT TTTOTATATT TGGACAGTGT GAGCAAACTA ATTTGTTAAA 11901 TTAAATTCAG AGAGAGATAC ATOTATCTGT AAATACATAT ATGCGTTGTT 11951 TGTGTTGGCTC TTCCTACATA GGTCAGCTAT AAGGCAAATA ATGTTCCTGG 12001 GT IATC TCAG T | TCACAT I T CCCACTGTCA ATAT TCCTGC TACT | T TAAG 12051 TCCCATATCC TGCTCTTTTC TTCCGTCAGT TTCCCCCAGA AGCTCCAAGA 12101 CCCCACCAGG AATCCCCATC CAAGTTTACT TTCCCAACTC CTGGAAGTTT 12151 CAAT GTGCT GCCTTGTGA CAT TATCATA TCTTTTCTGT TCAATGGTTG 12201 CTCTCTG GCTCACTGT CTCTACT CAGCCTGAGA GOTGGOTAAT 12251 CTGGGACAGT ACTCGAATGC AG TG TACACA TGGG TAACAT GGAAAACCCC 12301 GAT TTTCCCT TATATTCAAG GTAT TATT TG ACCT TAAGAA AAACTGT T T T 12351 ACATTTCATA CCAATAATG AGAAAAAAAT ATTGGCAAGC ACTGACTGGG 12401 CAGAATACAG GGAAGCT TCA CTATGGAGAA GTGAATTTGG GATTGAGGGC 12451 CT IT TATTGCA ATCTCCT TGT AAATAA TATT TIGATACTTCTT COTCATCTGG 12501 AGACACATTC CTAAGTAACT TTTCCTGAAT AATTTGGTCT COTTGACTGA 12551 ATCAGTAAGT ACAAATAGAT CCCCAAGCAT GGCTCTTTCC TAGAATGAAA 12601 GAAATGTCAA GAAGTCTGAA GATGATTCTT GAATTTTGGT TT TTGCTAT 12651 TGCTATTTGG GOTTGTTGTC CTTGTTGTTG CTATTGAGTT GAGCTCCTA 12701 TATATCTGG TACTAATCC CTGTAATAT GGATAGTCG CAAATAT 12751 A TOTCAT TCA AAGATAAT TA T TATT TACT T CATAGGCTG TTT TGG TAC 12801 CAAAGGTTTC T TTTTAGAAG ATAAGAAAAC GAAGATAGCT TCTACCAAAA 12851. TOTGCAACAA TAAGATACTC TGGTTGATAT GTAGCGAATTTATGTCCTTA 12901 TGGGCTGGAT CCAACAAGAA AAA TATGAAT CAGGTATGTA TGA TAAT TAT 12951 AGGGCCATTT GATACCTTAA GAAATTCCAG CTTTCCTTTGACTCATTTTG 13001 ATATATCTAT TTACTGTATA AA TCATATG GTATTCCAAA CCCTTAAAGA 13051 CAGATT TTTT T T TGCTTT TA AAAAT GTT TA TGGGATATA ATAGTTG TAC 13101 ATAT TTATGA GACACATATA T | T TGATATA AGCATACAAT GTGTAATGAC 13151 CAAATCAGGG TAAT TGGGAT ATCCATCACC TCAAGCAT T T ATCATT TOTT 13201 TT TG TAGAG ACAT TOTAAT T TGACTC TC TAGT TATT TT GAAATATACA 13251 ATGAATTATT GITTAACITATA GTCATCOTAT TGTGCATGCC AGACT TTAGT 13301 CCTTCTAACG GTATTTTGGT ACCCATTAAC CAATGCCTCT TTATCCTTCC 13351 CCCACCCCTA CITACCTTTCC CAGCCTCTGG TAACCATCAT TOT TOTCACT 13401 ATCTCTATAA GGTCAGTTTT TTTTTAAACT CCCCTATATG AGTGAGAACA 13451 TGCAGTAT TT G TOT I I T TGT GCCTGGCT TA TTTCACTTAA TGTAATGTTC 13501 TOTAA TTCA TCCACATTAT TGCAAATGAC ATGA T CAT TOT TO TATG 13551 GCTGTCTATA TG TACCACAT TT TATT TATC CACTCATCTG TTGATGGACA 13601. CTTAGGCTGA TTCATACT TGGTCATGT GAAAGGCT GTACAAACA 13651 TGGGGGTGCA GATGCTCT CCATGGATTGATTTCCTTT TTTCGA 13701 ATATAGACCT AGCACTGGAA TTGCTGGATCATATGGTAAT TCTACTTTTA 13751 GTT | TTTGAG GATCCCTCAT ACTC TCCCC A TAGTTCCTG TACTAAT TTA FIG 3F Patent Application Publication Jul. 29, 2004 Sheet 17 of 36 US 2004/0146980 A1 13801 CATTCCTACC AACAGTCTGT GCAAGAGTTC TCTTTTCTCC ACATTCTTGT 13851 CAGCATCCAT TATTGCCTAT CTT TT TGATA AAAGCTATTT TAACTGGAGT 13901 GAGATAG TAC TTCATTG TAG TT TAGTTCG CATTO TOTA ATGATTAGTA 13951 AGTTGAACA TTGTTT TAA TGTACCTCT 1 GGCTATTTGT ATGTCTTCTT 14001 T TGAGAAATG TCTAGTCAGA TCT T T TGTCC AT T T TTAAAT CAGATTTT T T 14051 TTTTGCAATT GAGTTATAT G ACOT OTTAT ATTAT TOTGGT TACTAATCCC 14101 TTGTCAGATG GGTAGTTTAC AAAATTC TCTCATTCAA CAGGTTCTT 14151 AGTCACT GTGATGGTC TCCTGCT TGCAGAAGCT TAGCTG 14201 ACGTAATCTA ATT TGTTCAT GITT TGCTTG GTTGGCCTGTG CATTTGAGGG 14251 CTTACCTCAA ATTGGCCCAG ACCAATGTCC CGGAGTGCTT CTGTAATGTT 14301 TGTTTTTAG AGTTTCATA GTTTTAGGTC TTAAATGTGT CTTTAATCCA 14351 TT TTGAT I T T GITT T T TGTAT CTGGCAAGAG A TAGAGATOT AATT CATTC 1440 TTCTGCATAT GGATATOTAG TTTTCCCAGC ATCATT TOTT GTGGAAAT TG 14451 TCCTTTGCCC AATGTATGTT CTTGATGCCT TTGTTGAAAA TTAGTTGACT 14501 ATAAAG, TGT GGATT TATTT GTGGGT TOTT TAT TOTGTTC CAT TGGTCTA 14551 TGGTOTGTT TTTATGCCAGTATCATGCAG TTTTGATAT TACAGGTTTG 14601 TAGTATAATT TGAAGTCAGG TCATGTGATG CCTCCAGOTT TGTTOTTTTT 14651 TOTCAGAATC TTATATTTAG AAAAACGTAA AGACTCCAAC AAAAAACCTG 14701 CTAGAACTGA TAAACAAATT CAT TAAAT TT GCAGGATACA ACATCAACAT 14751 ACAAAATTCA GCAGCATTTCAATATGCCAA GAGCAAATAA TCTTAAAAAA 14801 AAGAAAGAAA AAAAAACAAG AAATAATCCC ATTATAATA GCTACAAATA 14851 AAATAAAACA CCTAGGAATA AACCATACCA AAGAAG TGAA AGAT T TO TAC 14901 AATGAAAACT ATAAAACACT GATGAAAGAA ATTGAAAATGACATTAAAAA 14951 ATGGAAAGGTATTCCATGTT CATGGATTGC AAGAATCAAT ATTGTTAAAA 15001 TGTCCATATG ATCCAAAACA ATCTACAGAT TCAATGCAAT CCCTATCAAA 15051 ATACCAATGA CAT TCTTCAT TGAAATAAAA AAAAAGCCTA AAATTTAAGT 15101 GGAACCATGA AGG TAGATGT C TGC TATACA TAGAAGAT TA AGTACTCAAC 15151 AAACCTTGAA TATGAAGACT GGGGAAGTGA ATAGGCAGCT TCACTCTTCT 15201 ATCCCTGGT GAAATAGG AGAATGGATG TATAATG GGTAGCAGT 15251 TOT TACATGIT TOTCAATCAG CCAAAC TA CTACAGTCAA T T TGAATT TA 15301. TTGCATTTGA ATATATTGGA TTAAAAATAA AATCCTAAAA AAGGAGAGAA 15351. GCACATATAA ACCTGCGTCT TAT (TCATGT GTTCCTTTCT TTGTGGGTGA 15401 CT TT TGT TT T GAAATAAAAC CTGCAAAATA ACAGGACAGG GTGGAAGGGA 15451 GATGGGATCC CCTCTTTATG AAGAAGCAGC AGTCCTGTTT TATCACCTCT 15501 TCATTTCTG TITATTGAGAA TCAAGAAGA AGGAGGAGGA AGAGTTCACA 15551. TCCACAGACT GGTGTGGTTGAATAG T G TC TCTACTGTAT TCCAAATAGC 15601 AGCCAATGAG GCTGTTACAG TGAAGCCAGT CCCAAGATAA TTG TCTGTA 15651 CCCCTAT TOT CITAAGAAGCT AAATTGTGTT AGACTGAAAC CCATAAGGAA 15701 CCATTGTTCA AAGTTGGCTT GATCAAAAGT AAAGATTTTT AATAGTTTCT 15751 CTTAAT TAGA T TATT TOTA AGACATAGAVA TATGATTAC TATT TTATOT 15801 CTATAAT T T T CATCTGTATA ACGT T TACAA ATACTGAAAT AACOTT TGGA 15851 AAAAATGGC TAGCT ACTGCAA TATAT TATCCCCATA 15901 AAAGCOTAGG AAAT TGG TAC TATOGACTTTT AGTAT GTTCA TITTAVATAGAT 15951. GAAAACACAG AAACTCAAAG ATGTTAAATA TGGTGGCCAA GTTCACAAAG 16001. CTGATCATA ACAACAACAG GGCCTGAACT CCTGGTTTTC TGATTAATC 16051 TGTGACAGTG CACCTGGGTG CGCATGCATG CATCACCCCC ACACTTGCAC FIG. 3G Patent Application Publication Jul. 29, 2004 Sheet 18 of 36 US 2004/0146980 A1

16101 ATAGAACCTT TCCTAGTTGG CTTTGCTCCA TGATGACCAT TACTGTTCCT 16151 TOTACTTCAAAATAAGCAAATTATCCTACA GATTCAGAGC TGGTACAGGT 16201 GTGCGTCAA GCAGCCCATT CCATAGTCA GCTTGGGTT CACTCACAT 16251 AAAGTATTGA CCTAAATGGT ATATTTATCTAGATAATTCT ACCTTGTTAT 16301 TT TCAAAGCC CCAGTCT TGT TTGCTAAT TC TG TGCATCAT TT T TCTCTGA 16351 TTCTGAAAGG CAAAATTTTG TTGGGCAATT GCTGTAATAT GAGTTTTATC 16401 TCOTT TAGAG TOGAATGGAT GTGTATATGT CACATGCTCC CACTGGTTCA 16451 TCAGTACACA ACATTCTGCA TATAAAACAG GTAGAGTCTT AGTCATGGAA 16501 AACCATTOCA ATCOTTATTT TOAATA TATT TAAAAAGACA GAATTGACCC 16551 TGT TAACAGG CCTACCOTAA GAATOTTAAG AGCTGCTTC CAGTTGTCC 16601. TTGCTGCCTT CTGTATGCCT TGATTTCCCT GGAATTTAAG AGAAAGGATG 16651 TATGG TACA GACCAAGTAG ATGACATAAA TGAACACCAC CITTAAATCAG 16701 AGTTTTAAAA ATAGGCCCTG AACTGAAGCA AGAGG TAAAC TAGGGAAGCC 16751 TCAGGAGAACTGAGACTTCT CCAGAGAGAA GTATCTGGGA TTTAACTTCT 16801, TTCTAATGAG GCTTGGTTT CCATGAACTT, TTCCTTTAAA CCAAGGGGGG 16851 TATGCTCAT CT TOG TG AGCCCCAT TT GTCATAA TG TAAAAGGGT 16901 GGT TACATCC TTCTGGTGAT CTAGGAGCCC TATTTTCG TC CTAGCATACA 16951 GCATTT TTCT AAAATTITGCT GT TAGCTTTC ATGATTCTA COCTAACTAT 17001 TOT TTT TOTA AAAAACAT TT GITT TCAGO T TACCACCTG ATGAATTCAG 17051 AGCTTATGAC TGGGGAAATG ACGCTGATAA TATGAAACAT TACAATCAGG 17101 TGAGCTATTT ACAGTAACCC CAGCATGCTGATTTTGATAA ATTATAATAA 17151 AAAAT TATTT GAGGG TGGAA AGACTCC TAC CTG TOATT TG GTGGOATT TA 17201 TACTGATAGA ACTTTTTTTT AAAAAAATTT TAATT TAAT TTTAAT TAT 17251 TCAGAAAAT TATAAATA AAGAAGCATA TAGAAAGAAA CTACATCAT 17301 GTGTAATCCT TCCATCCAGA GATAACTAGA TG TACTAACA TTTTGGTGTA 17351 TTTATTCCAA TT TTCTCAGT ATTATATTGC TT TTAGACAA CTTTTAATCT 17401 T TOTAT TT TA CT TAAGOITAT AG TAAGAGAT AACTAATATA ACTGAGGGAT 17451 TAAATGC ATAATG GCTACATAAT AGAAATAT TCATAAAAAT 17501 CTTTACAGCA TAAATGAATA TACACTTTTT AATACCAACA GAAAAATTAG 17551 AATTCCATAT GAAAGTTGAA TAAGTATTAC CCAACATTGA AGACTTGGGT 17601 CGTAAGGCAT CTTTCTCCAT ATAGOTTTAT GACATAAAAA TOTGTAGCCT 17651. GTTTAGCAC CGTACTTTTA A TAMATCCG 'TCACECATTTT TCTGTCTCA 17701. TAGCCAGGGG CTTGGCTTATAAGTATGAAC TAAGCAAACT AAATTAAATT 17751 GTT TAAG TA TTTTCCCAGG CTATCATATT TTAAGCTATT TACT GGTGCA 17801 ACTATAGATT ATTAATAAGT TGT TTCTGAG GATCAAAACA ATCAGACTAA 17851 TCAATTTCTC AATAATGAAT TGGCCTGTTA GAGGAATAAT TCTACTAATC 17901 CTTAAAACCA CITACAAGAGA TAGACCATGT ATAT TT TATT TA TTT TAAA 17951 AATAAGT TTA AGATGTGATT TACATACAAG AACAT TACTA ATT TTGTGTG 18001 CCCATT TAA TAAGTTTTGA CAAATATATT TATT TGTGTA ACCACACCAC 18051 AATCTAAATA TAGGACGT ATATCACCAC TAAAAGT TCCTGCTC 18101 CTGAGACTAT TTATAGACAC AAATGCGTGTATTTGCAAAT GCTTAGAAAA 18151 GG TCTAGAAA AAAAAACAGT AAATGTTAAA GTGG TATCT TCAGAGAGAA 18201 GAAAGAAGAA AAGAAGTGGA TGGACATGAAACAG TAAAGG ACCCTCATTT 18251 TGGACTTTAC ATATGTOTGT TTTCTTCCATTATTTTGAAT AAACATGCTA 18301 TAT TATAAA TTAT TTACAT TTACAAGAAA ATGAAACAAA ATCAACACGC 18351 ACATTCAAGA TCAT TATGGT CAAG TACTAA AGTATGTGAG AG TGITTAATG FIG. 3H Patent Application Publication Jul. 29, 2004 Sheet 19 of 36 US 2004/0146980 A1 18401 TCCTTAGAAT TTGGCCACAG TTAGCTGGTC CTACTCTGCT CCAAGCCGGT 18451 CCTATTTTGT GAATTAATCT CATTTGATGC CAATTTTTAT TACATTCTCT 18501 CCAAAAAACTAGTCTCAACA GTTTGCTOTC TCCTCAAGTT CACAGCATTA 18551 TCTCTGCTAT ATCTATATTT TATTGAGTATAAGAGAATTA ACCCATGTAA 18601 GCTCCATGAG GGTAGGGATT TOT CATCGTT TGTTCACCA GGTTTTCTC 18651 ATCTTGAAGA GTACATGACA ATTACTGGGC TCCCAGTATC TATGTGITGC 18701 ATTAATGAAA TTTCTTAACT TTAATCTACC TCAAAATGTC TCTATGTTCT 18751 TGATTCTCTC CTTCCITT TCT CITATCAGAAA ATGATGGTCC TOTTAT T T TC 18801 CAAGTTAT TC CGGTCCTGTG CCCITTGATCC CATCTCT TCT CACTTCCCCT 18851 TOOT COTGC CTCCAT TOTC CTGTOCOTTA TGAAAAACAA GCAAGACCAT 18901 CAATTCTATC AAGTTATCAT TATGTCACTC TGTTCTTATC AACATATTTT 18951 TAGTATTGAA GAGGGCTTOT TOTACTTACT COTGAACCTT GTACAATGTA 19001 G TTAGGTCT TCATCTTTTT ATCATAGCTA CCITTATTTAA AGTCACCCAT 19051 GGCTTTTAAT TGCCAAATC AATGGCCTAT CTCACCTT TGAAATGTG T 19101 TATGTTCGT ACCACAGTCT CCITTGAAACT CAGTCCCCTG ACTTGGACTT 19151. CCATAACACAATGATTTCTG ATTTCCTTC TGTTTGTGA TG TCC TT 19201 GTCCCAGGCA OTGGCTACTC CACCTCCAC CTCTOTGAAA TCATAGCAT 19251 TCCCCAAGGA T TCT TCAAAA CTC TCT T TCT TCCT TGGAGA AG TCAGCATA 19301 GCTTAATTT GGACCATTTC, TATGGCTTAT CTAGATTTTT TCAGGACTTG 19351 CCTTCAACCT ATTCTTTCTG TAGGTGATTC CATTAACTGT. TGCCCATATG 19401 GTAGTCCGAA GACAGACCTC CGAGAAATGA CCCTTGTCTC CAAAACTTCC 19451 GCAATATGTC CAAATTTCCT AGCCTGACAT TCAGACTTG ATTATCTGCC 19501 TOCAAGTTTA TATOCTATOA TATTCCTTTA TATATTOTGT TOTCCAGG TA 19551 CACTGGGAAG CTTGCCATTC CTGATCATAG CCTACAAACT CTTCCTGCOT 19601 CCCACTCACC CTCATCTCTG CTGTCAAAAT GCAACCTTCC CTCAAGAGTC 19651 ATTTCACAGG ACCCCTCTTT CTATGAAGCC CTCAGGGGA AATAATTTTT 19701 TGCC.TTTTTT TOCAT TTAT TTTGGAGTG TT TATGGCAT TTAVACATACC 19751 TACT TGTA TACAAATAT TGCCTGOTO COTCTGC AAATTCTA 19801 AAGGTAGAGA CCATTGTATG TTTTCTTCAT ATGTTGCTGG TGCCTAACAG 19851 AACTATGGCC AT TGTCCACA TTCAT T TAGC AGCCT I TGTA GT TATTGCTT 19901 TGAGGAGCTT CCTCCATGA ATGCCCTGC TT TOTCTCCC ACAGAGTCAT 19951 CCCCCTATAT ATGACCTGAC TGCCATGAAA GTGCCTACTG CTATTTGGGC 20001 TGGTGGACAT GATGTCCTCG TAACACCCCA GGATGTGGCC AGGATACTCC 20051. CTCAAATCAA GAGTCTTCAT TACTTAAGC TATTGCCAGA TTGGAACCAC 20101 TTTGATTTTG TCTGGGGCCT CGATGCCCCT CAACGGATGT ACAGTGAAAT 20151 CATAGCTTA ATGAAGGCAT ATTCCTAAAT GCAATGCATT TACTTTTCAA 20201 TTAAAAGTTG CTTCCAAGCC CATAAGGGAC TTTAGAAAAAATGGTAACCA 20251 ACAATGAGGT TGTCCCCCAG CACCCTGGGG GAGATGCACA GTGGAGTCTG 20301 TTTTCCAAGT CAATGTGITT AGTGT TATT T ATGITT TAGAG ACATOTT TGC 20351 ATGGGACCAT CTACAGGTCC TTATAAACAA TGAGGTAGAT TAGGCAAAAA 20401 GATAAACAAG TTGCTACTCT ATCTGGCATT TAAGTCTAAT TAAATTGTAA 20451 TTTTTAGGGCATACCATGAA GTATAGAAAT GTCTGAAGCT TCAAAGGAAC 20501 AGTGAAATTC CTTAAGGTC CTATATGGAA ACCTCTGTTG TCATT T TATT 20551 TATATGGATT GCTATGGCAA TGGACAGAGT GTGGGATTAG GAGGAGGGCC 20601 TGTAACTTCT TTATAAAAGT TTCTTAGCTA TCCTGAAGAT GTATAGACAT 20651 TTT TACTT T T T TAGG TATT I TCAACATCAG AAATTCAAAA AAGTCCCCAA FIG 3 Patent Application Publication Jul. 29, 2004 Sheet 20 of 36 US 2004/0146980 A1

2O7O1 AGATTOTTOO AGAGAAGOOO TOT TTOTTA OAATOTATO COTGGOTATO 20751 TGCGTAAACG GAATOTTGAA CICCATAAT AG GATACATGTA TAAAATIOTTC 20801 CTTATAAAG CAGAAATAAA TTGTACAGCA TCAATATCAT TTTATAATCA 20851, TAGGGAGGOT TOTGITA GCATGTAATG CCCCCTAC AGGOTG 20901. TTCTTTGAGG GGTTTGAACA TTCCATGAAAAACTGACAGA TAGGAAACTG 20951 ACAATAAAAG ATTGAGCTAA AGATGGAAGC AGAAAGTACT AGGCTAGATA 21001 GTCTCTAAAC ATTAAGTATT TTCTTCCTCC ATCTTAAAAG CAATGAGAAG 21051 CCACCAAAAT ATTACCITA ATGGAAACCT GATGCCGCA TGTAAC 21101 CACCACTTG GCTGCTACAT AGAGAATGGA TTAGAAGATIG CCAACAAAAG 21151 ATTOTGAGCA AGTOTG TAAA TOTGATCAAG TGTTOTGATG CAGGCTGATA 21201 TCCTTCTGTG CTAAGAGAGA TGATCCTTGG AAAATCCAGA GCCAGCTCCA 21251 TAATACTTTC CTGCTCTGCT GGCAAATCCACAAGCTGCTG GCCCCTGGAG 21.301 CCATTCTTCT CTCAAAACTA GCATTCATCA ATTTAATGTA TACGTATTGA 21351 TGGGGAATAA TGGTCACTAT GAAAACCATG TGATAATATG GAAAAATACC 21401 CATGATATAA TGTTATGTGA AGAGAAGAAA ATGAAACTGG TAGAACTATG 21451 TGATTGCAAA TATATACAAA TAT TAAAACA ATTA ATGAC TTTATAAAAT 21501 AT TTGTATAT AATGAAAACT GAAGCAATAT AAAAAATAAA ATTAGTTGTG 21551 TCAGGGTAGT AACATGATGA GTGATAVATA GITTAAT · TAVATATAG 21601. TAATGACATA ATGTTACAAC TTGTCCAAAT CTCACAAACA TAATATTCAG 21651 TAAAGGAAGA TAAACATAAA AGAATACATA TTTATTATA CATTTTTATG 21701 TAGGCTAATT GATGGTTOTG AAAGCCTTAA AAAGCTTACT TTTAGGAGGA 21751 GAATCATGCC TTGGAGGACT CTAGGGTCCA GAAAAATGTC CTAATACTAG 21801 AGCTAGGTGC AGTCAGATTA ATTAAATAC ATTTCATTAT TTTGTCTGGA 21851 ATACCAAGAT GACTTCCAAG CAGGAATGGA GTCTAGCAAC ACTTTACTGA 21901 TGGGGAACTT GGCCACAGAC TTGTAATACA AATTTTIGGA TATGTTGACA 21951 ATGTTTCTCC T TA T T TOT TACT TATACA AAGCAAGAAA TTTGGCTCAC 22001 AACCTTGAAA CAGACTTACC AGGTTCCTCC AGTTTCCCAA GCCTCAATAT 22051 CITOATGCTA TITTAA CSEQ ID NO: 3)

SNPS :

DNA Position Major Minor

165 G A 226 A G 231 T C 359 A 544 G T 598 C T 1621 A G 2330 C T 2498 A G 2791 T C 2877 T C FIG. 3U Patent Application Publication Jul. 29, 2004 Sheet 21 of 36 US 2004/0146980 A1 2879 2912 3076 3745 3752 3762 3833 4399 4945 5056 5280 5790 5901 6457 6632 6763 6955 701.7 751. 7308 7321 7542 8597 8803 9016 9967 10008 10363 10684 11177 12345 12349 13115 13354 13373 14677 14734 14747 14808 15086 15414 15722 15861 16264 16314 16877 FIG. 3K Patent Application Publication Jul. 29, 2004 Sheet 22 of 36 US 2004/0146980 A1

16966 T G 17147 A G 17219 T C 18628 A G 18655 T G 18984 G T 194O7 C T 19531 T C 19911 C ? 2O199 A G 20243 G A 20640 T C 21.56 G C 21163 A T 21425 G A

COntext: DNA POSition

1.65 TTATGGCCTAACCTTTTTAACTT TGAGTTATTTTCAAGAGAAAATTTGAAAAAGCAGCOT TTGAGGAGAAAGAAGCAACCAACAAACAAAAAGA TAACCACACTG TAATAGGAAATG TG TT TGAATAGGACAT TGGAAGAAAAATAAAATCAT T T I TACAG G, A TAGATCCCAAAGTCAAGGATCTATGTTCAACCATGGTG TCCACCATOTTCACAMATIGA ATGAGTAACCATCATTAAGCAGTTAGCTAGGCCGTAATATGATTCTTGGACTGAGATTT OAAAAATAOOAOAGGOOTOTOAAAGGTAOOOOT TTOTAGOTOOAOTATOATOTAATT T TAT TAAAAAAAAAAAAAAAGGAAAAAT TTGAGOTTCTAGAGAGTAGGGGOTIACCAT TTTG TATOOOACAGGGOOAAGGAACAAGT TTAATGTAT TOATTAAAT TAATTTOAGTATGAG

226 TTAT GGCC TAACOTTTTT AACTTTGAGTT ATT TTCAAGAAGAAAATTITGAAAAAGCAGCOT TTGAGGAGAAAGAAGCAATCCAACAAACAAAAAGATAACCACACTG TAATAGGAAATG TG TTTTGAATAGGACATTGGAAGAAAAATAAAATCATTTTTACAGGTAGATCCCAAAGTCA AGGATCTATGTTCAACCATGGTG TCCACCATCTTCACAATTGA A,G) TGAGTAACCATCATAAGCAGTAGOTTAGGOOGTAATATGATOTGGACITGAGATC AAAAAACCACAGGCOTTCTGAAAGG TACCCCTTTCTAGCOCACTATCATCTAATTTT A TAAAAAAAAAAAAAAAGGAAAAATTGAGOTOTAGAGAGTAGGGGCTACOAT GT A COCACAGGGCCAAGGAACAAGTTTTAVATGITAT TOATTITAAAT TAAT I TCAGTATGAGT ATGAAATATAAATAGAAATATTGTAACATTATATATTTOTATATACTTTTATTATAT

231 T ATGGCCAACOT T I T TAACTTTGAGT TATT TOAAGAGAAAAT GAAAAAGCAGCOT TGAGGAGAAAGAAGCAATCCAACAAACAAAAAGAAACCACAC GAATAGGAAA GTG TTTGAATAGGACATGGAAGAAAAATAATAATCATTTTACAGG, AGATCCCAAAGTCA AGGATCTATGTTCAACCATGTGTGTTCCACCATCTTCACAATTGAATGAG T,C) FIG 3L Patent Application Publication Jul. 29, 2004 Sheet 23 of 36 US 2004/0146980 A1

AACCATCATTAAGCAGTTAGO TAGGCOGTAATAGATTOTTGGAOTGAGATTCAAAAA TACOAOAGGOOTTOTCAAAGGTTACOOOT TOTAGOTOOAOTATOATOTAATTAT AA AAAAAAAAAAAAAGGAAAAATTTGAGOTTOTAGAGAGTAGGGGOTACCATTTTG TATCOC ACAGGGCCAAGGAACAAGTTTTAATGTATTCATTTAAATTAATTTCAGTATGAGTATTGA AATATATAATAGAAATA TGTAACAT TATATAT | T TCTATATACT | TAT TATATAGAAA

359 CTTTGAGGAGAAAGAAGCAATCCAACAAACAAAAAGATAACCACAC GTAATAGGAAAG GT UTTGAATAGGACAT TGGAAGAAAAATAVATAATCAT TT TACAGGTAGATCCCAAAGT CAAGGATCTATGTTCAACCATGTGTGTTCCACCATCTTCACAAT GAATGAGTAACCATC ATTAAGCAG TAGCTTAGGCOGTAATATGATTOTTGGAOTGAGA T ICAAAAATACCACA GGCCT TCTGAAAGG TACCCCTTTOTAGCTCCACTATCATCTAATTTATAAAAAAAAA A y - AAAAAGGAAAAATTI GAGOT TCTAGAGAGIT AGGGGC TACOAT T T T GTA COCACAGGGCC AVAGGAMACAVAGTT TAAT GTATTOATT TAAATTAMAT TTCAGTAT GAGTAT TGAAVATATATA ATAGAVAATAT TG TAVACAT TATATAT TTTOTATATAC T T TATTATATAGAAAATATATAT TACAGAATATATTATTAAATATTG TAGAACAATATATAATACAGAAAAATATATAATACT CAGTAATATATTAAATACTTAT TAAAATAGCAAGCTTATATAGGAAGAGGATGGAGCAT

544 GCAGI TAGC TAGGCCGTAATATGATTCTTGGACTGAGA TTCAAAAAACCACAGGCCT TOT GAAAGGTTACCCC T TOTAGO TOCACTATCATOLTAAT T T TA TAAAAAAAAAAAAAA AGGAAAAATTTGAGCTOTAGAGAGTAGGGGCTACCATTTGTATCCCACAGGGCCAAGG AACAAGT I T TAATGT AT TOAT*TTAVAATTAVATTTCAGTATGAGTATGAAATATATAATAG AAATAT TGTAACAT IATATA I T TCTATATACT | T TAT IATATAGAAAATATATAT TACA G, T) AATA ATTATTAAATAT GAGAACAATA ATAATACAGAAAAAAAAATACTTCAGTA ATATATTAAATACTTATTAAAATAGCAAGCTTATATAGGAAGAGTGATGGAGCATTGTGA GAMAAGT T CAGO T TAT I TOT I TIGACAT TACT TGTTCTGCACAAACAAAAGAAT TACA GGAAT TGTCCAGAT TATTCAAATAACTCGAAGT TGAGGAGGGAATATAAGTCAATGATGT AGAAACTCTTTTAAGATTTGAGCTAGCCTACAATCTGTAAAGATCTGTGAAATTGAACTA

598 AGGCOTTOTGAAAGG TACOOCTTTOTAGOTOCOAO ATOA TOTAATT TATTAAAAAAAA AAAAAAAGGAAAAATTTGAGOTTCTAGAGAGTAGGGGCTACGATTTTGTATCCCACAGGG CCAAGGAACAAGTT TAATGTATTOA TAVAA TAAT TTCAGTATGAGTATTGAAAATA TAVATAGAVAATATTG TAACAT TAAT ATT TOTATATACT TTTATTATATAGAAAATATAT ATT ACAGAVATATATTAT TAAA TATT GTAGAVACAATATATAATACAGAAAAATATATAATA C, TD TCAGTAATATAT AAAACT TAT TAAAATAGCAAGCT ATATAGGAAGAGTGATGGAGCA T TGTGAGAAAGT I TCAGOT T TAT I TOT T TIGACAT TAO T GITT TOTGCACAAACAAAAGA ATT ACAGGAMATTGT COAGATT AT OAAATAAOTOGAAGTTGAGGAGGGAVATATAAG TOAA TGATGTAGAAACTCT TTTAAGAT T TGAGC TAGCCTACAATO EGTAAAGATO GTGAAAT T GAACTATATTTGTGCTATTTCCATATTAAGTCAAGGCAACAAATCAATATTAATAATAAT

1621 CGGCTTAAGCTCCACAGGCATACAAAGTGAAGCAGAAAACTGAGGCACGTG TGCOTCCAT TATCTGGTATCCATGTGGGGCTAGAGGTAAATTGTOGTATTGGCCTCCATTTCGC CITTAACCACCTGGTGTAAACAAAGGT ACTGTGCCAAAGT TIGACAGCAAGCCAAATCOOT TTGGCATGTGAATTAGTTTCCTCTGCCATACTGCTAGTTCCAAATTCCTTCTGGTTTCAG FIG 3M Patent Application Publication Jul. 29, 2004 Sheet 24 of 36 US 2004/0146980 A1 GATTTAGGAGTCAGGGTTGCOTCATCTTCCAAATGAGT TACAGTCACGCACATCCCTAC A,G) CACTGCATGGTGGCACTAGTTCCTGATATAGTTACTCCGTTGATCCTGATGAAGGA TCAAATGGGGAAGGGAGATACTATTGTOTOTGATTGTCCATTAAGATOTTGAGTATGTTC TAO TCCCTGTT GACACAGTGGT TGAAVAATGT TGOTAAG TOT TOCCAACAATGACAGA TACT CAGTGGAMAMACATGAAGGATTCCGTCAAACT GGTTAT TTT GOATCAT GTAGACOACT ATT TOCCAACCTGCAAGTGCATCATGGCCTTGGTGTGTCAGGGACACGCCTGGGTGTG

2330 AAAAGT TCAGAAGTTOOTCATCAATAAGGAGTOOTTGTGAGCAGGTGAAGOTCATOTAAC TAGG TAAGAGAAGACTATCAAACCAGGAGGCAGGTGGAAGGGCCAGTGCACGG CAGTCAGG TGCAAGAGOTOTGCAGTGAGGGGGGGAGTG TOCATOOTAGATOTOTCACC TOTTGGCTCTGTGACCT TGAGCAGGTCT TAAVATOTO TOTAAGCCTGTTTTTT TAAT TG A TAAAAGAGGATAA TAATAG TACCAAAATTAGGGAGAT TT CAGAGOT TAAAAACATA C, TD GTGAACTATT TAGAGTAAGCCTGCCATAWAGGGGAOTCAGTAGOT TATTAT TAGT TOAT ACAAT TGAAAAGT T TCATAATAT T TGCAGATIATAAGATGATCT TCAACCAGATAGCTAA TGTATGCAAAGCTATTTAGOTTCAGAAGTAAACTOTGCATTOTAGAAGTTAAATATTAC T TGT TATAGTGAAT TATCTGAA TATTATOTO GOTCACTT TTATAAGAAAAA TAGT GAAAGCATTITAT TAAGAAOTT ACACTGCACTAAAT GTT ATTAT ATGACTAATOOT CACTA

2498 AGATOTOCACOOTTGGCTCTGTGACOTTGAGCAGGTCTTAAATCTCTCTAAGCCTTTG T TAAT GATAAAATGAGGATAATAATAG TACCAAAAI TAGGGAGAT ICAGAGC TTAAATAACATACGTGAACTATTTAGAGTAATGCCTGCCAAAGGGGACT CAG TAGCTA TTAT TAGTTTCAACAA T GAAAAGT T TOATAMATAT T TGCAGATATAAGATGATOT TCA ACCAGATAGCTAAG TAGCAAAGCTATTTAGCTTCAGAAGTAAACTCTGCATTTCTAGA A,G) GTTAAA TATTACTTTGTTA TAGTGAAT TATOGAATATT TATOTOTT GOTOACTTTAT AAGAAAAATAGTGAAAGCATTTATTAAGAACTTACACTGCACTAAATGTTATATATGACT TAATCCTCACTATAACCCTATGAGATAGGTTACATTATTGTCCTAATTTTACTAACAAGG AAACCAAGAGACAAAGCTACTAAAACACTTGCCTGAGG TAGACATCTTCTTCTGTGGTG AGGCTGGATTTCAAAT TTAGACCATTTGACTG TAGCACT TATATGATGAGCATGCTG TT 2791 T TOTAGAAGTTAAVATATTACTTTGITTATAGTGAATTATCTGTAATATTATOTCTTGGCTC ACTTTATAAGAAAAAAGGAAAGCATTTATTAAGAACTTACACGCACTAAATGTAT AATGACAATCCCACTATAACCCTAGAGATAGGTTACATTATGTCCTAATTTTAC TAACAAGGAAACCAAGAGACAAAGCACAAAACACT TGCCTGAGGT TAGACA OTTO IT CTGTGG || GAGGCTGGATTTCAAATTTAGACCATTTGACTGTAGCACTTATATGATGAGCA T,C) GCTGTTTAGTGTTATAGTGTTGGTCTACCTTTGAATAGACATACTTTTAAACCATGGCAA GGAAGGAGACTGCACAT GAAVATATGAAAAT T TGCCTGGGGCCACGTGAGAAATA GTCACATCACTAGAAACTAATCATAAGCTT TTGTGTTTGGTTAAAGTT TTATTGATCCAT TOTGTACTGTGGGATTACTGGGCTAAC TAGGGGATACCLTCCACITACT GGCCATGG TATGAAAACCTGTCCTOTGAATOT I TAGATATT TGGCAAATTG TAGGCAAA

2877 ATTTATTAAGAAGT ACACT GCAGTAAATG TATATATGAOTTAATCOTCAGTATAACCC TAT GAGATAGGT ACATTAT TGTOCTAAT TTTACTAVACAVAGGAAACCAAGAGACAAAGCT FIG 3N Patent Application Publication Jul. 29, 2004 Sheet 25 of 36 US 2004/0146980 A1

ACAAAACACGCCTGAGGTTAGACATOLTOTTCTGTGGTGAGGCTGGATTTCAAAT T T AGACCATTGACTGTAGCACT TATATGATGAGCATGCTGT TAGTGTATAGTGTTGGTC TACCTTTGAATAGACATACTTTTAAACCATGGCAAGGAAGTGAGACTGCACATTGAAATA TC) GTAAAAT TGOOTTTGGGTGCCAOGTGAGAAATAG CACATCACTAGAAAGTAATCATAA GCTTTTGTGTTTGGTTAAAGTTTTATTGATCCATTITTCTTGTTTACTTTGTGGGATACT GGGCTTAACTAGGGGATACCTOCACTTTTTACTTGGCCATGGTATGAAAACCTGTOCOT GAATOTTTAGATATTTTGGCAAATTGTAGGCAAACAAAGACTTAAAGCAATTCAACCTTG ATTAAAATAAGACCAAAAATGCCTCCATACTTGATTAAATTTATTTCATTTTAGGAACTG

2879 AAAGAAC ACACGCACAAAG AAAGAC AACCCACAAACCCA TGAGATAGGTT ACATTATT GOOT AVATTTTAO TAVACAVAGGAAAOOAAGAGIACAAAGOIT AC AAAACACGCCTGAGGT TAGACA TOT TOT TOTGTGGTGAGGCTGGATTTCAAATT TAG ACCATTTGACTGAGCAO TATATGAGAGGATGCTGT T TAGGT ATAGTGTGG TOTA CCT TGAATAGACATACT I TAAACCATGGCAAGGAAG TGAGACTGCACATTGAAATATG

AAAAT T TGCGT T TGGGTGCCACGTGAGAAAAGTCACATCAGTAGAAACTAATCATAAGC TTTTGTGTTTGGTTAAAGT TTATTGATCCATTT C GTTACTTTGGGGATACTGG GOT TAVACAGGGGATACOCCACTT TTACTTGGCCATGGTATGAAAACCTGTTCCTCTGA ATCTTAGATATTTTGGCAAATTGTAGGCAAACAAAGACTTAAAGCAATCAACC, GA TAAAATAAGACCAAAAATGCCTCCATACT TGATTAAATT TATTTCATT TTAGGAACTGGA 2912 TATGACTTAATCCTCACTATAACCCTATGAGATAGGTTACATTATTGTCCTAATT TACT AACAAGGAAACCAAGAGACAAAGCTACTAAAACACT GCCTGAGGTTAGACATCTTC TC TGTGGTGAGGCTGGATTTCAAATTTAGACCATTTGACTGTAGCACTTATATGATGAGCAT. GOTGITT TAGTGT TATAGTGTTGGOTACOTT TGAATAGACATAOTT TAAVACOATGGCAA GGAAGTGAGACTGCACATTGAAATAGTAAAATT GCOTTTGGGTGOCAOGTGAGAAATA A,G) TCACATCAGTAGAAACTAATCAAAGCTT T TGGTTGGT TAAAGT TT TATTGATCCATT TTTCTTGTTTACTTGTGGGATACTGGGCTAACTAGGGGATACCTCCACTTTTTACTTG GCCATGG AT GAAAACCGTCCCTGAATCTTTAGATATTTTGGCAAAT G TAGGCAAAC AAAGACTTAAAGCAATTCAACOTT GATTAAAATAAGACCAAAAATGCGTOCATACTTGAT TAAAT TATTT CATTTTAGGAVACTGGATTATAMATOAAGACAAOTTOTACATGAAAAAATA

BO76 OTTA TATGATGAGCATGCTGTT TAGTGTTATAGTGTGGOTACOTT TGAATAGACATAC TAAACCATGGCAVAGGAAGTGAGACITGCACATGAAATATGTAAAATGCOTGGG TGCCACGTGAGAAATAG TCACATCACITAGAAACTAATCATAAGCT T TTG TGTT TGGTTAA AG TITAT TGATOOATTTTTOTITGTT TACTTTG TGGGATAOTGGGOTTAVACTAGGGGATA CCTCCACTTT T TACTTGGCCATGGTATGAAAACCTG TCCTCTGAATOTT TAGATATTTTG G, T CAAATTG TAGGCAAACAAAGACTTAAAGCAATTCAACCITGAT (AAAATAAGACCAAAAA TGCCTCCATACTTGATTAAATTTATTTCATTTTAGGAACTGGATTATAATCAAGACAACT TOTACATGAAAAAATAGATTAATAG GOTCCAAGTTAG TCACTG TATT TATTOOTTTTT ATACATTATCTGCCTTOGGTGTTATTCAAG ITTCATTAATCATTAATAATTTCACTAAT CATTTTATTTCATTAATCAACATTGATAGTTAAAATTAATCTGTGAATATTAAATGTTTT FIG. 3O Patent Application Publication Jul. 29, 2004 Sheet 26 of 36 US 2004/0146980 A1

3745 TGGGGATTCCTTGATTTGGAAAATGAAGTGAATCOGAGGTGTGGATGAATACTGTAAG TCATGGAAAACTGTGAAGAACATCAAATAAAGCAGGACTAATGGAGTATGAGGTTACGAA AGGTCCTGT TGAACAGAAAA TOTOT GAAAAACAGATAAAAGTAGATGGT I T TAACC TCTGCAAGAGTCAAGCTAGTTAGATCTTTGTCTGAAAAACAAATACTGCCGG, AATGAA AACCAAATTGTGOTAT TGIGCTATOTATOTATOTATOTATOTATOTATOTATOTATOTAT C,G) TATCITATOTTATO TATT TATCTATOTATCITATAGATAGAACCCCTCTTTTGAAT TATGT TTAAGAATATCAAGCTATTTGTTGATATACATGATTGCCTTCATTGATCTATAGTTCT ATTACITT TAAAGCAAGAGGGG TOTCAAAAGACAATTGACTTGATAATATAGOTTTGTCA GAAAGAATGGGTCAATGCTAAATTTTOCCCCAACCCCCCAAAATATTAGCCAATAG TAGA ATT TTT TAAAAT TOTACT TATTG TATTAAGACT T TATT TATTAVATTITTACAGTTACC

3752 TTCCITTGATTGGAAAATGAAG TGAATCCTGAGGTGTGGATGAATACTG TAAGTCAIGGA AAAGTG TGAAGAACATCAAATAAAGCAGGAGTAATGGAGTATGAGG TAGGAAAGG TOOT GT TGAACAGAAAACTTCTGATAAAACAGAAAAAGTAGA GGT TI I TAACCCTGCAA GAG TCAAGOTAG TAGATO TTG TOTGAAAAACAAATAOTG TOCGGTAATGAAAACCAAA TGTGCTATTGTGCTAT CATCTATO TACTATCITACTATCITATCITACTATOTA TOTA [T y - . . CAT CAT T ACT ACT ACTATAGATAGAACCTCCC. I GAA T ATG AAGA ATATCAAGOTA T TGT GATATACATGATTGCC TOTA TGATCATAG TOTA AO T TTAAAGCAAGAGGGGTCTCAAAAGACAATTGACTTGATAATATAGOTTGTCAGAAAGAA TGGGTCAATGCTAAATTTCCCCCAACCCCCCAAAATATTAGCCAATAGTAGATATTT TAAAAT OTACT TATT I TGTATAAGACT T TATTA TAAT TT TACAGTTACCTGGTGCT

3762 TGGAAAATGAAGTGAATCCTGAGGTGTGGATGAATACTGTAAGTCATGGAAAACTGTGAA GAACA CAAATAAAGCAGGACTAATGGAGATGAGGTTACGAAAGGTCCTGTTGAACAG AAAATOTOTGATAAAACAGATAAAATG TAGATGGT TTTTAACOTOTGCAAGAGTCAAGOT AGTTAGATOTT TGTCTGAAAAACAAATACTGTCCGGTAATGAAAACCAAAGTGCTAT T GTGOTA TOTA CITATO TATCITATCITATO ATCTATO A CITAT CITATO A OTACTATT - C y ?[ ATCTATCTATOTATAGATAGAACOTCOTCTTTGAATTATGT TAAGAATATOAAGOT ATT TGTTGATATACATGAT TGCCT TOTA TGATOTATAGT TOTAT TACT T T TAAAGCAAG AGGGGTCTCAAAAGACAATTGACTTGATAATATAGCTTTGTCAGAAAGAATGGGTCAATG OTAVAATT TCCCCCAACCCCCCAAAAAT TAGCCAAAGTAGAAT I T T I TAAVAA TOTA CITTATT TIGTATTAAGACT TTATT TATTAAT TT TACAGTTACCTGGTGCTACAAATTTCA

3833 AAAGCAGGACTAATGGAGTATGAGGTTACGAAAGGTCCTGTTGTAACAGAAAATCTCTGA TAAAACAGAAAAAGAGATGG TTTAACCCTGCAAGAGCAAGCTAG TAGATC TGTC-TGAAAAACAAATACTGTCCGGTAATGAAAACCAAAT TGTGCTATTGTGCTATCTA CITATCITATCITATOTATCITATO TATCITATOTATCITATCITATOTTATO TATT TACTATO TAT CTATAGATAGAACCTCCTOTTTTGAATTATGTTTTAAGAATATCAAGCTATTTGTTGAT A,G) TACATGAT TGOOTTOTA TGATCATAG TOTAT TACT T T AAAGGCAAGAGGGGCCAA AAGACAATTGACTTGATAATATAGOTTTGTCAGAAAGAATGGGTCAATGCTAAATTTTCC CCCAACCCCCCAAAATA TAGCCAAAGAGATATTTTTTAAAATTOTACITTAT TTGA TAAGACITATATAATTACAGTACOTGGTGCTACAAATCAGATAATCACC FIG 3P Patent Application Publication Jul. 29, 2004 Sheet 27 of 36 US 2004/0146980 A1

OTAVATAWAGCACACAACAGATGGT TGT T T GATCC T T TATATCOTT TGGAGAAG TO

4399 GTGATICCITATATCCTGGAGAAGTCCAC TAVACIGACTGTATACTGGG CAGAGTGAAATCATCATCTACAATGGCTACCCCAGTGAAGAGTATGAAGTCACCACTGAA GATGGGTATATACTCCTTGTCAACAGAATTCCTTATGGGCGAACACATGCTAGGAGCACA GG TACAAGATAG TOTOTOOTGAAAAGGGGACTGCAITGACCTCCTGCTTCCAGGAGGA ATTTAATGCTAGATAGCATCAAGAGAGTT ATCAAAA TGGTTTGAATTATTGGATTAG T,C) OTTAAATAGTTACAGGGAGGOTCACTOTTGCOTGATAATTOTOTGAMAGACAGACAGG AACOTAAAAATACAAACAGCAAGAOTGATOTTGOTAACTGCAACCAGAGGTACTTG TTAG GGTGTAAVACAGAAAGGCAGAGCCTGCATTTTGTCACOTCAT TAOGA TATCATGTGGA AAMATT GOTT TGTCCCAGGAAAATGGACCTCCAT TGTCAGAAGGAGAT TOTAGG TG TATGAAAT TGACTCTGGGGCACCCAAGAAGAACCTCTCCTGCTCCCACTAAAATTAAGGG

4945 AATTGACTCTGGGGCACCCAAGAAGAACCTCTGCTGCTCCCACTAAAATTAAGGGGCOTC COTOTGCAGGATAAAAAACAA TOTAG TAAATGACAACGCAT TTOTGAAAAGTTTTOCAG GACTGAAAACCFTAACATCCACATACACT TTGATCTAAGGGACAGACGGTTCATAGAATG AAAGAG TAITGG TGTCAATAAGGCT TGAAT TCTAGAATGAGGAGCCAGCCATGCCATAGCA GGGGAATGATACTCCTTAAAAGGGAAAATTTAACTACAAATCCTCTGAAGTAGAAATGAT A,G) . . . AGAATAACCAAAATATCTGCAATGGTCAAAGCAAATAA TITATGGCAGGTGCT ACC GTGT TCAT T TGCATOT T T T I TOOOACOACACATATAAGGAGOAGOTCAAGTCATGT T T GACATTOTOTOCOTOT TTTAT OTCCAGTTTCAGAATGAAAAATGAGAGTGAGAT ATGAGT AG T T TACTAGT TAAAA TATGAAACACGCAGT TAAAT T GAAGGTCAGATAAACAACAAA TAATT TTG TATAAGTCTCATTTTAMAGATAAACTAAAAAGTCATTATTITATTCACTATTA

5056 GTCOAGGACTGAAAACOT TAACATOCACATACACT T GATOTAAGGGACAGACGGT T CATAGAATGAAAGAGTATGGTGTCAATAAGGCTTGAATTCTAGAATGAGGAGCCAGCCAT GCCATAGCAGGGGAAGATACTGCTTAAAAGGGAAAAT I AACTACAAACCCTGAAGT AGAAATGATAAGAATAACCAAAATATOTGCAATGGTTCAATAGCAAAAATT TATT GGCA GOGO ACCGTGTCAT TTTGCATOLIITTTTTCCCACCACACATAT TAAGGAGCAGOGA A,G) GTCATGTTTGACATTCTCTCCCCTTTTATCTCCAGTTTCAGAATGAAAAATGAGAGTGA GATATGAGTAGTTTTACTAGTTAAAATATGAAACACCCAGTTAAATTTGAAGGTCAGATA AACAACAAATAAT I | TGTATAAGTOTOAT T T TAAGATAVATACTAAAAAGTCA TATT TAT TCACTATTATCACTAT T TATAAAATTT TGTAGAGCATCCTGGATCT T T T TGC TACT T T T GT T II AT T T I TGCTAAATCGGCAACCCAGGCACATGGGAAGGAGC GTGAAATA

S280 AAAAATTTATTGGCAGOTGCTTACCGTGTTCATTTTGCATOTTTTTTCCCACCACACAT ATTAAGGAGCAGOTGAAGTCATGTTTGACAT TOTOTOCOTOTTTTATOTCCAG ITTCAGA ATGAAAAATGAGGAGTGAGATAT GAGTAGTTTTACTAGTTAAAAAGAAACACOCOAGTTA AATTTGAAGGTCAGATAAACAACAAATAATTTTGTATAAGTCTGATTTTAAGATAATACT AAAAAGTCATTATTITATTCACTAT TATOACTATT TATAAAATT TTG TAGAGCATOOGGA T, A CT TTTGCTTACTTTGTTTTTATTTTGCTAAATOTGGCAATCCCAGGCACAGTGTG AAGGAGOTG GAAATATAAAAGGAGAAAACT TATGGGAAAGATTTGGCTTAAGGAGAG FIG. 3O Patent Application Publication Jul. 29, 2004 Sheet 28 of 36 US 2004/0146980 A1

ATAATTTTGGAAAGATTTAGAATTAAAGATCATTCATTAGATGTAATGTTCTAAATACTT TATATCAGTTAAACTTCTGATCAACAATATGAGATGGGTACCACTAATAGTCACCATTTC ACAAATGATGAAATTAAGGCACAACCGGTTATGTTAAGAGGCCTAAAGTCCACAAATAGC

S790 TGAGATCGGTACCACTAATAGTCACCATTTCACAAATCATCAAATTAAGGCACAACCGGT TATGTTAAGAGGCCTAAAGTCCACAAATAGCAAGCTGACAGACCAGAATTAAGCCCAGG CATGCTGGCTCCAGAGCCTGTGCTCTTAGTCATTAAATTATAGTGCCTTACTTGACCTTC CACCCGGTTACTTGGATCTCCCTGAATGCTCTCCTCCCTCAGAAATACTGGAAGTTG GCAGAGGGACACTGAGCTGAGCATATTATTGTAGTTTTAAATGCTCTOCACTGGACAGA A,G) GATGGGGGATTTGAATAGAAATTTGGTGAGGAACTAATCAGTGCCATTTACACTCACOT CCTOTTCCTCCCTGGAAGAGCTATAGGACTTGAGTAAGCATGATAAATTTCGTGTOTTTG TAAACCACACCICAGGAVAAT TTG TATATACAAATACATAGAGOACAGAGI TATOAGGACA GAOTT TIGACATAAAAAGAACTGGGTT GAGTCCCTGCTGTGGCCT TOT TATCTGGGTGGC CCTCTGGGAAAGT TACT TAACACATAAAGT I T TGT TI TOCATATO TACAAAATGAGGTT

5901 AAGCCCAGGCATGCTGGCTCCAGAGCCTGTGCTOTTAGTCATTAAATTATAGTGCCTTAC TTGACCTCCACCCTGGTTACTTGGATCTCCCTGAATGCTCTCTCTCCCTCAGAAATAC TGGAAGTTGGCAGAGGGACACTGAGCTGAGCAATTATTGTAGTTT TAAATGCTCTCCA CTGGACAGAAGATGGGGGATTTGAATAGAAA TTGGTGAGGAACAATCAGTGTCCAT T ACACTCACOCOCTTCOCOOGGAAGAGCAAGGACTTGAGAAGCAGATAAATT C y T GG TOT T GAAACCACACICCAGGAAAT I TGTATATACAAATACATAGAGCACAGAGTT ATCAGGACAGACT I TIGACAAAAAAGAACTGGGT I TGAGTGCCCTGCTGTGGCCT TOT AT OTGGGTGGOCOTOTGGGAAAGTTACTTAACTACATAAAG ITTTGITTCCATATOTACAAA ATOACSGTOTOAAAATAGOAGOTAGTAAGAG TGT TGOAAGAATT TAGTAAGOAA TACATATAAATAOGTCAACATAGCACCAGGTACAAAAATATGTGOTCAAGAAACTGAAGT

6457 CAACATAGCACCAGGACAAAAATATGTGCTCAAGAAACTGAAGTTACCTGATTATAAG COATACTAT GACAAGGGAAAAGTGAAAACAGT T TG TACCAGTGTGTAGG TGTGTGTCTGTGATGTTCCGACATGCTCTAT T TAACATAAATTACTCTCACTCTTCC TCTCCTCTTTCCTTCTCCCTCTCTCATCT ACCCTTTCCCCCACCAGGTCCCCGGCC AGTGTGTATATGCAGCATGCCOTGITGCAGACAATGCCTACTGGOTGAGAVATATGC C, T AAGGAAGCC GGATTOO TOTAGOAGATGCAGGT ATGATGTAGGAGGGAAACAGT CGGGGAAACACTTGGTCAAGAAGACACAAAACACTCTCAGAGACAGAGAGAAATTCTGG GOOTTAGGTAAATATAGCTAAGAAAACTCAAGGGGGAAA TGGAGGCAAT T TTAAAAA AATAACGTGGACGCTATTAAT GATTATCTTTGACGCTTGAAG CATATAGCTCCT TGTAG T T TCTGI TAAGATO TCAAAGGAGGGTAACAGCAAGAAGOTO, TGAT I T T TCACTGATTCTC

6632 TTCTCTCTCTCCTTTCTCTCTCCCTCTCTCACTTACCCTTTCCCCCACCAGGTCCC CGGCCAG T G T G TATATGCAGCATGCCCT G T GCAGACAATGCCTACTGGCTTGAGAAT TATGCCAATGGAAGCOTTGGATTOOTTOTAGCAGATGCAGGTTATGATG TATGGATGGGA AACAGCGGGGAAACACTTGGCAAGAAGACACAAAACACTC TCAGAGACAGAGAGAAA TTCTGGGCC TTAGGTAAATA AGOAAGAAAACCAAGGGGGAAATTGGAGGCAATT T, A FIG 3R Patent Application Publication Jul. 29, 2004 Sheet 29 of 36 US 2004/0146980 A1

AAAAAAATAVACOGGACGCTA TAVAGATTATC TGAOGC GAVAGTCAATAGOTOCT TGTAGTTTCTGAAGATOTCAAAGGAGGGAACAGCAAGAAGCTOTGATTTTTCACTGA OTOCCACAAGCAAAGTATGGCA TCAACAAGATCA TT TACATCOAVAT TOTGTGAA TTOTATGCATTAAAAGTATGTCCAAAGAGACAGCTCAGGAAATTATCATGACCAATGTGC ACATTICAT TCAGCCAATGTTTACTGAGTGGCACCTGTATGCGCTGTCTAGGCCCGGAAC

6763 AAGCCTTGGATTCCTTCTAGCAGATGCAGGTTATGATGTATGGATGGGAAACAGTCGGGG AAACACTTGGTCAAGAAGACACAAAACACTOTCAGAGACAGATGAGAAAT TOTGGGCOTT TAGGTAAATATAGCTAAGAAAACTCAAGGGGGAAATTGGAGGCAATTTTAAAAAAATAA CGTGGACGCTATTAATGATTATCTTTGACGCTTGAAGTCATATAGCTCCTTGTAGTTTCT GTAAGATOTOAAAGGAGGGTAAOAGOAAGAAGOTOTOAT TT TOAOTGATOTOOOAOA A,G) GCAAAGTATGGCATTOAACAAGATCAT T T TACATOCAVAT TOTGTGAAT TOTATGCAT AAAAGTAGTCCAAAGAGACAGO CAGGAAAT AT CATGACCAATGTGCACA CATTICA GCCAATGTTTACTGAGTGGCTACTGTATGCGCTGTTOTAGGCCCCGAACATTCAAACAGG GAACAGACAAACTCTGACCTCACAAAGCT TATGTTCATT TAGTGATAAT TTTACAAGTC A TGCTCCTGGATGCCAACAACTGTGTAAAGAGAT T TGGACCAGGACCT AT GAT T 6955. TAATGATTATCTTTGACGCTTGAAGTCATATAGCTCCTTGTAGTTTCTGTTAAGATCTCA AAGGAGGG TAACAGCAAGAAGCill CT GAT T T T TCACill GAT I CTCCCACAAGCAAAG TAITGG CATTTCAACAAGATCATT TTTACATCCAATTCTGTGAATTCTATGCATTAAAAGTATGTC CAAAGAGACAGO CAGIGAAAT AT CATGACCAAT GGCACAT CATCAGCCAAT GTTTA CTGAGTGGCTACTGTA GOGOTGT TOTAGGCCCCGAACA TCAAACAGGGAACAGACAAA - T y C TCGACCTCACAAAGCTTATGTTCAT T I TAGGA AAT T T TACAAGTCAT GOTCCTGGA TTGCCAATCAACTGTGTAAAGATGATTTGGACCAGGACCTTATTGATTTAGAGAAACTGT GAT TIGAT I TAGAGAAACTGAGATCGCACATAG TACCATTTTCAGGAAAACCCAAAT TA GATTT || TAAAACCTTGTTAATGGGCAATGAAGAAGAATCTT || TTTGATATCTTGTT TCTT TTAATGGAAGAGTTTTCTGCTGTCACCAGAGGACAGGCTGATGCCTGCGATAGACTTTTC

701.7 GGAGGGTAACAGCAAGAAGOTOTGATTTT TCAOTGATTOTOCCACAAGCAAAGTATGGCA TTTCAACAAGATCAT T T TACATCCAA TCTGGAAT TOTA GCAT TAAAAGTATGTCCA AAGAGACAGCTCAGGAAATTATCATGACCAATGTGCACATTCATTCAGCCAATGTTTACT GAGTGGCTACTGTATGCGCTGT TO AGGCCCCGAACA CAAACAGGGAACAGACAAACT CGACCTCACAAAGCTTATGTTCATTTTAGTGATAATTTTACAAGTCATTGCTCCTGGAT T, G GCCAACAACTGGAAAGAT GAT TTGGACCAGGACCT TA TGAT T TAGAGAAACTGTGA TTGATTAGAGAAACTGAGATOGCACATAGTACCATTT TCAGGAAAACTCCAATA TAGA TAAAACCTGTAATGGGCAATGAAGAAGAATCTGATACTGTCT AATGGAAGAGTTTTCTGCTGTCACCAGAGGACAGGCTGATGCCTGCGATAGACTTTTCTT TOT TOAGGCCTAAGCTCCCTGT TGGT TGAAACCTGATGCTAGAACAGACTGTGTAT TO

71.51 GAAAT ATCATGACCAATGTGCACAT TCA CAGCCAAGT I TAGTGAGTGGCAACTGTA TGCGCGTTCTAGGCCCCGAACAT TCAAACAGGGAACAGACAAACTCTGACCTCACAAAG CTTATGTTCATTTTAGTGATAATTTTACAAG CATTGCTCCTGGATTGCCAATCAACTGT GAAAGAGA TGGACCAGGACCT TA TGAT TAGAGAAACTGGAT GAT TAGAGAA FIG 3S Patent Application Publication Jul. 29, 2004 Sheet 30 of 36 US 2004/0146980 A1 ACTGAGATCGCACATAGTACCATCAGIGAAAACTCCAVATATAGATAAAACCIT GT TAAGGGCAATGAAGAAGAACTT GAATCTGTTCTTT TAATGGAAGAGTTTT CTGCTGTCACCAGAGGACAGGCTGATGCCTGCGATAGACTTTTCTTTCTTCAGGCCTAAG CTCCCTGTTGGTTTGTAAACCTGATGCTAGAACAGACTGTGTATTOOTATACATTAVATA AAACATTCAGTACCCAOTGAAAGTTTGAGAATAG TGGAGGAA TAGAATAGAATG TATAG TCTGAGTCTTGGGCAGGGGCAAGCATCAGGAAATATTGAATCATAGTCTTAGGAGGT

7308 CT COGGATTGCCAATCAAOTGTGTAAAGATGATTTGGACCAGGAOOTTATTGATTTAGA GAAACTGTGATT GATTTAGAGAAACTGAGATCGCACATAGTACCATTTTCAGGAAAACTC CAATATTAGAT TT TAAAACCT TGT TAATGGGCAATGAAGAAGAATCTTT T T TGATATCT TGTTCTT TTAATGGAAGAGTTT CTGCTGTCACCAGAGGACAGGCGATGCCTGCGATA GACTT TTCTT TOTTCAGGCCTAAGOTCCCTGTTGGT || TGAAACCTGATGCTAGAACAGA C,G) , TGTGTATTCCTATACAT AATAAAACATTCAGTACCCAC GAAAGTTGAGAAAGTGG AGGAATAGAATAGAATGT ATAGTC-TGAGT GT TGGGCAGGGGCAAGCATCAGGAAATAT TGAATCATTAGTCTTTAGGAGGTGTCACAACAATTCTCCTATTCTTGTAAGTCCCAATCT ATAGATTTOOTCACATGTOTTTTAATAAACAGGCTTOTAGCTTATGGAATACOTGATTT GACTAAATGTTATATAGGCCCTTTTGTTCCTCCTGTOTGAAGAACAAAATACTAGTACTA

7321 AACAACTGTG TAAAGAT GAT I TGGACCAGGACOT TA TGA I TAGAGAAACTGGA TG AT AGAGAAACTGAGATOGCACATAGACCATTCAGGAAAACTCCAATATTAGATT TTAAAACCT || GTTAATGGGCAATGAAGAAGAATCT TETTTGATATCTTGTTTCTITTAAT GGAAGAGT I T TOT GOTGTCACCAGAGGACAGGCTGAGCCTGCGATAGAO T I TO T TOT CAGGCCAAGCTCCCTGTTGG TTG TAVAACO GATGCTAGAACAGACTGTG AT COA T, CD TACAT (AATAAAACAT TCAGACCCACTGAAAGTT GAGAATAGTGGAGGAATAGAATAG AATGTTATAGTCTGAGTCTGGGCAGGGGCAAGCATCAGGAAATATGAATCATTAGTC TT TAGGAGGTGTCACAACAAT TOT OCCITAT TOT TG TAAGTCCCAA TOTA TAGAT T TOOTOA CAGTTO TTTTAVA TAAACAGGOTTCTAGCTTAT GGAATACC TGATT GRACTAAAT GTTAT ATAGGOOOTTTTGTTOOTOOTGTOTGAAGAACAAAATAOTAGTAGTATGGAATATTGG TA

7542 GOGATAGAO I TOT I TOTTCAGGCCTAAGCTCCCTGTTGGTTTGAAACCTGATGCTAG AACAGACTGTGTAT TCCAT TACAT TAATAAAACAT TCAGTACOCACTGAAAGT T TGAGA ATAGTGGAGGAATAGAAAGAATGTTATAGTCTGAGT CTTGGGCAGGGGCAAGCATCAG GAAATATTGAATCATTAGTCT || TAGGAGGGTCACAACAATTCTGCTATTCTTG TAAGTC CCAA TOTA TAGA I CCTCACATGITTO I TAAAAACAGGOT TOTAGOT TATGGAATAC C, T GATT TGAGTAVAMAG TATA AGGOOCOTT T TGT TOOT CO GTOTGAAGAMACAAMAATACA G TAGTATGGAATA TGG TATATATTAAATATATATOTATATATOCATG TGGACAGGAATA CTACTACAACAACA OTTACTGAGCACCGACTGGCAGCCAGAGTOGTT TOT CATACT ATTAAACCCCGTTAGCAGCCCCGTAAACCAGGIACACCOTGTTTATTTCCCAAATGAGA AAACATAGGCTCAGAGCATTTCAGTAATTTCTCAAGAGTGCAAAGGCCATAAATAGTAG

8597 ATAAAACTGG ICAGGAGAAATTG TAT ICATTGGACATTCACTGGCACTACAATAGG TA TGTTTATGAGGGTCACTGTTAGGTGTGTTT TGAGGGTCAGTTTTOTCAGAGTOTTACAG FIG 3T Patent Application Publication Jul. 29, 2004 Sheet 31 of 36 US 2004/0146980 A1

GAGTTCACCTTTATGTTGGAATAAAACAACTGT ACT THATAG TGCCC. TCAATTOCOTGTC) C TCTGC TGGGAATAACCCITAG TACTCITAAG TAGC TG. TGAGCC TGCAG TGCACAGACITATA GTAGGGCAAAGGTTTCCTGGGCTCTGGTCAGAGCACGCATA TGACTACGGTGATGCAA T, C TCCCAGGAAAACAGTGTTCCAAA TCAAAGAAATAAT TOCACAGAGAAGT OAG ATTOOOTOTGAGOTGAAAAAGTAAAATTCAATGCCATGGAATATGGOTGAAACATAATAA ATGTGCATCAATCATCTCTTTCTCACAACCCAAATGGGATTTTTAAAAAATAAAAGGGAA GGGCTATACCTATATTTAAACAAATTGAAAAGGCATGGTTATATTTGTTGTGAGTTGG AACACACAAGOTTAGTATAATAAATCAA TGAGOTTATOTATTCAGTGTGTGATTTAGTA

88O3 TAAGTAGOTO, TGAGOOTGOACTGOAOAGAOTATATGTAGGGOAAAOOTTTOOTGGGTOTO TGGTCAGAGCAGGA TATTGACTACGGTGATGCAATTTCCCAGGAAAACATGTGTTCCAA ATTCAAAGAAATAATTOCACAGAGTAAGTTTOTAGATTOOOTOTGAGOTGAAAAAGTAAA AT TOAATGCOATGGAATATGGCTGAAACATAATAAATGTGCATCAATCATOTOTT TOTCA CAACCCAAATGGGATTTTTAAAAAATAAAAGGGAAGGGG TATACCTATATTTAAACAAA CT …??????? TGAAAAGGCAGGT TAAT T TGT T G GAGT TGGAACACACAAGGT TAGTATAAAAAC AATTGAGCTTACTA CAGTGTGTGATT TAGTAT TAGAAATAGCAAGAAATGTAAG CACTATGTAGAAATTTCTAAAGTTT TE TAAGCTGACAACTACTTCTTAATTTACTTACT TACAATTACT ACAATTACT TOCAGGATT TGGAAAGAAATCAATAATCTAG TTCCAAGTAAAAGT TGAAAGGAACCCACACTAATAAAAGCT || TGAATTGTCATTGAACT

9016 AAATGTGCATCAATCATCTCTTTCTCACAACCCAAATGGGATTT TAAAAAATAAAAGGG AAGGGGT ATACOLTATA TAAACAAAT TGAAAAGGCATGGT TATATT GITT TGTGAGTIT GGAACACACAAGCT TACTATAAAAAT CAATGAGO TAOTAT CAGGTGTGATT AG TAT TATGAAATAGCAAGAAATGTAAGCACTATGTAGAAATTTCTAAAGTTTTTTAAGC TGACAACTTACTTCT (AAT || TACTACTTTACTTAATTTACTTTACAATTTACTTTCCAG G,A) TAT T T TGGAVAAGAAAT CAATAMATOAGT TCCAAG TAAAAG, TGAAAGGAACCCACACTAA TAAAAGOTTTGAATTTG CAT GAACTTCCACTAAAGT TTCCAATTT TAAGAGAATAAAT CATGTGAAAGTGCAATAT T TCAG | T TAGGGAAATAT T | TCAI IATCACCACTATCATCAG TAACAAACATATAT TCAT TAG TAll T T TAGAT TGACAGGCACT T | CCAAGCTCAGAACAGG CAGT TAGCATCAG CAGOATATAGTAAAAAAGTATCAAAGAAOTCATAGGAGATCAAAAA

9967 GT TI TCAT T TAGGACATAAATAT T T I TAGTGACTGTGT T I GCATT TGGACAGAGCAAT T TCTGTTATGAAGGAGCACCCACTOTTGTAGGACATTTAGAGGTCCCAGCCCATTAAA CAGGGC-TC-TGCAGTCAGCGTGACCCTCAAAAACTICACCTCCACACATTTCCAAACACCC TCTGGGGAAGTACTA TCCTGATTCAGAGTC TTT TATCAATGTTCAGTCAA TATT TC AG TOT TOTTTTTCTGGCCAAGACAGITT TAATGTTCCAACAAGTGITT TCAGACACACA T, C ACACACACACACACACACACACACACACACACACACATGCTAGTGGAGGCCCAGGAAGGG ACCTCTGGAAACCAAATTATATGGATATTCCCCTAGCCTACCCAGTGTTGTGCTAATCT CCATOOTCACAGATATACAAAGGGGTGCAATGCTAOTGOTGAAAGAGCAAAGCAAATGGA GATGCCTGGTCCTTACGGGCCATCGTGGATGCAGGGAAAGCCCCTTTCTTTGGAAA CAGGGAAGAGTCTAGAGGGT TGAAAAACACCCAGAAGACACTGGGAGCAGTGAAATTTC FIG 3U Patent Application Publication Jul. 29, 2004 Sheet 32 of 36 US 2004/0146980 A1

10008 CATTTTGGACAGAGCAATTTCTGTTATGTAAGGAGCACCCACTOTTTGTAGGACATTTAG TAGG i CCCAGCCCAT TAAACAGGGOTC) TGCAG TCAGCG TGACCC TCAAAAATO TCACCTC CACACATTTCCAAACACOCTCTGGGGAAGACTATTCOTGATTCAGAGTC TTTACAA TGTCAGTCAATTATTTCAGT TO TOTT TTCTGGCCAAGACAGT TITAATGTCCAAC AAGGTTTCAGTACACACATACACACACACACACACACACACACACACACACACACATGC CT AGTGGAGGCCCAGGAAGGGACCTOGGAAACCAAATTATATGGATATTCTCCCTAGCOTA CCCAGTGTTGTGCTAATCTOCATCCTCACAGATATACAAAGGGGTGCAATGCTACTGCTG AAAGAGCAAAGCAAATGGAGATGCCTGG TOCTTACTGGGCCATOGTGGATGCTAGGGAAA GOCCOTT TOT TTTGGAAACAGGGAAGAGTOTAGAGGGTGAAAAACACOCAGTAAGACA CTGGGAGCAGTGAAATTCATTCCATAGTGAGAAAGAAAACCTGTAGAATAACTGGGTG

10363 AGCCTACCCAG TGTGTGOTAACTCCATCO CACAGATATACAAAGGGGGCAAGOTA CTGCTGAAAGAGCAAAGCAAATGGAGATGCCGGTCC ACTGGGCCACGTGGATGCTA GGGAAAGCCCCTTTCTT TTTGGAAACAGGGAAGAGTCTAGAGGG || TGAAAAACACCCAGT AAGACACTGGGAGCAGTGAAAT TTCACCATAG GAGAAAGAAAACCGT TAGAAT AAC TGGGTGATGCGCAGAAAGAAACAAT CACCTCCTGTGACTGAT AT GOT TOTGGAA G, A - CTCTGTGATCATTCTGGCATCTCAGAGT TAGGGATGAAATGAGAATGTGCCAGCATTT ACCCCA GO TGGGAAGIT TACACAGCAGTAGOTACTCCAGCAGO TAACCATCACOTTT CCCCTGCCAACACTCCATTTCCCCCAATCAAG CAAACTG CCATAAATAGAAAAAAT AAAAT TGGAGACTTGAGAGCAGAGAAGACTGAAGGCAGAT TAT CT TA' TAGAAAACT CA GAAGACTTCCAATTCATCCCCAGTATGATCACGATAGAAGGAAAAAATGACTAAGCAGAG

10684 TOTCAGAG TTAGGGATGAAATGAGAATG TGCCAGCAT TTACOOCATGCTTGGGAAGT TT ACACAGCAGTAGCTACTCCAGCAGC TAACCATCACCTTTCCCCTGCCAACTACTCCATT TCCCCCAA TCAAGTCAAACTGTCCAAAAAGAATAAAATAAAAGGAGACT TGAGAGC AGAGAAGACGAAGGGAGAT ACT I AAGAAAACOAGAAGAO COAA ICAOCO CAG TATGATCACGATAGAAGGAAAAAATGACTAAGCAGAGCCCCAATT TTGTTAGAAACA T, C TGCGTAWAGTATT TATT TI TACAAGAT TG TOT TA OITOOTGTTCTCTOAGGGTTTG TAGCO T TOCACOACOOTGAACTGGCACAAAGAATOAAAATOAATTTGOOTTGGGTOOTACG ATCTGATTCAAATATCCCACGGGCATTTTTACCAGGTTT TTTCTACTTCCAAATTCCATA ATCAAGGTAGGCTCCTTTCAACAAAATGTACCTGAGGATCTCATTGGATCATAAATCC TATT ATT TTCAAATO-TACT GTAAAG TAAAAG TAGGAAATTT AGATAAAATCTAT AGAVAC

11177 COTTTCAACAAAAT GTACOTGAGGJATCTCATTITGGA TOAAAATCOTTAT TATT TOA AATCTACTGTAAAG AAAAGTAGGAAA TTAGATAAAATCTAAGAACTTAGACTCTGTG GG TATGTGCTGTGATGTGTGTCCCTGCGTGTGGCGCATGTCTGTGCCATAGTATCTGCA GGTTCTGTAATACAATT TACTATACAAGGTCATCAGCAGGCTGAGTATATGTCAGAAT T T CTAGCTGAACLTIGAGTGCTATATGACAACAAGGATOTGTCCCAAGTGT G,T) TTOCATT TAGTCAGGTAGGTCAATGAATTCACATTGCCCAAAGAAAGACACTTCAAGT ACCCATAATCACTGATGTGTCCAATT || TGACAT TAGAAAAACCTGATTAATATATTCCIT CCAA TAGGAAACTGCCCAATAACAAAGCAAGA CCAAAGCCAAAT GAT TACA GCTCAAGTATTAATTCAAATATTTATTGGTTATTTCAGGAGTTGAAAAAGTCATTTGG FIG 3V Patent Application Publication Jul. 29, 2004 Sheet 33 of 36 US 2004/0146980 A1

TTGCCAAT TGTGGATTGGGATT TTATOTATTAAAGGGT TTT TTTTTOTC TTGC

12345 TTTAAGTCCCATATCCTGCTCTTTTCTTCCGTCAGTTTCCCCCAGAAGCTCCAAGACCCC ACCAGGAATCCCCATCCAAGTTACTTTCCCAACTCCTGGAAGTTTCAATTGTGCTGCCT TTGIGACATIATCATATCT TT TCTGTTCAATGGI TGCT TCTOTT TGGCTCACTGT TCTCT ACTTTTCAGCCTGAGAGCTGGCTAATCTGGGACAGTACTOGAATGCAGTGTACACATGGG TAVACATGGAAAACCCCGAT ITTCCCT TAAT TCAAGGA TATT TGACCT TAAGAAAAAC T, C) GTTACATCATACOAATTAATGAGAAAAAAATATTGGCAAGCACTGACTGGGCAGAA TACAGGGAAGCTTCAC ATGGAGAAGTGAAT I TGGGAT TGAGGGCCTA GCAATOC CTTGTAAATAATATTTGAACTCTTCCTCATCGGAGACACATTCCTAAGAACTTTTCC TGAATAATTTGGTCTCCTTGACTGAATCAGTAAGTACAAATAGATCCCCAAGCATGGCTC T TTCCTAGAATGAAAGAAATGTCAAGAAGTCTGAAGATGATCTGAAI T I TGG T I T I TT

12349 AG TOCCATATCCTGCTOTT T TO TCCGCAGGTTTCCCCCAGAAGCTCCAAGACCCCACCA GGAATCCCCATCCAAGTTTACT TTCCCAACTCCTGGAAGT TTCAAT TGTGCTGCCTTTGT GACATTATCATATCITTTCGTTCAAGGTTGCTTCTOTTTGGCTCACTGTTCTOTACT T CAGCCTGAGAGCTGGCAACTGGGACAGITACTOGAATGCAGTGTACACAGGGTTAAC AIGGAAAACCCCGAT I TCCCT TATAI TCAAGG TA TAT I TGACC TAAGAAAAAC G T C, TD v. TACA TCATACCAAT TAATGAGAAAAAAATAT TGGCAAGCACTGACTGGGCAGAATACA GGGAAGCTTCACTATGGAGAAGTGAATTTGGGATTGAGGGCC TTATGCAATCTGCTTG TAAATAATATTGATACTCTCCTCACGGAGACACATTCCAAGTAACTTCCTGAA TAATTTGGTCTGCTTGACTGAATCAGTAAGTACAAATAGATCCCCAAGCATGGCTCTTTC CTAGAATGAAAGAAATG TCAAGAAGTOTGAAGATGATOTGAATGGTGOTA

1315 AGAAGATAAGAAAACGAAGATAGCT TCIACCAAAATCIGCAACAATAAGATAC TCIGG | TGAATGTAGOGAVA I ATGTCC ATGGGCTGGACCAACAAGAAAAA TATGAACAGG TATGTATGATAVAT TATAGGGCCATT TIGATACO TAAGAAATTCOAGO TTCCT TTGACTC ATTTTGATATATCATTTACTG TATAAATTCATATGGTATTCCAAACCCTTAAAGACAGA TT TIGOITAAAAATGTA TGGGTATATAATTAGTGTACATATATGAGACA C, T ATATATTT GATATAAGCATACAATGTGTAATGACCAAACAGGGTAATTGGGAATCCA TOACOTCAAGCAT T TATCAT I TOT I T T TGT TAGAGACAT TOTAVAT TTGACTCTTCTAGITT ATTTTGAAVATATACAATGAATTAT TGTT AVACTATIAGTCATOOT ATT GT GOATGCCAGACT TTAGTCCTTOTAACGGTATTTTGGTACCCATTAACCAATGCCTOTTTATCCTTCCCCCAC CCCTACTACCTTTCCCAGCCTCTGG TAACCATCATTCTTCTGACTATCTCTATAAGGTCA

13354 ATTTTTTTTTGCTTTTAAAAATGTTTATGGGTATATAATAGTTGTACATATTTATGAGAC ACATATATTT GATATAAGCATACAATGGTAATGACCAAATCAGGGTAATTGGGATAC CATCACCTCAAGCATTATCATTOTTTTGTTAGAGACATTOTAATTTGACTOTTOTAG T TATT GAAVATATACAATGAAT TA TGT TAAOTATAG CATCCITATGTGGATGCCAGA CTT TAGTTCCT TOTAACGGTATTTTGGITACICCATTAACCAAGCCTCTTACC CCCCC T, A CCCCTACACCTTCCCAGCCTCTGG TAACCATCA TCTTCTCACTATO TOTA TAAGGTC AG TTTTTTTAAACTTCCCCATATGAGTGAGAACATGCAGTATGTCTTTTTGGCCT FIG. 3W Patent Application Publication Jul. 29, 2004 Sheet 34 of 36 US 2004/0146980 A1

GGO TATTTCACTAATGTAATGTCTCTAATTTCACCACATTATTGCAAATGACA TGA TT CATTO TOT TATGGCTGTCTATATGTACCACATT TTA ATCCACTCATCTGT TGA TGGACACTTAGGCTGATT TCATATOTGGTCATTGTGAATAGGCTGTACTAAACATGGG

13373 AATGT TTATGGGTATATAATAGTTGTACATATTTATGAGACACATATATTTTGATATAAG CATACAATGTGTAATGACCAAATCAGGGTAAT TGGGATATOCATCACCTCAAGCATTTAT CATTTCTITTG TAGAGACATOTAAT I TGACTOTOTAG TAT TGAAATATACAAT GAATTATTGITTAAOTATAGTCATCOTATTGTGCATGCGCAGAOTT TAGTCOTTOTAACGGT ATTTTGGTACCCATTAACCAATGCCTOTTTATCCTTCCCCCACCCCTACTACCTTTCCCA C,G) CCTCTGGTAACCATCATTCTTCTCACTATCTCTATAAGGTCAGTTTTTTTTTAAACTCCC CTATATGAGTGAGAACATGCAGTATTGTCTTTTTGTGCCTGGCTTATTTCACTTAATGT AATGTTCTCTAATTTCATCCACATTATTGCAAATGACAT GATTTCATTCT TCTTATGGCT GTOTATATGTACCACATTTTATTTATCCACTCATCTGTTGATGGACACTTAGGCTGATTT CATATOTTGGTCATTGTGAATAGTGCTGTACTAAACATGGGGGTGCAGATOTOTOTTOCA

1 4677 AGAGATAGAGATOTAAT TICAT TOTTOGCATATGGATATOTAGTT CCCAGCATCAT TCTTGTGGAAATTGTGCTTTGCCCAATGTATGCTTGATGCGTT || GTTGAAAATTAGT GAOTATAAATGTGTGGAT T A T GGGG I TOT I TATOTG CCA GG | O TAGTGC TGT I I TATGCCAGTATCATGCAGI I TGAT TAT TACAGGT TGTAGTATAAT I GAAGT CAGGTCATGTGATGCGTCCAGCT TGTTCTTTTTTCTCAGAATCTTATATTTAGAAAAAC C,G) TAAAGACCCAACAAAAAACCTGCTAGAACTGATAAACAAAT TCAT TAAATT TGCAGGAT ACAACATCAACATACAAAAGAGCAGGATTCAA TATGCCAAGAGCAAATAATOTTAAA AAAAAGAAAGAAAAAAAAACAAGAAATAATOCCAT TTATAATAGOTACAAATAAAATAAA ACACOTAGGAAAAACCATACCAAAGAAGTGAAAGAT TTCACAAGAAAAOTATAAAAC ACTGATGAAAGAAATTGAAAATGACATTAAAAAATGGAAAGGTATTCCATGTTCATGGAT 14734 AT TOT TGTGGAAATGTTCCTGCCCAATGTATGTTCTTGATGCCTGTTGAAVAATTA GT TGACTATAAATGTGTGGATT TA T TGGGGTTCTT TATTCTGTTCCAT TGGTCTATGT GTOTGT TTT TAT GOICAGTA CAT GOAGTT TTGATTAT TACAGGTT TG TAG TATAAT TTGA AGTCAGGTCATGTGATGCCTCCAGCTGTCTCTCAGAATCTATATAGAAA . AACG TAAAGACTCCAACAAAAAACCTGC TAGAACT GATAAACAAAT TCAT TAAAT TGCA G,A) GATACAACATCAACATACAAAATTCAGCAGCAT T TOAANTATGCCAAGAGCAAATAATOTT AAAAAAAAGAAAGAAAAAAAAACAAGAAAAATCCCATTATAATAGCTACAAATAAAAT AAAACACCTAGGAATAAACCATACCAAAGAAGTGAAAGATTTOTACAATGAAAAGTATAA AACACTGATGAAAGAAATGAAAATGACATAAAAAAGGAAAGGTATOCATG TCATG GATTGCAAGAATCAATATTGT TAAAATGTCCATATGATCCAAAACAATCTACAGATTCAA 14747 A TGTCOTT TGCCCAATGTATG TOT TGATGCOTT TGT TGAAAAT TAGTGACTATAAAT GTGTGGAT I TAT I TGGGG TOT I AT TCTGTTCCAT TGGTCTATGTGCTGT T T T ATG CCAG TATOATGCAGTTTTGATT ATTACAGG TTT GTAGTATAAT TGAAGOAGGTCATG GATGCCTCCAGOTTGTTOTTTTCTCAGAATOTTATATTTAGAAAAACGTAAAGACTC CAACAAAAAACCTGCTAGAACTGATAAACAAATTCATTAAATT TGCAGGATACAACATCA A,G) FIG 3X Patent Application Publication Jul. 29, 2004 Sheet 35 of 36 US 2004/0146980 A1

CATACAAAATTCAGCAGCAT TCAATATGCCAAGAGCAAATAATO TAAAAAAAAGAAAG AAAAAAAAACAAGAAATAATCOCATTTATAATAGOTACAAATAAAATAAAACACOTAGGA ATAAACCATACCAAAGAAGTGAAAGAT TOTACAATGAAAAGTATAAAACACTGATGAAA GAMAA TGAAAAGACA TAAAAAAGGAAAGGT AT TOCAG CAGGATGCAAGAAC AAA GTAAAAGCCAAGACCAAAACAACACAGACAAGCAACCCAC

14808 TGTGGATTTATTTGTGGGTTOTTTATTOTGTTCCATTGGTOTATGTGTOTGTTTTATGC CAGTATCATGCAGGTTTTGATTAT TACAGGTTTGAGTATAATT GAAGTCAGGTCAGTG ATGCCTCCAGOT T TGT TOT | T TT TCTCAGAATCT TATAT T TAGAAAAACGTAAAGACTCC AACAAAAAACCTGCTAGAACTGATAAACAAATTCATTAAATTTGCAGGATACAACATCAA CATACAAAATTCAGCAGCAT T TCAATATGCCAAGAGCAAATAATCT TAAAAAAAAGAAAG - g A AAAAAAAACAAGAAATAATOCCATTATAATAGCTACAAATAAAATAAAACACC TAGGAA TAAACCATACCAAAGAAGTGAAAGA TTCTACAATGAAAACTATAAAACACTGATGAAAG AAAT TGAAAAGACA TAAAAAATGGAAAGGTAT CCAGT TOTA GGAT I GCAAGAACA ATAT TGT TAAAATG CCATA GAICCAAAACAATCTACAGAT TCAAGCAA CCCTATCA AAATACCAATGACATTCTTCATTGAAATAAAAAAAAAGCCTAAAATTTAAGGGAACCAT

15086 AATAA TOT AAAAAAAAGAAAGAAAAAAAAACAAGAAATAA TCCCA TATAA TAGCAC AAATAAAATAAAACACCTAGGAATAAACCATACCAAAGAAGTGAAAGAT T IOTACAATGA AAAGTATAAAACACTGATGAAAGAAAT TGAAAATGACAT TAAAAAATGGAAAGG TAT TOC AG TOATGGAT TGCAAGAACAATA G TAAAATGTCCATATGATCCAAAACAA TOTA CAGATTCAAT GCAATCCCTAT CAAAAACCAAT GACA O TCA TGAAAAAAAAAAAA - A, G CCTAAAATTAAGTGGAACCATGAAGGTAGATGCTGCTATACATAGAAGAT TAAGACT CAACAAACCTTGAATATGAAGACTGGGGAAGTGAATAGGCAGCTTCAC CTTCTATTCCC TGGTGAAATTTAGGAGAATGGATGTTTTATAAGGG, AGCAGTTTCTTACATGATCTCAA TCAGCCATAACTTACTACAGTCAATTTGAATTTATTGCATTTGAATATATTGGATTAAAA ATAAAATCCTAAAAAAGGAGAGAAGCACATATAAACOTGCGT OTTAT TOA GT GT TOOT

1544 TAGATGTOTGOATACATAGAAGATTAAG TAOTCAACAAACOTTGAATATGAAGAGGGG GAAGTGAATAGGCAGCTTCACTCTTCTATTCCCTGGTGAAATTTAGGAGAATGGATGTT T TATAAT GGG TAGCAGTTTCT ACATGTTOT OAA TOAGCOATAAOTTACT ACAGT OAAT TT GAATTTATTGCATTTGAATATATTGGATTAAAAATAAAATCCTAAAAAAGGAGAGAAGCA CATATAAACCTGCGCT TAT T CA TGG TOO I TOT TGTGGGTGACT TGT TGAA A,G) TAAAACCTGCAAAAAACAGGACAGGGTGGAAGGGAGAGGGATCCCC TOT I TATGAAGA AGCAGCAGTCCTGTTTTATOACC TOT TOATT TTCTGT TA TGAGAWAT TOAAGAAGAAGGA GGAGGAAGAG TCACATCCACAGACTGG TG IGGI TGAATAG TG TOTOTACTG TATTCCA AATAGOAGOCAATGAGGO TO I TACAGTGAAGOCAG TCOCAAGATAAT TG TO TGTACCCC TATTCTCTAAGAAGCTAAAT TGTGT TAGACTGAAACCCATAAGGAACCA GTTCAAAGT

15722 TGCAAAAAACAGGACAGGGT GGAAGGGAGATGGGATCCCCTCill T TAT GAAGAAGCAGCA GTOCTGT ITTATCACCTCTTCATTT TCTGTTATTGAGAATTCAAGAAGAAGGAGGAGGAA GAGT TOACATOCACAGACTGGTGTGGTTGAATAG T G TOTO TAOTG TATTCCAAAAGCA GCCAATGAGGCTG TACAGTGAAGCCAGTCCCAAGATAATTGTCTGACCCCA TOTO FIG 3Y Patent Application Publication Jul. 29, 2004 Sheet 36 of 36 US 2004/0146980 A1

TAAGAAGCTAAAT GTGT AGACTGAAACCCATAAGGAACCAT TGT ICAAAG TGGCTG T,C) TCAAAAGTAAAGATAVATAG TOTOTAVATAGATATCTAAGACATAGAVAT ATGATTACTAT TTTATOTO TATAVATTTTCATCTCTATAACGTT TACAAATACTGAAAAA CCTTGGAAAAAATTGGCTT TAGCTT TAGTTTTGCAATATT TATT TATCCCCATAAA AGCOTAGGAAAT TGG TAGTATGAOTTTTAGTATG TCATT TAATAGATGAAAACACAGAA ACTCAAAGATGTTAAATATGGTGGCCAAG CACAAAGCTGATCATTAACAACAACAGGG

15861 GGGTGGTTGAATAGT GTCTCACTGA CCAAATAGCAGCCAATGAGGCTGTTACAG TGAAGCCAGTCCCAAGATAATTGTTCTGTACCCCTATTCTCTAAGAAGCTAAATTGGTT AGACTGAAACCCATAAGGAACCATTG T TCAAAG. T TGGCT TGT TCAAAAGTAAAGAT TT T T AATAGTTCTCTAATTAGAT TATT TOTAAGACATAGAVATTATGATTACTATT TTATOT CATAA I T CATO OLTA AACGT T TACAAAACTGAAAAACC T IGGAAAAAA TGGC

TT TAGCTT TACT TTGCAATATT TATT TTATCCCCATAAAAGCCTAGGAAATTGGITACT ATGACT TAGTAG TOAT T TAATAGATGAAAACACAGAAACCAAAGATGT TAAVATAT GGTGGCCAAGT CACAAAGC (GATCATTAACAACAACAGGGCCTGAACTCCGG TCT GAT I AACTGGACAGTGCACCTGGGGCGCAGGATGCATCACCCCCACACT TGCACA TAGAACOTTTOOTAGTTGGOTTTGOTCCATGATGACCATTACTG TOOT IOTAGT TCAAA

16264 CTCAAAGATG AAAAGGTGGCCAAG CACAAAGCGA CA TAACAACAACAGGGC OTGAACTTCCTGGTTTTCTGAT I TAAT O TGTGACAGTGCACCTGGGGCGCATGOAT GOAT CACCCCCACAOTGCACAAGAACOTTOOTAG TIGGOTT TGOTCCATGATGACCAT TAC TGTTCCTTCTACTTCAAAATAAGCAAATTATCCTACAGATTCAGAGOTGGTACAGGTGTG CTGTCAAGCAGCCCAT ICCAT TAGTCAGCTTGGGTTCACCACATAAAGTATTGACCT A, TD AATGGTATATTTATCTAGATAATTCTACCTTGTTATTTTCAAAGCCCCAGTCTTGTTTGC TAATTCTGTGCATCATTT TTCTCTGATTCTGAAAGGCAAAAT TT TGT TGGGCAATTGCTG TAVATATGAGT TI TATOTOOT TAGAGTOGAATGGATGTGTATAG CACATGCTCCCACT GG TOSACAGITACACAACA TOT GCATATAAAACAGGTAGAGTOT TAGTCAGGAAAACC ATTOOAATOOTTATT TTCAATATATT TAAAAAGACAGAATTGACCC GT TAVACAGGCOTA

16314 ACAACAGGGCCTGAACTCCTGGTTTTCTGATT TAATCTGTGACAGTGCACCTGGGTGCGC AGCATGCATCACCCCCACACT GCACATAGAACCTTTCCTAGTGGCTTGCTCCATGA TGACCATTAC GT TCCTTCACT CAAAAAAGCAAATTATCCTACAGATTCAGAGCTGG TACAGGTGTGCTGTCAAGCAGCOOAT TOCAT TAGTCAGOT TGGG I TCACT CACA TAVAA GTATTGACCTAAAGGTATATT TATOT AGATAVAT TOTACO GT TATT TTCAAAGCCCCA G,A TCTTG TT TGCTAATCTGGCATCATTTTTCTCTGATTCTGAAAGGCAAAATTTTGTTGG. GCAAT TGCTGTAATATGAGTTTTATOTOOTTTAGAGTCGAATGGATG TGTATATGTCACA TGCTCCCAC GG TCATCAGTACACAACATTOTGCATATAAAACAGGTAGAGTOTTAGTC ATGGAAAACCATOCAATCOTTATTTTCAATATA TTAAAAAGACAGAAT TGACCCTGTT AACAGGCCITACCCTAAGAATCT TAAGAGCT TGCT TCCAGT T | G TCCT TGCTGCCT TCTGT

16877 TAAGAGCTTGGCTCCAGTTTGTCCT TGCTGCCTCTGTATGCCTGATTTCCCTGGAATT TAAGAGAAAGGATGTTATGG TACAGACCAAGTAGATGACATAAATGAACACCACOTTAAA FIG 3Z US 2004/0146980 A1 Jul. 29, 2004

ISOLATED HUMAN LIPASE PROTEINS, NUCLEIC fatty acids from adipose tissue by controlling the rate of ACID MOLECULES ENCODING HUMAN LIPASE lipolysis of Stored triglycerides. Hormone Sensitive lipase is PROTEINS, AND USES THEREOF activated by catecholamines through cyclic AMP-mediated phosphorylation of serine-563. Dephosphorylation is FIELD OF THE INVENTION induced by insulin. While mice with homozygous-null mutations of their hormone-sensitive lipase genes induced 0001. The present invention is in the field of lipase by homologous recombination have been shown to enlarged proteins that are related to the lySOSomal acid lipase Sub adipocytes in their brown adipose tissue and to a lesser family, recombinant DNA molecules, and protein produc extent their white adipose tissue, they are not obese. White tion. The present invention Specifically provides novel pep adipose tissue from homozygous null mice retain 40% of tides and proteins that effect protein phosphorylation and their wild type activity Suggesting that nucleic acid molecules encoding Such peptide and protein one or more other, as yet uncharacterized, enzymes also molecules, all of which are useful in the development of mediate the hydrolysis of triglycerides Stored in adipocytes. human therapeutics and diagnostic compositions and meth Hormone-Sensitive lipase does not show Sequence homol ods. ogy to the other characterized mammalian lipase proteins. BACKGROUND OF THE INVENTION 0007. The present invention has substantial similarity to lysosomal acid lipase. Human lysosomal acid lipase/choles 0002] Lipases teryl ester (EC 3.1.1.13) reveals that it is struc 0003. The lipases comprise a family of enzymes with the turally related to enteric acid lipases, but lackS Significant capacity to catalyZe hydrolysis of compounds including homology with any characterized neutral lipases. phospholipids, mono-, di-, and triglycerides, and acyl-coa 0008. The lysosomal enzyme catalyzes the deacylation of thioesters. Lipases play important roles in lipid digestion triacylglyceryl and cholesteryl ester core lipids of endocy and metabolism. Different lipases are distinguished by their tosed low density lipoproteins; this activity is deficient in Substrate Specificity, tissue distribution and Subcellular patients with Wolman disease and cholesteryl ester Storage localization. disease. 0004 Lipases have an important role in digestion. Trig 0009. Its amino acid sequence, as deduced from the lycerides make up the predominant type of lipid in the 2.6-kilobase cDNA nucleotide sequence, is 58 and 57% human diet. Prior to absorption in the small intestine, identical to those of human and rat lingual triglycerides are broken down to monoglycerides and free lipase, respectively, both of which are involved in the fatty acids to allow solubilization and emulsification before preduodenal breakdown of ingested triglycerides. Notable micelle formation in conjunction with bile acids and phos differences in the primary Structure of the lysosomal lipase pholipids Secreted by the liver. Secreted lipases that act that may account for discrete catalytic and transport prop within the lumen include lingual, gastric and pancreatic erties include the presence of 3 new cysteine residues, in lipases, each having the ability to act under appropriate pH addition to the 3 that are conserved in this lipase gene family, conditions. Modulating the activity of these enzymes has the and of two additional potential N-linked glycosylation sites. potential to alter the processing and absorption of dietary fats. This may be important in the treatment of obesity or 0010) Two major disorders, the severe infantile-onset malabsorption Syndromes Such as those that occur in the Wolman disease and the milder late-onset cholesteryl ester presence of pancreatic insufficiency. Storage disease (CESD), are seemingly caused by mutations in different parts of the lysosomal acid lipase (LIPA) gene. 0005 Lipases have an important role in lipid transport and lipoprotein metabolism. Subsequent to absorption 0011 Burton and Reed (1981) demonstrated material acroSS the intestinal mucosa, fatty acids are transported in croSSreacting with antibodies to acid lipase in fibroblasts of complexes with and protein molecules termed 3 patients with Wolman disease and 3 with cholesterol ester apoliporoteins. These complexes include particles known as storage disease. Quantitation of the CRM showed normal chylomicrons, very low density lipoproteins (“VLDLs”), levels in both cell types. Enzyme activity was reduced about low density lipoproteins ("LDLsº) and high density lipo 200-fold in Wolman disease fibroblasts and 50- to 100-fold proteins ("HDLs”) depending upon their particular forms. in cholesterol ester Storage disease cells. The allelic nature and are bound to act at the of Wolman and cholesteryl ester Storage diseases is the. endothelial Surfaces of extrahepatic and hepatic tissues, occurrence of possible genetic compounds, i.e., cases of respectively. Deficiencies of these enzymes are associated intermediate severity (Schmitz and Assmann, 1989). In both with pathological levels of circulating lipoprotein particles. Wolman disease and cholesteryl ester Storage disease, Chat Lipoprotein lipase functions as a homodimer and has the terjee et al. (1986) demonstrated that renal tubular cells shed dual functions of triglyceride hydrolase and ligand/bridging in the urine are laden with cholesteryl esters and triacylg factor for receptor-mediated lipoprotein uptake. Severe lycerol and that LIPA is lacking in these cells. Yoshida and mutations that cause LPL deficiency result in type I hyper Kuriyama (1990) described lysosomal acid lipase deficiency lipoproteinemia, while leSS eXtreme mutations in LPL are in rats. linked to many disorders of lipoprotein metabolism. 0012 Aslanidis et al. (1994) summarized the exon struc 0006 Lipases have an important role in lipolysis. Free ture of the LIPA gene, which consists of 10 exons, together fatty acids derived from adipose tissue triglycerides are the with the sizes of genomic EcoRI and SacI fragments hybrid most important fuel in mammals, providing more than half izing to each exon. The DNA sequence of the putative the caloric needs during fasting. The enzyme hormone promoter region was presented. Anderson et al. (1994) Sensitive lipase plays a vital role in the mobilization of free isolated and sequenced the gene for LIPA. They found that US 2004/0146980 A1 Jul. 29, 2004

it is spread over 36 kb of genomic DNA. The 5-prime Yoshida et al., Lab. Animal Sci. 40: 486-489, 1990; Young flanking region is GC-rich and has characteristics of a et al., Arch. Dis. Child. 45: 664-668, 1970. housekeeping gene promoter. 0015. As identified above and in the cited references, lipase proteins are a major target for drug action and I0013. Du et al. (1998) produced a mouse model of development. Accordingly, it is valuable to the field of lysosomal acid lipase deficiency by a null mutation pro pharmaceutical development to identify and characterize duced by targeting disruption of the mouse gene. Homozy previously unknown members of the lipase family of pro gous knockout mice produced no Lip1 mRNA, protein, or teins. The present invention advances the State of the art by enzyme activity. The homozygous deficient mice were born providing previously unidentified human proteins that have in mendelian ratios, were normal appearing at birth, and homology to known members of the lipase family of pro followed normal development into adulthood. However, teins. massive accumulation of triglycerides and cholesteryl esters occurred in Several organs. By 21 days, the liver developed 0016 Lipase proteins, particularly members of the lyso a yellow-orange color and was up to 2 times larger than Somal acid lipase Subfamily, are a major target for drug action and development. Accordingly, it is valuable to the normal. The accumulated cholesteryl esters and triglycerides field of pharmaceutical development to identify and char were approximately 30-fold greater than normal. The het acterize previously unknown members of this subfamily of erozygous mice had approximately 50% of normal enzyme lipase proteins. The present invention advances the State of activity and did not show lipid accumulation. Male and the art by providing a previously unidentified human lipase female homozygous deficient mice were fertile and could be proteins that have homology to members of the lysosomal bred to produce progeny. This mouse model is the pheno acid lipase subfamily. typic model of human CESD and a biochemical and histo pathologic mimic of human Wolman disease. SUMMARY OF THE INVENTION 0.014 For a review related to lysosomal acid lipase, see 0017. The present invention is based in part on the Anderson et al., Proc. Nat. Acad. Sci. 91: 2718-2722, 1994; identification of amino acid Sequences of human lipase Anderson et al., Genomics 15: 245-247, 1993; Anderson et peptides and proteins that are related to the lysosomal acid al., J. Biol. Chem. 266: 22479-22484, 1991; Aslanidis et al., lipase Subfamily, as well as allelic variants and other mam Genomics 20:329-331, 1994, Aslanidis et al., Genomics 33: malian Orthologs thereof. These unique peptide Sequences, 85-93, 1996; Assmann et al., In: Stanbury, J. B.; Wyn and nucleic acid Sequences that encode these peptides, can gaarden, J. B.; Fredrickson, D. S.; Goldstein, J. L.; Brown, be used as models for the development of human therapeutic M. S. : Metabolic Basis of Inherited Disease. New York: targets, aid in the identification of therapeutic proteins, and McGraw-Hill (pub.) (5th ed.) 1983. Pp.803-819; Beaudet et Serve as targets for the development of human therapeutic al., J. Pediat. 90:910-914, 1977; Besley et al., Clin. Genet. agents that modulate lipase activity in cells and tissues that 26: 195-203, 1984; Burton et al., Am. J. Hum. Genet. 33: express the lipase. Experimental data as provided in FIG. 1 203-208, 1981; Byrd et al., Acta Neuropath. 45: 37-42, indicates expression in the normal Stomach and human 1979; Cagle et al., Am. J. Med. Genet. 24: 711-722, 1986; leukocyte. Chatterjee et al., Clin. Genet. 29: 360-368, 1986; Christo manou et al., Hum. Genet. 57: 440-441, 1981; Coates et al., DESCRIPTION OF THE FIGURE SHEETS Am. J. Med. Genet. 2: 397–407, 1978; Crocker et al., Pediatrics 35: 627-640, 1965; Desai et al., Am. J. Med. 0018 FIG. 1 provides the nucleotide sequence of a Genet. 26: 689-698, 1987; Di Bisceglie et al., Hepatology cDNA molecule Sequence that encodes the lipase protein of 11: 764-772, 1990; Du et al., Hum. Molec. Genet. 7: the present invention. (SEQ ID NO:1) In addition, structure 1347-1354, 1998; Fujiyama et al., Hum. Mutat. 8: 377-380, and functional information is provided, Such as ATG start, 1996; Hoeg et al., Am. J. Hum. Genet. 36: 1190-1203, 1984; Stop and tissue distribution, where available, that allows one Kahana et al., Pediatrics 42: 70-76, 1968; Klima et al., J. to readily determine Specific uses of inventions based on this Clin. Invest. 92: 2713-2718, 1993; Koch et al., Somat. Cell molecular Sequence. Experimental data as provided in FIG. Genet. 7: 345-358, 1981; Koch et al., Cell Genet. 25: 174, 1 indicates expression in the normal Stomach and human 1979; Konno et al., Tohoku J. Exp. Med. 90: 375-389, 1966; leukocyte. Lake et al., J. Clin. Path. 24: 617-620, 1971; Lake et al., J. Pediat. 76: 262-266, 1970; Lough et al., Arch. Path. 89: 0019 FIG. 2 provides the predicted amino acid sequence 103-110, 1970; Marshall et al., Arch. Dis. Child. 44:331 of the lipase of the present invention. (SEQ ID NO:2) In 341, 1969; Maslen et al., Am. J. Hum. Genet. 53 (Suppl.); addition Structure and functional information Such as protein A926, 1993; Muntoni et al., Hum. Genet. 95: 491-494, 1995; family, function, and modification sites is provided where Muntoni et al., Hum. Genet. 97: 265-267, 1996; Pagani et available, allowing one to readily determine Specific uses of al., Hum. Molec. Genet. 5: 1611-1617, 1996; Patrick et al., inventions based on this molecular Sequence. Nature 222: 1067-1068, 1969; Roytta et al., Clin. Genet. 42: 0020 FIG. 3 provides genomic sequences that span the 1-7, 1992; Schaub et al., Europ. J. Pediat. 135: 45-53, 1980, gene encoding the lipase protein of the present invention. Schiff et al., Clinical aspects. Am. J. Med. 44: 538-546, (SEQ ID NO:3) In addition structure and functional infor 1968; Schmitz et al., The Metabolic Basis of Inherited mation, Such as intron/exon Structure, promoter location, Disease. New York: McGraw-Hill (pub.) (6th ed.) 1989. Pp. etc., is provided where available, allowing one to readily 1623-1644; Sloan et al., J. Clin. Invest. 51: 1923-1926, determine Specific uses of inventions based on this molecu 1972; Spiegel-Adolf et al., Confin. Neurol. 28:399-406, lar sequence. 72 SNPs, including 6 indels, have been iden 1966; Wolman et al., Pediatrics 28: 742-757, 1961; tified in the gene encoding the transporter protein provided Yokoyama et al., J. Inherit. Metab. Dis. 15: 291-292, 1992; by the present invention and are given in FIG. 3. US 2004/0146980 A1 Jul. 29, 2004

DETAILED DESCRIPTION OF THE genomic sequence), as well as all obvious variants of these INVENTION peptides that are within the art to make and use. Some of 0021 General Description these variants are described in detail below. 0022. The present invention is based on the sequencing of 0028. As used herein, a peptide is said to be “isolated” or the human genome. During the Sequencing and assembly of “purified” when it is substantially free of cellular material or the human genome, analysis of the Sequence information free of chemical precursors or other chemicals. The peptides revealed previously unidentified fragments of the human of the present invention can be purified to homogeneity or genome that encode peptides that share Structural and/or other degrees of purity. The level of purification will be Sequence homology to protein/peptide/domains identified based on the intended use. The critical feature is that the and characterized within the art as being a lipase protein or preparation allows for the desired function of the peptide, part of a lipase protein and are related to the lySOSomal acid even if in the presence of considerable amounts of other lipase Subfamily. Utilizing these Sequences, additional components (the features of an isolated nucleic acid mol genomic Sequences were assembled and transcript and/or ecule is discussed below). cDNA sequences were isolated and characterized. Based on 0029. In some uses, “substantially free of cellular mate this analysis, the present invention provides amino acid rial” includes preparations of the peptide having less than Sequences of human lipase peptides and proteins that are about 30% (by dry weight) other proteins (i.e., contaminat related to the lySOSomal acid lipase Subfamily, nucleic acid ing protein), less than about 20% other proteins, less than Sequences in the form of transcript Sequences, cDNA about 10% other proteins, or less than about 5% other Sequences and/or genomic Sequences that encode these proteins. When the peptide is recombinantly produced, it can lipase peptides and proteins, nucleic acid variation (allelic also be Substantially free of culture medium, i.e., culture information), tissue distribution of expression, and informa medium represents less than about 20% of the volume of the tion about the closest art known protein/peptide/domain that protein preparation. has structural or Sequence homology to the lipase of the present invention. 0030 The language “substantially free of chemical pre cursors or other chemicals includes preparations of the 0023. In addition to being previously unknown, the pep peptide in which it is separated from chemical precursors or tides that are provided in the present invention are Selected other chemicals that are involved in its Synthesis. In one based on their ability to be used for the development of embodiment, the language “Substantially free of chemical commercially important products and services. Specifically, precursors or other chemicals' includes preparations of the the present peptides are Selected based on homology and/or lipase peptide having less than about 30% (by dry weight) Structural relatedness to known lipase proteins of the lyso chemical precursors or other chemicals, less than about 20% Somal acid lipase Subfamily and the expression pattern chemical precursors or other chemicals, less than about 10% observed. Experimental data as provided in FIG. 1 indicates chemical precursors or other chemicals, or less than about expression in the normal Stomach and human leukocyte. The 5% chemical precursors or other chemicals. art has clearly established the commercial importance of members of this family of proteins and proteins that have 0031. The isolated lipase peptide can be purified from expression patterns similar to that of the present gene. Some cells that naturally express it, purified from cells that have of the more specific features of the peptides of the present been altered to express it (recombinant), or Synthesized invention, and the uses thereof, are described herein, par using known protein Synthesis methods. Experimental data ticularly in the Background of the Invention and in the as provided in FIG. 1 indicates expression in the normal annotation provided in the Figures, and/or are known within Stomach and human leukocyte. For example, a nucleic acid the art for each of the known lySOSomal acid lipase family molecule encoding the lipase peptide is cloned into an or Subfamily of lipase proteins. expression vector, the expression vector introduced into a host cell and the protein expressed in the host cell. The 0024) Specific Embodiments protein can then be isolated from the cells by an appropriate 0025 Peptide Molecules purification Scheme using Standard protein purification tech niques. Many of these techniques are described in detail 0026. The present invention provides nucleic acid below. Sequences that encode protein molecules that have been identified as being members of the lipase family of proteins 0032] Accordingly, the present invention provides pro and are related to the lysosomal acid lipase Subfamily teins that consist of the amino acid Sequences provided in (protein sequences are provided in FIG. 2, transcript/cDNA FIG. 2 (SEQ ID NO:2), for example, proteins encoded by Sequences are provided in FIG. 1 and genomic Sequences the transcript/cDNA nucleic acid sequences shown in FIG. are provided in FIG. 3). The peptide sequences provided in 1 (SEQ ID NO:1) and the genomic sequences provided in FIG. 2, as well as the obvious variants described herein, FIG. 3 (SEQ ID NO:3). The amino acid sequence of such a particularly allelic variants as identified herein and using the protein is provided in FIG. 2. A protein consists of an amino information in FIG. 3, will be referred herein as the lipase acid Sequence when the amino acid Sequence is the final peptides of the present invention, lipase peptides, or pep amino acid Sequence of the protein. tides/proteins of the present invention. 0033. The present invention further provides proteins that 0027. The present invention provides isolated peptide consist essentially of the amino acid Sequences provided in and protein molecules that consist of, consist essentially of, FIG. 2 (SEQ ID NO:2), for example, proteins encoded by or comprise the amino acid Sequences of the lipase peptides the transcript/cDNA nucleic acid sequences shown in FIG. disclosed in the FIG. 2, (encoded by the nucleic acid 1 (SEQ ID NO:1) and the genomic sequences provided in molecule shown in FIG. 1, transcript/cDNA or FIG. 3, FIG. 3 (SEQ ID NO:3). A protein consists essentially of an US 2004/0146980 A1 Jul. 29, 2004 amino acid Sequence when Such an amino acid Sequence is Sequence of the proteins of the present invention, Such as present with only a few additional amino acid residues, for naturally occurring mature forms of the peptide, allelic/ example from about 1 to about 100 or so additional residues, Sequence variants of the peptides, non-naturally occurring typically from 1 to about 20 additional residues in the final recombinantly derived variants of the peptides, and protein. orthologs and paralogs of the peptides. Such variants can 0034. The present invention further provides proteins that readily be generated using art-known techniques in the fields comprise the amino acid sequences provided in FIG. 2 (SEQ of recombinant nucleic acid technology and protein bio ID NO:2), for example, proteins encoded by the transcript/ chemistry. It is understood, however, that variants exclude cDNA nucleic acid sequences shown in FIG. 1 (SEQ ID any amino acid Sequences disclosed prior to the invention. NO:1) and the genomic sequences provided in FIG. 3 (SEQ 0039. Such variants can readily be identified/made using ID NO:3). A protein comprises an amino acid sequence molecular techniques and the Sequence information dis when the amino acid Sequence is at least part of the final closed herein. Further, Such variants can readily be distin amino acid Sequence of the protein. In Such a fashion, the guished from other peptides based on Sequence and/or protein can be only the peptide or have additional amino acid Structural homology to the lipase peptides of the present molecules, Such as amino acid residues (contiguous encoded invention. The degree of homology/identity present will be Sequence) that are naturally associated with it or heterolo based primarily on whether the peptide is a functional gous amino acid residues/peptide Sequences. Such a protein variant or non-functional variant, the amount of divergence can have a few additional amino acid residues or can present in the paralog family and the evolutionary distance comprise Several hundred or more additional amino acids. between the orthologs. The preferred classes of proteins that are comprised of the 0040. To determine the percent identity of two amino lipase peptides of the present invention are the naturally acid Sequences or two nucleic acid Sequences, the Sequences occurring mature proteins. A brief description of how Vari are aligned for optimal comparison purposes (e.g., gaps can ous types of these proteins can be made/isolated is provided be introduced in one or both of a first and a Second amino below. acid or nucleic acid Sequence for optimal alignment and 0035. The lipase peptides of the present invention can be non-homologous Sequences can be disregarded for compari attached to heterologous Sequences to form chimeric or Son purposes). In a preferred embodiment, at least 30%, fusion proteins. Such chimeric and fusion proteins comprise 40%, 50%, 60%, 70%, 80%, or 90% or more of the length a lipase peptide operatively linked to a heterologous protein of a reference Sequence is aligned for comparison purposes. having an amino acid Sequence not Substantially homolo The amino acid residues or nucleotides at corresponding gous to the lipase peptide. “Operatively linked' indicates amino acid positions or nucleotide positions are then com that the lipase peptide and the heterologous protein are fused pared. When a position in the first Sequence is occupied by in-frame. The heterologous protein can be fused to the the same amino acid residue or nucleotide as the correspond N-terminus or C-terminus of the lipase peptide. ing position in the Second Sequence, then the molecules are 0036). In some uses, the fusion protein does not affect the identical at that position (as used herein amino acid or activity of the lipase peptide per se. For example, the fusion nucleic acid “identity” is equivalent to amino acid or nucleic protein can include, but is not limited to, enzymatic fusion acid “homology’). The percent identity between the two proteins, for example beta-galactosidase fusions, yeast two Sequences is a function of the number of identical positions hybrid GAL fusions, poly-His fusions, MYC-tagged, HI shared by the Sequences, taking into account the number of tagged and Ig fusions. Such fusion proteins, particularly gaps, and the length of each gap, which need to be intro poly-His fusions, can facilitate the purification of recombi duced for optimal alignment of the two Sequences. nant lipase peptide. In certain host cells (e.g., mammalian 0041. The comparison of sequences and determination of host cells), expression and/or Secretion of a protein can be percent identity and Similarity between two Sequences can increased by using a heterologous Signal Sequence. be accomplished using a mathematical algorithm. (Compu 0037. A chimeric or fusion protein can be produced by tational Molecular Biology, Lesk, A. M., ed., Oxford Uni standard recombinant DNA techniques. For example, DNA versity Press, New York, 1988, Biocomputing. Informatics fragments coding for the different protein Sequences are and Genome Projects, Smith, D. W., ed., Academic Press, ligated together in-frame in accordance with conventional New York, 1993, Computer Analysis of Sequence Data, Part techniques. In another embodiment, the fusion gene can be 1, Griffin, A. M., and Griffin, H. G., eds., Humana Press, Synthesized by conventional techniques including auto New Jersey, 1994, Sequence Analysis in Molecular Biology, mated DNA synthesizers. Alternatively, PCR amplification von Heinje, G., Academic Press, 1987; and Sequence Analy Sis Primer, Gribskov, M. and Devereux, J., eds., M Stockton of gene fragments can be carried out using anchor primers Press, New York, 1991). In a preferred embodiment, the which give rise to complementary overhangs between two percent identity between two amino acid Sequences is deter consecutive gene fragments which can Subsequently be mined using the Needleman and Wunsch (J. Mol. Biol. annealed and re-amplified to generate a chimeric gene (48):444-453 (1970)) algorithm which has been incorpo sequence (see Ausubel et al., Current Protocols in Molecu rated into the GAP program in the GCG Software package lar Biology, 1992). Moreover, many expression vectors are (available at http://www.gcg.com), using either a Blossom commercially available that already encode a fusion moiety 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, (e.g., a GST protein). Alipase peptide-encoding nucleic acid 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. can be cloned into Such an expression vector Such that the In yet another preferred embodiment, the percent identity fusion moiety is linked in-frame to the lipase peptide. between two nucleotide Sequences is determined using the 0.038. As mentioned above, the present invention also GAP program in the GCG Software package (Devereux, J., provides and enables obvious variants of the amino acid et al., Nucleic Acids Res. 12(1):387 (1984)) (available at US 2004/0146980 A1 Jul. 29, 2004 http://www.gcg.com), using a NWSgapdna. CMP matrix and ogy/identity to at least a portion of the lipase peptide, as a gap weight of 40, 50, 60, 70, or 80 and a length weight of being encoded by a gene from humans, and as having Similar 1, 2, 3, 4, 5, or 6. In another embodiment, the percent activity or function. Two proteins will typically be consid identity between two amino acid or nucleotide Sequences is ered paralogs when the amino acid Sequences are typically determined using the algorithm of E. Myers and W. Miller at least about 60% or greater, and more typically at least (CABIOS, 4:11-17 (1989)) which has been incorporated into about 70% or greater homology through a given region or the ALIGN program (version 2.0), using a PAM120 weight domain. Such paralogs will be encoded by a nucleic acid residue table, a gap length penalty of 12 and a gap penalty Sequence that will hybridize to a lipase peptide encoding of 4. nucleic acid molecule under moderate to Stringent condi 0042. The nucleic acid and protein sequences of the tions as more fully described below. present invention can further be used as a “query Sequence” 0047 Orthologs of a lipase peptide can readily be iden to perform a Search against Sequence databases to, for tified as having Some degree of Significant Sequence homol example, identify other family members or related ogy/identity to at least a portion of the lipase peptide as well Sequences. Such Searches can be performed using the as being encoded by a gene from another organism. Pre NBLAST and XBLAST programs (version 2.0) of Altschul, ferred orthologs will be isolated from mammals, preferably et al. (J. Mol. Biol. 215:403-10 (1990)). BLAST nucleotide primates, for the development of human therapeutic targets searches can be performed with the NBLAST program, and agents. Such orthologs will be encoded by a nucleic acid Score=100, wordlength=12 to obtain nucleotide Sequences Sequence that will hybridize to a lipase peptide encoding homologous to the nucleic acid molecules of the invention. nucleic acid molecule under moderate to Stringent condi BLAST protein searches can be performed with the tions, as more fully described below, depending on the XBLAST program, score=50, wordlength=3 to obtain amino degree of relatedness of the two organisms yielding the acid Sequences homologous to the proteins of the invention. proteins. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschulet 0048. Non-naturally occurring variants of the lipase pep al. (Nucleic Acids Res. 25(17):3389-3402 (1997)). When tides of the present invention can readily be generated using utilizing BLAST and gapped BLAST programs, the default recombinant techniques. Such variants include, but are not parameters of the respective programs (e.g., XBLAST and limited to deletions, additions and Substitutions in the amino acid Sequence of the lipase peptide. For example, one class NBLAST) can be used. of Substitutions are conserved amino acid Substitution. Such 0.043 Full-length pre-processed forms, as well as mature Substitutions are those that Substitute a given amino acid in processed forms, of proteins that comprise one of the a lipase peptide by another amino acid of like characteristics. peptides of the present invention can readily be identified as Typically Seen as conservative Substitutions are the replace having complete Sequence identity to one of the lipase ments, one for another, among the aliphatic amino acids Ala, peptides of the present invention as well as being encoded by Val, Leu, and Ile, interchange of the hydroxyl residues Ser the same genetic locus as the lipase peptide provided herein. and Thr, exchange of the acidic residues Asp and Glu, As indicated by the data presented in FIG. 3, the map Substitution between the amide residues ASn and Gln; position was determined to be on chromosome 10 by ePCR. eXchange of the basic residues LyS and Arg, and replace 0044 Allelic variants of a lipase peptide can readily be ments among the aromatic residues Phe and Tyr. Guidance identified as being a human protein having a high degree concerning which amino acid changes are likely to be (significant) of Sequence homology/identity to at least a phenotypically Silent are found in Bowie et al., Science portion of the lipase peptide as well as being encoded by the 247: 1306-1310 (1990). Same genetic locus as the lipase peptide provided herein. Genetic locus can readily be determined based on the 0049 Variant lipase peptides can be fully functional or genomic information provided in FIG. 3, such as the can lack function in one or more activities, e.g. ability to genomic Sequence mapped to the reference human. AS bind Substrate, ability to hydrolyze substrate, etc. Fully indicated by the data presented in FIG. 3, the map position functional variants typically contain only conservative was determined to be on chromosome 10 by ePCR. As used variation or variation in non-critical residues or in non herein, two proteins. (or a region of the proteins) have critical regions. FIG. 2 provides the result of protein analy Significant homology when the amino acid Sequences are sis and can be used to identify critical domains/regions. typically at least about 70-80%, 80-90%, and more typically Functional variants can also contain Substitution of Similar at least about 90-95% or more homologous. A significantly amino acids that result in no change or an insignificant homologous amino acid Sequence, according to the present change in function. Alternatively, Such Substitutions may invention, will be encoded by a nucleic acid Sequence that positively or negatively affect function to Some degree. will hybridize to a lipase peptide encoding nucleic acid 0050. Non-functional variants typically contain one or molecule under Stringent conditions as more fully described more non-conservative amino acid Substitutions, deletions, below. insertions, inversions, or truncation or a Substitution, inser 004.5 FIG. 3 provides information on SNPs that have tion, inversion, or deletion in a critical residue or critical been identified in a gene encoding the transporter protein of region. the present invention. 72 SNP variants were found, including 0051 Amino acids that are essential for function can be 6 indels (indicated by a “-”). SNPs, identified at different identified by methods known in the art, Such as site-directed nucleotide positions in introns and regions 5' and 3' of the mutagenesis or alanine-Scanning mutagenesis (Cunningham ORF, may affect control/regulatory elements. et al., Science 244:1081-1085 (1989)), particularly using the 0.046 Paralogs of a lipase peptide can readily be identi results provided in FIG. 2. The latter procedure introduces fied as having Some degree of Significant Sequence homol Single alanine mutations at every residue in the molecule. US 2004/0146980 A1 Jul. 29, 2004

The resulting mutant molecules are then tested for biological carboxylation of glutamic acid residues, hydroxylation and activity Such as lipase activity or in assayS. Such as an in vitro ADP-ribosylation, for instance, are described in most basic proliferative activity. Sites that are critical for binding part texts, such as Proteins Structure and Molecular Proper ner/Substrate binding can also be determined by Structural ties, 2nd Ed., T. E. Creighton, W. H. Freeman and Company, analysis Such as crystallization, nuclear magnetic resonance New York (1993). Many detailed reviews are available on or photoaffinity labeling (Smith et al., J. Mol. Biol. 224:899 this subject, such as by Wold, F., Posttranslational Covalent 904 (1992); de Vos et al. Science 255:306-312 (1992)). Modification of Proteins, B. C. Johnson, Ed., Academic 0.052 The present invention further provides fragments Press, New York 1-12 (1983); Seifter et al. (Meth. Enzymol. of the lipase peptides, in addition to proteins and peptides 182: 626-646 (1990)) and Rattan et al. (Ann. N.Y. Acad. Sci. that comprise and consist of Such fragments, particularly 663:48-62 (1992)). those comprising the residues identified in FIG. 2. The 0057 Accordingly, the lipase peptides of the present fragments to which the invention pertains, however, are not invention also encompass derivatives or analogs in which a to be construed as encompassing fragments that may be Substituted amino acid residue is not one encoded by the disclosed publicly prior to the present invention. genetic code, in which a Substituent group is included, in 0.053 As used herein, a fragment comprises at least 8, 10, which the mature lipase peptide is fused with another 12, 14, 16, or more contiguous amino acid residues from a compound, Such as a compound to increase the half-life of lipase peptide. Such fragments can be chosen based on the the lipase peptide (for example, polyethylene glycol), or in ability to retain one or more of the biological activities of the which the additional amino acids are fused to the mature lipase peptide or could be chosen for the ability to perform lipase peptide, Such as a leader or Secretory Sequence or a a function, e.g. bind a Substrate or act as an immunogen. Sequence for purification of the mature lipase peptide or a Particularly important fragments are biologically active pro-protein Sequence. fragments, peptides that are, for example, about 8 or more 0.058 Protein/Peptide Uses amino acids in length. Such fragments will typically com 0059. The proteins of the present invention can be used in prise a domain or motif of the lipase peptide, e.g., active site, Substantial and Specific assays related to the functional a transmembrane domain or a Substrate-binding domain. information provided in the Figures, to raise antibodies or to Further, possible fragments include, but are not limited to, elicit another immune response; as a reagent (including the domain Or motif Containing fragments, Soluble peptide frag labeled reagent) in assays designed to quantitatively deter ments, and fragments containing immunogenic Structures. mine levels of the protein (or its binding partner or ligand) Predicted domains and functional sites are readily identifi in biological fluids, and as markers for tissues in which the able by computer programs well known and readily avail corresponding protein is preferentially expressed (either able to those of skill in the art (e.g., PROSITE analysis). The constitutively or at a particular Stage of tissue differentiation results of one such analysis are provided in FIG. 2. or development or in a disease state). Where the protein 0.054 Polypeptides often contain amino acids other than binds or potentially binds to another protein or ligand (Such the 20 amino acids commonly referred to as the 20 naturally as, for example, in a lipase-effector protein interaction or occurring amino acids. Further, many amino acids, including lipase-ligand interaction), the protein can be used to identify the terminal amino acids, may be modified by natural the binding partner/ligand So as to develop a System to processes, Such as processing and other post-translational identify inhibitors of the binding interaction. Any or all of modifications, or by chemical modification techniques well these uses are capable of being developed into reagent grade known in the art. Common modifications that occur natu or kit format for commercialization as commercial products. rally in lipase peptides are described in basic texts, detailed 0060 Methods for performing the uses listed above are monographs, and the research literature, and they are well well known to those skilled in the art. References disclosing known to those of skill in the art (Some of these features are such methods include “Molecular Cloning: A Laboratory identified in FIG. 2). Manual', 2d ed., Cold Spring Harbor Laboratory Press, 0.055 Known modifications include, but are not limited Sambrook, J., E. F. Fritsch and T. Maniatis eds., 1989, and to, acetylation, acylation, ADP-ribosylation, amidation, “Methods in Enzymology: Guide to Molecular Cloning covalent attachment of flavin, covalent attachment of a heme Techniques”, Academic PreSS, Berger, S. L. and A. R. moiety, covalent attachment of a nucleotide or nucleotide Kimmel eds., 1987. derivative, covalent attachment of a lipid or lipid derivative, 0061 The potential uses of the peptides of the present covalent attachment of phosphotidylinositol, croSS-linking, invention are based primarily on the Source of the protein as cyclization, disulfide bond formation, demethylation, for well as the class/action of the protein. For example, lipases mation of covalent crosslinks, formation of cystine, forma isolated from humans and their human/mammalian tion of pyroglutamate, formylation, gamma carboxylation, orthologS Serve as targets for identifying agents for use in glycosylation, GPI anchor formation, hydroxylation, iodi mammalian therapeutic applications, e.g. a human drug, nation, methylation, myristoylation, oxidation, proteolytic particularly in modulating a biological or pathological processing, phosphorylation, prenylation, racemization, response in a cell or tissue that expresses the lipase. Experi Selenoylation, sulfation, transfer-RNA mediated addition of mental data as provided in FIG. 1 indicates that lipase amino acids to proteins Such as arginylation, and ubiquiti proteins of the present invention are expressed in normal nation. stomach detected by a virtual northern blot. In addition, 0056 Such modifications are well known to those of skill PCR-based tissue Screening panel indicates expression in in the art and have been described in great detail in the human leukocyte. A large percentage of pharmaceutical scientific literature. Several particularly common modifica agents are being developed that modulate the activity of tions, glycosylation, lipid attachment, Sulfation, gamma lipase proteins, particularly members of the lysosomal acid US 2004/0146980 A1 Jul. 29, 2004 lipase subfamily (see Background of the Invention). The molecular libraries made of D- and/or L-configuration Structural and functional information provided in the Back amino acids; 2) phosphopeptides (e.g., members of random ground and Figures provide Specific and Substantial uses for and partially degenerate, directed phosphopeptide libraries, the molecules of the present invention, particularly in com see, e.g., Songyang et al., Cell 72:767-778 (1993)); 3) bination with the expression information provided in FIG.1. antibodies (e.g., polyclonal, monoclonal, humanized, anti Experimental data as provided in FIG. 1 indicates expres idiotypic, chimeric, and Single chain antibodies as well as Sion in the normal Stomach and human leukocyte. Such uses Fab, F(ab')2, Fab expression library fragments, and epitope can readily be determined using the information provided binding fragments of antibodies); and 4) Small organic and herein, that which is known in the art, and routine experi inorganic molecules (e.g., molecules obtained from combi mentation. natorial and natural product libraries). 0062) The proteins of the present invention (including 0067. One candidate compound is a soluble fragment of variants and fragments that may have been disclosed prior to the receptor that competes for Substrate binding. Other the present invention) are useful for biological assays related candidate compounds include mutant lipases or appropriate to lipases that are related to members of the lysosomal acid fragments containing mutations that affect lipase function lipase Subfamily. Such assays involve any of the known and thus compete for SubStrate. Accordingly, a fragment that lipase functions or activities or properties useful for diag competes for Substrate, for example with a higher affinity, or nosis and treatment of lipase-related conditions that are a fragment that binds Substrate but does not allow release, is specific for the subfamily of lipases that the one of the encompassed by the invention. present invention belongs to, particularly in cells and tissues 0068 Any of the biological or biochemical functions that express the lipase. Experimental data as provided in mediated by the lipase can be used as an endpoint assay. FIG. 1 indicates that lipase proteins of the present invention These include all of the biochemical or biochemical/biologi are expressed in normal Stomach detected by a virtual cal events described herein, in the references cited herein, northern blot. In addition, PCR-based tissue screening panel incorporated by reference for these endpoint assay targets, indicates expression in human leukocyte. and other functions known to those of ordinary skill in the 0.063. The proteins of the present invention are also art or that can be readily identified using the information useful in drug Screening assays, in cell-based or cell-free provided in the Figures, particularly FIG. 2. Specifically, a Systems. Cell-based Systems can be native, i.e., cells that biological function of a cell or tissues that expresses the normally express the lipase, as a biopsy or expanded in cell lipase can be assayed. Experimental data as provided in culture. Experimental data as provided in FIG. 1 indicates FIG. 1 indicates that lipase proteins of the present invention expression in the normal Stomach and human leukocyte. In are expressed in normal Stomach detected by a virtual an alternate embodiment, cell-based assays involve recom northern blot. In addition, PCR-based tissue screening panel binant host cells expressing the lipase protein. indicates expression in human leukocyte. 0069 Binding and/or activating compounds can also be 0064. The polypeptides can be used to identify com Screened by using chimeric lipase proteins in which the pounds that modulate lipase activity of the protein in its amino terminal extracellular domain, or parts thereof, the natural State or an altered form that causes a specific disease entire transmembrane domain or Subregions, Such as any of or pathology associated with the lipase. Both the lipases of the Seven transmembrane Segments or any of the intracel the present invention and appropriate variants and fragments lular or extracellular loops and the carboxy terminal intra can be used in high-throughput Screens to assay candidate cellular domain, or parts thereof, can be replaced by heter compounds for the ability to bind to the lipase. These ologous domains or Subregions. For example, a Substrate compounds can be further Screened against a functional binding region can be used that interacts with a different lipase to determine the effect of the compound on the lipase Substrate then that which is recognized by the native lipase. activity. Further, these compounds can be tested in animal or Accordingly, a different Set of Signal transduction compo invertebrate Systems to determine activity/effectiveness. nents is available as an end-point assay for activation. This Compounds can be identified that activate (agonist) or allows for assays to be performed in other than the Specific inactivate (antagonist) the lipase to a desired degree. host cell from which the lipase is derived. 0065. Further, the proteins of the present invention can be 0070 The proteins of the present invention are also used to Screen a compound for the ability to Stimulate or useful in competition binding assays in methods designed to inhibit interaction between the lipase protein and a molecule discover compounds that interact with the lipase (e.g. bind that normally interacts with the lipase protein, e.g. a Sub ing partners and/or ligands). Thus, a compound is exposed Strate. Such assays typically include the Steps of combining to a lipase polypeptide under conditions that allow the the lipase protein with a candidate compound under condi compound to bind or to otherwise interact with the polypep tions that allow the lipase protein, or fragment, to interact tide. Soluble lipase polypeptide is also added to the mixture. with the target molecule, and to detect the formation of a If the test compound interacts with the soluble lipase complex between the protein and the target or to detect the polypeptide, it decreases the amount of complex formed or biochemical consequence of the interaction with the lipase activity from the lipase target. This type of assay is particu protein and the target, Such as any of the associated effects larly useful in cases in which compounds are Sought that of hydrolysis. interact with Specific regions of the lipase. Thus, the Soluble 0.066 Candidate compounds include, for example, 1) polypeptide that competes with the target lipase region is peptides Such as Soluble peptides, including Ig-tailed fusion designed to contain peptide Sequences corresponding to the peptides and members of random peptide libraries (see, e.g., region of interest. Lam et al., Nature 354:82-84 (1991); Houghten et al., Nature 0071 To perform cell free drug screening assays, it is 354:84-86 (1991)) and combinatorial chemistry-derived Sometimes desirable to immobilize either the lipase protein, US 2004/0146980 A1 Jul. 29, 2004 or fragment, or its target molecule to facilitate Separation of 8:1693-1696; and Brent WO94/10300), to identify other complexes from uncomplexed forms of one or both of the proteins, which bind to or interact with the lipase and are proteins, as well as to accommodate automation of the assay. involved in lipase activity. 0.072 Techniques for immobilizing proteins on matrices 0076. The two-hybrid system is based on the modular nature of most transcription factors, which consist of Sepa can be used in the drug Screening assays. In one embodi rable DNA-binding and activation domains. Briefly, the ment, a fusion protein can be provided which adds a domain assay utilizes two different DNA constructs. In one con that allows the protein to be bound to a matrix. For example, Struct, the gene that codes for a lipase protein is fused to a glutathione-S-transferase fusion proteins can be adsorbed gene encoding the DNA binding domain of a known tran onto glutathione Sepharose beads (Sigma Chemical, St. scription factor (e.g., GAL-4). In the other construct, a DNA Louis, Mo.) or glutathione derivatized microtitre plates, Sequence, from a library of DNA sequences, that encodes an which are then combined with the cell lysates (e.g., S unidentified protein ("prey” or “sample') is fused to a gene labeled) and the candidate compound, and the mixture that codes for the activation domain of the known transcrip incubated under conditions conducive to complex formation tion factor. If the “bait' and the “prey” proteins are able to (e.g., at physiological conditions for Salt and pH). Following interact, in Vivo, forming a lipase-dependent complex, the incubation, the beads are washed to remove any unbound DNA-binding and activation domains of the transcription label, and the matrix immobilized and radiolabel determined factor are brought into close proximity. This proximity directly, or in the Supernatant after the complexes are allows transcription of a reporter gene (e.g., Lacz) which is dissociated. Alternatively, the complexes can be dissociated operably linked to a transcriptional regulatory site respon from the matrix, separated by SDS-PAGE, and the level of Sive to the transcription factor. Expression of the reporter lipase-binding protein found in the bead fraction quantitated gene can be detected and cell colonies containing the from the gel using Standard electrophoretic techniques. For functional transcription factor can be isolated and used to example, either the polypeptide or its target molecule can be obtain the cloned gene which encodes the protein which immobilized utilizing conjugation of biotin and Streptavidin interacts with the lipase protein. using techniques well known in the art. Alternatively, anti 0077. This invention further pertains to novel agents bodies reactive with the protein but which do not interfere identified by the above-described Screening assays. Accord with binding of the protein to its target molecule can be ingly, it is within the Scope of this invention to further use derivatized to the Wells of the plate, and the protein trapped an agent identified as described herein in an appropriate in the wells by antibody conjugation. Preparations of a animal model. For example, an agent identified as described lipase-binding protein and a candidate compound are incu herein (e.g., a lipase-modulating agent, an antisense lipase bated in the lipase protein-presenting Wells and the amount nucleic acid molecule, a lipase-specific antibody, or a lipase of complex trapped in the well can be quantitated. Methods binding partner) can be used in an animal or other model to for detecting Such complexes, in addition to those described determine the efficacy, toxicity, or Side effects of treatment above for the GST-immobilized complexes, include immu with Such an agent. Alternatively, an agent identified as nodetection of complexes using antibodies reactive with the described herein can be used in an animal or other model to lipase protein target molecule, or which are reactive with determine the mechanism of action of Such an agent. Fur lipase protein and compete with the target molecule, as well thermore, this invention pertains to uses of novel agents as enzyme-linked assays which rely on detecting an enzy identified by the above-described Screening assays for treat matic activity associated with the target molecule. ments as described herein. 0.073 Agents that modulate one of the lipases of the 0078. The lipase proteins of the present invention are also present invention can be identified using one or more of the useful to provide a target for diagnosing a disease or above assays, alone or in combination. It is generally predisposition to disease mediated by the peptide. Accord preferable to use a cell-based or cell free System first and ingly, the invention provides methods for detecting the then confirm activity in an animal or other model System. presence, or levels of, the protein (or encoding mRNA) in a Such model Systems are well known in the art and can cell, tissue, or organism. Experimental data as provided in readily be employed in this context. FIG. 1 indicates expression in the normal stomach and 0.074) Modulators of lipase protein activity identified human leukocyte. The method involves contacting a bio according to these drug Screening assays can be used to treat logical Sample with a compound capable of interacting with a Subject with a disorder mediated by the lipase pathway, by the lipase protein Such that the interaction can be detected. treating cells or tissues that express the lipase. Experimental Such an assay can be provided in a Single detection format data as provided in FIG. 1 indicates expression in the or a multi-detection format Such as an antibody chip array. normal Stomach and human leukocyte. These methods of 0079. One agent for detecting a protein in a sample is an treatment include the Steps of administering a modulator of antibody capable of Selectively binding to protein. A bio lipase activity in a pharmaceutical composition to a Subject logical Sample includes tissues, cells and biological fluids in need of Such treatment, the modulator being identified as isolated from a Subject, as well as tissues, cells and fluids described herein. present within a Subject. 0075. In yet another aspect of the invention, the lipase 0080. The peptides of the present invention also provide proteins can be used as "bait proteins' in a two-hybrid assay targets for diagnosing active protein activity, disease, or or three-hybrid assay (see, e.g., U.S. Pat. No. 5.283,317; predisposition to disease, in a patient having a variant Zervos et al. (1993) Cel 72:223-232; Madura et al. (1993) peptide, particularly activities and conditions that are known J. Biol Chem. 268:12046-12054; Bartel et al. (1993) Bio for other members of the family of proteins to which the techniques 14:920-924; Iwabuchi et al. (1993) Oncogene present one belongs. Thus, the peptide can be isolated from US 2004/0146980 A1 Jul. 29, 2004 a biological Sample and assayed for the presence of a genetic 0083. The peptides are also useful for treating a disorder mutation that results in aberrant peptide. This includes characterized by an absence of, inappropriate, or unwanted amino acid Substitution, deletion, insertion, rearrangement, expression of the protein. Experimental data as provided in (as the result of aberrant splicing events), and inappropriate FIG. 1 indicates expression in the normal stomach and post-translational modification. Analytic methods include human leukocyte. Accordingly, methods for treatment altered electrophoretic mobility, altered tryptic peptide include the use of the lipase protein or fragments. digest, altered lipase activity in cell-based or cell-free assay, alteration in Substrate or antibody-binding pattern, altered 0084 Antibodies isoelectric point, direct amino acid Sequencing, and any 0085. The invention also provides antibodies that selec other of the known assay techniques useful for detecting tively bind to one of the peptides of the present invention, a mutations in a protein. Such an assay can be provided in a protein comprising Such a peptide, as well as variants and Single detection format or a multi-detection format Such as fragments thereof. AS used herein, an antibody Selectively an antibody chip array. binds a target peptide when it binds the target peptide and 0081. In vitro techniques for detection of peptide include does not significantly bind to unrelated proteins. An anti enzyme linked immunosorbent assays (ELISAS), Western body is still considered to selectively bind a peptide even if blots, immunoprecipitations and immunofluorescence using it also binds to other proteins that are not Substantially a detection reagent, Such as an antibody or protein binding homologous with the target peptide So long as Such proteins agent. Alternatively, the peptide can be detected in Vivo in a share homology with a fragment or domain of the peptide Subject by introducing into the Subject a labeled anti-peptide target of the antibody. In this case, it would be understood antibody or other types of detection agent. For example, the that antibody binding to the peptide is still Selective despite antibody can be labeled with a radioactive marker whose Some degree of cross-reactivity. presence and location in a Subject can be detected by 0086 As used herein, an antibody is defined in terms Standard imaging techniques. Particularly useful are meth consistent with that recognized within the art: they are ods that detect the allelic variant of a peptide expressed in a multi-Subunit proteins produced by a mammalian organism Subject and methods which detect fragments of a peptide in in response to an antigen challenge. The antibodies of the a Sample. present invention include polyclonal antibodies and mono 0082 The peptides are also useful in pharmacogenomic clonal antibodies, as well as fragments of Such antibodies, analysis. Pharmacogenomics deal with clinically significant including, but not limited to, Fab or F(ab), and Fv frag hereditary variations in the response to drugs due to altered mentS. drug disposition and abnormal action in affected perSons. See, e.g., Eichelbaum, M. (Clin. Exp. Pharmacol. Physiol. 0087 Many methods are known for generating and/or 23(10-11):983-985 (1996)), and Linder, M. W. (Clin. Chem. identifying antibodies to a given target peptide. Several Such 43(2):254-266 (1997). The clinical outcomes of these varia methods are described by Harlow, Antibodies, Cold Spring tions result in Severe toxicity of therapeutic drugs in certain Harbor Press, (1989). individuals or therapeutic failure of drugs in certain indi 0088. In general, to generate antibodies, an isolated pep viduals as a result of individual variation in metabolism. tide is used as an immunogen and is administered to a Thus, the genotype of the individual can determine the way mammalian organism, Such as a rat, rabbit or mouse. The a therapeutic compound acts on the body or the way the full-length protein, an antigenic peptide fragment or a fusion body metabolizes the compound. Further, the activity of protein can be used. Particularly important fragments are drug metabolizing enzymes effects both the intensity and those covering functional domains, Such as the domains duration of drug action. Thus, the pharmacogenomics of the identified in FIG. 2, and domain of Sequence homology or individual permit the Selection of effective compounds and divergence amongst the family, Such as those that can effective dosages of Such compounds for prophylactic or readily be identified using protein alignment methods and as therapeutic treatment based on the individual's genotype. presented in the Figures. The discovery of genetic polymorphisms in Some drug metabolizing enzymes has explained why Some patients do 0089 Antibodies are preferably prepared from regions or not obtain the expected drug effects, Show an exaggerated discrete fragments of the lipase proteins. Antibodies can be drug effect, or experience Serious toxicity from Standard prepared from any region of the peptide as described herein. drug dosages. Polymorphisms can be expressed in the phe However, preferred regions will include those involved in notype of the extensive metabolizer and the phenotype of the function/activity and/or lipase/binding partner interaction. poor metabolizer. Accordingly, genetic polymorphism may FIG. 2 can be used to identify particularly important regions lead to allelic protein variants of the lipase protein in which while Sequence alignment can be used to identify conserved one or more of the lipase functions in one population is and unique Sequence fragments. different from those in another population. The peptides thus 0090 An antigenic fragment will typically comprise at allow a target to ascertain a genetic predisposition that can least 8 contiguous amino acid residues. The antigenic pep affect treatment modality. Thus, in a ligand-based treatment, tide can comprise, however, at least 10, 12, 14, 16 or more polymorphism may give rise to amino terminal extracellular amino acid residues. Such fragments can be Selected on a domains and/or other Substrate-binding regions that are physical property, Such as fragments correspond to regions more or less active in Substrate binding, and lipase activa that are located on the Surface of the protein, e.g., hydro tion. Accordingly, Substrate dosage would necessarily be philic regions or can be Selected based on Sequence unique modified to maximize the therapeutic effect within a given population containing a polymorphism. As an alternative to ness (see FIG. 2). genotyping, specific polymorphic peptides could be identi 0091) Detection on an antibody of the present invention fied. can be facilitated by coupling (i.e., physically linking) the US 2004/0146980 A1 Jul. 29, 2004 antibody to a detectable Substance. Examples of detectable 0096. Additionally, antibodies are useful in pharmacoge Substances include various enzymes, prosthetic groups, fluo nomic analysis. Thus, antibodies prepared against polymor rescent materials, luminescent materials, bioluminescent phic proteins can be used to identify individuals that require materials, and radioactive materials. Examples of Suitable modified treatment modalities. The antibodies are also use enzymes include horseradish peroxidase, alkaline phos ful as diagnostic tools as an immunological marker for phatase, 3-galactosidase, or ; examples aberrant protein analyzed by electrophoretic mobility, iso of Suitable prosthetic group complexes include Streptavidin/ electric point, tryptic peptide digest, and other physical biotin and avidin/biotin; examples of Suitable fluorescent assays known to those in the art. materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluores 0097. The antibodies are also useful for tissue typing. cein, dansyl chloride or phycoerythrin; an example of a Experimental data as provided in FIG. 1 indicates expres luminescent material includes luminol; examples of biolu Sion in the normal Stomach and human leukocyte. Thus, minescent materials include luciferase, luciferin, and where a specific protein has been correlated with expression aequorin, and examples of Suitable radioactive material in a specific tissue, antibodies that are specific for this include I, I, S or H. protein can be used to identify a tissue type. 0098. The antibodies are also useful for inhibiting protein 0092 Antibody Uses function, for example, blocking the binding of the lipase 0093. The antibodies can be used to isolate one of the peptide to a binding partner Such as a Substrate. These uses proteins of the present invention by Standard techniques, can also be applied in a therapeutic context in which Such as affinity chromatography or immunoprecipitation. treatment involves inhibiting the protein's function. An The antibodies can facilitate the purification of the natural antibody can be used, for example, to block binding, thus protein from cells and recombinantly produced protein modulating (agonizing or antagonizing) the peptides activ expressed in host cells. In addition, Such antibodies are ity. Antibodies can be prepared against Specific fragments useful to detect the presence of one of the proteins of the containing Sites required for function or against intact pro present invention in cells or tissues to determine the pattern tein that is associated with a cell or cell membrane. See FIG. of expression of the protein among various tissues in an 2 for Structural information relating to the proteins of the organism and over the course of normal development. present invention. Experimental data as provided in FIG. 1 indicates that lipase 0099. The invention also encompasses kits for using proteins of the present invention are expressed in normal antibodies to detect the presence of a protein in a biological stomach detected by a virtual northern blot. In addition, Sample. The kit can comprise antibodies Such as a labeled or PCR-based tissue Screening panel indicates expression in labelable antibody and a compound or agent for detecting human leukocyte. Further, Such antibodies can be used to protein in a biological Sample, means for determining the detect protein in situ, in vitro, or in a cell lysate or Super amount of protein in the Sample, means for comparing the natant in order to evaluate the abundance and pattern of amount of protein in the Sample with a Standard; and expression. Also, Such antibodies can be used to assess instructions for use. Such a kit can be Supplied to detect a abnormal tissue distribution or abnormal expression during Single protein or epitope or can be configured to detect one development or progression of a biological condition. Anti of a multitude of epitopes, Such as in an antibody detection body detection of circulating fragments of the full length array. Arrays are described in detail below for nuleic acid protein can be used to identify turnover. arrays and Similar methods have been developed for anti 0094) Further, the antibodies can be used to assess body arrayS. expression in disease States Such as in active Stages of the disease or in an individual with a predisposition toward 0100 Nucleic Acid Molecules disease related to the protein's function. When a disorder is 0101 The present invention further provides isolated caused by an inappropriate tissue distribution, developmen nucleic acid molecules that encode a lipase peptide or tal expression, level of expression of the protein, or protein of the present invention (cDNA, transcript and expressed/processed form, the antibody can be prepared genomic sequence). Such nucleic acid molecules will con against the normal protein. Experimental data as provided in sist of, consist essentially of, or comprise a nucleotide FIG. 1 indicates expression in the normal stomach and Sequence that encodes one of the lipase peptides of the human leukocyte. If a disorder is characterized by a specific present invention, an allelic variant thereof, or an Ortholog or mutation in the protein, antibodies specific for this mutant paralog thereof. protein can be used to assay for the presence of the Specific mutant protein. 0102) As used herein, an "isolated nucleic acid molecule is one that is separated from other nucleic acid present in the 0.095 The antibodies can also be used to assess normal natural source of the nucleic acid. Preferably, an “isolated” and aberrant Subcellular localization of cells in the various nucleic acid is free of Sequences which naturally flank the tissues in an organism. Experimental data as provided in nucleic acid (i.e., Sequences located at the 5' and 3' ends of FIG. 1 indicates expression in the normal stomach and the nucleic acid) in the genomic DNA of the organism from human leukocyte. The diagnostic uses can be applied, not which the nucleic acid is derived. However, there can be only in genetic testing, but also in monitoring a treatment Some flanking nucleotide Sequences, for example up to modality. Accordingly, where treatment is ultimately aimed about 5 KB, 4KB, 3 KB, 2 KB, or 1 KB or less, particularly at correcting expression level or the presence of aberrant contiguous peptide encoding Sequences and peptide encod Sequence and aberrant tissue distribution or developmental ing Sequences within the same gene but Separated by introns expression, antibodies directed against the protein or rel in the genomic Sequence. The important point is that the evant fragments can be used to monitor therapeutic efficacy. nucleic acid is isolated from remote and unimportant flank US 2004/0146980 A1 Jul. 29, 2004

ing Sequences Such that it can be Subjected to the Specific tures are either noted in FIGS. 1 and 3 or can readily be manipulations described herein Such as recombinant expres identified using computational tools known in the art. AS Sion, preparation of probes and primers, and other uses discussed below, Some of the non-coding regions, particu Specific to the nucleic acid Sequences. larly gene regulatory elements Such as promoters, are useful for a variety of purposes, e.g. control of heterologous gene 0103 Moreover, an "isolated” nucleic acid molecule, expression, target for identifying gene activity modulating Such as a transcript/cDNA molecule, can be Substantially compounds, and are particularly claimed as fragments of the free of other cellular material, or culture medium when genomic Sequence provided herein. produced by recombinant techniques, or chemical precur sors or other chemicals when chemically synthesized. How 0109 The isolated nucleic acid molecules can encode the ever, the nucleic acid molecule can be fused to other coding mature protein plus additional amino or carboxyl-terminal or regulatory Sequences and Still be considered isolated. amino acids, or amino acids interior to the mature peptide (when the mature form has more than one peptide chain, for 0104 For example, recombinant DNA molecules con instance). Such sequences may play a role in processing of tained in a vector are considered isolated. Further examples a protein from precursor to a mature form, facilitate protein of isolated DNA molecules include recombinant DNA mol trafficking, prolong or shorten protein half-life or facilitate ecules maintained in heterologous host cells or purified manipulation of a protein for assay or production, among (partially or substantially) DNA molecules in solution. Iso other things. As generally is the case in situ, the additional lated RNA molecules include in vivo or in vitro RNA amino acids may be processed away from the mature protein transcripts of the isolated DNA molecules of the present by cellular enzymes. invention. Isolated nucleic acid molecules according to the present invention further include Such molecules produced 0110. As mentioned above, the isolated nucleic acid Synthetically. molecules include, but are not limited to, the Sequence encoding the lipase peptide alone, the Sequence encoding the 0105. Accordingly, the present invention provides mature peptide and additional coding Sequences, Such as a nucleic acid molecules that consist of the nucleotide leader or Secretory Sequence (e.g., a pre-pro or pro-protein sequence shown in FIG. 1 or 3 (SEQ ID NO:1, transcript Sequence), the Sequence encoding the mature peptide, with Sequence and SEQ ID NO:3, genomic sequence), or any or without the additional coding Sequences, plus additional nucleic acid molecule that encodes the protein provided in non-coding Sequences, for example introns and non-coding FIG. 2, SEQ ID NO:2. A nucleic acid molecule consists of 5' and 3' Sequences Such as transcribed but non-translated a nucleotide Sequence when the nucleotide sequence is the Sequences that play a role in transcription, mRNA proceSS complete nucleotide Sequence of the nucleic acid molecule. ing (including splicing and polyadenylation signals), ribo 0106 The present invention further provides nucleic acid some binding and stability of mRNA. In addition, the molecules that consist essentially of the nucleotide Sequence nucleic acid molecule may be fused to a marker Sequence shown in FIG. 1 or 3 (SEQ ID NO:1, transcript sequence encoding, for example, a peptide that facilitates purification. and SEQ ID NO:3, genomic sequence), or any nucleic acid 0.111) Isolated nucleic acid molecules can be in the form molecule that encodes the protein provided in FIG. 2, SEQ of RNA, such as mRNA, or in the form DNA, including ID NO:2. A nucleic acid molecule consists essentially of a cDNA and genomic DNA obtained by cloning or produced nucleotide Sequence when Such a nucleotide Sequence is by chemical Synthetic techniqueS or by a combination present with only a few additional nucleic acid residues in thereof. The nucleic acid, especially DNA, can be double the final nucleic acid molecule. Stranded or Single-Stranded. Single-stranded nucleic acid can 0107 The present invention further provides nucleic acid be the coding Strand (Sense Strand) or the non-coding Strand molecules that comprise the nucleotide Sequences shown in (anti-Sense Strand). FIG. 1 or 3 (SEQ ID NO:1, transcript sequence and SEQID 0112 The invention further provides nucleic acid mol NO:3, genomic sequence), or any nucleic acid molecule that ecules that encode fragments of the peptides of the present encodes the protein provided in FIG. 2, SEQ ID NO:2. A invention as well as nucleic acid molecules that encode nucleic acid molecule comprises a nucleotide Sequence obvious variants of the lipase proteins of the present inven when the nucleotide Sequence is at least part of the final tion that are described above. Such nucleic acid molecules nucleotide Sequence of the nucleic acid molecule. In Such a may be naturally occurring, Such as allelic variants (same fashion, the nucleic acid molecule can be only the nucleotide locus), paralogs (different locus), and orthologs (different Sequence or have additional nucleic acid residues, Such as organism), or may be constructed by recombinant DNA nucleic acid residues that are naturally associated with it or methods or by chemical Synthesis. Such non-naturally heterologous nucleotide Sequences. Such a nucleic acid occurring variants may be made by mutagenesis techniques, molecule can have a few additional nucleotides or can including those applied to nucleic acid molecules, cells, or comprises Several hundred or more additional nucleotides. A organisms. Accordingly, as discussed above, the variants can brief description of how various types of these nucleic acid contain nucleotide Substitutions, deletions, inversions and molecules can be readily made/isolated is provided below. insertions. Variation can occur in either or both the coding 0108). In FIGS. 1 and 3, both coding and non-coding and non-coding regions. The variations can produce both Sequences are provided. Because of the Source of the present conservative and non-conservative amino acid Substitutions. invention, humans genomic sequence (FIG. 3) and cDNA/ 0113. The present invention further provides non-coding transcript sequences (FIG. 1), the nucleic acid molecules in fragments of the nucleic acid molecules provided in FIGS. the Figures will contain genomic intronic Sequences, 5' and 1 and 3. Preferred non-coding fragments include, but are not 3' non-coding Sequences, gene regulatory regions and non limited to, promoter Sequences, enhancer Sequences, gene coding intergenic Sequences. In general Such Sequence fea modulating Sequences and gene termination Sequences. US 2004/0146980 A1 Jul. 29, 2004

Such fragments are useful in controlling heterologous gene 0119) Nucleic Acid Molecule Uses expression and in developing Screens to identify gene modulating agents. A promoter can readily be identified as 0120) The nucleic acid molecules of the present invention are useful for probes, primers, chemical intermediates, and being 5' to the ATG start Site in the genomic Sequence in biological assays. The nucleic acid molecules are useful provided in FIG. 3. as a hybridization probe for messenger RNA, transcript/ 0114. A fragment comprises a contiguous nucleotide cDNA and genomic DNA to isolate full-length cDNA and Sequence greater than 12 or more nucleotides. Further, a genomic clones encoding the peptide described in FIG. 2 and to isolate cDNA and genomic clones that correspond to fragment could at least 30, 40, 50, 100, 250 or 500 nucle variants (alleles, orthologs, etc.) producing the same or otides in length. The length of the fragment will be based on related peptides shown in FIG. 2. 72 SNPs, including 6 its intended use. For example, the fragment can encode indels, have been identified in the gene encoding the trans epitope bearing regions of the peptide, or can be useful as porter protein provided by the present invention and are DNA probes and primers. Such fragments can be isolated given in FIG. 3. using the known nucleotide Sequence to Synthesize an oligonucleotide probe. A labeled probe can then be used to 0121 The probe can correspond to any sequence along screen a cDNA library, genomic DNA library, or mRNA to the entire length of the nucleic acid molecules provided in isolate nucleic acid corresponding to the coding region. the Figures. Accordingly, it could be derived from 5' non Further, primers can be used in PCR reactions to clone coding regions, the coding region, and 3' noncoding regions. Specific regions of gene. However, as discussed, fragments are not to be construed as encompassing fragments disclosed prior to the present 0115) A probe/primer typically comprises substantially a invention. purified oligonucleotide or oligonucleotide pair. The oligo nucleotide typically comprises a region of nucleotide 0.122 The nucleic acid molecules are also useful as Sequence that hybridizes under Stringent conditions to at primers for PCR to amplify any given region of a nucleic least about 12, 20, 25, 40, 50 or more consecutive nucle acid molecule and are useful to Synthesize antisense mol otides. ecules of desired length and Sequence. 0116 Orthologs, homologs, and allelic variants can be 0123 The nucleic acid molecules are also useful for identified using methods well known in the art. AS described constructing recombinant vectors. Such vectors include in the Peptide Section, these variants comprise a nucleotide expression vectors that express a portion of, or all of, the Sequence encoding a peptide that is typically 60-70%, peptide Sequences. VectorS also include insertion vectors, 70-80%, 80-90%, and more typically at least about 90-95% used to integrate into another nucleic acid molecule or more homologous to the nucleotide Sequence shown in sequence, Such as into the cellular genome, to alter in situ the Figure Sheets or a fragment of this sequence. Such expression of a gene and/or gene product. For example, an nucleic acid molecules can readily be identified as being endogenous coding Sequence can be replaced via homolo able to hybridize under moderate to Stringent conditions, to gous recombination with all or part of the coding region the nucleotide Sequence shown in the Figure sheets or a containing one or more specifically introduced mutations. fragment of the Sequence. Allelic variants can readily be 0.124. The nucleic acid molecules are also useful for determined by genetic locus of the encoding gene. AS expressing antigenic portions of the proteins. indicated by the data presented in FIG. 3, the map position was determined to be on chromosome 10 by ePCR. 0.125 The nucleic acid molecules are also useful as probes for determining the chromosomal positions of the 0117 FIG. 3 provides information on SNPs that have nucleic acid molecules by means of in situ hybridization been identified in a gene encoding the transporter protein of methods. As indicated by the data presented in FIG. 3, the the present invention. 72 SNP variants were found, including map position was determined to be on chromosome 10 by 6 indels (indicated by a “-”). SNPs, identified at different ePCR nucleotide positions in introns and regions 5' and 3' of the ORF, may affect control/regulatory elements. 0.126 The nucleic acid molecules are also useful in making vectors containing the gene regulatory regions of the 0118 AS used herein, the term “hybridizes under strin nucleic acid molecules of the present invention. gent conditions is intended to describe conditions for hybridization and washing under which nucleotide 0127. The nucleic acid molecules are also useful for Sequences encoding a peptide at least 60-70% homologous designing ribozymes corresponding to all, or a part, of the to each other typically remain hybridized to each other. The mRNA produced from the nucleic acid molecules described conditions can be Such that Sequences at least about 60%, at herein. least about 70%, or at least about 80% or more homologous to each other typically remain hybridized to each other. Such 0128. The nucleic acid molecules are also useful for Stringent conditions are known to those skilled in the art and making vectors that express part, or all, of the peptides. can be found in Current Protocols in Molecular Biology, 0129. The nucleic acid molecules are also useful for John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6. One example constructing host cells expressing a part, or all, of the nucleic of Stringent hybridization conditions are hybridization in 6x acid molecules and peptides. sodium chloride/sodium citrate (SSC) at about 45C, fol lowed by one or more washes in 0.2xSSC, 0.1% SDS at 0.130. The nucleic acid molecules are also useful for 50-65C. Examples of moderate to low stringency hybrid constructing transgenic animals expressing all, or a part, of ization conditions are well known in the art. the nucleic acid molecules and peptides. US 2004/0146980 A1 Jul. 29, 2004

0131 The nucleic acid molecules are also useful as date compound. The candidate compound can then be iden hybridization probes for determining the presence, level, tified as a modulator of nucleic acid expression based on this form and distribution of nucleic acid expression. Experi comparison and be used, for example to treat a disorder mental data as provided in FIG. 1 indicates that lipase characterized by aberrant nucleic acid expression. When proteins of the present invention are expressed in normal expression of mRNA is statistically significantly greater in stomach detected by a virtual northern blot. In addition, the presence of the candidate compound than in its absence, PCR-based tissue Screening panel indicates expression in the candidate compound is identified as a Stimulator of human leukocyte. Accordingly, the probes can be used to nucleic acid expression. When nucleic acid expression is detect the presence of, or to determine levels of, a specific Statistically significantly leSS in the presence of the candidate nucleic acid molecule in cells, tissues, and in organisms. The compound than in its absence, the candidate compound is nucleic acid whose level is determined can be DNA or RNA. identified as an inhibitor of nucleic acid expression. Accordingly, probes corresponding to the peptides described herein can be used to assess expression and/or gene copy 0.138. The invention further provides methods of treat number in a given cell, tissue, or organism. These uses are ment, with the nucleic acid as a target, using a compound relevant for diagnosis of disorders involving an increase or identified through drug Screening as a gene modulator to decrease in lipase protein expression relative to normal modulate lipase nucleic acid expression in cells and tissues results. that express the lipase. Experimental data as provided in FIG. 1 indicates that lipase proteins of the present invention 0132] In vitro techniques for detection of mRNA include are expressed in normal Stomach detected by a virtual Northern hybridizations and in situ hybridizations. In vitro northern blot. In addition, PCR-based tissue screening panel techniques for detecting DNA includes Southern hybridiza indicates expression in human leukocyte. Modulation tions and in situ hybridization. includes both up-regulation (i.e. activation or agonization) or down-regulation (Suppression or antagonization) or 0.133 Probes can be used as a part of a diagnostic test kit for identifying cells or tissues that express a lipase protein, nucleic acid expression. Such as by measuring a level of a lipase-encoding nucleic 0.139. Alternatively, a modulator for lipase nucleic acid acid in a sample of cells from a subject e.g., mRNA or expression can be a Small molecule or drug identified using genomic DNA, or determining if a lipase gene has been the Screening assays described herein as long as the drug or mutated. Experimental data as provided in FIG. 1 indicates Small molecule inhibits the lipase nucleic acid expression in that lipase proteins of the present invention are expressed in the cells and tissues that express the protein. Experimental normal stomach detected by a virtual northern blot. In data as provided in FIG. 1 indicates expression in the addition, PCR-based tissue Screening panel indicates expres normal Stomach and human leukocyte. Sion in human leukocyte. 0140. The nucleic acid molecules are also useful for 0134) Nucleic acid expression assays are useful for drug monitoring the effectiveness of modulating compounds on Screening to identify compounds that modulate lipase the expression or activity of the lipase gene in clinical trials nucleic acid expression. or in a treatment regimen. Thus, the gene expression pattern 0135 The invention thus provides a method for identi can Serve as a barometer for the continuing effectiveness of fying a compound that can be used to treat a disorder treatment with the compound, particularly with compounds asSociated with nucleic acid expression of the lipase gene, to which a patient can develop resistance. The gene expres particularly biological and pathological processes that are Sion pattern can also serve as a marker indicative of a mediated by the lipase in cells and tissues that express it. physiological response of the affected cells to the compound. Experimental data as provided in FIG. 1 indicates expres Accordingly, Such monitoring would allow either increased Sion in the normal Stomach and human leukocyte. The administration of the compound or the administration of method typically includes assaying the ability of the com alternative compounds to which the patient has not become pound to modulate the expression of the lipase nucleic acid resistant. Similarly, if the level of nucleic acid expression and thus identifying a compound that can be used to treat a falls below a desirable level, administration of the com disorder characterized by undesired lipase nucleic acid pound could be commensurately decreased. expression. The assays can be performed in cell-based and 0.141. The nucleic acid molecules are also useful in cell-free Systems. Cell-based assays include cells naturally diagnostic assays for qualitative changes in lipase nucleic expressing the lipase nucleic acid or recombinant cells acid expression, and particularly in qualitative changes that genetically engineered to express Specific nucleic acid lead to pathology. The nucleic acid molecules can be used to Sequences. detect mutations in lipase genes and gene expression prod 0.136 The assay for lipase nucleic acid expression can ucts Such as mRNA. The nucleic acid molecules can be used involve direct assay of nucleic acid levels, Such as mRNA as hybridization probes to detect naturally occurring genetic levels. In this embodiment the regulatory regions of these mutations in the lipase gene and thereby to determine genes can be operably linked to a reporter gene Such as whether a subject with the mutation is at risk for a disorder luciferase. caused by the mutation. Mutations include deletion, addi tion, or Substitution of one or more nucleotides in the gene, 0.137 Thus, modulators of lipase gene expression can be chromosomal rearrangement, Such as inversion or transpo identified in a method wherein a cell is contacted with a sition, modification of genomic DNA, Such as aberrant candidate compound and the expression of mRNA deter methylation patterns or changes in gene copy number, Such mined. The level of expression of lipase mRNA in the as amplification. Detection of a mutated form of the lipase presence of the candidate compound is compared to the level gene associated with a dysfunction provides a diagnostic of expression of lipase mRNA in the absence of the candi tool for an active disease or Susceptibility to disease when US 2004/0146980 A1 Jul. 29, 2004 the disease results from overexpression, underexpression, or Cotton et al., PNAS 85:4397 (1988); Saleeba et al., Meth. altered expression of a lipase protein. Enzymol. 217:286-295 (1992)), electrophoretic mobility of mutant and wild type nucleic acid is compared (Orita et al., 0.142 Individuals carrying mutations in the lipase gene PNAS 86:2766 (1989); Cotton et al., Mutat. Res. 285:125 can be detected at the nucleic acid level by a variety of 144 (1993); and Hayashi et al., Genet. Anal. Tech. Appl. techniques. FIG. 3 provides information on SNPs that have 9:73-79 (1992)), and movement of mutant or wild-type been identified in a gene encoding the transporter protein of fragments in polyacrylamide gels containing a gradient of the present invention. 72 SNP variants were found, including 6 indels (indicated by a “-”). SNPs, identified at different denaturant is assayed using denaturing gradient gel electro nucleotide positions in introns and regions 5' and 3' of the phoresis (Myers et al., Nature 313:495 (1985)). Examples of ORF, may affect control/regulatory elements. AS indicated other techniques for detecting point mutations include Selec by the data presented in FIG. 3, the map position was tive oligonucleotide hybridization, Selective amplification, determined to be on chromosome 10 by ePCR. Genomic and selective primer eXtension. DNA can be analyzed directly or can be amplified by using 0147 The nucleic acid molecules are also useful for PCR prior to analysis. RNA or cDNA can be used in the testing an individual for a genotype that while not neces Same way. In Some uses, detection of the mutation involves Sarily causing the disease, nevertheless affects the treatment the use of a probe/primer in a polymerase chain reaction modality. Thus, the nucleic acid molecules can be used to (PCR) (see, e.g. U.S. Pat. Nos. 4.683,195 and 4,683202), Study the relationship between an individual’s genotype and such as anchor PCR or RACE PCR, or, alternatively, in a the individual’s response to a compound used for treatment ligation chain reaction (LCR) (See, e.g., Landegran et al., (pharmacogenomic relationship). Accordingly, the nucleic Science 241:1077-1080 (1988); and Nakazawa et al., PNAS acid molecules described herein can be used to assess the 91:360-364 (1994)), the latter of which can be particularly mutation content of the lipase gene in an individual in order useful for detecting point mutations in the gene (see to Select an appropriate compound or dosage regimen for Abravaya et al., Nucleic Acids Res. 23:675-682 (1995)). This treatment. FIG. 3 provides information on SNPs that have method can include the Steps of collecting a Sample of cells been identified in a gene encoding the transporter protein of from a patient, isolating nucleic acid (e.g., genomic, mRNA the present invention. 72 SNP variants were found, including or both) from the cells of the sample, contacting the nucleic 6 indels (indicated by a “-”). SNPs, identified at different acid Sample with one or more primers which specifically nucleotide positions in introns and regions 5' and 3' of the hybridize to a gene under conditions Such that hybridization ORF, may affect control/regulatory elements. and amplification of the gene (if present) occurs, and detect 0.148 Thus nucleic acid molecules displaying genetic ing the presence or absence of an amplification product, or variations that affect treatment provide a diagnostic target detecting the Size of the amplification product and compar that can be used to tailor treatment in an individual. Accord ing the length to a control Sample. Deletions and insertions ingly, the production of recombinant cells and animals can be detected by a change in size of the amplified product containing these polymorphisms allow effective clinical compared to the normal genotype. Point mutations can be design of treatment compounds and dosage regimens. identified by hybridizing amplified DNA to normal RNA or antisense DNA sequences. 014.9 The nucleic acid molecules are thus useful as antisense constructs to control lipase gene expression in 0.143 Alternatively, mutations in a lipase gene can be cells, tissues, and organisms. A DNA antisense nucleic acid directly identified, for example, by alterations in restriction molecule is designed to be complementary to a region of the enzyme digestion patterns determined by gel electrophore gene involved in transcription, preventing transcription and Sis. hence production of lipase protein. An antisense RNA or 0144) Further, sequence-specific ribozymes (U.S. Pat. DNA nucleic acid molecule would hybridize to the mRNA No. 5,498.531) can be used to score for the presence of and thus block translation of mRNA into lipase protein. Specific mutations by development or loSS of a ribozyme 0150. Alternatively, a class of antisense molecules can be cleavage Site. Perfectly matched Sequences can be distin used to inactivate mRNA in order to decrease expression of guished from mismatched Sequences by cleavage lipase nucleic acid. Accordingly, these molecules can treat a digestion assays or by differences in melting temperature. disorder characterized by abnormal or undesired lipase 0145 Sequence changes at Specific locations can also be nucleic acid expression. This technique involves cleavage assessed by nuclease protection assayS. Such as RNase and by means of ribozymes containing nucleotide Sequences S1 protection or the chemical cleavage method. Further complementary to one or more regions in the mRNA that more, Sequence differences between a mutant lipase gene attenuate the ability of the mRNA to be translated. Possible and a wild-type gene can be determined by direct DNA regions include coding regions and particularly coding Sequencing. A variety of automated Sequencing procedures regions corresponding to the catalytic and other functional can be utilized when performing the diagnostic assays activities of the lipase protein, Such as Substrate binding. (Naeve, C. W., (1995) Biotechniques 19:448), including 0151. The nucleic acid molecules also provide vectors for Sequencing by mass spectrometry (see, e.g., PCT Interna gene therapy in patients containing cells that are aberrant in tional Publication No. WO 94/16101; Cohen et al., Ady? lipase gene expression. Thus, recombinant cells, which Chromatogr. 36:127-162 (1996); and Griffin et al., Appl. include the patient's cells that have been engineered ex vivo Biochem. Biotechnol. 38:147-159 (1993)). and returned to the patient, are introduced into an individual 0146). Other methods for detecting mutations in the gene where the cells produce the desired lipase protein to treat the include methods in which protection from cleavage agents is individual. used to detect mismatched bases in RNA/RNA or RNA/ 0152 The invention also encompasses kits for detecting DNA duplexes (Myers et al., Science 230:1242 (1985)); the presence of a lipase nucleic acid in a biological Sample. US 2004/0146980 A1 Jul. 29, 2004

Experimental data as provided in FIG. 1 indicates that lipase range from two to one million. The oligomers are Synthe proteins of the present invention are expressed in normal sized at designated areas on a Substrate using a light-directed stomach detected by a virtual northern blot. In addition, chemical proceSS. The Substrate may be paper, nylon or PCR-based tissue Screening panel indicates expression in other type of membrane, filter, chip, glass slide or any other human leukocyte. For example, the kit can comprise Suitable Solid Support. reagents Such as a labeled or labelable nucleic acid or agent capable of detecting lipase nucleic acid in a biological 0158. In another aspect, an oligonucleotide may be syn Sample; means for determining the amount of lipase nucleic thesized on the Surface of the Substrate by using a chemical acid in the Sample, and means for comparing the amount of coupling procedure and an inkjet application apparatus, as lipase nucleic acid in the Sample with a Standard. The described in PCT application WO95/251116 (Baldesch compound or agent can be packaged in a Suitable container. weiler et al.) which is incorporated herein in its entirety by The kit can further comprise instructions for using the kit to reference. In another aspect, a "gridded' array analogous to detect lipase protein mRNA or DNA. a dot (or slot) blot may be used to arrange and link cDNA 0153. Nucleic Acid Arrays fragments or oligonucleotides to the Surface of a Substrate using a vacuum System, thermal, UV, mechanical or chemi 0154) The present invention further provides nucleic acid cal bonding procedures. An array, Such as those described detection kits, Such as arrays or microarrays of nucleic acid above, may be produced by hand or by using available molecules that are based on the Sequence information pro devices (slot blot or dot blot apparatus), materials (any vided in FIGS. 1 and 3 (SEQ ID NOS:1 and 3). Suitable Solid Support), and machines (including robotic 0155 As used herein “Arrays” or “Microarrays” refers to instruments), and may contain 8, 24, 96, 384, 1536, 6144 or an array of distinct polynucleotides or oligonucleotides more oligonucleotides, or any other number between two Synthesized on a Substrate, Such as paper, nylon or other type and one million which lends itself to the efficient use of of membrane, filter, chip, glass slide, or any other Suitable commercially available instrumentation. Solid Support. In one embodiment, the microarray is pre 0159. In order to conduct sample analysis using a pared and used according to the methods described in U.S. microarray or detection kit, the RNA or DNA from a Pat. No. 5,837,832, Chee et al., PCT application WO95/ biological Sample is made into hybridization probes. The 11995 (Chee et al.), Lockhart, D. J. et al. (1996; Nat. mRNA is isolated, and cDNA is produced and used as a Biotech. 14: 1675-1680) and Schena, M. et al. (1996; Proc. template to make antisense RNA (aRNA). The a RNA is Natl. Acad. Sci. 93: 10614-10619), all of which are incor amplified in the presence of fluorescent nucleotides, and porated herein in their entirety by reference. In other labeled probes are incubated with the microarray or detec embodiments, Such arrays are produced by the methods tion kit So that the probe Sequences hybridize to comple described by Brown et al., U.S. Pat. No. 5,807,522. mentary oligonucleotides of the microarray or detection kit. 0156 The microarray or detection kit is preferably com Incubation conditions are adjusted So that hybridization posed of a large number of unique, Single-Stranded nucleic occurs with precise complementary matches or with various acid Sequences, usually either Synthetic antisense oligo degrees of less complementarity. After removal of nonhy nucleotides or fragments of cDNAS, fixed to a Solid Support. bridized probes, a Scanner is used to determine the levels and The oligonucleotides are preferably about 6-60 nucleotides patterns of fluorescence. The Scanned images are examined in length, more preferably 15-30 nucleotides in length, and to determine degree of complementarity and the relative most preferably about 20-25 nucleotides in length. For a abundance of each oligonucleotide Sequence on the microar certain type of microarray or detection kit, it may be ray or detection kit. The biological Samples may be obtained preferable to use oligonucleotides that are only 7-20 nucle from any bodily fluids (such as blood, urine, Saliva, phlegm, otides in length. The microarray or detection kit may contain gastric juices, etc.), cultured cells, biopsies, or other tissue oligonucleotides that cover the known 5", or 3', Sequence, preparations. A detection System may be used to measure the Sequential oligonucleotides which cover the full length absence, presence, and amount of hybridization for all of the Sequence, or unique oligonucleotides Selected from particu distinct Sequences Simultaneously. This data may be used for lar areas along the length of the Sequence. Polynucleotides large-scale correlation Studies on the Sequences, expression used in the microarray or detection kit may be oligonucle patterns, mutations, variants, or polymorphisms among otides that are specific to a gene or genes of interest. Samples. O157. In order to produce oligonucleotides to a known 0160 Using such arrays, the present invention provides Sequence for a microarray or detection kit, the gene(s) of methods to identify the expression of the lipase proteinS/ interest (or an ORF identified from the contigs of the present peptides of the present invention. In detail, Such methods invention) is typically examined using a computer algorithm comprise incubating a test Sample with one or more nucleic which starts at the 5' or at the 3' end of the nucleotide acid molecules and assaying for binding of the nucleic acid Sequence. Typical algorithms will then identify oligomers of molecule with components within the test Sample. Such defined length that are unique to the gene, have a GC content assays will typically involve arrayS comprising many genes, within a range Suitable for hybridization, and lack predicted at least one of which is a gene of the present invention and Secondary Structure that may interfere with hybridization. In or alleles of the lipase gene of the present invention. FIG. 3 certain situations it may be appropriate to use pairs of provides information on SNPs that have been identified in a oligonucleotides on a microarray or detection kit. The gene encoding the transporter protein of the present inven “pairs” will be identical, except for one nucleotide that tion. 72 SNP variants were found, including 6 indels (indi preferably is located in the center of the Sequence. The cated by a “-”). SNPs, identified at different nucleotide Second oligonucleotide in the pair (mismatched by one) positions in introns and regions 5' and 3' of the ORF, may Serves as a control. The number of oligonucleotide pairs may affect control/regulatory elements. US 2004/0146980 A1 Jul. 29, 2004

0.161 Conditions for incubating a nucleic acid molecule can transport the nucleic acid molecules. When the vector is with a test Sample vary. Incubation conditions depend on the a nucleic acid molecule, the nucleic acid molecules are format employed in the assay, the detection methods covalently linked to the vector nucleic acid. With this aspect employed, and the type and nature of the nucleic acid of the invention, the vector includes a plasmid, Single or molecule used in the assay. One skilled in the art will double Stranded phage, a single or double Stranded RNA or recognize that any one of the commonly available hybrid DNA viral vector, or artificial chromosome, Such as a BAC, ization, amplification or array assay formats can readily be PAC, YAC, OR MAC. adapted to employ the novel fragments of the Human genome disclosed herein. Examples of Such assays can be 0168 A vector can be maintained in the host cell as an found in Chard, T. An Introduction to Radioimmunoassay extrachromosomal element where it replicates and produces and Related Techniques, Elsevier Science Publishers, additional copies of the nucleic acid molecules. Alterna Amsterdam, The Netherlands (1986); Bullock, G. R. et al., tively, the vector may integrate into the host cell genome and Techniques in Immunocytochemistry, Academic PreSS, produce additional copies of the nucleic acid molecules Orlando, Fla. Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); when the host cell replicates. Tijssen, P., Practice and Theory of Enzyme Immunoassays: 0169. The invention provides vectors for the maintenance Laboratory Techniques in and Molecular Biol (cloning vectors) or vectors for expression (expression vec ogy, Elsevier Science Publishers, Amsterdam, The Nether tors) of the nucleic acid molecules. The vectors can function lands (1985). in prokaryotic or eukaryotic cells or in both (shuttle vectors). 0162 The test samples of the present invention include 0170 Expression vectors contain cis-acting regulatory cells, protein or membrane extracts of cells. The test Sample regions that are operably linked in the Vector to the nucleic used in the above-described method will vary based on the acid molecules Such that transcription of the nucleic acid assay format, nature of the detection method and the tissues, molecules is allowed in a host cell. The nucleic acid mol cells or extracts used as the Sample to be assayed. Methods ecules can be introduced into the host cell with a separate for preparing nucleic acid extracts or of cells are well known nucleic acid molecule capable of affecting transcription. in the art and can be readily be adapted in order to obtain a Thus, the Second nucleic acid molecule may provide a Sample that is compatible with the System utilized. trans-acting factor interacting with the cis-regulatory control region to allow transcription of the nucleic acid molecules 0163. In another embodiment of the present invention, from the vector. Alternatively, a trans-acting factor may be kits are provided which contain the necessary reagents to Supplied by the host cell. Finally, a trans-acting factor can be carry out the assays of the present invention. produced from the vector itself. It is understood, however, 0164. Specifically, the invention provides a compartmen that in Some embodiments, transcription and/or translation talized kit to receive, in close confinement, one or more of the nucleic acid molecules can occur in a cell-free System. containers which comprises: (a) a first container comprising 0171 The regulatory sequence to which the nucleic acid one of the nucleic acid molecules that can bind to a fragment molecules described herein can be operably linked include of the Human genome disclosed herein; and (b) one or more promoters for directing mRNA transcription. These include, other containers comprising one or more of the following: but are not limited to, the left promoter from bacteriophage wash reagents, reagents capable of detecting presence of a W, the lac, TRP, and TAC promoters from E. coli, the early bound nucleic acid. and late promoters from SV40, the CMV immediate early 0.165. In detail, a compartmentalized kit includes any kit promoter, the adenovirus early and late promoters, and in which reagents are contained in Separate containers. Such retrovirus long-terminal repeats. containers include Small glass containers, plastic containers, 0172 In addition to control regions that promote tran Strips of plastic, glass or paper, or arraying material Such as scription, expression Vectors may also include regions that Silica. Such containers allows one to efficiently transfer modulate transcription, Such as repressor binding Sites and reagents from one compartment to another compartment enhancers. Examples include the SV40 enhancer, the Such that the Samples and reagents are not cross-contami cytomegalovirus immediate early enhancer, polyoma nated, and the agents or Solutions of each container can be enhancer, adenovirus enhancers, and retrovirus LTR enhanc added in a quantitative fashion from one compartment to CS. another. Such containers will include a container which will accept the test Sample, a container which contains the 0.173) In addition to containing sites for transcription nucleic acid probe, containers which contain wash reagents initiation and control, expression vectors can also contain (Such as phosphate buffered Saline, Tris-buffers, etc.), and Sequences necessary for transcription termination and, in the containers which contain the reagents used to detect the transcribed region a ribosome binding site for translation. bound probe. One skilled in the art will readily recognize Other regulatory control elements for expression include that the previously unidentified lipase gene of the present initiation and termination codons as well as polyadenylation invention can be routinely identified using the Sequence Signals. The perSon of ordinary skill in the art would be information disclosed herein can be readily incorporated aware of the numerous regulatory Sequences that are useful into one of the established kit formats which are well known in expression vectors. Such regulatory Sequences are in the art, particularly expression arrayS. described, for example, in Sambrook et al., Molecular Cloning. A Laboratory Manual. 2nd. ed., Cold Spring 0166 Vectors/Host Cells Harbor Laboratory Press, Cold Spring Harbor, N.Y., (1989). 0167 The invention also provides vectors containing the 0.174. A variety of expression vectors can be used to nucleic acid molecules described herein. The term “vector' express a nucleic acid molecule. Such vectors include chro refers to a vehicle, preferably a nucleic acid molecule, which mosomal, episomal, and virus-derived vectors, for example US 2004/0146980 A1 Jul. 29, 2004 vectors derived from bacterial plasmids, from bacterioph Expression Technology: Methods in Enzymology 185, Aca age, from yeast episomes, from yeast chromosomal elle demic Press, San Diego, Calif (1990) 119-128). Alterna ments, including yeast artificial chromosomes, from Viruses tively, the Sequence of the nucleic acid molecule of interest Such as baculoviruses, papovaviruses Such as SV40, Vac can be altered to provide preferential codon usage for a cinia viruses, adenoviruses, poxviruses, pseudorabies specific host cell, for example E. coli. (Wada et al., Nucleic Viruses, and retroviruses. Vectors may also be derived from Acids Res. 20:2111-2118 (1992)). combinations of these Sources Such as those derived from 0180. The nucleic acid molecules can also be expressed plasmid and bacteriophage genetic elements, e.g. cosmids by expression vectors that are operative in yeast. Examples and phagemids. Appropriate cloning and expression vectors of vectors for expression in yeast e.g., S. cerevisiae include for prokaryotic and eukaryotic hosts are described in Sam pYepSec1 (13Baldari, et al., EMBO J. 6:229-234 (1987)), brook et al., Molecular Cloning. A Laboratory Manual. 2nd. pMFa (Kurjan et al., Cell 30:933-943(1982)), p.JRY88 ed., Cold Spring Harbor Laboratory Press, Cold Spring (Schultz et al., Gene 54.113-123 (1987)), and pYES2 (Invit Harbor, N.Y., (1989). rogen Corporation, San Diego, Calif.). 0.175. The regulatory sequence may provide constitutive expression in one or more host cells (i.e. tissue specific) or 0181. The nucleic acid molecules can also be expressed may provide for inducible expression in one or more cell in insect cells using, for example, baculovirus expression types Such as by temperature, nutrient additive, or exog vectors. Baculovirus vectors available for expression of enous factor Such as a hormone or other ligand. A variety of proteins in cultured insect cells (e.g., Sf9 cells) include the vectors providing for constitutive and inducible expression pAc series (Smith et al., Mol. Cell Biol. 3:2156-2165 (1983)) in prokaryotic and eukaryotic hosts are well known to those and the pVL series (Lucklow et al., Virology 170:31-39 of ordinary skill in the art. (1989)). 0176) The nucleic acid molecules can be inserted into the 0182. In certain embodiments of the invention, the vector nucleic acid by well-known methodology. Generally, nucleic acid molecules described herein are expressed in the DNA sequence that will ultimately be expressed is joined mammalian cells using mammalian expression vectors. to an expression vector by cleaving the DNA sequence and Examples of mammalian expression vectors include the expression vector with one or more restriction enzymes pCDM8 (Seed, B. Nature 329:840(1987)) and pMT2PC and then ligating the fragments together. Procedures for (Kaufman et al., EMBO J. 6:187-195 (1987)). digestion and ligation are well known to 0183 The expression vectors listed herein are provided those of ordinary skill in the art. by way of example only of the well-known vectors available 0177. The vector containing the appropriate nucleic acid to those of ordinary skill in the art that would be useful to molecule can be introduced into an appropriate host cell for express the nucleic acid molecules. The perSon of ordinary propagation or expression using well-known techniques. skill in the art would be aware of other vectors Suitable for Bacterial cells include, but are not limited to, E. coli, maintenance propagation or expression of the nucleic acid Streptomyces, and Salmonella typhimurium. Eukaryotic molecules described herein. These are found for example in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular cells include, but are not limited to, yeast, insect cells Such Cloning. A Laboratory Manual. 2nd, ed., Cold Spring as Drosophila, animal cells such as COS and CHO cells, and Harbor Laboratory, Cold Spring Harbor Laboratory Press, plant cells. Cold Spring Harbor, N.Y., 1989. 0.178 As described herein, it may be desirable to express the peptide as a fusion protein. Accordingly, the invention 0.184 The invention also encompasses vectors in which provides fusion vectors that allow for the production of the the nucleic acid Sequences described herein are cloned into peptides. Fusion vectors can increase the expression of a the vector in reverse orientation, but operably linked to a recombinant protein, increase the Solubility of the recombi regulatory Sequence that permits transcription of antisense nant protein, and aid in the purification of the protein by RNA. Thus, an antisense transcript can be produced to all, acting for example as a ligand for affinity purification. A or to a portion, of the nucleic acid molecule Sequences proteolytic cleavage Site may be introduced at the junction described herein, including both coding and non-coding of the fusion moiety So that the desired peptide can ulti regions. Expression of this antisense RNA is Subject to each mately be separated from the fusion moiety. Proteolytic of the parameters described above in relation to expression enzymes include, but are not limited to, factor Xa, thrombin, of the Sense RNA (regulatory sequences, constitutive or and enterolipase. Typical fusion expression vectors include inducible expression, tissue-specific expression). pGEX (Smith et al., Gene 67:3140 (1988)), pMAL (New 0185. The invention also relates to recombinant host cells England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, containing the vectorS described herein. Host cells therefore Piscataway, N.J.) which fuse glutathione S-transferase include prokaryotic cells, lower eukaryotic cells Such as (GST), maltose E binding protein, or protein A, respectively, yeast, other eukaryotic cells Such as insect cells, and higher to the target recombinant protein. Examples of Suitable eukaryotic cells Such as mammalian cells. inducible non-fusion E. coli expression vectors include pTrc 0186 The recombinant host cells are prepared by intro (Amann et al., Gene 69:301-315 (1988)) and pET 11d ducing the Vector constructs described herein into the cells (Studier et al., Gene Expression Technology: Methods in by techniques readily available to the perSon of ordinary Enzymology 185:60-89 (1990)). skill in the art. These include, but are not limited to, calcium 0179 Recombinant protein expression can be maximized phosphate transfection, DEAE-dextran-mediated transfec in host bacteria by providing a genetic background wherein tion, cationic lipid-mediated transfection, electroporation, the host cell has an impaired capacity to proteolytically transduction, infection, lipofection, and other techniques cleave the recombinant protein. (Gottesman, S., Gene such as those found in Sambrook, et al. (Molecular Cloning: US 2004/0146980 A1 Jul. 29, 2004 18

A Laboratory Manual. 2nd, ed., Cold Spring Harbor Labo 0194 Uses of Vectors and Host Cells ratory, Cold Spring Harbor Laboratory Press, Cold Spring 0.195 The recombinant host cells expressing the peptides Harbor, N.Y., 1989). described herein have a variety of uses. First, the cells are 0187 Host cells can contain more than one vector. Thus, useful for producing a lipase protein or peptide that can be different nucleotide Sequences can be introduced on different further purified to produce desired amounts of lipase protein vectors of the same cell. Similarly, the nucleic acid mol or fragments. Thus, host cells containing expression vectors ecules can be introduced either alone or with other nucleic are useful for peptide production. acid molecules that are not related to the nucleic acid 0196) Host cells are also useful for conducting cell-based molecules Such as those providing trans-acting factors for assays involving the lipase protein or lipase protein frag expression vectors. When more than one vector is intro ments, Such as those described above as well as other duced into a cell, the vectors can be introduced indepen formats known in the art. Thus, a recombinant host cell dently, co-introduced or joined to the nucleic acid molecule expressing a native lipase protein is useful for assaying VectOr. compounds that Stimulate or inhibit lipase protein function. 0188 In the case of bacteriophage and viral vectors, these 0.197 Host cells are also useful for identifying lipase can be introduced into cells as packaged or encapsulated protein mutants in which these functions are affected. If the Virus by Standard procedures for infection and transduction. mutants naturally occur and give rise to a pathology, host Viral vectors can be replication-competent or replication cells containing the mutations are useful to assay com defective. In the case in which Viral replication is defective, pounds that have a desired effect on the mutant lipase protein replication will occur in host cells providing functions that (for example, Stimulating or inhibiting function) which may complement the defects. not be indicated by their effect on the native lipase protein. 0189 Vectors generally include selectable markers that 0198 Genetically engineered host cells can be further enable the selection of the subpopulation of cells that used to produce non-human transgenic animals. A transgenic contain the recombinant vector constructs. The marker can animal is preferably a mammal, for example a rodent, Such be contained in the same vector that contains the nucleic acid as a rat or mouse, in which one or more of the cells of the molecules described herein or may be on a separate vector. animal include a transgene. A transgene is exogenous DNA Markers include tetracycline or amplicillin-resistance genes which is integrated into the genome of a cell from which a for prokaryotic host cells and dihydrofolate reductase or transgenic animal develops and which remains in the neomycin resistance for eukaryotic host cells. However, any genome of the mature animal in one or more cell types or marker that provides Selection for a phenotypic trait will be tissueS of the transgenic animal. These animals are useful for effective. Studying the function of a lipase protein and identifying and 0190. While the mature proteins can be produced in evaluating modulators of lipase protein activity. Other bacteria, yeast, mammalian cells, and other cells under the examples of transgenic animals include non-human pri control of the appropriate regulatory Sequences, cell- free mates, sheep, dogs, cows, goats, chickens, and amphibians. transcription and translation Systems can also be used to 0199 A transgenic animal can be produced by introduc produce these proteins using RNA derived from the DNA ing nucleic acid into the male pronuclei of a fertilized constructs described herein. oocyte, e.g., by microinjection, retroviral infection, and 0191) Where secretion of the peptide is desired, which is allowing the oocyte to develop in a pseudopregnant female difficult to achieve with multi-transmembrane domain con foster animal. Any of the lipase protein nucleotide Sequences taining proteins Such as lipases, appropriate Secretion signals can be introduced as a transgene into the genome of a are incorporated into the vector. The Signal Sequence can be non-human animal, Such as a mouse. endogenous to the peptides or heterologous to these pep 0200 Any of the regulatory or other sequences useful in tides. expression vectors can form part of the transgenic Sequence. 0.192 Where the peptide is not secreted into the medium, This includes intronic Sequences and polyadenylation sig which is typically the case with lipases, the protein can be nals, if not already included. A tissue-specific regulatory isolated from the host cell by Standard disruption proce Sequence(s) can be operably linked to the transgene to direct dures, including freeze thaw, Sonication, mechanical disrup expression of the lipase protein to particular cells. tion, use of lysing agents and the like. The peptide can then be recovered and purified by well-known purification meth 0201 Methods for generating transgenic animals via ods including ammonium Sulfate precipitation, acid extrac embryo manipulation and microinjection, particularly ani tion, anion or cationic exchange chromatography, phospho mals. Such as mice, have become conventional in the art and cellulose chromatography, hydrophobic-interaction are described, for example, in U.S. Pat. Nos. 4,736,866 and chromatography, affinity chromatography, hydroxylapatite 4,870,009, both by Leder et al., U.S. Pat. No. 4,873, 191 by chromatography, lectin chromatography, or high perfor Wagner et al. and in Hogan, B., Manipulating the Mouse mance liquid chromatography. Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986). Similar methods are used for produc 0193 It is also understood that depending upon the host tion of other transgenic animals. A transgenic founder ani cell in recombinant production of the peptides described mal can be identified based upon the presence of the herein, the peptides can have various glycosylation patterns, transgene in its genome and/or expression of transgenic depending upon the cell, or maybe non-glycosylated as mRNA in tissues or cells of the animals. A transgenic when produced in bacteria. In addition, the peptides may founder animal can then be used to breed additional animals include an initial modified methionine in Some cases as a carrying the transgene. Moreover, transgenic animals carry result of a host-mediated process. ing a transgene can further be bred to other transgenic US 2004/0146980 A1 Jul. 29, 2004 animals carrying other transgenes. A transgenic animal also morula or blastocyst and then transferred to pseudopregnant includes animals in which the entire animal or tissues in the female foster animal. The offspring born of this female animal have been produced using the homologously recom foster animal will be a clone of the animal from which the binant host cells described herein. cell, e.g., the Somatic cell, is isolated. 0202) In another embodiment, transgenic non-human ani 0204 Transgenic animals containing recombinant cells mals can be produced which contain Selected Systems that that express the peptides described herein are useful to allow for regulated expression of the transgene. One conduct the assays described herein in an in Vivo context. example of Such a System is the cre/loxP recombinase Accordingly, the various physiological factors that are system of bacteriophage P1. For a description of the cre/loxP present in Vivo and that could effect Substrate binding, and recombinase system, see, e.g., Lakso et al. PNAS 89:6232 lipase protein activation, may not be evident from in Vitro 6236 (1992). Another example of a recombinase system is cell-free or cell-based assayS. Accordingly, it is useful to the FLP recombinase system of S. cerevisiae (O'Gorman et provide non-human transgenic animals to assay in vivo al. Science 251:1351-1355 (1991). If a cre/loxP recombinase lipase protein function, including Substrate interaction, the System is used to regulate expression of the transgene, effect of Specific mutant lipase proteins on lipase protein animals containing transgenes encoding both the Cre recom function and Substrate interaction, and the effect of chimeric binase and a Selected protein is required. Such animals can lipase proteins. It is also possible to assess the effect of null be provided through the construction of “double” transgenic mutations, that is mutations that Substantially or completely animals, e.g., by mating two transgenic animals, one con eliminate one or more lipase protein functions. taining a transgene encoding a Selected protein and the other 0205 All publications and patents mentioned in the containing a transgene encoding a recombinase. above Specification are herein incorporated by reference. 0203 Clones of the non-human transgenic animals Various modifications and variations of the described described herein can also be produced according to the method and System of the invention will be apparent to those methods described in Wilmut, I. et al. Nature 385:810-813 skilled in the art without departing from the Scope and Spirit (1997) and PCT International Publication Nos. WO of the invention. Although the invention has been described 97/07668 and WO 97/07669. In brief, a cell, eg, a somatic in connection with Specific preferred embodiments, it should cell, from the transgenic animal can be isolated and induced be understood that the invention as claimed should not be to exit the growth cycle and enter G. phase. The quiescent unduly limited to Such specific embodiments. Indeed, vari cell can then be fused, e.g., through the use of electrical ous modifications of the above-described modes for carrying pulses, to an enucleated oocyte from an animal of the same out the invention which are obvious to those skilled in the Species from which the quiescent cell is isolated. The field of molecular biology or related fields are intended to be reconstructed oocyte is then cultured Such that it develops to within the Scope of the following claims.

SEQUENCE LISTING

<1 60 > NUMBER OF SIEQ IID INOS : 4 <2 10 > SIEQ IID NO 1 &2 11s LENGTH 1360 &212> TYPE DNA <213> ORGANISM: Homo sapiens <400> SEQUENCE: 1 citcttacitct tcagoctdat gttcaaaag.ca aaagttcaga agttcc toat caataaggag 60 tcc.tt gt gag caggit gaagic tcatcta act ag gcatttct at gatgtggc tgc titt taac 120 aacaacttgt ttgatctgtg galactittaaa togctggtgga titcc ttgatt to gaaaatga 18O agt gaatcct gaggt gtgga tgaatactag tgaaatcatic atctacaatg gctaccccag 240 tgaag agtat gaagtcacca ct gala gatgg gtatatactic cttgtcaaca galattcctta 3OO tgggc galaca catgctagga go acaggtoc coggccagtt gtgtatatgc agcatgcc ct 360 gitt tgcagac aatgc ct act gg citt gaga a titat gccaat ggaagic cittg gatit ccttct 420 agcagatgca ggttatgat g tatggatggg aa acagtcgg ggaaac actt ggtcaa galag 480 acacaaaaca ctc tc ag aga cagat gaga a att ctgggcc titta gtttt g at gaaatggc 540

caa at at gat ct ccc ag gag ta ata ga citit catt gitaaat aa aacit ggt c aggagaaatt 600 gitatttc att gg ac att cac itt ggc actac aata ggg titt gtagcctttit ccaccat gcc 660 tiga actiggc a caaagaatca aaat gaattit tgc cittgggit cc tac gaitct catt ca aata 72O US 2004/0146980 A1 Jul. 29, 2004 2O

-continued toccacgggc atttittacca ggitttitttct actitccaaat tccataatca aggctgttitt 78 0 tggtaccalaa gqtttcttitt tagaagataa gaaaacgaag atagottcta ccaaaatctg 840 caacaataag atactctggit toatatgtag cqaattitatgtc.cittatggg citggatccala 9 OO caag aaaaat atgaat caga gtcgaatgga tgt gtatatig tcacatgctic ccactggttc 96.O atcagtacac aacattctgc atattaaaaca gctttaccac tct gat gaat tcagagctta 1020 tg actgggga a at gacgc tg ata at atgaa a cattacaat cagagtcatc ccc citatata 1080 tgacct gact gcc atgaaag togcct actgc tatttgggct ggtggacatg atgtc.citcgg 1140 aaca cc cc ag gat gtg gcca ggata ctc cc tca aatcaag agt cittit cat tagtig citaag 1200 ccitatt gc ca gaatggga ac ccaccttt ga tittit gtc:tgg ggc citt gat g ccc ct ca ac g 1260 gat gitt cagt ggaaat cata accttitaat g a ag gcat att ticcita aatgc caat gcattit 1320 taccttittto a atttaaagg ttggtttcca aagcccttac 1360

<2 10 > SIEQ IID NO 2 &2 11s LENGTH 395 <2 12> TYPE : PRT <213> ORGANISM: Homo sapiens <400> SEQUIENCE : 2 Met Met Trp Leu Leu Leu Thir Thr Thr Cys Leu Ile Cys Gly Thr Leu 1 5 10 15 Asn Ala Gly Gly Phe Leu Asp Leu. Glu Asin Glu Wall Asin Prd Glu Val 2O 25 30 Trp Met Asn Thr Ser Glu Ile Ile Ille Tyr Asn Gly Tyr Pro Ser Glu 35 40 45 Glu Tyr Glu Val Thr Thr Glu Asp Gly Tyr Ile Leu Leu Val Asn Arg 50 55 60 Ile Pro Tyr Gly Arg Thr His Ala Arg Ser Thr Gly Pro Arg Pro Val 65 70 75 8O Val Tyr Met Gln His Alla Leu Phe Alla Asp ASn Alla Tyr Trp Leu Glu 85 90 95 Asn Tyr Ala Asn Gly Ser Leu Gly Phe Leu Leu Ala Asp Alla Gly Tyr 100 105 110 Asp Val Trip Met Gly Asn Ser Arg Gly Asn Thr Trp Ser Arg Arg His 115 120 125 Lys Thr Leu Ser Glu Thr Asp Glu Lys Phe Trp Ala Phe Ser Phe Asp 130 135 1 4 0 Glu Met Ala Lys Tyr Asp Leu Pro Gly Val Ille Asp Phe Ille Val Asn 145 15 O 155 160 Lys Thr Gly Glin Glu Lys Leu Tyr Phe Ile Gly His Ser Leu Gly Thr 1.65 170 175 Thr Ile Gly Phe Val Ala Phe Ser Thr Met Pro Glu Leu Ala Gln Arg 18O 185 19 O Ile Lys Met Asin Phe Ala Leu Gly Pro Thr Ile Ser Phe Lys Tyr Pro 195 200 2O5 Thr Gly Ile Phe Thr Arg Phe Phe Leu Leu Pro Asn Ser Ile Ile Lys 210 215 220 Alla Val Phe Gly Thr Lys Gly Phe Phe Leu Glu Asp Lys Lys Thr Lys 225 230 235 240 US 2004/0146980 A1 Jul. 29, 2004 21

-continued Ile Ala Ser Thr Lys Ile Cys Asin Asin Lys Ile Leu Trp Leu Ile Cys 245 250 255 Ser Glu Phe Met Ser Leu Trp Alla Gly Ser Asin Lys Lys Asn Met Asin 260 265 27 O Gln Ser Arg Met Asp Val Tyr Met Ser His Ala Pro Thr Gly Ser Ser 275 280 285 Val His Asin Ile Leu His Ile Lys Gln Leu Tyr His Ser Asp Glu Phie 29 O 295 3OO Arg Alla Tyr Asp Trp Gly Asin Asp Alla Asp Asin Met Lys His Tyr Asn 305 310 315 320 Gln Ser His Pro Pro Ile Tyr Asp Leu Thr Alla Met Lys Val Pro Thr 325 330 335 Ala Ile Trp Ala Gly Gly His Asp Val Leu Gly Thr Pro Gln Asp Val 340 345 35 O Ala Arg Ile Leu Pro Glin Ile Lys Ser Leu Ser Leu Val Teu Ser Leu 355 360 365 Leu Pro Glu Trp Glu Pro Thr Phe Asp Phe Val Trp Gly Leu Asp Ala 370 375 38O Pro Gln Arg Met Phe Ser Gly Asn His Asn Leu 385 390 395

<2 10 > SIEQ IID NO 3 &2 11s LENGTH 220 67 <2 12> TYPE : DONA <213> ORGANISM: Homo sapiens <400> SEQUIENCE : 3 titatggccita accitttittaa citttgagtta ttittcaagag aaaatttgaa aaag.cagoct 60 ttgaggagaa agaagcaatc caaicaaac aa aaagata acc acactgtaat aggaaatgtg 120 tttt gaatag gacattggaa gaaaaataat aatcattttt acaggtagat cccaiaagtca 18O aggatctatig titcaaccatig tgt gttccac catcttcaca att gaat gag taaccatcat 240 taagcagtta gcttaggccg taatatgatt cittggactga gattitcaaaa ataccacagg 3OO ccttctgaaa ggittacccct ttctagotcc actatoatct aattittatta aaaaaaaaaa 360 aaaaggaaaa att tgagcitt citagagagita ggggctacca titttgtatc c c ac agggcc a 420 aggaacaagit tttaatgtat tcatttaa at taatttcagt at gagtattg aaatatataa 480 tagaaatatt gitaacattat attattttcta tatacttitta titatatagaa aatatatatt 540 acagaatata titattaaata ttg tagaaca atatataata cagaaaaata tataatactic 600 agtaatatat taaatactta ttaaaatagc aagcttatat aggaagagtg atggagcatt 660 gtgagaaagt titcagcttta tttctttgac attacttitgt titctgcacaa acaaaagaat 72 O tacaggaatt gitccagatta titcaaataac tcgaagttga ggaggga ata taagtcaatg 78 0 atgtagaa ac t cittittaaga titt gagct ag c citacaatcit gitaalagat ct gitgaaattiga 840 actatatttg tdctatttcc atattaagttcaaggcaacaa atcaatatta ataataataa 9 OO catagcactt ctagaacttt ctaaa gagtc caataaagtt ttgttagaaa ggattgtttt 96.O tgaagttaaa alaccat gaga alattccagga aaatccacat acctatgcca tcatactatic 1020 aatcaggg ca aaa catgctt gagitcttt ca tca agacitaa atgattaagg agitggta cat 1080 aacttittccc tgttctgact agctgaac ac ttccttittac tccacatttg tttaattggc 1140 US 2004/0146980 A1 Jul. 29, 2004 22

-continued atgaaattitc ccactccact aaaacagatc ttaggatttg gacaacacaa aatat cattt 200 gttttgaaag gatttgagga taa atcca aa cita atagaac tgalaacttct atatitatgct 260 gggt ag Calac t tagttltt CC C tacC Citt Cit it Cait gCit gigg agat gaa ag a gatt C agitta 320 cggcttaa.gc ticcacaggca tacaaagtga agcagaaaac to agg cacgt gtgcc to cat 38O tatctggtat ctcatgtggg gcttagaggt aaattgtcgt tatttggcct ccattitctgc 4 40 ctttaaccac tggtgtaaac aaaggttact gtgccaaagt tgacagcaac ccaaatccct 5 OO ttggcatgtg aattagtttic citctgccata citgctagttic caaattcctt citggtttcag 560 gatt taggag it cagggitt gc ctcat cittct caa at ga gitt a cagt ca cgc a catc ccitac 62O a cactgcaitig gitt ggc acta gitt cc.ttgat aitatgitt act cc gtttgatc ctc at gaagg 680 at ca aatggg gaagggagat act att gtc:t ctg att gtcc attaa gaitct t gagt at gitt 740 ctacttccct gtttgacaca ctggtttgaa alatgttgcta agtcttccca acaat gacag 800 atacticagtg gaalacatgaa goatt.ccgto: aaactggitta ttittgcatca totag acciac 860 tatttc cca a cct gcaiagtig cat catgggcc titt ggtg tgt caggg ac ac g c cittgggtigt 920 gtgtctdagt citaaagctitc citccttittca caagctitcct gtttctdatc. tctdtag citt 98O citaactgtca citgtaatcat citcttactict tcagoctdat gtcaaaag.ca aaagttcaga 20 40 agittcctcat caataaggag tccttgtgag caggtgaagc t catctaact aggtalagatg 2100 aagatctatc ataaccagga gigcag gitt gig aaggt giccag ttg cactiggc agtcaggtgc 216 O aagagctctg cagtgaggct gcctgagt gt ccatcctaga tctctcacct cittggctctg 2220 tgaccittgag caggtottaa atctotctaa goctttgttt ttittaattga taaaatgagg 228O aitaata at ag taccaa aatt agggagatitit t ca gagcitta aataacatac gt galactatt 234. O tagagtaatg cct gccataa goggacticag tag cittatta ttagttt cat acaatttgaa 24 OO a agttt cata atatttgc ag atata agat g at ctt ca acc a gata gcta a tgtat gc aa a 2460 gctatttagc titcagaagita a actct.gcat ttc tagaagt taaatattac titt gittatag 252O tgaattatct gtaatattta totcttgctc acttittataa gaaaaatagt gaaag cattt 258O attaagaact tacactgcac taaatgttat atatgactta atcctcacta taaccctatg 264 O agatag gitta cattatt gtc: cita attittac taaca ag gaa accaa gagac aaa gcitacita 27 OO aaacacttgc ctgaggttag a catcttctt citgtggtgag gottggatttc alaatt tagac 276 O. catt tigacitg tag cactitat atg at gag ca tgc tgitt tag tgittata gtg ttg gtc:tacc 282O tittgaataga catacttitta aaccatggca aggaagtgag actgcacatt gaaatatgta 2880 aaattt gc ct it tgggt gc ca c gt gagaaat agt cacat ca citaga alacta atc ata agcit 2.940 tttgtgtttig gttaaagttt tattgatcca tttttcttgtt ttactttgtg ggatactggg 3OOO cittaac tagg ggatacctico acttitt tact togcc atggit atgaaaacct g to citctgaa 3060 tctttagata ttttggcaaa ttigtaggcaa acaaagacitt aaagcaattic aaccttgatt 312 O aaaataagac caaaaatgcc toccatacttig attaaattta ttt cattitta gigaactggat 318O tataatcaag acaactitcta catgaaaaaa tag attaata gtgct coaag titagttcact 324 O gtatttattc cttitttatac attatctgcc ttcggtgtta ttcaagttitt cattaatcat 33OO taataattto actaatcatt ttatttaatt aatcaacatt gatagttaaa attaatctgt 3360 gaatattaaa tgttttatgc caggcatttic tatgatgtgg ctgcttttaa caaicaacttg 342O

US 2004/0146980 A1 Jul. 29, 2004 24

-continued tctctctccc tcagaaatac tggaagttgg cagagggaca ctgagctgag catattattg 576 O. tagtttittaa atgctotcca citggacagaa gatgggggat ttgaatagaa atttggtgag 582O galactaatca gtgtc.cattt acacticacct cottctitcc to cotggaagag citatagg act 588 O tgagita ag ca tgataaattit c gt gt citttg taa acca cac ccaggaaatit tgt at ataca 594 O a atacatag a gcacagtagt tatcaggaca gactttigaca taa aaagaac tggg titt gag 6 OOO tccctgctct ggccttctta tctgggtggc cctctgggaa agttactta a cita cataaag 6060 tttt gtttcc atat citacaa aatgagg titt citicaa aatag cagct agttt ata gagtitigt 61.20 tgica aga att tagitaa gcita atacatataa atac gtcaac aitagc accag gita caaaaat 618O atgtgcto aa gaaactgaag ttacctgatt ataatgctot atact attga caagggaaaa 624 O gitgaaaac ag ttttt gtttt accat gtg tg tatt gt gt gtg tgt citt gt gat gittit ccg ac a 6300 tgctctattt aacataaatt actctcactc tttctctctc tctcttitctc tttctccctc 6360 totcatctta coctittcc cc caccaggtoc coggccagtt gtgtatatgc agcatgccct 642O gtttgcagac aat gcctact ggcttgagala titatgccaat ggaagccttg gattccttct 64.80 a gcagatgca g gittat gatig tatggatggg a aacagt cgg gga aaca citt ggt caagaag 654. O acacaaaa.ca citctoagaga cagatgagaa attctgg gcc tittaggtaaa tattagctaa 6600 gaaaactca a gigggga alatt ggaggcaatt tta aalaalaat alacgtggacg citattaatga 6660 ttatctttga cgcttgaagt catatagctc cttgtagttit ctgttaagat citcaaaggag 672O ggta acagoa agaagctctg atttittcact gattotocca caa.gcaaagt atggcatttic 678 O. aiaca agat Ca tittitta catc Calatt Citgtg a att Citat gc attaaaaglt a tgt cCara ag a 6840 gaca gcctc ag gaaattatca tgiaccaat gt g ca cattcat tcagic caat g tttactgagt 69 OO gigct act gita tgc gct gittc taggc ccc ga a catt caaac aggga ac ag a caa actic tiga 696 O cctcacaaag cittatgttica titttagtgat aattttacaa gtcattgctic citggattgcc 7 O2O a atcaact gt gitaalagat ga titt gg accag gac cittattig att tagaga a act gt gatt g 708O atttagagala act gagatcg cacatagtac cattttcagg aaaactccaa tattagattt 714. O ttaaaacctt gttaatgggc aatgaagaag aatctttittit gatatcttgt ttcttittaat 72O O ggaagagttt totgctgtca coagaggaca ggctgatgcc toc gatagac ttittctttct 726 O tcaggcctaa gctccctgtt ggtttgtaaa cctgatgcta gaacagactg tgtattc cta 732O ttacattaat aaaacattica gtacccactg aaagtitt gag aatagtggag gaatagalata 7 38 O gaat gttata gtctgagttic titgggcaggg gcaagcatica ggaaatattig aatcattagt 440 ctttaggagg tgtcacaaca attctcctat tcttgtaagt cccaatctat agatttoctc 7500 acat gittcitt titaataaaca gigcittctagc titatg gaata cct gatttiga citaaatgitta 756 O tatag gcc ct titt gitt cctc ct gtc:tga ag aacaa aatac tagita citatg gaat att ggt 762O atatattaaa tatatatota tatatocatg tagg acaggaa tactactact aacaacatct 7 68 O tact gagcac ccactggcag cca gagtcgt ttcttitcata ctattaa acc ccgttagcag 774. O cc.ccgtaaac Cagg tactac cotgtttatt toccaaatga gaaaacatag gotcagagca 78 0 0 tittcagta at ttctoaa.gag ttgcaaaggc cataaatagt agaat catga tttacaaaac 786 O ccct gtttcc aaagatgggt attaaatggit cottaacaatt gtgaa.gc.ctic atgtgggagt 7920 Cagaagtaga ggCacacaag C Cagatgggg aaagg gaggg caaagaaaag Caagaga agg 798O US 2004/0146980 A1 Jul. 29, 2004 25

-continued gaagga ag ag gagggatcat a ag gittiga ac titicaa at atc ataca caagt ttc gaiaagt g 804. O titcctc ttat a ag gaagt aa aat gitacata tgc ag aa aaa caaaa ag cita caata gccita 8100 catata attg gataaataat gaaatacaca ttgaatctaa gtaaa.ca.gca tagaatctgg 81 60 gtgt aaaaaa gaagtgagca agt gctctga gttttaaact taalacttgca agtatttata 8220 aaagcccctg ttttattttig cagittttgat gaa atggcca aatat gatcit cccaggagta 828O atagactt ca ttgtaaataa a actggtcag gagaaattgt attt cattgg a catt ca ctt 8340 ggCactac aa taggtatgtt tat gagggtc actgttaggt gtgtttttga gggtcagttt 84 OO tc tc ag agtc ttacaggagt t caccttt at gitt ggaataa aaca act gitt a cittatagtig 84 60 cc citcaattic cct gtcct cit gctgg gaata accotagtac totalagtagc tigtgagcctg 852O cagtgcacag actatatgta gggcaaacct titcctggg to tctggtoaca gcagoatatt 858O gactacggtg atgcaatttc c caggaata a catgtgttcc alaattca alag alaatalattcc 864. O ac ag agita ag titt citagatit ccctct gagc tgaaaaagita aaatticaat g c catgga ata 87 OO tggctdaaac ataataaatg togcatcaatc atctotttct cacaa.cccaa atgggattitt 876O taaaaaataa aagggaaggg cittataccta tatttaaa.ca aattgaaaag goatggittat 882O atttgtttgt gagttggaac acacaagctt actataataa atcaattgag cittatctatt 888O cagt gt gt ga titt agt attt atgaaatagc a agitaaat gitt a ag cactat g tagaaatttc 894 O taaagttttt taagatgaca acttacttct taatttactt actttactta atttacttta 9 OOO caatttactt tccaggtatt ttggaaagaa atcaataatc tagttccaag taaaagttga 9 O60 aaggaaccoa cactaataaa agctttgaat ttgtcattga actitcCacta aagtttccala 912 O ttttaag aga ataaat catg tgaaagtgca atatttcagt ttagggaaat atttt catta 918O tcaccactat catcagtaac aaacatatat tcattagtat tttagattga caggcacttt 924 O ccaagctdag aac agg cagt tag catcagt cagcatatac taaaaaagta toaaagaact 93OO cataggagat caaaaatgcc accaataggc aaata attac agtatctaac a cittattgag 936 O cattcgttat gtgtagggtc ttgtgttcag gaccttcccc acagtatctc cctctgatct 9420 tcaaaacaac cc gaat gtta ttatccccat ctcatagaag aagaaiacaca agttcag aac 94.80 acagattcaa accagatgta totgatttca ccalatagggit gtgtaaggat toc ggagaaa 9540 tggt gtagag aagaag aaat gactttagitt ggtttitggaa agtgggt agg acttagatat 96.OO gctcttatac ttgatctgca aaaaaaaaaa aaaaaaccat ggaga atttg attatctgtg 9 660 cttct gitgittit catttaggac ataaatattt ttagt gactg ttg ttt.gcat titt ggacaga 972 O gcaatttctg titatgtaagg agcacccact ctttg tagga catttag tag gitcccagccc 978O attaaacagg gctctgcagt cagogtgacc citcaaaaatc. tcaccitccac acatttccala 984 O acaccctcctg gg gaagtact attcctgatt cagagitcttt ttatcaattg titolagitcaat 9900 tatttcagtt cittctttittctggccaagac agttittaatg titccaacaag tatttcagta 996 O cacacataca cacacacaca cacacacaca cacacacaca cacat gctag tggaggccca 10 020 ggaagg gacc totggaalacc aaattatatg gatattotcc ctagoctacc cagtgttgtg 10080 citaatctoca to citcacaga tatacaaagg ggtgcaatgc tactgctgaa agagcaaagc 10140 aaatggagat gccctggtcct tactgggc ca tcgtggatgc taggg aaag c c cctttcttt 10 20 0 ttggaaac ag gga aga gtc:t a gagg gitt ga aaa ac ac cca gita ag ac act ggg ag ca gtg 10 260 US 2004/0146980 A1 Jul. 29, 2004 26

-continued aaatttcatt ccatagtgag aaagaaaacc tgttagaata actgggtgat gctgcagaaa O320 gaaatcaatt cacctcctgt gactgattat ttgcttctgg aagctctgtg attcattctg O38O gcat ct caga gitt agggat g a aatgaga at gitt gc cagca tittaccc cat g cittgggaag 04 40 tttacacagc agtagctact ccagcagctt aaccatcacc tttcccctgc caactactcc O5 OO attitcc.ccca atcaagtcaa actgtccata aatagaataa aataaaattig gagacitt gag O560 agcagagaag actgaaggca gattatctitt atagaataac to agaag act tccaatt cat O 620 cccc agitatg at cacgat ag a agga aaa aa tga citaagca gagcc ccaat titt gttaga a O 680 acattgcgta agtatttatt tttacaagat tgtcttatct cctgttctct cagggtttgt Of 40 a gcctttit cc accat gcc.tg a actig gcaca a ag aatcaaa atgaattitt g c cittggg tcc O8OO tacgatctca ttcaaatatc ccacgggcat ttittaccagg tittittitctac ttccaaattc O 860 cataat ca ag gtaggct cct tt ca a caaaa tgt acct gag gat ct cattt tggat cata a O920 atccttatta ttttcaaatc tactgtaaag taaaagtagg aaatttagat aaaatctata O98O galacitt ag ac t citg tgggita tg tgc ttg tg tatt gt gt gtc: cct gc gt gtg cgc at gtc:t g O4. O t|gcc at agita tct gcag gitt citigta ataca att tactata caaggtcat c agc aggc tga 100 gitat at git ca gaatttct ag ctgaa ctg ag tgc tatat ga caaca aggat ttttc tt gitt 160 titcc caagtg ttttttgittc catttagtca ggtaggt caa tgaattcaca ttg cccaaat 220 gaaagacact tcaagttacc cataatcact gat gt gtcca attttgacat tagaaaaacc 280 tgattaatat attccttcca atatggaaac ttgccctaat aactaaagct aagattccaa 34 O agcc-taaatg tattacagot caagtattaa ttcaaatatt tattggittat ttittcaggag 400 ttgaaaaagt cat ttg gitt g c ca attigt gg att tggg att titat citatta a agggttttt 460 tttttttttc tctttgcttt tgtttctcta caaaggtcat tgoccacaatg aacacagcat 52O ttaatcaaat to cagattgg cctittgaact toggatgatg gataaaatgg atttgggcca 58O aaattgaagt caaggaga.cc agittagaata toaaaataat toatatataa gaaaatgaga 640 c gitt gg tittig ggg tagagtig gtaggaat ga aaa aaattat ttg tgagcita a caca agga a FOO taattitcc at agg gcc taat aatagittagg totgataata citatggtotg ataatagttt 760 tattgtattg tttact gaga gcacaaatga tigtaactitcc titattoaaga gcttittctag 820 tittattitaaa aat gitgitt ga catcagttag gittittaatgt tittctatatt tggacagtigt 88O gagcaaacta atttgttaaa ttaaattcag agagagatac atctatctgt aaatacatat 940 at gc gittgitt tgt gitt gictc ttc ct acata ggt cagcitat a aggcaa ata atg ttc ct.gg 2OOO gittatcitcag titt cacattit ccc actigt ca atattcctgc tacttittaag tcc catatcc 2060 tgctcittittc titcc.gtcagt titccc.ccaga agctccaaga ccccaccagg aatcc.ccatc 2120 caagtttact titcccaactc. citggaagttt caattgttgct gcctttgttga cattatcata 218O tcttttctgt tcaatggttg cittctctttg gctcactgtt ctctactttt cagcctgaga 2240 gctggctaat citgggacagt acticgaat gc agt gtacaca tgggtaalcat ggaaaacccc 2300 gattittccct tatattcaag gtattatttg accttaagaa aaactgtttt acatttoata 2360 ccaattaatg agaaaaaaat attggcaa.gc actgactggg cagaatacag ggaagcttca 2420 citat ggagaa gitgaattit gg gatt gagggc citt tatt gca atc tc ctitigt aaata at att 24.80 tgatactctt cct catctgg agacacattic citaagtaact tittcctgaat aatttggtct 2540 US 2004/0146980 A1 Jul. 29, 2004 27

-continued c citt gact ga atc agitaagt aca aaltagat ccc ca ag cat gigctc tttcc tagaatgaaa 2600 gaaat gtcaa gaagtc:tgaa gat gattc tt gaattitt ggt ttttt gcitat tgc tatt tgg 2660 gcitt gitt gtc: cittgitt gittig citatt gagitt gag ctc ctita tatattct.gg tittactaatcc 2720 cittgtaatat ggatagtctg caaatattitt atcto attca aagataatta ttatttactt 2 78 0 toataggctg tttittggtac caaaggtttc tttittagaag ataagaaaac gaagatagot 284 O t citaccaa aa tct gcaacaa taagata ctc tggitt gatat gita gc ga att tatt gtc ctita 29 OO tgggctggat cca acaag aa aaatatgaat caggtatgta t gata attat agggccattt 2960 gattacctttaa gaaatticcag ctttcctttig act cattittig atatatcitat ttactgtata 3020 aatticatatig gtattccaaa ccctitaaaga cagatttttt titt gcittitta aaa at gttta 3O8O tgggtatata ata gitt gitac atattitat ga gacacatata titttgatata a gcatacaat 314 O gtgtaatgac caaatcaggg taattgggat atc catcacc toaag cattt atcatttctt 3200 tittgttagag acattctaat titigactcttc tagittatttt gaaatataca atgaattatt 326 O gttaactata gtcatcctat tgtgcatgcc agactttagt ccttotaacg gtattittggt 3320 acco attaac caatgcctct ttatcctitcc cccaccocta citaccttitcc cagoctotgg 3380 taaccatcat tcttctcact atctctataa ggtcagtittt tittttaaact cccctatatg 34 40 agrtgagaaca tgcagtattt gtcttttt gt gcctggctta ttt cacttaa t gtaatgttc 3500 totaattitca tocacattat tdcaaatgac atgattitcat tcttctitat g gctgtctata 356 O tgittaccacat tittattitatc cactcatctg ttgat ggaca cittaggc tga titt catatcit 362O tggtoattgt gaatagtgct gtact aaa.ca togggggtgca gatgtcticitt coatggattg 3 680 atttcctttt titttttctga atatagacct agcactggaa titgctggatic atatggtaat 3740 tctactttta gttttttgag gatccctcat actcttcccc atagtitcctg tactaattta 38 OO cattccitacc aacagtctgt gcaag agttc. tcttittctoc acattcttgt cagdatccat 3860 tatt gc citat cittttt gata aaa gcitattt taacit ggagt gagat agitac ttc attgtag 392 O ttttagittcg catttctcta atgattagta atgttgaaca ttgtttttaa tgtaccitctt 398O ggctatttigt atgtcttc tt titgagaaatig tctactcaga tctttt.gtcc atttttaaat 4040 cagattittitt ttttgcaatt gagttatatg acctcitttat atattctggt tactaatccc 41 OO ttgtcagatg ggtagitttac aaatattttc tctcattcaa caggittcttt agittocacttt 416 O gttgatggtc tcctttgctt tgcagaagct ttttagcttg acgtaatcta atttgttcat 4220 gittit gcitit tg gitt gcc tg tg cattit gaggg cittacct caa att ggcc cag accaat gtcc 428O cggagtgctt ctgtaatgtt tgttttttag tagtttoata gttttaggtc ttaaatgtgt 434 O ctttaatcca titttgattitt gttttitgtat citggcaagag atagagatct aattt cattc 4 400 ttctgcatat ggatatotag titttcccago atcatttctt gtggaaattig toctittgccc 4 460 aatgtatgtt cittgatgcct ttgttgaaaa ttagttgact ataaatgtgt ggatttattt 4520 gtgggittcitt tattctg ttc cattig gtc:ta tgt gtc:tgitt tittat giccag tatcatgcag 4580 ttittgattat tac aggtttg tagtataatt tgaagtcagg tcatgtgatg cctccagctt 4 640 tgttctttitt tctcagaatc titatatttag aaaaac gitaa agactccaac aaaaaacctg 47 OO ctagaact ga taaaca a att cattaaattt g ca ggataca a cat caacat a caaaatt ca 476 O gcago atttcaatatgcc aa gag caaataa tottaaaaaa aagaaagaaa aaaaaacaag 482O

US 2004/0146980 A1 Jul. 29, 2004 29

-continued tgagotattt acagtaaccc cagoatgotg attttgataa attataataa aaaattattt 716 O gagggtggaa agactic ctac citgtcatttg gtggcattta tactgataga actttitttitt 220 aaaaaaattt taattttaat tttaatttat ttcagaaaat ttataaatta aagaagaata 728 O tacaaagaaa cittacatcat gtgtaatcct tccatccaga gatalactaga tigtactaa.ca 734 O ttittggtgta tittattocaa titttctoagt attatattgc titttagacaa cittittaatct 7400 tt ctat ttta cttaag ctat a gt aa gagat a acta atata act gagggat ttttaaat gc 460 atttttaatig gctacata at agaaattatt t cataaaaat ctttacagca taaat gaata 752 O tacactttitt aataccaa.ca gaaaaattag aattic catat gaaagttgaa taagtattac 758 O ccaacattga a gacttgggt cgtaaggcat cittitctccat attagctittat gacataaaaa 764 O totgtagcct tatttagcac cqtacttitta attaatcctg. tcaccatttt totgttctoa 7700 tagccagggg cittggcittat a agtatgaac taagcaaact aa attaa att gttittaagta 776 O ttttcccagg ctatcatatt ttaagctatt tactggtgca actatagatt attaataagt 782O tgtttctgag gaticaaaaca atcagactaa tcaatttctic aat aatgaat tggcctgtta 788 O gaggaataat totactaatc cittaaaacca citacaa.gaga tag accatgt atattittatt 794. O tatttttaaaaataagttta agatgtgatt tacatacaag aacattacta attttgt gtg 8 OOO tcccatittaa taagttttga caaatatatt tatttigt gita accac accac aaltcitaaata 8060 taggac gttt atatcaccac taaaagitttt tttcctgcctc citigagactat ttatagacac 812O aaatgcgtgt atttgcaaat gottagaaaa gqtctagaaa aaaaaac agt aaatgttaaa 818O gtggittatct tcagagagaa gaaagaagaa aagaagtgga tigg acatgaa acagtaaagg 824 O accctcattit tg gactittac aitatgtct.gt tittcttc cat tattttgaat aaacatgcta 8300 tatttataaa titatttacat ttaca agaaa atgaaacaaa atcaacacgc acatt caaga 8360 tcattatggt caagtact aa agtatgtgag agt gttaatg tccttagaat titggccacag 842O ttagctggtc ctactctgct ccaagccggt cctatttitgt gaattaatct catttgatgc 848 O caatttittat tacattctot coaaaaaact agtctdaaca gtttgctcitc. tccitcaagtt 854 O cacagaatta totctgotat atctatattt tattgagtat aagagaatta acccatgtaa 8600 gcto catgag g g tagggatt totcatcgtt ttgttcacca gtgttittct c atcttgaaga 8660 gitacat gaca attactgggc tcc cagtatc itat gitgitt gc attaat gaa a titt citta act 872O ttaatctacc toaaaatgtc. tctatottct to attctdtc ctitcctttct citatcagaaa 878O atgatggtcc tcttattttic caagttattc cggtcctgtg cccttgatcc catctcttct 884. O cactitcccct tcctitcctgc citccattcto citgtc.ccitta taaaaaacaa goaag accat 89 OO caattictato aagttatcat tatgtcactc tagttcttatc aacatattitt tag tattgaa 896 O gagggcttct tctactitact cotgaaccitt gtacaatgta gtttaggtot to atctttitt 9 O20 atcatagcta ccttatttaa agit cacccat ggcttittaat tgccaaattc aatggcctat 9 O8O cttcaccttt tgaaatgt gt tatgttcgtt accacagtct ccttgaaact cagtcccctg 914 O acttggactt ccataacaca atgatttctg attttccttc tgtttgtgat tgttcctttt 920 O gtoccaggca citggct acto cacct tccac citctotgaaa to attagcat tocccaagga 9260 ttcttcaaaa citctotttct tccttggaga agtcagdata gctittaattt g gaccatttc 932O tatggcttat ctagatttitt tcaggacttg ccttoaacct attctttctg taggtgatto 9380 US 2004/0146980 A1 Jul. 29, 2004 30

-continued cattaactgt tgcccatatg gtagtccgaa gacagacctc cgagaaatga cccttgtctc 944. O caaaactitcc gcaatatgtc. caaattitcct agcctgacat tcag actittg attatctg.cc 95 OO tccaagttta tatcctatca tattocttta tatattotgt tctccaggta cactgggaag 956 O cttgccattc ctgatcatag cctacaaact cittcctgcct cccactcacc ctcatctctg 962O ctgtcaaaat gcaaccttcc ctcaagagtoc atttcacagg acccctccttt citatgaagcc 968O ctcaggtgga aataatttitt tgccttttitt tccattittat ttttggagtg tttatggcat 974. O ttaacatacc ttactttgta tacaaatatt tgc?ttgctc cctcttttgc aaattitcitta 98OO aaggtagaga ccattgtatgttittcttcat atgttgctgg togcctaa.cag alactatogcc 986 O attgtccaca titcatttagc agcctttgta gittattgctt tgaggagctt cctct catga 992 O atgcccttgc tttctctccc a cagagtcat ccccctatat atgacctgac tgccatgaaa 998O gtgcct acto citatttgggc tiggtggacat gatgtc.citcg talacaccc.ca ggatgtggcc 20040 aggatact.cc citcaaatcaa gagtottcat tactittaa.gc tattgccaga ttggalaccac 20100 tittgattitltg tct ggggc Cit Cgat gCCC Cit Calac ggat gºt acagt galaat CatagCttta 210 16 0 atgaaggcat attcctaaat gcaatgcatt tacttttcaattaaaagttg cittccaagcc 20220 cata aggg ac tittagaaaaa atggtaac ca a caat gaggit tgt cc ccc ag cac cc tgggg 20 280 gagat g caca gtg gagtc:tg ttttc caagt ca att gitgitt agt gittattt at gitt tagag 20 34 0 acatctittgc atgggaccat ctacaggtcc ttataaacaa tgaggtagat taggcaaaaa 20 400 gattaaacaag titgctactct atctggcatt taagtctaat taaattgtaal tittttagggc 20460 ataccatgaa gtatagaaat gtctgaagct tcalaaggaac agt gaaattic citttaag gtc 20520 ctatatggaa acctctgttg tcattttatt tatatggatt gctatggcaa tggacagagt 20580 gtgggattag gaggagggcc tgtaacttct ttataaaagt ttcittagcta tcctgaagat 20640 gtatagac at ttttactttt ttaggtattt tcaacatcag aaattcaaaa aagitcccca a 20700 agatt?ttcc agagaagccc tctttt?tta caatcttatc cctggctatc tgcgtaaac g 20760 gaatcttgaa cccataatag gatacatgta taaaatcttic cittattaaag cagaaataaa 20820 ttgtacagca tcaatatcat tttataatca tagggaggct tctttgttta gcatgtaatg 20880 ccccctttac aggcttitttg ttctttgagg ggtttgaaca ttccatgaaa aactgacaga 20940 taggaaactg a caataaaag attgagctaa agatggaagc a gaaagtact aggctagata 21000 gtct citaaac attaag tatt ttc ttc ctcc atcittaaaag caatgagaag ccaccaaaat 21060 attittaccta atggaaacct gattgcc.gca tttttgtaac caccactittg gctgctacat 21120 agagaatgga ttagaagatg ccaacaaaag att?tgagca agtctgtaaa tot gatcaag 21 180 tglttct gat g c ag gct gata tccttc ttg tg cita agagaga t gatc ct tgg aaa at ccaga 2124 0 gccagotcca taatacttitc citgctctgct ggcaaatcca caagctgct g g cc cc to gag 21300 ccattcttct cticaaaacta gcattcat ca attitaat gita tacgtattga tggggaataa 21360 tggtoactat gaaaac catg tdataatatg gaaaaatacc catgatataa tottatgtga 21420 agagaagaaa atgaaiact gg tagalactatg tgatt gc aaa tatatacaa a tatta aaaca 21 480 attatatgac tittataaaat atttgtatat aatgaaaact gaa.gcaatat aaaaaataaa 21540 attagittg tg t ca ggg tagit a ac at gat ga gtg atta ata gttttta att tittaa tatag 21 6 0 0 taat gacata atgttacaac ttgtccaaat ctcacaaaca taatatt cag taaagga aga 21660 US 2004/0146980 A1 Jul. 29, 2004 31

-continued taaacatalaa aga atacata ttttattata cattittitatg tagg cita att gat g gittctg 21720 aaagccttaa aaagcttact tttaggagga gaatcat gcc ttggaggact ctagggtcca 21780 gaaaaatgtc ctaatactag agctaggtgc agtcagatta attataatac atttcattat 21840 ttt gtc tigga ataccaagat g acttccaag cagga atgga gtctagocaac actittactg a 21 90 0 tgggga actt ggc cacagac titgta ataca aatttttgga tatgttgaca atgtttctc c 21 9 6 0 ttatttttct tacttataca aagcaagaaa tttggcctcac aaccttgaaa cagacittacc 22020 aggttcctcc agtttcccaa gcctcaatat citcattgcta tttttaa 22O67

<2 10 > SIEQ IID NO 4 &2 11s LENGTH 392 <2 12> TYPE : PRT <213> ORGANISM: Homo sapiens <400> SEQUIENCE : 4 Met Arg Phe Leu Gly Leu Val Val Cys Leu Val Leu Trp Thr Leu His 1 5 10 15 Ser Glu Gly Ser Gly Gly Lys Lleu Thr Alla Val Asp Pro Glu Thr Asin 2O 25 30 Met Asn Val Ser Glu Ile Ile Ser Tyr Trp Gly Phe Pro Ser Glu Glu 35 40 45 Tyr Leu Val Glu Thr Glu Asp Gly Tyr Ile Leu Cys Leu Asn Arg Ile 50 55 60 Pro His Gly Arg Lys Asn His Ser Asp Lys Gly Pro Lys Pro Val Val 65 70 75 8O Phe Leu Gln His Gly Leu Leu Ala Asp Ser Ser Asn Trp Val Thr Asn 85 90 95 Leu Ala Asn Ser Ser Leu Gly Phe Ille Leu Ala Asp Alla Gly Phe Asp 100 105 110 Val Trip Met Gly Asn Ser Arg Gly Asn Thr Trp Ser Arg Lys His Lys 115 120 125 Thr Leu Ser Val Ser Gln Asp Glu Phe Trp Alla Phe Ser Tyr Asp Glu 130 135 1 4 0 Met Ala Lys Tyr Asp Leu Pro Ala Ser Ile Asin Phe Ille Leu Asn Lys 145 15 O 155 160 Thr Gly Gln Glu Gln Val Tyr Tyr Val Gly His Ser Gln Gly Thr Thr 1.65 170 175 Ile Gly Phe Ile Ala Phe Ser Gln Ille Pro Glu Leu Alla Lys Arg Ile 18O 185 19 O Lys Met Phe Phe Ala Leu Gly Pro Val Ala Ser Val Ala Phe Cys Thr 195 200 2O5 Ser Pro Met Alla Lys Lleu Gly Arg Leu Pro Asp His Leu Ile Lys Asp 210 215 220 Leu Phe Gly Asp Lys Glu Phe Leu Pro Glin Ser Ala Phe Leu Lys Trp 225 230 235 240 Leu Gly Thr His Val Cys Thr His Val Ille Leu Lys Glu Leu Cys Gly 245 250 255 Asn Leu Cys Phe Leu Leu Cys Gly Phe Asn Glu Arg Asn Leu Asn Met 260 265 27 O Ser Arg Val Asp Val Tyr Thr Thr His Ser Pro Ala Gly Thr Ser Val 275 280 285 US 2004/0146980 A1 Jul. 29, 2004 32

-continued Gln Asn Met Leu Hils Trp Ser Gln Alla Wall Phe Gln Lys Phe Glin 29 O 295 3OO

Ala Phe Asp Trp Gly Ser Ser Alla Asn Tyr Phe His Tyr Asn Glin 305 310 315 320

Ser Tyr Pro Pro Th Tyr Asn Val Asp Met Leu Wall Pro Thr Ala 325 330 335

Val Trp Ser Gly Gly His Asp Trp Leu Ala Asp Wall Asp Wal Asin 340 345 35 O

Ille Telu Telu Thr Glin Ile Thr Asn Leu Val Phe His Ser Ille Pro 355 360

Glu Trp Glu His Leu Asp Phe Ille Trp Gly Teu Asp Ala Pro Trp 370 375

Teu Tyr Asn Ile Ille Asn Teu 385 390

That which is claimed is: 4. An isolated nucleic acid molecule consisting of a 1. An isolated peptide consisting of an amino acid nucleotide Sequence Selected from the group consisting of: Sequence Selected from the group consisting of: (a) a nucleotide Sequence that encodes an amino acid (a) an amino acid sequence shown in SEQ ID NO:2; sequence shown in SEQ ID NO:2; (b) an amino acid Sequence of an allelic variant of an (b) a nucleotide Sequence that encodes of an allelic variant amino acid sequence shown in SEQ ID NO:2, wherein of an amino acid sequence shown in SEQ ID NO:2, Said allelic variant is encoded by a nucleic acid mol wherein said nucleotide Sequence hybridizes under ecule that hybridizes under Stringent conditions to the Stringent conditions to the opposite Strand of a nucleic opposite Strand of a nucleic acid molecule shown in acid molecule shown in SEQ ID NOS:1 or 3; SEQ ID NOS:1 or 3; (c) a nucleotide sequence that encodes an ortholog of an (c) an amino acid sequence of an ortholog of an amino amino acid sequence shown in SEQ ID NO:2, wherein acid sequence shown in SEQ ID NO:2, wherein said Said nucleotide Sequence hybridizes under Stringent ortholog is encoded by a nucleic acid molecule that conditions to the opposite Strand of a nucleic acid hybridizes under Stringent conditions to the opposite molecule shown in SEQ ID NOS: 1 or 3; strand of a nucleic acid molecule shown in SEQ ID NOS: 1 or 3; and (d) a nucleotide sequence that encodes a fragment of an amino acid sequence shown in SEQ ID NO:2, wherein (d) a fragment of an amino acid sequence shown in SEQ Said fragment comprises at least 10 contiguous amino ID NO:2, wherein said fragment comprises at least 10 acids, and contiguous amino acids. 2. An isolated peptide comprising an amino acid Sequence (e) a nucleotide sequence that is the complement of a Selected from the group consisting of: nucleotide sequence of (a)-(d). 5. An isolated nucleic acid molecule comprising a nucle (a) an amino acid sequence shown in SEQ ID NO:2; otide Sequence Selected from the group consisting of (b) an amino acid Sequence of an allelic variant of an (a) a nucleotide Sequence that encodes an amino acid amino acid sequence shown in SEQ ID NO:2, wherein sequence shown in SEQ ID NO:2; Said allelic variant is encoded by a nucleic acid mol ecule that hybridizes under Stringent conditions to the (b) a nucleotide Sequence that encodes of an allelic variant opposite Strand of a nucleic acid molecule shown in of an amino acid sequence shown in SEQ ID NO:2, SEQ ID NOS:1 or 3; wherein Said nucleotide Sequence hybridizes under (c) an amino acid sequence of an ortholog of an amino Stringent conditions to the opposite Strand of a nucleic acid sequence shown in SEQ ID NO:2, wherein said acid molecule shown in SEQ ID NOS:1 or 3; ortholog is encoded by a nucleic acid molecule that (c) a nucleotide sequence that encodes an ortholog of an hybridizes under Stringent conditions to the opposite amino acid sequence shown in SEQ ID NO:2, wherein strand of a nucleic acid molecule shown in SEQ ID Said nucleotide Sequence hybridizes under Stringent NOS:1 or 3; and conditions to the opposite Strand of a nucleic acid (d) a fragment of an amino acid sequence shown in SEQ molecule shown in SEQ ID NOS:1 or 3; ID NO:2, wherein said fragment comprises at least 10 (d) a nucleotide sequence that encodes a fragment of an contiguous amino acids. amino acid sequence shown in SEQ ID NO:2, wherein 3. An isolated antibody that selectively binds to a peptide Said fragment comprises at least 10 contiguous amino of claim 2. acids, and US 2004/0146980 A1 Jul. 29, 2004 33

(e) a nucleotide sequence that is the complement of a 15. The method of claim 14, wherein said agent is nucleotide sequence of (a)-(d). administered to a host cell comprising an expression vector 6. A gene chip comprising a nucleic acid molecule of that expresses Said peptide. claim 5. 16. A method for identifying an agent that binds to any of 7. A transgenic non-human animal comprising a nucleic the peptides of claim 2, Said method comprising contacting acid molecule of claim 5. the peptide with an agent and assaying the contacted mixture 8. A nucleic acid vector comprising a nucleic acid mol to determine whether a complex is formed with the agent ecule of claim 5. bound to the peptide. 9. A host cell containing the vector of claim 8. 17. A pharmaceutical composition comprising an agent 10. A method for producing any of the peptides of claim identified by the method of claim 16 and a pharmaceutically 1 comprising introducing a nucleotide Sequence encoding acceptable carrier therefor. any of the amino acid sequences in (a)-(d) into a host cell, 18. A method for treating a disease or condition mediated and culturing the host cell under conditions in which the by a human lipase protein, Said method comprising admin peptides are expressed from the nucleotide Sequence. istering to a patient a pharmaceutically effective amount of 11. A method for producing any of the peptides of claim an agent identified by the method of claim 16. 2 comprising introducing a nucleotide Sequence encoding 19. A method for identifying a modulator of the expres any of the amino acid sequences in (a)-(d) into a host cell, Sion of a peptide of claim 2, Said method comprising and culturing the host cell under conditions in which the contacting a cell expressing Said peptide with an agent, and peptides are expressed from the nucleotide Sequence. determining if Said agent has modulated the expression of 12. A method for detecting the presence of any of the Said peptide. peptides of claim 2 in a Sample, Said method comprising 20. An isolated human lipase peptide having an amino contacting Said Sample with a detection agent that Specifi acid Sequence that shares at least 70% homology with an cally allows detection of the presence of the peptide in the amino acid sequence shown in SEQ ID NO:2. Sample and then detecting the presence of the peptide. 21. A peptide according to claim 20 that shares at least 90 13. A method for detecting the presence of a nucleic acid percent homology with an amino acid Sequence shown in molecule of claim 5 in a Sample, Said method comprising SEO ID NO:2. contacting the Sample with an oligonucleotide that hybrid 22. An isolated nucleic acid molecule encoding a human izes to Said nucleic acid molecule under Stringent conditions lipase peptide, Said nucleic acid molecule Sharing at least 80 and determining whether the oligonucleotide binds to said percent homology with a nucleic acid molecule shown in nucleic acid molecule in the Sample. SEQ IDNOS:1 or3. 14. A method for identifying a modulator of a peptide of 23. A nucleic acid molecule according to claim 22 that claim 2, Said method comprising contacting Said peptide shares at least 90 percent homology with a nucleic acid with an agent and determining if Said agent has modulated molecule shown in SEO ID NOS:1 or 3. the function or activity of Said peptide. ? ? ? ? ?