US 20040078837A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2004/0078837 A1 Shannon et al. (43) Pub. Date: Apr. 22, 2004

(54) FOUR HUMANZINC-FINGER-CONTAINING 435/320.1; 435/354; 536/23.2 : MDZ3, MDZ4, MDZ7 AND MDZ12 (57) ABSTRACT (76) Inventors: Mark E. Shannon, Livermore, CA (US); Yizhong Gu, Cupertino, CA The invention provides isolated nucleic acids that encode (US); Cung-Tuong Nguyen, San Jose, MDZ3, MDZA, MDZ7 and MDZ12, and fragments thereof, CA (US) vectors for propagating and expressing MDZ3, MDZ4, MDZ7 and MDZ12 nucleic acids, host cells comprising the Correspondence Address: nucleic acids and vectors of the present invention, proteins, FISH & NEAVE fragments, and protein fusions of the novel MDZ3, 1251 AVENUE OF THE AMERICAS MDZ4, MDZ7 and MDZ12 isoforms, and antibodies 50TH FLOOR thereto. The invention further provides transgenic cells and NEW YORK, NY 10020-1105 (US) non-human organisms comprising human MDZ3, MDZ4, MDZ7 and MDZ12 nucleic acids, and transgenic cells and (21) Appl. No.: 09/922, 181 non-human organisms with targeted disruption of the endog (22) Filed: Aug. 2, 2001 enous orthologue of the human MDZ3, MDZ4, MDZ7 or MDZ12 . The invention further provides pharmaceuti Publication Classification cal formulations of the nucleic acids, proteins, and antibod ies of the present invention, and diagnostic, investigational, (51) Int. Cl." ...... A01K 67/027; CO7H 21/04; and therapeutic methods based on the MDZ3, MDZ4, C12N 9/64; C12P 21/02; C12N 5/06 MDZ7 or MDZ12 nucleic acids, proteins, and antibodies of (52) U.S. Cl...... 800/18; 435/69.1; 435/226; the present invention.

Patent Application Publication Apr. 22, 2004 Sheet 3 of 32 US 2004/0078837 A1

D eº##G'dqGç9IEIWIWI???NICTVÍGIRINEIGHO

vniuwwwwww.[L][][][Ll][-][][][]qx0°z

(q?VISNV?SSnOOT£ZCIW)(dqIEG'LLI)OZOGOOOV G#£ZT-8L9

url:)No.. Patent Application Publication Apr. 22, 2004 Sheet 4 of 32 US 2004/0078837 A1 nt : SEQ ID NO: 1 aa: SEQ ID NO: 3 1. c gga tag CaC togg Cda CCC tag C9g gtg aga ggC CCt 38 toa ggg ccg cqg Cogg gtt gag cqC acc atc acatct aag 77 cca toa gca agt ttgttg gtt tta atc. tcc aaa ata C9t 116 ctt gat ttt g to toga Ctc ttt gCC acc acc Ctg atC taa 155 goc ctt atc atc togc titgaat cac taa ctt gtc. tcc act 194 tcc agt ttt taa aag agt togc titc cat ttg act ttt tot 233 g to togC tdt acc aac ata ta gtt toa gga ggg gtC att 272 gat gCa gtC att CtC agt CtC CtC gga ggg agt Ctg aag M L K E H P E M A E A P Q 311 ATG CTT AAA GAG CAT CCA GAG ATG GCG GAA GCT CCT CAG O O L G I P W W K L E K. E 350 CAG CAG TTG GGT ATT CCT GTG GTG AAA CTG GAGAAA GAG L P W G R G R E D P S P E 389 TTG CCA TGG GGC AGA GGA AGG GAG GAC CCT AGT CCA GAG T F R L R F R Q F R Y O E 428 ACT TTT CGG CTG AGG TTT CGG CAG TTC CGC TAC CAG GAG A A G P Q E A L R G L Q E 467 GCA GCT GGA CCC CAG GAA GCT. CTT AGG GGG CTC CAG GAG L C R R W L R P E L H T K 506 CTC TGT CGT CGG TGG CTG AGG CCC GAG TTG CAC ACC AAG E Q I L E L L W L E O F L 545 GAG CAG ATC CTG GAG CTG CTG GTG CTG GAG CAG TTC CTC T I L. P R E F Y A W I R E 584 ACT ATC CTG CCC CGC GAG TTC TAC GCC TGG ATC CGG GAG H G P E S G. K. A. L. A. A M W 623 CAT GGC CCA GAG AGT GGC AAG GCC CTG GCC GCC ATG GTG FIG. 3A Patent Application Publication Apr. 22, 2004 Sheet 5 of 32 US 2004/0078837 A1

E D L T E R A L E A K A. W. 662 GAG GAC CTG ACA GAA AGA GCA CTG GAG GCC AAG GCG GTT P C H R Q G E Q E E T A L 701 CCA TGC CAC AGG CAG GGA GAG CAG GAG GAA ACA GCA CTT C R G A W E P G I Q L G P 740 TGC AGA GGC GCT TOGG GAG CCA GGC ATC CAG CTG GGG CCA V E V K P E W G M P P G E 779 GTG GAG GTT AAG CCT GAA TGG GGG ATG CCC CCT GGG GAA G W Q G P D P G T E E Q L 818 GGA GTT CAA. GGT CCA GAC CCA GGT ACC GAG GAG CAG CTC S Q D P G D E T R A F Q E 857 AGT CAG GAC CCT GGA GAT GAG ACA CGG GCC TTC CAG GAG Q A L P W L Q A G P G L P 896 CAA GCA CTA CCT GTT CTG CAG GCG GGT CCT GGC CTC CCC A V N P R D Q E M A A G F 935 GCA GTG AAT CCC AGA GAC CAA GAG ATG GCA GCT GGG TTC F T A G S Q G L G P F K D 974. TTT ACT GCT GGA TCG CAG GGG TTG GGG CCA TTT AAA GAT M A L A F P E E E W R H W 1013 ATG GCC CTG GCC TTC CCT GAG GAG GAG TGG AGG CAT GTG T P A Q I D C F. G E Y V E 1052 ACC CCA GCC CAG ATA GAC TGC TTT GGG GAG TAT GTG GAA P Q D C R V S P G G G S K 1091 CCG CAG GAC TGC AGG GTC TCT CCA GGC GGT GGG AGC AAG

E K E A K P P Q E D L. K. G 1130 GAA AAG GAG GCA AAA CCC CCA CAG GAA GAC CTG AAA GGG A L. W. A. L T S E R F G E A 1169 GCG CTG GTG GCA CTG ACA TCA GAG AGG TTT GGG GAA GCC S L Q G P G L G R V C E Q 1208 TCT CTC CAG GGC CCT GGG CTC GGA AGG GTC TGT GAG CAG E P G G P A G S A P G L P 1247 GAG CCT GGT GGC CCT GCA GGC AGT, GCG CCT GGG CTT CCT FIG. 3B Patent Application Publication Apr. 22, 2004 Sheet 6 of 32 US 2004/0078837 A1

P P Q H G A I P L P D E V 1286 CCT CCC CAG CAC GGT GCC ATC CCC CTG CCT GAC GAA GTC K T H S S F W K P F O C P 1325 AAA ACC CAC AGC TCC TTC TGG AAG CCT TTC CAG TGC CCT E C G K G F S R S S N L W 1364 GAG TGT GGG AAA GGA TTC AGT CGG AGC TCC AAT CTC GTC R H Q R T H E E K S Y G C 1403 AGG CAC CAG CGA ACC CAC GAA GAG AAG TCT TAT GGC TGT W E C G K G F T L R E Y L. 1442 GTG GAG TGT GGG AAG GGC TTT ACC CTG AGA GAG TAC CTG M K H Q R T H L G K R P Y 1481 ATG AAG CAC CAG AGA ACC CAC CTG GGA AAG AGG CCC TAC W C S E C W K T F S Q R H 1520 GTG TGC AGC GAG TGC TGG AAA ACC TTC AGC CAG AGA CAC H L E W H Q R S H T G E K 1559 CAC CTG GAG GTG CAC CAG CGC AGC CAC ACT GGG GAG AAG P H K C G D C W K S F S R 1598 CCC CAC AAG TGC GGG GAC TGC TGG AAG AGC TTC AGC CGC R. Q H L Q W H R R T H T G 1637 AGG CAG CAC CTG CAG GTG CAC CGG AGG ACG CAC ACC GGG E K P Y T C E C G K S F S 1676 GAG AAG CCC TAC ACC TGC GAG TGT GGC AAG AGC TTC AGC R N A N L, A W H R R A H T 1715 AGG AAT GCC AAT CTG GCG GTG CAC CGG CGT GCC CAC ACT G E K P Y G C Q V C G K R 1754 GGC GAG AAG CCA TAT GGG TGC CAG GTG TGC GGG AAG CGG F S K G E R L V R H Q R I 1793 TTC AGC AAA GGG GAG CGG CTG GTC CGA CAC CAG AGA ATC H T G E K P Y H C P A C G 832 CAT ACA GGG GAG AAG CCC TAC CAC TGT CCT GCC TGC GGG R S F N O R S I L N R H Q 1871 CGA AGC TTC AAC CAG AGG TCC ATC CTC AAC CGG CAC CAG FIG. 3C Patent Application Publication Apr. 22, 2004 Sheet 7 of 32 US 2004/0078837 A1 K T Q H R Q E P L V Q * 1910 AAG ACC CAG CAC CGC CAG GAG CCG CTG GTG CAG TGA goa 1949 tag cag gtg gCaggC agc act atC att Cat Ctt Cogg ata 1988 gCa ctd gcd acc Cta gCd ggt gag agg CCC ttC agg gCC 2027 gog gCd ggt toga gCd CaC Cat CaC at C taa gCC atc agC 2066 aag ttt gtt ggt ttt aat Ctc caa aat acg tot toga titt 2105 tot ctg act ctt togc cac cac CCt gat cta agc Cct tat 2144 cat ctg ctt gaa toa cta act togt Citc Cac ttg Cag titt 2183 tta aaa gag ttg ctt CCattt gac ttt ttctgtctg. Ctg 2222 tac Caa Cat atg agt ttc agg agg ggit Cat toga togC agt 2261 cat tot Cag tot CCt Cgg agg gag tot gaa gat gct taa 2300 aga gCatcC aga gat ggC gga agc toc toa gCa gCa gtt 2339 ggg tat tcc togt get gaa act gga gaa aga gtt gCC atg 2378 gog Cag agg aag gga gqa CCC tag toC aga gaC ttt tog 2417 gCt gag gtt tog gca gtt CC9 Cta Cca gga goC agC togg 2456 acc CCagga agc tict tag ggg gct CCagga gCt Ctg tog 2495 tog gtg gCt gag gCC Coga gtt gCa CaC Caa gaga gCa gat 2534 CCt gga gct gct ggt gct gga gCa gtt CCt cac tat cot 2573 gCC CC9 C9a gtt Cta CGC Ctg gat CC9 gga gCa togg CCC 2612 aga gag togg Caa ggC CCt ggC C9C Cat ggt gga gga CCt 2651 gaC aga aag agc act gga ggC caa ggC ggit toC atg CCa 2690 Cag gCaggg aga gCagga gga aac agc act ttg cag agg 2729 CgC ttg gga gCC agg Cat CCa gCt ggg gCC agt gga ggit 2768 taa gCC toga atg ggg gat gCC CCC togg gga agg agt toa 2807 agg toC aga CCC agg tac C9a gga gCa gct cag toa gga 2846 CCC togg agatga gaC acg ggC Ctt CCagga gCa agc act FIG. 3D Patent Application Publication Apr. 22, 2004 Sheet 8 of 32 US 2004/0078837 A1 2885 acc togt tot gca ggC ggg toC togg CCt CCC CdC agt gaa 2924 tcc cag aga Cca aga gat ggC agc togg gtt Ctt tac togC 2963 tgg atC gCaggg gtt gogg gcc att taa aga tat ggc cct 3002 goC Ctt CCC toga gga gga gtg gag gCa tdt gaC CCC agc 3041 CCa gat aga Ctg Ctt tog gaga gta tot goa acC gCa gga 3080 Ctg Cag got Ctc. tcc agg C9g togg gag Caa goa aaa gga 3119 goC aaa acc CCC aca gaga aga. CCt gaa agg ggC gCt ggt 3158 goC act gaC at C aga gag gtt togg goga agc ctC tot CCa 31.97 ggg CCC togg gCt C9g aag gigt Citg toga gCa gga gCC togg 3236 togg CCC togC agg Cag togC gCC togg gCt toc toc toc CCa 3275 gCa C9g togC Cat CCC CCt gCC toga Coga agt caa aac CCa 3314 Cag Ctc Ctt Ctg gaa gCC titt CCa gtg CCC toga gtg togg 3353 gaa agg att Cag tog gag CtC Caa tot Cdt cag gCa cca 3392 gCg aac CCa Cqa aga gaa gtC tta togg Ctg togt ga gtg 3431 togg gaa ggg Ctt tac CCt gag aga gta Cct gat gaa gCa 3470 CCa gag aac CCa Cct gogg aaa gag gCC Cta Cdt gtg Cag 3509 Cqa gtg Ctg gaa aac Ctt Cag CCa gag aca CCa cot gga 3548 ggt gCa Cca gCd Cag CCa Cac togg gga gaa gCC cca caa 3587 gtg Cog gga Ctg Ctg gaa gag Ctt Cag CCG Cag gCa gCa 3626 cct gCaggit gca CC9 gag gaC gCa Cac C9g gga gaa gCC 3665 Cta Cac Ctg C9a gtg togg Caa gag Ctt cag cag gaa togc 3704 Caa tot goC ggit gca CC9 gCd togC CCa cac togg cqa gaa 3743 gCC ata togg gtg cca ggt gtg cqg gaa gCd gtt cag caa 3782 agg gga gC9 gct ggt CC9 aca Cca gag aat CCa tac agg 3821 ga gaa gCC Cta Cca Ctg toc togc ctd cqg gog aag ctt 3860 Caa Cca gag gtc. Cat CCt caa cog gCa coa gaa gac coa FIG. 3E Patent Application Publication Apr. 22, 2004 Sheet 9 of 32 US 2004/0078837 A1 3899 gCa CC9 CCagga gCC gct ggit gca gtg agc ata gca got 3938 ggc agg cag cac cat cat tca tot t FIG. 3F

Patent Application Publication Apr. 22, 2004 Sheet 11 of 32 US 2004/0078837 A1

(qX6SNV?SSOOO'IÞZCIW)

(dq6LL'82T) (OVA)

ElNo.,No.ti Patent Application Publication Apr. 22, 2004 Sheet 12 of 32 US 2004/0078837 A1 nt : SEQ ID NO: 3027 aa: SEQ ID NO: 3029 1 gtt gtC aag gat toa gag. Cag atg toga tCt gCg Cta 40 goC atC tCC tCt CaC gga tgc Ctc gat CtC ggg gtt tCC 79 aga aga alala gat Caa 999 aga att at C aag aat aga tta M A I T 11.8 titt ttC tga ata gtt cal CCt ttg ATG GCC ATA ACC TTG T L O T A E M Q E G L L 157 ACC CTT CAG ACT GCA GAG ATG CAG GAA GGA CTT CTG GCA W K W K. E E E E E H S C 196 GTG AAG GTA AAG GAG GAA GAG GAG GAA CAT TCC TGT GGG P E S G L. S R N N P H T 235 CCA GAA TCA GGC CTG TCA AGA AAT AAC CCT CAT ACC AGA E F R R F R Q F C Y 274 GAG ATC TTT CGT AGA CGC TTC AGG CAG TTC TGC TAT CAG E S P G P R E A L. Q R L 313 GAG TCC CCT GGG CCC CGG GAG GCT. CTT CAA AGA CTC CAG E L C H Q W I R P E M H 352 GAG CTC TGC CAT CAG TGG CTG AGA CCA GAG ATG CAC ACC K E Q I L E L L W L E Q 391 AAG GAG CAG ATC CTA GAG CTG CTG GTG CTG GAG CAG TTC L T I L P E E L. Q A W W 430 CTG ACT ATC CTG CCT GAG GAG CTC CAG GCC TGG GTC AGA Q H R P V S G E E A W T 469 CAG CAC CGT CCT GTG. AGT GGA GAG GAG GCA GTG ACT GTG L E D L E R E L D D P G 508 CTG GAG GAT TTG GAG. AGA GAG CTG GAT GAC CCA GGA GAG FIG.6A Patent Application Publication Apr. 22, 2004 Sheet 13 of 32 US 2004/0078837 A1

O W L S H A H E Q E E F W 547 CAG GTC CTG AGC CAT GCT CAT GAA CAG GAA GAG TTT GTA K E K A T P G A. A Q E S S 586 AAG GAG AAG GCA ACT CCA GGA GCA GCT CAG GAG TCA TCA N D Q F Q T L E E Q L. G Y 625 AAT GAC CAA TTC CAA ACC TTG GAA GAG CAG CTT GGG TAT N L R E W C P W Q E I D G 664 AAT TTG CGA GAG GTG TGC CCA GTT CAA GAG ATT GAT GGC K A G T W N V E. L. A P K R 703 AAG GCT GGG ACT TGG AAT GTG GAG TTA GCC CCA AAG AGG E I Q E W K S L. I O V L 742 GAG ATT TCT CAG GAA GTG AAA TCT CTT ATA CAA GTT CTT G K N G N I T O I P E Y 781 GGA AAA CAG AAT GGT AAT ATT ACT CAG ATT CCT GAG TAT G D C D R E G R L E K Q 820 GGA GAT ACC TGT GAC CGT GAG GGC AGA TTG GAA AAG CAA R W S S V E R P Y I C S 859 AGG GTG. AGC TCT TCA GTG GAG AGA CCC TAT ATC TGT AGT E K S F T O N S I L I 898 GAA TGT GGA AAA AGC TTC ACC CAG AAT TCC ATC CTT ATC E R T H T G E K P Y E 937 GAG CAC CAG AGA ACA CAC ACA GGT GAG AAG CCT TAT GAA G R A F S Q R S G 976 TGT GAT GAG TGT GGG CGG GCC TTC AGC CAG AGG TCA GGC Q R L H T G E. K. R 1015 CTA TTC CAG CAC CAG AGA CTC CAC ACT GGG GAG AAG CGC W C G K. A. F S Q N 1054 TAC CAG TGC AGT GTT TGT GGC AAA GCC TTC AGC CAG AAT

H H. L. R II H T G E 1093 GCC GGG CTT TTC CAT CAC CTC AGA ATT CAC ACT GGG GAG FIG. 6B Patent Application Publication Apr. 22, 2004 Sheet 14 of 32 US 2004/0078837 A1

K P Y Q C N Q. C N K S F S 1132 AAG CCT TAC CAG TGC AAT CAG TGC AAT AAG AGT TTT AGT R R S W L I K H Q R I H T 1171 CGA CGT TCA GTC CTC ATT. AAG CAT CAG AGA ATT CAC ACT G E R P Y E C E E C G K. N 1210 GGA GAG AGA CCT TAT GAA TGT GAA GAA TGT GGC AAG AAC F I Y H C N L, I Q H R K V 1249 TTC ATT TAC CAT TGC AAC CTA ATC CAG CAT CGG AAA GTC H P W A E S S 1288 CAC CCA GTG GCT GAA TCA AGC TAG Ctc ctt gga aca ggit 1327 agg FIG. 6C Patent Application Publication Apr. 22, 2004 Sheet 15 of 32 US 2004/0078837 A1

s s O sh CN

N N e Patent Application Publication Apr. 22, 2004 Sheet 16 of 32 US 2004/0078837 A1

(qXG‘GSNV?SSnOOTILZCIW)(dqGG6ºOzT) q?zºz'v'NHulwwwwww[T] |-# (OVE)

ZT I“OTEZOOOVI [][] Patent Application Publication Apr. 22, 2004 Sheet 17 of 32 US 2004/0078837 A1 nt : SEQ ID NO: 4407 aa: SEQ ID NO: 4409 1 CC togt tCC CgC tgC CCC tCG gCC tgg CaC Cag gag 39 tac toa 9ag CtC aaa gCt ggg atC tgC agt CCC tita CCC 78 act cag tgC aCg CCG CCt aag gCt ttg CGC ttC acC titt 117 act CaC CtC gaa gCC Ctg gaC atc CgC atc tgC CCt aag

156 act tct CaC CtC agt agC aga agg aag tog Cgt Cag Ctg 195 gcc aca gCC tCt CtC Cta gga gaC Cdt CCG gga aaa 234 agt cag ggt aga CCC toga 99C CCC tda gCt CCg gct att 273 ttc aga tCt gtC gct cot tCa CCC toa gCC ttt Caa a Ca 312 99C CaC tCC alala aaa aag CCC aat CaC agc Ctt CCt tct 351 tot CCt 99C Ctt CCg gCa Ctg toc aat Caa Cot acg CCa 390 tot atC 99a titt toa gtt CCC alala CCC gct ttt atc. tcg 429 gtg gaa 99a gaa gtg gag. 9C9 tgg agC CC9 gag gCC 468 Cag gat CCC gaC ggt 9ag agC tCt gCa gCt ttC agC agg 507 ggC Caa Cag gaa gCa gga toC agg gat ggg aat gag 546 aag aag agg Ctg aag aag tot CCa alala Cala aaa. Gag 585 gtg gaa gtg gCt gtC aag gag togg tgg CCC agC 624 gtC gCC CCa gag ttC tgC aac CCt agg Cag agC CCC M. N. L. K. D T L T R. R. L 663 ATG AAT TGG CTC AAG GAC ACT CTG ACC CGA AGA CTG P H P D C G R. N. F S Y 703 CCC CAC TGC CCA GAC TGT GGC CGC AAC TTC AGC TAC P S A S H Q R W H S G 742 CCT TCC CTG GCC AGC CAC CAG CGG GTC CAC TCC GGG FIG. 9A Patent Application Publication Apr. 22, 2004 Sheet 18 of 32 US 2004/0078837 A1

E R P F S C G O C Q A R F 781 GAG CGG CCC TTC TCC TGC GGC CAG TGT CAG GCG CGT TTC S Q R R Y L L Q H Q F I H 820 TCC CAG CGC AGG TAC CTG CTC CAG CAT CAG TTC ATC CAC T G E K P Y P C P D C G R 859 ACC GGC GAG AAG CCC TAC CCC TGC CCC GAC TGC GGG CGC R F R Q R G S L A I H R R 898 CGC TTC CGC CAG AGG GGT TCC CTG GCT ATC CAC AGG CGG A H T G E K P Y A C S D C 937 GCT CAC ACC GGG GAG AAG CCT TAC GCG TGC TCA GAC TGC K S R F T Y P Y L L A I H 976 AAG AGT CGC TTC ACT TAC CCC TAC CTG CTG GCC ATC CAC Q R K H T G E K P Y S C P 1015 CAG CGC AAG CAC ACG GGC GAG AAG CCC TAC AGC TGC CCC D C S L R F A Y T S L L A 1054 GAT TGC AGC CTC CGT TTC GCC TAC ACC TCC CTG CTG GCC I H R I H T G E K P Y P 1093 ATC CAC AGG CGC ATA CAC ACC GGC GAG AAG CCC TAC CCC C P D G R R F T Y S S L 1132 TGT CCT GAC TGC GGC CGC CGC TTC ACC TAT TCT TCC CTC L. L. S H R R I H S S R P 1171 CTC CTC AGT CAC CGG CGC ATT CAC TCC GAC AGC CGG CCC F P C W E C G K G K R K 1210 TTC CCC TGC GTG GAG TGT GGG AAA GGC TTC AAG CGC AAG T A L E A H R W I H R S C 1249 ACC GCC CTG GAA GCC CAT CGG TGG ATC CAC CGC TCC TGC S E R R A W Q Q A W W G R 1288 AGC GAG AGG CGC GCG TGG CAG CAG GCC GTG GTG GGG CGT S E P I P W L G G K D P P 1327 TCA GAG CCC ATC CCT GTT TTG GGA GGC AAG GAT CCC CCA W H F R H F P D I F Q E C 1366 GTT CAC TTC CGG CAC TTT CCA GAT ATA TTT CAA GAG TGT G 1405 GGG TGA tgg Cdt toa CaC ada Ctg gtC agC gtt tCC Ctg FIG. 9B Patent Application Publication Apr. 22, 2004 Sheet 19 of 32 US 2004/0078837 A1 1443 gag agg aag agg caa gat att togC atgttc Cct gga titt 1482 togt att ttt toga taa aga tat att Ctt gagg CCa Cag tag 1521 ctg gag ata taa togC C9g agg att Ctt ttt ttt ttt ttt 1560 ttg aga Cag agt Ctg tot Cta ttg CCt ggg Ctg gag togC 1599 agt gcc cca agc tac gCt Cac togC aag Ctc CaC CtC Ctg 1638 got toa cac cat tot CCt gct tca gtc. tcc cqa gta gct 1677 goa attaca agc acc C9C Cac CaC gCC Caa cta ata titt 1716 togt att ttt agt aga gaC ggg got ttc acC gtg tta gCC 1755 agg atg gtc. tcg atC toC toga Ctt Cogt gat CCt COC gCC 1794 tcg gCC toC Caa agt gct ggg attaca gC gtg agc CaC 1833 togC acC cag CCt Ctt ttt ttt ttt gag atg gag ttt cqC 1872 tot togt togC CCaggC tag agt gca atg gCa toga tot togg 1911 ctC act gCa acc toc gCC toc tag gtt caa gCd att cto 1950 Ctg CCC cag CCt Ctt gag tag Ctg gaga tta cag gCa ccc 1989 acq acc atg cct goC taa ttg cat ttt tac tag aga cag 2028 gtt toa Cca tot togg CCaggc togg tot coa att cot gac 2067 CtC agg toga toC acc Coga ctt ggC CtC CCa aag ttctgg 2106 gat tac att ttt ttt tta aag aaa gaataa attaat tdt 2145 gat taa agt toga aat Caa ggC ata gtt aaa aaa aaa aaa 2184 aaa aaa aaa aaa aaa ninc ctg ttc ccg ctg ccc ctc ggg 2223 Ctg gCa Ctg CCagga gta CtC aga gct caa agc tigg gat 2262 Ctg Cag toc Ctt acc Cac toa gtg cac goc goc taa ggc 2301 ttt gC9 Ctt CaC Ctt tac toa Cct cqa agc cct gga cat 2340 cc.g. Cat Ctg CCC taa gac ttc. tca cot cag tag cag aag 2379 gaa gtC gCg tda gCt ggC Cac agc cto tct cot agg aga 2418 cc.g. tcc ggg aaa agc gag toa gag tag acc ctg agg ccc FIG.9C Patent Application Publication Apr. 22, 2004 Sheet 20 of 32 US 2004/0078837 A1 2457 ctc agc toc ggc tat ttt Cag at C tdt C9C to C ttC acc 2496 ctc agc ctt toa aac agg CCaCtC Caa aaa aaa gCC Caa 2535 toa cag cot toc ttc ttc. tcc togg CCt toC ggC act gtc 2574 caa toa acg tac goc atc tat cqg att ttc agt toC Caa 2613 acc cqC ttt tat Ctc gtg ggt gga agg aga agt gga ggC 2652 gtg gag ccc ggaggC CCagga toC C9a C9g toga gag CtC 2691 toc agc titt Cag Cag ggg CCa agg aca gga agC agg atc 2730 cag gga togg gaa toga gaa gaa ga aag gCt gaa gaa gtg 2769 toC aaa aca aaa aga ggt goC gCa toga agt ggC tot Caa 2808 gga gtg gtg gCC Cag Cdt CGC Ctg CCC aga gtt Ctg Caa 2847 ccc tag gCa gag CCC cat gaa toC Ctg gct Caa goa CaC 2886 tot gac cog aag act gCC CCa ctC ttg CCC aga Ctg togg 2925 ccg caa ctt cag cta CCC titC cct CCt ggC Cag CCaCCa 2964 gCd ggt CCa Ctc C9g gga gC9 gCC Ctt CtC Ct9 C99 CCa 3003 gtg tda ggC gCg titt CtC CCa gCd Cag gta CCt gCt CCa 3042 goa toa gtt cat CCa cac cqg Coga gaa gCC Cta CCC Ctg 3081 CCC Coga Ctg C9g gCd CCG Ctt CCG CCa gag gagg ttc. CCt 3120 ggC tat CCa Cag gCd ggC toa CaC C9g gga gaa gCC tta 3159 cogc gtg Ctc aga Citg Caa gag tog Ctt CaC tta CCC cta 31.98 cct gct ggC Cat CCaCCa gCd Caa gCa CaC ggg Cda gaa 3237 goc cta cag ctg. CCC Coga ttg cag cot CCG ttt coc cta 3276 cac ctC cct gct ggC Cat CCa Cag gCd cat aca cac cqg 3315 Coga gaa gCC Cta CCC ctg. tcc toga Ctg Cogg CCG cog ctt 3354 cac cta ttc ttc cot CCt cot cag toa cog gCd cat toa 3393 CtC C9a Cag CCG gCC Ctt CCC ctg. Cdt gga gtg togg gaa 3432 agg Ctt Caa gC9 Caa gaC CgC CCt gga agc CCatcg gtg FIG. 9D Patent Application Publication Apr. 22, 2004 Sheet 21 of 32 US 2004/0078837 A1 3471 gat coa CCG ctc Ctg Cag C9a gag gC9 CgC gtg gCa gCa 3510 ggc cgt ggit gagg gC9 ttC aga gCC Cat CCC togt ttt ggg 3549 agg caa goa toc CCC agt toa Ctt CC9 gCa Ctt to C aga 3588 tat att toa aga gtg togg gtg atg gC9 ttC aca Caa act 3627 ggt cag Cdt ttc Cct gaga gag gaa gag gca aga tat ttg 3666 cat gtt coc togg att ttg tat ttt ttgata aag ata tat 3705 tot togg gcc acagta gCt gga gat ata atg CC9 gag gat 3744 tct ttt ttt ttt ttt ttt gag aca gag tot gtc. tct att 3783 gCC togg gCt gga gtg Cag togg CCC aag Cta C9C toa Ctg 3822 caa gCt coa cot CCt ggg ttc aca CCattc. tcc togC titC 3861 agt cto CCG agt agc tigg aat tac aag CaC CC9 CCaCCa 3900 cqC cca act aat att ttg tat ttt tag tag aga C9g ggg 3939 ttt cac cqt gtt agc Cag gat got Ctc gat Ctc Ctg act 3978 tog toga toc toc cgc ctc ggc Ctc cca aag togC togg gat 4017 tac agg cqt gag CCactg. cac CCa gCC tot ttt ttt ttt 4056 toga gat goga gtt tog CtC ttgttg CCC agg Cta gag togC 4095 aat goc atg atc ttg gCt cac togcaac ctC cqC Ctc cta 4134 got toa agc gat tot CCt gCC CCa gCC tot tga gta gCt 4173 ggg attaca ggC acc CaC gaC cat gCC togg Cta att gCa 4212 ttt tta Cta gag aca get ttc acc atgttg gCC agg Ctg 4251 gtc. tcc aat toC toga Cct Cag gtg atc. cac CCG act togg 4290 cct CCC aaa gtt cto gga tta cat ttt ttt ttt aaa gaa 4329 aga ata aat taa ttg toga tta aag ttgaaa toa agg cat 4368 agt taa aaa aaa aaa aaa aaa aaa aaa aaa a FIG.9E

Patent Application Publication Apr. 22, 2004 Sheet 23 of 32 US 2004/0078837 A1

(qx{{LI'QVE)9†68TOOVI

Patent Application Publication Apr. 22, 2004 Sheet 24 of 32 US 2004/0078837 A1

nt : SEQ ID NO: 5770 aa: SEQ ID NO: 5772 1 gaa tCC C99 tCg ggit tot gogg agg CaC CgC CtC 999 gtt 40 gCd 99C C99 gtg Cogg CtC ggC ggt 99a 99a CtC act tCC 79 to c tCC at C CCC ggC tog gCC Ctg 999 C99 aac toga tga W. L. G T S K. S. 118 cqC ttg ata ATG TGG CTG GGG ACT TCA GGG AAG AGT GGG C L E N P Q E 157 TTA CCT GGA CAC TGC TTA GAG AAT CCT CTC CAG GAA TGC L E E. W. A K. G 196 CAC CCA GCA CAG TTA GAA GAA TGG GCT CTC AAA GGA ATT W I S Q P O K 235 TCC AGG CCT AGT GTA ATC TCC CAG CCG GAG CAG AAA GAA L P L. Q N E A 274 GAG CCA TGG GTC CTA CCA CTC AAC TTT GAG GCG AGG E S H D 313 AAG ATC CCG AGG AA AGC CAC ACA GAC TGT GAG CAT CAG N Q D N 352 GTG GCA AAG CTC AAT CAG GAC AAT TCT GAA ACA GCA GAA S S E R 391 TGT GGA ACA TCC TCA GAA AGG ACC AAT GAT CTT S W G 430 TCT CAT ACT CTT AGT TGG GGA GGA AAC TGG GAG CAA GGC G Q H 469 CTA GAA TTA GAA GGG CAA CAT GGA ACC CTT CCA GGA GAG S F S Q 508 GGC CAG CTG GAG TCC TTT TCA CAG GAG AGG GAT TTA AAC G Y W 547 AAG CTC CTG GAT GGA TAT GTA GGA GAG AAG CCT ATG FIG. 12A Patent Application Publication Apr. 22, 2004 Sheet 25 of 32 US 2004/0078837 A1

A E C G K S F N Q S S Y L. 586 GCA GAA TGC GGG AAA AGC TTT AAC CAG AGT TCC TAT CTC I R H L R T H T G E R P Y 625 ATA AGA CAC CTA AGA ACC CAC ACT GGC GAG AGG CCC TAT T C I E C G K G F K Q S S 664 ACG TGC ATT GAG TGT. GGG AAA GGC TTC AAA CAG AGC TCA D L W T H R R T H T G E K. 703 GAC CTT GTC ACC CAT CGC AGA ACA CAC ACA GGA GAG AAG P Y O C K G C E K K F S D 742 CCC TAC CAA TGC AAG GGG TGT GAG AAG AAA TTC AGC GAC S S T L I K H Q R T H T G 781. AGC TCA ACA CTC ATC AAA CAT CAG AGA ACC CAC ACA GGG E R P Y E C P E C G K. T. F 82O GAG AGA CCC TAT GAG TGC CCA GAG TGT GGA AAG ACT TTT G R K P H L I M H Q R T H 859 GGG CGG AAG CCA CAC CTC ATA ATG CAC CAA AGA ACC CAC T G E K P Y A C L E C H K 898 ACA GGC GAG AAG CCC TAC GCG TGC CTG GAA TGT CAC AAA S F S R S S N F I T H Q R 937 AGC TTC AGT CGA AGC TCA AAT TTC ATC ACT CAC CAG AGG T H T G V K P Y R C N D C 976 ACC CAC ACA GGG GTG AAG CCTTAC AGG TGT AAT GAC TGT G E S F S Q S S D L I K H 1015 GGG GAG AGT TTT AGC CAG AGC TCG GAT TTG ATT. AAG CAC Q R T H T G E R P F K C P 1054 CAA CGA ACC CAC ACG GGA GAA CGG CCC TTC AAA TGC CCG

E C G K G F R D S S H F W 1093 GAG TGC GGG AAG GGC TTC AGA GAT AGT TCT CAT TTT GTA A H M S T H S G E R P F S 1132 GCT CAC ATG AGC ACT CAT TCA GGA GAG AGG CCT TTC AGT C P D C H K S F S Q S S H 1171 TGT CCT GAC TGC CAC AAA AGC TTC AGT CAG AGC TCA CAT L W T H Q R T H T G E R P 1210 TTG GTC ACG CAC CAA AGA ACA CAC ACA GGT GAG AGA CCT FIG. 12B Patent Application Publication Apr. 22, 2004 Sheet 26 of 32 US 2004/0078837 A1

F K C E N C G K G F A D S 1249 TTT AAG TGC GAA AAC TGT. GGG AAA GGA TTC GCC GAC AGC S A L I K H Q R I H T G E 1288 TCC GCC CTC ATT. AAG CAC CAA CGA ATC CAC ACC GGA GAA R P Y K C G E C G K. S F N 1327 AGA CCC TAC AAA TGT GGA GAG TGT GGG AAG AGC TTC AAT Q S S H F I T H Q R I H L 1366 CAG AGC TCC CAC TTT ATT ACC CAT CAG CGA ATC CAC TTA G D R P Y R C P E C G K T 1405 GGA GAC AGG CCC TAT CGA TGT CCT GAG TGT GGC AAG ACC F N Q R S H F L T H Q R. T 1444 TTC AAT CAG CGT TCC CAT TTC CTC ACA CAC CAG AGA ACG H T G E K P F H C S K C N 1483 CAT ACA GGA GAA AAA CCT TTC CAC TGT AGT AAA TGT AAC K S F. R O K. A H L L C H Q 1522 AAG AGC TTC CGT CAG AAA GCG CAT CTT, TTA TGC CAT CAA N T H. L. I 1561 AAC ACC CAT TTG ATT TAG gaa gta gtC ttt ggt gtt Cag 1600 Ctg Ctc CCt togC acattt toa ttg cta ctg. tct tca agc 1639 acc CCa aat aga gaa aac Ctg ggc gtc agt ggC toa att 1678 togg gCC Ctg atc tat tot coc tot ttc ttg tot atgtta 1717 taa Cag aga gaga taa act taa agg gtc. caa ata acg gtc 1756 Caa aaa aaa aaa aaa aaa aaa aaa a FIG. 12C Patent Application Publication Apr. 22, 2004 Sheet 27 of 32 US 2004/0078837 A1 nt : SEQ ID NO: 6938 aa: SEQ ID NO: 6939 and 6940

M W L G T S G K S G L P G 13 ATG TGG CTG GGG ACT TCA. GGG AAG AGT GGG TTA CCT GGA 39 H C L E N P L Q E C H P A 26 CAC TGC TTA GAG AAT CCT CTC CAG GAA TGC CAC CCA GCA 78 O L E E W A L K G L G W T 39 CAG TTA GAA GAA TGG GCT. CTC AAA GGA CTG GGT TGG ACT 117 L T S A T k 45 CTC ACC TCT GCC ACT TAA cttctgagacttctgaggtotttgtgg 162 aaaaggagaatttcCaggCCtagtgtaatctocCagcCggagcagaaagaag 214 agccatgggtoctaccactCcaaaactittgaggCgaggaagatcCCgaggga 266 aagccacacagactgtgagcatcaggtggCaaagctCaatcaggacaattct 31.8 gaaacagoagaacaatgtggaacatcCtcagaaaggacCaataaagatctitt 370 CtCatactCttagttggggaggaaactgggagCaaggCCtagaattagaagg 422 gcaacatggaaccCttcCaggagagggCCagctggagtcCttttCacaggag 474 M C 2 agggatttaaacaagCtcCtggatggatatgtaggagagaagCCtATG TGT 525 A E C G K S F N Q S S Y L. 15 GCA GAA TGC GGG AAA AGC TTT AAC CAG AGT TCC TAT CTC 564 I R H L R T H T G E R P Y 28 ATA AGA CAC CTA AGA ACC CAC ACT GGC GAG AGG CCC TAT 603 T C I E C G K G F K Q S S 41 ACG TGC ATT GAG TGT GGG AAA GGC TTC AAA CAG AGC TCA 642 D L W T H R R T H T G E K. 54 GAC CTT GTC ACC CAT CGC AGA ACA CAC ACA GGA GAG AAG 68 P Y O C K G C E K K F S D 67 CCC TAC CAA TGC AAG GGG TGT GAG AAG AAA TTC AGC GAC 720 FIG. 13A Patent Application Publication Apr. 22, 2004 Sheet 28 of 32 US 2004/0078837 A1

S S T L. I K H Q R T H T G 80 AGC TCA ACA CTC ATC AAA CAT CAG AGA ACC CAC ACA GGG 759 E R P Y E C P E C G K T F 93 GAG. AGA CCC TAT GAG TGC CCA GAG TGT GGA AAG ACT TTT 798 G R K P H L I M H Q R T H 106 GGG CGG AAG CCA CAC CTC ATA ATG CAC CAA AGA ACC CAC 837 T G E K. P Y A. C L E C H K 19 ACA GGC GAG AAG CCC TAC GCG TGC CTG GAA TGT CAC AAA 876 S F S R S S N F I T H Q R 132 AGC TTC AGT CGA AGC TCA AAT TTC ATC ACT CAC CAG AGG 915 T H T G W K P Y R C N D C 45 ACC CAC ACA GGG GTG AAG CCT TAC AGG TGT AAT GAC TGT 954 G E S F S Q S S D L I K H 158 GGG GAG AGT TTT AGC CAG AGC TCG GAT TTG ATT AAG CAC 993 Q T. H. T G E R P F K C P 171 CAA CGA ACC CAC ACG GGA GAA CGG CCC TTC AAA TGC CCG 1032 E G K G R D S S H. F W 184 GAG TGC GGG AAG GGC TTC AGA GAT AGT TCT CAT TTT GTA 1071. A M S T G E R P F S 197 GCT CAC ATG AGC ACT CAT TCA GGA GAG AGG CCT TTC AGT 1110 C D C H F S Q S S H 210 TGT CCT GAC TGC CAC AGC TTC AGT CAG AGC TCA CAT 1149 L T. H. Q H T G E R P 223 TTG GTC ACG CAC CAA AGA ACA CAC ACA GGT GAG. AGA CCT 1.188 F C E N C K G F A D S 236 TTT AAG TGC GAA AAC TGT GGG AAA GGA TTC GCC GAC AGC 1227 S L I K H Q R I H T G E 249 TCC GCC CTC ATT AAG CAC CAA CGA ATC CAC ACC GGA GAA 1266 R Y K C G E C G K S F N 262 AGA CCC TAC AAA TGT GGA GAG TGT GGG AAG AGC TTC AAT 1305 S H F I T H R I H L. 275 CAG AGC TCC CAC TTT ATT ACC CAT CAG CGA ATC CAC TTA 1344 G R P Y R C P E C G. K. T 288 GGA GAC AGG CCC TAT CGA TGT CCT GAG TGT GGC AAG ACC 1383 FIG. 13B Patent Application Publication Apr. 22, 2004 Sheet 29 of 32 US 2004/0078837 A1

F N Q R S H F L T H Q R T 301 TTC AAT CAG CGT TCC CAT TTC CTC ACA CAC CAG AGA ACG 1422 H T G E K P F H C S K C N 314 CAT ACA GGA GAA AAA CCT TTC CAC TGT AGT AAA TGT AAC 1461 K S F. R O K. A H L L C H Q 327 AAG AGC TTC CGT CAG AAA GCG CAT CTT, TTA TGC CAT CAA 1500 N T H L I 323 AAC ACC CAT TTG ATT TAG 1518 FIG. 13C Patent Application Publication Apr. 22, 2004 Sheet 30 of 32 US 2004/0078837 A1

± O ŽU ŽU? tri ?z. O UÚ

d'IOSnW TVLTIXS.

HHAII

2. |- ? tuj Patent Application Publication Apr. 22, 2004 Sheet 31 of 32 US 2004/0078837 A1

IAIT

SnIdi d'IOHM Patent Application Publication Apr. 22, 2004 Sheet 32 of 32 US 2004/0078837 A1

WICH

SILSHL

HOSON TWCIXIS

WLNHOWII

HAIT

KHNCIX

LHWHH

NIWR8

MOVW HNO

CCTZOIW US 2004/0078837 A1 Apr. 22, 2004

FOUR HUMAN ZINC-FINGER-CONTAINING 0006 Another hallmark of ZNF proteins is that they PROTEINS: MDZ3, MDZ4, MDZ7 AND MDZ12 frequently contain additional functional domains that further define their biological activities. One motif that is com REFERENCE TO SEQUENCE LISTING monly found in conjunction with ZNF motifs in mammalian SUBMITTED ON COMPACT DISC proteins is the Kruppel-associated box (KRAB) motif, first 0001. The present application includes a Sequence List described by Bellefroid et al. Proc. Natl. Acad. Sci. USA ing filed on one CD-R disc, provided in duplicate, contain 88:3608-3612 (1991). This motif can be divided into two ing a single file named Sequence.txt, having 1,048,233 bytes, Sub-elements referred to as KRABA and B domains. How last modified on Aug. 2, 2001 and recorded Aug. 2, 2001. ever, not all proteins contain both A and B domains (Belle The Sequence Listing contained in Said file on Said disc is froid et al., EMBO.J. 12:1363-1374 (1993)). Several studies incorporated herein by reference in its entirety. have demonstrated that the KRAB A domain acts as a 0002) 1. Field of the Invention repressor of transcription (Margolin et al., Proc. Natl. Acad. Sci. USA 91:4509-4513 (1994); Witzgallet al., Proc. Natl. 0003. The present invention relates to four novel human Acad. Sci. USA 91:4515-4518 (1994); Pengue et al., Nucl. zinc-finger-containing proteins: MDZ3, MDZ4, MDZ7 and Acids Res. 22:2908-2914 (1994)). The KRAB B domain by MDZ12, and a splice variant of MDZ12, MDZ12b (the itself does not convey Strong transcriptional repressor func original MDZ12 is here named MDZ12a for this reason). tion, but may enhance the activity of an associated KRABA More specifically, the invention provides isolated nucleic domain (Margolin et al., Proc. Natl. Acad. Sci. USA acid molecules encoding MDZ3, MDZ4, MDZ7, MDZ12, 91:4509-4513 (1994)). A general mechanism for transcrip fragments thereof, vectors and host cells comprising isolated tional repression appears to be shared by all members of the nucleic acid molecules encoding MDZ3, MDZ4, MDZ7 and KRAB subfamily of ZNF proteins, and involves the recruit MDZ12; MDZ3, MDZ4, MDZ7 and MDZ12 polypeptides, ment of the common co-repressor protein KAP1/KRIP-1/ antibodies, transgenic cells and non-human organisms, and TIF-beta by KRAB-ZNF proteins to sites of gene repression diagnostic, therapeutic, and investigational methods of using (Friedman et al., Devel. 10:2067-2078 (1996); Kim et the same. al., Proc. Natl. Acad. Sci. USA 93:15299-15304 (1996); BACKGROUND OF THE INVENTION Moosmann et al., Nucl. Acids Res. 24:4849-4867 (1996)). 0004 Zinc-finger (ZNF)-containing genes collectively 0007. A second conserved motif that is often found in represent the largest family of Sequence-specific nucleic acid Kruppel (C2H2)-type ZNF proteins is called the SCAN box binding regulatory proteins in mammalian genomes, with or LeR (leucine-rich) domain (Yokoyama et al., Biochim. 600-1000 family members estimated for the Biophys. Acta. 1353:13-17 (1997); Williams et al., Mol. Cell (Hoovers et al., Genomics 12:254-263 (1992)). Most ZNF Biol. 19:8526-8535 (1999)). The SCAN box functions pri containing proteins are thought to act as DNA-binding marily as an oligomerization domain that allows Self-asso regulators of transcription, either activating or repressing ciation of proteins or association between different proteins expression of specific genes (El-Baradi and Pieler, Mech. containing compatible SCAN boxes (Williams et al., Mol. Dev. 35:155-169 (1991)). However, a few family members Cell Biol. 19:8526-8535 (1999)). Typically, the SCAN box have been found to bind specifically to RNA (e.g. ZNF74; is located at the N-terminus of the protein while the ZNF Grondin et al., J. Biol. Chem. 271:15458-15467 (1996)). motifs are clustered near its C-terminus. In contrast to the 0005. The most common form of the ZNF motif, termed KRAB domain, the SCAN box does not have a direct Kruppel or C2H2-type, was first described as a repeated activator or repressor role in Specific gene transcription. domain in the Xenopus transcription factor TFIIIA (Miller et 0008 ZNF proteins have been implicated in a wide range al., EMBO J. 4:1609-1614 (1985), but is named after a of biological activities that includes regulation of cell pro Similar domain found in the Drosophila Kruppel protein, liferation and differentiation as well as controlling patterns which also functions as a transcription factor (Rosenberg et of embryonic development. For example, mouse Krox20 is al., Nature 319:336-339 (1986)). Kruppel (C2H2)-type ZNF a ZNF protein that is expressed specifically in the hindbrain, motifs share a common backbone polypeptide Sequence of and disruption of the gene by mutation results in abnormali CXCX-FX5LX2HX-H, where X can be any amino acid. ZNF ties in brain development and death shortly after birth motifs may be found dispersed throughout a polypeptide or (Swiatek and Gridley, Genes Devel, 7:2071-2084 (1993)). may be clustered into one or more tandemly repeated blockS. Another ZNF protein, human SALL1 has been shown to be In cases where the repeats are arranged in tandem, adjacent the gene responsible for TowneS-Brocks Syndrome, an auto motifs are typically joined by a highly conserved stretch of Somal-dominant disorder characterized by abnormalities in seven amino acids (TGEKPYX, with X being any amino the development of ears, limbs and kidneys (Kohlhase et al., acid) termed the H-C link (Schuh et al., Cell 47:1025-1032 Nature Genet. 18:81-83 (1998)). (1986)). The conserved cysteine (C) and histidine (H) resi dues collectively coordinate a single Zinc ion, with the 0009 ZNF proteins have also been implicated in tumori intervening amino acids forming an alpha helical Structure genesis in humans. For example, the product of the Wilms that is responsible for Specific binding to a nucleotide triplet. Tumor gene (WT1), Kruppel-type ZNF protein, is normally Among proteins with multiple, clustered ZNF motifs, each expressed during kidney development. However, mutations repeated unit binds to a different, but consecutive, nucleotide that affect the DNA binding ZNF motifs can result in triplet (Suzuki et al., Nucl. Acids Res. 22:3397-3405 (1994)). embryonal renal neoplasia (Call et al., Cell 60:509-520 Thus, for proteins with multiple ZNF repeats, it is the (1990)). Mutations in other ZNF genes have also been concerted activity of most or all of the repeat units that shown to be the causative agents in a number of other determines the proteins DNA binding properties and ulti cancers, including acute promyelocytic leukemia (Chen et mate biological function. al., EMBO J. 12:1161-1167 (1993); Shakinovich et al., Mol. US 2004/0078837 A1 Apr. 22, 2004

Cell Biol. 18:5533-5545 (1998)) and t(8;13) leukemia/lym 0016. The invention further provides pharmaceutical for phoma syndrome (Xiao et al., Nature Genet. 18:84-87 mulations of the nucleic acids, proteins, and antibodies of (1998)). the present invention. 0.010 Recent reports suggest that at least one-third, and 0017. In other aspects, the invention provides transgenic likely a higher percentage, of human genes are alternatively cells and non-human organisms comprising nucleic acids of spliced. Hanke et al., Trends Genet. 15(1):389-390 (1999); any one or more of four novel human Zinc-finger-containing Mironov et al., Genome Res. 9:1288-93 (1999); Brett et al., genes-MDZ3, MDZ4, MDZ7, MDZ12a and MDZ12b FEBS Lett. 474(1):83-6 (2000). Alternative splicing has and transgenic cells and non-human organisms with targeted been proposed to account for at least part of the difference disruption of the endogenous Orthologue of any of the four between the number of genes recently called from the novel human Zinc-Finger-containing genes, MDZ3, MDZ4, completed human genome draft Sequence-30,000 to MDZ7, MDZ12a and MDZ12b. 40,000 (Genome International Sequencing Consortium, Nature 409:860-921 (15 Feb. 2001)—and earlier predictions 0018. The invention additionally provides diagnostic, of human gene number that routinely ranged as high as investigational, and therapeutic methods based on the 120,000, Liang et al., Nature Genet. 25(2):239-240 (2000). nucleic acids, proteins, antibodies, mimetics, agonists, and With the Drosophila homolog of one human gene reported antagonists of the present invention. to have 38,000 potential alternatively spliced variants, Sch mucker et al., Cell 101:671 (2000), it now appears that BRIEF DESCRIPTION OF THE DRAWINGS alternative Splicing may permit the relatively Small number 0019. The above and other objects and advantages of the of human coding regions to encode millions, perhaps tens of present invention will be apparent upon consideration of the millions, of Structurally distinct proteins and protein iso following detailed description taken in conjunction with the forms. accompanying drawings, in which like characters refer to 0.011) Given the demonstrated roles of ZNF proteins in like parts throughout, and in which: cell proliferation, differentiation and development, in com 0020 FIGS. 1A-1C schematize the protein domain struc bination with tumorigenesis when expressed aberrantly, ture of MDZ3, including the overall structure of MDZ3 and there is a need to identify and to characterize human genes that encode ZNF-containing proteins. Given the importance the alignment of SCAN box and KRAB domain in MDZ3 of alternative Splicing in providing further tissue-specific with similar motifs; and developmental regulation of human proteins, there is a 0021 FIG. 2 is a map showing the genomic structure of need to identify and to characterize splice variants of ZNF MDZ3 encoded at 7q22.1; containing proteins. 0022 FIG. 3 presents the nucleotide and predicted amino SUMMARY OF THE INVENTION acid sequences of MDZ3; 0023 FIGS. 4A and 4B schematize the protein domain 0012. The present invention solves these and other needs structure of MDZ4, including the overall structure of MDZ4 in the art by providing isolated nucleic acids that encode four and the alignment of the SCAN box in MDZ4 with similar novel human Zinc-finger-containing proteinS-MDZ3, motifs, MDZ4, MDZ7 and MDZ12a-as well as a splice variant of MDZ12a, termed MDZ12b, and fragments thereof. 0024 FIG. 5 is a map showing the genomic structure of MDZ4 encoded at chromosome 6p21.3-22.2, 0013 MDZ3 encodes a protein with a SCAN box and a more divergent KRAB domain, in addition to 7 ZNF motifs. 0025 FIG. 6 presents the nucleotide and predicted amino MDZ4 encodes a protein with SCAN box domain in addi acid sequences of MDZ4; tion to 5 ZNF motifs. MDZ7 encodes a protein that contains 7 ZNF motifs and no additional known elements. MDZ12 0026 FIG. 7 schematizes the protein domainstructure of encodes a protein with a divergent and possibly partial form MDZ7; of the KRAB motif (containing only the B domain) along 0027 FIG. 8 is a map showing the genomic structure of with 12 ZNF repeats. MDZ7 encoded at chromosome 16p11.2; 0.014 We have also isolated a splice variant of MDZ12, 0028 FIG. 9 presents the nucleotide and predicted amino designated MDZ12b (and have thus denominated the origi acid sequences of MDZ7; nally isolated transcript MDZ12a), which is expressed in Some of the tissues tested. MDZ12b may encode a protein 0029 FIGS. 10A and 10B schematize the protein with 12 ZNF repeats (termed MDZ12bL) if the internal domain structure of MDZ12, including the overall structure initiation methionine is used for protein translation, other of MDZ12a and MDZ12bL (the longer ORF generated using wise it encodes a 44-amino acid peptide (termed MDZ12bS) the internal translation initiation site) and the alignment of due to the introduction of a stop codon in the inserted eXon. the KRAB domain in MDZ12a with similar motifs; 0.015. In other aspects, the invention provides vectors for 0030 FIG. 11 is a map showing the genomic structure of propagating and expressing the nucleic acids of the present MDZ12a and MDZ12b encoded at chromosome 15q26.1; invention, host cells comprising the nucleic acids and vec 0031 FIG. 12 presents the nucleotide and predicted tors of the present invention, proteins, protein fragments, amino acid Sequences of MDZ12a, and protein fusions of any of the four novel human zinc Finger-containing proteins: MDZ3, MDZ4, MDZ7, 0032 FIG. 13 presents the nucleotide and predicted MDZ12a and MDZ12b (S and L), and antibodies thereto. amino acid sequences of MDZ12b; US 2004/0078837 A1 Apr. 22, 2004

0033 FIG. 14 presents the RT-PCR analysis of the interactions with other transcription modulators. Thus, MDZ3 gene expression; MDZ4 is a clinically useful diagnostic marker and potential 0034 FIG. 15 presents the RT-PCR analysis of the therapeutic agent for a variety of diseases, including devel MDZ7 gene expression; and opmental disorders and cancer. 0046. In common with the other family members of the 0035 FIG. 16 presents the RT-PCR analysis of the SCAN box-containing Kruppel zinc-finger proteins, MDZ4 MDZ12 gene expression. has a SCAN box near the N-terminus, which has been shown to participate in protein-protein interactions. The C-terminal DETAILED DESCRIPTION OF THE region of the MDZ4 protein contains five copies of C2H2 INVENTION Zinc fingers. 0.036 Mining the sequence of the human genome for 0047 FIG. 5 shows the genomic organization of MDZ4. novel human genes, the present inventors have identified four novel human Zinc-finger-containing genes: MDZ3, 0048. At the top is shown about 9 kb of the 128 kb P1 MDZ4, MDZ7 and MDZ12 (a and b). Each of the four genes artificial chromosome (PAC) (with GenBank accession acts as a Sequence-specific nucleic acid binding regulatory number), that spans the MDZ4 locus. The genome-derived protein, and plays a role in cell proliferation, differentiation Single-exon probe first used to demonstrate expression from and development. When expressed aberrantly, the genes this locus, is shown below the PAC and is labeled “500'; the contribute to neoplasia. 500 bp probe includes sequence drawn from exon two as well as flanking intron two. 0037 Detailed Description of the MDZ3 Gene 0049. As shown in FIG. 5, MDZ4, encoding a protein of 0.038 AS schematized in FIG. 1, the protein product of 389 amino acids, is comprised of exons 1-4. Predicted the newly isolated MDZ3 gene shares certain protein molecular weight, prior to any post-translational modifica domains and an overall Structural organization with a family tion, is 44.9 kD. of other zinc-finger proteins. The shared Structural features strongly imply that MDZ3 plays a role similar to that of the 0050 AS further discussed in the examples herein, SCAN box- and KRAB motif-containing Kruppel family expression of MDZ4 was assessed using hybridization to Zinc-finger proteins as a Sequence-specific nucleic acid genome-derived Single exon microarrayS. Microarray analy binding regulatory protein, and is likely to participate in sis of exons 2 and 3 showed expression in all tissues tested, protein-protein interactions with other transcription modu including bone marrow, brain, heart, hela, adult liver, fetal lators. Thus, MDZ3 is a clinically useful diagnostic marker liver, lung, placenta and prostate. and potential therapeutic agent for a variety of diseases, 0051) Detailed Description of the MDZ7 Gene including developmental disorders and cancer. 0.052 As schematized in FIG. 71 the newly isolated 0039. In common with the other family members of the MDZ7 gene product is mainly composed of Seven tandemly SCAN box containing Kruppel zinc-finger proteins, MDZ3 arrayed Kruppel-type (C2H2) Zinc finger repeats. Such a has a SCAN box near the N-terminus, which has been shown structure implies that MDZ7 is likely to function in to participate in protein-protein interactions. The C-terminal Sequence-specific DNA binding and impart a regulatory region of the MDZ3 protein contains seven copies of C2H2 effect on Specific gene expression. Thus, MDZ7 is a clini Zinc fingers. There is also weak homology to KRAB domain cally useful diagnostic marker and potential therapeutic in the middle of MDZ3. agent for a variety of diseases, including developmental 0040 FIG. 2 shows the genomic organization of MDZ3. disorders and cancer. 0041 At the top is shown approximately 14 kb of the 177 0053) FIG. 8 shows the genomic organization of MDZ7. kb bacterial artificial chromosome (BAC), with GenBank 0054) At the top is shown about 5.5 kb of the 121 kb accession number, that spans the MDZ3 locus. bacterial artificial chromosome (BAC), with GenBank 0042. As shown in FIG. 2, MDZ3, encoding a protein of accession number, that spans the MDZ7 locus. 544 amino acids, comprises exons 1-8. Predicted molecular 0055 As shown in FIG. 8, MDZ7 is comprised of four weight of the protein, prior to any post-translational modi exons and encodes a protein of 248 amino acids. Predicted fication, is 61.4 kD. molecular weight of the MDZ7 protein, prior to any post 0043. As further discussed in the examples herein, translational modification, is 72.0 kD. expression of MDZ3 was assessed using RT-PCR analysis. 0056. As further discussed in the examples herein, RT-PCR product for MDZ3 was clearly produced from expression of MDZ7 was assessed using RT-PCR. RT-PCR brain, testis, heart and bone marrow, but not from lung, liver, analysis of MDZ7 showed expression only in testes, but not or skeletal muscle. in brain, lung, liver, kidney, keletal muscle, heart, whole 0044) Detailed Description of the MDZA Gene fetus, or Hela cells. 0.045 AS schematized in FIG. 4, the product of the newly 0057 Detailed Description of the MDZ12 Gene isolated MDZ4 gene shares certain protein domains and an 0.058 As schematized in FIG. 10A, and further shown in overall Structural organization with a family of Zinc finger FIG. 11, the newly isolated MDZ12 encodes a MDZ12a proteins. The shared Structural features Strongly imply that insofar and potentially two other isoforms, which we des MDZ4 plays a role similar to those SCAN box-containing ignate 12bL (for “long”), and 12bS (for “short”); FIG. 10A Kruppel family Zinc-finger proteins as a potential transcrip shows the 12b insofar. MDZ12a contains a partial KRAB tion regulator, and is likely to participate in protein-protein motif as well as twelve C2H2 zinc fingers. MDZ12bL US 2004/0078837 A1 Apr. 22, 2004 encodes a protein with 12 C2H2 zinc fingers. Such features 0066 As used herein, “nucleic acid” (synonymously, strongly imply that MDZ12a (and MDZ12bL, if translated) "polynucleotide’) includes polynucleotides having natural plays a role as a potential transcription regulator, and is nucleotides in native 5'-3' phosphodiester linkage-e.g., likely to participate in protein-protein interactions with other DNA or RNA-as well as polynucleotides that have non transcription modulators. Thus MDZ12 is a clinically useful natural nucleotide analogues, nonnative internucleoside diagnostic marker and potential therapeutic agent for a bonds, or both, So long as the nonnatural polynucleotide is variety of diseases, including developmental disorders and capable of Sequence-discriminating basepairing under CCC. experimentally desired conditions. Unless otherwise Speci fied, the term “nucleic acid” includes any topological con 0059) FIG. 11 shows the genomic organization of formation; the term thus explicitly comprehends Single MDZ12. Stranded, double-Stranded, partially duplexed, triplexed, 0060. At the top is shown a portion of the 173 kb bacterial hairpinned, circular, and padlocked conformations. artificial chromosome (BAC), with GenBank accession number, that spans the MDZ12 locus. The genome-derived 0067. As used herein, an “isolated nucleic acid” is a Single-exon probe first used to demonstrate expression from nucleic acid molecule that exists in a physical form that is this locus includes Sequence drawn Solely from exon 4 of nonidentical to any nucleic acid molecule of identical MDZ12a. Sequence as found in nature, "isolated” does not require, although it does not prohibit, that the nucleic acid So 0061. As shown in FIG. 11, MDZ12a encodes a protein described has itself been physically removed from its native of 483 amino acids, comprising exons 1-4. Predicted molecular weight of the protein, prior to any post-transla environment. tional modification, is 55.1 kD. The inclusion of a novel 0068 For example, a nucleic acid can be said to be exon between exons 2 and 3 introduces an inframe Stop "isolated” when it includes nucleotides and/or internucleo codon in MDZ12b, and thus MDZ12b encodes a short side bonds not found in nature. When instead composed of polypeptide of 44 amino acids (MDZ12bS). The use of an natural nucleosides in phosphodiester linkage, a nucleic acid internal methionine as initiation methionine in MDZ12b can be said to be “isolated” when it exists at a purity not could potentially further encode a 332 amino acid protein found in nature, where purity can be adjudged with respect (MDZ12bL). The predicted molecular weight of the to the presence of nucleic acids of other Sequence, with MDZ12bL protein, prior to any post-translational modifica respect to the presence of proteins, with respect to the tion, is 38.2 kD. presence of lipids, or with respect the presence of any other 0.062 AS further discussed in the examples herein, component of a biological cell, or when the nucleic acid expression of MDZ12 was assessed using RT-PCR. The lacks sequence that flanks an otherwise identical Sequence in abundance of PCR product indicates that MDZ12a is an organism's genome, or when the nucleic acid possesses expressed in all tissues examined, with highest expression in Sequence not identically present in nature. brain, heart, skeletal muscle, testis and Hela cells. MDZ12b, 0069. As so defined, “isolated nucleic acid” includes however, is expressed with lower to much lower abundance nucleic acids integrated into a host cell chromosome at a compared with MDZ12a in bone marrow, brain, heart, heterologous Site, recombinant fusions of a native fragment kidney, placenta, Skeletal muscle, testis and Hela cells with to a heterologous Sequence, recombinant vectors present as almost no expression in liver. episomes or as integrated into a host cell chromosome. 0.063 As more fully described below, the present inven 0070 AS used herein, an isolated nucleic acid “encodes' tion provides isolated nucleic acids that encode MDZ3, a reference polypeptide when at least a portion of the nucleic MDZ4, MDZ7, MDZ12a and MDZ12b and fragments acid, or its complement, can be directly translated to provide thereof. The invention further provides vectors for propa the amino acid Sequence of the reference polypeptide, or gation and expression of the nucleic acids of the present when the isolated nucleic acid can be used, alone or as part invention, host cells comprising the nucleic acids and vec of an expression vector, to express the reference polypeptide tors of the present invention, proteins, protein fragments, in Vitro, in a prokaryotic host cell, or in a eukaryotic host and protein fusions of the present invention, and antibodies cell. specific for all or any one of the isoforms. The invention provides pharmaceutical formulations of the nucleic acids, 0071 AS used herein, the term “exon” refers to a nucleic proteins, and antibodies of the present invention. The inven acid Sequence found in genomic DNA that is bioinformati tion further provides transgenic cells and non-human organ cally predicted and/or experimentally confirmed to contrib isms comprising human MDZ3, MDZ4, MDZ7 or MDZ12 ute contiguous Sequence to a mature mRNA transcript. nucleic acids, and transgenic cells and non-human organ 0072 AS used herein, the phrase “open reading frame” isms with targeted disruption of the endogenous orthologue and the equivalent acronym “ORF refer to that portion of of the human MDZ3, MDZA, MDZ7 or MDZ12. The a transcript-derived nucleic acid that can be translated in its invention additionally provides diagnostic, investigational, entirety into a sequence of contiguous amino acids. AS So and therapeutic methods based on the MDZ3, MDZ4, defined, an ORF has length, measured in nucleotides, MDZ7, MDZ12 nucleic acids, proteins, and antibodies of exactly divisible by 3. As so defined, an ORF need not the present invention. encode the entirety of a natural protein. 0064 Definitions 0073. As used herein, the phrase “ORF-encoded peptide' 0065. Unless defined otherwise, all technical and scien tific terms used herein have the meaning commonly under refers to the predicted or actual translation of an ORF. stood by one of ordinary skill in the art to which this 0074 As used herein, the phrase “degenerate variant” of invention belongs. a reference nucleic acid Sequence intends all nucleic acid US 2004/0078837 A1 Apr. 22, 2004

Sequences that can be directly translated, using the Standard 0081. As used herein, “a single exon probe' comprises at genetic code, to provide an amino acid Sequence identical to least part of an exon (“reference exon’) and can hybridize that translated from the reference nucleic acid Sequence. detectably under high Stringency conditions to transcript derived nucleic acids that include the reference exon. The 0075 AS used herein, the term “microarray' and the Single exon probe will not, however, hybridize detectably equivalent phrase “nucleic acid microarray' refer to a Sub under high Stringency conditions to nucleic acids that lack Strate-bound collection of plural nucleic acids, hybridization the reference exon and that instead consist of one or more to each of the plurality of bound nucleic acids being Sepa rately detectable. The Substrate can be Solid or porous, exons that are found adjacent to the reference eXon in the planar or non-planar, unitary or distributed. genome. 0082 For purposes herein, “high stringency conditions” 0.076. As so defined, the term “microarray' and phrase are defined for Solution phase hybridization as aqueous "nucleic acid microarray' include all the devices So called in hybridization (i.e., free of formamide) in 6xSSC (where Schena (ed.), DNA Microarrays: A Practical Approach 20xSSC contains 3.0 M NaCl and 0.3 M sodium citrate), 1% (Practical Approach Series), Oxford University Press SDS at 65 C. for at least 8 hours, followed by one or more (1999) (ISBN: 0199637768); Nature Genet. 21(1) (suppl):1- washes in 0.2xSSC, 0.1% SDS at 65° C. “Moderate strin 60 (1999); and Schena (ed.), Microarray Biochip: Tools and gency conditions are defined for Solution phase hybridiza Technology, Eaton Publishing Company/BioTechniques tion as acqueous hybridization (i.e., free of formamide) in Books Division (2000) (ISBN: 1881299376), the disclosures 6xSSC, 1% SDS at 65° C. for at least 8 hours, followed by of which are incorporated herein by reference in their one or more washes in 2xSSC, 0.1% SDS at room tempera entireties. ture. 0.077 As so defined, the term “microarray' and phrase 0083. For microarray-based hybridization, standard "nucleic acid microarray' also include Substrate-bound col “high Stringency conditions are defined as hybridization in lections of plural nucleic acids in which the plurality of 50% formamide, 5xSSC, 0.2 tug/ul poly(dA), 0.2 tug/ul nucleic acids are distributably disposed on a plurality of human cot1 DNA, and 0.5% SDS, in a humid oven at 42 C. beads, rather than on a unitary planar Substrate, as is overnight, followed by Successive washes of the microarray described, inter alia, in Brenner et al., Proc. Natl. Acad. Sci. in 1XSSC, 0.2% SDS at 55° C. for 5 minutes, and then USA 97(4): 166501670 (2000), the disclosure of which is 0.1XSSC, 0.2% SDS, at 55° C. for 20 minutes. For microar incorporated herein by reference in its entirety; in Such case, ray-based hybridization, "moderate Stringency conditions', the term “microarray' and phrase "nucleic acid microarray' suitable for cross-hybridization to mRNA encoding struc refer to the plurality of beads in aggregate. turally- and functionally-related proteins, are defined to be 0078. As used herein with respect to solution phase the same as those for high Stringency conditions but with hybridization, the term “probe’, or equivalently, “nucleic reduction in temperature for hybridization and washing to acid probe' or “hybridization probe’, refers to an isolated room temperature (approximately 25 C.). nucleic acid of known Sequence that is, or is intended to be, 0084. As used herein, the terms “protein”, “polypeptide', detectably labeled. AS used herein with respect to a nucleic and "peptide' are used interchangeably to refer to a natu acid microarray, the term “probe' (or equivalently “nucleic rally-occurring or Synthetic polymer of amino acid mono acid probe" or “hybridization probe") refers to the isolated mers (residues), irrespective of length, where amino acid nucleic acid that is, or is intended to be, bound to the monomer here includes naturally-occurring amino acids, Substrate. In either Such context, the term “target' refers to naturally-occurring amino acid structural variants, and Syn nucleic acid intended to be bound to probe by Sequence thetic non-naturally occurring analogs that are capable of complementarity. participating in peptide bonds. The terms “protein', 0079 AS used herein, the expression “probe comprising "polypeptide', and "peptide' explicitly permits of post SEO ID NO:X', and variants thereof, intends a nucleic acid translational and post-Synthetic modifications, Such as gly probe, at least a portion of which probe has either (i) the cosylation. sequence directly as given in the referenced SEQ ID NO:X, 0085. The term “oligopeptide’ herein denotes a protein, or (ii) a sequence complementary to the Sequence as given polypeptide, or peptide having 25 or fewer monomeric in the referenced SEQ ID NO:X, the choice as between Subunits. Sequence directly as given and complement thereof dictated 0086 The phrases "isolated protein”, “isolated polypep by the requirement that the probe be complementary to the tide”, “isolated peptide” and "isolated oligopeptide” refer to desired target. a protein (or respectively to a polypeptide, peptide, or 0080. As used herein, the phrases “expression of a probe' oligopeptide) that is nonidentical to any protein molecule of and “expression of an isolated nucleic acid” and their identical amino acid Sequence as found in nature; "isolated” linguistic equivalents intend that the probe or, (respectively, does not require, although it does not prohibit, that the the isolated nucleic acid), or a probe (or, respectively, protein so described has itself been physically removed from isolated nucleic acid) complementary in Sequence thereto, its native environment. can hybridize detectably under high Stringency conditions to a Sample of nucleic acids that derive from mRNA transcripts 0087. For example, a protein can be said to be “isolated” from a given Source. For example, and by way of illustration when it includes amino acid analogues or derivatives not only, expression of a probe in “liver” means that the probe found in nature, or includes linkages other than Standard can hybridize detectably under high Stringency conditions to peptide bonds. a sample of nucleic acids that derive from mRNA obtained 0088. When instead composed entirely of natural amino from liver. acids linked by peptide bonds, a protein can be Said to be US 2004/0078837 A1 Apr. 22, 2004

"isolated” when it exists at a purity not found in nature molecule. Among Such fragments are Fab, Fab', Fv, F(ab)', where purity can be adjudged with respect to the presence of and Single chain FV (ScPV) fragments. proteins of other Sequence, with respect to the presence of non-protein compounds, Such as nucleic acids, lipids, or 0097. Derivatives within the scope of the term include other components of a biological cell, or when it exists in a antibodies (or fragments thereof) that have been modified in composition not found in nature, Such as in a host cell that Sequence, but remain capable of Specific binding to a target does not naturally express that protein. molecule, including: interspecies chimeric and humanized antibodies, antibody fusions; heteromeric antibody com 0089. A “purified protein” (equally, a purified polypep plexes and antibody fusions, Such as diabodies (bispecific tide, peptide, or oligopeptide) is an isolated protein, as above antibodies), Single-chain diabodies, and intrabodies (see, described, present at a concentration of at least 95%, as e.g., Marasco (ed.), Intracellular Antibodies. Research and measured on a weight basis with respect to total protein in Disease Applications, Springer-Verlag New York, Inc. a composition. A “Substantially purified protein' (equally, a (1998) (ISBN: 3540641513), the disclosure of which is Substantially purified polypeptide, peptide, or oligopeptide) incorporated herein by reference in its entirety). is an isolated protein, as above described, present at a 0098. As used herein, antibodies can be produced by any concentration of at least 70%, as measured on a weight basis known technique, including harvest from cell culture of with respect to total protein in a composition. native B lymphocytes, harvest from culture of hybridomas, 0090. As used herein, the phrase “protein isoforms” recombinant expression Systems, and phage display. refers to a plurality of proteins having nonidentical primary 0099 AS used herein, “antigen” refers to a ligand that can amino acid Sequence but that share amino acid Sequence be bound by an antibody; an antigen need not itself be encoded by at least one common eXon. immunogenic. The portions of the antigen that make contact 0.091 AS used herein, the phrase “alternative splicing” with the antibody are denominated “epitopes”. and its linguistic equivalents includes all types of RNA 0100 “Specific binding” refers to the ability of two processing that lead to expression of plural protein isoforms molecular species concurrently present in a heterogeneous from a single gene; accordingly, the phrase “splice Vari (inhomogeneous) sample to bind to one another in prefer ant(s)' and its linguistic equivalents embraces mRNAS ence to binding to other molecular species in the Sample. transcribed from a given gene that, however processed, Typically, a Specific binding interaction will discriminate collectively encode plural protein isoforms. For example, over adventitious binding interactions in the reaction by at and by way of illustration only, Splice variants can include least two-fold, more typically by at least 10-fold, often at exon insertions, exon extensions, exon truncations, exon least 100-fold; when used to detect analyte, Specific binding deletions, alternatives in the 5' untranslated region (“5' UT”) is sufficiently discriminatory when determinative of the and alternatives in the 3' untranslated region (“3' UT"). Such presence of the analyte in a heterogeneous (inhomogeneous) 3' alternatives include, for example, differences in the Site of Sample. Typically, the affinity or avidity of a specific binding RNA transcript cleavage and site of poly(A) addition. See, reaction is least about 107M, with specific binding reac e.g., Gautheret et al., Genome Res. 8:524–530 (1998). tions of greater Specificity typically having affinity or avidity 0092. As used herein, “orthologues” are separate occur of at least 10 M to at least about 10 M. rences of the same gene in multiple species. The Separate 0101 AS used herein, “molecular binding partners'-and occurrences have similar, albeit nonidentical, amino acid equivalently, “Specific binding partners'-refer to pairs of Sequences, the degree of Sequence Similarity depending, in molecules, typically pairs of biomolecules, that exhibit part, upon the evolutionary distance of the Species from a Specific binding. Nonlimiting examples are receptor and common ancestor having the same gene. ligand, antibody and antigen, and biotin to any of avidin, 0093. As used herein, the term “paralogues' indicates Streptavidin, neutraVidin and captAvidin. Separate occurrences of a gene in one species. The Separate 0102) The term “antisense”, as used herein, refers to a occurrences have similar, albeit nonidentical, amino acid nucleic acid molecule Sufficiently complementary in Sequences, the degree of Sequence Similarity depending, in Sequence, and Sufficiently long in that complementary part, upon the evolutionary distance from the gene duplica Sequence, as to hybridize under intracellular conditions to (i) tion event giving rise to the Separate occurrences. a target mRNA transcript or (ii) the genomic DNA strand 0094 AS used herein, the term “homologues” is generic complementary to that transcribed to produce the target to “orthologues' and “paralogues'. mRNA transcript. 0.095 As used herein, the term “antibody” refers to a 0103) The term “portion', as used with respect to nucleic polypeptide, at least a portion of which is encoded by at least acids, proteins, and antibodies, is Synonymous with "frag one immunoglobulin gene, or fragment thereof, and that can ment. bind Specifically to a desired target molecule. The term 0104) Nucleic Acid Molecules includes naturally-occurring forms, as well as fragments and derivatives. 0105. In a first aspect, the invention provides isolated nucleic acids that encode MDZ3, MDZA, MDZ7 or MDZ12, 0.096 Fragments within the scope of the term “antibody” variants having at least 65% sequence identity thereto, include those produced by digestion with various proteases, degenerate variants thereof, variants that encode MDZ3, those produced by chemical cleavage and/or chemical dis MDZ4, MDZ7 or MDZ12 proteins having conservative or Sociation, and those produced recombinantly, So long as the moderately conservative Substitutions, cross-hybridizing fragment remains capable of Specific binding to a target nucleic acids, and fragments thereof. US 2004/0078837 A1 Apr. 22, 2004

0106 FIGS. 3, 6, 9, 12 and 13 present the nucleotide 0111 Single nucleotide polymorphisms (SNPs) occur sequences of the MDZ3, MDZ4, MDZ7, MDZ12a and frequently in eukaryotic genomes-more than 1.4 million MDZ12b cDNA clones, respectively, with predicted amino SNPs have already identified in the human genome, Inter acid translations. The Sequences are further presented in the national Human Genome Sequencing Consortium, Nature Sequence Listing, incorporated herein by reference in its 409:860-921 (2001) and the sequence determined from entirety, in SEQ ID NOs: 1 (full length nucleotide sequence one individual of a species may differ from other allelic of human MDZ3 cDNA), 3 (full length amino acid sequence forms present within the population. Additionally, Small of MDZ3), 3027 (full length nucleotide sequence of human deletions and insertions, rather than Single nucleotide poly MDZ4 cDNA), 3029 (full length amino acid sequence of morphisms, are not uncommon in the general population, MDZ4), 4407 (full length nucleotide sequence of human and often do not alter the function of the protein. MDZ7 cDNA), 4409 (full length amino acid sequence of 0112 Accordingly, it is an aspect of the present invention MDZ7), 5770 (full length nucleotide sequence of human to provide nucleic acids not only identical in Sequence to MDZ12a clDNA), 5772 (full length amino acid sequence of those described with particularity herein, but also to provide MDZ12a), 6938 (full length nucleotide sequence of human isolated nucleic acids at least about 65% identical in MDZ12b cDNA), 6939 (full length amino acid sequence of Sequence to those described with particularity herein, typi MDZ12bS) and 6940 (full length amino acid sequence of cally at least about 70%, 75%, 80%, 85%, or 90% identical MDZ12bL). in Sequence to those described with particularity herein, 0107 Unless otherwise indicated, each nucleotide usefully at least about 91%, 92%, 93%, 94%, or 95% Sequence is Set forth herein as a Sequence of deoxyribo identical in Sequence to those described with particularity nucleotides. It is intended, however, that the given Sequence herein, usefully at least about 96%, 97%, 98%, or 99% be interpreted as would be appropriate to the polynucleotide identical in Sequence to those described with particularity composition: for example, if the isolated nucleic acid is herein, and, most conservatively, at least about 99.5%, composed of RNA, the given Sequence intends ribonucle 99.6%, 99.7%, 99.8% and 99.9% identical in sequence to otides, with uridine Substituted for thymidine. those described with particularity herein. These Sequence variants can be naturally occurring or can result from human 0108 Unless otherwise indicated, nucleotide sequences intervention, as by random or directed mutagenesis. of the isolated nucleic acids of the present invention were determined by Sequencing a DNA molecule that had 0113 For purposes herein, percent identity of two nucleic resulted, directly or indirectly, from at least one enzymatic acid Sequences is determined using the procedure of Tatiana polymerization reaction (e.g., reverse transcription and/or et al., “Blast 2 Sequences-a new tool for comparing protein polymerase chain reaction) using an automated Sequencer and nucleotide sequences”, FEMS Microbiol Lett. 174:247 (such as the MegaBACETM 1000, Molecular Dynamics, 250 (1999), which procedure is effectuated by the computer Sunnyvale, Calif., USA), or by reliance upon Such sequence program BLAST 2 SEQUENCES, available online at or upon genomic Sequence prior-accessioned into a public 0.114) http://www.ncbi.nlm.nih.gov/blast/bl2seq/ database. Unless otherwise indicated, all amino acid bl2.html. Sequences of the polypeptides of the present invention were predicted by translation from the nucleic acid Sequences. So 0115) To assess percent identity of nucleic acids, the determined. BLASTN module of BLAST 2 SEQUENCES is used with default values of (i) reward for a match: 1; (ii) penalty for a 0109 As a consequence, any nucleic acid sequence pre mismatch: -2, (iii) open gap 5 and extension gap 2 penalties; Sented herein may contain errors introduced by erroneous (iv) gap X dropoff 50 expect 10 word size 11 filter, and both incorporation of nucleotides during polymerization, by erro Sequences are entered in their entireties. neous base calling by the automated Sequencer (although Such Sequencing errors have been minimized for the nucleic 0116 AS is well known, the genetic code is degenerate, acids directly determined herein, unless otherwise indicated, with each amino acid except methionine translated from a by the Sequencing of each of the complementary Strands of plurality of codons, thus permitting a plurality of nucleic a duplex DNA), or by Similar errors accessioned into the acids of disparate Sequence to encode the identical protein. public database. AS is also well known, codon choice for optimal expression varies from Species to species. The isolated nucleic acids of 0110. Accordingly, the MDZ3, MDZ4, MDZ7, MDZ12a the present invention being useful for expression of MDZ3, and MDZ12b cDNA clones described herein have been MDZ4, MDZ7 or MDZ12 proteins and protein fragments, it Separately deposited in a public repository (American Type is, therefore, another aspect of the present invention to Culture Collection, Manassas, Va., USA, “ATCC). Each of provide isolated nucleic acids that encode MDZ3, MDZ4, the MDZ3, MDZ4, MDZ7 cDNA clones was sent for MDZ7 or MDZ12 proteins and portions thereof not only deposit to ATCC in a discrete tube on Aug. 1, 2001 and identical in Sequence to those described with particularity received by ATCC Aug. 2, 2001, and respectively accorded herein, but degenerate variants thereof as well. accession numbers of s s . The two splice variants of the MDZ12 gene-MDZ12a and 0117 AS is also well known, amino acid substitutions MDZ12b-were sent for deposit in admixture in a single occur frequently among natural allelic variants, with con tube to ATCC on Aug. 1, 2001, received at ATCC on Aug. Servative Substitutions often occasioning only de minimis 2, 2001, and jointly accorded the Single accession number change in protein function. . Any errors in Sequence reported herein can be 0118 Accordingly, it is an aspect of the present invention determined and corrected by Sequencing nucleic acids to provide nucleic acids not only identical in Sequence to propagated from the deposited clones using Standard tech those described with particularity herein, but also to provide niques. isolated nucleic acids that encode MDZ3, MDZ4, MDZ7 or US 2004/0078837 A1 Apr. 22, 2004

MDZ12, and portions thereof, having conservative amino the present invention (“reference nucleic acids”), as well as acid Substitutions, and also to provide isolated nucleic acids cross-hybridizing nucleic acids that hybridize under moder that encode MDZ3, MDZ4, MDZ7 or MDZ12, and portions ate Stringency conditions to all or to a portion of various of thereof, having moderately conservative amino acid Substi the isolated MDZ3, MDZ4, MDZ7 or MDZ12 nucleic acids tutions. of the present invention. 0119) Although there are a variety of metrics for calling conservative amino acid Substitutions, based primarily on 0123. Such cross-hybridizing nucleic acids are useful, either observed changes among evolutionarily related pro inter alia, as probes for, and to drive expression of, proteins teins or on predicted chemical Similarity, for purposes herein related to the proteins of the present invention as alternative a conservative replacement is any change having a positive isoforms, homologues, paralogues, and orthologues. Par value in the PAM250 log-likelihood matrix reproduced ticularly useful orthologues are those from other primate herein below (see Gonnet et al., Science 256(5062): 1443-5 Species, Such as chimpanzee, rhesus macaque, monkey, (1992)): baboon, orangutan, and gorilla; from rodents, Such as rats,

O 2 2 3 12 2 3 2 2

1 2 3 4 1 2 3 4 2 4 3

2 3 3 4 1 3 4 5 O 1 2

0120 For purposes herein, a “moderately conservative” mice, guinea pigs, from lagomorphs, Such as rabbits, and replacement is any change having a nonnegative value in the from domestic livestock, Such as cow, pig, sheep, horse, goat PAM250 log-likelihood matrix reproduced herein above. and chickens. 0121 AS is also well known in the art, relatedness of 0.124 For purposes herein, high Stringency conditions are nucleic acids can also be characterized using a functional test, the ability of the two nucleic acids to base-pair to one defined as aqueous hybridization (i.e., free of formamide) in another at defined hybridization Stringencies. 6xSSC (where 20xSSC contains 3.0 M NaCl and 0.3 M sodium citrate), 1% SDS at 65° C. for at least 8 hours, 0122) It is, therefore, another aspect of the invention to followed by one or more washes in 0.2xSSC, 0.1% SDS at provide isolated nucleic acids not only identical in Sequence to those described with particularity herein, but also to 65 C. For purposes herein, moderate Stringency conditions provide isolated nucleic acids (“cross-hybridizing nucleic are defined as aqueous hybridization (i.e., free of forma acids) that hybridize under high Stringency conditions (as mide) in 6xSSC, 1% SDS at 65° C. for at least 8 hours, defined herein below) to all or to a portion of various of the followed by one or more washes in 2xSSC, 0.1% SDS at isolated MDZ3, MDZ4, MDZ7 or MDZ12 nucleic acids of room temperature. US 2004/0078837 A1 Apr. 22, 2004

0.125 The hybridizing portion of the reference nucleic ertoire to produce antibodies of predetermined specificity,” acid is typically at least 15 nucleotides in length, often at Nature 299:592-596 (1982); Shinnick et al., “Synthetic least 17 nucleotides in length. Often, however, the hybrid peptide immunogens as Vaccines, Annu. Rev. Microbiol. izing portion of the reference nucleic acid is at least 20 37:425-46 (1983); Sutcliffe et al., “Antibodies that react nucleotides in length, 25 nucleotides in length, and even 30 with predetermined sites on proteins, Science 219:660-6 nucleotides, 35 nucleotides, 40 nucleotides, and 50 nucle (1983), the disclosures of which are incorporated herein by otides in length. Of course, cross-hybridizing nucleic acids reference in their entireties. that hybridize to a larger portion of the reference nucleic 0132) The nucleic acid fragment of the present invention acid-for example, to a portion of at least 50 nt, at least 100 is thus at least 17 nucleotides in length, typically at least 18 nt, at least 150 nt, 200 nt, 250 nt, 300 nt, 350 nt, 400 nt, 450 nucleotides in length, and often at least 24 nucleotides in nt, or 500 nt or more-or even to the entire length of the length. Often, the nucleic acid of the present invention is at reference nucleic acid, are also useful. least 25 nucleotides in length, and even 30 nucleotides, 35 0.126 The hybridizing portion of the cross-hybridizing nucleotides, 40 nucleotides, or 45 nucleotides in length. Of nucleic acid is at least 75% identical in Sequence to at least course, larger fragments having at least 50 nt, at least 100 nt, a portion of the reference nucleic acid. Typically, the hybrid at least 150 nt, 200 nt, 250 nt, 300 nt, 350 nt, 400 nt, 450 nt, izing portion of the cross-hybridizing nucleic acid is at least or 500 nt or more are also useful, and at times preferred. 80%, often at least 85%, 86%, 87%, 88%, 89% or even at 0.133 Having been based upon the mining of genomic least 90% identical in Sequence to at least a portion of the Sequence, rather than upon Surveillance of expressed mes reference nucleic acid. Often, the hybridizing portion of the Sage, the present invention further provides isolated cross-hybridizing nucleic acid will be at least 91%, 92%, genome-derived nucleic acids that include portions of the 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical in Sequence to at least a portion of the reference nucleic acid MDZ3, MDZ4, MDZ7 or MDZ12 gene. Sequence. At times, the hybridizing portion of the croSS 0134) The invention particularly provides genome-de hybridizing nucleic acid will be at least 99.5% identical in rived single exon probes. Sequence to at least a portion of the reference nucleic acid. 0.135 AS further described in commonly owned and 0127. The invention also provides fragments of various copending U.S. patent application Ser. No. 09/864,761, filed of the isolated nucleic acids of the present invention. May 23, 2001; Ser. No. 09/774,203, filed Jan. 29, 2001; and Ser. No. 09/632,366, filed Aug. 3, 2000, the disclosures of 0128 By “fragments” of a reference nucleic acid is here which are incorporated herein by reference in their entire intended isolated nucleic acids, however obtained, that have ties, “a single exon probe' comprises at least part of an exon a nucleotide Sequence identical to a portion of the reference (“reference exon”) and can hybridize detectably under high nucleic acid Sequence, which portion is at least 17 nucle Stringency conditions to transcript-derived nucleic acids that otides and less than the entirety of the reference nucleic acid. include the reference exon. The Single exon probe will not, As so defined, “fragments' need not be obtained by physical however, hybridize detectably under high Stringency condi fragmentation of the reference nucleic acid, although Such tions to nucleic acids that lack the reference exon and instead provenance is not thereby precluded. consist of one or more exons that are found adjacent to the 0129. In theory, an oligonucleotide of 17 nucleotides is of reference exon in the genome. Sufficient length as to occur at random leSS frequently than 0.136 Genome-derived single exon probes typically fur once in the three gigabase human genome, and thus to ther comprise, contiguous to a first end of the exon portion, provide a nucleic acid probe that can uniquely identify the a first intronic and/or intergenic Sequence that is identically reference Sequence in a nucleic acid mixture of genomic contiguous to the exon in the genome. Often, the genome complexity. AS is well known, further specificity can be derived single exon probe further comprises, contiguous to obtained by probing nucleic acid Samples of Subgenomic a Second end of the exonic portion, a Second intronic and/or complexity, and/or by using plural fragments as short as 17 intergenic Sequence that is identically contiguous to the exon nucleotides in length collectively to prime amplification of in the genome. nucleic acids, as, e.g., by polymerase chain reaction (PCR). 0.137 The minimum length of genome-derived single 0130. As further described herein below, nucleic acid exon probes is defined by the requirement that the exonic fragments that encode at least 6 contiguous amino acids (i.e., portion be of Sufficient length to hybridize under high fragments of 18 nucleotides or more) are useful in directing Stringency conditions to transcript-derived nucleic acids. the expression or the Synthesis of peptides that have utility Accordingly, the exon portion is at least 17 nucleotides, in mapping the epitopes of the protein encoded by the typically at least 18 nucleotides, 20 nucleotides, 24 nucle reference nucleic acid. See, e.g., Geysen et al., “Use of otides, 25 nucleotides or even 30, 35, 40, 45, or 50 nucle peptide Synthesis to probe Viral antigens for epitopes to a otides in length, and can usefully include the entirety of the resolution of a single amino acid,"Proc. Natl. Acad. Sci. exon, up to 100 nt, 150 nt, 200 nt, 250 nt, 300 nt,350 nt, 400 USA 81:3998-4002 (1984); and U.S. Pat. Nos. 4,708,871 nt or even 500 nt or more in length. and 5,595,915, the disclosures of which are incorporated 0.138. The maximum length of genome-derived single herein by reference in their entireties. exon probes is defined by the requirement that the probes 0131 AS further described herein below, fragments that contain portions of no more than one exon, that is, be unable encode at least 8 contiguous amino acids (i.e., fragments of to hybridize detectably under high Stringency conditions to 24 nucleotides or more) are useful in directing the expres nucleic acids that lack the reference exon but include one or Sion or the Synthesis of peptides that have utility as immu more exons that are found adjacent to the reference exon the nogens. See, e.g., Lerner, "Tapping the immunological rep genome. US 2004/0078837 A1 Apr. 22, 2004

0139 Given variable spacing of exons through eukary 7-dUTP, BODIPY(R) FL-14-dUTP, BODIPY(R) TMR-14 otic genomes, the maximum length of Single exon probes of dUTP, BODIPYCR TR-14-dUTP, Rhodamine GreenTM-5- the present invention is typically no more than 25 kb, often dUTP, Oregon Green(E) 488-5-dUTP, Texas Red(R)-12-dUTP, no more than 20 kb, 15 kb, 10 kb or 7.5 kb, or even no more BODIPY(R) 630/650-14-dUTP, BODIPY(R 650/665-14 than 5 kb, 4 kb, 3 kb, or even no more than about 2.5 kb in dUTP, Alexa Fluor(R) 488-5-dUTP, Alexa Fluor(R) 532-5- length. dUTP, Alexa Fluor(R) 568-5-dUTP, Alexa Fluor(R 594-5- 0140. The genome-derived single exon probes of the dUTP, Alexa Fluor(R) 546-14-dUTP, fluorescein-12-UTP, present invention can usefully include at least a first terminal tetramethylrhodamine-6-UTP, Texas RedOR)-5-UTP, Cascade priming Sequence not found in contiguity with the rest of the Blue(R)-7-UTP, BODIPY(R) FL-14-UTP, BODIPY(R) TMR probe Sequence in the genome, and often will contain a 14-UTP, BODIPYCR TR-14-UTP, Rhodamine GreenTM-5- Second terminal priming Sequence not found in contiguity UTP, Alexa Fluor?e 488-5-UTP, Alexa Fluor(R) 546-14-UTP with the rest of the probe Sequence in the genome. (Molecular Probes, Inc. Eugene, Oreg., USA). 0.148 Protocols are available for custom synthesis of 0.141. The present invention also provides isolated nucleotides having other fluorophores. Henegariu et al., genome-derived nucleic acids that include nucleic acid “Custom Fluorescent-Nucleotide Synthesis as an Alternative sequence elements that control transcription of the MDZ3, Method for Nucleic Acid Labeling, Nature Biotechnol. MDZ4, MDZ7 or MDZ12 gene. 18:345-348 (2000), the disclosure of which is incorporated 0142. With a complete draft of the human genome now herein by reference in its entirety. available, genomic Sequences that are within the vicinity of the MDZ3, MDZ4, MDZ7 or MDZ12 coding region (and 0149 Haptens that are commonly conjugated to nucle that are additional to those described with particularity otides for subsequent labeling include biotin (biotin-11 herein) can readily be obtained by PCR amplification. dUTP, Molecular Probes, Inc., Eugene, Oreg., USA; biotin 21-UTP, biotin-21-dUTP, Clontech Laboratories, Inc., Palo 0143. The isolated nucleic acids of the present invention Alto, Calif., USA), digoxigenin (DIG-11-dUTP, alkali can be composed of natural nucleotides in native 5'-3' labile, DIG-11-UTP, Roche Diagnostics Corp., Indianapolis, phosphodiester internucleoside linkage-e.g., DNA or Ind., USA), and dinitrophenyl (dinitrophenyl-11-dUTP, RNA-or can contain any or all of nonnatural nucleotide Molecular Probes, Inc., Eugene, Oreg., USA). analogues, nonnative internucleoside bonds, or post-Synthe sis modifications, either throughout the length of the nucleic 0150. As another example, when desired to be used for acid or localized to one or more portions thereof. antisense inhibition of transcription or translation, the iso lated nucleic acids of the present invention can usefully 0144. As is well known in the art, when the isolated include altered, often nuclease-resistant, internucleoside nucleic acid is used as a hybridization probe, the range of bonds. See Hartmann et al. (eds.), Manual of Antisense Such nonnatural analogues, nonnative internucleoside Methodology (Perspectives in Antisense Science), Kluwer bonds, or post-synthesis modifications will be limited to Law International (1999) (ISBN:079238539X); Stein et al. those that permit Sequence-discriminating basepairing of the (eds.), Applied AntiSense Oligonucleotide Technology, resulting nucleic acid. When used to direct expression or Wiley-Liss (cover (1998) (ISBN: 0471172790); Chadwick RNA or protein in vitro or in vivo, the range of Such et al. (eds.), Oligonucleotides as Therapeutic Agents Sym nonnatural analogues, nonnative internucleoside bonds, or posium No. 209, John Wiley & Son Ltd (1997) (ISBN: post-synthesis modifications will be limited to those that 0471972797), the disclosures of which are incorporated permit the nucleic acid to function properly as a polymer herein by reference in their entireties. Such altered internu ization Substrate. When the isolated nucleic acid is used as closide bonds are often desired also when the isolated a therapeutic agent, the range of Such changes will be limited nucleic acid of the present invention is to be used for or for to those that do not confer toxicity upon the isolated nucleic targeted gene correction, Gamper et al., Nucl. Acids ReS. acid. 28(21):4332-4339 (2000), the disclosures of which are 0145 For example, when desired to be used as probes, incorporated herein by reference in its entirety. the isolated nucleic acids of the present invention can 0151 Modified oligonucleotide backbones often pre usefully include nucleotide analogues that incorporate labels ferred when the nucleic acid is to be used for antisense that are directly detectable, Such as radiolabels or fluoro purposes are, for example, phosphorothioates, chiral phos phores, or nucleotide analogues that incorporate labels that phorothioates, phosphorodithioates, phosphotriesters, ami can be visualized in a Subsequent reaction, Such as biotin or noalkylphosphotriesters, methyl and other alkyl phospho various haptens. nates including 3'-alkylene phosphonates and chiral phosphonates, phosphinates, phosphoramidates including 0146 Common radiolabeled analogues include those 3'-amino phosphoramidate and aminoalkylphosphorami labeled with p, p, and S, such as C-P-dATP, C-P- dates, thionophosphoramidates, thionoalkylphosphonates, dCTP, o-P-dGTP, o-P-dTTP, o-P-3'dATP, C-P-ATP, thionoalkylphosphotriesters, and boranophosphates having o-P-CTP, c--P-GTP, o-P-UTP, c-S-dATP, Y-S- normal 3'-5' linkages, 2'-5' linked analogs of these, and those GTP Y-P-dATP, and the like. having inverted polarity wherein the adjacent pairs of 0147 Commercially available fluorescent nucleotide nucleoside units are linked 3'-5' to 5'-3' or 2'-5' to 5'-2'. analogues readily incorporated into the nucleic acids of the Representative U.S. patents that teach the preparation of the present invention include Cy3-dCTP, Cy3-dUTP, Cy5 above phosphorus-containing linkages include, but are not dCTP, Cy3-dUTP (Amersham Pharmacia Biotech, Piscat limited to, U.S. Pat. Nos. 3,687,808; 4,469,863; 4,476,301; away, N.J., USA), fluorescein-12-dUTP, tetramethyl 5,023,243; 5,177, 196; 5,188,897; 5,264,423: 5,276,019; rhodamine-6-dUTP, Texas RedE)-5-dUTP, Cascade Blue(R)- 5,278.302; 5,286,717; 5,321,131; 5,399,676; 5,405,939; US 2004/0078837 A1 Apr. 22, 2004

5,453,496; 5,455,233; 5,466,677; 5,476,925; 5,519,126; duplexes, a single mismatch lowers the Tm by 4-16 C. (11 5,536,821; 5,541,306; 5,550,111; 5,563,253; 5,571,799; C. on average). Because PNA probes can be significantly 5,587,361; and 5,625,050, the disclosures of which are Shorter than DNA probes, their specificity is greater. incorporated herein by reference in their entireties. 0158. Additionally, nucleases and proteases do not rec 0152 Preferred modified oligonucleotide backbones for ognize the PNA polyamide backbone with nucleobase antisense use that do not include a phosphorus atom have Sidechains. AS a result, PNA oligomers are resistant to backbones that are formed by short chain alkyl or cycloalkyl degradation by enzymes, and the lifetime of these com internucleoside linkages, mixed heteroatom and alkyl or pounds is extended both in vivo and in vitro. In addition, cycloalkyl internucleoside linkages, or one or more short PNA is stable over a wide pH range. chain heteroatomic or heterocyclic internucleoside linkages. 0159. Because its backbone is formed from amide bonds, These include those having morpholino linkages (formed in PNA can be Synthesized using a modified peptide Synthesis part from the Sugar portion of a nucleoside); siloxane protocol. PNA oligomers can be synthesized by both Fmoc backbones, Sulfide, Sulfoxide and Sulfone backbones, for and tBoc methods. Representative U.S. patents that teach the macetyl and thioformacetyl backbones, methylene for preparation of PNA compounds include, but are not limited macetyl and thioformacetyl backbones, alkene containing to, U.S. Pat. Nos. 5,539,082; 5,714,331; and 5,719,262, each backbones, Sulfamate backbones, methyleneimino and of which is hereby incorporated herein by reference; auto methylenehydrazino backbones, Sulfonate and Sulfonamide mated PNA synthesis is readily achievable on commercial backbones, amide backbones, and others having mixed N, synthesizers (see, e.g., “PNA User's Guide,” Rev. 2, Feb O, S and CH2 component parts. Representative U.S. patents ruary 1998, Perseptive Biosystems Part No. 60138, Applied that teach the preparation of the above backbones include, Biosystems, Inc., Foster City, Calif.). but are not limited to, U.S. Pat. Nos. 5,034,506; 5,166,315; 5,185,444; 5,214,134, 5,216,141, 5,235,033; 5,264,562; 0160 PNA chemistry and applications are reviewed, inter 5,264,564; 5,405,938; 5.434,257; 5,466,677; 5,470,967; alia, in Ray et al., FASEB.J. 14(9):1041-60 (2000); Nielsen 5,489,677; 5,541,307; 5,561,225; 5,596,086; 5,602,240; et al., Pharmacol Toxicol. 86(1):3-7 (2000); Larsen et al., 5,610,289; 5,602,240; 5,608,046; 5,610,289; 5,618,704; Biochim Biophys Acta. 1489(1): 159-66 (1999); Nielsen, 5,623,070; 5,663,312; 5,633,360; 5,677,437; and 5,677,439, Curr. Opin. Struct. Biol. 9(3):353-7 (1999), and Nielsen, the disclosures of which are incorporated herein by refer Curr. Opin. Biotechnol. 10(1):71-5 (1999), the disclosures of ence in their entireties. which are incorporated herein by reference in their entire ties. 0153. In other preferred oligonucleotide mimetics, both the Sugar and the internucleoside linkage are replaced with 0161 Differences from nucleic acid compositions found novel groups, Such as peptide nucleic acids (PNA). in nature-e.g., nonnative bases, altered internucleoside linkages, post-Synthesis modification-can be present 0154) In PNA compounds, the phosphodiester backbone throughout the length of the nucleic acid or can, instead, of the nucleic acid is replaced with an amide-containing usefully be localized to discrete portions thereof. AS an backbone, in particular by repeating N-(2-aminoethyl)gly example of the latter, chimeric nucleic acids can be Synthe cine units linked by amide bonds. Nucleobases are bound sized that have discrete DNA and RNA domains and dem directly or indirectly to aza nitrogen atoms of the amide onstrated utility for targeted gene repair, as further described portion of the backbone, typically by methylene carbonyl in U.S. Pat. Nos. 5,760,012 and 5,731,181, the disclosures of linkages. which are incorporated herein by reference in their entire O155 The uncharged nature of the PNA backbone pro ties. AS another example, chimeric nucleic acids comprising vides PNA/DNA and PNA/RNA duplexes with a higher both DNA and PNA have been demonstrated to have utility thermal stability than is found in DNA/DNA and DNA/RNA in modified PCR reactions. See Misra et al., Biochem. 37: duplexes, resulting from the lack of charge repulsion 1917-1925 (1998); see also Finn et al., Nucl. Acids Res. 24: between the PNA and DNA or RNA strand. In general, the 3357-3363 (1996), incorporated herein by reference. Tm of a PNA/DNA or PNA/RNA duplex is 1° C. higher per 0162 Unless otherwise specified, nucleic acids of the than the Tm of the corresponding DNA/DNA or present invention can include any topological conformation DNA/RNA duplex (in 100 mM NaCl). appropriate to the desired use; the term thus explicitly 0156 The neutral backbone also allows PNA to form comprehends, among others, Single-Stranded, double Stable DNA duplexes largely independent of Salt concentra Stranded, triplexed, quadruplexed, partially double-Stranded, tion. At low ionic strength, PNA can be hybridized to a target partially-triplexed, partially-quadruplexed, branched, hair Sequence at temperatures that make DNA hybridization pinned, circular, and padlocked conformations. Padlock problematic or impossible. And unlike DNA/DNA duplex conformations and their utilities are further described in formation, PNA hybridization is possible in the absence of Baner et al., Curr. Opin. Biotechnol. 12:11-15 (2001); magnesium. Adjusting the ionic Strength, therefore, is useful Escude et al., Proc. Natl. Acad. Sci. USA 14:96(19):10603-7 if competing DNA or RNA is present in the sample, or if the (1999); Nilsson et al., Science 265(5181):2085-8 (1994), the nucleic acid being probed contains a high level of Secondary disclosures of which are incorporated herein by reference in Structure. their entireties. TripleX and quadruplex conformations, and their utilities, are reviewed in Praseuth et al., Biochim. O157 PNA also demonstrates greater specificity in bind Biophys. Acta. 1489(1):181-206 (1999); Fox, Curr. Med. ing to complementary DNA. APNA/DNA mismatch is more Chem. 7(1): 17-37 (2000); Kochetkova et al., Methods Mol. destabilizing than DNA/DNA mismatch. A single mismatch Biol. 130:189-201 (2000); Chan et al., J. Mol. Med. in mixed a PNA/DNA 15-mer lowers the Tm by 8-20° C. 75(4):267-82 (1997), the disclosures of which are incorpo (15° C. on average). In the corresponding DNA/DNA rated herein by reference in their entireties. US 2004/0078837 A1 Apr. 22, 2004

0163 The nucleic acids of the present invention can be 0173 Nucleic acids of the present invention can also detectably labeled. usefully be bound to a substrate. The substrate can porous or 0164 Commonly-used labels include radionuclides, such Solid, planar or non-planar, unitary or distributed; the bond as *p, *p, 35S, H (and for NMR detection, C and 'N), can be covalent or noncovalent. Bound to a Substrate, haptens that can be detected by Specific antibody or high nucleic acids of the present invention can be used as probes affinity binding partner (Such as avidin), and fluorophores. in their unlabeled State. 0.165. As noted above, detectable labels can be incorpo 0.174 For example, the nucleic acids of the present inven rated by inclusion of labeled nucleotide analogues in the tion can usefully be bound to a porous Substrate, commonly nucleic acid. Such analogues can be incorporated by enzy a membrane, typically comprising nitrocellulose, nylon, or matic polymerization, Such as by nick translation, random positively-charged derivatized nylon; So attached, the priming, polymerase chain reaction (PCR), terminal trans nucleic acids of the present invention can be used to detect ferase tailing, and end-filling of overhangs, for DNA mol MDZ3, MDZ4, MDZ7 or MDZ12 nucleic acids present ecules, and in Vitro transcription driven, e.g., from phage within a labeled nucleic acid Sample, either a Sample of promoters, such as T7, T3, and SP6, for RNA molecules. genomic nucleic acids or a Sample of transcript-derived Commercial kits are readily available for each Such labeling nucleic acids, e.g. by reverse dot blot. approach. 0.175. The nucleic acids of the present invention can also usefully be bound to a Solid Substrate, Such as glass, 0166 Analogues can also be incorporated during auto although other Solid materials, Such as amorphous Silicon, mated Solid phase chemical Synthesis. crystalline Silicon, or plastics, can also be used. Such plastics 0167 AS is well known, labels can also be incorporated include polymethylacrylic, polyethylene, polypropylene, after nucleic acid synthesis, with the 5' phosphate and 3 polyacrylate, polymethylmethacrylate, polyvinylchloride, hydroxyl providing convenient sites for post-Synthetic cova polytetrafluoroethylene, polystyrene, polycarbonate, poly lent attachment of detectable labels. acetal, polysulfone, celluloseacetate, cellulosenitrate, nitro 01.68 Various other post-synthetic approaches permit cellulose, or mixtures thereof. internal labeling of nucleic acids. 0176 Typically, the solid substrate will be rectangular, 0169. For example, fluorophores can be attached using a although other shapes, particularly disks and even Spheres, cisplatin reagent that reacts with the N7 of guanine residues present certain advantages. Particularly advantageous alter (and, to a lesser extent, adenine bases) in DNA, RNA, and natives to glass Slides as Support Substrates for array of PNA to provide a stable coordination complex between the nucleic acids are optical discs, as described in Demers, nucleic acid and fluorophore label (Universal Linkage Sys “Spatially Addressable Combinatorial Chemical Arrays in tem) (available from Molecular Probes, Inc., Eugene, Oreg., CD-ROM Format,” international patent publication WO USA and Amersham Pharmacia Biotech, Piscataway, N.J., 98/12559, incorporated herein by reference in its entirety. USA); see Alers et al., Genes, & Cancer, Vol. 0177. The nucleic acids of the present invention can be 25, pp. 301-305 (1999); Jelsma et al., J. NIH Res. 5:82 attached covalently to a Surface of the Support Substrate or (1994); Van Belkum et al., BioTechniques 16:148-153 applied to a derivatized Surface in a chaotropic agent that (1994), incorporated herein by reference. As another facilitates denaturation and adherence by presumed nonco example, nucleic acids can be labeled using a disulfide Valent interactions, or Some combination thereof. containing linker (FastTag" Reagent, Vector Laboratories, 0.178 The nucleic acids of the present invention can be Inc., Burlingame, Calif., USA) that is photo- or thermally bound to a substrate to which a plurality of other nucleic coupled to the target nucleic acid using aryl azide chemistry; acids are concurrently bound, hybridization to each of the after reduction, a free thiol is available for coupling to a plurality of bound nucleic acids being Separately detectable. hapten, fluorophore, Sugar, affinity ligand, or other marker. At low density, e.g. on a porous membrane, these Substrate 0170 Multiple independent or interacting labels can be bound collections are typically denominated macroarrays, at incorporated into the nucleic acids of the present invention. higher density, typically on a Solid Support, Such as glass, 0171 For example, both a fluorophore and a moiety that these Substrate bound collections of plural nucleic acids are in proximity thereto acts to quench fluorescence can be colloquially termed microarrayS. AS used herein, the term included to report Specific hybridization through release of microarray includes arrays of all densities. It is, therefore, fluorescence quenching, Tyagi et al., Nature Biotechnol. 14. another aspect of the invention to provide microarrays that 303-308 (1996); Tyagi et al., Nature Biotechnol. 16, 49-53 include the nucleic acids of the present invention. (1998); Sokol et al., Proc. Natl. Acad. Sci. USA 95: 11538 0179 The isolated nucleic acids of the present invention 11543 (1998); Kostrikis et al., Science 279:1228-1229 can be used as hybridization probes to detect, characterize, (1998); Marras et al., Genet. Anal. 14: 151-156 (1999); U.S. and quantify MDZ3, MDZ4, MDZ7 or MDZ12 nucleic Pat. Nos. 5,846,726, 5,925,517, 5,925,517, or to report acids in, and isolate MDZ3, MDZ4, MDZ7 or MDZ12 exonucleotidic excision, U.S. Pat. No. 5,538,848; Holland et nucleic acids from, both genomic and transcript-derived al., Proc. Natl. Acad. Sci. USA 88:7276-7280 (1991); Heid nucleic acid Samples. When free in Solution, Such probes are et al., Genome Res. 6(10):986-94 (1996); Kuimelis et al., typically, but not invariably, detectably labeled; bound to a Nucleic Acids Symp Ser: (37):255-6 (1997); U.S. Pat. No. Substrate, as in a microarray, Such probes are typically, but 5,723.591, the disclosures of which are incorporated herein not invariably unlabeled. by reference in their entireties. 0180 For example, the isolated nucleic acids of the 0172 So labeled, the isolated nucleic acids of the present present invention can be used as probes to detect and invention can be used as probes, as further described below. characterize gross alterations in the MDZ3, MDZ4, MDZ7 US 2004/0078837 A1 Apr. 22, 2004

or MDZ12 genomic locus, Such as deletions, insertions, nucleic acids in transcript-derived Samples-that is, to mea translocations, and duplications of the MDZ3, MDZ4, sure expression of the MDZ3, MDZA, MDZ7 or MDZ12 MDZ7 or MDZ12 genomic locus through fluorescence in gene-when included in a microarray. Measurement of Situ hybridization (FISH) to chromosome spreads. See, e.g., MDZ3, MDZ4, MDZ7 or MDZ12 expression has particular Andreef et al. (eds.), Introduction to Fluorescence In Situ utility in diagnosis and treatment of a variety of diseases, Hybridization. Principles and Clinical Applications, John including developmental disorders and cancer, as further Wiley & Sons (1999) (ISBN: 0471013455), the disclosure of described in the Examples herein below. which is incorporated herein by reference in its entirety. The 0185. As would be readily apparent to one of skill in the isolated nucleic acids of the present invention can be used as art, each MDZ3, MDZ4, MDZ7 or MDZ12 nucleic acid probes to assess Smaller genomic alterations using, e.g., probe-whether labeled, substrate-bound, or both is thus Southern blot detection of restriction fragment length poly currently available for use as a tool for measuring the level morphisms. The isolated nucleic acids of the present inven of MDZ3, MDZ4, MDZ7 or MDZ12 expression in each of tion can be used as probes to isolate genomic clones that the tissues in which expression has already been confirmed, include the nucleic acids of the present invention, which notably testes for MDZ7, brain, testis, heart and bone thereafter can be restriction mapped and Sequenced to iden marrow for MDZ3, bone marrow, brain, heart, hela, adult tify deletions, insertions, translocations, and Substitutions liver, fetal liver, lung, placenta and prostate for MDZ4, (single nucleotide polymorphisms, SNPs) at the Sequence brain, heart, kidney, placenta, Skeletal muscle, testis, Hela level. cells, bone marrow and liver for MDZ12. The utility is 0181. The isolated nucleic acids of the present invention Specific to the probe: under high Stringency conditions, the can also be used as probes to detect, characterize, and probe reports the level of expression of message specifically quantify MDZ3, MDZ4, MDZ7 or MDZ12 nucleic acids in, containing that portion of the MDZ3, MDZ4, MDZ7 or and isolate MDZ3, MDZ4, MDZ7 or MDZ12 nucleic acids MDZ12 gene included within the probe. from, transcript-derived nucleic acid Samples. 0186 Measuring tools are well known in many arts, not 0182 For example, the isolated nucleic acids of the just in molecular biology, and are known to possess credible, present invention can be used as hybridization probes to specific, and substantial utility. For example, U.S. Pat. No. detect, characterize by length, and quantify MDZ3, MDZ4, 6,016,191 describes and claims a tool for measuring char MDZ7 or MDZ12 mRNA by northern blot of total or acteristics of fluid flow in a hydrocarbon well; U.S. Pat. No. poly-A"-selected RNA samples. For example, the isolated 6,042,549 describes and claims a device for measuring nucleic acids of the present invention can be used as exercise intensity; U.S. Pat. No. 5,889,351 describes and hybridization probes to detect, characterize by location, and claims a device for measuring Viscosity and for measuring quantify MDZ3, MDZ4, MDZ7 or MDZ12 message by in characteristics of a fluid; U.S. Pat. No. 5,570,694 describes Situ hybridization to tissue Sections (See, e.g., Schwarchza and claims a device for measuring blood preSSure, U.S. Pat. cher et al., In Situ Hybridization, Springer-Verlag New York No. 5,930,143 describes and claims a device for measuring (2000) (ISBN: 0387915966), the disclosure of which is the dimensions of machine tools; U.S. Pat. No. 5.279,044 incorporated herein by reference in its entirety). For describes and claims a measuring device for determining an example, the isolated nucleic acids of the present invention absolute position of a movable element; U.S. Pat. No. can be used as hybridization probes to measure the repre 5,186,042 describes and claims a device for measuring sentation of MDZ3, MDZ4, MDZ7 or MDZ12 clones in a action force of a wheel; and U.S. Pat. No. 4,246,774 cDNA library. For example, the isolated nucleic acids of the describes and claims a device for measuring the draft of present invention can be used as hybridization probes to Smoking articles Such as cigarettes. isolate MDZ3, MDZ4, MDZ7 or MDZ12 nucleic acids from 0187. As for tissues not yet demonstrated to express cDNA libraries, permitting Sequence level characterization MDZ3, MDZ4, MDZ7 or MDZ12, the MDZ3, MDZ4, of MDZ3, MDZ4, MDZ7 or MDZ12 messages, including MDZ7 or MDZ12 nucleic acid probes of the present inven identification of deletions, insertions, truncations-includ tion are currently available as tools for Surveying Such ing deletions, insertions, and truncations of exons in alter tissues to detect the presence of MDZ3, MDZ4, MDZ7 or natively spliced forms-and Single nucleotide polymor MDZ12 nucleic acids. phisms. 0188 Survey tools- i.e., tools for determining the pres 0183 All of the aforementioned probe techniques are ence and/or location of a desired object by Search of an well within the skill in the art, and are described at greater area-are well known in many arts, not just in molecular length in Standard texts. Such as Sambrook et al., Molecular biology, and are known to possess credible, Specific, and Cloning: A Laboratory Manual (3" ed.), Cold Spring Har substantial utility. For example, U.S. Pat. No. 6,046,800 bor Laboratory Press (2001) (ISBN: 0879695773); Ausubel describes and claims a device for Surveying an area for et al. (eds.), Short Protocols in Molecular Biology: A Com objects that move; U.S. Pat. No. 6,025,201 describes and pendium of Methods from Current Protocols in Molecular claims an apparatus for locating and discriminating platelets Biology (4" ed.), John Wiley & Sons, 1999 (ISBN: from non-platelet particles or cells on a cell-by-cell basis in 047132938X); and Walker et al. (eds.), The Nucleic Acids a whole blood sample; U.S. Pat. No. 5,990,689 describes Protocols Handbook, Humana Press (2000) (ISBN: and claims a device for detecting and locating anomalies in 0896034593), the disclosures of which are incorporated the electromagnetic protection of a System, U.S. Pat. No. herein by reference in their entirety. 5,984,175 describes and claims a device for detecting and identifying wearable user identification units; U.S. Pat. No. 0184 As described in the Examples herein below, the 3,980,986 ("Oil well Survey tool”), describes and claims a nucleic acids of the present invention can also be used to tool for finding the position of a drill bit working at the detect and quantify MDZ3, MDZ4, MDZ7 or MDZ12 bottom of a borehole. US 2004/0078837 A1 Apr. 22, 2004

0189 As noted above, the nucleic acid probes of the 0196) Genome-derived single exon probes and genome present invention are useful in constructing microarrays; the derived single exon probe microarrays have the additional microarrays, in turn, are products of manufacture that are utility, inter alia, of permitting high-throughput detection of useful for measuring and for Surveying gene expression. Splice variants of the nucleic acids of the present invention, 0190. When included on a microarray, each MDZ3, as further described in copending and commonly owned MDZ4, MDZ7 or MDZ12 nucleic acid probe makes the U.S. patent application Ser. No. 09/632,366, filed Aug. 3, microarray Specifically useful for detecting that portion of 2000, the disclosure of which is incorporated herein by the MDZ3, MDZA, MDZ7 or MDZ12 gene included within reference in its entirety. the probe, thus imparting upon the microarray device the 0197) The isolated nucleic acids of the present invention ability to detect a signal where, absent Such probe, it would can also be used to prime Synthesis of nucleic acid, for have reported no signal. This utility makes each individual purpose of either analysis or isolation, using mRNA, cDNA, probe on Such microarray akin to an antenna, circuit, firm or genomic DNA as template. ware or Software element included in an electronic appara tus, where the antenna, circuit, firmware or Software element 0198 For use as primers, at least 17 contiguous nucle imparts upon the apparatus the ability newly and addition otides of the isolated nucleic acids of the present invention ally to detect Signal in a portion of the radio-frequency will be used. Often, at least 18, 19, or 20 contiguous nucleotides of the nucleic acids of the present invention will Spectrum where previously it could not; Such devices are be used, and on occasion at least 20, 22, 24, or 25 contiguous known to have Specific, Substantial, and credible utility. nucleotides of the nucleic acids of the present invention will 0191 Changes in the level of expression need not be be used, and even 30 nucleotides or more of the nucleic acids observed for the measurement of expression to have utility. of the present invention can be used to prime specific 0.192 For example, where gene expression analysis is Synthesis. used to assess toxicity of chemical agents on cells, the 0199 The nucleic acid primers of the present invention failure of the agent to change a gene's expression level is can be used, for example, to prime first Strand cDNA evidence that the drug likely does not affect the pathway of synthesis on an mRNA template. which the gene's expressed protein is a part. Analogously, where gene expression analysis is used to assess Side effects 0200 Such primer extension can be done directly to of pharmacologic agents-whether in lead compound dis analyze the message. Alternatively, Synthesis on an mRNA covery or in Subsequent Screening of lead compound deriva template can be done to produce first strand cDNA. The first tives—the inability of the agent to alter a gene's expression Strand cDNA can thereafter be used, inter alia, directly as a level is evidence that the drug does not affect the pathway of Single-stranded probe, as above-described, as a template for which the gene's expressed protein is a part. Sequencing permitting identification of alterations, includ ing deletions, insertions, and Substitutions, both normal 0193 WO99/58720, incorporated herein by reference in allelic variants and mutations associated with abnormal its entirety, provides methods for quantifying the relatedneSS phenotypes-or as a template, either for Second Strand of a first and Second gene expression profile and for ordering cDNA synthesis (e.g., as an antecedent to insertion into a the relatedness of a plurality of gene expression profiles, cloning or expression vector), or for amplification. without regard to the identity or function of the genes whose expression is used in the calculation. 0201 The nucleic acid primers of the present invention can also be used, for example, to prime Single base extension 0194 Gene expression analysis, including gene expres (SBE) for SNP detection (see, e.g., U.S. Pat. No. 6,004,744, Sion analysis by microarray hybridization, is, of course, the disclosure of which is incorporated herein by reference principally a laboratory-based art. Devices and apparatus in its entirety). used principally in laboratories to facilitate laboratory research are well-established to possess Specific, Substantial, 0202 As another example, the nucleic acid primers of the and credible utility. For example, U.S. Pat. No. 6,001,233 present invention can be used to prime amplification of describes and claims a gel electrophoresis apparatus having MDZ3, MDZ4, MDZ7 or MDZ12 nucleic acids, using a cam-activated clamp; for example, U.S. Pat. No. 6,051,831 transcript-derived or genomic DNA as template. describes and claims a high mass detector for use in time 0203 Primer-directed amplification methods are now of-flight mass spectrometers; for example, U.S. Pat. No. well-established in the art. Methods for performing the 5,824,269 describes and claims a flow cytometer-as is well polymerase chain reaction (PCR) are compiled, inter alla, in known, few gel electrophoresis apparatuses, TOF-MS McPherson, PCR (Basics: From Background to Bench), devices, or flow cytometers are Sold for consumer use. Springer Verlag (2000) (ISBN: 0387916008); Innis et al. 0.195 Indeed, and in particular, nucleic acid microarrays, (eds.), PCR Applications. Protocols for Functional Genom as devices intended for laboratory use in measuring gene ics, Academic Press (1999) (ISBN: 0123721857); Gelfandet expression, are well-established to have specific, Substantial al. (eds.), PCR Strategies, Academic Press (1998) (ISBN: and credible utility. Thus, the microarrays of the present 0123721822); Newton et al., PCR, Springer-Verlag New invention have at least the Specific, Substantial and credible York (1997) (ISBN: 0387915060); Burke (ed.), PCR. Essen utilities of the microarrays claimed as devices and articles of tial Techniques, John Wiley & Son Ltd (1996) (ISBN: manufacture in the following U.S. patents, the disclosures of 047195697X); White (ed.), PCR Cloning Protocols: From each of which is incorporated herein by reference: U.S. Pat. Molecular Cloning to Genetic Engineering, Vol. 67, Humana No. 5,445,934 (“Array of oligonucleotides on a solid Sub Press (1996) (ISBN: 0896033430); McPherson et al. (eds.), strate"); U.S. Pat. No. 5,744,305 (“Arrays of materials PCR 2. A Practical Approach, Oxford University Press, Inc. attached to a substrate”); and U.S. Pat. No. 6,004,752 (1995) (ISBN: 0199634254), the disclosures of which are (“Solid support with attached molecules”). incorporated herein by reference in their entireties. Methods US 2004/0078837 A1 Apr. 22, 2004 for performing RT-PCR are collected, e.g., in Siebert et al. therapy. In Vivo expression can also be driven from Signals (eds.), Gene Cloning and Analysis by RT-PCR, Eaton Pub endogenous to the nucleic acid or from a vector, often a lishing Company/Bio Techniques Books Division, 1998 plasmid vector, such as pVAX1 (Invitrogen, Carlsbad Calif., (ISBN: 1881299147); Siebert (ed.), PCR Technique:RT USA), for purpose of “naked nucleic acid vaccination, as PCR, Eaton Publishing Company/BioTechniques Books further described in U.S. Pat. Nos. 5,589.466; 5,679,647; (1995) (ISBN: 1881299139), the disclosure of which is 5,804,566; 5,830,877; 5,843,913; 5,880,104; 5,958,891; incorporated herein by reference in its entirety. 5,985,847; 6,017,897; 6,110,898; 6.204250, the disclosures 0204 Isothermal amplification approaches, such as roll of which are incorporated herein by reference in their ing circle amplification, are also now well-described. See, entireties. e.g., Schweitzer et al., Curr. Opin. Biotechnol. 12(1):21-7 0211 The nucleic acids of the present invention can also (2001); U.S. Pat. Nos. 6,235,502, 6,221,603, 6,210,884, be used for antisense inhibition of transcription or transla 6,183,960, 5,854,033, 5,714,320, 5,648,245, and interna tion. See Phillips (ed.), Antisense Technology, Part B, Meth tional patent publications WO 97/19193 and WO 00/15779, ods in Enzymology Vol. 314, Academic Press, Inc. (1999) the disclosures of which are incorporated herein by refer (ISBN: 012182215X); Phillips (ed.), Antisense Technology, ence in their entireties. Rolling circle amplification can be Part A, Methods in Enzymology Vol. 313, Academic Press, combined with other techniques to facilitate SNP detection. Inc. (1999) (ISBN: 0121822141); Hartmann et al. (eds.), See, e.g., Lizardi et al., Nature Genet. 19(3):225-32 (1998). Manual of Antisense Methodology (Perspectives in Anti sense Science), Kluwer Law International (1999) 0205 As further described below, nucleic acids of the (ISBN:079238539X); Stein et al. (eds.), Applied Antisense present invention, inserted into Vectors that flank the nucleic Oligonucleotide Technology, Wiley-Liss (cover (1998) acid insert with a phage promoter, such as T7, T3, or SP6 (ISBN: 0471172790); Agrawal et al. (eds.), Antisense promoter, can be used to drive in vitro expression of RNA Research and Application, Springer-Verlag New York, Inc. complementary to either Strand of the nucleic acid of the (1998) (ISBN: 3540638334); Lichtenstein et al. (eds.), Anti present invention. The RNA can be used, inter alla, as a Sense Technology: A Practical Approach, Vol. 185, Oxford single-stranded probe, in cDNA-mRNA subtraction, or for University Press, INC. (1998) (ISBN: 0199635838); Gibson in vitro translation. (ed.), Antisense and Ribozyme Methodology: Laboratory 0206 AS will be further discussed herein below, nucleic Companion, Chapman & Hall (1997) (ISBN: 3826.100794); acids of the present invention that encode MDZ3, MDZ4, Chadwick et al. (eds.), Oligonucleotides as Therapeutic MDZ7 or MDZ12 protein or portions thereof can be used, Agents- Symposium No. 209, John Wiley & Son Ltd (1997) inter alia, to express the MDZ3, MDZ4, MDZ7 or MDZ12 (ISBN: 0471972797), the disclosures of which are incorpo proteins or protein fragments, either alone, or as part of rated herein by reference in their entireties. fusion proteins. 0212 Nucleic acids of the present invention, particularly 0207 Expression can be from genomic nucleic acids of cDNAs of the present invention, that encode full-length the present invention, or from transcript-derived nucleic human MDZ3, MDZ4, MDZ7 or MDZ12 protein isoforms, acids of the present invention. have additional, well-recognized, immediate, real world 0208. Where protein expression is effected from genomic utility as commercial products of manufacture Suitable for DNA, expression will typically be effected in eukaryotic, Sale. typically mammalian, cells capable of Splicing introns from 0213 For example, Invitrogen Corp. (Carlsbad, Calif., the initial RNA transcript. Expression can be driven from USA), through its Research Genetics subsidiary, sells full episomal vectors, Such as EBV-based vectors, or can be length human cl)NAS cloned into one of a selection of effected from genomic DNA integrated into a host cell expression vectors as GeneStorm(E) expression-ready clones, chromosome. As will be more fully described below, where utility is specific for the gene, Since each gene is capable of expression is from transcript-derived (or otherwise intron being ordered Separately and has a distinct catalogue num less) nucleic acids of the present invention, expression can ber, and utility is substantial, each clone selling for S650.00 be effected in wide variety of prokaryotic or eukaryotic cells. US. Similarly, Incyte Genomics (Palo Alto, Calif., USA) Sells clones from public and proprietary Sources in multi 0209 Expressed in vitro, the protein, protein fragment, or well plates or individual tubes. protein fusion can thereafter be isolated, to be used, inter alia, as a Standard in immunoassays specific for the proteins, 0214) Nucleic acids of the present invention that include or protein isoforms, of the present invention; to be used as genomic regions encoding the human MDZ3, MDZ4, a therapeutic agent, e.g., to be administered as passive MDZ7 or MDZ12 protein, or portions thereof, have yet replacement therapy in individuals deficient in the proteins further utilities. of the present invention, or to be administered as a vaccine; 0215 For example, genomic nucleic acids of the present to be used for in vitro production of Specific antibody, the invention can be used as amplification Substrates, e.g. for antibody thereafter to be used, e.g., as an analytical reagent preparation of genome-derived Single exon probes of the for detection and quantitation of the proteins of the present present invention, as described above and in copending and invention or to be used as an immunotherapeutic agent. commonly-owned U.S. patent application Ser. No. 09/864, 0210. The isolated nucleic acids of the present invention 761, filed May 23, 2001, Ser. No. 09/774,203, filed Jan. 29, can also be used to drive in Vivo expression of the proteins 2001, and Ser. No. 09/632,366, filed Aug. 3, 2000, the of the present invention. In Vivo expression can be driven disclosures of which are incorporated herein by reference in from a vector-typically a viral vector, often a vector based their entireties. upon a replication incompetent retrovirus, an adenovirus, or 0216 AS another example, genomic nucleic acids of the an adeno-associated virus (AAV)-for purpose of gene present invention can be integrated non-homologously into US 2004/0078837 A1 Apr. 22, 2004

the genome of Somatic cells, e.g. CHO cells, COS cells, or tional Screening, and more elegant Schemes for recombinant 293 cells, with or without amplification of the insertional evolution of proteins, as are described, inter alia, in U.S. Pat. locus, in order, e.g., to create Stable cell lines capable of Nos. 6,180,406; 6,165,793; 6,117,679; and 6,096,548, the producing the proteins of the present invention. disclosures of which are incorporated herein by reference in 0217. As another example, more fully described herein their entireties. below, genomic nucleic acids of the present invention can be 0223) Nucleic acids of the present invention can be integrated nonhomologously into embryonic stem (ES) cells obtained by using the labeled probes of the present invention to create transgenic non-human animals capable of produc to probe nucleic acid Samples, Such as genomic libraries, ing the proteins of the present invention. cDNA libraries, and mRNA samples, by standard tech niques. Nucleic acids of the present invention can also be 0218 Genomic nucleic acids of the present invention can obtained by amplification, using the nucleic acid primers of also be used to target homologous recombination to the the present invention, as further demonstrated in Example 1, human MDZ3, MDZ4, MDZ7 or MDZ12 locus, respec herein below. Nucleic acids of the present invention of fewer tively. See, e.g., U.S. Pat. Nos. 6,187,305; 6,204,061; 5,631, than about 100 nt can also be synthesized chemically, 153; 5,627,059; 5.487,992; 5,464,764; 5,614,396; 5,527,695 typically by Solid phase Synthesis using commercially avail and 6,063,630; and Kmiec et al. (eds.), Gene Targeting able automated Synthesizers. Protocols, Vol. 133, Humana Press (2000) (ISBN: 0896033600); Joyner (ed.), Gene Targeting: A Practical 0224). “Full Length” Human MDZ3, MDZ4, MDZ7 or Approach, Oxford University Press, Inc. (2000) (ISBN: MDZ12 Nucleic Acids 0199637938); Sedivy et al., Gene Targeting, Oxford Uni 0225. In a first series of nucleic acid embodiments, the versity Press (1998) (ISBN: 071677013X); Tymms et al. invention provides isolated nucleic acids that encode the (eds.), Gene Knockout Protocols, Humana Press (2000) entirety of the MDZ3, MDZ4, MDZ7 or MDZ12a, (ISBN: 0896035727); Maket al. (eds.), The Gene Knockout MDZ12bS, MDZ12bL protein. As discussed above, the FactsBook, Vol. 2, Academic Press, Inc. (1998) (ISBN: “full-length nucleic acids of the present invention can be 0124660444); Torres et al., Laboratory Protocols for Con used, interalia, to express full length MDZ3, MDZ4, MDZ7 ditional Gene Targeting, Oxford University Press (1997) or MDZ12a, MDZ12bS, MDZ12bL protein. The full-length (ISBN: 019963677X); Vega (ed.), Gene Targeting, CRC nucleic acids can also be used as nucleic acid probes, used Press, LLC (1994) (ISBN: 08493.8950X), the disclosures of as probes, the isolated nucleic acids of these embodiments which are incorporated herein by reference in their entire will respectively hybridize to MDZ3, MDZ4, MDZ7 or ties. MDZ12, or to portions thereof. 0219. Where the genomic region includes transcription 0226. In a first such embodiment, the invention provides regulatory elements, homologous recombination can be used an isolated nucleic acid comprising (i) the nucleotide to alter the expression of MDZ3, MDZA, MDZ7 or MDZ12, Sequence of the nucleic acid of any of ATCC deposits both for purpose of in vitro production of MDZ3, MDZ4, (MDZ3), (MDZ4), (MDZ7), and MDZ7 or MDZ12 protein from human cells, and for purpose (MDZ12a and MDZ12b), (ii) the nucleotide of gene therapy. See, e.g., U.S. Pat. Nos. 5,981,214, 6,048, sequence of any of SEQ ID NOs: 1,3027, 4407,5770, 6938 524; 5,272,071. or (iii) the complement of (i) or (ii). The ATCC deposits have, and SEQ ID Nos: 1,3027, 4407,5770, 6938 present, 0220 Fragments of the nucleic acids of the present the entire cDNA of MDZ3, MDZ4, MDZ7, and MDZ12a invention Smaller than those typically used for homologous and MDZ12b, respectively, including the 5' untranslated recombination can also be used for targeted gene correction (UT) region and 3' UT (except for MDZ12b). or alteration, possibly by cellular mechanisms different from 0227. In a second embodiment, the invention provides an those engaged during homologous recombination. isolated nucleic acid comprising (i) the nucleotide sequence 0221 For example, partially duplexed RNA/DNA chi of SEQ ID Nos: 2, 3028, 4408, 5771, (ii) a degenerate meras have been shown to have utility in targeted gene variant of the nucleotide sequence of SEQ ID Nos: 2,3028, correction, U.S. Pat. Nos. 5,945,339, 5,888,983, 5,871,984, 4408, 5771, or (iii) the complement of (i) or (ii). SEQ ID 5,795,972, 5,780,296, 5,760,012, 5,756,325, 5,731,181, the Nos: 2, 3028, 4408, 5771 present the open reading frame disclosures of which are incorporated herein by reference in (ORF) from SEQID Nos: 1,3027,4407,5770, respectively. their entireties. So too have Small oligonucleotides fused to 0228. In a third embodiment, the invention provides an triplexing domains have been shown to have utility in isolated nucleic acid of no more than about 100 kb, often no targeted gene correction, Culver et al., “Correction of chro more than about 75 kb, typically no more than about 50 kb, mosomal point mutations in human cells with bifunctional and usefully no more than about 25 kb, even no more than oligonucleotides."Nature Biotechnol. 17(10):989-93 (1999), about 10 kb, comprising (i) a nucleotide Sequence that as have oligonucleotides having modified terminal bases or encodes a polypeptide with the amino acid Sequence of any modified terminal internucleoside bonds, Gamper et al., of SEQID Nos: 3,3029, 4409, 5772, 6939, or 6940, or (ii) Nucl. Acids Res. 28(21):4332-9 (2000), the disclosures of the complement of a nucleotide Sequence that encodes a which are incorporated herein by reference. polypeptide with the amino acid sequence of any of SEQ ID 0222. The isolated nucleic acids of the present invention Nos: 3,3029, 4409, 5772, 6939, or 6940. SEQ ID Nos: 3, can also be used to provide the initial Substrate for recom 3029, 4409, 5772, 6939, 6940 provide the amino acid binant engineering of MDZ3, MDZ4, MDZ7 or MDZ12 sequence of MDZ3, MDZ4, MDZ7, MDZ12a, MDZ12bS protein variants having desired phenotypic improvements. and MDZ12bL, respectively. Such engineering includes, for example, Site-directed 0229. In a fourth embodiment, the invention provides an mutagenesis, random mutagenesis with Subsequent func isolated nucleic acid of no more than about 100 kb, often no US 2004/0078837 A1 Apr. 22, 2004

more than about 75 kb, typically no more than about 50 kb, sequence of SEQ ID NO: 3032, or (v) the complement of and usefully no more than about 25 kb, even no more than any of (i)-(iv). SEQ ID NO: 3030 presents a portion of about 10 kb, having a nucleotide Sequence that (i) encodes MDZ4 not present in known ESTs, with SEQ ID NO:3031 a polypeptide having the sequence of any of SEQ ID NOS: representing the 5' UT portion of SEQID NO:3030 and SEQ 3,3029, 4409, 5772, 6939, or 6940 with conservative amino ID NO:6 representing the coding region of SEQ ID acid Substitutions, (ii) encodes a polypeptide having the NO:3030, wherein the isolated nucleic acids of this embodi sequence of any of SEQ ID Nos: 3,3029, 4409, 5772, 6939, ment are no more than about 100 kb in length, often no more or 6940 with moderately conservative amino acid substitu than about 75 kb in length, or 50 kb, or even 25 kb in length, tions, or (iii) the complement of (i) or (ii). and can be no more than about 15 kb in length, and 0230 Selected Partial Nucleic Acids frequently no more than about 10 kb in length. 0239). In another embodiment, the invention provides an 0231. In a second series of nucleic acid embodiments, the isolated nucleic acid comprising (i) a nucleotide sequence invention provides isolated nucleic acids that encode Select that encodes SEQ ID NO:3033 or (ii) the complement of a portions of MDZ3, MDZ4, MDZ7 and MDZ12, respec nucleotide sequence that encodes SEQ ID NO: 3033, tively. As will be further discussed herein below, these wherein the isolated nucleic acid is no more than about 100 "partial nucleic acids can be used, inter alia, to express kb in length, typically no more than about 75 kb in length, specific portions of MDZ3, MDZ4, MDZ7 and MDZ12, frequently no more than about 50 kb in length. SEQ ID NO: respectively. These “partial nucleic acids can also be used, 3033 is the amino acid sequence encoded by that portion of inter alia, as nucleic probes. MDZ4 not present in known EST fragments. Often, the 0232 Selected Partial Nucleic Acids of MDZ3 isolated nucleic acids of this embodiment are no more than 0233. In a first such embodiment, the invention provides about 25 kb in length, often no more than about 15 kb in an isolated nucleic acid comprising (i) the nucleotide length, and frequently no more than about 10 kb in length. sequence of SEQ ID NO: 4, (ii) the nucleotide sequence of 0240. In another embodiment, the invention provides an SEQ ID NO:5, (iii) the nucleotide sequence of SEQ ID isolated nucleic acid comprising (i) a nucleotide sequence NO:6, (iv) a degenerate variant of SEQ ID NO:6, or (v) the that encodes SEO ID NO:3033 with conservative Substiti complement of any of (i)-(iv). SEQ ID NO. 4 presents that tions, (ii) a nucleotide sequence that encodes SEQ ID portion of MDZ3 not present in known ESTs, with SEQ ID NO:3033 with moderately conservative substitions, or (iii) NO:5 representing the 5' UT portion of SEQ ID NO:4 and the complement of (i) or (ii), wherein the isolated nucleic SEQ ID NO:6 representing the coding region of SEQ ID acid is no more than about 100 kb in length, typically no NO:4. Often, the isolated nucleic acids of this embodiment more than about 75 kb in length, and often no more than are no more than about 100 kb in length, often no more than about 50 kb in length. Often, the isolated nucleic acids of about 75 kb in length, or 50 kb, or even 25 kb in length, and this embodiment are no more than about 25 kb in length, can be no more than about 15 kb in length, and frequently often no more than about 15 kb in length, and frequently no no more than about 10 kb in length. more than about 10 kb in length. In another embodiment, the invention provides an isolated nucleic acid comprising (i) 0234. In another embodiment, the invention provides an the nucleotide sequence of SEQ ID NO: 3034, (ii) the isolated nucleic acid comprising (i) a nucleotide sequence nucleotide sequence of SEQID NO:3035, (iii) a degenerate that encodes SEQ ID NO: 7 or (ii) the complement of a variant of the nucleotide sequence of SEQ ID NO:3035, (iv) nucleotide sequence that encodes SEQ ID NO: 7, wherein the nucleotide sequence of SEQ ID NO:3036, or (v) the the isolated nucleic acid is no more than about 100 kb in complement of any of (i)-(iv), wherein the isolated nucleic length, often no more than about 75 kb in length, or 50 kb, acid is no more than about 100 kb in length, typically no or even 25 kb in length, and can be no more than about 15 more than about 75 kb in length, and often no more than kb in length, and frequently no more than about 10 kb in about 50 kb in length. Often, the isolated nucleic acids of length. this embodiment are no more than about 25 kb in length, 0235 SEQID NO: 7 is the amino acid sequence encoded often no more than about 15 kb in length, and frequently no by SEQ ID NO:6. more than about 10 kb in length. 0236. In another embodiment, the invention provides an 0241 SEQID NO:3034 presents a portion of MDZ4 not isolated nucleic acid comprising (i) a nucleotide sequence present in known ESTs, with SEQ ID NO:3035 representing that encodes SEQ ID NO: 7 with conservative Substitutions, a coding region portion of SEQ ID NO:3034, and with SEQ (ii) a nucleotide sequence that encodes SEQ ID NO: 7 with ID NO:3036 representing the 3' UT portion of SEQ ID moderately conservative Substititions, or (iii) the comple NO:3034. ment of (i) or (ii), wherein the isolated nucleic acid is no 0242. In another embodiment, the invention provides an more than about 100 kb in length, often no more than about isolated nucleic acid comprising (i) a nucleotide sequence 75 kb in length, or 50 kb, or even 25 kb in length, and can that encodes SEQ ID NO:3037 or (ii) the complement of a be no more than about 15 kb in length, and frequently no nucleotide sequence that encodes SEQ ID NO: 3037, more than about 10 kb in length. wherein the isolated nucleic acid is no more than about 100 kb in length, typically no more than about 75 kb in length, 0237) Selected Partial Nucleic Acids of MDZ4 frequently no more than about 50 kb in length. SEQ ID NO: 0238. In a first such embodiment, the invention provides 3037 is the amino acid sequence encoded by SEQ ID an isolated nucleic acid comprising (i) the nucleotide NO:3035. Often, the isolated nucleic acids of this embodi sequence of SEQ ID NO:3030, (ii) the nucleotide sequence ment are no more than about 25 kb in length, often no more of SEQ ID NO:3031, (iii) the nucleotide sequence of SEQ than about 15 kb in length, and frequently no more than ID NO: 3032, (iv) a degenerate variant of the nucleotide about 10 kb in length. US 2004/0078837 A1 Apr. 22, 2004

0243 In another embodiment, the invention provides an ment are no more than about 25 kb in length, often no more isolated nucleic acid comprising (i) a nucleotide sequence than about 15 kb in length, and frequently no more than that encodes SEO ID NO:3037 with conservative Substitu about 10 kb in length. tions, (ii) a nucleotide sequence that encodes SEQ ID NO: 0252) In another such embodiment, the invention pro 3037 with moderately conservative substitutions, or (iii) the vides an isolated nucleic acid comprising (i) the nucleotide complement of (i) or (ii), wherein the isolated nucleic acid sequence of SEQ ID NO: 6941, (ii) a degenerate variant of is no more than about 100 kb in length, typically no more SEQ ID NO: 6941, or (iii) the complement of (i) or (ii), than about 75 kb in length, and often no more than about 50 wherein the isolated nucleic acid is no more than about 100 kb in length. Often, the isolated nucleic acids of this embodi kb in length, typically no more than about 75 kb in length, ment are no more than about 25 kb in length, often no more more typically no more than about 50 kb length. SEQ ID than about 15 kb in length, and frequently no more than NO: 6941 is the novel exon inserted in the MDZ12b splice about 10 kb in length. variant. Often, the isolated nucleic acids of this embodiment 0244. Selected Partial Nucleic Acids of MDZ7 are no more than about 25 kb in length, often no more than about 15 kb in length, and frequently no more than about 10 0245. In a first such embodiment, the invention provides kb in length. an isolated nucleic acid comprising the nucleotide Sequence of SEQ ID NO: 4410 or SEQ ID NO:4411, or the comple 0253) In another embodiment, the invention provides an ment thereof, wherein the isolated nucleic acid of this aspect isolated nucleic acid comprising (i) a nucleotide sequence of the invention is no more than about 100 kb in length, often that encodes SEQ ID NO: 6942 or (ii) the complement of a no more than about 75 kb in length, more typically no more nucleotide sequence that encodes SEQ ID NO: 6942, than about 50 kb length. Often, the isolated nucleic acids of wherein the isolated nucleic acid is no more than about 100 this embodiment are no more than about 25 kb in length, kb in length, typically no more than about 75 kb in length, often no more than about 15 kb in length, and frequently no frequently no more than about 50 kb in length. SEQ ID NO: more than about 10 kb in length. 6942 is the amino acid Sequence encoded by the novel MDZ12b exon before the stop codon. Often, the isolated 0246 SEQID Nos: 4410 and 4411 present those portions nucleic acids of this embodiment are no more than about 25 of MDZ7 not present in known ESTs. kb in length, often no more than about 15 kb in length, and frequently no more than about 10 kb in length. 0247 Selected Partial Nucleic Acids of MDZ12 0254. In another embodiment, the invention provides an 0248. In a first such embodiment, the invention provides isolated nucleic acid comprising (i) a nucleotide sequence an isolated nucleic acid comprising (i) the nucleotide that encodes SEQ ID NO: 6942 with conservative Substitu sequence of SEQ ID NO: 5773, (ii) a degenerate variant of tions, (ii) a nucleotide sequence that encodes SEQ ID NO: SEQ ID NO: 5773, or (iii) the complement of (i) or (ii), 6942 with moderately conservative substititions, or (iii) the wherein the isolated nucleic acid is no more than about 100 complement of (i) or (ii), wherein the isolated nucleic acid kb in length, typically no more than about 75 kb in length, is no more than about 100 kb in length, typically no more more typically no more than about 50 kb length. Often, the than about 75 kb in length, and often no more than about 50 isolated nucleic acids of this embodiment are no more than kb in length. Often, the isolated nucleic acids of this embodi about 25 kb in length, often no more than about 15 kb in ment are no more than about 25 kb in length, often no more length, and frequently no more than about 10 kb in length. than about 15 kb in length, and frequently no more than about 10 kb in length. 0249 SEQ ID NO: 5773 encodes a portion of MDZ12a not present in known ESTs. 0255 Cross-Hybridizing Nucleic Acids 0250 In another embodiment, the invention provides an 0256 In another series of nucleic acid embodiments, the isolated nucleic acid comprising (i) a nucleotide sequence invention provides isolated nucleic acids that hybridize to that encodes SEQID NO: 5774 or (ii) the complement of a various of the MDZ3, MDZA, MDZ7 and MDZ12 nucleic nucleotide sequence that encodes SEQ ID NO: 5774, acids of the present invention. These cross-hybridizing wherein the isolated nucleic acid is no more than about 100 nucleic acids can be used, inter alia, as probes for, and to kb in length, typically no more than about 75 kb in length, drive expression of, proteins that are related to MDZ3, frequently no more than about 50 kb in length. SEQ ID NO: MDZ4, MDZ7 and MDZ12 of the present invention as 5774 is the amino acid sequence encoded by that portion of further isoforms, homologues, paralogues, or orthologues. MDZ12a not found in any EST fragments. Often, the 0257 Cross-Hybridizing Nucleic Acids of MDZ3 isolated nucleic acids of this embodiment are no more than about 25 kb in length, often no more than about 15 kb in 0258. In a first such embodiment, the invention provides length, and frequently no more than about 10 kb in length. an isolated nucleic acid comprising a Sequence that hybrid izes under high Stringency conditions to a probe the nucle 0251. In another embodiment, the invention provides an otide sequence of which consists of at least 17 nt, 18, 19, 20, isolated nucleic acid comprising (i) a nucleotide sequence 21, 22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO. 4 or of that encodes SEO ID NO: 5774 with conservative Substiti the complement of SEQ ID NO: 4, wherein the isolated tions, (ii) a nucleotide sequence that encodes SEQ ID NO: nucleic acid is no more than about 100 kb in length, typically 5774 with moderately conservative substitutions, or (iii) the no more than about 75 kb in length, and often no more than complement of (i) or (ii), wherein the isolated nucleic acid about 50 kb in length. Often, the isolated nucleic acids of is no more than about 100 kb in length, typically no more this embodiment are no more than about 25 kb in length, than about 75 kb in length, and often no more than about 50 often no more than about 15 kb in length, and frequently no kb in length. Often, the isolated nucleic acids of this embodi more than about 10 kb in length. US 2004/0078837 A1 Apr. 22, 2004

0259. In a further embodiment, the invention provides an ment are no more than about 25 kb in length, often no more isolated nucleic acid comprising a sequence that hybridizes than about 15 kb in length, and frequently no more than under moderate Stringency conditions to a probe the nucle about 10 kb in length. otide sequence of which consists of at least 17 nt, 18, 19, 20, 21, 22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO. 4 or of 0265. In yet another embodiment, the invention provides the complement of SEQ ID NO: 4, wherein the isolated an isolated nucleic acid comprising a Sequence that hybrid nucleic acid is no more than about 100 kb in length, typically izes under high Stringency conditions to a probe the nucle no more than about 75 kb in length, and often no more than otide sequence of which consists of at least 17 nt, 18, 19, 20, about 50 kb in length. Often, the isolated nucleic acids of 21, 22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO:3034 or this embodiment are no more than about 25 kb in length, of the complement of SEQ ID NO: 3034, wherein the often no more than about 15 kb in length, and frequently no isolated nucleic acid is no more than about 100 kb in length, more than about 10 kb in length. typically no more than about 75 kb in length, and often no more than about 50 kb in length. Often, the isolated nucleic 0260. In a further embodiment, the invention provides an acids of this embodiment are no more than about 25 kb in isolated nucleic acid comprising a sequence that hybridizes length, often no more than about 15 kb in length, and under high Stringency conditions to a hybridization probe frequently no more than about 10 kb in length. the nucleotide Sequence of which (i) encodes a polypeptide having the sequence of SEQ ID NO: 7, (ii) encodes a 0266. In a further embodiment, the invention provides an polypeptide having the sequence of SEQ ID NO: 7 with isolated nucleic acid comprising a sequence that hybridizes conservative amino acid Substitutions, or (iii) is the comple under moderate Stringency conditions to a probe the nucle ment of (i) or (ii), wherein the isolated nucleic acid is no otide sequence of which consists of at least 17 nt, 18, 19, 20, more than about 100 kb in length, typically no more than 21, 22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO:3034 or about 75 kb in length, and often no more than about 50 kb of the complement of SEQ ID NO: 3034, wherein the in length. Often, the isolated nucleic acids of this embodi isolated nucleic acid is no more than about 100 kb in length, ment are no more than about 25 kb in length, often no more typically no more than about 75 kb in length, and often no than about 15 kb in length, and frequently no more than more than about 50 kb in length. Often, the isolated nucleic about 10 kb in length. acids of this embodiment are no more than about 25 kb in length, often no more than about 15 kb in length, and 0261 Cross-Hybridizing Nucleic Acids of MDZ4 frequently no more than about 10 kb in length. 0262. In a first such embodiment, the invention provides 0267 In a further embodiment, the invention provides an an isolated nucleic acid comprising a sequence that hybrid isolated nucleic acid comprising a Sequence that hybridizes izes under high Stringency conditions to a probe the nucle under high Stringency conditions to a hybridization probe otide sequence of which consists of at least 17 nt, 18, 19, 20, the nucleotide sequence of which (i) encodes a polypeptide 21, 22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO:3030 or having the sequence of SEQ ID NO:3037, (ii) encodes a of the complement of SEQ ID NO: 3030, wherein the polypeptide having the sequence of SEQ ID NO:3037 with isolated nucleic acid is no more than about 100 kb in length, conservative amino acid Substitutions, or (iii) is the comple typically no more than about 75 kb in length, and often no ment of (i) or (ii), wherein the isolated nucleic acid is no more than about 50 kb in length. Often, the isolated nucleic more than about 100 kb in length, typically no more than acids of this embodiment are no more than about 25 kb in about 75 kb in length, and often no more than about 50 kb length, often no more than about 15 kb in length, and in length. Often, the isolated nucleic acids of this embodi frequently no more than about 10 kb in length. ment are no more than about 25 kb in length, often no more 0263. In a further embodiment, the invention provides an than about 15 kb in length, and frequently no more than isolated nucleic acid comprising a sequence that hybridizes about 10 kb in length. under moderate Stringency conditions to a probe the nucle 0268 Cross-Hybridizing Nucleic Acids of MDZ7 otide sequence of which consists of at least 17 nt, 18, 19, 20, 21, 22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO:3030 or 0269. In a first such embodiment, the invention provides of the complement of SEQ ID NO: 3030, wherein the an isolated nucleic acid comprising a Sequence that hybrid isolated nucleic acid is no more than about 100 kb in length, izes under high Stringency conditions to a probe the nucle typically no more than about 75 kb in length, and often no otide sequence of which consists of SEQID NO: 4410 or the more than about 50 kb in length. Often, the isolated nucleic complement of SEQ ID NO: 4410, wherein the isolated acids of this embodiment are no more than about 25 kb in nucleic acid is no more than about 100 kb in length, typically length, often no more than about 15 kb in length, and no more than about 75 kb in length, and often no more than frequently no more than about 10 kb in length. about 50 kb in length. Often, the isolated nucleic acids of 0264. In a further embodiment, the invention provides an this embodiment are no more than about 25 kb in length, isolated nucleic acid comprising a sequence that hybridizes often no more than about 15 kb in length, and frequently no under high Stringency conditions to a hybridization probe more than about 10 kb in length. the nucleotide Sequence of which (i) encodes a polypeptide 0270. In a further embodiment, the invention provides an having the sequence of SEQ ID NO: 3033, (ii) encodes a isolated nucleic acid comprising a sequence that hybridizes polypeptide having the sequence of SEQ ID NO:3033 with under moderate Stringency conditions to a probe the nucle conservative amino acid Substitutions, or (iii) is the comple otide sequence of which consists of SEQID NO: 4410 or the ment of (i) or (ii), wherein the isolated nucleic acid is no complement of SEQ ID NO: 4410, wherein the isolated more than about 100 kb in length, typically no more than nucleic acid is no more than about 100 kb in length, typically about 75 kb in length, and often no more than about 50 kb no more than about 75 kb in length, and often no more than in length. Often, the isolated nucleic acids of this embodi about 50 kb in length. Often, the isolated nucleic acids of US 2004/0078837 A1 Apr. 22, 2004 20 this embodiment are no more than about 25 kb in length, 0277. In another such embodiment, the invention pro often no more than about 15 kb in length, and frequently no vides an isolated nucleic acid comprising a sequence that more than about 10 kb in length. hybridizes under high Stringency conditions to a probe the 0271 In a further embodiment, the invention provides an nucleotide sequence of which consists of SEQ ID NO: 6941 isolated nucleic acid comprising a sequence that hybridizes or the complement of SEQ ID NO: 6941, wherein the under high Stringency conditions to a probe the nucleotide isolated nucleic acid is no more than about 100 kb in length, sequence of which consists of SEQ ID NO: 4411 or the typically no more than about 75 kb in length, and often no complement of SEQ ID NO: 4411, wherein the isolated more than about 50 kb in length. Often, the isolated nucleic nucleic acid is no more than about 100 kb in length, typically acids of this embodiment are no more than about 25 kb in no more than about 75 kb in length, and often no more than length, often no more than about 15 kb in length, and about 50 kb in length. Often, the isolated nucleic acids of frequently no more than about 10 kb in length. this embodiment are no more than about 25 kb in length, 0278 In a further embodiment, the invention provides an often no more than about 15 kb in length, and frequently no isolated nucleic acid comprising a sequence that hybridizes more than about 10 kb in length. under moderate Stringency conditions to a probe the nucle 0272. In a further embodiment, the invention provides an otide sequence of which consists of SEQID NO: 6941 or the isolated nucleic acid comprising a sequence that hybridizes complement of SEQ ID NO: 6941, wherein the isolated under moderate Stringency conditions to a probe the nucle nucleic acid is no more than about 100 kb in length, typically otide sequence of which consists of SEQID NO: 4411 or the no more than about 75 kb in length, and often no more than complement of SEQ ID NO: 4411, wherein the isolated about 50 kb in length. Often, the isolated nucleic acids of nucleic acid is no more than about 100 kb in length, typically this embodiment are no more than about 25 kb in length, no more than about 75 kb in length, and often no more than often no more than about 15 kb in length, and frequently no about 50 kb in length. Often, the isolated nucleic acids of more than about 10 kb in length. this embodiment are no more than about 25 kb in length, 0279. In a further embodiment, the invention provides an often no more than about 15 kb in length, and frequently no isolated nucleic acid comprising a sequence that hybridizes more than about 10 kb in length. under high Stringency conditions to a hybridization probe the nucleotide sequence of which (i) encodes a polypeptide 0273 Cross-Hybridizing Nucleic Acids of MDZ12 having the sequence of SEQ ID NO: 6942, (ii) encodes a 0274. In a first such embodiment, the invention provides polypeptide having the sequence of SEQ ID NO: 6942 with an isolated nucleic acid comprising a sequence that hybrid conservative amino acid Substitutions, or (iii) is the comple izes under high Stringency conditions to a probe the nucle ment of (i) or (ii), wherein the isolated nucleic acid is no otide sequence of which consists of SEQID NO: 5773 or the more than about 100 kb in length, typically no more than complement of SEQ ID NO: 5773, wherein the isolated about 75 kb in length, and often no more than about 50 kb nucleic acid is no more than about 100 kb in length, typically in length. Often, the isolated nucleic acids of this embodi no more than about 75 kb in length, and often no more than ment are no more than about 25 kb in length, often no more about 50 kb in length. Often, the isolated nucleic acids of than about 15 kb in length, and frequently no more than this embodiment are no more than about 25 kb in length, about 10 kb in length. often no more than about 15 kb in length, and frequently no more than about 10 kb in length. 0280 Particularly Useful Nucleic Acids 0275. In a further embodiment, the invention provides an 0281 Particularly Useful Nucleic Acids of MDZ3 isolated nucleic acid comprising a sequence that hybridizes 0282 Particularly useful among the above-described under moderate Stringency conditions to a probe the nucle MDZ3 nucleic acids are those that are expressed, or the otide sequence of which consists of SEQID NO: 5773 or the complement of which are expressed, in brain, testis, heart complement of SEQ ID NO: 5773, wherein the isolated and bone marrow. nucleic acid is no more than about 100 kb in length, typically 0283 Also particularly useful among the above-de no more than about 75 kb in length, and often no more than scribed MDZ3 nucleic acids are those that encode, or the about 50 kb in length. Often, the isolated nucleic acids of complement of which encode, a polypeptide having at least this embodiment are no more than about 25 kb in length, one C2H2 (Kruppel family) Zinc finger, and especially those often no more than about 15 kb in length, and frequently no that encode 7 C2H2 zinc fingers in tandem, those that more than about 10 kb in length. encode a SCAN domain, those that encode a KRAB domain, 0276. In a further embodiment, the invention provides an and those that include all of a SCAN domain, KRAB isolated nucleic acid comprising a sequence that hybridizes domain, and 7 zinc fingers. under high Stringency conditions to a hybridization probe 0284. Also particularly useful are those encode, or the the nucleotide Sequence of which (i) encodes a polypeptide complement of which encode, a polypeptide having having the sequence of SEQ ID NO: 5774, (ii) encodes a Sequence-specific nucleic acid binding regulatory activity, polypeptide having the sequence of SEQ ID NO: 5774 with and that participates in protein-protein interactions with conservative amino acid Substitutions, or (iii) is the comple other transcription modulators. ment of (i) or (ii), wherein the isolated nucleic acid is no more than about 100 kb in length, typically no more than 0285 Particularly Useful Nucleic Acids of MDZ4 about 75 kb in length, and often no more than about 50 kb 0286 Particularly useful among the above-described in length. Often, the isolated nucleic acids of this embodi MDZ4 nucleic acids are those that are expressed, or the ment are no more than about 25 kb in length, often no more complement of which are expressed, in bone marrow, brain, than about 15 kb in length, and frequently no more than heart, hela, adult liver, fetal liver, lung, placenta and pros about 10 kb in length. tate. US 2004/0078837 A1 Apr. 22, 2004

0287. Other particularly useful embodiments of the nucleic acids of the present invention which prove useful, MDZ4 nucleic acids above-described are those that encode, inter alia, as nucleic acid probes, as amplification primers, or the complement of which encode, a polypeptide that has and to direct expression or Synthesis of epitopic or immu at least one C2H2 (Kruppel family) Zinc finger, and espe nogenic protein fragments. cially those that encode 5 C2H2 zinc fingers in tandem, those that encode a SCAN domain, and those that include all 0299 Nucleic Acid Fragments of MDZ3 of a SCAN domain and 5 zinc fingers. 0300. In a first embodiment, the invention provides an 0288 Also particularly useful among the above-de isolated nucleic acid comprising at least 17 nucleotides, 18 scribed MDZ4 nucleic acids are those that encode, or the nucleotides, 20 nucleotides, 24 nucleotides, or 25 nucle complement of which encode, a polypeptide having otides of (i) SEQID NO: 4, (ii) a degenerate variant of SEQ Sequence-specific nucleic acid binding regulatory activity, ID NO: 6, or (iii) the complement of (i) or (ii), wherein the isolated nucleic acid is no more than about 100 kb in length, and that participates in protein-protein interactions with typically no more than about 75 kb in length, more typically other transcription modulators. no more than about 50 kb in length. Often, the isolated 0289 Particularly Useful Nucleic Acids of MDZ7 nucleic acids of this embodiment are no more than about 25 0290 Particularly useful among the above-described kb in length, often no more than about 15 kb in length, and MDZ7 nucleic acids are those that are expressed, or the frequently no more than about 10 kb in length. complement of which are expressed, in testes, preferably at 0301 The invention also provides an isolated nucleic a level greater than that in kidney, liver, lung, brain or heart, acid comprising (i) a nucleotide sequence that encodes a typically at a level at least two-fold that in kidney, liver, peptide of at least 8 contiguous amino acids of SEQID NO: lung, brain or heart, often at least three-fold, four-fold, or 7, or (ii) the complement of a nucleotide sequence that even five-fold that in kidney, liver, lung, brain or heart. encodes a peptide of at least 8 contiguous amino acids of SEOID NO: 7, wherein the isolated nucleic acid is no more 0291 Also particularly useful among the above-de than about 100 kb in length, typically no more than about 75 scribed MDZ7 nucleic acids are those that encode, or the kb in length, more typically no more than about 50 kb in complement of which encode, a polypeptide having at least length. Often, the isolated nucleic acids of this embodiment one C2H2 (Kruppel family) Zinc finger, especially those are no more than about 25 kb in length, often no more than having a plurality of Zinc fingers in tandem, particularly about 15 kb in length, and frequently no more than about 10 those having 7 Zinc fingers in tandem. kb in length. 0292 Also particularly useful among the above-de scribed MDZ7 nucleic acids are those that encode, or the 0302 The invention also provides an isolated nucleic complement of which encode, a polypeptide having acid comprising (i) a nucleotide sequence that encodes a Sequence-specific nucleic acid binding regulatory activity, peptide of at least 15 contiguous amino acids of SEQID NO: and that functions in Sequence-specific modulation of gene 7, or (ii) the complement of a nucleotide sequence that expression. encodes a peptide of at least 15 contiguous amino acids of SEOID NO: 7, wherein the isolated nucleic acid is no more 0293 Particularly Useful Nucleic Acids of MDZ12 than about 100 kb in length, typically no more than about 75 0294 Particularly useful among the above-described kb in length, more typically no more than about 50 kb in MDZ12 nucleic acids are those that are expressed, or the length. Often, the isolated nucleic acids of this embodiment complement of which are expressed, in brain, heart, kidney, are no more than about 25 kb in length, often no more than placenta, Skeletal muscle, testis, Hela cells, bone marrow about 15 kb in length, and frequently no more than about 10 and liver. kb in length. 0303. The invention also provides an isolated nucleic 0295 Also particularly useful among the above-de acid comprising a nucleotide sequence that encodes (i) a Scribed nucleic acids are those that encode, or the comple polypeptide having the Sequence of at least 8 contiguous ment of which encode, a polypeptide having a C2H2 (Krup amino acids of SEQ ID NO: 7 with conservative amino acid pel family) Zinc finger, particularly those having a plurality Substitutions, (ii) a polypeptide having the sequence of at of Such Zinc fingers in tandem, especially those having at least 15 contiguous amino acids of SEQ ID NO:7 with least 5, often at least 12, Zinc fingers in tandem. Also conservative amino acid Substitutions, (iii) a polypeptide particularly useful among the above-described nucleic acids having the Sequence of at least 8 contiguous amino acids of are those that encode, or the complement of which encode, SEQ ID NO:7 with moderately conservative substitutions, a polypeptide having KRAB-B domain, especially those (iv) a polypeptide having the Sequence of at last 15 congiu having both a KRAB domain and at least one, preferably a ous amino acids of SEQ ID NO:7 with moderately conser plurality, especially at least 10, often at least 12, Zinc finger vative Substitutions, or (v) the complement of any of (i)-(iv), domains. wherein the isolated nucleic acid is no more than about 100 0296 Particularly useful nucleic acids are those that kb in length, typically no more than about 75 kb in length, encode, or the complement of which encode, polypeptides more typically no more than about 50 kb in length. Often, that act as Sequence-specific transcription regulators, and the isolated nucleic acids of this embodiment are no more that interaction with other transcriptional modulators by than about 25 kb in length, often no more than about 15 kb protein-protein interactions. in length, and frequently no more than about 10kb in length. 0297 Nucleic Acid Fragments 0304) Nucleic Acid Fragments of MDZ4 0298. In another series of nucleic acid embodiments, the 0305. In a first embodiment, the invention provides an invention provides fragments of various of the isolated isolated nucleic acid comprising at least 17 nucleotides, 18 US 2004/0078837 A1 Apr. 22, 2004 22 nucleotides, 20 nucleotides, 24 nucleotides, or 25 nucle ment are no more than about 25 kb in length, often no more otides of (i) SEQ ID NO:3030, (ii) a degenerate variant of than about 15 kb in length, and frequently no more than SEQ ID NO: 3032, or (iii) the complement of (i) or (ii), about 10 kb in length. wherein the isolated nucleic acid is no more than about 100 0310. The invention also provides an isolated nucleic kb in length, typically no more than about 75 kb in length, acid comprising a nucleotide sequence that (i) encodes a more typically no more than about 50 kb in length. Often, polypeptide having the Sequence of at least 8 contiguous the isolated nucleic acids of this embodiment are no more amino acids of SEQ ID NO: 3037 having conservative than about 25 kb in length, often no more than about 15 kb amino acid Substitutions, (ii) encodes a polypeptide having in length, and frequently no more than about 10kb in length. the Sequence of at least 8 contiguous amino acids of SEQID NO:3037 having moderately conservative amino acid sub 0306 The invention also provides an isolated nucleic Stitutions, (iii) encodes a polypeptide having the Sequence of acid comprising (i) a nucleotide sequence that encodes a at least 15 contiguous amino acids of SEQ ID NO: 3037 peptide of at least 8 contiguous amino acids of SEQID NO: having conservative amino acid Substitutions, (iv) encodes a 3033, (ii) a nucleotide Sequence that encodes a peptide of at polypeptide having the Sequence of at least 15 contiguous least 15 contiguous amino acids of SEQID NO:3033, or (iii) amino acids of SEQ ID NO: 3037 having moderately the complement of (i) or (ii), wherein the isolated nucleic conservative amino acid Substitutions, or (V) is the comple acid is no more than about 100 kb in length, typically no ment of any of (i)-(iv), wherein the isolated nucleic acid is more than about 75 kb in length, more typically no more no more than about 100 kb in length, typically no more than than about 50 kb in length. Often, the isolated nucleic acids about 75 kb in length, more typically no more than about 50 of this embodiment are no more than about 25 kb in length, kb in length. Often, the isolated nucleic acids of this embodi often no more than about 15 kb in length, and frequently no ment are no more than about 25 kb in length, often no more more than about 10 kb in length. than about 15 kb in length, and frequently no more than about 10 kb in length. 0307 The invention also provides an isolated nucleic acid comprising a nucleotide sequence that (i) encodes a 0311) Nucleic Acid Fragments of MDZ7 polypeptide having the Sequence of at least 8 contiguous 0312. In a first embodiment of this aspect of the inven amino acids of SEQ ID NO: 3033 with conserative amino tion, the invention provides an isolated nucleic acid com acid Substitutions, (ii) encodes a polypeptide having the prising at least 17 nucleotides, 18 nucleotides, 20 nucle Sequence of at least 8 contiguous amino acids of SEQ ID otides, 24 nucleotides, or 25 nucleotides of (i) SEQ ID NO: NO: 3033 with moderately conservative amino acid substi 4410, or (ii) the complement thereof, wherein the isolated tutions, (iii) encodes a polypeptide having the sequence of at nucleic acid is no more than about 100 kb in length, typically least 15 contiguous amino acids of SEQ ID NO:3033 with no more than about 75 kb in length, more typically no more conservative amino acid Substitutions, (iv) encodes a than about 50 kb in length. Often, the isolated nucleic acids polypeptide having the Sequence of at last 15 contiguous of this embodiment are no more than about 25 kb in length, amino acids of SEQ ID NO:3033 with moderately conser often no more than about 15 kb in length, and frequently no vative amino acid Substitutions, or (v) the complement of more than about 10 kb in length. any one of (i)-(iv), wherein the isolated nucleic acid is no more than about 100 kb in length, typically no more than 0313. In a second embodiment, the invention provides an about 75 kb in length, more typically no more than about 50 isolated nucleic acid comprising at least 17 nucleotides, 18 kb in length. Often, the isolated nucleic acids of this embodi nucleotides, 20 nucleotides, 24 nucleotides, or 25 nucle ment are no more than about 25 kb in length, often no more otides of (i) SEQ ID NO. 4411, or (ii) the complement of than about 15 kb in length, and frequently no more than thereof, wherein the isolated nucleic acid is no more than about 10 kb in length. about 100 kb in length, typically no more than about 75 kb in length, more typically no more than about 50 kb in length. 0308. In a further embodiment, the invention provides an Often, the isolated nucleic acids of this embodiment are no isolated nucleic acid comprising at least 17 nucleotides, 18 more than about 25 kb in length, often no more than about nucleotides, 20 nucleotides, 24 nucleotides, or 25 nucle 15 kb in length, and frequently no more than about 10 kb in otides of (i) SEQ ID NO:3034, (ii) a degenerate variant of length. SEQ ID NO: 3035, or (iii) the complement of (i) or (ii), wherein the isolated nucleic acid is no more than about 100 0314) Nucleic Acid Fragments of MDZ12 kb in length, typically no more than about 75 kb in length, 0315. In a first embodiment, the invention provides an more typically no more than about 50 kb in length. Often, isolated nucleic acid comprising at least 17 nucleotides, 18 the isolated nucleic acids of this embodiment are no more nucleotides, 20 nucleotides, 24 nucleotides, or 25 nucle than about 25 kb in length, often no more than about 15 kb otides of (i) SEQ ID NO: 5773, (ii) a degenerate variant of in length, and frequently no more than about 10kb in length. SEQ ID NO: 5773, or (iii) the complement of (i) or (ii), 0309 The invention also provides an isolated nucleic wherein the isolated nucleic acid is no more than about 100 acid comprising (i) a nucleotide sequence that encodes a kb in length, typically no more than about 75 kb in length, peptide of at least 8 contiguous amino acids of SEQID NO: more typically no more than about 50 kb in length. Often, 3037, or (ii) the complement of a nucleotide sequence that the isolated nucleic acids of this embodiment are no more encodes a peptide of at least 8 contiguous amino acids of than about 25 kb in length, often no more than about 15 kb SEQ ID NO:3037, wherein the isolated nucleic acid is no in length, and frequently no more than about 10kb in length. more than about 100 kb in length, typically no more than 0316 The invention also provides an isolated nucleic about 75 kb in length, more typically no more than about 50 acid comprising (i) a nucleotide sequence that encodes a kb in length. Often, the isolated nucleic acids of this embodi peptide of at least 8 contiguous amino acids of SEQID NO: US 2004/0078837 A1 Apr. 22, 2004

5774, or (ii) the complement of a nucleotide Sequence that ment are no more than about 25 kb in length, often no more encodes a peptide of at least 8 contiguous amino acids of than about 15 kb in length, and frequently no more than SEO ID NO: 5774, wherein the isolated nucleic acid is no about 10 kb in length. more than about 100 kb in length, typically no more than 0321) The invention also provides an isolated nucleic about 75 kb in length, more typically no more than about 50 acid comprising a nucleotide sequence that (i) encodes a kb in length. Often, the isolated nucleic acids of this embodi polypeptide having the Sequence of at least 8 contiguous ment are no more than about 25 kb in length, often no more amino acids of SEQ ID NO: 6942 with conservative amino than about 15 kb in length, and frequently no more than acid Substitutions, (ii) encodes a polypeptide having the about 10 kb in length. Sequence of at least 15 contiguous amino acids of SEQ ID NO: 6942 with conservative amino acid substitutions, (iii) 0317. The invention also provides an isolated nucleic encodes a polypeptide having the Sequence of at least 8 acid comprising (i) a nucleotide sequence that encodes a contiguous amino acids of SEQ ID NO: 6942 with moder peptide of at least 15 contiguous amino acids of SEQID NO: ately conservative amino acid Substitutions, (iv) encodes a 5774, or (ii) the complement of a nucleotide Sequence that polypeptide having the Sequence of at least 15 contiguous encodes a peptide of at least 15 contiguous amino acids of amino acids of SEQ ID NO: 6942 with moderately conser SEO ID NO: 5774, wherein the isolated nucleic acid is no vative amino acid Substitutions, or (v) the complement of more than about 100 kb in length, typically no more than any one of (i)-(iv). about 75 kb in length, more typically no more than about 50 kb in length. Often, the isolated nucleic acids of this embodi 0322 Single Exon Probes ment are no more than about 25 kb in length, often no more 0323 The invention further provides genome-derived than about 15 kb in length, and frequently no more than Single exon probes having portions of no more than one exon about 10 kb in length. of the above-described genes. AS further described in com 0318. The invention also provides an isolated nucleic monly owned and copending U.S. patent application Ser. acid comprising a nucleotide sequence that (i) encodes a No. 09/632,366, filed Aug. 3, 2000 (“Methods and Appara polypeptide having the Sequence of at least 8 contiguous tus for High Throughput Detection and Characterization of amino acids of SEQ ID NO: 5774 with conservative amino alternatively Spliced Genes”), the disclosure of which is acid Substitutions, or (ii) encodes a polypeptide having the incorporated herein by reference in its entirety, Such Single Sequence of at least 15 contiguous amino acids of SEQ ID exon probes have particular utility in identifying and char NO:5774 having conservative amino acid substitutions, (iii) acterizing Splice variants. In particular, Such single exon encodes a polypeptide having the Sequence of at least 8 probes are useful for identifying and discriminating the contiguous amino acids of SEQ ID NO:5774 with moder expression of distinct isoforms of genes. ately conservative amino acid Substitutions, (iv) encodes a 0324) Single Exon Probes of MDZ3 polypeptide having the Sequence of at least 15 contiguous amino acids of SEQ ID NO:5774 having moderately con 0325 The invention further provides genome-derived Servative amino acid Substitutions, or (v) the complement of Single exon probes having portions of no more than one exon any one of (i)-(iv), wherein the isolated nucleic acid is no of the MDZ3 gene. more than about 100 kb in length, typically no more than 0326 In a first embodiment, the invention provides an about 75 kb in length, more typically no more than about 50 isolated nucleic acid comprising a nucleotide Sequence of no kb in length. Often, the isolated nucleic acids of this embodi more than one portion of SEQ ID NOs: 8-15 or the comple ment are no more than about 25 kb in length, often no more ment of SEQ ID NOs: 8-15, wherein the portion comprises than about 15 kb in length, and frequently no more than at least 17 contiguous nucleotides, 18 contiguous nucle about 10 kb in length. otides, 20 contiguous nucleotides, 24 contiguous nucle otides, 25 contiguous nucleotides, or 50 contiguous nucle 03.19. In another embodiment, the invention provides an otides of any one of SEQID Nos: 8-15, or their complement. isolated nucleic acid comprising at least 17 nucleotides, 18 In a further embodiment, the exonic portion comprises the nucleotides, 20 nucleotides, 24 nucleotides, or 25 nucle otides of (i) SEQ ID NO: 6941, (ii) a degenerate variant of entirety of the referenced SEQ ID NO: or its complement. SEQ ID NO: 6941, or (iii) the complement of (i) or (ii), 0327 In other embodiments, the invention provides iso wherein the isolated nucleic acid is no more than about 100 lated Single exon probes having the nucleotide Sequence of kb in length, typically no more than about 75 kb in length, any one of SEQ ID NOs: 16-23. more typically no more than about 50 kb in length. Often, the isolated nucleic acids of this embodiment are no more 0328. Single Exon Probes of MDZ4 than about 25 kb in length, often no more than about 15 kb 0329. The invention further provides genome-derived in length, and frequently no more than about 10kb in length. Single exon probes having portions of no more than one exon 0320 The invention also provides an isolated nucleic of the MDZ4 gene. acid comprising (i) a nucleotide sequence that encodes a 0330. In a first embodiment, the invention provides an peptide of at least 8 contiguous amino acids of SEQID NO: isolated nucleic acid comprising a nucleotide Sequence of no 6942, or (ii) the complement of a nucleotide Sequence that more than one portion of SEQ ID NOs: 3038-3041 or the encodes a peptide of at least 8 contiguous amino acids of complement of SEQ ID NOs: 3038-3041, wherein the SEO ID NO: 6942, wherein the isolated nucleic acid is no portion comprises at least 17 contiguous nucleotides, 18 more than about 100 kb in length, typically no more than contiguous nucleotides, 20 contiguous nucleotides, 24 con about 75 kb in length, more typically no more than about 50 tiguous nucleotides, 25 contiguous nucleotides, or 50 con kb in length. Often, the isolated nucleic acids of this embodi tiguous nucleotides of any one of SEQ ID NOs: 3038-3041, US 2004/0078837 A1 Apr. 22, 2004 24 or their complement. In a further embodiment, the exonic nucleic acid is no more than about 100 kb in length, typically portion comprises the entirety of the referenced SEQID NO: no more than about 75 kb in length, more typically no more or its complement. than about 50 kb in length. Often, the isolated nucleic acids 0331 In other embodiments, the invention provides iso of this embodiment are no more than about 25 kb in length, lated Single exon probes having the nucleotide Sequence of often no more than about 15 kb in length, and frequently no any one of SEQ ID NOs: 3042-3045. more than about 10 kb in length. 0344) In another embodiment, the invention provides an 0332 Single Exon Probes of MDZ7 isolated nucleic acid comprising at least 17, 18, 20, 24, or 25 0333. The invention further provides genome-derived nucleotides of the sequence of SEQ ID NO: 24 or its Single exon probes having portions of no more than one exon complement, wherein the isolated nucleic acid is no more of the MDZ7 gene. than about 100 kb in length, typically no more than about 75 0334. In a first embodiment, the invention provides an kb in length, more typically no more than about 50 kb in isolated nucleic acid comprising a nucleotide Sequence of no length. Often, the isolated nucleic acids of this embodiment more than one portion of SEQ ID NOS. 4412-4415 or the are no more than about 25 kb in length, often no more than complement of SEQ ID NOs: 4412-4415, wherein the about 15 kb in length, and frequently no more than about 10 portion comprises at least 17 contiguous nucleotides, 18 kb in length. contiguous nucleotides, 20 contiguous nucleotides, 24 con 0345 Transcription Control Nucleic Acids of MDZ4 tiguous nucleotides, 25 contiguous nucleotides, or 50 con 0346. In another aspect, the present invention provides tiguous nucleotides of any one of SEQ ID NOs: 4412-4415, genome-derived isolated nucleic acids that include nucleic or their complement. In a further embodiment, the exonic acid Sequence elements that control transcription of the portion comprises the entirety of the referenced SEQID NO: MDZ4 gene. These nucleic acids can be used, inter alia, to or its complement. drive expression of heterologous coding regions in recom 0335) In other embodiments, the invention provides iso binant constructs, thus conferring upon Such heterologous lated Single exon probes having the nucleotide Sequence of coding regions the expression pattern of the native MDZA any one of SEQ ID NOs:4416-4419. gene. These nucleic acids can also be used, conversely, to target heterologous transcription control elements to the 0336. Single Exon Probes of MDZ12 MDZ4 genomic locus, altering the expression pattern of the 0337 The invention further provides genome-derived MDZ4 gene itself. Single exon probes having portions of no more than one exon 0347 In a first such embodiment, the invention provides of the MDZ12 gene. an isolated nucleic acid comprising the nucleotide Sequence 0338. In a first embodiment, the invention provides an of SEQ ID NO: 3046 or its complement, wherein the isolated nucleic acid comprising a nucleotide Sequence of no isolated nucleic acid is no more than about 100 kb in length, more than one portion of SEQ ID NOs: 5775-5778, 6941 or typically no more than about 75 kb in length, more typically the complement of SEQ ID NOs: 5775-5778, 6941 wherein no more than about 50 kb in length. Often, the isolated the portion comprises at least 17 contiguous nucleotides, 18 nucleic acids of this embodiment are no more than about 25 contiguous nucleotides, 20 contiguous nucleotides, 24 con kb in length, often no more than about 15 kb in length, and tiguous nucleotides, 25 contiguous nucleotides, or 50 con frequently no more than about 10 kb in length. tiguous nucleotides of any one of SEQ ID NOs: 5775-5778, 0348. In another embodiment, the invention provides an 6941, or their complement. In a further embodiment, the isolated nucleic acid comprising at least 17, 18, 20, 24, or 25 exonic portion comprises the entirety of the referenced SEQ nucleotides of the sequence of SEQ ID NO: 3046 or its ID NO: or its complement. complement, wherein the isolated nucleic acid is no more 0339. In other embodiments, the invention provides iso than about 100 kb in length, typically no more than about 75 lated Single exon probes having the nucleotide Sequence of kb in length, more typically no more than about 50 kb in any one of SEQ ID NOs: 5779-5782 and 6941. length. Often, the isolated nucleic acids of this embodiment are no more than about 25 kb in length, often no more than 0340 Transcription Control Nucleic Acids about 15 kb in length, and frequently no more than about 10 0341 Transcription Control Nucleic Acids of MDZ3 kb in length. 0342. In another aspect, the present invention provides 0349 Transcription Control Nucleic Acids of MDZ7 genome-derived isolated nucleic acids that include nucleic 0350. In another aspect, the present invention provides acid Sequence elements that control transcription of the genome-derived isolated nucleic acids that include nucleic MDZ3 gene. These nucleic acids can be used, inter alia, to acid Sequence elements that control transcription of the drive expression of heterologous coding regions in recom MDZ7 gene. These nucleic acids can be used, inter alia, to binant constructs, thus conferring upon Such heterologous drive expression of heterologous coding regions in recom coding regions the expression pattern of the native MDZ3 binant constructs, thus conferring upon Such heterologous gene. These nucleic acids can also be used, conversely, to coding regions the expression pattern of the native MDZ7 target heterologous transcription control elements to the gene. These nucleic acids can also be used, conversely, to MDZ3 genomic locus, altering the expression pattern of the target heterologous transcription control elements to the MDZ3 gene itself. MDZ7 genomic locus, altering the expression pattern of the 0343. In a first such embodiment, the invention provides MDZ7 gene itself. an isolated nucleic acid comprising the nucleotide Sequence 0351. In a first such embodiment, the invention provides of SEQ ID NO: 24 or its complement, wherein the isolated an isolated nucleic acid comprising the nucleotide Sequence US 2004/0078837 A1 Apr. 22, 2004

of SEQ ID NO: 4420 or its complement, wherein the of the nucleic acids of the present invention in vitro or within isolated nucleic acid is no more than about 100 kb in length, a host cell, and for expressing polypeptides encoded by the typically no more than about 75 kb in length, more typically nucleic acids of the present invention, alone or as fusions to no more than about 50 kb in length. Often, the isolated heterologous polypeptides. Vectors of the present invention nucleic acids of this embodiment are no more than about 25 will often be Suitable for several Such uses. kb in length, often no more than about 15 kb in length, and 0360 Vectors are by now well-known in the art, and are frequently no more than about 10 kb in length. described, inter alia, in Jones et al. (eds.), Vectors. Cloning 0352. In another embodiment, the invention provides an Applications. Essential Techniques (Essential Techniques isolated nucleic acid comprising at least 17, 18, 20, 24, or 25 Series), John Wiley & Son Ltd 1998 (ISBN: 047196266X); nucleotides of the sequence of SEQ ID NO: 4420 or its Jones et al. (eds.), Vectors. Expression Systems: Essential complement, wherein the isolated nucleic acid is no more Techniques (Essential Techniques Series), John Wiley & Son than about 100 kb in length, typically no more than about 75 Ltd, 1998 (ISBN: 0471962678); Gacesa et al., Vectors: kb in length, more typically no more than about 50 kb in Essential Data, John Wiley & Sons, 1995 (ISBN: length. Often, the isolated nucleic acids of this embodiment 04719484.11); Cid-Arregui (eds.), Viral Vectors: Basic Sci are no more than about 25 kb in length, often no more than ence and Gene Therapy, Eaton Publishing Co., 2000 (ISBN: about 15 kb in length, and frequently no more than about 10 188129935X); Sambrook et al., Molecular Cloning: A Labo kb in length. ratory Manual (3" ed.), Cold Spring Harbor Laboratory Press, 2001 (ISBN: 0879695773); Ausubel et al. (eds.), 0353 Transcription Control Nucleic Acids of MDZ12 Short Protocols in Molecular Biology: A Compendium of 0354) In another aspect, the present invention provides Methods from Current Protocols in Molecular Biology (4" genome-derived isolated nucleic acids that include nucleic ed.), John Wiley & Sons, 1999 (ISBN: 047132938X), the acid Sequence elements that control transcription of the disclosures of which are incorporated herein by reference in MDZ12 gene. These nucleic acids can be used, inter alia, to their entireties. Furthermore, an enormous variety of vectors drive expression of heterologous coding regions in recom are available commercially. Use of existing vectors and binant constructs, thus conferring upon Such heterologous modifications thereof being well within the skill in the art, coding regions the expression pattern of the native MDZ12 only basic features need be described here. gene. These nucleic acids can also be used, conversely, to 0361 Typically, vectors are derived from virus, plasmid, target heterologous transcription control elements to the prokaryotic or eukaryotic chromosomal elements, or Some MDZ12 genomic locus, altering the expression pattern of combination thereof, and include at least one origin of the MDZ12 gene itself. replication, at least one site for insertion of heterologous 0355. In a first such embodiment, the invention provides nucleic acid, typically in the form of a polylinker with an isolated nucleic acid comprising the nucleotide Sequence multiple, tightly clustered, Single cutting restriction sites, of SEQ ID NO: 5783 or its complement, wherein the and at least one Selectable marker, although Some integrative isolated nucleic acid is no more than about 100 kb in length, vectors will lack an origin that is functional in the host to be typically no more than about 75 kb in length, more typically chromosomally modified, and Some vectors will lack Select no more than about 50 kb in length. Often, the isolated able markers. Vectors of the present invention will further nucleic acids of this embodiment are no more than about 25 include at least one nucleic acid of the present invention kb in length, often no more than about 15 kb in length, and inserted into the vector in at least one location. frequently no more than about 10 kb in length. 0362. Where present, the origin of replication and select 0356. In another embodiment, the invention provides an able markers are chosen based upon the desired host cell or isolated nucleic acid comprising at least 17, 18, 20, 24, or 25 host cells, the host cells, in turn, are Selected based upon the nucleotides of the sequence of SEQ ID NO: 5783 or its desired application. complement, wherein the isolated nucleic acid is no more 0363 For example, prokaryotic cells, typically E. coli, than about 100 kb in length, typically no more than about 75 are typically chosen for cloning. In Such case, vector repli kb in length, more typically no more than about 50 kb in cation is predicated on the replication Strategies of coliform length. Often, the isolated nucleic acids of this embodiment infecting phage-Such as phage lambda, M13, T7, T3 and are no more than about 25 kb in length, often no more than P1-or on the replication origin of autonomously replicating about 15 kb in length, and frequently no more than about 10 episomes, notably the Col. 1 plasmid and later derivatives, kb in length. including pBR322 and the puC series plasmids. Where E. coli is used as host, Selectable markers are, analogously, 0357 Vectors and Host Cells chosen for Selectivity in gram negative bacteria: e.g., typical 0358 In another aspect, the present invention provides markers confer resistance to antibiotics, Such as amplicillin, vectors that comprise one or more of the isolated nucleic tetracycline, chloramphenicol, kanamycin, Streptomycin, acids of the present invention, and host cells in which Such Zeocin, auxotrophic markers can also be used. vectors have been introduced. 0364. As another example, yeast cells, typically S. cer 0359 The vectors can be used, inter alia, for propagating evisiae, are chosen, inter alia, for eukaryotic genetic Studies, the nucleic acids of the present invention in host cells due to the ease of targeting genetic changes by homologous (cloning vectors), for shuttling the nucleic acids of the recombination and to the ready ability to complement present invention between host cells derived from disparate genetic defects using recombinantly expressed proteins, for organisms (shuttle vectors), for inserting the nucleic acids of identification of interacting protein components, e.g. the present invention into host cell chromosomes (insertion through use of a two-hybrid System, and for protein expres vectors), for expressing Sense or antisense RNA transcripts Sion. Vectors of the present invention for use in yeast will US 2004/0078837 A1 Apr. 22, 2004 26 typically, but not invariably, contain an origin of replication replication as part of the mammalian chromosome, can, but Suitable for use in yeast and a Selectable marker that is need not, include an origin of replication functional in functional in yeast. mammalian cells, such as the SV40 origin. Vectors based 0365 Integrative YIp vectors do not replicate autono upon viruses, Such as adenovirus, adeno-associated virus, mously, but integrate, typically in Single copy, into the yeast vaccinia virus, and various mammalian retroviruses, will genome at low frequencies and thus replicate as part of the typically replicate according to the Viral replicative Strategy. host cell chromosome; these vectors lack an origin of 0370 Selectable markers for use in mammalian cells replication that is functional in yeast, although they typically include resistance to neomycin (G418), blasticidin, hygro have at least one origin of replication Suitable for propaga mycin and to Zeocin, and Selection based upon the purine tion of the vector in bacterial cells. YEp. Vectors, in contrast, Salvage pathway using HAT medium. replicate episomally and autonomously due to presence of the yeast 2 micron plasmid origin (2 um ori). The YCp yeast 0371 Plant cells can also be used for expression, with the centromere plasmid vectors are autonomously replicating vector replicon typically derived from a plant virus (e.g., vectors containing centromere Sequences, CEN, and autono cauliflower mosaic virus, CaMV; tobacco mosaic virus, mously replicating Sequences, ARS, the ARS Sequences are TMV) and selectable markers chosen for suitability in believed to correspond to the natural replication origins of plants. yeast chromosomes. YACS are based on yeast linear plas 0372 For propagation of nucleic acids of the present mids, denoted YLp, containing homologous or heterologous invention that are larger than can readily be accomodated in DNA sequences that function as telomeres (TEL) in vivo, as vectors derived from plasmids or virus, the invention further well as containing yeast ARS (origins of replication) and provides artificial chromosomes-BACs, YACs, HACs and CEN (centromeres) segments. PACs-that comprise MDZ3, MDZ4, MDZ7 and MDZ12 0366 Selectable markers in yeast vectors include a vari nucleic acids, respectively, often genomic nucleic acids. ety of auxotrophic markers, the most common of which are 0373 The BAC system is based on the well-characterized (in Saccharomyces cerevisiae) URA3, HIS3, LEU2, TRP1 E. coli F-factor, a low copy plasmid that exists in a Super and LYS2, which complement specific auxotrophic muta coiled circular form in host cells. The structural features of tions, such as ura3-52, his3-D1, leu2-D1, trp1-D1 and the F-factor allow stable maintenance of individual human lys2-201. The URA3 and LYS2 yeast genes further permit DNA clones as well as easy manipulation of the cloned negative Selection based on Specific inhibitors, 5-fluoro DNA. See Shizuya et al., Kelo J. Med. 50(1):26-30 (2001); orotic acid (FOA) and O-aminoadipic acid (CAA), respec Shizuya et al., Proc. Natl. Acad. Sci. USA 89(18):8794-7 tively, that prevent growth of the prototrophic strains but (1992). allows growth of the ura3 and lyS2 mutants, respectively. 0374 YACs are based on yeast linear plasmids, denoted Other Selectable markers confer resistance to, e.g., Zeocin. YLp, containing homologous or heterologous DNA 0367 As yet another example, insect cells are often Sequences that function as telomeres (TEL) in Vivo, as well chosen for high efficiency protein expression. Where the as containing yeast ARS (origins of replication) and CEN host cells are from Spodoptera frugiperda—e.g., Sf9 and (centromeres) segments. Sf21 cell lines, and expressFTM cells (Protein Sciences Corp., Meriden, Conn., USA)-the vector replicative strat 0375 HACs are human artifical chromosomes. Kuroiwa egy is typically based upon the baculovirus life cycle. et al., Nature Biotechnol. 18(10): 1086-90 (2000); Henning Typically, baculovirus transfer vectors are used to replace et al., Proc. Natl. Acad. Sci. USA 96(2):592-7 (1999); the wild-type AcMNPV polyhedrin gene with a heterolo Harrington et al., Nature Genet. 15(4):345-55 (1997). In one gous gene of interest. Sequences that flank the polyhedrin version, long Synthetic arrays of alpha Satellite DNA are gene in the wild-type genome are positioned 5' and 3' of the combined with telomeric DNA and genomic DNA to gen expression cassette on the transfer vectors. Following erate linear microchromosomes that are mitotically and cotransfection with AcMNPV DNA, a homologous recom cytogenetically stable in the absence of Selection. bination event occurs between these Sequences resulting in 0376) PACs are P1-derived artificial chromosomes. a recombinant Virus carrying the gene of interest and the Sternberg, Proc. Natl. Acad. Sci. USA 87(1):103-7 (1990); polyhedrin or p10 promoter. Selection can be based upon Sternberg et al., New Biol. 2(2):151-62 (1990); Pierce et al., Visual Screening for lacZ fusion activity. Proc. Natl Acad. Sci. USA 89(6):2056-60 (1992). 0368. As yet another example, mammalian cells are often 0377 Vectors of the present invention will also often chosen for expression of proteins intended as pharmaceuti include elements that permit in vitro transcription of RNA cal agents, and are also chosen as host cells for Screening of from the inserted heterologous nucleic acid. Such vectors potential agonist and antagonists of a protein or a physi typically include a phage promoter, Such as that from T7, T3, ological pathway. or SP6, flanking the nucleic acid insert. Often two different 0369. Where mammalian cells are chosen as host cells, Such promoters flank the inserted nucleic acid, permitting vectors intended for autonomous extrachromosomal repli Separate in vitro production of both Sense and antisense cation will typically include a viral origin, such as the SV40 Strands. origin (for replication in cell lines expressing the large 0378 Expression vectors of the present invention-that T-antigen, such as COS1 and COS7 cells), the papilloma is, those vectors that will drive expression of polypeptides Virus origin, or the EBV origin for long term episomal from the inserted heterologous nucleic acid-will often replication (for use, e.g., in 293-EBNA cells, which consti include a variety of other genetic elements operatively tutively express the EBV EBNA-1 gene product and aden linked to the protein-encoding heterologous nucleic acid ovirus E1A). Vectors intended for integration, and thus insert, typically genetic elements that drive transcription, US 2004/0078837 A1 Apr. 22, 2004 27

Such as promoters and enhancer elements, those that facili ments responsive to ecdysone, an insect hormone, can be tate RNA processing, Such as transcription termination and/ used instead, with coexpression of the ecdysone receptor. or polyadenylation signals, and those that facilitate transla 0385 Expression vectors can be designed to fuse the tion, Such as ribosomal consensus Sequences. expressed polypeptide to Small protein tags that facilitate 0379 For example, vectors for expressing proteins of the purification and/or visualization. present invention in prokaryotic cells, typically E. coli, will include a promoter, often a phage promoter, Such as phage 0386 For example, proteins of the present invention can lambda pl. promoter, the trc promoter, a hybrid derived from be expressed with a polyhistidine tag that facilitates purifi the trp and lac promoters, the bacteriophage T7 promoter (in cation of the fusion protein by immobilized metal affinity E. coli cells engineered to express the T7 polymerase), or the chromatography, for example using NiNTA resin (Qiagen araBAD operon. Often, Such prokaryotic expression vectors Inc., Valencia, Calif., USA) or TALONTM resin (cobalt immobilized affinity chromatography medium, Clontech will further include transcription terminators, Such as the Labs, Palo Alto, Calif., USA). As another example, the aspA terminator, and elements that facilitate translation, fusion protein can include a chitin-binding tag and Self Such as a consensus ribosome binding site and translation excising intein, permitting chitin-based purification with termination codon, Schomer et al., Proc. Natl. Acad. Sci. self-removal of the fused tag (IMPACTTM system, New USA 83:8506-8510 (1986). England Biolabs, Inc., Beverley, Mass., USA). Alterna 0380. As another example, vectors for expressing pro tively, the fusion protein can include a calmodulin-binding teins of the present invention in yeast cells, typically S. peptide tag, permitting purification by calmodulin affinity cerevisiae, will include a yeast promoter, such as the CYC1 resin (Stratagene, La Jolla, Calif., USA), or a specifically promoter, the GAL1 promoter, ADH1 promoter, or the GPD excisable fragment of the biotin carboxylase carrier protein, promoter, and will typically have elements that facilitate permitting purification of in Vivo biotinylated protein using transcription termination, Such as the transcription termina an avidin resin and Subsequent tag removal (Promega, tion signals from the CYC1 or ADH1 gene. Madison, Wis., USA). As another useful alternative, the proteins of the present invention can be expressed as a fusion 0381. As another example, vectors for expressing pro to glutathione-S-transferase, the affinity and Specificity of teins of the present invention in mammalian cells will binding to glutathione permitting purification using glu include a promoter active in mammalian cells. Such pro moters are often drawn from mammalian viruses-Such as tathione affinity resins, such as Glutathione-Superfow Resin the enhancer-promoter Sequences from the immediate early (Clontech Laboratories, Palo Alto, Calif., USA), with Sub gene of the human cytomegalovirus (CMV), the enhancer Sequent elution with free glutathione. promoter Sequences from the Rous Sarcoma virus long 0387. Other tags include, for example, the Xpress terminal repeat (RSV LTR), and the enhancer-promoter epitope, detectable by anti-Xpress antibody (Invitrogen, from SV40. Often, expression is enhanced by incorporation Carlsbad, Calif., USA), a myc tag, detectable by anti-myc of polyadenylation sites, Such as the late SV40 polyadeny tag antibody, the VS epitope, detectable by anti-V5 antibody lation Site and the polyadenylation signal and transcription (Invitrogen, Carlsbad, Calif., USA), FLAG(R) epitope, termination Sequences from the bovine growth hormone detectable by anti-FLAG(R) antibody (Stratagene, La Jolla, (BGH) gene, and ribosome binding sites. Furthermore, Calif., USA), and the HA epitope. vectors can include introns, Such as intron II of rabbit 0388 For secretion of expressed proteins, vectors can B-globin gene and the SV40 splice elements. include appropriate Sequences that encode Secretion Signals, 0382 Vector-drive protein expression can be constitutive Such as leader peptides. For example, the pSecTag2 vectors or inducible. (Invitrogen, Carlsbad, Calif., USA) are 5.2 kb mammalian expression vectors that carry the Secretion signal from the 0383 Inducible vectors include either naturally inducible V-J2-C region of the mouse Ig kappa-chain for efficient promoters, Such as the trc promoter, which is regulated by Secretion of recombinant proteins from a variety of mam the lac operon, and the p promoter, which is regulated by malian cell lines. tryptophan, the MMTV-LTR promoter, which is inducible by dexamethasone, or can contain Synthetic promoters and/ 0389 Expression vectors can also be designed to fuse or additional elements that confer inducible control on proteins encoded by the heterologous nucleic acid insert to adjacent promoters. Examples of inducible Synthetic pro polypeptides larger than purification and/or identification moters are the hybrid Plac/ara-1 promoter and the PLtetO-1 tags. Useful protein fusions include those that permit display promoter. The PltetO-1 promoter takes advantage of the high of the encoded protein on the Surface of a phage or cell, expression levels from the PL promoter of phage lambda, fusions to intrinsically fluorescent proteins, Such as those but replaces the lambda repressor Sites with two copies of that have a green fluorescent protein (GFP)-like chro operator 2 of the Tn 10 tetracycline resistance operon, caus mophore, fusions to the IgGFc region, and fusions for use ing this promoter to be tightly repressed by the Tet repressor in two hybrid systems. protein and induced in response to tetracycline (Tc) and Tc 0390 Vectors for phage display fuse the encoded derivatives Such as anhydrotetracycline. polypeptide to, e.g., the gene III protein (pII) or gene VIII 0384 As another example of inducible elements, hor protein (pVIII) for display on the surface of filamentous mone response elements, Such as the glucocorticoid phage, such as M13. See Barbas et al., Phage Display. A response element (GRE) and the estrogen response element Laboratory Manual, Cold Spring Harbor Laboratory Press (ERE), can confer hormone inducibility where vectors are (2001) (ISBN 0-87969-546-3); Kay et al. (eds.), Phage used for expression in cells having the respective hormone Display of Peptides and Proteins. A Laboratory Manual, receptors. To reduce background levels of expression, ele San Diego: Academic Press, Inc., 1996; Abelson et al. (eds.), US 2004/0078837 A1 Apr. 22, 2004 28

Combinatorial Chemistry, Methods in Enzymology vol. entirety. A variety of Such modified chromophores are now 267, Academic Press (May 1996). commercially available and can readily be used in the fusion 0391 Vectors for yeast display, e.g. the pYD1 yeast proteins of the present invention. display vector (Invitrogen, Carlsbad, Calif., USA), use the 0397) For example, EGFP (“enhanced GFP”), Cormack C.-agglutinin yeast adhesion receptor to display recombinant et al., Gene 173:33-38 (1996); U.S. Pat. Nos. 6,090,919 and protein on the Surface of S. cerevisiae. Vectors for mamma 5,804,387, is a red-shifted, human codon-optimized variant lian display, e.g., the pdisplay" vector (Invitrogen, Carls of GFP that has been engineered for brighter fluorescence, bad, Calif., USA), target recombinant proteins using an higher expression in mammalian cells, and for an excitation N-terminal cell Surface targeting Signal and a C-terminal spectrum optimized for use in flow cytometers. EGFP can transmembrane anchoring domain of platelet derived growth usefully contribute a GFP-like chromophore to the fusion factor receptor. proteins of the present invention. A variety of EGFP vectors, both plasmid and viral, are available commercially (Clon 0392 A wide variety of vectors now exist that fuse tech Labs, Palo Alto, Calif., USA), including vectors for proteins encoded by heterologous nucleic acids to the chro bacterial expression, Vectors for N-terminal protein fusion mophore of the Substrate-independent, intrinsically fluores expression, vectors for expression of C-terminal protein cent green fluorescent protein from Aequorea Victoria fusions, and for bicistronic expression. (“GFP) and its variants. These proteins are intrinsically fluorescent: the GFP-like chromophore is entirely encoded 0398 Toward the other end of the emission spectrum, by its amino acid Sequence and can fluoresce without EBFP (“enhanced blue fluorescent protein”) and BFP2 con requirement for cofactor or Substrate. tain four amino acid Substitutions that shift the emission from green to blue, enhance the brightness of fluorescence 0393 Structurally, the GFP-like chromophore comprises and improve solubility of the protein, Heim et al., Curr. Biol. an 11-stranded B-barrel (B-can) with a central C-helix, the 6:178-182 (1996); Cormack et al., Gene 173:33-38 (1996). central C-helix having a conjugated at-resonance System that EBFP is optimized for expression in mammalian cells includes two aromatic ring Systems and the bridge between whereas BFP2, which retains the original jellyfish codons, them. The TL-resonance System is created by autocatalytic can be expressed in bacteria; as is further discussed below, cyclization among amino acids, cyclization proceeds the host cell of production does not affect the utility of the through an imidazolinone intermediate, with Subsequent resulting fusion protein. The GFP-like chromophores from dehydrogenation by molecular oxygen at the CO-CB bond of EBFP and BFP2 can usefully be included in the fusion a participating tyrosine. proteins of the present invention, and Vectors containing 0394 The GFP-like chromophore can be selected from these blue-shifted variants are available from Clontech Labs GFP-like chromophores found in naturally occurring pro (Palo Alto, Calif., USA). teins, such as A. victoria GFP (GenBank accession number 0399 Analogously, EYFP (“enhanced yellow fluorescent AAA27721), Renilla reniformis GFP, FP583 (GenBank protein'), also available from Clontech Labs, contains four accession no. AF168419) (DsRed), FP593 (AF272711), amino acid Substitutions, different from EBFP, Ormö et al., FP483 (AF168420), FP484 (AF168424), FP595 Science 273:1392-1395 (1996), that shift the emission from (AF246709), FP486 (AF168421), FP538 (AF168423), and green to yellowish-green. Citrine, an improved yellow fluo FP506 (AF168422), and need include only so much of the rescent protein mutant, is described in Heikal et al., PrOC. native protein as is needed to retain the chromophore's Natl. Acad. Sci. USA 97:11996-12001 (2000). ECFP intrinsic fluorescence. Methods for determining the minimal (“enhanced cyan fluorescent protein') (Clontech Labs, Palo domain required for fluorescence are known in the art. Li et Alto, Calif., USA) contains six amino acid Substitutions, one al., “Deletions of the Aequorea victoria Green Fluorescent of which shifts the emission spectrum from green to cyan. Protein Define the Minimal Domain Required for Fluores Heim et al., Curr. Biol. 6:178-182 (1996); Miyawaki et al., cence, J. Biol. Chem. 272:28545-28549 (1997). Nature 388:882-887 (1997). The GFP-like chromophore of 0395 Alternatively, the GFP-like chromophore can be each of these GFP variants can usefully be included in the selected from GFP-like chromophores modified from those fusion proteins of the present invention. found in nature. Typically, Such modifications are made to 0400. The GFP-like chromophore can also be drawn from improve recombinant production in heterologous expression other modified GFPs, including those described in U.S. Pat. Systems (with or without change in protein sequence), to Nos. 6,124,128; 6,096,865; 6,090,919; 6,066,476; 6,054, alter the excitation and/or emission spectra of the native 321; 6,027,881; 5,968,750; 5,874,304; 5,804,387; 5,777, protein, to facilitate purification, to facilitate or as a conse 079; 5,741,668; and 5,625,048, the disclosures of which are quence of cloning, or are a fortuitous consequence of incorporated herein by reference in their entireties. See also research investigation. Conn (ed.), Green Fluorescent Protein, Methods in Enzymol. 0396 The methods for engineering such modified GFP Vol. 302, pp. 378-394 (1999), incorporated herein by refer like chromophores and testing them for fluorescence activ ence in its entirety. A variety of Such modified chromophores ity, both alone and as part of protein fusions, are well-known are now commercially available and can readily be used in in the art. Early results of these efforts are reviewed in Heim the fusion proteins of the present invention. et al., Curr. Biol. 6:178-182 (1996), incorporated herein by 04.01 Fusions to the IgG Fc region increase serum half reference in its entirety, a more recent review, with tabula life of protein pharmaceutical products through interaction tion of useful mutations, is found in Palm et al., “Spectral with the FcRn receptor (also denominated the FcRp receptor Variants of Green Fluorescent Protein,” in Green Fluores and the Brambell receptor, FcRb), further described in cent Proteins, Conn (ed.), Methods Enzymol. Vol. 302, pp. international patent application nos. WO 97/43316, WO 378-394 (1999), incorporated herein by reference in its 97/34631, WO 96/32478, WO 96/18412. US 2004/0078837 A1 Apr. 22, 2004 29

0402 For long-term, high-yield recombinant production from the American Type Culture Collection (ATCC) of the proteins, protein fusions, and protein fragments of the (Manassas, Va., USA) and the National Institute of General present invention, Stable expression is particularly useful. medical Sciences (NIGMS) Human Genetic Cell Repository 0403 Stable expression is readily achieved by integration at the Coriell Cell Repositories (Camden, N.J., USA). into the host cell genome of vectors having Selectable 04.09 Methods for introducing the vectors and nucleic markers, followed by Selection for integrants. acids of the present invention into the host cells are well 0404 For example, the pUB6/V5-His A, B, and Cvectors known in the art; the choice of technique will depend (Invitrogen, Carlsbad, Calif., USA) are designed for high primarily upon the Specific vector to be introduced and the level Stable expression of heterologous proteins in a wide host cell chosen. range of mammalian tissue types and cell lines. pUB6/V5 0410 For example, phage lambda vectors will typically His uses the promoter/enhancer Sequence from the human be packaged using a packaging extract (e.g., GigapackE ubiquitin C gene to drive expression of recombinant pro packaging extract, Stratagene, La Jolla, Calif., USA), and teins: expression levels in 293, CHO, and NIH3T3 cells are the packaged virus used to infect E. coli. Plasmid vectors comparable to levels from the CMV and human EF-1a will typically be introduced into chemically competent or promoters. The bsd gene permits rapid Selection of Stably electrocompetent bacterial cells. transfected mammalian cells with the potent antibiotic blas 0411 E. coli cells can be rendered chemically competent ticidin. by treatment, e.g., with CaCl, or a solution of Mg,Mn", 04.05 Replication incompetent retroviral vectors, typi Cat", Rb" or K", dimethyl sulfoxide, dithiothreitol, and cally derived from Moloney murine leukemia virus, prove hexamine cobalt(III), Hanahan, J. Mol. Biol. 166(4):557-80 particularly useful for creating Stable transfectants having (1983), and vectors introduced by heat shock. A wide variety integrated provirus. The highly efficient transduction of chemically competent Strains are also available commer machinery of retroviruses, coupled with the availability of a cially (e.g., Epicurian Coli(R) XL10-Gold(R) Ultracompetent variety of packaging cell lines-such as RetroPackTM PT67, Cells (Stratagene, La Jolla, Calif., USA); DH5O. competent EcoPack2TM-293, AmphoPack-293, GP2-293 cell lines (all cells (Clontech Laboratories, Palo Alto, Calif., USA); available from Clontech Laboratories, Palo Alto, Calif., TOP10 Chemically Competent E. coli Kit (Invitrogen, USA)-allow a wide host range to be infected with high Carlsbad, Calif., USA)). efficiency; varying the multiplicity of infection readily adjusts the copy number of the integrated provirus. Retro 0412 Bacterial cells can be rendered electrocompetent viral vectors are available with a variety of selectable that is, competent to take up exogenous DNA by electropo markers, Such as resistance to neomycin, hygromycin, and ration-by various pre-pulse treatments, vectors are intro puromycin, permitting ready Selection of Stable integrants. duced by electroporation followed by Subsequent outgrowth in Selected media. An extensive Series of protocols is pro 0406. The present invention further includes host cells vided online in Electroprotocols (BioRad, Richmond, Calif., comprising the vectors of the present invention, either USA) (http://www.bio-rad.com/LifeScience/pdf/New present episomally within the cell or integrated, in whole or Gene Pulser.pdf). in part, into the host cell chromosome. 0413 Vectors can be introduced into yeast cells by 0407 Among other considerations, some of which are Spheroplasting, treatment with lithium Salts, electroporation, described above, a host cell Strain may be chosen for its or protoplast fusion. ability to process the expressed protein in the desired fashion. Such post-translational modifications of the 0414) Spheroplasts are prepared by the action of hydro polypeptide include, but are not limited to, acetylation, lytic enzymes-a Snail-gut extract, usually denoted GluSu carboxylation, glycosylation, phosphorylation, lipidation, lase, or Zymolyase, an enzyme from Arthrobacter luteus and acylation, and it is an aspect of the present invention to to remove portions of the cell wall in the presence of OSmotic stabilizers, typically 1 M Sorbitol. DNA is added to the provide MDZ3, MDZ4, MDZ7 and MDZ12 proteins, Spheroplasts, and the mixture is co-precipitated with a respectively, with Such post-translational modifications. solution of polyethylene glycol (PEG) and Ca". Subse 0408. As noted earlier, host cells can be prokaryotic or quently, the cells are resuspended in a Solution of Sorbitol, eukaryotic. Representative examples of appropriate host mixed with molten agar and then layered on the Surface of cells include, but are not limited to, bacterial cells, Such as a Selective plate containing Sorbitol. For lithium-mediated E. coli, Caulobacter crescentus, Streptomyces species, and transformation, yeast cells are treated with lithium acetate, Salmonella typhimurium; yeast cells, Such as Saccharomy which apparently permeabilizes the cell wall, DNA is added ceS cerevisiae, Schizosaccharomyces pombe, Pichia pas and the cells are co-precipitated with PEG. The cells are toris, Pichia methanolica; insect cell lines, Such as those exposed to a brief heat shock, washed free of PEG and from Spodoptera frugiperda-e.g., Sf9 and Sf21 cell lines, lithium acetate, and Subsequently spread on plates contain and expressFTM cells (Protein Sciences Corp., Meriden, ing ordinary Selective medium. Increased frequencies of Conn., USA)-Drosophila S2 cells, and Trichoplusia ni transformation are obtained by using specially-prepared High Five(R) Cells (Invitrogen, Carlsbad, Calif., USA); and Single-Stranded carrier DNA and certain organic Solvents. mammalian cells. Typical mammalian cells include COS1 Schiestlet al., Curr. Genet. 16(5-6):339-46 (1989). For and COS7 cells, chinese hamster ovary (CHO) cells, NIH electroporation, freshly-grown yeast cultures are typically 3T3 cells, 293 cells, HEPG2 cells, HeLa cells, L cells, washed, Suspended in an OSmotic protectant, Such as Sorbi murine ES cell lines (e.g., from strains 129/SV, C57/BL6, tol, mixed with DNA, and the cell Suspension pulsed in an DBA-1, 129/SVJ), K562, Jurkat cells, and BW5147. Other electroporation device. Subsequently, the cells are spread on mammalian cell lines are well known and readily available the Surface of plates containing Selective media. Becker et US 2004/0078837 A1 Apr. 22, 2004 30

al., Methods Enzymol. 194:182-7 (1991). The efficiency of population. Small deletions and insertions can often be transformation by electroporation can be increased over found that do not alter the function of the protein. 100-fold by using PEG, single-stranded carrier DNA and cells that are in late log-phase of growth. Larger constructs, 0422. Accordingly, it is an aspect of the present invention Such as YACs, can be introduced by protoplast fusion. to provide proteins not only identical in Sequence to those described with particularity herein, but also to provide 0415 Mammalian and insect cells can be directly isolated proteins at least about 65% identical in Sequence to infected by packaged viral vectors, or transfected by chemi those described with particularity herein, typically at least cal or electrical means. about 70%, 75%, 80%, 85%, or 90% identical in sequence 0416) For chemical transfection, DNA can be coprecipi to those described with particularity herein, usefully at least tated with CaPO or introduced using liposomal and nonli about 91%, 92%, 93%, 94%, or 95% identical in sequence poSomal lipid-based agents. Commercial kits are available to those described with particularity herein, usefully at least for CaPO transfection (CalPhosTM Mammalian Transfec about 96%, 97%, 98%, or 99% identical insequence to those tion Kit, Clontech Laboratories, Palo Alto, Calif., USA), and described with particularity herein, and, most conserva lipid-mediated transfection can be practiced using commer tively, at least about 99.5%, 99.6%, 99.7%, 99.8% and cial reagents, such as LIPOFECTAMINETM 2000, LIPO FECTAMINETM Reagent, CELLFECTINGR) Reagent, and 99.9% identical in sequence to those described with particu LIPOFECTINGR) Reagent (Invitrogen, Carlsbad, Calif., larity herein. These Sequence variants can be naturally USA), DOTAP Liposomal Transfection Reagent, FuGENE occurring or can result from human intervention by way of 6, X-tremeGENE Q2, DOSPER, (Roche Molecular Bio random or directed mutagenesis. chemicals, Indianapolis, Ind. USA), Effectene"M, Poly 0423 For purposes herein, percent identity of two amino Fect(R), Superfect(R) (Qiagen, Inc., Valencia, Calif., USA). acid Sequences is determined using the procedure of Tatiana Protocols for electroporating mammalian cells can be found online in Electroprotocols (Bio-Rad, Richmond, Calif., et al., “Blast 2 Sequences-a new tool for comparing protein USA) (http://www.bio-rad.com/LifeScience/pdf/New and nucleotide sequences”, FEMS Microbiol Lett. 174:247 Gene Pulser.pdf). See also, Norton et al. (eds.), Gene 250 (1999), which procedure is effectuated by the computer Transfer Methods. Introducing DNA into Living Cells and program BLAST 2 SEQUENCES, available online at Organisms, BioTechniques Books, Eaton Publishing Co. 0424 http://www.ncbi.nlm.nih.gov/blast/bl2seq/ (2000) (ISBN 1-881299-34-1), incorporated herein by ref erence in its entirety. bl2.html, 0417. Other transfection techniques include transfection 0425 To assess percent identity of amino acid sequences, by particle embardment. See, e.g., Cheng et al., PrOC. Natl. the BLASTP module of BLAST 2 SEQUENCES is used Acad. Sci. USA 90(10):4455-9 (1993); Yang et al., Proc. with default values of (i) BLOSUM62 matrix, Henikoff et Natl. Acad. Sci. USA 87(24):9568-72 (1990). al., Proc. Natl. Acad. Sci USA 89(22): 10915-9 (1992); (ii) open gap 11 and extension gap 1 penalties; and (iii) gap 0418 Proteins x dropoff 50 expect 10 word size 3 filter, and both 0419. In another aspect, the present invention provides Sequences are entered in their entireties. MDZ3, MDZ4, MDZ7 and MDZ12 proteins, respectively, various fragments thereof Suitable for use as antigens (e.g., 0426 AS is well known, amino acid substitutions occur for epitope mapping) and for use as immunogens (e.g., for frequently among natural allelic variants, with conservative raising antibodies or as vaccines), fusions of MDZ3, MDZ4, Substitutions often occasioning only de minimis change in MDZ7 and MDZ12 polypeptides and fragments to heter protein function. ologous polypeptides, and conjugates of the proteins, frag 0427 Accordingly, it is an aspect of the present invention ments, and fusions of the present invention to other moieties to provide proteins not only identical in Sequence to those (e.g., to carrier proteins, to fluorophores). described with particularity herein, but also to provide 0420 FIGS. 3, 6, 9, 12 and 13 present the predicted isolated proteins having the sequence of MDZ3, MDZ4, amino acid sequences encoded by the MDZ3, MDZ4, MDZ7 and MDZ12 proteins, respectively, or portions MDZ7, MDZ12a and MDZ12b (S and L) cDNA clones, thereof, with conservative amino acid Substitutions. It is a respectively. The amino acid Sequences are further pre further aspect to provide isolated proteins having the sented, respectively, in SEQ ID Nos: 3, 3029, 4409, 5772, sequence of MDZ3, MDZ4, MDZ7 and MDZ12 proteins, 6939 and 6940. respectively, and portions thereof, with moderately conser Vative amino acid Substitutions. These conservatively-Sub 0421 Unless otherwise indicated, amino acid Sequences Stituted and moderately conservatively-Substituted variants of the proteins of the present invention were determined as a predicted translation from a nucleic acid Sequence. can be naturally occurring or can result from human inter Accordingly, any amino acid Sequence presented herein may vention. contain errors due to errors in the nucleic acid Sequence, as 0428. Although there are a variety of metrics for calling described in detail above. Furthermore, Single nucleotide conservative amino acid Substitutions, based primarily on polymorphisms (SNPs) occur frequently in eukaryotic either observed changes among evolutionarily related pro genomes-more than 1.4 million SNPs have already iden teins or on predicted chemical Similarity, for purposes herein tified in the human genome, International Human Genome a conservative replacement is any change having a positive Sequencing Consortium, Nature 409:860-921 (2001)-and value in the PAM250 log-likelihood matrix reproduced the Sequence determined from one individual of a Species herein below (see Gonnet et al., Science 256(5062): 1443-5 may differ from other allelic forms present within the (1992)): US 2004/0078837 A1 Apr. 22, 2004 31

O 2 2 3 12 2 3 2 2

1 2 3 4 1 2 3 4 2 4 3

2 3 3 4 1 3 4 5 O 1 2

0429 For purposes herein, a “moderately conservative” lagomorphs, Such as rabbits, and from domestic livestock, replacement is any change having a nonnegative value in the Such as cow, pig, sheep, horse, and goat. PAM250 log-likelihood matrix reproduced herein above. 0433 Relatedness of proteins can also be characterized 0430 AS is also well known in the art, relatedness of using a Second functional test, the ability of a first protein proteins can also be characterized using a functional test, the ability of the encoding nucleic acids to base-pair to one competitively to inhibit the binding of a Second protein to an another at defined hybridization Stringencies. antibody. 0431. It is, therefore, another aspect of the invention to 0434. It is, therefore, another aspect of the present inven provide isolated proteins not only identical in Sequence to tion to provide isolated proteins not only identical in those described with particularity herein, but also to provide Sequence to those described with particularity herein, but isolated proteins (“hybridization related proteins”) that are also to provide isolated proteins (“cross-reactive proteins”) encoded by nucleic acids that hybridize under high Strin that competitively inhibit the binding of antibodies to all or gency conditions (as defined herein above) to all or to a to a portion of various of the isolated MDZ3, MDZ4, MDZ7 portion of various of the isolated nucleic acids of the present and MDZ12 proteins, respectively, of the present invention invention (“reference nucleic acids”). It is a further aspect of (“reference proteins”). Such competitive inhibition can the invention to provide isolated proteins (“hybridization readily be determined using immunoassays well known in related proteins”) that are encoded by nucleic acids that the art. hybridize under moderate Stringency conditions (as defined herein above) to all or to a portion of various of the isolated 0435 Among the proteins of the present invention that nucleic acids of the present invention (“reference nucleic differ in amino acid Sequence from those described with acids”). particularity herein-including those that have deletions and 0432. The hybridization related proteins can be alterna insertions causing up to 10% non-identity, those having tive isoforms, homologues, paralogues, and Orthologues of conservative or moderately conservative Substitutions, the MDZ3, MDZ4, MDZ7 and MDZ12 proteins, respec hybridization related proteins, and croSS-reactive proteins tively, of the present invention. Particularly useful ortho those that substantially retain one or more MDZ3, MDZ4, logues are those from other primate Species, Such as chim MDZ7 or MDZ12 activities are particularly useful. As panzee, rhesus macaque monkey, baboon, orangutan, and described above, those activities include transcription regu gorilla, from rodents, Such as rats, mice, guinea pigs, from lation and protein-protein interaction. US 2004/0078837 A1 Apr. 22, 2004 32

0436 Residues that are tolerant of change while retaining protein fragments are useful, inter alia, as antigenic and function can be identified by altering the protein at known immunogenic fragments of MDZ3, MDZ4, MDZ7 or residues using methods known in the art, Such as alanine MDZ12, respectively. Scanning mutagenesis, Cunningham et al., Science 244(4908): 1081-5 (1989); transposon linker scanning 0442. By “fragments” of a protein is here intended iso mutagenesis, Chen et al., Gene 263(1-2):39-48 (2001); lated proteins (equally, polypeptides, peptides, oligopep combinations of homolog- and alanine-Scanning mutagen tides), however obtained, that have an amino acid sequence esis, Jin et al., J. Mol. Biol. 226(3):851-65 (1992); combi identical to a portion of the reference amino acid Sequence, natorial alanine Scanning, Weiss et al., Proc. Natl. Acad. Sci which portion is at least 6 amino acids and less than the USA 97(16):8950-4 (2000), followed by functional assay. entirety of the reference nucleic acid. AS So defined, "frag Transposon linker Scanning kits are available commercially ments' need not be obtained by physical fragmentation of (New England Biolabs, Beverly, Mass., USA, catalog. no. the reference protein, although Such provenance is not E7-102S; EZ:TNTM In-Frame Linker Insertion Kit, cata thereby precluded. logue no. EZIO4KN, Epicentre Technologies Corporation, 0443 Fragments of at least 6 contiguous amino acids are Madison, Wis., USA). useful in mapping B cell and T cell epitopes of the reference 0437 AS further described below, the isolated proteins of protein. See, e.g., Geysen et al., “Use of peptide Synthesis to the present invention can readily be used as Specific immu probe Viral antigens for epitopes to a resolution of a single nogens to raise antibodies that Specifically recognize MDZ3, amino acid, 'Proc. Natl. Acad. Sci. USA 81:3998-4002 MDZ4, MDZ7 or MDZ12 (MDZ12a, MDZ12bS, (1984) and U.S. Pat. Nos. 4,708,871 and 5,595,915, the MDZ12bL) proteins, their isoforms, homologues, paral disclosures of which are incorporated herein by reference in ogues, and/or orthologues. The antibodies, in turn, can be their entireties. Because the fragment need not itself be used, inter alia, specifically to assay for the MDZ3, MDZ4, immunogenic, part of an immunodominant epitope, nor even MDZ7 or MDZ12 proteins of the present invention-e.g. by recognized by native antibody, to be useful in Such epitope ELISA for detection of protein fluid Samples, Such as Serum, mapping, all fragments of at least 6 amino acids of the by immunohistochemistry or laser Scanning cytometry, for proteins of the present invention have utility in Such a study. detection of protein in tissue Samples, or by flow cytometry, 0444 Fragments of at least 8 contiguous amino acids, for detection of intracellular protein in cell Suspensions-for often at least 15 contiguous amino acids, have utility as Specific antibody-mediated isolation and/or purification of immunogens for raising antibodies that recognize the pro MDZ3, MDZ4, MDZ7 or MDZ12 proteins, as for example teins of the present invention. See, e.g., Lerner, “Tapping the by immunoprecipitation, and for use as Specific agonists or immunological repertoire to produce antibodies of predeter antagonists of MDZ3, MDZ4, MDZ7 or MDZ12 action. mined specificity,” Nature 299:592-596 (1982); Shinnicket 0438. The isolated proteins of the present invention are al., “Synthetic peptide immunogens as vaccines, Annu. Rev. also immediately available for use as Specific Standards in Microbiol. 37:425-46 (1983); Sutcliffe et al., “Antibodies assays used to determine the concentration and/or amount that react with predetermined Sites on proteins, Science specifically of the MDZ3, MDZ4, MDZ7 or MDZ12 pro 219:660-6 (1983), the disclosures of which are incorporated teins of the present invention. As is well known, ELISA kits herein by reference in their entireties. As further described for detection and quantitation of protein analytes typically in the above-cited references, Virtually all 8-mers, conju include isolated and purified protein of known concentration gated to a carrier, Such as a protein, prove immunogenic for use as a measurement standard (e.g., the human inter that is, prove capable of eliciting antibody for the conjugated feron-Y OptEIA kit, catalog no. 555142, Pharmingen, San peptide; accordingly, all fragments of at least 8 amino acids Diego, Calif., USA includes human recombinant gamma of the proteins of the present invention have utility as interferon, baculovirus produced). immunogens. 0439. The isolated proteins of the present invention are 0445 Fragments of at least 8, 9, 10 or 12 contiguous also immediately available for use as Specific biomolecule amino acids are also useful as competitive inhibitors of capture probes for Surface-enhanced laser desorption ion binding of the entire protein, or a portion thereof, to anti ization (SELDI) detection of protein-protein interactions, bodies (as in epitope mapping), and to natural binding WO 98/59362; WO 98/59360; WO 98/59361; and Merchant partners, Such as Subunits in a multimeric complex or to et al., Electrophoresis 21(6): 1164-77 (2000), the disclosures receptors or ligands of the Subject protein; this competitive of which are incorporated herein by reference in their inhibition permits identification and Separation of molecules entireties. Analogously, the isolated proteins of the present that bind specifically to the protein of interest, U.S. Pat. Nos. invention are also immediately available for use as Specific 5,539,084 and 5,783,674, incorporated herein by reference biomolecule capture probes on BIACORE surface plasmon in their entireties. resonance probes. See Weinberger et al., Pharmacogenom 0446. The protein, or protein fragment, of the present ics 1(4):395-416 (2000); Malmqvist, Biochem. Soc. Trans. invention is thus at least 6 amino acids in length, typically 27(2):335-40 (1999). at least 8, 9, 10 or 12 amino acids in length, and often at least 0440 The isolated proteins of the present invention are 15 amino acids in length. Often, the protein or the present also useful as a therapeutic Supplement in patients having a invention, or fragment thereof, is at least 20 amino acids in specific deficiency in MDZ3, MDZ4, MDZ7 or MDZ12 length, even 25 amino acids, 30 amino acids, 35 amino production, respectively. acids, or 50 amino acids or more in length. Of course, larger fragments having at least 75 amino acids, 100 amino acids, 0441. In another aspect, the invention also provides frag or even 150 amino acids are also useful, and at times ments of various of the proteins of the present invention. The preferred. US 2004/0078837 A1 Apr. 22, 2004

0447 The present invention further provides fusions of University Press (1997) (ISBN: 0195109384); Zhu et al., each of the proteins and protein fragments of the present Yeast Hybrid Technologies, Eaton Publishing, (2000) (ISBN invention to heterologous polypeptides. 1-881299-15-5); Fields et al., Trends Genet. 10(8):286-92 (1994); Mendelsohn et al., Curr. Opin. Biotechnol. 0448. By fusion is here intended that the protein or 5(5):482-6 (1994); Luban et al., Curr. Opin. Biotechnol. protein fragment of the present invention is linearly con 6(1):59-64 (1995); Allen et al., Trends Biochem. Sci. tiguous to the heterologous polypeptide in a peptide-bonded 20(12):511-6 (1995); Drees, Curr. Opin. Chem. Biol. polymer of amino acids or amino acid analogues, by "het 3(1):64-70 (1999); Topcu et al., Pharm. Res. 17(9):1049-55 erologous polypeptide' is here intended a polypeptide that (2000); Fashena et al., Gene 250(1-2):1-14 (2000), the does not naturally occur in contiguity with the protein or disclosures of which are incorporated herein by reference in protein fragment of the present invention. ASSO defined, the their entireties. Typically, such fusion is to either E. coli fusion can consist entirely of a plurality of fragments of any Lex A or yeast GAL4 DNA binding domains. Related bait one of the MDZ3, MDZ4, MDZ7 or MDZ12 proteins, plasmids are available that express the bait fused to a nuclear respectively, in altered arrangement; in Such case, any of the localization signal. MDZ3, MDZ4, MDZ7 or MDZ12 fragments can be con sidered heterologous to the other MDZ3, MDZ4, MDZ7 or 0454) Other useful protein fusions include those that MDZ12 fragments in the fusion protein. More typically, permit display of the encoded protein on the Surface of a however, the heterologous polypeptide is not drawn from the phage or cell, fusions to intrinsically fluorescent proteins, MDZ3, MDZ4, MDZ7 or MDZ12 protein itself. Such as green fluorescent protein (GFP), and fusions to the IgG Fc region, as described above, which discussion is 0449 The fusion proteins of the present invention will incorporated here by reference in its entirety. include at least one fragment of the protein of the present invention, which fragment is at least 6, typically at least 8, 0455 The proteins and protein fragments of the present often at least 15, and usefully at least 16, 17, 18, 19, or 20 invention can also usefully be fused to protein toxins, Such amino acids long. The fragment of the protein of the present as Pseudomonas eXotoxin A, diphtheria toxin, Shiga toxin A, to be included in the fusion can usefully be at least 25 amino anthrax toxin lethal factor, ricin, in order to effect ablation of acids long, at least 50 amino acids long, and can be at least cells that bind or take up the proteins of the present inven 75, 100, or even 150 amino acids long. Fusions that include tion. the entirety of the proteins of the present invention have particular utility. 0456. The isolated proteins, protein fragments, and pro tein fusions of the present invention can be composed of 0450. The heterologous polypeptide included within the natural amino acids linked by native peptide bonds, or can fusion protein of the present invention is at least 6 amino contain any or all of nonnatural amino acid analogues, acids in length, often at least 8 amino acids in length, and nonnative bonds, and post-Synthetic (post translational) usefully at least 15, 20, and 25 amino acids in length. modifications, either throughout the length of the protein or Fusions that include larger polypeptides, Such as the IgGFc localized to one or more portions thereof. region, and even entire proteins (Such as GFP chromophore 0457 AS is well known in the art, when the isolated containing proteins), have particular utility. protein is used, e.g., for epitope mapping, the range of Such 0451 AS described above in the description of vectors nonnatural analogues, nonnative inter-residue bonds, or and expression vectors of the present invention, which post-synthesis modifications will be limited to those that discussion is incorporated herein by reference in its entirety, permit binding of the peptide to antibodies. When used as an heterologous polypeptides to be included in the fusion immunogen for the preparation of antibodies in a non proteins of the present invention can usefully include those human host, Such as a mouse, the range of Such nonnatural designed to facilitate purification and/or visualization of analogues, nonnative inter-residue bonds, or post-Synthesis recombinantly-expressed proteins. Although purification modifications will be limited to those that do not interfere tags can also be incorporated into fusions that are chemically with the immunogenicity of the protein. When the isolated Synthesized, chemical Synthesis typically provides Sufficient protein is used as a therapeutic agent, Such as a vaccine or purity that further purification by HPLC suffices; however, for replacement therapy, the range of Such changes will be Visualization tags as above described retain their utility even limited to those that do not confer toxicity upon the isolated when the protein is produced by chemical Synthesis, and protein. when So included render the fusion proteins of the present 04.58 Non-natural amino acids can be incorporated dur invention useful as directly detectable markers of MDZ3, ing Solid phase chemical Synthesis or by recombinant tech MDZ4, MDZ7 or MDZ12 presence. niques, although the former is typically more common. 0452 AS also discussed above, heterologous polypep tides to be included in the fusion proteins of the present 0459 Solid phase chemical synthesis of peptides is well invention can usefully include those that facilitate Secretion established in the art. Procedures are described, inter alia, in of recombinantly expressed proteins-into the periplasmic Chan et al. (eds.), Fmoc Solid Phase Peptide Synthesis: A Space or extracellular milieu for prokaryotic hosts, into the Practical Approach (Practical Approach Series), Oxford Univ. Press (March 2000) (ISBN: 0199637245); Jones, culture medium for eukaryotic cells-through incorporation Amino Acid and Peptide Synthesis (Oxford Chemistry Prim of Secretion signals and/or leader Sequences. ers, No 7), Oxford Univ. Press (August 1992) (ISBN: 0453 Other useful protein fusions of the present inven 0.198556683); and Bodanszky, Principles of Peptide Syn tion include those that permit use of the protein of the thesis (Springer Laboratory), Springer Verlag (December present invention as bait in a yeast two-hybrid System. See 1993) (ISBN: 0387564314), the disclosures of which are Bartel et al. (eds.), The Yeast Two-Hybrid System, Oxford incorporated herein by reference in their entireties. US 2004/0078837 A1 Apr. 22, 2004 34

0460 For example, D-enantiomers of natural amino acids boxylic acid, Fmoc-N-(4-aminobenzoyl)-b-alanine, Fmoc can readily be incorporated during chemical peptide Synthe 2-amino-4,5-dimethoxybenzoic acid, Fmoc-4- Sis: peptides assembled from D-amino acids are more resis aminohippuric acid, Fmoc-2-amino-3-hydroxybenzoic acid, tant to proteolytic attack, incorporation of D-enantiomers Fmoc-2-amino-5-hydroxybenzoic acid, Fmoc-3-amino-4- can also be used to confer Specific three dimensional con hydroxybenzoic acid, Fmoc-4-amino-3-hydroxybenzoic formations on the peptide. Other amino acid analogues acid, Fmoc-4-amino-2-hydroxybenzoic acid, Fmoc-5- commonly added during chemical Synthesis include orni amino-2-hydroxybenzoic acid, Fmoc-2-amino-3-methoxy thine, norleucine, phosphorylated amino acids (typically benzoic acid, Fmoc-4-amino-3-methoxybenzoic acid, phosphoSerine, phosphothreonine, phosphotyrosine), L-ma Fmoc-2-amino-3-methylbenzoic acid, Fmoc-2-amino-5-me lonyltyrosine, a non-hydrolyzable analog of phosphoty thylbenzoic acid, Fmoc-2-amino-6-methylbenzoic acid, rosine (Kole et al., Biochem. BiophyS. Res. Com. 209:817 Fmoc-3-amino-2-methylbenzoic acid, Fmoc-3-amino-4-me 821 (1995)), and various halogenated phenylalanine thylbenzoic acid, Fmoc-4-amino-3-methylbenzoic acid, derivatives. Fmoc-3-amino-2-naphtoic acid, Fmoc-D,L-3-amino-3-phe nyipropionic acid, Fmoc-L-Methyldopa, Fmoc-2-amino-4, 0461 Amino acid analogues having detectable labels are 6-dimethyl-3-pyridinecarboxylic acid, Fmoc-D,L-2-amino also usefully incorporated during Synthesis to provide a 2-thiophenacetic acid, Fmoc-4-(carboxymethyl)piperazine, labeled polypeptide. Fmoc-4-carboxypiperazine, Fmoc-4-(carboxymethyl)ho 0462 Biotin, for example (indirectly detectable through mopiperazine, Fmoc-4-phenyl-4-piperidinecarboxylic acid, interaction with avidin, Streptavidin, neutravidin, captavidin, Fmoc-L-1,2,3,4-tetrahydronorharman-3-carboxylic acid, or anti-biotin antibody), can be added using biotinoyl-(9- Fmoc-L-thiazolidine-4-carboxylic acid, all available from fluorenylmethoxycarbonyl)-L-lysine (FMOC biocytin) The Peptide Laboratory (Richmond, Calif., USA). (Molecular Probes, Eugene, Oreg., USA). (Biotin can also 0465 Non-natural residues can also be added biosyn be added enzymatically by incorporation into a fusion thetically by engineering a Suppressor tRNA, typically one protein of a E. coli BirA substrate peptide.) The FMOC and that recognizes the UAG Stop codon, by chemical aminoa tlOC derivatives of dabcyl-L-lysine (Molecular Probes, cylation with the desired unnatural amino acid and. Con Inc., Eugene, Oreg., USA) can be used to incorporate the ventional Site-directed mutagenesis is used to introduce the dabcyl chromophore at Selected Sites in the peptide Sequence chosen Stop codon UAG at the Site of interest in the protein during synthesis. The aminonaphthalene derivative EDANS, gene. When the acylated suppressor tRNA and the mutant the most common fluorophore for pairing with the dabcyl gene are combined in an in vitro transcription/translation quencher in fluorescence resonance energy transfer (FRET) System, the unnatural amino acid is incorporated in response Systems, can be introduced during automated Synthesis of to the UAG codon to give a protein containing that amino peptides by using EDANS-FMOC-L-glutamic acid or the acid at the specified position. Liu et al., Proc. Natl Acad. Sci. corresponding tEOC derivative (both from Molecular USA 96(9):4780-5 (1999); Wang et al., Science Probes, Inc., Eugene, Oreg., USA). Tetramethylrhodamine 292(5516):498-500 (2001). fluorophores can be incorporated during automated FMOC synthesis of peptides using (FMOC)--TMR-L-lysine 0466. The isolated proteins, protein fragments and fusion proteins of the present invention can also include nonnative (Molecular Probes, Inc. Eugene, Oreg., USA). inter-residue bonds, including bonds that lead to circular and 0463. Other useful amino acid analogues that can be branched forms. incorporated during chemical Synthesis include aspartic acid, glutamic acid, lysine, and tyrosine analogues having 0467. The isolated proteins and protein fragments of the allyl side-chain protection (Applied BioSystems, Inc., Foster present invention can also include post-translational and City, Calif., USA); the allyl side chain permits synthesis of post-Synthetic modifications, either throughout the length of cyclic, branched-chain, Sulfonated, glycosylated, and phos the protein or localized to one or more portions thereof. phorylated peptides. 0468 For example, when produced by recombinant expression in eukaryotic cells, the isolated proteins, frag 0464 A large number of other EMOC-protected non ments, and fusion proteins of the present invention will natural amino acid analogues capable of incorporation dur typically include N-linked and/or O-linked glycosylation, ing chemical Synthesis are available commercially, includ the pattern of which will reflect both the availability of ing, e.g., Fmoc-2-aminobicyclo2.2.1]heptane-2-carboxylic glycosylation Sites on the protein Sequence and the identity acid, Fmoc-3-endo-aminobicyclo2.2.1]heptane-2-endo of the host cell. Further modification of glycosylation pat carboxylic acid, Fmoc-3-exo-aminobicyclo2.2.1]heptane tern can be performed enzymatically. 2-exo-carboxylic acid, Fmoc-3-endo-amino-bicyclo2.2.1 hept-5-ene-2-endo-carboxylic acid, Fmoc-3-exo-amino 0469 As another example, recombinant polypeptides of bicyclo2.2.1]hept-5-ene-2-exo-carboxylic acid, Fmoc-cis the invention may also include an initial modified methion 2-amino-1-cyclohexanecarboxylic acid, Fmoc-trans-2- ine residue, in Some caseS resulting from host-mediated amino-1-cyclohexanecarboxylic acid, Fmoc-1-amino-1- proceSSeS. cyclopentanecarboxylic acid, Fmoc-cis-2-amino-1- 0470 When the proteins, protein fragments, and protein cyclopentanecarboxylic acid, Fmoc-1-amino-1- fusions of the present invention are produced by chemical cyclopropanecarboxylic acid, Fmoc-D-2-amino-4- Synthesis, post-Synthetic modification can be performed (ethylthio)butyric acid, Fmoc-L-2-amino-4- before deprotection and cleavage from the resin or after (ethylthio)butyric acid, Fmoc-L-buthionine, Fmoc-S- deprotection and cleavage. Modification before deprotection methyl-L-CySteine, FMoc-2-aminobenzoic acid and cleavage of the Synthesized protein often allows greater (anthranillic acid), Fmoc-3-aminobenzoic acid, Fmoc-4- control, e.g. by allowing targeting of the modifying moiety aminobenzoic acid, Fmoc-2-aminobenzophenone-2-car to the N-terminus of a resin-bound Synthetic peptide. US 2004/0078837 A1 Apr. 22, 2004

0471 Useful post-synthetic (and post-translational) present invention include radioactive labels, echoSono modifications include conjugation to detectable labels, Such graphic contrast reagents, and MRI contrast agents. as fluorophores. 0479. The proteins, protein fragments, and protein 0472. A wide variety of amine-reactive and thiol-reactive fusions of the present invention can also usefully be conju fluorophore derivatives have been synthesized that react gated using cross-linking agents to carrier proteins, Such as under nondenaturing conditions with N-terminal amino KLH, bovine thyroglobulin, and even bovine serum albumin groupS and epsilon amino groups of lysine residues, on the (BSA), to increase immunogenicity for raising anti-MDZ3, one hand, and with free thiol groups of cysteine residues, on anti-MDZ4, anti-MDZ7 or anti-MDZ12 antibodies. the other. 0480. The proteins, protein fragments, and protein 0473 Kits are available commercially that permit conju fusions of the present invention can also usefully be conju gation of proteins to a variety of amine-reactive or thiol gated to polyethylene glycol (PEG); PEGylation increases reactive fluorophores: Molecular Probes, Inc. (Eugene, the serum half life of proteins administered intravenously for Oreg., USA), e.g., offers kits for conjugating proteins to replacement therapy. Delgado et al., Crit. Rev. Ther. Drug Alexa Fluor 350, Alexa Fluor 430, Fluorescein-EX, Alexa Carrier Syst. 9(3-4):249-304 (1992); Scott et al., Curr. Fluor 488, Oregon Green 488, Alexa Fluor 532, Alexa Fluor Pharm. Des. 4(6):423-38 (1998); DeSantis et al., Curr. Opin. 546, Alexa Fluor 546, Alexa Fluor 568, Alexa Fluor 594, and Biotechnol. 10(4):324-30 (1999), incorporated herein by Texas Red-X. reference in their entireties. PEG monomers can be attached 0474. A wide variety of other amine-reactive and thiol to the protein directly or through a linker, with PEGylation reactive fluorophores are available commercially (Molecular using PEG monomers activated with tresyl chloride (2.2.2- Probes, Inc., Eugene, Oreg., USA), including Alexa Fluore trifluoroethaneSulphonyl chloride) permitting direct attach 350, Alexa Fluor(R) 488, Alexa Fluor(R) 532, Alexa Fluor(R) ment under mild conditions. 546, Alexa Fluor(R) 568, Alexa Fluor(R) 594, Alexa Fluor(R) 0481. The isolated proteins of the present invention, 647 (monoclonal antibody labeling kits available from including fusions thereof, can be produced by recombinant Molecular Probes, Inc., Eugene, Oreg., USA), BODIPY expression, typically using the expression vectors of the dyes, such as BODIPY 493/503, BODIPY FL, BODIPY present invention as above-described or, if fewer than about R6G, BODIPY 530/550, BODIPYTMR, BODIPY 558/568, 100 amino acids, by chemical Synthesis (typically, Solid BODIPY 558/568, BODIPY 564/570, BODIPY 576/589, phase Synthesis), and, on occasion, by in vitro translation. BODIPY 581/591, BODIPY TR, BODIPY 630/650, BODIPY 650/665, Cascade Blue, Cascade Yellow, Dansyl, 0482 Production of the isolated proteins of the present lissamine rhodamine B, Marina Blue, Oregon Green 488, invention can optionally be followed by purification. Oregon Green 514, Pacific Blue, rhodamine 6G, rhodamine 0483 Purification of recombinantly expressed proteins is green, rhodamine red, tetramethylrhodamine, Texas Red now well within the skill in the art. See, e.g., Thorner et al. (available from Molecular Probes, Inc., Eugene, Oreg., (eds.), Applications of Chimeric Genes and Hybrid Proteins, USA). Part A. Gene Expression and Protein Purification (Methods in Enzymology, Volume 326), Academic Press (2000), 0475. The polypeptides of the present invention can also (ISBN: 0121822273); Harbin (ed.), Cloning, Gene Expres be conjugated to fluorophores, other proteins, and other Sion and Protein Purification. Experimental Procedures and macromolecules, using bifunctional linking reagents. Process Rationale, Oxford Univ. Press (2001) (ISBN: 0476 Common homobifunctional reagents include, e.g., 0195132947); Marshak et al., Strategies for Protein Purifi APG, AEDP, BASED, BMB, BMDB, BMH, BMOE, cation and Characterization: A Laboratory Course Manual, BMPEO3, BMPEO)4, BS3, BSOCOES, DFDNB, DMA, Cold Spring Harbor Laboratory Press (1996) (ISBN: DMP, DMS, DPDPB, DSG, DSP (Lomant's Reagent), DSS, 0-87969-385-1); and Roe (ed.), Protein Purification Appli DST, DTBP, DTME, DTSSP, EGS, HBVS, Sulfo-BSO cations, Oxford University Press (2001), the disclosures of COES, Sulfo-DST, Sulfo-EGS (all available from Pierce, which are incorporated herein by reference in their entire Rockford, Ill., USA); common heterobifunctional cross ties, and thus need not be detailed here. linkers include ABH, AMAS, ANB-NOS, APDP, ASBA, 0484 Briefly, however, if purification tags have been BMPA, BMPH, BMPS, EDC, EMCA, EMCH, EMCS, fused through use of an expression vector that appends Such KMUA, KMUH, GMBS, LC-SMCC, LC-SPDP, MBS, tag, purification can be effected, at least in part, by means M2C2H, MPBH, MSA, NHS-ASA, PDPH, PMPI, SADP, appropriate to the tag, Such as use of immobilized metal SAED, SAND, SANPAH, SASD, SATP, SBAP, SFAD, SIA, affinity chromatography for polyhistidine tags. Other tech SIAB, SMCC, SMPB, SMPH, SMPT, SPDP, Sulfo-EMCS, Sulfo-GMBS, Sulfo-HSAB, Sulfo-KMUS, Sulfo-LC-SPDP, niques common in the art include ammonium Sulfate frac Sulfo-MBS, Sulfo-NHS-LC-ASA, Sulfo-SADP, Sulfo tionation, immunoprecipitation, fast protein liquid chroma SANPAH, Sulfo-SIAB, Sulfo-SMCC, Sulfo-SMPB, Sulfo tography (FPLC), high performance liquid chromatography LC-SMPT, SVSB, TFCS (all available Pierce, Rockford, (HPLC), and preparative gel electrophoresis. Ill., USA). 0485 Purification of chemically-synthesized peptides 0477 The proteins, protein fragments, and protein can readily be effected, e.g., by HPLC. fusions of the present invention can be conjugated, using 0486 Accordingly, it is an aspect of the present invention Such croSS-linking reagents, to fluorophores that are not to provide the isolated proteins of the present invention in amine- or thiol-reactive. pure or Substantially pure form. 0478. Other labels that usefully can be conjugated to the 0487. A purified protein of the present invention is an proteins, protein fragments, and fusion proteins of the isolated protein, as above described, that is present at a US 2004/0078837 A1 Apr. 22, 2004 36 concentration of at least 95%, as measured on a weight basis invention is useful for binding and then detecting Secondary (w/w) with respect to total protein in a composition. Such proteins that bind with sufficient affinity or avidity to the purities can often be obtained during chemical Synthesis Surface-bound protein to indicate biologic interaction ther without further purification, as, e.g., by HPLC. Purified ebetween. The proteins, fragments, and fusions of the proteins of the present invention can be present at a con present invention can also be attached to a Substrate Suitable centration (measured on a weight basis with respect to total for use in Surface plasmon resonance detection; So attached, protein in a composition) of 96%, 97%, 98%, and even 99%. the protein, fragment, or fusion of the present invention is The proteins of the present invention can even be present at useful for binding and then detecting Secondary proteins that levels of 99.5%, 99.6%, and even 99.7%, 99.8%, or even bind with sufficient affinity or avidity to the surface-bound 99.9% following purification, as by HPLC. protein to indicate biological interaction therebetween. 0488 Although high levels of purity are particularly 0495. MDZ3 Proteins useful when the isolated proteins of the present invention are used as therapeutic agents-Such as vaccines, or for replace 0496. In a first series of protein embodiments, the inven ment therapy-the isolated proteins of the present invention tion provides an isolated MDZ3 polypeptide having an are also useful at lower purity. For example, partially puri amino acid sequence encoded by the cDNA in ATCC Deposit No. , or the amino acid Sequence in SEQ ID fied proteins of the present invention can be used as immu NO: 3, which are full length MDZ3 proteins. When used as nogens to raise antibodies in laboratory animals. immunogens, the full length proteins of the present inven 0489. Thus, in another aspect, the present invention pro tion can be used, inter alia, to elicit antibodies that bind to vides the isolated proteins of the present invention in Sub a variety of epitopes of the MDZ3 protein. stantially purified form. A “substantially purified protein' of the present invention is an isolated protein, as above 0497. The invention further provides fragments of the described, present at a concentration of at least 70%, mea above-described polypeptides, particularly fragments hav Sured on a weight basis with respect to total protein in a ing at least 6 amino acids, typically at least 8 amino acids, composition. Usefully, the Substantially purified protein is often at least 15 amino acids, and even the entirety of the present at a concentration, measured on a weight basis with sequence given in SEQ ID NO: 3. respect to total protein in a composition, of at least 75%, 0498. The invention further provides fragments of at least 80%, or even at least 85%, 90%, 91%, 92%, 93%, 94%, 6 amino acids, typically at least 8 amino acids, often at least 94.5% or even at least 94.9%. 15 amino acids, and even the entirety of the Sequence given 0490. In preferred embodiments, the purified and Sub in SEO ID NO: 7. Stantially purified proteins of the present invention are in 0499 AS described above, the invention further provides compositions that lack detectable ampholytes, acrylamide proteins that differ in sequence from those described with monomers, bis-acrylamide monomers, and polyacrylamide. particularity in the above-referenced SEQ ID NOS, whether 0491. The proteins, fragments, and fusions of the present by way of insertion or deletion, by way of conservative or invention can usefully be attached to a substrate. The moderately conservative Substitutions, as hybridization Substrate can porous or Solid, planar or non-planar, the bond related proteins, or as cross-hybridizing proteins. can be covalent or noncovalent. 0500 Particularly useful among the above-described pro 0492 For example, the proteins, fragments, and fusions teins are those having at least one C2H2 (Kruppel family) of the present invention can usefully be bound to a porous Zinc finger, and especially those that have a plurality of Substrate, commonly a membrane, typically comprising C2H2 zinc fingers in tandem, particularly those that have 7 nitrocellulose, polyvinylidene fluoride (PVDF), or cationi tandem C2H2 zinc fingers. Also particularly useful among cally derivatized, hydrophilic PVDF; so bound, the proteins, the above-described fragments are those having a SCAN fragments, and fusions of the present invention can be used domain, those that encode a KRAB domain, and those that to detect and quantify antibodies, e.g. in Serum, that bind include all of a SCAN domain, KRAB domain, and 7 zinc Specifically to the immobilized protein of the present inven fingers. tion. 0501. Also particularly useful are those proteins that have 0493 As another example, the proteins, fragments, and Sequence-specific nucleic acid binding regulatory activity, fusions of the present invention can usefully be bound to a and that participates in protein-protein interactions with Substantially nonporous Substrate, Such as plastic, to detect other transcription modulators. and quantify antibodies, e.g. in Serum, that bind Specifically 0502. The invention further provides fusions of the pro to the immobilized protein of the present invention. Such teins and protein fragments herein described to heterologous plastics include polymethylacrylic, polyethylene, polypro polypeptides. pylene, polyacrylate, polymethylmethacrylate, polyvinyl chloride, polytetrafluoroethylene, polystyrene, polycarbon 0503. MDZ4 Proteins ate, polyacetal, polysulfone, celluloseacetate, cellulosenitrate, nitrocellulose, or mixtures thereof, when 0504. In a first series of protein embodiments, the inven the assay is performed in Standard microtiter dish, the plastic tion provides an isolated MDZ4 polypeptide having an is typically polystyrene. amino acid sequence encoded by the cDNA in ATCC Deposit No. , or the amino acid Sequence in SEQ ID 0494. The proteins, fragments, and fusions of the present NO:3029, which are full length MDZ4 proteins. When used invention can also be attached to a Substrate Suitable for use as immunogens, the full length proteins of the present as a Surface enhanced laser desorption ionization Source; SO invention can be used, inter alla, to elicit antibodies that bind attached, the protein, fragment, or fusion of the present to a variety of epitopes of the MDZ4 protein. US 2004/0078837 A1 Apr. 22, 2004 37

0505) The invention further provides fragments of the 0517. The invention further provides fusions of the pro above-described polypeptides, particularly fragments hav teins and protein fragments herein described to heterologous ing at least 6 amino acids, typically at least 8 amino acids, polypeptides. often at least 15 amino acids, and even the entirety of the sequence given in SEQ ID NO: 3029. 0518. MDZ12 Proteins 0506 The invention further provides fragments of at least 0519 In a first series of protein embodiments, the inven 6 amino acids, typically at least 8 amino acids, often at least tion provides an isolated MDZ12a polypeptide having an 15 amino acids, and even the entirety of the Sequence given amino acid sequence encoded by the cDNA in ATCC in SEO ID Nos: 3033 and 3037. Deposit No. , or the amino acid Sequence in SEQ ID NO: 5772, which are full length MDZ12a proteins. When 0507 AS described above, the invention further provides used as immunogens, the full length proteins of the present proteins that differ in sequence from those described with invention can be used, inter alia, to elicit antibodies that bind particularity in the above-referenced SEQ ID NOS., whether to a variety of epitopes of the MDZ12a protein. by way of insertion or deletion, by way of conservative or moderately conservative Substitutions, as hybridization 0520. The invention further provides fragments of the related proteins, or as cross-hybridizing proteins. above-described polypeptides, particularly fragments hav ing at least 6 amino acids, typically at least 8 amino acids, 0508 Particularly useful among such proteins are those often at least 15 amino acids, and even the entirety of the that have at least one C2H2 (Kruppel family) Zinc finger, and sequence given in SEQ ID NO: 5772. especially those that have 5 C2H2 zinc fingers in tandem, those that have a SCAN domain, and those that include all 0521. The invention further provides fragments of at least of a SCAN domain and 5 zinc fingers. 6 amino acids, typically at least 8 amino acids, often at least 15 amino acids, and even the entirety of the Sequence given 0509 Also particularly useful among the above-de in SEO ID NO: 5774. Scribed MDZ4 proteins are those that have Sequence-specific nucleic acid binding regulatory activity, and that participate 0522. In another series of protein embodiments, the in protein-protein interactions with other transcription invention provides an isolated MDZ12bS polypeptide hav modulators. ing an amino acid Sequence encoded by the MDZ12bS part of the MDZ12b cDNA in ATCC Deposit No. , or the 0510) The invention further provides fusions of the pro amino acid sequence in SEQ ID NO: 6939, a full length teins and protein fragments herein described to heterologous MDZ12bS protein. When used as immunogens, the full polypeptides. length proteins of the present invention can be used, inter alia, to elicit antibodies that bind to a variety of epitopes of 0511 MDZ7 Proteins the MDZ12bS protein. 0512. In a first series of protein embodiments, the inven tion provides an isolated MDZ7 polypeptide having an 0523 The invention further provides fragments of the amino acid sequence encoded by the cDNA in ATCC above-described polypeptides, particularly fragments hav Deposit No. , or the amino acid Sequence in SEQ ID ing at least 6 amino acids, typically at least 8 amino acids, NO: 4409, which are full length MDZ7 proteins. When used often at least 15 amino acids, and even the entirety of the as immunogens, the full length proteins of the present sequence given in SEQ ID NO: 6939. invention can be used, inter alia, to elicit antibodies that bind 0524. In another series of protein embodiments, the to a variety of epitopes of the MDZ7 protein. invention provides an isolated MDZ12b.L. polypeptide hav 0513. The invention further provides fragments of the ing an amino acid Sequence encoded by the MDZ12bL above-described polypeptides, particularly fragments hav portion of the MDZ12b cDNA in ATCC Deposit No. ing at least 6 amino acids, typically at least 8 amino acids, , or the amino acid sequence in SEQ ID NO: 6940, often at least 15 amino acids, and even the entirety of the which is a full length MDZ12bL protein. When used as sequence given in SEQ ID NO: 4409. immunogens, the full length proteins of the present inven tion can be used, inter alia, to elicit antibodies that bind to 0514 AS described above, the invention further provides a variety of epitopes of the MDZ12bL protein. proteins that differ in sequence from those described with particularity in the above-referenced SEQ ID NOS, whether 0525) The invention further provides fragments of the by way of insertion or deletion, by way of conservative or above-described polypeptides, particularly fragments hav moderately conservative Substitutions, as hybridization ing at least 6 amino acids, typically at least 8 amino acids, related proteins, or as cross-hybridizing proteins, with those often at least 15 amino acids, and even the entirety of the that substantially retain a MDZ7 activity particularly useful. sequence given in SEQ ID NO: 6940. 0515 Particularly useful among the above-described 0526. The invention further provides fragments of at least MDZ7 proteins are those that have at least one C2H2 6 amino acids, typically at least 8 amino acids, often at least (Kruppel family) Zinc finger, especially those having a 15 amino acids, and even the entirety of the Sequence given plurality of Zinc fingers in tandem, particularly those having in SEO ID NO: 6942. 7 Zinc fingers in tandem. 0527. As described above, the invention further provides 0516. Also particularly useful among the above-de proteins that differ in sequence from those described with scribed MDZ7 proteins are those that have sequence-specific particularity in the above-referenced SEQ ID NOS., whether nucleic acid binding regulatory activity, and that function in by way of insertion or deletion, by way of conservative or Sequence-specific modulation of gene expression. moderately conservative Substitutions, as hybridization US 2004/0078837 A1 Apr. 22, 2004 38 related proteins, or as cross-hybridizing proteins, with those 75-fold, and often by more than 100-fold, and on occasion that substantially retain a MDZ12 activity particularly use by more than 500-fold or 1000-fold. When used to detect the ful. proteins or protein fragments of the present invention, the 0528 Particularly useful among the above-described antibody of the present invention is Sufficiently specific MDZ12 proteins are those that have a C2H2 (Kruppel when it can be used to determine the presence of the protein family) Zinc finger, particularly those having a plurality of of the present invention in Samples derived from human Such Zinc fingers in tandem, especially those having at least tissues expressing each of the genes (See, e.g., Examples 1, 5, often at least 12, Zinc fingers in tandem. Also particularly 2, 3, and 4). useful among the above-described proteins are those that 0537) Typically, the affinity or avidity of an antibody (or have a KRAB-B domain, especially those having both a antibody multimer, as in the case of an IgM pentamer) of the KRAB domain and at least one, preferably a plurality, present invention for a protein or protein fragment of the especially at least 10, often at least 12, Zinc finger domains. present invention will be at least about 1x10 molar (M), 0529 Particularly useful proteins are those that act as typically at least about 5x107 M, usefully at least about Sequence-specific transcription regulators, and that interac 1x10M, with affinities and avidities of at least 1x10 M, tion with other transcriptional modulators by protein-protein 5x10 M, and 1x10' M proving especially useful. interactions. 0538. The antibodies of the present invention can be 0530. The invention further provides fusions of the pro naturally-occurring forms, Such as IgG, IgM, Ig), IgE, and teins and protein fragments herein described to heterologous IgA, from any mammalian species. polypeptides. 0539 Human antibodies can, but will infrequently, be drawn directly from human donors or human cells. In Such 0531. Antibodies and Antibody-Producing Cells case, antibodies to the proteins of the present invention will 0532. In another aspect, the invention provides antibod typically have resulted from fortuitous immunization, Such ies, including fragments and derivatives thereof, that bind as autoimmune immunization, with the protein or protein specifically to MDZ3, MDZ4, MDZ7 or MDZ12 proteins fragments of the present invention. Such antibodies will and protein fragments of the present invention or to one or typically, but will not invariably, be polyclonal. more of the proteins and protein fragments encoded by the 0540 Human antibodies are more frequently obtained isolated MDZ3, MDZ4, MDZ7 or MDZ12 nucleic acids of using transgenic animals that express human immunoglo the present invention. The antibodies of the present inven bulin genes, which transgenic animals can be affirmatively tion can be specific for all of linear epitopes, discontinuous immunized with the protein immunogen of the present epitopes, or conformational epitopes of Such proteins or invention. Human Ig-transgenic mice capable of producing protein fragments, either as present on the protein in its human antibodies and methods of producing human anti native conformation or, in Some cases, as present on the bodies therefrom upon Specific immunization are described, proteins as denatured, as, e.g., by Solubilization in SDS. interalia, in U.S. Pat. Nos. 6,162,963; 6,150,584, 6,114,598; 0533. In other embodiments, the invention provides anti 6,075,181; 5,939,598; 5,877,397; 5,874.299; 5,814,318; bodies, including fragments and derivatives thereof, the 5,789,650; 5,770,429; 5,661,016; 5,633,425; 5,625,126; binding of which can be competitively inhibited by one or 5,569,825; 5,545,807; 5,545,806, and 5,591,669, the disclo more of the MDZ3, MDZ4, MDZ7 or MDZ12 proteins and Sures of which are incorporated herein by reference in their protein fragments of the present invention, or by one or more entireties. Such antibodies are typically monoclonal, and are of the proteins and protein fragments encoded by the iso typically produced using techniques developed for produc lated MDZ3, MDZ4, MDZ7 or MDZ12 nucleic acids of the tion of murine antibodies. present invention. 0541 Human antibodies are particularly useful, and often 0534 AS used herein, the term “antibody” refers to a preferred, when the antibodies of the present invention are polypeptide, at least a portion of which is encoded by at least to be administered to human beings as in Vivo diagnostic or one immunoglobulin gene, which can bind Specifically to a therapeutic agents, since recipient immune response to the first molecular Species, and to fragments or derivatives administered antibody will often be substantially less than thereof that remain capable of Such specific binding. that occasioned by administration of an antibody derived 0535. By “bind specifically” and “specific binding” is from another Species, Such as mouse. here intended the ability of the antibody to bind to a first 0542. IgG, IgM, Ig|D, IgE and IgA antibodies of the molecular species in preference to binding to other molecu present invention are also usefully obtained from other lar species with which the antibody and first molecular mammalian Species, including rodents-typically mouse, Species are admixed. An antibody is Said Specifically to but also rat, guinea pig, and hamster-lagomorphs, typically “recognize” a first molecular species when it can bind rabbits, and also larger mammals, Such as Sheep, goats, Specifically to that first molecular species. cows, and horses. In Such cases, as with the transgenic human-antibody-producing non-human mammals, fortu 0536 AS is well known in the art, the degree to which an itous immunization is not required, and the non-human antibody can discriminate as among molecular species in a mammal is typically affirmatively immunized, according to mixture will depend, in part, upon the conformational relat Standard immunization protocols, with the protein or protein edness of the Species in the mixture; typically, the antibodies fragment of the present invention. of the present invention will discriminate over adventitious binding to non-MDZ3, non-MDZ4, non-MDZ7 or non 0543. As discussed above, virtually all fragments of 8 or MDZ12 proteins by at least two-fold, more typically by at more contiguous amino acids of the proteins of the present least 5-fold, typically by more than 10-fold, 25-fold, 50-fold, invention can be used effectively as immunogens when US 2004/0078837 A1 Apr. 22, 2004 39 conjugated to a carrier, typically a protein Such as bovine present invention can be cloned from hybridomas and there thyroglobulin, keyhole limpet hemocyanin, or bovine Serum after expressed in other host cells. Nor need the two neces albumin, conveniently using a bifunctional linker Such as Sarily be performed together: e.g., genes encoding antibod those described elsewhere above, which discussion is incor ies Specific for the proteins and protein fragments of the porated by reference here. present invention can be cloned directly from B cells known to be specific for the desired protein, as further described in 0544 Immunogenicity can also be conferred by fusion of U.S. Pat. No. 5,627,052, the disclosure of which is incor the proteins and protein fragments of the present invention porated herein by reference in its entirety, or from antibody to other moieties. displaying phage. 0545 For example, peptides of the present invention can be produced by Solid phase Synthesis on a branched polyl 0550 Recombinant expression in host cells is particu ysine core matrix; these multiple antigenic peptides (MAPs) larly useful when fragments or derivatives of the antibodies provide high purity, increased avidity, accurate chemical of the present invention are desired. definition and improved Safety in vaccine development. Tam 0551 Host cells for recombinant antibody production et al., Proc. Natl. Acad. Sci. USA 85:5409-5413 (1988); either whole antibodies, antibody fragments, or antibody Posnett et al., J. Biol. Chem. 263, 1719-1725 (1988). derivatives-can be prokaryotic or eukaryotic. 0546 Protocols for immunizing non-human mammals 0552) Prokaryotic hosts are particularly useful for pro are well-established in the art, Harlow et al. (eds.), Antibod ducing phage displayed antibodies of the present invention. ies. A Laboratory Manual, Cold Spring Harbor Laboratory (1998) (ISBN: 0879693142); Coligan et al. (eds.), Current 0553 The technology of phage-displayed antibodies, in Protocols in Immunology, John Wiley & Sons, Inc. (2001) which antibody variable region fragments are fused, for (ISBN: 0-471-52276-7); Zola, Monoclonal Antibodies: example, to the gene III protein (pII) or gene VIII protein Preparation and Use of Monoclonal Antibodies and Engi (pVIII) for display on the Surface of filamentous phage, Such neered Antibody Derivatives (Basics: From Background to as M13, is by now well-established, Sidhu, Curr. Opin. Bench), Springer Verlag (2000) (ISBN: 0387915907), the Biotechnol. 11(6):610-6 (2000); Griffiths et al., Curr. Opin. disclosures of which are incorporated herein by reference, Biotechnol. 9(1):102-8 (1998); Hoogenboom et al., Immu and often include multiple immunizations, either with or notechnology, 4(1):1-20 (1998); Rader et al., Current Opin without adjuvants Such as Freund's complete adjuvant and ion in Biotechnology 8:503-508 (1997); Aujame et al., Freund's incomplete adjuvant. Human Antibodies 8:155-168 (1997); Hoogenboom, Trends in Biotechnol. 15:62-70 (1997); de Kruif et al., 17:453-455 0547 Antibodies from nonhuman mammals can be poly (1996); Barbas et al., Trends in Biotechnol. 14:230-234 clonal or monoclonal, with polyclonal antibodies having (1996); Winter et al., Ann. Rev. Immunol. 433-455 (1994), certain advantages in immunohistochemical detection of the and techniques and protocols required to generate, propa proteins of the present invention and monoclonal antibodies gate, Screen (pan), and use the antibody fragments from Such having advantages in identifying and distinguishing particu libraries have recently been compiled, Barbas et al., Phage lar epitopes of the proteins of the present invention. Display: A Laboratory Manual, Cold Spring Harbor Labo 0548. Following immunization, the antibodies of the ratory Press (2001) (ISBN 0-87969-546-3); Kay et al. (eds.), present invention can be produced using any art-accepted Phage Display of Peptides and Proteins. A Laboratory technique. Such techniques are well known in the art, Manual, Academic Press, Inc. (1996); Abelson et al. (eds.), Coligan et al. (eds.), Current Protocols in Immunology, John Combinatorial Chemistry, Methods in Enzymology vol. Wiley & Sons, Inc. (2001) (ISBN: 0-471-52276-7); Zola, 267, Academic Press (May 1996), the disclosures of which Monoclonal Antibodies. Preparation and Use of Mono are incorporated herein by reference in their entireties. clonal Antibodies and Engineered Antibody Derivatives 0554 Typically, phage-displayed antibody fragments are (Basics. From Background to Bench), Springer Verlag ScFv fragments or Fab fragments, when desired, full length (2000) (ISBN: 0387915907); Howard et al. (eds), Basic antibodies can be produced by cloning the variable regions Methods in Antibody Production and Characterization, from the displaying phage into a complete antibody and CRC Press (2000) (ISBN: 0849394.457); Harlow et al. expressing the full length antibody in a further prokaryotic (eds.), Antibodies: A Laboratory Manual, Cold Spring Har or a eukaryotic host cell. bor Laboratory (1998) (ISBN: 0879693142); Davis (ed.), Monoclonal Antibody Protocols, Vol. 45, Humana Press 0555 Eukaryotic cells are also useful for expression of (1995) (ISBN: 0896033082); Delves (ed.), Antibody Pro the antibodies, antibody fragments, and antibody derivatives duction: Essential Techniques, John Wiley & Son Ltd (1997) of the present invention. (ISBN: 0471970.107); Kenney, Antibody Solution. An Anti body Methods Manual, Chapman & Hall (1997) (ISBN: 0556. For example, antibody fragments of the present 0412141914), incorporated herein by reference in their invention can be produced in Pichia pastoris, Takahashi et al., Biosci. Biotechnol. Biochem. 64(10):2138-44 (2000); entireties, and thus need not be detailed here. Freyre et al., J. Biotechnol. 76(2-3): 157-63 (2000); Fischer 0549 Briefly, however, such techniques include, inter et al., Biotechnol. Appl. Biochem. 30 (Pt 2): 117-20 (1999); alia, production of monoclonal antibodies by hybridomas Pennell et al., Res. Immunol. 149(6):599-603 (1998); Eldin and expression of antibodies or fragments or derivatives et al., J. Immunol. Methods. 201(1):67-75 (1997); and in thereof from host cells engineered to express immunoglo Saccharomyces cerevisiae, Frenken et al., Res. Immunol. bulin genes or fragments thereof. These two methods of 149(6):589-99 (1998); Shusta et al., Nature Biotechnol. production are not mutually exclusive: genes encoding anti 16(8):773-7 (1998), the disclosures of which are incorpo bodies Specific for the proteins or protein fragments of the rated herein by reference in their entireties. US 2004/0078837 A1 Apr. 22, 2004 40

0557 Antibodies, including antibody fragments and immunogenic in human beings, and thus more Suitable for in derivatives, of the present invention can also be produced in Vivo administration, than are unmodified antibodies from insect cells, Li et al., Protein Expr: Purif. 21(1):121-8 non-human mammalian Species. (2001); Ailor et al., Biotechnol. Bioeng. 58(2-3): 196-203 (1998); Hsu et al., Biotechnol. Prog. 13(1):96-104 (1997); 0566 Chimeric antibodies typically include heavy and/or Edelman et al., Immunology 91(1):13-9 (1997); and Nesbit light chain variable regions (including both CDR and frame et al., J. Immunol. Methods. 151(1-2):201-8 (1992), the work residues) of immunoglobulins of one species, typically disclosures of which are incorporated herein by reference in mouse, fused to constant regions of another Species, typi their entireties. cally human. See, e.g., U.S. Pat. No. 5,807,715; Morrison et al., Proc. Natl. Acad. Sci USA.81(21):6851-5 (1984); Sharon 0558 Antibodies and fragments and derivatives thereof et al., Nature 309(5966):364-7 (1984); Takeda et al., Nature of the present invention can also be produced in plant cells, 314(6010):452-4 (1985), the disclosures of which are incor Giddings et al., Nature Biotechnol. 18(11): 1151-5 (2000); porated herein by reference in their entireties. Primatized Gavilondo et al., Biotechniques 29(1): 128-38 (2000); Fis and humanized antibodies typically include heavy and/or cher et al., J. Biol. Regul. HomeOSt. Agents 14(2):83-92 light chain CDRs from a murine antibody grafted into a (2000); Fischer et al., Biotechnol. Appl. Biochem. 30 (Pt non-human primate or human antibody V region framework, 2): 113-6 (1999); Fischer et al., Biol. Chem. 380(7-8):825-39 usually further comprising a human constant region, Riech (1999); Russell, Curr. Top. Microbiol. Immunol. 240: 119-38 mann et al., Nature 332(6162):323-7 (1988); Co et al., (1999); and Ma et al., Plant Physiol. 109(2):341-6 (1995), Nature 351(6326):501-2 (1991); U.S. Pat. Nos. 6,054,297; the disclosures of which are incorporated herein by refer 5,821,337; 5,770,196; 5,766,886; 5,821,123; 5,869,619; ence in their entireties. 6,180,377; 6,013,256; 5,693,761; and 6,180,370, the disclo 0559 Mammalian cells useful for recombinant expres Sures of which are incorporated herein by reference in their Sion of antibodies, antibody fragments, and antibody deriva entireties. tives of the present invention include CHO cells, COS cells, 0567. Other useful antibody derivatives of the invention 293 cells, and myeloma cells. include heteromeric antibody complexes and antibody 0560 Verma et al., J. Immunol. Methods 216(1-2):165-81 fusions, Such as diabodies (bispecific antibodies), Single (1998), review and compare bacterial, yeast, insect and chain diabodies, and intrabodies. mammalian expression Systems for expression of antibodies. 0568. The antibodies of the present invention, including 0561 Antibodies of the present invention can also be fragments and derivatives thereof, can usefully be labeled. It prepared by cell free translation, as further described in is, therefore, another aspect of the present invention to Merket al., J. Biochem. (Tokyo). 125(2):328-33 (1999) and provide labeled antibodies that bind specifically to one or Ryabova et al., Nature Biotechnol. 15(1):79-84 (1997), and more of the proteins and protein fragments of the present in the milk of transgenic animals, as further described in invention, to one or more of the proteins and protein Pollocket al., J. Immunol. Methods 231(1-2): 147-57 (1999), fragments encoded by the isolated nucleic acids of the the disclosures of which are incorporated herein by refer present invention, or the binding of which can be competi ence in their entireties. tively inhibited by one or more of the proteins and protein fragments of the present invention or one or more of the 0562. The invention further provides antibody fragments proteins and protein fragments encoded by the isolated that bind Specifically to one or more of the proteins and nucleic acids of the present invention. protein fragments of the present invention, to one or more of the proteins and protein fragments encoded by the isolated 0569. The choice of label depends, in part, upon the nucleic acids of the present invention, or the binding of desired use. which can be competitively inhibited by one or more of the 0570 For example, when the antibodies of the present proteins and protein fragments of the present invention or invention are used for immunohistochemical Staining of one or more of the proteins and protein fragments encoded tissue Samples, the label can usefully be an enzyme that by the isolated nucleic acids of the present invention. catalyzes production and local deposition of a detectable 0563 Among such useful fragments are Fab, Fab', Fv, product. F(ab)', and single chain Fv (scFv) fragments. Other useful 0571. Enzymes typically conjugated to antibodies to per fragments are described in Hudson, Curr. Opin. Biotechnol. mit their immunohistochemical visualization are well 9(4):395-402 (1998). known, and include alkaline phosphatase, B-galactosidase, 0564. It is also an aspect of the present invention to glucose oxidase, horseradish peroxidase (HRP), and urease. provide antibody derivatives that bind specifically to one or Typical Substrates for production and deposition of Visually more of the proteins and protein fragments of the present detectable products include o-nitrophenyl-beta-D-galacto invention, to one or more of the proteins and protein pyranoside (ONPG); o-phenylenediamine dihydrochloride fragments encoded by the isolated nucleic acids of the (OPD); p-nitrophenyl phosphate (PNPP), p-nitrophenyl present invention, or the binding of which can be competi beta-D-galactopryanoside (PNPG); 3',3'-diaminobenzidine tively inhibited by one or more of the proteins and protein (DAB); 3-amino-9-ethylcarbazole (AEC), 4-chloro-1-naph fragments of the present invention or one or more of the thol (CN); 5-bromo-4-chloro-3-indolyl-phosphate (BCIP); proteins and protein fragments encoded by the isolated ABTS(R); BluoGal; iodonitrotetrazolium (INT); nitroblue tetrazolium chloride (NBT); phenazine methosulfate (PMS); nucleic acids of the present invention. phenolphthalein monophosphate (PMP); tetramethylbenzi 0565 Among such useful derivatives are chimeric, pri dine (TMB); tetranitroblue tetrazolium (TNBT); X-Gal; matized, and humanized antibodies, Such derivatives are leSS X-Gluc; and X-Glucoside. US 2004/0078837 A1 Apr. 22, 2004

0572. Other substrates can be used to produce products 0580. As another example, when the antibodies of the for local deposition that are luminescent. For example, in the present invention are used for radioimmunotherapy, the presence of hydrogen peroxide (H2O), horseradish peroxi label can usefully be 'Th, '7Ac, 'Ac, 'Ra, Bi, dase (HRP) can catalyze the oxidation of cyclic diacylhy 212Pb, 212Bi, 211 At 203Pb, 194Os, 188Re, 186Re, 153Sm, drazides, Such as luminol. Immediately following the oxi 149Tb, 131I, 125I, 11, In, 105Rh, 99mTc, 7Ru, 90Y, 90Sr, 88Y. dation, the luminol is in an excited State (intermediate 7°Se, 7Cu, or '7Sc. reaction product), which decays to the ground State by emitting light. Strong enhancement of the light emission is 0581. As another example, when the antibodies of the produced by enhancers, Such as phenolic compounds. present invention are to be used for in Vivo diagnostic use, Advantages include high Sensitivity, high resolution, and they can be rendered detectable by conjugation to MRI rapid detection without radioactivity and requiring only contrast agents, Such as gadolinium diethylenetriaminepen Small amounts of antibody. See, e.g., Thorpe et al., Methods taacetic acid (DTPA), Lauffer et al., Radiology 207(2):529 Enzymol. 133:331-53 (1986); Kricka et al., J. Immunoassay 38 (1998), or by radioisotopic labeling AS would be under 17(1):67-83 (1996); and Lundqvist et al., J. Biolumin. stood, use of the labels described above is not restricted to Chemilumin. 10(6):353-9 (1995), the disclosures of which the application as for which they were mentioned. are incorporated herein by reference in their entireties. Kits 0582 The antibodies of the present invention, including for Such enhanced chemiluminescent detection (ECL) are fragments and derivatives thereof, can also be conjugated to available commercially. toxins, in order to target the toxins ablative action to cells 0573 The antibodies can also be labeled using colloidal that display and/or express the proteins of the present gold. invention. Commonly, the antibody in Such immunotoxins is conjugated to Pseudomonas eXotoxin A, diphtheria toxin, 0574 As another example, when the antibodies of the Shiga toxin A, anthrax toxin lethal factor, or ricin. See Hall present invention are used, e.g., for flow cytometric detec (ed.), Immunotoxin Methods and Protocols (Methods in tion, for Scanning laser cytometric detection, or for fluores Molecular Biology, Vol 166), Humana Press (2000) cent immunoassay, they can usefully be labeled with fluo (ISBN:0896037754); and Frankel et al. (eds.), Clinical rophores. Applications of Immunotoxins, Springer-Verlag New York, Incorporated (1998) (ISBN:3540640975), the disclosures of 0575. There are a wide variety of fluorophore labels that which are incorporated herein by reference in their entire can usefully be attached to the antibodies of the present invention. ties, for review. 0583. The antibodies of the present invention can use 0576 For flow cytometric applications, both for extra fully be attached to a Substrate, and it is, therefore, another cellular detection and for intracellular detection, common aspect of the invention to provide antibodies that bind useful fluorophores can be fluorescein isothiocyanate Specifically to one or more of the proteins and protein (FITC), allophycocyanin (APC), R-phycoerythrin (PE), fragments of the present invention, to one or more of the peridinin chlorophyll protein (PerCP), Texas Red, Cy3, Cy5, proteins and protein fragments encoded by the isolated fluorescence resonance energy tandem fluorophores Such as nucleic acids of the present invention, or the binding of PerCP-Cy5.5, PE-Cy5, PE-Cy5.5, PE-Cy7, PE-Texas Red, which can be competitively inhibited by one or more of the and APC-Cy7. proteins and protein fragments of the present invention or 0577. Other fluorophores include, inter alia, Alexa one or more of the proteins and protein fragments encoded Fluor?e) 350, Alexa Fluore) 488, Alexa Fluore) 532, Alexa by the isolated nucleic acids of the present invention, Fluore) 546, Alexa Fluore) 568, Alexa Fluor?& 594, Alexa attached to a Substrate. Fluor(R) 647 (monoclonal antibody labeling kits available 0584 Substrates can be porous or nonporous, planar or from Molecular Probes, Inc., Eugene, Oreg., USA), nonplanar. BODIPY dyes, such as BODIPY 493/503, BODIPY FL, BODIPY R6G, BODIPY 530/550, BODIPYTMR, BODIPY 0585 For example, the antibodies of the present inven 558/568, BODIPY 558/568, BODIPY 564/570, BODIPY tion can usefully be conjugated to filtration media, Such as 576/589, BODIPY 581/591, BODIPY TR, BODIPY 630/ NHS-activated Sepharose or CNBr-activated Sepharose for 650, BODIPY 650/665, Cascade Blue, Cascade Yellow, purposes of immunoaffinity chromatography. Dansyl, lissamine rhodamine B, Marina Blue, Oregon Green 488, Oregon Green 514, Pacific Blue, rhodamine 6G, 0586 For example, the antibodies of the present inven rhodamine green, rhodamine red, tetramethylrhodamine, tion can usefully be attached to paramagnetic microSpheres, Texas Red (available from Molecular Probes, Inc., Eugene, typically by biotin-Streptavidin interaction, which micro Oreg., USA), and Cy2, Cy3, Cy3.5, Cy5, Cy5.5, Cy7, all of Sphere can then be used for isolation of cells that express or which are also useful for fluorescently labeling the antibod display the proteins of the present invention. AS another ies of the present invention. example, the antibodies of the present invention can usefully be attached to the surface of a microtiter plate for ELISA. 0578 For secondary detection using labeled avidin, Streptavidin, captavidin or neutravidin, the antibodies of the 0587. As noted above, the antibodies of the present present invention can usefully be labeled with biotin. invention can be produced in prokaryotic and eukaryotic cells. It is, therefore, another aspect of the present invention 0579. When the antibodies of the present invention are to provide cells that express the antibodies of the present used, e.g., for western blotting applications, they can use invention, including hybridoma cells, B cells, plasma cells, fully be labeled with radioisotopes, such as P, P, S, H, and host cells recombinantly modified to express the anti and 'I. bodies of the present invention. US 2004/0078837 A1 Apr. 22, 2004 42

0588. In yet a further aspect, the present invention pro Specifically to, or the binding of which can be competitively vides aptamers evolved to bind Specifically to one or more inhibited by, a polypeptide having the amino acid Sequence of the proteins and protein fragments of the present inven of SEO ID NO:3037. tion, to one or more of the proteins and protein fragments encoded by the isolated nucleic acids of the present inven 0599. In a fourth series of antibody embodiments, the tion, or the binding of which can be competitively inhibited invention provides antibodies, both polyclonal and mono by one or more of the proteins and protein fragments of the clonal, and fragments and derivatives thereof, that bind present invention or one or more of the proteins and protein Specifically to, or the binding of which can be competitively fragments encoded by the isolated nucleic acids of the inhibited by, polypeptides encoded by any of the MDZ7 present invention. nucleic acids of the present invention, as above-described. 0589 MDZ3 Antibodies 0600 Such antibodies are useful in in vitro immunoas SayS, Such as ELISA, western blot or immunohistochemical 0590. In a first series of antibody embodiments, the assay, for detection of MDZ3 and related proteins. Such invention provides antibodies, both polyclonal and mono antibodies are also useful in isolating and purifying MDZA clonal, and fragments and derivatives thereof, that bind proteins, including related croSS-reactive proteins, by immu Specifically to, or the binding of which can be competitively noprecipitation, immunoaffinity chromatography, or mag inhibited by, a polypeptide having an amino acid Sequence netic bead-mediated purification. encoded by the MDZ3 cDNA in ATCC Deposit No. s or having the amino acid sequence of SEQ ID No. 3. 0601. In other embodiments, the invention further pro vides the above-described antibodies detectably labeled, and 0591. In a second series of antibody embodiments, the in yet other embodiments, provides the above-described invention provides antibodies, both polyclonal and mono antibodies attached to a Substrate. clonal, and fragments and derivatives thereof, that bind Specifically to, or the binding of which can be competitively 0602 MDZ7 Antibodies inhibited by, a polypeptide having the amino acid Sequence 0603. In a first series of antibody embodiments of this of SEO ID NO:7. aspect of the invention, the invention provides antibodies, 0592. In a third series of antibody embodiments, the both polyclonal and monoclonal, and fragments and deriva invention provides antibodies, both polyclonal and mono tives thereof, that bind specifically to, or the binding of clonal, and fragments and derivatives thereof, that bind which can be competitively inhibited by, a polypeptide Specifically to, or the binding of which can be competitively having an amino acid Sequence encoded by the MDZ7 inhibited by, polypeptides encoded by any of the MDZ3 cDNA in ATCC Deposit No. , or having the amino nucleic acids of the present invention, as above-described. acid sequence of SEQ ID No. 4409. 0593. Such antibodies are useful in in vitro immunoas 0604 Such antibodies are useful in in vitro immunoas says, Such as ELISA, western blot or immunohistochemical SayS, Such as ELISA, western blot or immunohistochemical assay, for detection of MDZ3 and related proteins. Such assay, for detection of MDZ7 and related proteins. Such antibodies are also useful in isolating and purifying MDZ3 antibodies are also useful in isolating and purifying MDZ7 proteins, including related croSS-reactive proteins, by immu proteins, including related croSS-reactive proteins, by immu noprecipitation, immunoaffinity chromatography, or mag noprecipitation, immunoaffinity chromatography, or mag netic bead-mediated purification. netic bead-mediated purification. 0594. In other embodiments, the invention further pro 0605. In other embodiments, the invention further pro vides the above-described antibodies detectably labeled, and vides the above-described antibodies detectably labeled, and in yet other embodiments, provides the above-described in yet other embodiments, provides the above-described antibodies attached to a Substrate. antibodies attached to a Substrate. 0595. MDZ4 Antibodies 0606 MDZ12a, MDZ12bS and MDZ12b.L Antibodies 0596) In a first series of antibody embodiments, the 0607. The invention further provides antibodies, poly invention provides antibodies, both polyclonal and mono clonal or monoclonal, and fragments and derivatives thereof, clonal, and fragments and derivatives thereof, that bind that bind specifically to, or the binding of which can be Specifically to, or the binding of which can be competitively competitively inhibited by, a polypeptide having an amino inhibited by, a polypeptide having an amino acid Sequence acid sequence encoded by the MDZ12a cDNA in ATCC encoded by the MDZ4 cDNA in ATCC Deposit No. s Deposit No. , or having the amino acid Sequence of or having the amino acid sequence of SEQ ID No. 3029. SEQ ID No:5772. Depending upon the epitope recognized, certain of Such antibodies can cross-react with MDZ12bS, 0597. In a second series of antibody embodiments, the others with MDZ12b.L, and yet others with neither invention provides antibodies, both polyclonal and mono MDZ12bS or MDZ12b. These Subsets can be discrimi clonal, and fragments and derivatives thereof, that bind nated by screening the antibodies for ability to bind to the Specifically to, or the binding of which can be competitively MDZ12bS and MDZ12bL polypeptides of the present inhibited by, a polypeptide having the amino acid Sequence invention, the Sequences of which are set forth respectively of SEO ID NO:3033. in SEO ID NOS: 6939 and 6940. 0598. In a third series of antibody embodiments, the 0608. In another series of embodiments, the invention invention provides antibodies, both polyclonal and mono provides antibodies, both polyclonal and monoclonal, and clonal, and fragments and derivatives thereof, that bind fragments and derivatives thereof, that bind Specifically to, US 2004/0078837 A1 Apr. 22, 2004 or the binding of which can be competitively inhibited by, a The Science and Practice of Pharmacy, 20" ed., Lippincott, polypeptide having the amino acid Sequence of SEQ ID Williams & Wilkins (2000) (ISBN: 0683306472); Ansel et NO:5774. al., Pharmaceutical Dosage Forms and Drug Delivery Sys 0609. In another series of embodiments, the invention tems, 7" ed., Lippincott Williams & Wilkins Publishers provides antibodies, including fragments and derivatives (1999) (ISBN: 0683305727); and Kibbe (ed.), Handbook of thereof, both monoclonal and polyclonal, that bind Specifi Pharmaceutical Excipients American Pharmaceutical Asso cally to, or the binding of which can be competitively ciation, 3" ed. (2000) (ISBN: 091733096X), the disclosures inhibited by, a MDZ12bS polypeptide encoded by ATCC of which are incorporated herein by reference in their Deposit No. , or having the amino acid Sequence of entireties, and thus need not be described in detail herein. SEO ID No.6939. Such antibodies will cross-react with 0619 Briefly, however, formulation of the pharmaceuti MDZ12a but not with MDZ12b. cal compositions of the present invention will depend upon 0610. In yet another series of embodiments, the invention the route chosen for administration. The pharmaceutical provides antibodies, including fragments and derivatives compositions utilized in this invention can be administered thereof, monoclonal and polyclonal, that bind Specifically to, by various routes including both enteral and parenteral or the binding of which can be competitively inhibited by, a routes, including oral, intravenous, intramuscular, Subcuta MDZ12bL polypeptide encoded by ATCC Deposit No. neous, inhalation, topical, Sublingual, rectal, intra-arterial, , or having the amino acid Sequence of SEQ ID intramedullary, intrathecal, intraventricular, transmucosal, No:6940. Such antibodies will cross-react with MDZ12a but transdermal, intranasal, intraperitoneal, intrapulmonary, and not with MDZ12b.S. intrauterine. 0611 Such antibodies are useful in in vitro immunoas 0620 Oral dosage forms can be formulated as tablets, says, Such as ELISA, western blot or immunohistochemical pills, dragees, capsules, liquids, gels, Syrups, slurries, Sus assay, for detection of MDZ7 and related proteins. Such pensions, and the like, for ingestion by the patient. antibodies are also useful in isolating and purifying MDZ12 0621 Solid formulations of the compositions for oral proteins, including related croSS-reactive proteins, by immu administration can contain Suitable carriers or excipients, noprecipitation, immunoaffinity chromatography, or mag Such as carbohydrate or protein fillers, Such as Sugars, netic bead-mediated purification. including lactose, Sucrose, mannitol, or Sorbitol; Starch from 0612. In other embodiments, the invention further pro corn, wheat, rice, potato, or other plants, cellulose, Such as vides the above-described antibodies detectably labeled, and methyl cellulose, hydroxypropylmethyl-cellulose, Sodium in yet other embodiments, provides the above-described carboxymethylcellulose, or microcrystalline cellulose; gums antibodies attached to a Substrate. including arabic and tragacanth; proteins Such as gelatin and collagen; inorganics, Such as kaolin, calcium carbonate, 0613 Pharmaceutical Compositions dicalcium phosphate, Sodium chloride; and other agents 0614 MDZ3, MDZ4, MDZ7 and MDZ12 are important Such as acacia and alginic acid. for transcriptional regulation and protein-protein interac 0622 Agents that facilitate disintegration and/or solubi tions with other transcription modulators; defects in MDZ3, lization can be added, Such as the cross-linked polyvinyl MDZ4, MDZ7 or MDZ12 expression, activity, distribution, pyrrollidone, agar, alginic acid, or a Salt thereof, Such as localization, and/or Solubility are a cause of human disease, Sodium alginate, microcrystalline cellulose, corn Starch, which disease can manifest as a disorder of brain, testis, Sodium Starch glycolate, and alginic acid. heart or bone marrow function for MDZ3; bone marrow, brain, heart, hela, adult liver, fetal liver, lung, placenta or 0623 Tablet binders that can be used include acacia, prostate function for MDZ4; testes function for MDZ7; and methylcellulose, Sodium carboxymethylcellulose, polyvi brain, heart, kidney, placenta, Skeletal muscle, testis, bone nylpyrrolidone (Povidone"M), hydroxypropyl methylcellu marrow or liver function for MDZ12. lose, Sucrose, Starch and ethylcellulose. 0615. Accordingly, pharmaceutical compositions com 0624 Lubricants that can be used include magnesium prising nucleic acids, proteins, and antibodies of the present Stearates, Stearic acid, Silicone fluid, talc, waxes, oils, and invention, as well as mimetics, agonists, antagonists, or colloidal Silica. inhibitors of MDZ3, MDZ4, MDZ7 or MDZ12 activity, can be administered as therapeutics for treatment of MDZ3, 0625 Fillers, agents that facilitate disintegration and/or MDZ4, MDZ7 or MDZ12 defects, respectively. Solubilization, tablet binders and lubricants, including the aforementioned, can be used Singly or in combination. 0616) Thus, in another aspect, the invention provides pharmaceutical compositions comprising the nucleic acids, 0626 Solid oral dosage forms need not be uniform nucleic acid fragments, proteins, protein fusions, protein throughout. fragments, antibodies, antibody derivatives, antibody frag 0627 For example, dragee cores can be used in conjunc ments, mimetics, agonists, antagonists, and inhibitors of the tion with Suitable coatings, Such as concentrated Sugar present invention. Solutions, which can also contain gum arabic, talc, polyvi 0617 Such a composition typically contains from about nylpyrrollidone, carbopol gel, polyethylene glycol, and/or 0.1 to 90% by weight of a therapeutic agent of the invention titanium dioxide, lacquer Solutions, and Suitable organic formulated in and/or with a pharmaceutically acceptable Solvents or Solvent mixtures. carrier or excipient. 0628 Oral dosage forms of the present invention include 0618) Pharmaceutical formulation is a well-established push-fit capsules made of gelatin, as well as Soft, Sealed art, and is further described in Gennaro (ed.), Remington. capsules made of gelatin and a coating, Such as glycerol or US 2004/0078837 A1 Apr. 22, 2004 44

Sorbitol. Push-fit capsules can contain active ingredients tical cream base. Various formulations for topical use mixed with a filler or binders, Such as lactose or Starches, include drops, tinctures, lotions, creams, Solutions, and lubricants, Such as talc or magnesium Stearate, and, option ointments containing the active ingredient and various Sup ally, Stabilizers. In Soft capsules, the active compounds can ports and vehicles. In other transdermal formulations, typi be dissolved or Suspended in Suitable liquids, Such as fatty cally in patch-delivered formulations, the pharmaceutically oils, liquid, or liquid polyethylene glycol with or without active compound is formulated with one or more skin stabilizers. penetrants, such as 2-N-methyl-pyrrolidone (NMP) or 0629. Additionally, dyestuffs or pigments can be added to AZOne. the tablets or dragee coatings for product identification or to 0639 Inhalation formulations can also readily be formu characterize the quantity of active compound, i.e., dosage. lated. For inhalation, various powder and liquid formula 0630. Liquid formulations of the pharmaceutical compo tions can be prepared. Sitions for oral (enteral) administration are prepared in water 0640 The pharmaceutically active compound in the phar or other aqueous vehicles and can contain various Suspend maceutical compositions of the present inention can be ing agents Such as methylcellulose, alginates, tragacanth, provided as the Salt of a variety of acids, including but not pectin, kelgin, carrageenan, acacia, polyvinylpyrrollidone, limited to hydrochloric, Sulfuric, acetic, lactic, tartaric, and polyvinyl alcohol. The liquid formulations can also malic, and Succinic acid. Salts tend to be more Soluble in include Solutions, emulsions, Syrups and elixirs containing, aqueous or other protonic Solvents than are the correspond together with the active compound(s), wetting agents, Sweet eners, and coloring and flavoring agents. ing free base forms. 0641 After pharmaceutical compositions have been pre 06.31 The pharmaceutical compositions of the present pared, they are packaged in an appropriate container and invention can also be formulated for parenteral administra labeled for treatment of an indicated condition. tion. 0632 For intravenous injection, water soluble versions of 0642. The active compound will be present in an amount the compounds of the present invention are formulated in, or effective to achieve the intended purpose. The determination if provided as a lyophilate, mixed with, a physiologically of an effective dose is well within the capability of those acceptable fluid vehicle, such as 5% dextrose (“D5”), physi skilled in the art. ologically buffered saline, 0.9% saline, Hanks solution, or 0643 A “therapeutically effective dose” refers to that Ringer's Solution. amount of active ingredient-for example MDZ3, MDZ4, 0633 Intramuscular preparations, e.g. a sterile formula MDZ7 or MDZ12 protein, fusion protein, or fragments tion of a suitable soluble salt form of the compounds of the thereof, antibodies specific for MDZ3, MDZ4, MDZ7 or present invention, can be dissolved and administered in a MDZ12, agonists, antagonists or inhibitors of MDZ3, pharmaceutical excipient such as Water-for-Injection, 0.9% MDZ4, MDZ7 or MDZ12-which ameliorates the signs or Saline, or 5% glucose Solution. Alternatively, a Suitable Symptoms of the disease or prevents progression thereof, as insoluble form of the compound can be prepared and admin would be understood in the medical arts, cure, although istered as a Suspension in an aqueous base or a pharmaceu desired, is not required. tically acceptable oil base, Such as an ester of a long chain 0644. The therapeutically effective dose of the pharma fatty acid (e.g., ethyl oleate), fatty oils Such as Sesame oil, ceutical agents of the present invention can be estimated triglycerides, or liposomes. initially by in vitro tests, Such as cell culture assays, fol 0634 Parenteral formulations of the compositions can lowed by assay in model animals, usually mice, rats, rabbits, contain various carrierS Such as Vegetable oils, dimethylac dogs, or pigs. The animal model can also be used to etamide, dimethylformamide, ethyl lactate, ethyl carbonate, determine an initial useful concentration range and route of isopropyl myristate, ethanol, polyols (glycerol, propylene administration. glycol, liquid polyethylene glycol, and the like). 0645) For example, the ED50 (the dose therapeutically 0635 Aqueous injection Suspensions can also contain effective in 50% of the population) and LD50 (the dose Substances that increase the Viscosity of the Suspension, Such lethal to 50% of the population) can be determined in one or as Sodium carboxymethyl cellulose, Sorbitol, or dextran. more cell culture of animal model Systems. The dose ratio of Non-lipid polycationic amino polymers can also be used for toxic to therapeutic effects is the therapeutic index, which delivery. Optionally, the Suspension can also contain Suitable can be expressed as LD50/ED50. Pharmaceutical composi Stabilizers or agents that increase the Solubility of the tions that exhibit large therapeutic indices are particularly compounds to allow for the preparation of highly concen useful. trated Solutions. 0646 The data obtained from cell culture assays and 0636 Pharmaceutical compositions of the present inven animal Studies is used in formulating an initial dosage range tion can also be formulated to permit injectable, long-term, for human use, and preferably provides a range of circulat deposition. ing concentrations that includes the ED50 with little or no 0637. The pharmaceutical compositions of the present toxicity. After administration, or between Successive admin istrations, the circulating concentration of active agent var invention can be administered topically. ies within this range depending upon pharmacokinetic fac 0638 A topical semi-solid ointment formulation typically tors well known in the art, Such as the dosage form contains a concentration of the active ingredient from about employed, Sensitivity of the patient, and the route of admin 1 to 20%, e.g., 5 to 10%, in a carrier Such as a pharmaceu istration. US 2004/0078837 A1 Apr. 22, 2004

0647. The exact dosage will be determined by the prac MDZ7 or MDZ12, respectively, or, depending on the immu titioner, in light of factorS Specific to the Subject requiring nogen, to immunize against aberrant or aberrantly expressed treatment. Factors that can be taken into account by the forms, Such as mutant or inappropriately expressed iso practitioner include the Severity of the disease State, general forms. In yet other embodiments, protein fusions having a health of the Subject, age, weight, gender of the Subject, diet, toxic moiety are administered to ablate cells that aberrantly time and frequency of administration, drug combination(s), accumulate MDZ3, MDZ4, MDZ7 or MDZ12, respectively. reaction Sensitivities, and tolerance/response to therapy. Long-acting pharmaceutical compositions can be adminis 0655. In another embodiment of the therapeutic methods tered every 3 to 4 days, every week, or once every two weeks of the present invention, a therapeutically effective amount depending on half-life and clearance rate of the particular of a pharmaceutical composition comprising nucleic acid of formulation. the present invention is administered. The nucleic acid can be delivered in a vector that drives expression of MDZ3, 0648. Normal dosage amounts may vary from 0.1 to MDZ4, MDZ7 or MDZ12 protein, fusion, or fragment 100,000 micrograms, up to a total dose of about 1 g, thereof, or without Such vector. depending upon the route of administration. Where the therapeutic agent is a protein or antibody of the present 0656 Nucleic acid compositions that can drive expres invention, the therapeutic protein or antibody agent typically Sion of MDZ3, MDZ4, MDZ7 or MDZ12 are administered, is administered at a daily dosage of 0.01 mg to 30 mg/kg of for example, to complement a deficiency in native MDZ3, body weight of the patient (e.g., 1 mg/kg to 5 mg/kg). The MDZ4, MDZ7 or MDZ12, or as DNA vaccines. Expression pharmaceutical formulation can be administered in multiple vectors derived from Virus, replication deficient retroviruses, doses per day, if desired, to achieve the total desired daily adenovirus, adeno-associated (AAV) virus, herpes virus, or dose. vaccinia virus can be used-see, e.g., Cid-Arregui (ed.), Viral Vectors. Basic Science and Gene Therapy, Eaton 0649 Guidance as to particular dosages and methods of Publishing Co., 2000 (ISBN: 188129935X)—as can plas delivery is provided in the literature and generally available mids AntiSense nucleic acid compositions, or vectors that to practitioners in the art. Those skilled in the art will employ drive expression of MDZ3, MDZA, MDZ7 or MDZ12 different formulations for nucleotides than for proteins or antisense nucleic acids, are administered to downregulate their inhibitors. Similarly, delivery of polynucleotides or transcription and/or translation of MDZ3, MDZ4, MDZ7 or polypeptides will be specific to particular cells, conditions, MDZ12 in circumstances in which excessive production, or locations, etc. production of aberrant protein, is the pathophysiologic basis 0650 Conventional methods, known to those of ordinary of disease. skill in the art of medicine, can be used to administer the 0657 Antisense compositions useful in therapy can have pharmaceutical formulation (S) of the present invention to Sequence that is complementary to coding or to noncoding the patient. The pharmaceutical compositions of the present regions of the MDZ3, MDZ4, MDZ7 or MDZ12 genes, invention can be administered alone, or in combination with respectively. For example, oligonucleotides derived from other therapeutic agents or interventions. the transcription initiation Site, e.g., between positions -10 0651) Therapeutic Methods and +10 from the Start Site, are particularly useful. 0652 The present invention further provides methods of 0658 Catalytic antisense compositions, such as treating subjects having defects in MDZ3, MDZ4, MDZ7 or ribozymes, that are capable of Sequence-specific hybridiza MDZ12-e.g., in expression, activity, distribution, localiza tion to MDZ3, MDZ4, MDZ7 or MDZ12 transcripts, are tion, and/or solubility of MDZ3, MDZ4, MDZ7 or also useful in therapy. See, e.g., Phylactou, Adv. Drug Deliv. MDZ12-which can manifest as a disorder of brain, testis, Rev. 44(2-3):97-108 (2000); Phylactou et al., Hum. Mol. heart or bone marrow function for MDZ3; bone marrow, Genet. 7(10): 1649-53 (1998); Rossi, Ciba Found. Symp. brain, heart, hela, adult liver, fetal liver, lung, placenta and 209:195-204 (1997); and Sigurdsson et al., Trends Biotech prostate function for MDZ4; testes function for MDZ7; and nol. 13(8):286-9 (1995), the disclosures of which are incor brain, heart, kidney, placenta, Skeletal muscle, testis, bone porated herein by reference in their entireties. marrow or liver function for MDZ12. As used herein, 0659) Other nucleic acids useful in the therapeutic meth “treating” includes all medically-acceptable types of thera ods of the present invention are those that are capable of peutic intervention, including palliation and prophylaxis triplex helix formation in or near the MDZ3, MDZ4, MDZ7 (prevention) of disease. or MDZ12 genomic locus, respectively. Such triplexing 0653. In one embodiment of the therapeutic methods of oligonucleotides are able to inhibit transcription, Intody et the present invention, a therapeutically effective amount of al., Nucleic Acids Res. 28(21):4283-90 (2000); McGuffie et a pharmaceutical composition comprising MDZ3, MDZ4, al., Cancer Res. 60(14):3790-9 (2000), the disclosures of MDZ7 or MDZ12 protein, fusion, fragment or derivative which are incorporated herein by reference, and pharmaceu thereof is administered to a Subject with a clinically-signifi tical compositions comprising Such triplex forming oligos cant MDZ3, MDZ4, MDZ7 or MDZ12 defect. (TFOS) are administered in circumstances in which exces Sive production, or production of aberrant protein, is a 0654 Protein compositions are administered, for example, to complement a deficiency in native MDZ3, pathophysiologic basis of disease. MDZ4, MDZ7 or MDZ12, respectively. In other embodi 0660. In another embodiment of the therapeutic methods ments, protein compositions are administered as a vaccine to of the present invention, a therapeutically effective amount elicit a humoral and/or cellular immune response to MDZ3, of a pharmaceutical composition comprising an antibody MDZ4, MDZ7 or MDZ12, respectively. The immune (including fragment or derivative thereof) of the present response can be used to modulate activity of MDZ3, MDZ4, invention is administered. AS is well known, antibody com US 2004/0078837 A1 Apr. 22, 2004 46 positions are administered, for example, to antagonize activ 0670 The following examples are offered for purpose of ity of MDZ3, MDZA, MDZ7 or MDZ12, respectively, or to illustration, not limitation. target therapeutic agents to sites of MDZ3, MDZ4, MDZ7 or MDZ12 presence and/or accumulation. EXAMPLE 1. 0661. In another embodiment of the therapeutic methods Identification and Characterization of cDNAS of the present invention, a pharmaceutical composition Encoding MDZ3 Proteins comprising a non-antibody antagonist of MDZ3, MDZ4, 0671 Bioinformatic algorithms were applied to human MDZ7 or MDZ12 is administered. Antagonists of MDZ3, genomic Sequence data to identify putative exons. Using a MDZ4, MDZ7 or MDZ12 can be produced using methods graphical display particularly designed to facilitate comput generally known in the art. In particular, purified MDZ3, erized query of the resulting eXon, eight exons were iden MDZ4, MDZ7 or MDZ12 (MDZ12a, MDZ12bS, or tified as belonging to the same gene. MDZ12bL) can be used to screen libraries of pharmaceutical agents, often combinatorial libraries of Small molecules, to 0672 Marathon-ReadyTM HeLa cell cDNA (Clontech identify those that Specifically bind and antagonize at least Laboratories, Palo Alto, Calif.) was used as a substrate for one activity of MDZ3, MDZ4, MDZ7 or MDZ12, respec standard RACE (rapid amplification of cDNA ends). Mara tively. thon-ReadyTM cl)NAS are adaptor-ligated double stranded cDNAS Suitable for 3' and 5 RACE. Chenchik et al., 0662. In other embodiments a pharmaceutical composi BioTechniques 21:526-532 (1996); Chenchik et al., CLON tion comprising an agonist of MDZ3, MDZ4, MDZ7 or TECHniques X(1):5-8 (January 1995). RACE techniques MDZ12 is administered. Agonists can be identified using are described, inter alia, in the Marathon-Ready TM cl)NA methods analogous to those used to identify antagonists. User Manual (Clontech Labs., Palo Alto, Calif., USA, Mar. 0663. In still other therapeutic methods of the present 30, 2000, Part No. PT1156-1 (PRO3517)), Ausubel et al. invention, pharmaceutical compositions comprising host (eds.), Short Protocols in Molecular Biology: A Compen cells that express MDZ3, MDZ4, MDZ7 or MDZ12, dium of Methods from Current Protocols in Molecular fusions, or fragments thereof can be administered. In Such Biology, 4" edition (April 1999), John Wiley & Sons (ISBN: cases, the cells are typically autologous, So as to circumvent 047132938X) and Sambrook et al. (eds.), Molecular Clon Xenogeneic or allotypic rejection, and are administered to ing: A Laboratory Manual (3rd ed.), Cold Spring Harbor complement defects in MDZ3, MDZ4, MDZ7 or MDZ12 Laboratory Press (2000) (ISBN: 0879695773), the disclo production or activity. Sures of which are incorporated herein by reference in their 0664. In other embodiments, pharmaceutical composi entireties. tions comprising the MDZ3, MDZ4, MDZ7 or MDZ12 0673. Two overlapping RACE products were cloned that proteins, nucleic acids, antibodies, antagonists, and agonists together span 2.0 kilobases and that together appear to of the present invention can be administered in combination contain the entire coding region of the gene to which the with other appropriate therapeutic agents. Selection of the exons contribute; for reasons described below, we termed appropriate agents for use in combination therapy can be this cDNA MDZ3. made by one of ordinary skill in the art according to 0674) The MDZ3 cDNA was sequenced on both strands conventional pharmaceutical principles. The combination of using a MegaBace" sequencer (Molecular Dynamics, Inc., therapeutic agents or approaches can act additively or Syn Sunnyvale, Calif., USA). Sequencing both strands provided ergistically to effect the treatment or prevention of the us with the exact chemical structure of the cDNA, which is various disorders described above, providing greater thera shown in FIG. 3 and further presented in the SEQUENCE peutic efficacy and/or permitting use of the pharmaceutical LISTING as SEQID NO: 1, and placed us in actual physical compositions of the present invention using lower dosages, possession of the entire Set of Single-base incremented reducing the potential for adverse Side effects. fragments of the Sequenced clone, Starting at the 5' and 3 0665 Transgenic Animals and Cells termini. 0666. In another aspect, the invention provides transgenic 0675. MDZ3 cDNA was sent to the American Type cells and non-human organisms comprising MDZ3, MDZ4, Culture Collection (ATCC) for deposit on Aug. 1, 2001, MDZ7 or MDZ12 insofar nucleic acids, and transgenic cells received by ATCC on Aug. 2, 2001, and accorded accession and non-human organisms with targeted disruption of the number endogenous orthologue of the human MDZ3, MDZ4, MDZ7 or MDZ12 gene, respectively. 0676. As shown in FIG.3, the MDZ3 cDNA spans 1981 nucleotides and contains an open reading frame from nucle 0667 The cells can be embryonic stem cells or somatic otide 311 through and including nt 1945 (inclusive of cells. The transgenic non-human organisms can be chimeric, termination codon), predicting a protein of 544 amino acids nonchimeric heterozygotes, and nonchimeric homozygotes. with a (posttranslationally unmodified) molecular weight of 0668 Diagnostic Methods 61.4 kD. The open reading frame appears full length with 0669 The nucleic acids of the present invention can be in-frame 5' Stop codons, a methionine Start codon and a stop used as nucleic acid probes to assess the levels of MDZ3, codon. MDZ4, MDZ7, MDZ12a or MDZ12b mRNA in disease 0677 BLAST query of genomic sequence identified one tissueS or cells, and antibodies of the present invention can BAC, Spanning 14 kb, that constitute the minimum set of be used to assess the expression levels of MDZ3, MDZ4, clones encompassing the cDNA sequence. Based upon the MDZ7, MDZ12a or MDZ12b proteins in disease tissues or known origin of the BAC (GenBank accession number cells to diagnose a variety of diseases, including develop AC005020.5), the MDZ3 gene can be mapped to human mental disorders and cancer. chromosome 7q22.1. US 2004/0078837 A1 Apr. 22, 2004 47

0678 Comparison of the cDNA and genomic sequences tural organization with SCAN box-containing Kruppel identified 8 exons. Exon organization is listed in Table 1. family Zinc-finger proteins. The shared Structural features strongly imply that MDZ3 plays a role similar to that of TABLE 1. other SCAN box-containing Kruppel family zinc-finger proteins as regulators of gene expression, and participates in MDZ3 Exon Structure protein-protein interactions with other transcription modu Exon BAC lators. Thus MDZ3 is a clinically useful diagnostic marker O. cDNA range genomic range accession and potential therapeutic agent for a ranges of diseases, including developmental disorders and cancer. 1. 1-52 78844-78793 ACOOSO2O.S 2 53-179 77284-77158 0688 Possession of the genomic sequence permitted 3 18O-264 76761-76677 4 265-697 76248-75816 Search for promoter and other control Sequences for the 5 698-899 74436-74235 MDZ3 gene. 6 900-991 73260-73169 7 992-1115 71752-71629 0689. A putative transcriptional control region, inclusive 8 1116-1981 66618-65753 of promoter and downstream elements, was defined as 1 kb around the transcription start Site, itself defined as the first nucleotide of the MDZ3 cDNA clone. The region, drawn 0679 FIG. 2 schematizes the exon organization of the from sequence of BAC AC005020.5, has the sequence given MDZ3 clone. in SEQ ID NO: 24, which lists 1000 nucleotides before the transcription start Site. 0680 At the top is shown the bacterial artificial chromo Some (BAC), with GenBank accession number, that spans 0690 Transcription factor binding sites were identified the MDZ3 locus. using a web based program (http://motif. genome.ad.jp/), including two binding site for Sex-determining region Y 0681. As shown in FIG. 2, MDZ3, encoding a protein of gene product (SRY, 474-480 bp and 482-488), for myoblast 544 amino acids, is comprised of exons 1-8. Predicted determining factor (961-970 bp), and for homeo domain molecular weight of the protein, prior to any post-transla factor NkX-2.5/CSX, tinman homolog (589-597 bp, with tional modification, is 61.4 kD. numbering according to SEQ ID NO: 24), amongst others. 0682. As further discussed in the examples herein, 0.691. We have thus identified a newly described human expression of MDZ3 was assessed using RT-PCR. RT-PCR gene, MDZ3, which shares certain protein domains and an product for MDZ3 was clearly produced from brain, testis, overall structural organization with SCAN box containing heart and bone marrow, but not from lung, liver, or skeletal Kruppel family Zinc-finger proteins. The Shared Structural muscle. features strongly imply that the MDZ3 protein plays a role 0683) The sequence of the MDZ3 cDNA was used as a similar to the other SCAN box containing Kruppel family BLAST query into the GenBank nr and dbest databases. The Zinc-finger proteins. It likely functions as a regulator of gene nr database includes all non-redundant GenBank coding expression and participates in protein-protein interactions Sequence translations, Sequences derived from the 3-dimen with other transcription modulators; thus, the MDZ3 pro Sional structures in the Brookhaven Protein Data Bank teins and nucleic acids are clinically useful diagnostic mark (PDB), sequences from SwissProt, sequences from the pro erS and potential therapeutic agents for a range of diseases, tein information resource (PIR), and Sequences from protein including developmental disorders and cancer. research foundation (PRF). The dbEst (database of expressed Sequence tags) includes ESTs, short, Single pass EXAMPLE 2 read cDNA (mRNA) sequences, and cDNA sequences from Identification and Characterization of cDNAS differential display experiments and RACE experiments. Encoding MDZ4 Proteins BLAST search identified two human ESTs (BE540085, AW892105), multiple mouse and rat ESTs, one EST from 0692 Predicating our gene discovery efforts on use of bovine (BE750886) and one from pig (BF079982) as having genome-derived single exon probes and hybridization to sequence closely related to MDZ3. genome-derived single exon microarrayS-an approach that we have previously demonstrated will readily identify novel 0684 Globally, human MZD3 resembles a 604 amino genes that have proven refractory to mRNA-based identifi acid residue hypothetical Zinc finger protein, KIAA0426, at cation efforts-we identified an eXon in raw human genomic 36% amino acid identity and 47% amino acid similarity over Sequence that is particularly expressed in human bone 560 amino acids. marrow, brain, heart, HeLa, adult liver, fetal liver, lung, 0685 Motif searches using Pfam (http://pfam.wustl. placenta and prostate. edu), SMART (http://smart.embl-heidelberg.de), and PROS 0693 Briefly, bioinformatic algorithms were applied to ITE pattern and profile databases (http://www.expasy.ch/ human genomic Sequence data to identify putative exons. prosite), identified several known domains shared with Each of the predicted exons was amplified from genomic SCAN box containing Kruppel family Zinc-finger proteins. DNA, typically centering the putative coding Sequence 0686 FIG. 1 shows the domain structure of MDZ3 and within a larger amplicon that included flanking noncoding the alignment of SCAN box and KRAB domain in MDZ3 Sequence. These genome-derived Single exon probes were with similar motifs. arrayed on a Support and expression of the bioinformatically predicted exons assessed through a Series of Simultaneous 0687. As schematized in FIG. 1, the newly isolated gene two-color hybridizations to the genome-derived single exon product shares certain protein domains and an overall Struc microarrayS. US 2004/0078837 A1 Apr. 22, 2004 48

0694. The approach and procedures are further described 0698 As shown in Table 2, significant expression of in detail in Penn et al., “Mining the Human Genome using exons 2 and 3 was seen only in human bone marrow, brain, Microarrays of Open Reading Frames,” Nature Genetics heart, hela, adult liver, fetal liver, lung, placenta and pros 26:315-318 (2000); commonly owned and copending U.S. tate. patent application Ser. No. 09/864,761, filed May 23, 2001, 0699 Marathon-ReadyTM placenta cDNA (Clontech Ser. No. 09/774,203, filed Jan. 29, 2001, and Ser. No. Laboratories, Palo Alto, Calif., USA) was used as a substrate 09/632,366, filed Aug. 3, 2000, the disclosures of which are for standard RACE (rapid amplification of cDNA ends) to incorporated herein by reference in their entireties. obtain a cDNA clone that SpanS 1.3 kilobases and appears to contain the entire coding region of the gene to which the 0695) Using a graphical display particularly designed to exon contributes; for reasons described below, we termed facilitate computerized query of the resulting exon-specific this cDNA MDZ4. Marathon-ReadyTM clDNAS are adaptor expression data, as further described in commonly owned ligated double stranded cDNAS Suitable for 3' and 5 RACE. and copending U.S. patent application commonly owned Chenchik et al., BioTechniques 21:526-532 (1996); and copending U.S. patent application Ser. No. 09/864,761, Chenchik et al., CLONTECHniques X(1) :5-8 (January filed May 23, 2001, and Ser. No. 09/774,203, filed Jan. 29, 1995). RACE techniques are described, inter alia, in the 2001, the disclosures of which are incorporated herein by Marathon-ReadyTM clDNA User Manual (Clontech Labs., reference in their entireties, two exons were identified that Palo Alto, Calif., USA, Mar. 30, 2000, Part No. PT1156-1 are expressed in all the human tissues tested; Subsequent (PRO3517)), Ausubel et al. (eds.), Short Protocols in analysis revealed that the two exons belong to the same Molecular Biology: A Compendium of Methods from Cur gene. rent Protocols in Molecular Biology, 4" edition (April 1999), John Wiley & Sons (ISBN: 047132938X) and Sam 0696 Table 2 Summarizes the microarray expression data brook et al. (eds.), Molecular Cloning: A Laboratory obtained using genome-derived single exon probes corre Manual (3rd ed.), Cold Spring Harbor Laboratory Press sponding to exons 2 and 3. Each probe was completely (2000) (ISBN: 0879695773), the disclosures of which are Sequenced on both Strands prior to its use on a genome incorporated herein by reference in their entireties. derived single exon microarray; Sequencing confirmed the exact chemical Structure of each probe. An added benefit of 0700. The MDZ4 cDNA was sequenced on both strands Sequencing is that it placed uS in possession of a set of Single using a MegaBace" sequencer (Molecular Dynamics, Inc., base-incremented fragments of the Sequenced nucleic acid, Sunnyvale, Calif., USA). Sequencing both strands provided starting from the sequencing primer's 3' OH. (Since the us with the exact chemical structure of the cDNA, which is single exon probes were first obtained by PCR amplification shown in FIG. 6 and further presented in the SEQUENCE from genomic DNA, we were of course additionally in LISTING as SEQ ID NO: 3027, and placed us in actual possession of an even larger Set of Single base incremented physical possession of the entire Set of Single-base incre fragments of each of the Single exon probes, each fragment mented fragments of the Sequenced clone, Starting at the 5' corresponding to an extension product from one of the two and 3' termini. amplification primers.) 0701 MDZ4 cDNA was sent for deposit to the American Type Culture Collection on Aug. 1, 2001, received at ATCC 0697) Signals and expression ratios are normalized val on Aug. 2, 2001, and accorded accession number ues measured and calculated as further described in com 0702. As shown in FIG. 6, the MDZ4 cDNA spans 1329 monly owned and copending U.S. patent application Ser. nucleotides and contains an open reading frame from nucle No. 09/864,761, filed May 23, 2001, 09/774,203, filed Jan. otide 142 through and including nt 1311 (inclusive of 29, 2001, and Ser. No. 09/632,366, filed Aug. 3, 2000, the termination codon), predicting a protein of 389 amino acids disclosures of which are incorporated herein by reference in with a (posttranslationally unmodified) molecular weight of their entireties. 44.9 kD. The clone appears full length, with the reading frame opening with a methionine and terminating with a TABLE 2 Stop codon. Expression Analysis 0703 BLAST query of genomic sequence identified one Senome-Derived Single Exon Microarray PAC, Spanning 128 kb, that constitutes the minimum set of Ampl 9581 clones encompassing the cDNA sequence. Based upon the Ampl 31808 (eXOn 2 exon 3 known origin of the PAC (GenBank accession numbers Z98745.1), the MDZ4 gene can be mapped to human Signal Expression ratio Signal Expression ratio chromosome 6p21.3-22.2. ADULT info nic 1.4 -1.04 0704 Comparison of the cDNA and genomic sequences LIVER BONE 1.32 -1.69 info info identified 4 exons. Exon organization is listed in Table 3. MARROW BRAIN 1.43 -1.63 1.75 info TABLE 3 FETAL 2.17 -1.11 1.07 -1.16 LIVER MDZA Exon Structure HEART 1.78 -1.22 122 -1.07 HELA 2.04 -1.12 1.75 1.01 Exon PAC LUNG O.78 nic info info O. cDNA range genomic range accession PLACENTA 1.23 -2.54 1.61 1. PROSTATE 155 nic info info 1. 1-64 62122-62O59 Z98745.1 2 65-549 SSOO3-54519 US 2004/0078837 A1 Apr. 22, 2004 49

0713 AS schematized in FIG. 4, the newly isolated gene TABLE 3-continued product shares certain protein domains and an overall Struc tural organization with SCAN box containing Kruppel fam MDZ4 Exon Structure ily Zinc-finger proteins. The shared Structural features Exon PAC strongly imply that MDZ4 plays a role similar to that of O. cDNA range genomic range accession SCAN box containing Kruppel family Zinc-finger proteins as a potential transcription regulator, and is likely to par 3 550-697 54267-5412O ticipate in protein-protein interactions with other transcrip 4 698-1328 53738-53107 tion modulators. Thus, MDZ4 is a clinically useful diagnos tic markers and potential therapeutic agents for a variety of 0705 FIG. 5 schematizes the exon organization of the diseases, including developmental disorders and cancer. MDZ4 clone. 0714 Possession of the genomic sequence permitted 0706. At the top is shown the P1 artificial chromosome Search for promoter and other control Sequences for the (PAC), with GenBank accession number, that span the MDZ4 gene. MDZ4 locus. The genome-derived single-exon probe first 0715) A putative transcriptional control region, inclusive used to demonstrate expression from this locus is shown of promoter and downstream elements, was defined as 1 kb below the PAC and is labeled “500". The 500 bp probe around the transcription start Site, itself defined as the first includes Sequence drawn from exon two as well as flanking nucleotide of the MDZ4 cDNA clone. The region, drawn intron two. from sequence of PAC Z98745.1, has the sequence given in SEQ ID NO:3046, which lists 1000 nucleotides before the 0707. As shown in FIG. 5, MDZ4, encoding a protein of transcription start Site. 389 amino acids, comprising exons 1-4. Predicted molecular weight, prior to any post-translational modification, is 44.9 0716 Transcription factor binding sites were identified kD. using a web based program (http://motif. genome.ad.jp/), 0708 AS further discussed in the examples herein, including a binding site for CdXA (771-777 bp) and for cap expression of MDZ4 was assessed using hybridization to signal for transcription initiation (984-991 bp, with number genome-derived Single exon microarrayS. Microarray analy ing according to SEQ ID NO: 3046), amongst others. sis of exons 2 and 3 showed expression in all tissues tested, 0717) We have thus identified a newly described human namely, in human bone marrow, brain, heart, hela, adult gene, MDZ4, which shares certain protein domains and an liver, fetal liver, lung, placenta and prostate. overall structural organization with SCAN box containing Kruppel family Zinc-finger proteins. The Shared Structural 0709) The sequence of the MDZA cDNA was used as a features strongly imply that the MDZ4 protein plays a role BLAST query into the GenBank nr and dbest databases. The Similar to SCAN box containing Kruppel family Zinc-finger nr database includes all non-redundant GenBank coding proteins, as a potential transcription regulator, and is likely Sequence translations, Sequences derived from the 3-dimen to participate in protein-protein interactions with other tran Sional structures in the Brookhaven Protein Data Bank scription modulators. Thus MDZ4 nucleic acids and proetins (PDB), sequences from SwissProt, sequences from the pro are clinically useful diagnostic markers and potential thera tein information resource (PIR), and Sequences from protein peutic agents for a variety of diseases, including develop research foundation (PRF). The dbEst (database of expressed Sequence tags) includes ESTs, short, Single pass mental disorders and cancer. read cDNA (mRNA) sequences, and cDNA sequences from differential display experiments and RACE experiments. EXAMPLE 3 BLAST search identified one human EST (BF698315) that skips exon 2, and two ESTs from pig (BF079982, Identification and Characterization of cDNAS BE233395) as having sequence closely related to MDZ4. Encoding MDZ7 Proteins 0710 Globally, human MDZ4 resembles a family of 0718 Bioinformatic algorithms were applied to human SCAN box containing Kruppel family Zinc-finger proteins, genomic Sequence data to identify putative exons. Using a including ZNF165 protein (GenBank Accession number: graphical display particularly designed to facilitate comput P49910; 40% amino acid identity and 55% amino acid erized query of the resulting eXon, four exons were identified similarity over 447 a.a.), ZNF193 protein (GenBank Acces as belonging to the same gene. sion number: 015535; 54% amino acid identity and 68% 0719 Marathon-ReadyTM placenta cDNA (Clontech amino acid similarity over 387 a.a.) and ZNF232 protein Laboratories, Palo Alto, Calif., USA) was used as a substrate (GenBank Accession number: Q9UNY5; 47% amino acid for standard RACE (rapid amplification of cDNA ends). identity and 59% amino acid similarity over 387 a.a.). Marathon-ReadyTM cl)NAS are adaptor-ligated double 0711 Motifsearches using Pfam (http://pfam.wustl.edu), stranded cDNAS Suitable for 3' and 5' RACE. Chenchik et SMART (http://smart.embl-heidelberg.de), and PROSITE al., BioTechniques 21:526-532 (1996); Chenchik et al., pattern and profile databases (http://www.expasy.ch/pros CLONTECHniques X(1):5-8 (January 1995). RACE tech ite), identified several known domains shared with SCAN niques are described, inter alia, in the Marathon-ReadyTM box containing Kruppel family Zinc-finger proteins. cDNAUser Manual (Clontech Labs., Palo Alto, Calif., USA, Mar. 30, 2000, Part No. PT1156-1 (PR03517)), Ausubel et 0712 FIG. 4 shows the domain structure of MDZ4, al. (eds.), Short Protocols in Molecular Biology: A Com including the overall structure of MDZ4 and the alignment pendium of Methods from Current Protocols in Molecular of the SCAN box in MDZA with similar motifs. Biology, 4" edition (April 1999), John Wiley & Sons (ISBN: US 2004/0078837 A1 Apr. 22, 2004 50

047132938X) and Sambrook et al. (eds.), Molecular Clon 0727 FIG. 8 schematizes the exon organization of the ing: A Laboratory Manual (3rd ed.), Cold Spring Harbor MDZ7 clone. Laboratory Press (2000) (ISBN: 0879695773), the disclo 0728. At the top is shown the bacterial artificial chromo Sures of which are incorporated herein by reference in their Some (BAC), with GenBank accession numbers, that spans entireties. the MDZ7 locus. As shown in FIG. 8, MDZ7 is comprised 0720. Three overlapping RACE products were cloned of four exons and encodes a protein of 248 amino acids. that together contain the complete Sequence of MDZ7, a Predicted molecular weight of the MDZ7 protein, prior to cDNA clone that collectively spans 2.2 kilobases and any post-translational modification, is 28.8 kD. appears to contain the entire coding region of the gene to 0729. As further discussed in the examples herein, which the exon contributes; for reasons described below, we expression of MDZ7 was assessed using RT-PCR. RT-PCR termed this cDNA MDZ7. analysis of showed MDZ7 expression only in testes, but not in brain, lung, liver, kidney, keletal muscle, heart, whole 0721 The MDZ7 cDNA was sequenced on both strands using a MegaBace" sequencer (Molecular Dynamics, Inc., fetus, or Hela cells. Sunnyvale, Calif., USA). Sequencing both strands provided 0730. The sequence of the MDZ7 cDNA was used as a us with the exact chemical structure of the cDNA, which is BLAST query into the GenBank nr and dbest databases. The shown in FIG. 9 and further presented in the SEQUENCE nr database includes all non-redundant GenBank coding LISTING as SEQ ID NO: 4407, and placed us in actual Sequence translations, Sequences derived from the 3-dimen physical possession of the entire Set of Single-base incre Sional structures in the Brookhaven Protein Data Bank mented fragments of the Sequenced clone, Starting at the 5' (PDB), sequences from SwissProt, sequences from the pro and 3' termini. tein information resource (PIR), and Sequences from protein research foundation (PRF). The dbEst (database of 0722. In order to assess expression in a variety of tissues, expressed sequence tags) includes ESTs, Short, Single pass we generated a pair of PCR primers to analyze the expres read cDNA (mRNA) sequences, and cDNA sequences from sion pattern of the human MDZ7 gene in standard RT-PCR differential display experiments and RACE experiments. experiments (Sambrook et al., Molecular cloning: 3' edi BLAST search identified multiple human ESTs as having tion, 2001). RT-PCR product for MDZ7 was produced from sequence closely related to MDZ7. testis. These experiments placed uS in possession of a near 0731 Motifsearches using Pfam (http://pfam.wustl.edu), complete set of fragments of the template. SMART (http://smart.embl-heidelberg.de), and PROSITE pattern and profile databases (http://www.expasy.ch/pros 0723. MDZ7 cDNA was sent for deposit to the American ite), identified several Kruppel family zinc-finger motifs. Type Culture Collection on Aug. 1, 2001, received at ATCC Aug. 2, 2001, and accorded accession number 0732 FIG. 7 shows the domain structure of MDZ7. 0724. As shown in FIG.9, the MDZ7 cDNA spans 2198 0733) As schematized in FIG. 7, the newly isolated nucleotides and contains an open reading frame from nucle MDZ7 is mainly composed of seven tandemly arrayed otide 663 through and including nt 1409 (inclusive of Kruppel-type (C2H2) Zinc finger repeats. Such a structure termination codon), predicting a protein of 248 amino acids implies that MDZ7 is likely to function in sequence-specific with a (posttranslationally unmodified) molecular weight of DNA binding and impart a regulatory effect on Specific gene expression. Thus MDZ7 nucleic acids and proteins are 28.8 kD. The open reading frame appears full length with an clinically useful diagnostic markers and potential therapeu in-frame 5' Stop codon, a methionine Start codon and a stop tic agents for a variety of diseases, including developmental codon before a 3' poly-A tail. disorders and cancer. 0725 BLAST query of genomic sequence identified one 0734 Possession of the genomic sequence permitted BAC, Spanning 121 kb, that constitute the minimum set of clones encompassing the cDNA sequence. Based upon the Search for promoter and other control Sequences for the known origin of the BAC (GenBank accession numbers MDZ7 gene. AC002310.1), the MDZ7 gene can be mapped to human 0735. A putative transcriptional control region, inclusive chromosome 16p11.2. of promoter and downstream elements, was defined as 1 kb around the transcription start Site, itself defined as the first 0726 Comparison of the cDNA and genomic sequences nucleotide of the MDZ7 cDNA clone. The region, drawn identified 4 exons. EXOn organization is listed in Table 4. from sequence of BAC AC002310.1, has the sequence given in SEQ ID NO: 4420, which lists 1000 nucleotides before TABLE 4 the transcription Start Site. MDZ7 Exon Structure 0736 Transcription factor binding sites were identified Eexon PAC using a web based program (http://motif.genome.ad.jp/), O. cDNA range genomic range accession including a binding Site for cap signal for transcription 1. 1-396 1531-1926 ACOO2310.1 initiation (846-853 bp, with numbering according to SEQID 2 397-525 2497-2625 NO: 4420), amongst others. 3 526-1458 4304-5236 4 1459-21.98 6181-6920 0737 We have thus identified a newly described human gene, MDZ7, which contains Seven Kruppel family Zinc finger motifs. The Structural features Strongly imply that the US 2004/0078837 A1 Apr. 22, 2004

MDZ7 protein plays a role similar to other Kruppel family initial ORF to 44 amino acids. However, the MDZ12b Zinc-finger proteins, as a potential transcription regulator. transcript does still contain another ORF downstream with Thus, MDZ7 nucleic acids and proteins are clinically useful 332 amino acids. diagnostic markers and potential therapeutic agents for a 0742 The MDZ12a cDNA was sequenced on both variety of diseases, including developmental disorders and Strands using a MegaBace" sequencer (Molecular Dynam CCC. ics, Inc., Sunnyvale, Calif., USA). Sequencing both Strands provided us with the exact chemical structure of the cDNA, EXAMPLE 4 which is shown in FIG. 12 and further presented in the SEQUENCE LISTING as SEQ ID NO: 5770, and placed us Identification and Characterization of cDNAS in actual physical possession of the entire Set of Single-base Encoding MDZ12 Proteins incremented fragments of the Sequenced clone, Starting at 0738 Bioinformatic algorithms were applied to human the 5' and 3' termini. genomic Sequence data to identify putative exons. Using a 0743. MDZ12a and MDZ12b cDNA were deposited graphical display particularly designed to facilitate comput together in a Single tube at the American Type Culture erized query of the resulting eXon, four exons were identified Collection; the deposit was Sent to ATCC for deposit on Aug. as belong to the same gene. 1, 2001, received at ATCC Aug. 2, 2001, and accorded 0739. To RACE out the full length MDZ12a gene, human accession number . As shown in FIG. 12, the heart marathon-ready cDNA (Clontech) was used as the MDZ12a cDNA spans 1780 nucleotides and contains an template, and oligonucleotides OL612 (5'-TTACTAAAT. open reading frame from nucleotide 127 through and includ CAAATGGGTGTTTTGATGGCATAAA-3) SEQ ID ing nt 1578 (inclusive of termination codon), predicting a NO:7036) and OL613 (5'-GCATCAGGTGGCAAAGCT protein of 483 amino acids with a (posttranslationally CAATCAGGACA-3) SEQ ID NO:7037) were used to unmodified) molecular weight of 55.1 kD. The open reading PCR out a 1.2 kb fragment of the open reading frame (ORF) frame appears full length with in-frame 5' Stop codons, a using protocols according to the manufacturer's instructions methionine Start codon and a stop codon. (Clontech). Marathon-Ready"McDNAs are adaptor-ligated 0744. As shown in FIG. 13, the MDZ12b cDNA spans at double Stranded cDNAS Suitable for 3' and 5 RACE. least 1518 nucleotides and contains two open reading Chenchik et al., BioTechniques 21:526-532 (1996); frames. The shorter ORF, MDZ12bS, is from nucleotide 1 Chenchik et al., CLONTECHniques X(1):5-8 (January through and including nt 135 (inclusive of termination 1995). RACE techniques are described, inter alia, in the codon), predicting a polypeptide of 44 amino acids. While Marathon-ReadyTM cDNA User Manual (Clontech Labs., the longer ORF, MDZ12bL, is from nucleotide 520 through Palo Alto, Calif., USA, Mar. 30, 2000, Part No. PT1156-1 and including nt 1518 (inclusive of termination codon), (PRO3517)), Ausubel et al. (eds.), Short Protocols in predicting a polypeptide of 332 amino acids with a (post Molecular Biology: A Compendium of Methods from Cur rent Protocols in Molecular Biology, 4" edition (April translationally unmodified) molecular weight of 38.2 kD. 1999), John Wiley & Sons (ISBN: 047132938X) and Sam 0745 BLAST query of genomic sequence identified one brook et al. (eds.), Molecular Cloning: A Laboratory BAC, that constitute the minimum Set of clone encompass Manual (3rd ed.), Cold Spring Harbor Laboratory Press ing the cDNA sequence. Based upon the known origin of the (2000) (ISBN: 0879695773), the disclosures of which are BAC (GenBank accession numbers AC018946.5), the incorporated herein by reference in their entireties. MDZ12 gene can be mapped to human chromosome 0740. Using a similar protocol, oligonucleotides OL614 15q26.1. (5'-CACTCAATGCACGTATAGGGCCTCTCGCC-3) 0746 Comparison of the cDNA and genomic sequences SEQ ID NO: 7038) and OL615 (5'-TGTCCTGAT identified 4 exons for MDZ12a. Exon organization is listed TGAGCTTTGCCACCTGATGC-3) SEQ ID NO:7039) in Table 5. The additional exon found in MDZ12b is were used to PCR out the 5' end of the gene, oligonucle numbered as 2. otides OL311 (5'-GGTCACCTGTACGCTCCTCTCCAT. GTCTCTTC-3) (SEQ ID NO:7040) and OL312 (5-CT TABLE 5 GTTTGGCTTCCGACCTGCTCCTCACC-3) (SEQ ID NO:7041) were used to PCR out the 3' end of the gene. The Exon Structure of the MDZ12 gene PCR fragments are sequenced using a MegaBACETM Exon Sequencer. The final contig of the Sequences revealed a 1.78 O. cDNA range genomic range BAC accession kb novel gene with a 1.45 kb ORF. 1. 1-107 155292-155186 ACO18946.5 0741. To subclone MDZ12a into a cloning vector, the 2 1-66 152246-152181 2 108-230 152955-152833 RACE product generated with oligonucleotides OL612 and 3 231-337 145120-145226 OL638 (5-CCACCATGTGGCTGGGGACTTCAGGGAA 4 338-1756 146307-14773O GAGTGGGTTAC-3") SEQ ID NO:7042 against human heart marathon-ready clDNA (Clontech) was ligated and T/A cloned into pGem-Teasy vector (Promega Corp.). Individual 0747 FIG. 11 schematizes the exon organization of the clones were picked and inserts Sequenced. A Second MDZ12 MDZ12 clones. transcript, with the insertion of an extra exon with 66 nucleotides between exons 2 and 3 of MDZ12a, was iden 0748. At the top is shown the bacterial artificial chromo tified. This transcript is named MDZ12b. The insertion of Some (BAC), with GenBank accession numbers, that spans the extra eXon introduces an early Stop codon, reducing the the MDZ12 locus. US 2004/0078837 A1 Apr. 22, 2004 52

0749. As shown in FIG. 11, MDZ12a encodes a protein nucleotide of the MDZ12 cdNA clone. The region, drawn of 483 amino acids, comprising exons 1-4. Predicted from sequence of BAC AC018946.5, has the sequence given molecular weight of the protein, prior to any post-transla in SEQ ID NO: 5783, which lists 1000 nucleotides before tional modification, is 55.1 kD. The inclusion of a novel the transcription Start Site. exon between exons 2 and 3 introduces an inframe Stop codon in MDZ12b, and thus MDZ12b encodes a short 0757. Transcription factor binding sites were identified polypeptide of 44 amino acids (MDZ12bS). The use of an using a web based program (http://motif.genome.ad.jp/), internal methionine as initiation methionine in MDZ12b including a binding site for Signal transducers and activators could potentially encode a 332 amino acid protein of transcription (STATx, 837-845bp) and for GATA-binding (MDZ12bL). The predicted molecular weight of the factor 1 (GATA-1, 926-935 bp), with numbering according MDZ12bL protein, prior to any post-translational modifica to SEQ ID NO: 5783), amongst others. tion, is 38.2 kD. 0758 We have thus identified a newly described human gene, MDZ12, which shares certain protein domains and an 0750 Expression of MDZ12 was assessed using RT overall Structural organization with KRAB box containing PCR. The abundance of PCR product indicates that Kruppel family Zinc-finger proteins. The Shared Structural MDZ12a is expressed in all tissue examined with highest features strongly imply that the MDZ12a and MDZ12bL expression in brain, heart, Skeletal muscle, testis and Hela proteins play a role Similar to KRAB box containing Krup cells. MDZ12b, however, is expressed with lower to much pel family Zinc-finger proteins, as a potential transcription lower abundance compared with MDZ12a in bone marrow, regulator, and is likely to participate in protein-protein brain, heart, kidney, placenta, Skeletal muscle, testis and interactions with other transcription modulators. Thus Hela cells with almost no expression in liver. MDZ12 is a clinically useful diagnostic marker and potential 0751) The sequences of the MDZ12a and MDZ12b therapeutic agent for a variety of diseases, including devel cDNA were used as BLAST queries into the GenBank nr opmental disorders and cancer. and dbEst databases. The nr database includes all non redundant GenBank coding Sequence translations, EXAMPLE 5 Sequences derived from the 3-dimensional Structures in the Brookhaven Protein Data Bank (PDB), sequences from RT-PCR Analysis of Expression of Human MDZ3 SwissProt, Sequences from the protein information resource (PIR), and Sequences from protein research foundation 0759 RT-PCR analysis was used to determine the expres (PRF). The dbEst (database of expressed sequence tags) sion pattern of the human MDZ3 gene. A forward primer includes ESTs, short, single pass read clDNA (mRNA) (5'-GATGGCGGAAGCTCCTCAGC) (SEQ ID NO:7043) Sequences, and cDNA sequences from differential display and a reverse primer (5'-AGTCCTGCGGTTCCACATAC) experiments and RACE experiments. BLAST search iden SEQ ID NO:7044 derived from the middle of the MDZ3 tified multiple human ESTS as having Sequence closely gene were used in standard RT-PCRs (Sambrook et al., related to MDZ12. Molecular Cloning (3rd ed.), 2001). Templates for the PCRs were obtained from brain, liver, lung, bone marrow, heart, 0752 Motifsearches using Pfam (http://pfam.wustl.edu), skeletal muscle, and testis and reactions were carried out SMART (http://smart.embl-heidelberg.de), and PROSITE according to the following schedule: 94 C., 20 seconds; 68 pattern and profile databases (http://www.expasy.ch/pros C. 20 seconds; 72° C., 60 seconds, for 35 cycles). PCR ite), identified several known domains shared with KRAB products were separated on an agarose gel and Visualized domain containing Kruppel family Zinc-finger proteins for with a TyphoonTM fluorimager and ImageOuantTM software MDZ12a and Kruppel family zinc-fingers for MDZ12bL. (Molecular Dynamics, Sunnyvale, Calif.). RT-PCR product 0753 FIG. 10A shows the domain structure of MDZ12, for MDZ3 was produced from brain, testis, heart and bone including the overall structure of MDZ12a and MDZ12bL, marrow, but not from lung, liver, or skeletal muscle (FIG. with FIG. 10B showing the alignment of KRAB domain in 14). MDZ12a with similar motifs. EXAMPLE 6 0754). As schematized in FIGS. 10A and 10B, the newly isolated MDZ12a contains a partial KRAB motif as well as RT-PCR Analysis of Expression of Human MDZ7 twelve copies of C2H2 zinc fingers. The MDZ12bL contains twelve copies of C2H2 zinc fingers. Such features Strongly 0760 RT-PCR analysis was used to determine the expres imply that MDZ12 plays a role as a potential transcription sion pattern of the human MDZ7 gene. A forward primer regulator, and is likely to participate in protein-protein (5'-TCAGATCTGTCGCTCCTTCA) (SEQ ID NO:7045 interactions with other transcription modulators. Thus and a reverse primer (5'-GCAGTCTGAGCACGCGTAAG) MDZ12 nucleic acids and proteins are clinically useful SEQ ID NO:7046) derived from the open reading frame of diagnostic markers and potential therapeutic agents for a MDZ7 were used in standard RT-PCRs (Sambrook et al., variety of diseases, including developmental disorders and Molecular cloning: 3' edition, 2001). Templates for the CCC. PCRs were obtained from whole fetus, liver, lung, kidney, heart, and testis and reactions were carried out according to 0755 Possession of the genomic sequence permitted the following schedule: 94° C., 20 seconds; 68 C. 20 Search for promoter and other control Sequences for the seconds; 72 C., 60 seconds, for 35 cycles). PCR products MDZ12 gene. were separated on an agarose gel and Visualized with a 0756. A putative transcriptional control region, inclusive Typhoons fluorimager and ImageOuanTM software (Molecu of promoter and downstream elements, was defined as 1 kb lar Dynamics, Sunnyvale, Calif.). RT-PCR product for around the transcription start Site, itself defined as the first MDZ7 was only produced from testis (FIG. 15). US 2004/0078837 A1 Apr. 22, 2004

EXAMPLE 7 -continued RT-PCR Analysis of MDZ12 Expression SEO ID NO:3033 (aa, aa 1-136, CDS entirely within SEQ 0761) To explore the potential function of the MDZ12 ID NO:3032) gene, the expression of MDZ12 gene in human tissues was SEO ID NO:3034 (nt, nt 1095-1329, portion of MDZ4) SEO ID NO:3035 (nt, coding region of SEQ ID NO:3034) examined by PCR using marathon-ready cDNAS. Oligo SEO ID NO:3036 (nt, 3' UT portion of SEQ ID NO:3034) nucleotides OL612 and OL638 were used to amplify both SEO ID NO:3037 (aa, aa 319-389, CDS entirely within MDZ12a and MDZ12b from human cDNAs of bone mar SEQ ID NO:3035) row, brain, heart, kidney, liver, placenta, Skeletal muscle, SEO ID NOs: 3038-3O41 (nt, exons 1-4, from genomic sequence, of MDZ4) testis and Hela cells. The PCR conditions were according to SEO ID NOs: 3042-3045 (nt, 500 bp genomic amplicon centered a touchdown PCR procedure. The tubes containing the about exons 1-4 of MDZ4) oligonucleotides, cDNA and Taq polymerase were first SEO ID NO:3046 (nt, 1000 bp putative promoter of MDZ4) SEO ID NOs: 3047-3515 (nt, 17-mers scanning nt 65-549 of incubated at 94° C. for 15 seconds followed by 72° C. for 2 human MDZ4) minutes, cycle 5 times. The tubes were then incubated at 94 SEO ID NOs: 3516-3976 (nt, 25-mers scanning nt 65-549 of C. for 15 seconds followed by 70° C. for 2 minutes, cycle 5 human MDZ4) times. Finally the tubes were incubated at 94 C. for 15 SEO ID NOs: 3977-4195 (nt, 17-mers scanning nt 1095-1329 of seconds followed by 68 C. for 2 minutes, cycle 25 times. human MDZ4) SEO ID NOs: 4196-4406 (nt, 25-mers scanning nt 1095-1329 of The result of the expression profile is shown in FIG. 16. The human MDZ4) abundance of PCR product indicates that MDZ12a is SEO ID NO: 4407 (nt, full length MDZ7 cDNA) expressed in all tissue examined with highest expression in SEO ID NO: 4408 (nt, cDNA ORF of MDZ7) SEO ID NO: 4409 (aa, full length protein of MDZ7) brain, heart, skeletal muscle, testis and Hela cells (FIG. 16). SEO ID NO: 4410 (nt, nt 1-395, portion of MDZ7) MDZ12b, however, is expressed with lower to much lower SEO ID NO: 4411 (nt, nt 1459–1778, portion of MDZ7) abundance compared with MDZ12a in bone marrow, brain, SEO ID NOs: 4412–4415 (nt, exons 1-4, from genomic sequence, heart, kidney, placenta, Skeletal muscle, testis and Hela cells of MDZ7) with almost no expression in liver. SEO ID NOs: 4416-4419 (nt, 500 bp genomic amplicon centered about exon 1-4 of MDZ7) SEO ID NO: 442O (nt, 1000 bp putative promoter of MDZ7) EXAMPLE 8 SEO ID NOs: 4421-4799 (nt, 17-mers scanning nt 1-395 of human MDZ7) Preparation and Labeling of Useful Fragments of SEO ID NOs: 4800-517O (nt, 25-mers scanning nt 1-395 of human MDZ7) MDZ3, MDZ4, MDZ7 and MDZ12 SEO ID NOS:5171-5474 (nt, 17-mers scanning nt 1459–1778 of human MDZ7) 0762. Useful fragments of MDZ3, MDZ4, MDZ7 and SEO ID NOs: 5475-5769 (nt, 25-mers scanning nt 1459–1778 of MDZ12 are produced by PCR, using standard techniques, or human MDZ7) Solid phase chemical Synthesis using an automated nucleic SEQ ID NO: 5770 (nt, full length MDZ12 cDNA) acid Synthesizer. Each fragment is Sequenced, confirming SEQ ID NO: 5771 (nt, cDNA ORF of MDZ12) the exact chemical Structure thereof. SEQ ID NO: 5772 (aa, full length protein of MDZ12) SEQ ID NO: 5773 (nt, nt 352-948, portion of MDZ12) 0763 The exact chemical structure of preferred frag SEO ID NO: 5774 (aa, aa 76-274, portion of MDZ12) SEQ ID NOs: 5775-5778 (nt, exons 1-4, from genomic sequence, ments is provided in the attached SEQUENCE LISTING, of MDZ12) the disclosure of which is incorporated herein by reference SEO ID NOs: 5779-5782 (nt, 500 bp genomic amplicon centered in its entirety. The following Summary identifies the frag about exon 1-4 of MDZ12) ments whose Structures are more fully described in the SEO ID NO: 5783 int, 1000 bp putative promoter of MDZ12) SEQUENCE LISTING: SEO ID NOs: 5784-6364 (nt, 17-mers scanning nt 352-948 of human MDZ12) SEO ID NOs: 6365-6937 (nt, 25-mers scanning nt 352-948 of human MDZ12) SEO ID NO: 1 (nt, full length MDZ3 cDNA) SEO ID NO: 6938 (nt, cDNA of MDZ12b) SEO ID NO: 2 (nt, cDNA ORF of MDZ3) SEO ID NO: 6939 (aa, full length protein of MDZ12bS) SEO ID NO:3 (aa, full length protein of MDZ3) SEO ID NO: 6940 (aa, full length protein of MDZ12bL) SEO ID NO: 4 (nt, nt 201–1721 of MDZ3) SEO ID NO: 6941 (nt, novel exon (nt 105-170 portion) SEO ID NO: 5 (nt, 5' UT portion of SEQ ID NO: 4) of MDZ12b) SEO ID NO: 6 (nt, coding region of SEQ ID NO: 4) SEO ID NO: 6942 (aa, novel exon (before stop codon) of SEO ID NO: 7 (aa, aa 1-470, CDS entirely within SEQ MDZ12b) ID NO: 6) SEO ID NO: 6943 (nt, 500 bp genomic amplicon centered SEO ID NOs: 8-15 (nt, exon 1-8 from genomic sequence of about novel exon of MDZ12b) MDZ3) SEO ID NOs: 6944-6993 nt, 17-mers scanning novel exon of SEO ID NOs: 16-23 (nt, 500 bp genomic amplicon centered human MDZ12b) about exons 1-8 of MDZ3) SEO ID NOs: 6994-7035 SEO ID NO: 24 (nt, 1000 bp putative promoter of MDZ3) (nt, 25-mers scanning novel exon of SEQ ID NOs: 25–1529 (nt, 17-mers human MDZ12b) scanning nt 201-1721 of human MDZ3) SEO ID NO: 7036 (nt, oligo OL612) SEQ ID NOs: 1530-3026 (nt, 25-mers scanning nt 201-1721 of SEO ID NO: 7037 (nt, oligo OL613) human MDZ3) SEO ID NO: 7038 (nt, oligo OL614) SEO ID NO:3027 (nt, full length MDZ4 cDNA) SEO ID NO: 7039 (nt, oligo OL615) SEO ID NO:3028 (nt, cDNA ORF of MDZ4) SEO ID NO: 7040 (nt, oligo OL311) SEO ID NO:3O29 (aa, full length protein of MDZ4) SEO ID NO: 7041 (nt, oligo OL312) SEO ID NO:3030 (nt, nt 65-549, portion of MDZ4) SEO ID NO: 7042 (nt, oligo OL638) SEO ID NO:3031 (nt, 5' UT portion of SEQ ID NO:3030) SEO ID NO: 7043 (nt, forward primer for MDZ3 RT PCR) SEO ID NO:3032 (nt, coding region of SEQ ID NO:3030) SEO ID NO: 7044 (nt, reverse primer for MDZ3 RT PCR) US 2004/0078837 A1 Apr. 22, 2004 54

EXAMPLE 11 -continued SEO ID NO: 7045 (nt, forward primer for MDZ7 RT PCR) Use of MDZ3, MDZ4, MDZ7 or MDZ12 Probes SEO ID NO: 7046 (nt, reverse primer for MDZ7 RT PCR) and Antibodies for Diagnosis 0770. After informed consent is obtained, samples are 0764. Upon confirmation of the exact structure, each of drawn from tumors, and tested for MDZ3, MDZ4, MDZ7, the above-described nucleic acids of confirmed Structure is MDZ12a or MDZ12b mRNA levels by standard techniques recognized to be immediately useful as a MDZ3, MDZ4, and tested additionally for MDZ3, MDZA, MDZ7, MDZ12a MDZ7 or MDZ12-specific probe. or MDZ12b protein levels using anti-MDZ3, MDZ4, 0765 For use as labeled nucleic acid probes, the above MDZ7, MDZ12a or MDZ12b antibodies in a standard described MDZ3, MDZ4, MDZ7 or MDZ12 nucleic acids ELISA. After data are unblinded, a Statistically significant are separately labeled by random priming. AS is well known correlation of aberrant expression of each of the above in the art of molecular biology, random priming places the described genes is Seen with various tumor types. investigator in possession of a near-complete Set of labeled fragments of the template of varying length and varying EXAMPLE 12 Starting nucleotide. 0766 The labeled probes are used to identify the MDZ3, Use of MDZ3, MDZ4, MDZ7 or MDZ12 Nucleic MDZ4, MDZ7 or MDZ12 gene on a Southern blot, and are Acids, Proteins, and Antibodies in Therapy used to measure expression of MDZ3, MDZ4, MDZ7 or MDZ12 mRNA on a northern blot and by RT-PCR, using 0771) Once mutations of MDZ3, MDZ4, MDZ7 or Standard techniques. MDZ12 have been detected in patients, normal MDZ3, MDZ4, MDZ7 or MDZ12 is reintroduced into the patient's EXAMPLE 9 tumor cells by introduction of expression vectors that drive Production of MDZ3, MDZ4, MDZ7 or MDZ12 MDZ3, MDZ4, MDZ7 or MDZ12 expression or by intro Protein ducing MDZ3, MDZ4, MDZ7 or MDZ12 proteins into cells, with Statistically significant improvement in patient longev 0767 The full length MDZ3, MDZ4, MDZ7, MDZ12a or ity. MDZ12b cDNA clone is cloned into the mammalian expres sion vector pcDNA3.1/HISA (Invitrogen, Carlsbad, Calif., 0772. In patients in whom expression is increased, anti USA), transfected into COS7 cells, transfectants selected bodies for the mutated forms of MDZ3, MDZ4, MDZ7 or with G418, and protein expression in transfectants con MDZ12 are used to block the function of the abnormal forms firmed by detection of the anti-XpressTM epitope according of the protein. to manufacturer's instructions. Protein is purified using immobilized metal affinity chromatography and vector-en coded protein Sequence is then removed with enterokinase, EXAMPLE 13 per manufacturers instructions, followed by gel filtration and/or HPLC. MDZ3 Disease ASSociations 0768 Following epitope tag removal, MDZ3, MDZ4, 0773) Diseases that map to the MDZ3 chromosomal MDZ7, MDZ12a, MDZ12bS or MDZ12bL protein is region in Table 6. Mutation of the MDZ3 gene contributes present at a concentration of at least 70%, measured on a more of these conditions. weight basis with respect to total protein (i.e., W/w), and is free of acrylamide monomers, bis acrylamide monomers, polyacrylamide and ampholytes. Further HPLC purification TABLE 6 provides MDZ3, MDZ4, MDZ7, MDZ12a, MDZ12bS or Diseases mapped to human chromosome 7q22.1 (MDZ3 region) MDZ12bL protein at a concentration of at least 95%, mea Sured on a weight basis with respect to total protein (i.e., mim num disease chromosomal location w/w). 20985O Autism, susceptibility to 7q EXAMPLE 10 145290 Hyperreflexia 7q Production of Anti-MDZ3, MDZ4, MDZ7 or 603511 Limb-girdle muscular 7q MDZ12 Antibody dystrophy-1D 0769 Purified proteins prepared as in Example 6 are Separately conjugated to carrier proteins and used to prepare murine monoclonal antibodies by Standard techniques. Ini EXAMPLE 1.4 tial Screening with the unconjugated purified proteins, fol lowed by competitive inhibition Screening using peptide MDZ4 Disease ASSociations fragments of the MDZ3, MDZ4, MDZ7, MDZ12a, MDZ12bS or MDZ12bL, identifies monoclonal antibodies 0774 Diseases that map to the MDZ4 chromosomal with specificity for MDZ3, MDZ4, MDZ7, MDZ12a, region in Table 7. Mutation of the MDZ4 gene contributes MDZ12bS or MDZ12b. more of these conditions. US 2004/0078837 A1 Apr. 22, 2004

EXAMPLE 16 TABLE 7 MDZ12 Disease ASSociations Diseases mapped to human chromosome 6p21.3 (MDA region mim num disease chromosomal location 0776 Diseases that map to the MDZ12 chromosomal 106300 Ankylosing spondylitis 6p21.3 region are shown in Table 9. Mutation of the MDZ12 gene 108800 Atrial septal defect, secundum 6p21.3 contributes to one or more of these conditions. type 146520 Hypotrichosis simplex of scalp 6p21.3 222100 Insulin-dependent diabetes 6p21.3 TABLE 9 mellitus-1 137100 Immunoglobulin A deficiency 6p21.3 Diseases mapped to chromosome 15q26.1 (MDZ12 region) 146850 Immune suppression to 6p21.3 streptococcal antigen 604809 Panbronchiolitis, diffuse 6p21.3 mim num disease chromosomal location 167250 Paget disease of bone 6p21.3 177900 Psoriasis susceptibility 1 6p21.3 604329 Hypertension, essential, 15q 179450 Ragweed sensitivity 6p21.3 susceptibility to, 2 150270 Laryngeal adductor paralysis 6p21.3-p21.2 214900 Cholestasis-lymphedema syndrome 15q 604416 Pyogenic sterile arthritis, 15q24-q26.1 pyOderma gangrenosum, and acne EXAMPLE 1.5 (PAPAsyndrome) MDZ7 Disease ASSociations 603813 Hypercholesterolemia, familial, 15q25-q26 autosomal recessive, 1 0775 Diseases that map to the MDZ7 chromosomal 60.0318 Insulin-dependent diabetes 15q26 region are shown in Table 8. Mutation of the MDZ7 gene mellitus-3 contributes to one or more of these conditions. 166800 Otosclerosis 1 15q26.1-qter

TABLE 8 Diseases mapped to chromosome 16p11.2 (MDZ7 region 0777 All patents, patent publications, and other pub lished references mentioned herein are hereby incorporated mim num disease chromosomal location by reference in their entireties as if each had been individu 157700 Mitral valve prolapse, familial 16p12.1-p11.2 ally and specifically incorporated by reference herein. While 602066 Convulsions, infantile and 16p12-q12 paroxysmal choreoathetosis preferred illustrative embodiments of the present invention 266600 Inflammatory bowel disease-1 16p12-q13 186580 Arthrocutaneouveal 16p12-q21 are described, one skilled in the art will appreciate that the granulomatosis present invention can be practiced by other than the 128200 Paroxysmal kinesigenic 16p11.2-q12.1 described embodiments, which are presented for purposes of choreoathetosis illustration only and not by way of limitation. The present invention is limited only by the claims that follow.

SEQUENCE LISTING The patent application contains a lengthy “Sequence Listing Section. A copy of the "Sequence Listing” is available in electronic form from the USPTO web site (http://seqdata.uspto.gov/sequence.html?DocID=20040078837). An electronic copy of the “Sequence Listing” will also be available from the USPTO upon request and payment of the fee set forth in 37 CFR 119(b)(3).

What is claimed is: (iii) the nucleotide sequence of SEQID NO: 2, SEQ ID 1. An isolated nucleic acid that encodes a Zinc finger NO:3028, SEQ ID NO:4408, and SEQ ID NO:5771; containing protein, comprising: (iv) a degenerate variant of the Sequences set forth in (a) a nucleotide sequence Selected from the group con (iii); Sisting of: (v) the complement of the sequences set forth in (iii) (i) SEQID NO:1, SEQID NO:3027, SEQID NO:4407, and (iv); and SEQ ID NO:5770; and SEQ ID NO:6938; (vi) the nucleotide sequence of the cDNAs having (ii) the complement of the Sequences set forth in (i); ATCC accession noS. (MDZ3), US 2004/0078837 A1 Apr. 22, 2004 56

(MDZ4), (MDZ7), and (MDZ12a (i) SEQID NO:3, SEQID NO:3029, SEQID NO:4409, and MDZ12b); or SEQ ID NO:5772, SEQ ID NO:6939, and SEQ ID (b) a nucleotide Sequence Selected from the group con NO:6940; and Sisting of: (ii) the amino acid sequence of the cDNAS having (i) a nucleotide sequence that encodes a polypeptide ATCC accession nos. { }; having the sequence of SEQ ID NO:3, SEQ ID (b) an amino acid sequence having at least 65% amino NO:3029, SEQ ID NO:4409, SEQ ID NO:5772, and acid Sequence identity to that of (a)(i) or (a)(ii); SEQ ID NO:6940; (c) an amino acid sequence according to (a)(i) or (a)(ii) in (ii) a nucleotide Sequence that encodes a polypeptide which at least 95% of deviations from the sequence of having the sequence of SEQ ID NO:3, SEQ ID (a)(i) or (a)(ii) are conservative Substitutions; or NO:3029, SEQ ID NO:4409, SEQ ID NO:5772, (d) a fragment of at least 8 contiguous amino acids of any SEQ ID NO:6939, and SEQ ID NO:6940, with of (a)-(c). conservative amino acid Substitutions, and 15. A fusion protein, Said fusion protein comprising a (iii) the complement of the sequences Set forth in (i) and polypeptide of claim 14 fused to a heterologous amino acid (ii), Sequence. 16. The fusion protein of claim 15, wherein said heter wherein Said isolated nucleic acid comprising a nucleotide ologous amino acid Sequence is a detectable moiety. Sequence Selected from group (b) is no more than about 17. The fusion protein of claim 16, wherein said detect 100 kb in length. able moiety is fluorescent. 2. The isolated nucleic acid of claim 1 wherein said 18. The fusion protein of claim 15, wherein said heter nucleic acid, or the complement of Said nucleic acid, ologous amino acid Sequence is an Ig Fc region. encodes a polypeptide having Sequence-specific DNA bind 19. An isolated antibody, or antigen-binding fragment or ing activity. derivative thereof, the binding of which can be competi 3. The isolated nucleic acid of claim 1, wherein said tively inhibited by a polypeptide of claim 14. nucleic acid, or the complement of Said nucleic acid, is 20. A transgenic non-human animal modified to contain expressed in testis. the nucleic acid molecule of any one of claims 1 or 8-10. 4. A nucleic acid probe, comprising: 21. A transgenic non-human animal unable to express the endogenous orthologue of the nucleic acid molecule of (a) the nucleic acid of claim 1; or claim 1. (b) at least 17 contiguous nucleotides of SEQ ID NO:4, 22. A method of identifying agents that modulate the SEQ ID NO:3030, SEQ ID NO:3034, SEQ ID expression of MDZ3, MDZ4, MDZ7, or MDZ12, the NO:4410, SEQ ID NO:4411, SEQ ID NO:5773, or method comprising: SEQ ID NO:6941, contacting a cell or tissue sample believed to express wherein Said probe according to (b) is no longer than MDZ3, MDZ4, MDZ7, or MDZ12 with a chemical or about 100 kb in length. biological agent, and then 5. The probe of claim 4, wherein said probe is detectably labeled. comparing the amount of MDZ3, MDZ4, MDZ7, or MDZ12 expression in said cell or tissue sample with 6. The probe of claim 4, attached to a substrate. that of a control, 7. A microarray, wherein at least one probe of Said array is a probe according to claim 4. changes in the amount relative to control identifying an 8. The isolated nucleic acid molecule of claim 1, wherein agent that modulates expression of MDZ3, MDZ4, Said nucleic acid molecule is operably linked to one or more MDZ7, or MDZ12. expression control elements. 23. A method of identifying agonists and antagonists of 9. A replicable vector comprising a nucleic acid molecule MDZ3, MDZ4, MDZ7, or MDZ12, the method comprising: of claim 1. contacting a cell or tissue sample believed to express 10. A replicable vector comprising an isolated nucleic MDZ3, MDZ4, MDZ7, or MDZ12 with a chemical or acid molecule of claim 8. biological agent, and then 11. A host cell transformed to contain the nucleic acid molecule of any one of claims 1 or 8-10, or the progeny comparing the activity of MDZ3, MDZ4, MDZ7, or thereof. MDZ12 with that of a control, 12. A method for producing a polypeptide, the method increased activity relative to a control identifying an comprising: culturing the host cell of claim 11 under con agonist, decreased activity relative to a control identi ditions in which the protein encoded by Said nucleic acid fying an antagonist. molecule is expressed. 24. A purified agonist of the polypeptide of claim 14. 13. An isolated polypeptide produced by the method of 25. A purified antagonist of the polypeptide of claim 14. claim 12. 26. A method of identifying a Specific binding partner for 14. An isolated polypeptide, comprising: a polypeptide according to claim 14, the method comprising: (a) an amino acid sequence Selected from the group contacting Said polypeptide to a potential binding partner; consisting of: and US 2004/0078837 A1 Apr. 22, 2004 57

determining if the potential binding partner binds to Said 38. The diagnostic composition of claim 37, wherein said polypeptide. composition is further Suitable for in Vivo administration. 27. The method of claim 26, wherein said contacting is 39. A pharmaceutical composition comprising the nucleic performed in vivo. 28. A purified binding partner of the polypeptide of claim acid of claim 1 and a pharmaceutically acceptable excipient. 14. 40. A pharmaceutical composition comprising the 29. A method for detecting a target nucleic acid in a polypeptide of claim 14 and a pharmaceutically acceptable Sample, Said target being a nucleic acid according to claim excipient. 1, the method comprising: 41. A pharmaceutical composition comprising the anti a) hybridizing the sample with a probe comprising at least body or antigen-binding fragment or derivative thereof of 17 contiguous nucleotides of a Sequence complemen claim 19 and a pharmaceutically acceptable excipient. tary to Said target nucleic acid in Said Sample under 42. A pharmaceutical composition comprising the agonist high Stringency hybridization conditions, and of claim 24 and a pharmaceutically acceptable excipient. b) detecting the presence or absence, and optionally the 43. A pharmaceutical composition comprising the antago amount, of Said binding. nist of claim 25 and a pharmaceutically acceptable excipient. 30. A method of diagnosing a disease caused by mutation 44. A method for treating or preventing a disorder asso in MDZ3, MDZA, MDZ7 or MDZ12, comprising: ciated with decreased expression or activity of MDZ3, detecting Said mutation in a Sample of nucleic acids that MDZ4, MDZ7, or MDZ12, the method comprising admin derives from a Subject Suspected to have Said disease. istering to a Subject in need of Such treatment an effective 31. A method of diagnosing or monitoring a disease amount of the pharmaceutical composition of any of claims caused by altered expression of MDZ3, MDZ4, MDZ7 or 39, 40 or 42. MDZ12, comprising: 45. A method for treating or preventing a disorder asso determining the level of expression of MDZ3, MDZ4, ciated with increased expression or activity of MDZ3, MDZ7, or MDZ12 in a sample of nucleic acids or MDZ4, MDZ7 or MDZ12, the method comprising admin proteins that derives from a Subject Suspected to have istering to a Subject in need of Such treatment an effective Said disease, amount of the pharmaceutical composition of claim 41 or alterations from a normal level of expression providing 43. diagnostic and/or monitoring information. 46. A method of modulating the expression of a nucleic 32. A diagnostic composition comprising the nucleic acid acid according to claim 1, the method comprising: of claim 1, Said nucleic acid being detectably labeled. 33. The diagnostic composition of claim 32, wherein said administering an effective amount of an agent which composition is further Suitable for in Vivo administration. modulates the expression of a nucleic acid according to 34. A diagnostic composition comprising the polypeptide claim 1. of claim 14, Said polypeptide being detectably labeled. 47. A method of modulating at least one activity of a 35. The diagnostic composition of claim 34, wherein said polypeptide according to claim 14, the method comprising: composition is further Suitable for in Vivo administration. 36. A diagnostic composition comprising the antibody, or administering an effective amount of an agent which antigen-binding fragment or derivative thereof, of claim 19. modulates at least one activity of a polypeptide accord 37. The diagnostic composition of claim 36, wherein said ing to claim 14. antibody or antigen-binding fragment or derivative thereof is detectably labeled.