US 2014O134726A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2014/0134726A1 D'Amour et al. (43) Pub. Date: May 15, 2014

(54) MARKERS OF DEFINITIVE ENDODERM (60) Provisional application No. 60/736,598, filed on Nov. 14, 2005, provisional application No. 60/532,004, (71) Applicant: ViaCyte, Inc., San Diego, CA (US) filed on Dec. 23, 2003, provisional application No. 60/586,566, filed on Jul. 9, 2004, provisional applica (72) Inventors: Kevin Allen D'Amour, San Diego, CA tion No. 60/587,942, filed on Jul 14, 2004. (US); Alan D. Agulnick, San Diego, CA (US); Emmanuel Edward Baetge, Encinitas, CA (US) Publication Classification (73) Assignee: ViaCyte, Inc., San Diego, CA (US) (51) Int. C. CI2N5/071 (2006.01) (21) Appl. No.: 14/066,441 (52) U.S. C. CPC ...... CI2N5/0617 (2013.01) (22) Filed: Oct. 29, 2013 USPC ...... 435/366; 435/325; 435/405

Related U.S. Application Data (57) ABSTRACT (63) Continuation of application No. 12/093.590, filed on Jul. 21, 2008, now Pat. No. 8,586,357, filed as appli Disclosed herein are reagent-cell complexes comprising one cation No. PCT/US06/43932 on Nov. 13, 2006, said or more definitive endoderm cells. Also described herein are application No. 12/093.590 is a continuation-in-part of compositions for detecting definitive endoderm. Method of application No. 11/021,618, filed on Dec. 23, 2004, enriching, isolating and/or purifying definitive endoderm now Pat. No. 7,510,876. cells are also disclosed. Patent Application Publication US 2014/0134726 A1

I"OIH

Patent Application Publication May 15, 2014 Sheet 2 of 38 US 2014/0134726 A1

«…:%%;vez,sze?wº…«******&3&:;?&&#******************************…«,»º«º»,«»,

Patent Application Publication May 15, 2014 Sheet 3 of 38 US 2014/0134726 A1

¿?

re-was-----ee-earaxe-rrrrraaaaaa-swararaasaase-scal'

****************??--- ×,

R saw xxxxx SS

*re-assassessar Patent Application Publication May 15, 2014 Sheet 4 of 38 US 2014/0134726 A1

Patent Application Publication May 15, 2014 Sheet 5 of 38 US 2014/0134726 A1

Patent Application Publication May 15, 2014 Sheet 6 of 38 US 2014/0134726 A1

Patent Application Publication May 15, 2014 Sheet 7 of 38 US 2014/0134726 A1

A Sex & sess& &xpressies B $$$$$$xx 3&š8ws &&is

sway awar. AX4 8 ------~~ rurururururus warrarwa-awa is -

's Patent Application Publication May 15, 2014 Sheet 8 of 38 US 2014/0134726 A1

xx C2 s site

Šs & e 38 8 & & S S. $8. & S. se 8 woxo s k S& & & &s 8 se see s & ass&six 888& Patent Application Publication May 15, 2014 Sheet 9 of 38 US 2014/0134726 A1

6."OIH Patent Application Publication May 15, 2014 Sheet 10 of 38 US 2014/0134726 A1

F.C.. O

Patent Application Publication May 15, 2014 Sheet 12 of 38 US 2014/0134726 A1

Patent Application Publication May 15, 2014 Sheet 13 of 38 US 2014/0134726 A1

r^YYYYYYYYYYYYrrara-a-a-a-aa-ra------rra-a-a-a-a-a------arrasaar aaaa-ra-warrara-na-na-marrawn---- **************************•••••••••••••••~~~~~~~~ SS - t 8 so R

h h

-…….……………………….……****************************~~~~~~~~~~

------rrawawr Warrara-a-rammyxxxxxaaaaaaaaaaaaaaa------x-xxx-aawm-wears

??~~~~~*****••••••?

tax. W Patent Application Publication May 15, 2014 Sheet 14 of 38 US 2014/0134726 A1

xis: അff332838433

& S. Ss &

|

FG. a. Patent Application Publication May 15, 2014 Sheet 15 of 38 US 2014/0134726 A1

Patent Application Publication May 15, 2014 Sheet 16 of 38 US 2014/0134726 A1

Patent Application Publication May 15, 2014 Sheet 17 of 38 US 2014/0134726 A1

Patent Application Publication May 15, 2014 Sheet 18 of 38 US 2014/0134726 A1

XXXXXXX-XXXXXa)

××××××

exxxsas WIWIWITITZ277||M|MIT

saaaaassaxxxx.cccssoccasses US 2014/0134726 A1 **********************************************

C5wns s Patent Application Publication May 15, 2014 Sheet 20 of 38 US 2014/0134726 A1

s

00‘OIH *****************--~~~~

******************************????????????????åååå,-,-,-,-,-,-,-,-,-,-,-,-

$8 ******************…*..*,,+.*? sy X&$88s busiege anaeia ); ***************************>)*)*)*)*)*)*)*)*)*)*)(---,-,-,-,-).********************????????????? ****************************************************************************************************************************)*)*)*)*)*)*)*)*+,·,· Patent Application Publication May 15, 2014 Sheet 21 of 38 US 2014/0134726 A1

?gyv?

IZ(OIH ******************************~~~~~~~~~~~~~~~~~~~~~~~~~~~.~~~~~~~~~~);

888 S$8,8x8 8888: 8888.8 Strixww.sssssssssswsrxxx-xx-www.xxxxxwww.xw.xxxxxxwww.yesws.xxxxx xxxxxYxxxxxx Patent Application Publication May 15, 2014 Sheet 22 of 38 US 2014/0134726 A1

a-Yaaaaaaaaaaaa. aaaaaaaaaa-Maxxxaasaaraaaaaaaanx

************•~~~

"OIHZZ

----·-

888 8 X838 & 8 & 888. ^^^^^^^***********************~~~~~,~~~~~~~------~--~~~~~……………………………?? aNasswaaaaX-rw-W-----a-e'-a'------a- x''Asakara'a-a-a-a-a-a-a- Patent Application Publication May 15, 2014 Sheet 23 of 38 US 2014/0134726 A1

: | c. s - $8 sawaxaaw: or soa &- & & uoissoluxetosses sixa vagelo8 seas Patent Application Publication May 15, 2014 Sheet 24 of 38 US 2014/0134726 A1

**************~~~~~~~~~,~~************** •?????•

~~~~~~~~~~~

s ------as-as-xx-aaraaaaaaaa-...-a, ------Yaaaa-MaxM

88

SS

&

8 . .

& 8 is

R arra-aaaaaaaaaaaaaaaaaaSSaraaaaaaaaaaaaaaaaaaaaa (B)

SS$8

V

R

towns Patent Application Publication May 15, 2014 Sheet 25 of 38 US 2014/0134726 A1

Patent Application Publication May 15, 2014 Sheet 26 of 38 US 2014/0134726 A1

w axe-rawa-N-awam asaasaxx-xx-aaaaaaaaaaaaa-Wa-Wentwaaaaaa~~~~~--~~~~); Patent Application Publication May 15, 2014 Sheet 27 of 38 US 2014/0134726 A1

~~~~*~~~~#~~~~*~*~~~~.~~~~.~~~~…………;

& servaxas-Naaaaaaasax ????? za~~===~~~~~~~~a~~ (~~~*~*****************************¿¿. #******************------******************************)*)*)*)*)*)*)*)*)*)*)*)*)*)(---) (C Patent Application Publication May 15, 2014 Sheet 28 of 38 US 2014/0134726 A1

***^^*^^******************************«*^^º««««…--~~~~!!!!!!………………………….…

------TZ·!

Patent Application Publication May 15, 2014 Sheet 29 of 38 US 2014/0134726 A1

SOXY SOX air Patent Application Publication May 15, 2014 Sheet 30 of 38 US 2014/0134726 A1

x s

Srxys's waxyw$Yisrs-s-s s 8 S.wr

s

s 8.

six it. gax SS s

***********gxºxºzºg: Patent Application Publication May 15, 2014 Sheet 31 of 38 US 2014/0134726A1

(B) ------M.------as------asaaaa •••••¿•)›***********************·:-

abamaxx-xxx

AxAXXXX-AM-xxxxxxxxxaaaaaaaaassrs.

$$$$$.

| {4{ {| |{ •}} *}~~~~#~~~~#~~~~*~~~~#~?---- w ##################

*~~~~~~~*~~~~*~~~~~~ axaaaaaaaaaaay's Patent Application Publication May 15, 2014 Sheet 32 of 38 US 2014/0134726A1

m is

FIG. 32 Patent Application Publication May 15, 2014 Sheet 33 of 38 US 2014/0134726A1

(A) strachyury

*~~~~'+'~~~~~~~~~~~~~~~~~);-)))-,

Www.a.s. (C) Patent Application Publication May 15, 2014 Sheet 34 of 38 US 2014/0134726 A1

S&S

$8:$rsssssssswasaraxxar S8------'''''''''''''',: "...... star------

3-SSSS&S

-s areassessoaxxxYYYYYa------axx-xx-s

F.C. 3. Patent Application Publication May 15, 2014 Sheet 35 of 38 US 2014/0134726 A1

Siksis''

SS: wartw8taxxxxww.

: ~~~#~~~~

888:

------

º?

FIG. 34

Patent Application Publication May 15, 2014 Sheet 36 of 38 US 2014/0134726 A1

8x8x

#*! |{| {? *

& Sarasara-earease-area

SS S rwarxWraxxx-xxxxYxxws 8 & ico S

&S %(*?

******************reeee!~~~~~~•••••••••••••••••••••

FIG. . Patent Application Publication May 15, 2014 Sheet 37 of 38 US 2014/0134726 A1

Patent Application Publication May 15, 2014 Sheet 38 of 38 US 2014/0134726 A1

x -

$3. sexistern

SSSSS------

(~~~.

8:S$ $&S

-->

*****j,w US 2014/0134726 A1 May 15, 2014

MARKERS OF DEFINITIVE ENDODERM the mature organism in addition to extraembryonic tissues (e.g. placenta) and germ cells. Although pluripotency imparts RELATED APPLICATIONS extraordinary utility upon hESCs, this property also poses 0001. This application is a nonprovisional application unique challenges for the study and manipulation of these which claims priority under 35 U.S.C. S 119(e) to U.S. Pro cells and their derivatives. Owing to the large variety of cell visional Patent Application No. 60/736,598, entitled MARK types that may arise in differentiating hESC cultures, the vast ERS OF DEFINITIVE ENDODERM, filed Nov. 14, 2005; majority of cell types are produced at very low efficiencies. this application is also a continuation-in-part of U.S. patent Additionally, success in evaluating production of any given application Ser. No. 1 1/021,618, entitled DEFINITIVE cell type depends critically on defining appropriate markers. ENDODERM, filed Dec. 23, 2004, which claims priority Achieving efficient, directed differentiation is of great impor under 35 U.S.C. S 119(e) as a nonprovisional application to tance for therapeutic application of hESCs. U.S. Provisional Patent Application No. 60/587,942, entitled 0006. In order to use hESCs as a starting material togen CHEMOKINE CELL SURFACE RECEPTOR FOR THE erate cells that are useful in cell therapy applications, it would ISOLATION OF DEFINITIVE ENDODERM, filed Jul 14, be advantageous to overcome the foregoing problems. For 2004; and U.S. Provisional Patent Application No. 60/586, example, it would be useful to identify and isolate cell types, 566, entitled CHEMOKINE CELL SURFACE RECEPTOR such as definitive endoderm, that can later differentiate into FOR THE ISOLATION OF DEFINITIVE ENDODERM, pancreatic islet/B-cells, as well as other useful cell types. filed Jul. 9, 2004. and U.S. Provisional Patent Application No. 60/532,004, entitled DEFINITIVE ENDODERM, filed Dec. SUMMARY OF THE INVENTION 23, 2003. The disclosure of each of the foregoing priority applications is incorporated herein by reference in its entirety. 0007 Embodiments of the present invention relate to methods for producing a cell population enriched in definitive FIELD OF THE INVENTION endoderm cells. The method can include the following steps: providing a cell population including definitive endoderm 0002 The present invention relates to the fields of medi cells with a reagent that binds to a marker selected from Table cine and cell biology. In particular, the present invention 3, and separating definitive endoderm cells bound to the relates to the identification and isolation of definitive endo reagent from cells that are not bound to the reagent, thereby derm cells. producing a cell population enriched in definitive endoderm cells. In some embodiments, the marker is selected from the BACKGROUND group consisting of AGPAT3, APOA2, C20orf56, C21orf129, 0003 Human pluripotent stem cells, such as embryonic CALCR, CCL2, CER1, CMKOR1, CRIP1, CXCR4, stem (ES) cells and embryonic germ (EG) cells, were first CXorf1, DIO3, DIO3OS, EB-1, EHHADH, ELOVL2, isolated in culture without fibroblastfeeders in 1994 (Bongso EPSTI1, FGF17, FLJ10970, FLJ21195, FLJ22471, et al., 1994) and with fibroblast feeders (Hogan, 1997). Later, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, GSC, Thomson, Reubinoff and Shamblott established continuous LOC283537, MYL7, NPPB, NTN4, PRSS2, RTN4RL1, cultures of human ES and EG cells using mitotically inacti SEMA3E, SIAT8D, SLC5A9, SLC40A1, SOX17, SPOCK3, vated mouse feeder layers (Reubinoffet al., 2000; Shamblott TMOD1, TRPA1, TTN, AW166727, AI821586, BF941609, et al., 1998; Thomson et al., 1998). AI916532, BC034407, N63706 and AW772192. In preferred 0004 Human ES and EG cells (hESCs) offer unique embodiments, the marker is selected from the group consist opportunities for investigating early stages of human devel ing of CALCR, CMKOR1, CXCR4, GPR37, NTN4, opment as well as for therapeutic intervention in several dis RTN4RL1, SEMA3E, SLC40A1 SLC5A9 and TRPA1. The ease states, such as diabetes mellitus and Parkinson's disease. reagent used to produce a cell population enriched in defini For example, the use of insulin-producing B-cells derived tive endoderm can be an antibody, such as a polyclonal or from hESCs would offer a vast improvement over current cell monoclonal antibody that binds to one of the markers therapy procedures that utilize cells from donor pancreases described herein. for the treatment of diabetes. However, presently it is not 0008. In certain embodiments, the method of producing a known how to generate an insulin-producing B-cell from cell population enriched for definitive endoderm cells can hESCs. As such, current cell therapy treatments for diabetes also include the following steps: obtaining a cell population mellitus, which utilize islet cells from donor pancreases, are comprising pluripotent human cells; providing the cell popu limited by the scarcity of high quality islet cells needed for lation with at least one growth factor of the TGFB Superfamily transplant. Cell therapy for a single Type I diabetic patient in an amount Sufficient to promote differentiation of said requires a transplant of approximately 8x10 pancreatic islet pluripotent cells to definitive endoderm cells; and allowing cells. (Shapiro et al., 2000; Shapiro et al., 2001a; Shapiro et sufficient time for definitive endoderm cells to form. In fur al., 2001b). As such, at least two healthy donor organs are ther embodiments, the at least one growth factor can be required to obtain sufficient islet cells for a successful trans selected from the following: Nodal, activin A, activin B and plant. Human embryonic stem cells offer a source of starting combinations thereof. In some embodiments, the at least one material from which to develop Substantial quantities of high growth factor can be provided to said cell population at a quality differentiated cells for human cell therapies. concentration ranging from about 1 ng/ml to about 1000 0005. Two properties that make hESCs uniquely suited to ng/ml. In further preferred embodiments, the at least one cell therapy applications are pluripotence and the ability to growth factor can be provided in a concentration of at least maintain these cells in culture for prolonged periods. Pluri about 100 ng/ml. In other preferred embodiments, the cell potency is defined by the ability of hESCs to differentiate to population can be grown in a medium including less than derivatives of all 3 primary germ layers (endoderm, meso about 10% serum. In yet other preferred embodiments, the derm, ectoderm) which, in turn, formall Somatic cell types of pluripotent human cells can be human embryonic stem cells. US 2014/0134726 A1 May 15, 2014

0009. Additional embodiments of the invention relate to CXorf1, DIO3, DIO3OS, EB-1, EHHADH, ELOVL2, other methods of producing a cell population enriched in EPSTI1, FGF17, FLJ10970, FLJ21195, FLJ22471, definitive endoderm cells, which can include the steps of: FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, GSC, obtaining a population of pluripotent cells, wherein at least LOC283537, MYL7, NPPB, NTN4, PRSS2, RTN4RL1, one cell of the pluripotent cell population includes a nucleic SEMA3E, SIAT8D, SLC5A9, SLC40A1, SOX17, SPOCK3, acid encoding a fluorescent or a biologically active TMOD1, TRPA1, TTN, AW166727, AI821586, BF941609, fragment thereof, which has been operably linked to a pro AI916532, BC034407, N63706 and AW772192. In preferred moter that controls the expression of a marker selected from embodiments, the definitive endoderm cell expresses a cell Table 3: differentiating the pluripotent cells so as to produce Surface marker selected from the group consisting of definitive endoderm cells, wherein the definitive endoderm CALCR, CMKOR1, CXCR4, GPR37, NTN4, RTN4RL1, cells express said fluorescent protein; and separating the SEMA3E, SLC40A1 SLC5A9 and TRPA1. definitive endoderm cells from cells that do not substantially 0013 The reagent of the reagent-cell complex can be, for express said fluorescent protein. In preferred embodiments, example, an antibody. In preferred embodiments, the bound the marker is selected from the group consisting of AGPAT3, marker in the reagent-cell complex can be a cell-surface APOA2, C20orf56, C21orf129, CALCR, CCL2, CER1, marker selected from the group consisting of CALCR, CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, CMKOR1, CXCR4, GPR37, NTN4, RTN4RL1, SEMA3E, EHHADH, ELOVL2, EPSTI1, FGF17, FLJ10970, SLC40A1 SLC5A9 and TRPA1. FLJ21195, FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, 0014. In accordance with the embodiments described GPR37, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, herein, aspects of the present invention relate to compositions RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, of cell populations that include human cells wherein at least SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, about 10% of the cells are reagent-cell complexes, wherein AI821586, BF941609, AI916532, BC034407, N63706 and the complex can include a human definitive endoderm cell AW772192. In other embodiments, the marker is selected expressing a marker selected from Table 3, and wherein the from the group consisting of CALCR, CMKOR1, CXCR4, definitive endoderm cell is a multipotent cell that can differ GPR37, NTN4, RTN4RL1, SEMA3E, SLC40A1, SLC5A9 entiate into cells of the gut tube or organs derived therefrom. and TRPA1. Methods known in the art, such as fluorescence In such embodiments, the marker is expressed by the defini activated cell sorting (FACS), can be used to separate the tive endoderm cell, wherein the marker is selected from Table fluorescently-tagged definitive endoderm cells from cells that 3, and a reagent bound to the marker. In some embodiments, do not substantially express said fluorescent protein. the marker bound by the reagent in the reagent-cell complex 0010. In some embodiments, the step of differentiating the is selected from the group consisting of AGPAT3, APOA2, pluripotent cells can include providing the cell population C20orf56, C21orf129, CALCR, CCL2, CER1, CMKOR1, with at least one growth factor of the TGFB superfamily in an CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, amount sufficient to promote differentiation of said pluripo ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, tent cells to definitive endoderm cells, and allowing sufficient FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, time for definitive endoderm cells to form. For example, in GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, some embodiments, the cell population can be provided with RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, at least one growth factor selected from the group consisting SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, of Nodal, activin A, activin B and combinations thereof. In AI821586, BF941609, AI916532, BC034407, N63706 and preferred embodiments, the at least one growth factor can be AW772192. In preferred embodiments, the marker bound by provided to said cell population at a concentration ranging the reagent is selected from the group consisting of CALCR, from about 1 ng/ml to about 1000 ng/ml. In other preferred CMKOR1, CXCR4, GPR37, NTN4, RTN4RL1, SEMA3E, embodiments, the at least one growth factor can be provided SLC40A1 SLC5A9 and TRPA1. In further embodiments, the in a concentration of at least about 100 ng/ml. In still other expression of one or more of the markers selected from Table preferred embodiments, the cell population can be grown in a 3 in definitive endoderm cells can be greater than expression medium including less than about 10% serum. In yet other ofa marker selected from the group consisting of OCT4, AFP. preferred embodiments, the pluripotent human cells can be Thrombomodulin (TM), SPARC and SOX7. human embryonic stem cells. 0015. In some embodiments, the cell population includes 0011. In some embodiments, methods, such as affinity human cells wherein at least about 20% of the cells are based separation or magnetic-based separation, are used to reagent-cell complexes as described herein. In other embodi enrich, isolate or Substantially purify preparations of defini ments, the cell population includes human cells wherein at tive endoderm cells which bind to the reagent. least about 40% of the cells are reagent-cell complexes as 0012. In accordance with the methods described herein, described herein. In yet other embodiments, the cell popula embodiments of the present invention relate to compositions tion includes human cells wherein at least about 60% of the useful for the enrichment and isolation of definitive endoderm cells are reagent-cell complexes as described herein. In yet cells. Some embodiments relate to an ex vivo reagent-cell other embodiments, the cell population includes human cells complex. The ex vivo complex can include a human definitive wherein at least about 80% of the cells are reagent-cell com endoderm cell expressing a marker selected from Table 3 and plexes as described herein. In yet other embodiments, the cell a reagent bound to the marker. The definitive endoderm cell of population includes human cells wherein at least about 95% the ex vivo reagent-cell complex is a multipotent cell that can of the cells are reagent-cell complexes as described herein. differentiate into cells of the guttube or organs derived there 0016. Additionally, embodiments of the present invention from. In certain embodiments, the marker expressed by the also relate to isolated antibodies that bind to markers that are human definitive endoderm cell is selected from the group differentially expressed in definitive endoderm cells. For consisting of AGPAT3, APOA2, C20orf56, C21orf129, example, some embodiments relate to any of the markers CALCR, CCL2, CER1, CMKOR1, CRIP1, CXCR4, shown in Table 3. In preferred embodiments, the antibody US 2014/0134726 A1 May 15, 2014

binds to a polypeptide encoded by AGPAT3, APOA2, EPSTI1, FGF17, FLJ10970, FLJ21195, FLJ22471, C20orf56, C21orf129, CALCR, CCL2, CER1, CMKOR1, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, GSC, CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, LOC283537, MYL7, NPPB, NTN4, PRSS2, RTN4RL1, ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, SEMA3E, SIAT8D, SLC5A9, SLC40A1, SOX17, SPOCK3, FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, TMOD1, TRPA1, TTN, AW166727, AI821586, BF941609, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, AI916532, BC034407, N63706 and AW772192. In other pre RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, ferred embodiments, the method can also include detection of SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, expression of at least one marker selected from the group AI821586, BF941609, AI916532, BC034407, N63706 and consisting of OCT4, AFP, TM, SPARC and SOX7 in cells of AW772192. In other preferred embodiments, the antibody the cell population. In some embodiments, the expression of binds to a marker that is expressed on the cell Surface. Such as the at least one marker selected from Table 3 can be greater for example, CALCR, CMKOR1, CXCR4, GPR37, NTN4, than expression of the at least one marker selected from the RTN4RL1, SEMA3E, SLC40A1 SLC5A9 and TRPA1. group consisting of OCT4, AFP, TM, SPARC and SOX7 in 0017 Antibodies described herein can be either poly clonal or monoclonal. Further, in certain embodiments, the the definitive endoderm cells. In further embodiments, the antibodies can be single chain, and/or modified, for example method can also include detecting expression of at least one to incorporate a label. marker selected from the group consisting of SOX17 and 0018. Other embodiments of the present invention relate CXCR4, wherein expression of the at least one marker to compositions for the identification of definitive endoderm. selected from Table 3 and expression of the at least one The composition can include a first oligonucleotide that marker selected from the group consisting of SOX17 and hybridizes to a first marker, wherein the first marker is CXCR4 is greater than expression of the at least one marker selected from Table 3; and a second oligonucleotide that selected from the group consisting of OCT4, AFP, TM, hybridizes to a second marker, wherein the second marker is SPARC and SOX7 in the definitive endoderm cells. In some selected from Table 3, and wherein the second marker is embodiments, the expression of the at least one marker is different from the first marker. In preferred embodiments, the determined by quantitative polymerase chain reaction first marker and the second marker can be selected from the (Q-PCR). In other embodiments, the expression of the at least group consisting of AGPAT3, APOA2, C20orf56, C21orf129, one marker is determined by immunocytochemistry. CALCR, CCL2, CER1, CMKOR1, CRIP1, CXCR4, 0020. Other embodiments relate to methods of identifying CXorf1, DIO3, DIO3OS, EB-1, EHHADH, ELOVL2, a differentiation factor capable of promoting the differentia EPSTI1, FGF17, FLJ10970, FLJ21195, FLJ22471, tion of human definitive endoderm cells in a cell population FLJ23514, FOXA2, FOXO1, GATA4, GPR37, GSC, including human cells. Methods can include the steps of LOC283537, MYL7, NPPB, NTN4, PRSS2, RTN4RL1, obtaining a cell population including human definitive endo SEMA3E, SIAT8D, SLC5A9, SLC40A1, SOX17, SPOCK3, derm cells, providing a candidate differentiation factor to the TMOD1, TRPA1, TTN, AW 166727, AI821586, BF941609, cell population, determining expression of a marker selected AI916532, BC034407, N63706, AW772192, OCT4, AFP, from Table 3 in the cell population at a first time point, Thrombomodulin (TM), SPARC and SOX7. In other pre determining expression of the same marker in the cell popu ferred embodiments, the first marker and the second marker lation at a second time point, wherein the second time point is can be selected from the group consisting of AGPAT3, Subsequent to the first time point and wherein the second time APOA2, C20orf56, C21orf129, CALCR, CCL2, CER1, point is Subsequent to providing said cell population with said CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, candidate differentiation factor and determining if expression EHHADH, ELOVL2, EPSTI1, FGF17, FLJ10970, of the marker in the cell population at the second time point is FLJ21195, FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, increased or decreased as compared to expression of the GPR37, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, marker in the cell population at the first time point, wherein an RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, increase or decrease in expression of said marker in the cell SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, population indicate can that said candidate differentiation AI821586, BF941609, AI916532, BC034407, N63706 and factor is capable of promoting the differentiation of said AW772192. In still other preferred embodiments, the first human definitive endoderm cells. In preferred embodiments, marker and second marker are selected from the group con the marker is selected from the group consisting of AGPAT3, sisting of CALCR, CMKOR1, CXCR4, GPR37, NTN4, APOA2, C20orf56, C21orf129, CALCR, CCL2, CER1, RTN4RL1, SEMA3E, SLC40A1 SLC5A9 and TRPA1. The CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, compositions above can also include a polymerase, such as an EHHADH, ELOVL2, EPSTI1, FGF17, FLJ10970, RNA-dependent DNA polymerase and/or a DNA-dependent FLJ21195, FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, DNA polymerase. In certain embodiments, at least one oli GPR37, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, gonucleotide includes a label. In some embodiments, the RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, marker can be a nucleic acid, for example DNA or RNA. SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, 0019. Other embodiments relate to methods of detecting AI821586, BF941609, AI916532, BC034407, N63706 and definitive endoderm cells that can include the steps of detect AW772192. In other embodiments, the human definitive ing the presence of definitive endoderm cells in a cell popu endoderm cells include at least about 10% of the human cells lation by detecting expression of at least one marker selected in the cell population. In preferred embodiments, the human from Table 3 in cells of the cell population. In preferred definitive endoderm cells include about 90% of the human embodiments, the at least one marker can be selected from the cells in said cell population. In yet other embodiments, the group consisting of AGPAT3, APOA2, C20orf56, C21orf129, human definitive endoderm cells can differentiate into cells, CALCR, CCL2, CER1, CMKOR1, CRIP1, CXCR4, tissues or organs derived from the guttube in response to the CXorf1, DIO3, DIO3OS, EB-1, EHHADH, ELOVL2, candidate differentiation factor. US 2014/0134726 A1 May 15, 2014

0021. In embodiments described herein, the first time language which permits the inclusion of any additional ele point can be prior to providing the candidate differentiation ments. With this in mind, additional embodiments of the factor to the cell population, whereas in other embodiments, present inventions are described with reference to the num the first time point can be at approximately the same time as bered paragraphs below: providing the candidate differentiation factor to the cell popu 0027 1. An isolated antibody that binds to a marker lation. In still other embodiments, the first time point can be selected from Table 3. Subsequent to providing the candidate differentiation factor to 0028 2. The antibody of paragraph 1, wherein said marker the cell population. In some embodiments, expression of the is selected from the group consisting of AGPAT3, APOA2, marker can be determined by quantitative polymerase chain C20orf56, C21orf129, CALCR, CCL2, CER1, CMKOR1, reaction (Q-PCR), whereas in other embodiments, expression CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, of the marker can be determined by immunocytochemistry. ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, 0022 Candidate differentiation factors can include known FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, and unknown factors, and may be Small molecules, polypep GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, tides, or the like. In certain embodiments, the candidate dif RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, ferentiation factor can include a growth factor. In some SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, embodiments, the candidate differentiation factor can be pro AI821586, BF941609, AI916532, BC034407, N63706 and vided to the cell population at a concentration of between AW772192. about 1 ng/ml to about 1 mg/ml. In preferred embodiments, 0029. 3. The antibody of paragraph 2, wherein said marker the candidate differentiation factor can be provided to the cell comprises a polypeptide selected from the group consisting population at a concentration of about 100 ng/ml of CALCR, CMKOR1, CXCR4, GPR37, NTN4, RTN4RL1, 0023. Some embodiments of the present invention relate SEMA3E, SLC40A1 SLC5A9 and TRPA1. to cell cultures comprising definitive endoderm cells, wherein 0030. 4. The antibody of paragraph 1, wherein said anti the definitive endoderm cells are multipotent cells that can body is polyclonal. differentiate into cells of the gut tube or organs derived from 0031 5. The antibody of paragraph 1, wherein said anti the gut tube. In accordance with certain embodiments, the body is monoclonal. definitive endoderm cells are mammalian cells, and in a pre 0032 6. The antibody of paragraph 1, wherein said anti ferred embodiment, the definitive endoderm cells are human body is a single-chain antibody. cells. In some embodiments of the present invention, defini tive endoderm cells express or fail to significantly express 0033 7. The antibody of paragraph 1, wherein said anti certain markers. In some embodiments, one or more markers body is labeled. selected from Table 3 are expressed in definitive endoderm 0034 8. An ex vivo reagent-cell complex comprising a cells. In other embodiments, one or more markers selected human definitive endoderm cell expressing a marker selected from the group consisting of OCT4, alpha-fetoprotein (AFP). from Table 3, said definitive endoderm cell being a multipo Thrombomodulin (TM), SPARC and SOX7 are not signifi tent cell that can differentiate into cells of the gut tube or cantly expressed in definitive endoderm cells. organs derived therefrom, and a reagent bound to said marker. 0024. In accordance with other embodiments of the 0035 9. The reagent-cell complex of paragraph 8, wherein present invention, methods of producing definitive endoderm said marker is selected from the group consisting of AGPAT3, from pluripotent cells are described. In some embodiments, APOA2, C20orf56, C21orf129, CALCR, CCL2, CER1, pluripotent cells are derived from a morula. In some embodi CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, ments, pluripotent cells are stem cells. Stem cells used in EHHADH, ELOVL2, EPSTI1, FGF17, FLJ10970, these methods can include, but are not limited to, embryonic FLJ21195, FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, stem cells. Embryonic stem cells can be derived from the GPR37, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, embryonic inner cell mass or from the embryonic gonadal RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, ridges. Embryonic stem cells can originate from a variety of SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, animal species including, but not limited to, various mamma AI821586, BF941609, AI916532, BC034407, N63706 and lian species including humans. In a preferred embodiment, AW772192. human embryonic stem cells are used to produce definitive 0036) 10. The reagent-cell complex of paragraph 9, endoderm. In certain embodiments the hESCs are maintained wherein said marker comprises a polypeptide selected from on a feeder layer. In such embodiments, the feeder layer cells the group consisting of CALCR, CMKOR1, CXCR4, can be cells, such as fibroblasts, which are obtained from GPR37, NTN4, RTN4RL1, SEMA3E, SLC40A1 SLC5A9 humans, mice or any other Suitable organism. and TRPA1. 0025. In some embodiments of the present invention, the 0037 11. The reagent-cell complex of paragraph 10, compositions comprising definitive endoderm cells and wherein said reagent comprises an antibody. hESCs also includes one or more growth factors. Such growth 0038 12. The reagent-cell complex of paragraph 10, factors can include growth factors from the TGFB superfam wherein said reagent comprises a ligand for a receptor. ily. In Such embodiments, the one or more growth factors 0039 13. A cell population comprising human cells comprise the Nodal/Activin and/or the BMP subgroups of the wherein at least about 10% of said cells form reagent-cell TGFB superfamily of growth factors. In some embodiments, complexes according to paragraph 8. the one or more growth factors are selected from the group 0040 14. A cell population comprising human cells consisting of Nodal, Activin A, Activin B, BMP4, Wnt3a or wherein at least about 20% of said cells form reagent-cell combinations of any of these growth factors complexes according to paragraph 8. 0026. In certain jurisdictions, there may not be any gener 0041) 15. A cell population comprising human cells ally accepted definition of the term “comprising.” As used wherein at least about 40% of said cells form reagent-cell herein, the term “comprising is intended to represent “open’ complexes according to paragraph 8. US 2014/0134726 A1 May 15, 2014

0042. 16. A cell population comprising human cells TMOD1, TRPA1, TTN, AW166727, AI821586, BF941609, wherein at least about 60% of said cells form reagent-cell AI916532, BC034407, N63706 and AW772192. complexes according to paragraph 8. 0053 27. The composition of paragraph 24, wherein said 0043. 17. A cell population comprising human cells marker is a nucleic acid. wherein at least about 80% of said cells form reagent-cell 0054, 28. The composition of paragraph 27, wherein said complexes according to paragraph 8. nucleic acid is DNA. 0044) 18. A cell population comprising human cells 0055 29. The composition of paragraph 27, wherein said wherein at least about 95% of said cells form reagent-cell nucleic acid is RNA. complexes according to paragraph 8. 0056 30. The composition of paragraph 24, wherein at 0045. 19. The cell population of any one of paragraphs least one oligonucleotide comprises a label. 13-18, wherein said marker is selected from the group con sisting of AGPAT3, APOA2, C20orf56, C21orf129, CALCR, 0057. 31. The composition of paragraph 24 further com CCL2, CER1, CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, prising a polymerase. DIO3OS, EB-1, EHHADH, ELOVL2, EPSTI1, FGF17, 0.058 32. The composition of paragraph 31, wherein the FLJ10970, FLJ21195, FLJ22471, FLJ23514, FOXA2, polymerase comprises a DNA-dependent DNA polymerase. FOXO1, GATA4, GPR37, GSC, LOC283537, MYL7, 0059) 33. The composition of paragraph 31, wherein the NPPB, NTN4, PRSS2, RTN4RL1, SEMA3E, SIAT8D, polymerase comprises an RNA-dependent DNA polymerase. SLC5A9, SLC40A1, SOX17, SPOCK3, TMOD1, TRPA1, 0060 34. The method of paragraph 33, wherein the RNA TTN, AW166727, AI821586, BF941609, AI916532, dependent DNA polymerase comprises Moloney Murine BC034407, N63706 and AW772192. Leukemia Virus reverse transcriptase or Avian Myeloblasto 0046. 20. The cell population of paragraph 19, wherein sis Virus (AMV) reverse transcriptase. expression of said marker is greater than expression of a 0061 35. A method for producing a cell population marker selected from the group consisting of OCT4, AFP. enriched in definitive endoderm cells, said method compris Thrombomodulin (TM), SPARC and SOX7 in definitive ing the steps of providing a cell population comprising defini endoderm cells. tive endoderm cells with a reagent that binds to a marker 0047. 21. The cell population of paragraph 19, wherein selected from Table 3, and separating definitive endoderm said marker comprises a polypeptide selected from the group cells bound to said reagent from cells that are not bound to consisting of CALCR, CMKOR1, CXCR4, GPR37, NTN4, said reagent, thereby producing a cell population enriched in RTN4RL1, SEMA3E, SLC40A1 SLC5A9 and TRPA1. definitive endoderm cells. 004.8 22. The cell population of paragraph 20, wherein 0062 36. The method of paragraph 35, wherein said said reagent comprises an antibody. marker is selected from the group consisting of AGPAT3, 0049 23. The cell population of paragraph 20, wherein APOA2, C20orf56, C21orf129, CALCR, CCL2, CER1, said reagent comprises a ligand for a receptor. CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, 0050. 24. A composition for the identification of definitive EHHADH, ELOVL2, EPSTI1, FGF17, FLJ10970, endoderm, said composition comprising a first oligonucle FLJ21195, FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, otide that hybridizes to a first marker, wherein said first GPR37, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, marker is selected from the group consisting of Table 3. RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, OCT4, AFP. Thrombomodulin (TM), SPARC and SOX7, and SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, a second oligonucleotide that hybridizes to a second marker, AI821586, BF941609, AI916532, BC034407, N63706 and wherein said second marker is selected from the group con AW772192. sisting of Table 3, OCT4, AFP. Thrombomodulin (TM), 0063. 37. The method of paragraph 36, wherein the marker SPARC and SOX7, and wherein said second marker is dif is selected from the group consisting of CALCR, CMKOR1, ferent from said first marker. CXCR4, GPR37, NTN4, RTN4RL1, SEMA3E, SLC40A1, 0051 25. The composition of paragraph 24, wherein said SLC5A9 and TRPA1. first marker and said second marker are selected from the 0064. 38. The method of paragraph 37, wherein said group consisting of AGPAT3, APOA2, C20orf56, C21orf129, reagent comprises an antibody or a ligand for a receptor. CALCR, CCL2, CER1, CMKOR1, CRIP1, CXCR4, 0065. 39. The method of paragraph 35 further comprising CXorf1, DIO3, DIO3OS, EB-1, EHHADH, ELOVL2, the steps of obtaining a cell population comprising pluripo EPSTI1, FGF17, FLJ10970, FLJ21195, FLJ22471, tent human cells, providing said cell population with at least FLJ23514, FOXA2, FOXO1, GATA4, GPR37, GSC, one growth factor of the TGFB Superfamily in an amount LOC283537, MYL7, NPPB, NTN4, PRSS2, RTN4RL1, sufficient to promote differentiation of said pluripotent cells SEMA3E, SIAT8D, SLC5A9, SLC40A1, SOX17, SPOCK3, to definitive endoderm cells, and allowing sufficient time for TMOD1, TRPA1, TTN, AW 166727, AI821586, BF941609, definitive endoderm cells to form. AI916532, BC034407, N63706, AW772192, OCT4, AFP, Thrombomodulin (TM), SPARC and SOX7. 0.066 40. The method of paragraph 39, wherein said at 0052 26. The composition of paragraph 25, wherein said least one growth factor is selected from the group consisting first marker and said second marker are selected from the of Nodal, activin A, activin B and combinations thereof. group consisting of AGPAT3, APOA2, C20orf56, C21orf129, 0067 41. The method of paragraph 40, wherein said at CALCR, CCL2, CER1, CMKOR1, CRIP1, CXCR4, least one growth factor is provided to said cell population at a CXorf1, DIO3, DIO3OS, EB-1, EHHADH, ELOVL2, concentration ranging from about 1 ng/ml to about 1000 EPSTI1, FGF17, FLJ10970, FLJ21195, FLJ22471, ng/ml. FLJ23514, FOXA2, FOXO1, GATA4, GPR37, GSC, 0068 42. The method of paragraph 41, wherein said at LOC283537, MYL7, NPPB, NTN4, PRSS2, RTN4RL1, least one growth factor is provided in a concentration of at SEMA3E, SIAT8D, SLC5A9, SLC40A1, SOX17, SPOCK3, least about 100 ng/ml US 2014/0134726 A1 May 15, 2014

0069 43. The method of paragraph 39, wherein said cell FLJ10970, FLJ21195, FLJ22471, FLJ23514, FOXA2, population is grown in a medium comprising less than about FOXO1, GATA4, GPR37, GSC, LOC283537, MYL7, 10% serum. NPPB, NTN4, PRSS2, RTN4RL1, SEMA3E, SIAT8D, 0070 44. The method of paragraph 39, wherein said pluri SLC5A9, SLC40A1, SOX17, SPOCK3, TMOD1, TRPA1, potent human cells are human embryonic stem cells. TTN, AW166727, AI821586, BF941609, AI916532, 0071 45. A method of producing a cell population BC034407, N63706 and AW772192. enriched in definitive endoderm cells, said method compris I0082 56. The method of paragraph 54 further comprising ing the steps of obtaining a population of pluripotent cells, detecting expression of at least one marker selected from the wherein at least one cell of said pluripotent cell population group consisting of OCT4, AFP, TM, SPARC and SOX7 in comprises a nucleic acid encoding a fluorescent protein or a cells of said cell population, wherein the expression of said at biologically active fragment thereof, which has been oper least one marker selected from Table 3 is greater than the ably linked to a promoter that controls the expression of a expression of said at least one marker selected from the group marker selected from Table 3, differentiating said pluripotent consisting of OCT4, AFT, TM, SPARC and SOX7 in said cells so as to produce definitive endoderm cells, wherein said definitive endoderm cells. definitive endoderm cells express said fluorescent protein, I0083 57. The method of Paragraph 54, further comprising and separating said definitive endoderm cells from cells that detecting expression of at least one marker selected from the do not substantially express said fluorescent protein. group consisting of SOX17 and CXCR4, and at least one 0072 46. The method of paragraph 45, wherein said marker from Table 3 other than SOX17 and CXCR4, wherein marker is selected from the group consisting of AGPAT3, expression of said at least one marker selected from Table 3 APOA2, C20orf56, C21orf129, CALCR, CCL2, CER1, other than SOX17 and CXCR4 and expression of said at least CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, one marker selected from the group consisting of SOX17 and EHHADH, ELOVL2, EPSTI1, FGF17, FLJ10970, CXCR4 is greater than the expression of said at least one FLJ21195, FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, marker selected from the group consisting of OCT4, AFP. GPR37, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, TM, SPARC and SOX7 in said definitive endoderm cells. RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, I008.4 58. The method of paragraph 54, wherein expres SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, sion of said at least one marker is determined by quantitative AI821586, BF941609, AI916532, BC034407, N63706 and polymerase chain reaction (Q-PCR). AW772192. I0085 59. The method of paragraph 54, wherein expres 0073 47. The method of paragraph 45, wherein fluores sion of said at least one marker is determined by immunocy cence activated cell sorting (FACS) is used to separate said tochemistry. definitive endoderm cells from said cells that do not substan I0086) 60. A method of identifying a differentiation factor tially express said fluorescent protein. capable of promoting the differentiation of human definitive 0074 48. The method of paragraph 45, wherein said step endoderm cells in a cell population comprising human cells, of differentiating said pluripotent cells further comprises pro said method comprising the steps of obtaining a cell popula viding said cell population with at least one growth factor of tion comprising human definitive endoderm cells, providing a the TGFB superfamily in an amount sufficient to promote candidate differentiation factor to said cell population, deter differentiation of said pluripotent cells to definitive endoderm mining expression of a marker selected from Table 3 in said cells, and allowing sufficient time for definitive endoderm cell populationata first time point, determining expression of cells to form. the same marker in said cell populationata second time point, 0075 49. The method of paragraph 48, wherein said at wherein said second time point is Subsequent to said first time least one growth factor is selected from the group consisting point and wherein said second time point is Subsequent to of Nodal, activin A, activin B and combinations thereof. providing said cell population with said candidate differen 0076 50. The method of paragraph 49, wherein said at tiation factor, and determining if expression of the marker in least one growth factor is provided to said cell population at a said cell population at said second time point is increased or concentration ranging from about 1 ng/ml to about 1000 decreased as compared to expression of the marker in said cell ng/ml. population at said first time point, wherein an increase or 0077 51. The method of paragraph. 50, wherein said at decrease in expression of said marker in said cell population least one growth factor is provided in a concentration of at indicates that said candidate differentiation factor is capable least about 100 ng/ml of promoting the differentiation of said human definitive 0078) 52. The method of paragraph 45, wherein said cell endoderm cells. population is grown in a medium comprising less than about I0087 61. The method of paragraph 60, wherein said 10% serum. marker is selected from the group consisting of AGPAT3, 0079 53. The method of paragraph 45, wherein said pluri APOA2, C20orf56, C21orf129, CALCR, CCL2, CER1, potent human cells are human embryonic stem cells. CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, 0080 54. A method of detecting definitive endoderm cells, EHHADH, ELOVL2, EPSTI1, FGF17, FLJ10970, said method comprising detecting the presence of definitive FLJ21195, FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, endoderm cells in a cell population by detecting expression of GPR37, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, at least one marker selected from Table 3 in cells of said cell RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, population. SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, 0081 55. The method of paragraph 54, wherein said at AI821586, BF941609, AI916532, BC034407, N63706 and least one marker is selected from a group consisting of AW772192. AGPAT3, APOA2, C20orf56, C21orf129, CALCR, CCL2, I0088 62. The method of paragraph 60, wherein said CER1, CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, human definitive endoderm cells comprise at least about 10% DIO3OS, EB-1, EHHADH, ELOVL2, EPSTI1, FGF17, of the human cells in said cell population. US 2014/0134726 A1 May 15, 2014

0089 63. The method of paragraph 60, wherein said endoderm, or islet/beta-cell. Some factors useful for mediat human definitive endoderm cells comprise at least about 90% ing this transition are members of the TGFB family which of the human cells in said cell population. include, but are not limited to, activins, nodals and BMPs. 0090) 64. The method of paragraph 60, wherein said Exemplary markers for defining the definitive endoderm tar human definitive endoderm cells differentiate into cells, tis get cell are SOX17, GATA4, HNF3b, MIXL1 and CXCR4. Sues or organs derived from the gut tube in response to said 0104 FIG. 2 is a diagram of the human SOX17 cDNA candidate differentiation factor. which displays the positions of conserved motifs and high 0091 65. The method of paragraph 60, wherein said first lights the region used for the immunization procedure by time point is prior to providing said candidate differentiation GENOVAC. factor to said cell population. 0105 FIG. 3 is a relational dendrogram illustrating that 0092 66. The method of paragraph 60, wherein said first SOX17 is most closely related to SOX7 and somewhat less to time point is at approximately the same time as providing said SOX18. The SOX17 are more closely related among candidate differentiation factor to said cell population. species homologs than to other members of the SOXgroup F 0093 67. The method of paragraph 60, wherein said first Subfamily within the same species. time point is Subsequent to providing said candidate differ 0106 FIG. 4 is a Western blot probed with the rat anti entiation factor to said cell population. SOX17 antibody. This blot demonstrates the specificity of 0094 68. The method of paragraph 60, wherein expres this antibody for human SOX17 protein over-expressed in sion of said marker is determined by quantitative polymerase fibroblasts (lane 1) and a lack of immunoreactivity with chain reaction (Q-PCR). 0095 69. The method of paragraph 60, wherein expres EGFP (lane 2) or the most closely related SOX family mem sion of said marker is determined by immunocytochemistry. ber, SOX7 (lane 3). 0096 70. The method of paragraph 60, wherein said can 0107 FIGS. 5A-B are micrographs showing a cluster of didate differentiation factor comprises a small molecule. SOX17+ cells that display a significant number of AFP+ 0097. 71. The method of paragraph 60, wherein said can co-labeled cells (A). This is in striking contrast to other didate differentiation factor comprises a polypeptide. SOX17+ clusters (B) where little or no AFP+ cells are 0098 72. The method of paragraph 60, wherein said can observed. didate differentiation factor comprises a growth factor. 0.108 FIGS. 6A-C are micrographs showing parietal 0099 73. The method of paragraph 60, wherein said can endoderm and SOX17. Panel A shows immunocytochemistry didate differentiation factor is provided to said cell population for human Thrombomodulin (TM) protein located on the cell at a concentration of between about 1 ng/ml to about 1 mg/ml. surface of parietal endoderm cells in randomly differentiated 0100 74. The method of paragraph 73, wherein said can cultures of hES cells. Panel B is the identical field shown in A didate differentiation factor is provided to said cell population double-labeled for TM and SOX17. Panel C is the phase at a concentration of about 100 ng/ml. contrast image of the same field with DAPI labeled nuclei. 0101. It will be appreciated that the methods and compo Note the complete correlation of DAPI labeled nuclei and sitions described above relate to cells cultured in vitro. How SOX17 labeling. ever, the above-described in vitro differentiated cell compo 0109 FIGS. 7A-B are bar charts showing SOX17 sitions may be used for in vivo applications. expression by quantitative PCR (Q-PCR) and anti-SOX17 0102 Additional embodiments of the present invention positive cells by SOX17-specific antibody. Panel A shows may also be found in U.S. Provisional Patent Application No. that activin A increases SOX17 gene expression while ret 60/532,004, entitled DEFINITIVE ENDODERM, filed Dec. inoic acid (RA) strongly suppresses SOX17 expression rela 23, 2003: U.S. Provisional Patent Application No. 60/566, tive to the undifferentiated control media (SR20). Panel B 293, entitled PDX 1 EXPRESSING ENDODERM, filed Apr. shows the identical pattern as well as a similar magnitude of 27, 2004; U.S. Provisional Patent Application No. 60/586, these changes is reflected in SOX17 cell number, indicating 566, entitled CHEMOKINE CELL SURFACE RECEPTOR that Q-PCR measurement of SOX17 gene expression is very FOR THE ISOLATION OF DEFINITIVE ENDODERM, reflective of changes at the single cell level. filed Jul. 9, 2004; U.S. Provisional Patent Application No. 0110 FIG. 8A is a bar chart which shows that a culture of 60/587,942, entitled CHEMOKINE CELL SURFACE differentiating hESCs in the presence of activin A maintains RECEPTOR FOR THE ISOLATION OF DEFINITIVE a low level of AFP gene expression while cells allowed to ENDODERM, filed Jul. 14, 2004; U.S. patent application randomly differentiate in 10% fetal bovine serum (FBS) Ser. No. 11/021,618, entitled DEFINITIVE ENDODERM, exhibit a strong upregulation of AFP. The difference in filed Dec. 23, 2004, and U.S. patent application Ser. No. expression levels is approximately 7-fold. 11/115,868, entitled PDX 1 EXPRESSING ENDODERM, 0111 FIGS. 8B-C are images of two micrographs show filed Apr. 26, 2005, U.S. patent application Ser. No. 1 1/165, ing that the Suppression of AFP expression by activin A is also 305, entitled METHODS FOR IDENTIFYING FACTORS evident at the single cell level as indicated by the very rare and FOR DIFFERENTIATING DEFINITIVE ENDODERM, small clusters of AFP cells observed in activin A treatment filed Jun. 23, 2005, the disclosures of which are incorporated conditions (bottom) relative to 10% FBS alone (top). herein by reference in their entireties. 0112 FIGS. 9A-B are comparative images showing the quantitation of the AFP cell number using flow cytometry. BRIEF DESCRIPTION OF THE DRAWINGS This figure demonstrates that the magnitude of change in AFP 0103 FIG. 1 is a schematic of a proposed differentiation gene expression (FIG. 8A) in the presence (right panel) and pathway for the production of beta-cells from hESCs. The absence (left panel) of activin A exactly corresponds to the first step in the pathway commits the ES cell to the definitive number of AFP cells, further supporting the utility of Q-PCR endoderm lineage and represents an early step in the further analyses to indicate changes occurring at the individual cell differentiation of ES cells to pancreatic endoderm, endocrine level. US 2014/0134726 A1 May 15, 2014

0113 FIGS. 10A-F are micrographs which show that I0120 FIG. 17 is a micrograph showing the appearance of exposure of hESCs to nodal, activin A and activin B (NAA) definitive endoderm and visceral endoderm in vitro from yields a striking increase in the number of SOX17 cells over hESCs. The regions of visceral endoderm are identified by the period of 5 days (A-C). By comparing to the relative AFP"/SOX17 while definitive endoderm displays the abundance of SOX17 cells to the total number of cells complete opposite profile, SOX17"/AFP'. This field was present in each field, as indicated by DAPI stained nuclei selectively chosen due to the proximity of these two regions to (D-F), it can be seen that approximately 30-50% of all cells each other. However, there are numerous times when are immunoreactive for SOX17 after five days treatment with SOX17"/AFP' regions are observed in absolute isolation NAA from any regions of AFP" cells, suggesting the separate origi 0114 FIG. 11 is a bar chart which demonstrates that nation of the definitive endoderm cells from visceral endo activin A (0, 10, 30 or 100 ng/ml) dose-dependently increases derm cells. SOX17 gene expression in differentiating hESCs. Increased I0121 FIG. 18 is a diagram depicting the TGFB family of expression is already robust after 3 days of treatment on ligands and receptors. Factors activating AR Smads and BR adherent cultures and continues through Subsequent 1, 3 and Smads are useful in the production of definitive endoderm 5 days of Suspension culture as well. from human embryonic stem cells (see, J Cell Pysiol. 187: 0115 FIGS. 12A-C are bar charts which demonstrate the 265-76). effect of activin A on the expression of MIXL 1 (panel A), 0.122 FIG. 19 is a bar chart showing the induction of GATA4 (panel B) and HNF3b (panel C). Activin A dose SOX17 expression over time as a result of treatment with dependent increases are also observed for three other markers individual and combinations of TGFB factors. of definitive endoderm; MIXL1, GATA4 and HNF3b. The I0123 FIG. 20 is a bar chart showing the increase in magnitudes of increased expression in response to activin SOX17 cell number with time as a result of treatment with dose are strikingly similar to those observed for SOX17, combinations of TGFB factors. strongly indicating that activin A is specifying a population of 0.124 FIG. 21 is a bar chart showing induction of SOX17 cells that co-express all four (SOX17', MIXL1. expression over time as a result of treatment with combina GATA4" and HNF3b). tions of TGFB factors. 0116 FIGS. 13 A-C are bar charts which demonstrate the 0.125 FIG.22 is a bar chart showing that activin A induces effect of activin A on the expression of AFP (panel A), SOX7 a dose-dependent increase in SOX17 cell number. (panel B) and SPARC (panel C). There is an activin A dose 0.126 FIG. 23 is a bar chart showing that addition of dependent decrease in expression of the visceral endoderm Wnt3a to activin A and activin B treated cultures increases marker AFP. Markers of primitive endoderm (SOX7) and parietal endoderm (SPARC) remain either unchanged or SOX17 expression above the levels induced by activin A and exhibit Suppression at Some time points indicating that activin activin B alone. A does not act to specify these extra-embryonic endoderm I0127 FIGS. 24A-C are bar charts showing differentiation cell types. This further supports the fact that the increased to definitive endoderm is enhanced in low FBS conditions. expression of SOX17, MIXL1, GATA4, and HNF3b are due Treatment of hESCs with activins A and B in media contain to an increase in the number of definitive endoderm cells in ing 2% FBS (2AA) yields a 2-3 times greater level of SOX17 response to activin A. expression as compared to the same treatment in 10% FBS 0117 FIGS. 14A-B are bar charts showing the effect of media (10AA) (panel A). Induction of the definitive endo activin A on ZIC1 (panel A) and Brachyury expression (panel derm marker MIXL 1 (panel B) is also affected in the same B) Consistent expression of the neural marker ZIC1 demon way and the suppression of AFP (visceral endoderm) (panel strates that there is not a dose-dependent effect of activin A on C) is greater in 2% FBS than in 10% FBS conditions. neural differentiation. There is a notable suppression of I0128 FIGS. 25A-D are micrographs which show SOX17" mesoderm differentiation mediated by 100 ng/ml of activin A cells are dividing in culture. SOX17 immunoreactive cells are treatment as indicated by the decreased expression of present at the differentiating edge of an hESC colony (C, D) brachyury. This is likely the result of the increased specifica and are labeled with proliferating cell nuclear antigen tion of definitive endoderm from the mesendoderm precur (PCNA) (panel B) yet are not co-labeled with OCT4 (panel sors. Lower levels of activin A treatment (10 and 30 ng/ml) C). In addition, clear mitotic figures can be seen by DAPI maintain the expression of brachyury at later time points of labeling of nuclei in both SOX17 cells (arrows) as well as differentiation relative to untreated control cultures. OCT4", undifferentiated hESCs (arrowheads) (D). 0118 FIGS. 15A-B are micrographs showing decreased I0129 FIG. 26 is a bar chart showing the relative expres parietal endoderm differentiation in response to treatment sion level of CXCR4 in differentiating hESCs under various with activins. Regions of TM" parietal endoderm are found media conditions. through the culture (A) when differentiated in serum alone, I0130 FIGS. 27A-Darebarcharts that show how a panel of while differentiation to TM cells is scarce when activins are definitive endoderm markers share a very similar pattern of included (B) and overall intensity of TM immunoreactivity is expression to CXCR4 across the same differentiation treat lower. ments displayed in FIG. 26. 0119 FIGS. 16A-D are micrographs which show marker I0131 FIGS. 28A-E are bar charts showing how markers expression in response to treatment with activin A and activin for mesoderm (BRACHYURY, MOX1), ectoderm (SOX1, B. hESCs were treated for four consecutive days with activin ZIC1) and visceral endoderm (SOX7) exhibit an inverse rela A and activin Band triple labeled with SOX17, AFP and TM tionship to CXCR4 expression across the same treatments antibodies. Panel A SOX17; Panel B AFP; Panel displayed in FIG. 26. C TM; and Panel D. Phase/DAPI. Notice the numerous I0132 FIGS. 29 A-F are micrographs that show the relative SOX17 positive cells (A) associated with the complete difference in SOX17 immunoreactive cells across three of the absence of AFP (B) and TM (C) immunoreactivity. media conditions displayed in FIGS. 26-28. US 2014/0134726 A1 May 15, 2014

0.133 FIGS.30A-Care flow cytometry dot plots that dem tube such as the lungs, liver, thymus, parathyroid and thyroid onstrate the increase in CXCR4" cell number with increasing glands, gallbladder and pancreas (Grapin-Botton and Mel concentration of activin Aadded to the differentiation media. ton, 2000; Kimelman and Griffin, 2000: Tremblay et al., 0134 FIGS. 31A-D are bar charts that show the CXCR4" 2000; Wells and Melton, 1999: Wells and Melton, 2000). A cells isolated from the high dose activin A treatment (A100 very important distinction should be made between the CX--) are even further enriched for definitive endoderm mark definitive endoderm and the completely separate lineage of ers than the parent population (A100). cells termed primitive endoderm. The primitive endoderm is 0135 FIG.32 is a bar chart showing gene expression from primarily responsible for formation of extra-embryonic tis CXCR4" and CXCR4 cells isolated using fluorescence-ac Sues, mainly the parietal and visceral endoderm portions of tivated cell sorting (FACS) as well as gene expression in the the placental yolk sac and the extracellular matrix material of parent populations. This demonstrates that the CXCR4" cells Reichert’s membrane. contain essentially all the CXCR4 gene expression present in 0.141. During gastrulation, the process of definitive endo each parent population and the CXCR4 populations contain derm formation begins with a cellular migration event in very little or no CXCR4 gene expression. which mesendoderm cells (cells competent to form meso 0.136 FIGS. 33A-D are bar charts that demonstrate the derm or endoderm) migrate through a structure called the depletion of mesoderm (BRACHYURY. MOX1), ectoderm primitive streak. the streak and through the node (a special (ZIC1) and visceral endoderm (SOX7) gene expression in the ized structure at the anterior-most region of the streak). As CXCR4 cells isolated from the high dose activin Atreatment migration occurs, definitive endoderm populates first the which is already Suppressed in expression of these non-de most anterior gut tube and culminates with the formation of finitive endoderm markers. the posterior end of the gut tube. 0.137 FIGS.34A-Mare bar charts showing the expression 0142. In vivo analyses of the formation of definitive endo patterns of marker genes that can be used to identify definitive derm, such as the studies in Zebrafish and Xenopus by Conlon endoderm cells. The expression analysis of definitive endo et al., 1994: Feldman et al., 1998; Zhou et al., 1993: Aoki et derm markers, FGF17, VWF, CALCR, FOXQ1, CMKOR1 al., 2002: Dougan et al., 2003; Tremblay et al., 2000; Vincent and CRIP1 is shown in panels G-L, respectively. The expres et al., 2003; Alexander et al., 1999; Alexander and Stainier, sion analysis of previously described lineage marking genes, 1999: Kikuchi et al., 2001; Hudson et al., 1997 and in mouse SOX17, SOX7, SOX17/SOX7, TM, ZIC1, and MOX1 is by Kanai-AZuma et al., 2002 lay a foundation for how one shown in panels A-F, respectively. Panel M shows the expres might attempt to approach the development of a specific germ sion analysis of CXCR4. With respect to each of panels A-M, layer cell type in the culture dish using human embryonic the column labeled hESC indicates gene expression from stem cells. There are two aspects associated with in vitro ESC purified human embryonic stem cells; 2NF indicates cells culture that pose major obstacles in the attempt to recapitulate treated with 2% FBS, no activin addition; 0.1A100 indicates development in the culture dish. First, organized germ layer cells treated with 0.1% FBS, 100 ng/ml activin A; 1A100 or organ structures are not produced. The majority of germ indicates cells treated with 1% FBS, 100 ng/ml activin A; and layer and organ specific genetic markers will be expressed in 2A100 indicates cells treated with 2% FBS, 100 ng/ml activin a heterogeneous fashion in the differentiating HESC culture A. system. Therefore it is difficult to evaluate formation of a 0138 FIGS. 35A-D show the in vivo differentiation of specific tissue or cell type due to this lack of organ specific definitive endoderm cells that are transplanted under the kid boundaries. Almost all genes expressed in one cell type ney capsule of immunocompromised mice. Panels: A heta within a particular germ layer or tissue type are expressed in toxylin-eosin staining showing gut-tube-like structures; other cells of different germ layer or tissue types as well. B-antibody immunoreactivity against hepatocyte specific Without specific boundaries there is considerably less means antigen (liver); C antibody immunoreactivity against Villin to assign gene expression specificity with a small sample of (intestine); and D—antibody immunoreactivity against 1-3 genes. Therefore one must examine considerably more CDX2 (intestine). genes, some of which should be present as well as some that 0139 FIGS. 36A-C are charts showing the normalized should not be expressed in the particular cell type of the organ relative expression levels of markers for liver (albumin and or tissue of interest. Second, the timing of gene expression PROX1) and lung (TITF1) tissues in cells contacted with patterns is crucial to movement down a specific developmen Wnt3A at 20 ng/ml. FGF2 at 5 ng/ml or FGF2 at 100 ng/ml on tal pathway. days 5-10. DE refers to definitive endoderm. Panels: A al 0.143 To further complicate matters, it should be noted bumin, B PROX1, and C TITF1. that stem cell differentiation in vitro is rather asynchronous, likely considerably more so than in vivo. As such, one group DETAILED DESCRIPTION of cells may be expressing genes associated with gastrulation, 0140. A crucial stage in early human development termed while another group may be starting final differentiation. gastrulation occurs 2-3 weeks after fertilization. Gastrulation Furthermore, manipulation of HESC monolayers or embry is extremely significant because it is at this time that the three oid bodies (EBs) with or without exogenous factor applica primary germ layers are first specified and organized (Lu et tion may result in profound differences with respect to overall al., 2001; Schoenwolf and Smith, 2000). The ectoderm is gene expression pattern and state of differentiation. For these responsible for the eventual formation of the outer coverings reasons, the application of exogenous factors must be timed of the body and the entire nervous system whereas the heart, according to gene expression patterns within a heterogeneous blood, bone, skeletal muscle and other connective tissues are cell mixture in order to efficiently move the culture down a derived from the mesoderm. Definitive endoderm is defined specific differentiation pathway. It is also beneficial to con as the germ layer that is responsible for formation of the entire sider the morphological association of the cells in the culture gut tube which includes the esophagus, stomach and Small vessel. The ability to uniformly influence hESCs when and large intestines, and the organs which derive from the gut formed into so called embryoid bodies may be less optimal US 2014/0134726 A1 May 15, 2014 than hESCs grown and differentiated as monolayers and or composition, washing temperature, sand salt concentration. HESC colonies in the culture vessel. In general, longer oligonucleotides may anneal at relatively 0144. As an effective way to deal with the above-men high temperatures, while shorter oligonucleotides generally tioned problems, some embodiments of the present invention anneal at lower temperatures. Hybridization generally contemplate combining a method for differentiating cells depends on the ability of denatured nucleic acid to reanneal with a method for the enrichment, isolation and/or purifica when complementary strands are present in an environment tion and identification of intermediate cell types in the differ below their melting temperature. The higher the degree of entiation pathway. desired homology between the oligonucleotide and hybridiz able sequence, the higher the relative temperature which can DEFINITIONS be used. As a result, it follows that higher relative tempera 0145 Certain terms and phrases as used throughout this tures would tend to make the reaction conditions more strin application have the meanings provided as follows: gent, while lower temperatures less so. For additional details 0146. As used herein, "embryonic' refers to a range of and explanation of stringency of hybridization reactions, see developmental stages of an organism beginning with a single Ausubeletal. Current Protocols in Molecular Biology, Wiley Zygote and ending with a multicellular structure that no Interscience Publishers, (1995). longer comprises pluripotent or totipotent cells other than 0154 “Stringent conditions” or “high stringency condi developed gametic cells. In addition to embryos derived by tions, as used herein, are identified by those that: (1) employ gamete fusion, the term "embryonic' refers to embryos low ionic strength and high temperature for washing, for derived by somatic cell nuclear transfer. example 0.015 M sodium chloride/0.0015M sodium citrate/ 0147 As used herein, “multipotent” or “multipotent cell 0.1% SDS at 50° C.; (2) employ during hybridization a dena refers to a cell type that can give rise to a limited number of turing agent, such as formamide, for example 50% (v/v) other particular cell types. Multipotent cells are committed to formamide with 0.1% bovine serum albumin/0.1% Ficoll/0. one or more, but not all, embryonic and/or extraembryonic 1% polyvinylpyrrolidone/50 mM sodium phosphate buffer at cell fates. Thus, in contrast to pluripotent cells, multipotent pH 6.5 with 750 mM sodium chloride, 75 mM sodium citrate cells cannot give rise to each of the three embryonic cell at 42°C., or (3) for example, employ 50% formamide, 5xSSC lineages as well as extraembryonic cells. (0.75M NaCl, 0.075 M sodium citrate) 50 mM sodium phos 0148. In some embodiments. hESCs can be derived from a phate (pH 6.8), 0.1% sodium pyrophosphate, 5xDenhardt's “preimplantation embryo. As used herein, “preimplantation solution, sonicated salmon sperm DNA (50 ug/mL) 0.1% embryo refers to an embryo between the stages of fertiliza SDS, and 10% dextran sulfate at 42°C., with washes at 42°C. tion and implantation. Implantation generally takes place 7-8 in 0.2xSSC (sodium chloride/sodium citrate) and 50% for days after fertilization. However, implantation may take place mamide at 55° C., followed by a high-stringency wash con about 1, about 2, about 3, about 4, about 5, about 6, about 7, sisting of 0.1xSSC containing EDTA at 55° C. about 8, about 9, about 10, about 11, about 12, about 13, about 0155 “Moderately stringent conditions' may be identi 14 or greater than about 14 days after fertilization. In some fied as described by Sambrook et al., Molecular Cloning: A embodiments, a preimplantation embryo has not progressed Laboratory Manual, New York: Cold Spring Harbor Press, beyond the blastocyst stage. 1989, and include the use of washing solution and hybridiza 0149. As used herein, “expression” refers to the produc tion conditions (e.g., temperature, ionic strength and% SDS) tion of a material or Substance as well as the level or amount less stringent that those described above. An example of of production of a material or Substance. Thus, determining moderately stringent conditions is overnight incubation at the expression of a specific marker refers to detecting either 37° C. in a solution comprising: 20% formamide, 5xSSC the relative or absolute amount of the marker that is expressed (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium or simply detecting the presence or absence of the marker. phosphate (pH 7.6), 5xDenhardt’s solution, 10% dextransul 0150. As used herein, “marker” refers to any molecule that fate, and 20 mg/ml denatured sheared salmon sperm DNA, can be observed or detected. For example, a marker can followed by washing the filters in 1xSSC at about 37-50° C. include, but is not limited to, a nucleic acid, Such as a tran The skilled artisan will recognize how to adjust the tempera Script of a specific gene, a polypeptide product of a gene, a ture, ionic strength, etc. as necessary to accommodate factors non-gene product polypeptide, a glycoprotein, a carbohy Such as probe length and the like. drate, a glycolipid, a lipid, a lipoprotein or a small molecule 0156. As used herein, the term “label refers to, for (for example, molecules having a molecular weight of less example, radioactive, fluorescent, biological or enzymatic than 10,000 amu) tags or labels of standard use in the art. A label can be con 0151. When used in connection with cell cultures and/or jugated, or otherwise bound, to nucleic acids, polypeptides, cell populations, the term "portion” means any non-Zero Such as antibodies, or Small molecules. For example, oligo amount of the cell culture or cell population, which ranges nucleotides of the present invention can be labeled subse from a single cell to the entirety of the cell culture or cells quent to synthesis, by incorporating biotinylated dNTPs or population. rNTP, or some similar means (e.g., photo-cross-linking a 0152 With respect to cells in cell cultures or in cell popu psoralen derivative of biotin to RNAs), followed by addition lations, the phrase “substantially free of means that the of labeled Streptavidin (e.g., phycoerythrin-conjugated specified cell type of which the cell culture or cell population streptavidin) or the equivalent. Alternatively, when fluores is free, is present in an amount of less than about 5% of the cently-labeled oligonucleotide probes are used, fluorescein, total number of cells present in the cell culture or cell popu lissamine, phycoerythirin, rhodamine (PerkinElmer Cetus), lation. Cy2, Cy3, Cy3.5, Cy5, Cy5.5, Cy7, FluorX (Amersham) and 0153 “Stringency” of hybridization reactions is readily others, can be attached to nucleic acids. Non-limiting determinable by those skilled in the art, and generally is an examples of detectable labels that may be conjugated to empirical calculation based upon oligonucleotide length and polypeptides Such as antibodies include but are not limited to US 2014/0134726 A1 May 15, 2014

radioactive labels, such as H, C, C, F, 'P, S, Cu, are both used to identify and determine the amount or relative 7°Br, Y, Tc, In, II, PI, or 77Lu, enzymes, such as proportions of markers, such as those selected from Table 3. horseradish peroxidase, fluorophores, chromophores, chemi 0.161. By using methods, such as those described above, to luminescent agents, chelating complexes, dyes, colloidal determine the expression of one or more appropriate markers, gold or latex particles. it is possible to identify definitive endoderm cells, as well as determine the proportion of definitive endoderm cells in a cell Definitive Endoderm Cells and Processes Related Thereto culture or cell population. For example, in some embodi ments of the present invention, the definitive endoderm cells 0157 Embodiments described herein relate to the produc or cell populations that are produced express one or more tion, identification, and isolation and/or enrichment of defini markers selected from Table 3 at a level of about 2 orders of tive endoderm cells in culture by differentiating pluripotent magnitude greater than non-definitive endoderm cell types or cells, such as stem cells. As described above, definitive endo cell populations. In certain embodiments, the non-definitive derm cells do not differentiate into tissues produced from endoderm cell types consist only of hESCs. In other embodi ectoderm or mesoderm, but rather, differentiate into the gut ments, the definitive endoderm cells or cell populations that tube as well as organs that are derived from the gut tube. In are produced express one or more markers selected from certain preferred embodiments, the definitive endoderm cells Table 3 at a level of more than 2 orders of magnitude greater are derived from hESCs. Such processes can provide the basis than non-definitive endoderm cell types or cell populations. for efficient production of human endodermal derived tissues In particular embodiments, the non-definitive endoderm cell Such as pancreas, liver, lung, stomach, intestine, thyroid and types consist only of hESCs. In still other embodiments, the thymus. For example, production of definitive endoderm may definitive endoderm cells or cell populations that are pro be the first step in differentiation of a stem cell to a functional duced express one or more of the markers selected from the insulin-producing B-cell. To obtain useful quantities of insu group consisting of AGPAT3, APOA2, C20orf56, C21orf129, lin-producing B-cells, high efficiency of differentiation is CALCR, CCL2, CER1, CMKOR1, CRIP1, CXCR4, desirable for each of the differentiation steps that occur prior CXorf1, DIO3, DIO3OS, EB-1, EHHADH, ELOVL2, to reaching the pancreatic islet/B-cell fate. Since differentia EPSTI1, FGF17, FLJ10970, FLJ21195, FLJ22471, tion of stem cells to definitive endoderm cells represents an FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, GSC, early step towards the production of functional pancreatic LOC283537, MYL7, NPPB, NTN4, PRSS2, RTN4RL1, islet/B-cells (as shown in FIG. 1), high efficiency of differen SEMA3E, SIAT8D, SLC5A9, SLC40A1, SOX17, SPOCK3, tiation at this step is particularly desirable. TMOD1, TRPA1, TTN, AW166727, AI821586, BF941609, 0158. In view of the desirability of efficient differentiation AI916532, BCO34407, N63706 and AW772192 at a level of of pluripotent cells to definitive endoderm cells, some aspects about 2 or more than 2 orders of magnitude greater than of the differentiation processes described herein relate to in non-definitive endoderm cell types or cell populations. In vitro methodology that results inapproximately 50-80% con certain embodiments, the non-definitive endoderm cell types version of pluripotent cells to definitive endoderm cells. Typi consist only of hESCs. In some embodiments described cally, such methods encompass the application of culture and herein, definitive endoderm cells do not substantially express growth factor conditions in a defined and temporally specified PDX1. fashion. Further enrichment of the cell population for defini (0162 Embodiments described herein also relate to defini tive endoderm cells can be achieved by isolation and/or puri tive endoderm compositions. For example, Some embodi fication of the definitive endoderm cells from other cells in the ments relate to cell cultures comprising definitive endoderm, population by using a reagent that specifically binds to defini whereas others relate to cell populations enriched in definitive tive endoderm cells. As such, some embodiments described endoderm cells. Some preferred embodiments relate to cell herein relate to definitive endoderm cells as well as methods cultures which comprise definitive endoderm cells, wherein for producing and isolating and/or purifying Such cells. at least about 50-80% of the cells in culture are definitive 0159. In order to determine the amount of definitive endo endoderm cells. An especially preferred embodiment relates derm cells in a cell culture or cell population, a method of to cells cultures comprising human cells, wherein at least distinguishing this cell type from the other cells in the culture about 50-80% of the human cells in culture are definitive or in the population is desirable. Accordingly, certain endoderm cells. Because the efficiency of the differentiation embodiments described herein relate to cell markers whose procedure can be adjusted by modifying certain parameters, presence, absence and/or relative expression levels are spe which include but are not limited to, cell growth conditions, cific for definitive endoderm and methods for detecting and growth factor concentrations and the timing of culture steps, determining the expression of Such markers. the differentiation procedures described herein can result in 0160. In some embodiments described herein, the pres about 5%, about 10%, about 15%, about 20%, about 25%, ence, absence and/or level of expression of a marker is deter about 30%, about 35%, about 40%, about 45%, about 50%, mined by quantitative PCR (Q-PCR). For example, the about 55%, about 60%, about 65%, about 70%, about 75%, amount of transcript produced by certain genetic markers, about 80%, about 85%, about 90%, about 95%, or greater such as one or more markers selected from Table 3 is deter than about 95% conversion of pluripotent cells to definitive mined by quantitative Q-PCR. In other embodiments, immu endoderm. In other preferred embodiments, conversion of a nohistochemistry is used to detect the proteins expressed by pluripotent cell population, such as a stem cell population, to the above-mentioned genes. In a preferred embodiment, Substantially pure definitive endoderm cell population is con immunohistochemistry is used to detect one or more cell templated. Surface markers selected from the group consisting of 0163 The compositions and methods described herein CALCR, CMKOR1, CXCR4, GPR37, NTN4, RTN4RL1, have several useful features. For example, the cell cultures SEMA3E, SLC40A1 SLC5A9 and TRPA1. In still other and cell populations comprising definitive endoderm as well embodiments, Q-PCR and immunohistochemcial techniques as the methods for producing such cell cultures and cell US 2014/0134726 A1 May 15, 2014

populations are useful for modeling the early stages of human Wnt family members are useful for the production of defini development. Furthermore, the compositions and methods tive endoderm cells. In certain differentiation processes, com described herein can also serve for therapeutic intervention in binations of any of the above-mentioned growth factors can disease states, such as diabetes mellitus. For example, since be used. definitive endoderm serves as the source for only a limited 0168 With respect to some of the processes for the differ number of tissues, it can be used in the development of pure entiation of pluripotent stem cells to definitive endoderm tissue or cell types. cells, the above-mentioned growth factors are provided to the Production of Definitive Endoderm from Pluripotent Cells cells So that the growth factors are present in the cultures at 0164 Processes for differentiating pluripotent cells to pro concentrations sufficient to promote differentiation of at least duce cell cultures and enriched cell populations comprising a portion of the stem cells to definitive endoderm cells. In definitive endoderm is described below and in U.S. patent Ser. Some processes, the above-mentioned growth factors are No. 1 1/021,618, entitled DEFINITIVE ENDODERM, filed present in the cell culture at a concentration of at least about Dec. 23, 2004, the disclosure of which is incorporated herein 5 ng/ml, at least about 10 ng/ml, at least about 25 ng/ml, at by reference in its entirety. In some of these processes, the least about 50 ng/ml, at least about 75 ng/ml, at least about pluripotent cells used as starting material are stem cells. In 100 ng/ml, at least about 200 ng/ml, at least about 300 ng/ml, certain processes, definitive endoderm cell cultures and at least about 400 ng/ml, at least about 500 ng/ml, at least enriched cell populations comprising definitive endoderm about 1000 ng/ml, at least about 2000 ng/ml, at least about cells are produced from embryonic stem cells. A preferred 3000 ng/ml, at least about 4000 ng/ml, at least about 5000 method for deriving definitive endoderm cells utilizes human ng/ml or more than about 5000 ng/ml. embryonic stem cells as the starting material for definitive 0169. In certain processes for the differentiation of pluri endoderm production. Such pluripotent cells can be cells that potent stem cells to definitive endoderm cells, the above originate from the morula, embryonic inner cell mass or those mentioned growth factors are removed from the cell culture obtained from embryonic gonadal ridges. Human embryonic Subsequent to their addition. For example, the growth factors stem cells can be maintained in culture in a pluripotent state can be removed within about one day, about two days, about without substantial differentiation using methods that are three days, about four days, about five days, about six days, known in the art. Such methods are described, for example, in about seven days, about eight days, about nine days or about U.S. Pat. Nos. 5.453,357, 5,670,372, 5,690,926 5,843,780, ten days after their addition. In a preferred processes, the 6,200,806 and 6,251,671 the disclosures of which are incor growth factors are removed about four days after their addi porated herein by reference in their entireties. tion. 0.165. In some processes for producing definitive endo 0170 Cultures of definitive endoderm cells can be grown derm cells, hESCs are maintained on a feeder layer. In such in medium containing reduced serum or no serum. Under processes, any feeder layer which allows hESCs to be main certain culture conditions, serum concentrations can range tained in a pluripotent state can be used. One commonly used from about 0.05% V/v to about 20% V/v. For example, in some feeder layer for the cultivation of human embryonic stem differentiation processes, the serum concentration of the cells is a layer of mouse fibroblasts. More recently, human medium can be less than about 0.05% (v/v), less than about fibroblast feeder layers have been developed for use in the 0.1% (v/v), less than about 0.2% (v/v), less than about 0.3% cultivation of hESCs (see US Patent Application No. 2002/ (v/v), less than about 0.4% (v/v), less than about 0.5% (v/v), 00721 17, the disclosure of which is incorporated herein by less than about 0.6% (v/v), less than about 0.7% (v/v), less reference in its entirety). Alternative processes for producing than about 0.8% (v/v), less than about 0.9% (v/v), less than definitive endoderm permit the maintenance of pluripotent about 1% (v/v), less than about 2% (v/v), less than about 3% HESC without the use of a feeder layer. Methods of main (v/v), less than about 4% (v/v), less than about 5% (v/v), less taining pluripotent hESCs under feeder-free conditions have than about 6% (v/v), less than about 7% (v/v), less than about been described in US Patent Application No. 2003/0175956, 8% (v/v), less than about 9% (v/v), less than about 10% (v/v), the disclosure of which is incorporated herein by reference in less than about 15% (v/v) or less than about 20% (v/v). In its entirety. Some processes, definitive endoderm cells are grown without 0166 The human embryonic stem cells used herein can be serum or with serum replacement. In still other processes, maintained in culture either with or without serum. In some definitive endoderm cells are grown in the presence of B27. In embryonic stem cell maintenance procedures, serum replace Such processes, the concentration of B27 Supplement can ment is used. In others, serum free culture techniques, such as range from about 0.1% V/v to about 20% V/v. those described in US Patent Application No. 2003/0190748, the disclosure of which is incorporated herein by reference in Monitoring the Differentiation of Pluripotent Cells to its entirety, are used. Definitive Endoderm 0167 Stem cells are maintained in culture in a pluripotent (0171 The progression of the HESC culture to definitive state by routine passage until it is desired that they be differ endoderm can be monitored by determining the expression of entiated into definitive endoderm. In some processes, differ markers characteristic of definitive endoderm. In some pro entiation to definitive endoderm is achieved by providing to cesses, the expression of certain markers is determined by the stem cell culture a growth factor of the TGFB superfamily detecting the presence or absence of the marker. Alterna in an amount sufficient to promote differentiation to definitive tively, the expression of certain markers can be determined by endoderm. Growth factors of the TGFB superfamily which measuring the level at which the marker is present in the cells are useful for the production of definitive endoderm are of the cell culture or cell population. In Such processes, the selected from the Nodal/Activin or BMP subgroups. In some measurement of marker expression can be qualitative or preferred differentiation processes, the growth factor is quantitative. One method of quantitating the expression of selected from the group consisting of Nodal, activinA, activin markers that are produced by marker genes is through the use Band BMP4. Additionally, the growth factor Wnt3a and other of quantitative PCR (Q-PCR). Methods of performing US 2014/0134726 A1 May 15, 2014

Q-PCR are well known in the art. Other methods which are ing, but not limited to, one or more markers selected from known in the art can also be used to quantitate marker gene Table 3. Since definitive endoderm cells express the CXCR4 expression. For example, the expression of a marker gene marker gene at a level higher than that of the SOX7 marker product can be detected by using antibodies specific for the gene, the expression of both CXCR4 and SOX7 can be moni marker gene product of interest. In certain processes, the tored. In other processes, expression of both the CXCR4 expression of marker genes characteristic of definitive endo marker gene and the OCT4 marker gene, is monitored. Addi derm as well as the lack of significant expression of marker tionally, because definitive endoderm cells express the genes characteristic of hESCs and other cell types is deter CXCR4 marker gene at a level higher than that of the AFP. mined. SPARC or Thrombomodulin (TM) marker genes, the expres 0172. As described further in the Examples below, a reli sion of these genes can also be monitored. able marker of definitive endoderm is the SOX17 gene. As 0.175. It will be appreciated that in some embodiments such, the definitive endoderm cells produced by the processes described herein, the expression of one or more markers described herein express the SOX17 marker gene, thereby selected from Table3 indefinitive endoderm cells is increased producing the SOX17 gene product. Other markers of defini as compared to the expression of one or more markers tive endoderm are markers selected from Table 3. Since selected from the group consisting of OCT4, SPARC, AFP, definitive endoderm cells express the SOX17 marker gene at TM and SOX7 in definitive endoderm cells. In preferred a level higher than that of the SOX7 marker gene, which is embodiments, the expression of one or more markers selected characteristic of primitive and visceral endoderm (see Table from the group consisting of AGPAT3, APOA2, C20orf56, 1), in some processes, the expression of both SOX17 and C21orf129, CALCR, CCL2, CER1, CMKOR1, CRIP1, SOX7 is monitored. In other processes, expression of the both CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, the SOX17 marker gene and the OCT4 marker gene, which is ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, characteristic of hESCs, is monitored. Additionally, because FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, definitive endoderm cells express the SOX17 marker gene at GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, a level higher than that of the AFP, SPARC or Thrombomodu RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, lin (TM) marker genes, the expression of these genes can also SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, be monitored. AI821586, BF941609, AI916532, BC034407, N63706 and 0173 Another marker of definitive endoderm is the AW772192 in definitive endoderm cells is increased as com CXCR4 gene. The CXCR4 gene encodes a cell surface pared to the expression of one or more markers selected from chemokine receptor whose ligand is the chemoattractant the group consisting of OCT4, SPARC, AFP, TM and SOX7 SDF-1. The principal roles of the CXCR4 receptor-bearing in definitive endoderm cells. cells in the adult are believed to be the migration of hemato 0176 It will be appreciated that expression of CXCR4 in poetic cells to the bone marrow, lymphocyte trafficking and endodermal cells does not preclude the expression of SOX17. the differentiation of various B cell and macrophage blood Furthermore, the expression of one or more markers selected cell lineages Kim, C., and Broxmeyer, H. J. Leukocyte Biol. from Table 3 in definitive endoderm does not preclude the 65, 6-15 (1999). The CXCR4 receptor also functions as a expression of other markers selected from Table 3. As such, coreceptor for the entry of HIV-1 into T-cells Feng, Y., et al. definitive endoderm cells produced by the processes Science, 272,872-877 (1996). In an extensive series of stud described herein will substantially express SOX17, CXCR4 ies carried out by McGrath, K. E. et al. Dev. Biology 213, and one or more other markers selected from Table 3, but will 442-456 (1999), the expression of the chernokine receptor not substantially express AFP, TM, SPARC or PDX1. CXCR4 and its unique ligand, SDF-1 Kim, C., and Brox (0177. It will be appreciated that the expression of SOX17, myer, H., J. Leukocyte Biol. 65, 6-15 (1999), were delin CXCR4 and/or one or more other markers selected from eated during early development and adult life in the mouse. Table 3 is induced over a range of different levels in definitive The CXCR4/SDF1 interaction in development became endoderm cells depending on the differentiation conditions. apparent when it was demonstrated that if either gene was As such, in Some embodiments described herein, the expres disrupted in transgenic mice Nagasawa et al. Nature, 382, sion of SOX17, CXCR4 and/or one or more other markers 635-638 (1996), Ma, Q., etal Immunity, 10,463-471 (1999) selected from Table 3 in definitive endoderm cells or cell it resulted in late embryonic lethality. McGrathet al. demon populations is at least about 2-fold higher to at least about strated that CXCR4 is the most abundant chemokine receptor 10,000-fold higher than the expression of SOX17, CXCR4 messenger RNA detected during early gastrulating embryos and/or one or more other markers selected from Table 3 in (E7.5) using a combination of RNase protection and in situ non-definitive endoderm cells or cell populations, for hybridization methodologies. In the gastrulating embryo, example pluripotent stem cells. In other embodiments, the CXCR4/SDF-1 signaling appears to be mainly involved in expression of SOX17, CXCR4 and/or one or more other inducing migration of primitive-streak germlayer cells and is markers selected from Table 3 in definitive endoderm cells or expressed on definitive endoderm, mesoderm and extraem cell populations is at least about 4-fold higher, at least about bryonic mesoderm present at this time. In E7.2-7.8 mouse 6-fold higher, at least about 8-fold higher, at least about embryos, CXCR4 and alpha-fetoprotein are mutually exclu 10-fold higher, at least about 15-fold higher, at least about sive indicating a lack of expression in visceral endoderm 20-fold higher, at least about 40-fold higher, at least about McGrath, K. E. et al. Dev. Biology 213,442-456 (1999). 80-fold higher, at least about 100-fold higher, at least about 0.174 Since definitive endoderm cells produced by differ 150-fold higher, at least about 200-fold higher, at least about entiating pluripotent cells express the CXCR4 marker gene, 500-fold higher, at least about 750-fold higher, at least about expression of CXCR4 can be monitored in order to track the 1000-fold higher, at least about 2500-fold higher, at least production of definitive endoderm cells. Additionally, defini about 5000-fold higher, at least about 7500-fold higher or at tive endoderm cells produced by the methods described least about 10,000-fold higher than the expression of SOX17, herein express other markers of definitive endoderm includ CXCR4 and/or one or more other markers selected from US 2014/0134726 A1 May 15, 2014

Table 3 in non-definitive endoderm cells or cell populations, stantially expressed in one or more non-definitive endoderm for example pluripotent stem cells. In some embodiments, the cell types, and separating cells bound by the reagent from expression of SOX17, CXCR4 and/or one or more other cells that are not bound by the reagent. markers selected from Table 3 in definitive endoderm cells or cell populations is infinitely higher than the expression of 0180. In some embodiments, the marker can be any SOX17, CXCR4 and/or one or more other markers selected marker that is expressed in definitive endoderm cells but not from Table 3 in non-definitive endoderm cells or cell popu Substantially expressed in one or more non-definitive endo lations, for example pluripotent stem cells. In preferred derm cell types. Such markers include, but are not limited to, embodiments of the above-described processes, the expres markers that are more highly expressed in definitive endo sion of one or more markers selected from Table 3, including derm cells than hESCs, mesoderm cells, ectoderm cells, and/ one or more markers selected from the group consisting of or extra-embryonic endoderm cells. Non-limiting examples AGPAT3, APOA2, C20orf56, C21orf129, CALCR, CCL2, ofuseful markers expressed in definitive endoderm cells, but CER1, CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, not substantially expressed in one or more non-definitive DIO3OS, EB-1, EHHADH, ELOVL2, EPSTI1, FGF17, endoderm cell types, are provided in Table 3. Certain pre FLJ10970, FLJ21195, FLJ22471, FLJ23514, FOXA2, ferred markers include one or more markers selected from the FOXO1, GATA4, GPR37, GSC, LOC283537, MYL7, group consisting of AGPAT3, APOA2, C20orf56, C21orf129, NPPB, NTN4, PRSS2, RTN4RL1, SEMA3E, SIAT8D, CALCR, CCL2, CER1, CMKOR1, CRIP1, CXCR4, SLC5A9, SLC40A1, SOX17, SPOCK3, TMOD1, TRPA1, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, ELOVL2, TTN, AW166727, AI821586, BF941609, AI916532, EPSTI1, FGF17, FLJ10970, FLJ21195, FLJ22471, BC034407, N63706 and AW772192, in definitive endoderm FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, GSC, cells or cell populations is increased as compared to the LOC283537, MYL7, NPPB, NTN4, PRSS2, RTN4RL1, expression of one or more markers selected from the group SEMA3E, SIAT8D, SLC5A9, SLC40A1, SOX17, SPOCK3, consisting of AGPAT3, APOA2, C20orf56, C21orf129, TMOD1, TRPA1, TTN, AW166727, AI821586, BF941609, CALCR, CCL2, CER1, CMKOR1, CRIP1, CXCR4, AI916532, BC034407, N63706 and AW772192 (highly CXorf1, DIO3, DIO3OS, EB-1, EHHADH, ELOVL2, expressed markers described in Table 4). Other preferred EPSTI1, FGF17, FLJ10970, FLJ21195, FLJ22471, markers include one or more markers selected from the group FLJ23514, FOXA2, FOXO1, GATA4, GPR37, GSC, consisting of CALCR, CMKOR1, CXCR4, GPR37, NTN4, LOC283537, MYL7, NPPB, NTN4, PRSS2, RTN4RL1, RTN4RL1, SEMA3E, SLC40A1, SLC5A9 and TRPA1 (cell SEMA3E, SIAT8D, SLC5A9, SLC40A1, SOX17, SPOCK3, surface markers described in Table 5). TMOD1, TRPA1, TTN, AW 166727, AI821586, BF941609, 0181. In some embodiments, the reagent can be any type AI916532, BC034407, N63706 and AW772192 in non-de of affinity tag that is specific, or at least partially specific, for finitive endoderm cells or cell populations. definitive endoderm cells, such as antibodies, ligands or other 0178. Additionally, it will be appreciated that there is a binding agents that are specific, or at least partially specific, to range of differences between the expression level of SOX17, a marker molecule, such as a polypeptide, that is present in CXCR4 and/or one or more other markers selected from definitive endoderm cells but which is not substantially Table 3 and the expression levels of the OCT4, SPARC, AFP. present in other cell types that are found in a cell culture TM and/or SOX7 markers in definitive endoderm cells. As produced by the methods described herein. Such, in some embodiments described herein, the expression 0182 Reagents that bind to definitive endoderm associ of SOX17, CXCR4 and/or one or more other markers selected ated markers described herein can form reagent-cell com from Table 3 is at least about 2-fold higher to at least about plexes with definitive endoderm cells. In some embodiments, 10,000-fold higher than the expression of OCT4, SPARC, Subsequent to providing a cell population with a reagent that AFP, TM and/or SOX7 markers. In other embodiments, the binds to a marker, unbound reagent can be removed from the expression of SOX17, CXCR4 and/or one or more other composition by washing, stirring, and/or filtering. One markers selected from Table 3 is at least about 4-fold higher, example of reagent-cell complex is an antibody-antigen com at least about 6-fold higher, at least about 8-fold higher, at plex. In preferred embodiments of Such complexes, the anti least about 10-fold higher, at least about 15-fold higher, at gen marker can be localized to the cell Surface. Another least about 20-fold higher, at least about 40-fold higher, at example of a reagent-cell complex is a receptor-ligand com least about 80-fold higher, at least about 100-fold higher, at plex, wherein the receptor is expressed in definitive endoderm least about 150-fold higher, at least about 200-fold higher, at cells but not Substantially expressed in one or more non least about 500-fold higher, at least about 750-fold higher, at definitive endoderm cell types. In preferred embodiments, the least about 1000-fold higher, at least about 2500-fold higher, ligand can bind to the receptor-binding domain without at least about 5000-fold higher, at least about 7500-fold appreciably affecting the receptor activity. In still other pre higher or at least about 10,000-fold higher than the expression ferred embodiments, the ligand is attached to another mol of OCT4, SPARC, AFP, TM and/or SOX7 markers. In some ecule so as to facilitate easy manipulation of the reagent cell embodiments, OCT4, SPARC, AFP, TM and/or SOX7 mark complex. For example, in cases where the ligand is a small ers are not significantly expressed in definitive endoderm molecule, the ligand can be linked to a macromolecule. Such cells. as a carbohydrate, a protein or a synthetic polymer. In some Enrichment, Isolation and/or Purification of Definitive Endo embodiments, the macromolecules can be coupled to, for derm example, labels, particles. Such as magnetic particles, or Sur 0179. With respect to aspects of the present invention, faces, such as a Solid Support. Table 5 lists several polypep definitive endoderm cells can be enriched, isolated and/or tides which are localized at the cell Surface, including recep purified by contacting a cell population that includes defini tors, which are expressed in definitive endoderm cells but not tive endoderm cells with a reagent that binds to a marker, Substantially expressed in one or more non-definitive endo which is expressed in definitive endoderm cells but not sub derm cell types. US 2014/0134726 A1 May 15, 2014

0183 In some embodiments of the present invention, an 0188 The variable regions of an antibody are composed of antibody is prepared for use as an affinity tag for the enrich a light chain and a heavy chain. Light and heavy chain vari ment, isolation and/or purification of definitive endoderm able regions have been cloned and expressed in foreign hosts, cells. Methods for making antibodies, for example, the meth while maintaining their binding ability. Therefore, it is pos ods described in Kohler, G., and C. Milstein. 1975. Continu sible to generate a single chain structure from the multiple ous cultures of fused cells secreting antibody of predefined chain aggregate (the antibody). Such that the single chain specificity. Nature 256:495-497, the disclosure of which is structure will retain the three-dimensional architecture of the incorporated herein by reference in its entirety, are well multiple chain aggregate. known in the art. Additionally, antibodies and fragments can 0189 In some embodiments, single polypeptide chain Fv be made by other standardized methods (See, for example, E. fragments having the characteristic binding ability of multi Harlow et al., Antibodies, A Laboratory Manual, Cold Spring chain variable regions of antibody molecules can be used to Harbor Laboratory, Cold Spring Harbor, N.Y., 1988). The bind one or markers selected from Table 3. Such single chain isolation, identification, and molecular construction of anti fragments can be produced, for example, following the meth bodies has been developed to such an extent that the choices ods described in U.S. Pat. No. 5,260,203, the disclosure of are almost inexhaustible. Therefore, examples of methods for which is incorporated herein by reference in its entirety, using making various antibodies, antibody fragments and/or anti a computer based system and method to determine chemical body complexes will be provided with the understanding that structures. These chemical structures are used for converting this only represents a sampling of what is known to those of two naturally aggregated but chemically separated light and ordinary skill in the art. heavy polypeptide chains from an antibody variable region 0184 Antibodies can be whole immunoglobulin (IgG) of into a single polypeptide chain which will fold into a three any class, e.g., IgG, IgM, IgA, Ig|D, IgE, chimericantibodies dimensional structure very similar to the original structure of or hybrid antibodies with dual or multiple antigen or epitope the two polypeptide chains. The two regions may be linked specificities, or fragments, such as F(ab'), Fab', Fab and the using an amino acid sequence as a bridge. like, including hybrid fragments. It will be understood that 0190. The single polypeptide chain obtained from this any immunoglobulin or any natural, synthetic, or genetically method can then be used to prepare a genetic sequence that engineered protein that acts like an antibody by binding to encodes it. The genetic sequence can then be replicated in marker of definitive endoderm can be used. appropriate hosts, further linked to control regions, and trans formed into expression hosts, where it is expressed. The 0185. Preparations of polyclonal antibodies can be made resulting single polypeptide chain binding protein, upon using standard methods which are well known in the art. refolding, has the binding characteristics of the aggregate of Antibodies can include antiserum preparations from a variety the original two (heavy and light) polypeptide chains of the of commonly used animals, e.g., goats, primates, donkeys, variable region of the antibody. Swine, rabbits, horses, hens, guinea pigs, rats, or mice, and (0191 In a further embodiment, the antibodies used herein even human antisera after appropriate selection and purifica can multivalent forms of single-chain antigen-binding pro tion. Animal antisera are raised by inoculating the animals teins. Multivalent forms of single-chain antigen-binding pro with immunogenic epitopes from one or more recombi teins have additional utility beyond that of the monovalent nantly-expressed polypeptide markers selected from Table 3. single-chain antigen-binding proteins. In particular, a multi The animals are then bled and the serum or an immunoglo Valent antigen-binding protein has more than one antigen bulin-containing serum fraction is recovered. binding site which results in an enhanced binding affinity. The 0186 Hybridoma-derived monoclonal antibodies (hu multivalent antibodies can be produced using the method man, monkey, rat, mouse, or the like) are also suitable for use disclosed in U.S. Pat. No. 5,869,620, the disclosure of which in the present invention. Monoclonal antibodies typically dis is incorporated herein by reference in its entirety. The method play a high degree of specificity. Such antibodies are readily involves producing a multivalent antigen-binding protein by prepared by conventional procedures for the immunization of linking at least two single-chain molecules, each single chain mammals with preparations such as, the immunogenic molecule having two binding portions of the variable region epitopes of one or more markers selected from Table 3, fusion of an antibody heavy or light chain linked into a single chain of immune lymph or spleen cells with an immortal myeloma protein. In this way the antibodies can have binding sites for cell line, and isolation of specific hybridoma clones. In addi different parts of an antigen or have binding sites for multiple tion to the foregoing procedure, alternate method of preparing antigens. monoclonal antibodies are also useful. Such methods can 0.192 In another embodiment, the antibody can be an oli include interspecies fusions and genetic engineering manipu gomer. The oligomer is produced as in WO98/18943, the lations of hyperVariable regions. disclosure of which is incorporated herein by reference in its 0187. In one embodiment of the present invention, the entirety, by first isolating a specific ligand from a phage antibody is a single chain Fv region. Antibody molecules have display library. Oligomers overcome the problem of the iso two generally recognized regions, in each of the heavy and lation of mostly low affinity ligands from these libraries, by light chains. These regions are the so-called “variable' region oligomerizing the low-affinity ligands to produce high affin which is responsible for binding to the specific antigen in ity oligomers. The oligomers are constructed by producing a question, and the so-called "constant region which is respon fusion protein with the ligand fused to a semi-rigid hinge and sible for biological effector responses such as complement a coiled coil domain from, for example, Cartilage Oligomeric binding, binding to neutrophils and macrophages, etc. The Matrix Protein (COMP). When the fusion protein is constant regions have been separated from the antibody mol expressed in a host cell, it self assembles into oligomers. ecule and variable binding regions have been obtained. Frag 0193 Preferably, the oligomers are peptabodies (Terskikh ments containing the variable region, but lacking the constant et al., Biochemistry 94:1663-1668 (1997)). Peptabodies can region, are Sufficient for antigen binding. be exemplified as IgM antibodies which are pentameric with US 2014/0134726 A1 May 15, 2014

each binding site having low-affinity binding, but able to bind based on SDF-1 is a useful reagent in the enrichment, isola in a high affinity manner as a complex. Peptabodies are made tion and/or purification methods described herein. Likewise, using phage-display random peptide libraries. A short peptide SDF-1 fragments, SDF-1 fusions or SDF-1 mimetics that ligand from the library is fused via a semi-rigid hinge at the retain the ability to form reagent-cell complexes are useful in N-terminus of a COMP pentamerization domain. The fusion the enrichment, isolation and/or purification methods protein is expressed in bacteria where it assembles into a described herein. In some embodiments, the ligand is a ligand pentameric antibody which shows high affinity for its target. that binds to one of the cell surface markers selected from the Depending on the affinity of the ligand, an antibody with very group consisting of CALCR, CMKOR1, CXCR4, GPR37, high affinity can be produced. NTN4, RTN4RL1, SEMA3E, SLC40A1 SLC5A9 and 0194 Several generally applicable methods for using anti TRPA1. In preferred embodiments, a ligand that binds to one bodies in affinity-based isolation and purification methods of the cell Surface markers selected from the group consisting are known in the art. Such methods can be implemented for of CALCR, CMKOR1, CXCR4, GPR37, NTN4, RTN4RL1, use with the antibodies and cells described herein. In pre SEMA3E, SLC40A1 SLC5A9 and TRPA1 is attached ferred embodiments, an antibody which binds to one of the directly to a macromolecule or a Solid Support. In other pre cell Surface markers selected from the group consisting of ferred embodiments, a ligand that binds to one of the cell CALCR, CMKOR1, CXCR4, GPR37, NTN4, RTN4RL1, Surface markers selected from the group consisting of SEMA3E, SLC40A1 SLC5A9 and TRPA1 is attached to a CALCR, CMKOR1, CXCR4, GPR37, NTN4, RTN4RL1, magnetic bead then allowed to bind to definitive endoderm SEMA3E, SLC40A1, SLC5A9 and TRPA1 is attached via a cells in a cell culture which has been enzymatically treated to linker to a macromolecule or a solid Support. In such embodi reduce intercellular and Substrate adhesion, to form reagent ments, non-bound cells can be removed from the ligand-cell cell complexes. The reagent-cell complexes are then exposed complexes by washing. to a movable magnetic field which is used to separate reagent bound definitive endoderm cells from unbound cells. Once 0.197 Alternatively, in some embodiments of the pro the definitive endoderm cells are physically separated from cesses described herein, definitive endoderm cells are fluo other cells in culture, the antibody binding is disrupted and rescently labeled by recombinant expression of a fluorescent the cells are replated in appropriate tissue culture medium, polypeptide or fragment thereof in definitive endoderm cells. thereby producing a population of cells enriched for definitive For example, a nucleic acid encoding green fluorescent pro endoderm. tein (GFP) or a nucleic acid encoding a different expressible (0195 Other methods of enrichment, isolation and/or puri fluorescent marker gene is used to label definitive endoderm fication using antibodies to formantibody-cell complexes are cells. Various fluorescent marker genes are known in the art, contemplated. For example, in Some embodiments, a primary Such as luciferase, enhanced green fluorescent protein antibody which binds to a marker expressed in definitive (EGFP), DS-Red monomer fluorescent protein (Clontech) endoderm cells, but not substantially expressed in one or and others. Any such markers can be used with the enrich more non-definitive endoderm cell types, is incubated with a ment, isolation and/or purification methods described herein. definitive endoderm-containing cell culture that has been For example, in Some embodiments, at least one copy of a treated to reduce intercellular and substrate adhesion, to form nucleic acid encoding GFP or a biologically active fragment antibody-cell complexes. In preferred embodiments, the pri thereof is introduced into a pluripotent cell, preferably a mary antibody comprises one or more antibodies that bind to human embryonic stem cell, downstream of the promoter of a cell Surface polypeptide selected from the group consisting any of the marker genes described herein that are expressed in of CALCR, CMKOR1, CXCR4, GPR37, NTN4, RTN4RL1, definitive endoderm cells, but not substantially expressed in SEMA3E, SLC40A1 SLC5A9 and TRPA. The cells are then one or more non-definitive endoderm cell types, including washed, centrifuged and resuspended. The cell Suspension is those listed in Table 3, such that the expression of the GFP then incubated with a secondary antibody, such as an FITC gene product or biologically active fragment thereof is under conjugated antibody that is capable of binding to the primary control of the marker gene's promoter. In some embodiments, antibody of the reagent-cell complex. The cells are then the entire coding region of the nucleic acid, which encodes the washed, centrifuged and resuspended in buffer. The cell sus marker, is replaced by a nucleic acid encoding GFP or a pension is then analyzed and Sorted using a fluorescence biologically active fragment thereof. In other embodiments, activated cell sorter (FACS). Antibody-bound cells are col the nucleic acid encoding GFP or a biologically active frag lected separately from cells that are not bound by the anti ment thereof is fused in frame with at least a portion of the body, thereby resulting in the isolation of such cell types. If nucleic acid encoding the marker, thereby generating a fusion desired, the enriched cell compositions can be further purified protein. In such embodiments, the fusion protein retains a by using an alternate affinity-based method or by additional fluorescent activity similar to GFP. rounds of sorting using the same or different markers that are 0198 The fluorescently-tagged cells produced by the specific for definitive endoderm. above-described methods are then differentiated to definitive 0196. Additional methods of enrichment, isolation and/or endoderm cells using the differentiation methods described purification related to using a ligand-based reagent to form herein. Because definitive endoderm cells express the fluo ligand-cell complexes. For example, ligand-cell complexes rescent marker gene, whereas non-endoderm cells do not, can include a ligand-based reagent and receptor that is these two cell types can be separated. In some embodiments, expressed on definitive endoderm but not substantially cell Suspensions comprising a mixture of fluorescently-la expressed in one or more non-definitive endoderm cell types. beled definitive endoderm cells and unlabeled non-definitive In Such embodiments, ligand-based reagents are used as an endoderm cells are sorted using a FACS. Definitive endoderm affinity tag for the enrichment, isolation and/or purification of cells are collected separately from non-definitive endoderm definitive endoderm cells. By way of example, the chemokine cells, thereby resulting in the isolation of such cell types. If SDF-1, which is a ligand for CXCR4, or other molecules desired, the isolated cell compositions can be further purified US 2014/0134726 A1 May 15, 2014

by additional rounds of Sorting using the same or different conversion of pluripotent cells to definitive endoderm. In markers that are specific for definitive endoderm cells. processes in which isolation of definitive endoderm cells is 0199. Using the methods and compositions described employed, for example, by using an affinity reagent that binds herein, enriched, isolated and/or purified populations of to a marker selected from the group consisting of CALCR, definitive endoderm cells and/or tissues can be produced in CMKOR1, CXCR4, GPR37, NTN4, RTN4RL1, SEMA3E, vitro from HESC cultures or cell populations which have SLC40A1, SLC5A9 and TRPA1, a substantially pure defini undergone differentiation for from about 1 hour to greater tive endoderm cell population can be recovered. than about 144 hours. In some embodiments, the cells 0204. Some embodiments described herein relate to com undergo random differentiation. In a preferred embodiment, positions, such as cell populations and cell cultures, that however, the cells are directed to differentiate primarily into comprise both pluripotent cells, such as stem cells, and defini definitive endoderm cells. Some preferred enrichment, isola tive endoderm cells. For example, using the methods tion and/or purification methods relate to the in vitro produc described herein, compositions comprising mixtures of tion of definitive endoderm cells from human embryonic stem hESCs and definitive endoderm cells can be produced. In cells. Some embodiments, compositions comprising at least about 5 0200. Using the methods described herein, cell popula definitive endoderm cells for about every 95 pluripotent cells tions or cell cultures can be enriched in definitive endoderm are produced. In other embodiments, compositions compris cell content by at least about 2- to about 1000-fold as com ing at least about 95 definitive endoderm cells for about every pared to untreated or unenriched cell populations or cell cul 5 pluripotent cells are produced. Additionally, compositions tures. In some embodiments, definitive endoderm cells can be comprising other ratios of definitive endoderm cells to pluri enriched by at least about 5- to about 500-fold as compared to potent cells are contemplated. For example, compositions untreated or unenriched cell populations or cell cultures. In comprising at least about 1 definitive endoderm cell for about other embodiments, definitive endoderm cells can be every 1,000,000 pluripotent cells, at least about 1 definitive enriched from at least about 10- to about 200-fold as com endoderm cell for about every 10,000 pluripotent cells, at pared to untreated or unenriched cell populations or cell cul least about 1 definitive endoderm cell for about every 100,000 tures. In still other embodiments, definitive endoderm cells pluripotent cells, at least about 1 definitive endoderm cell for can be enriched from at least about 20- to about 100-fold as about every 1000 pluripotent cells, at least about 1 definitive compared to untreated or unenriched cell populations or cell endoderm cell for about every 500 pluripotent cells, at least cultures. In yet other embodiments, definitive endoderm cells about 1 definitive endoderm cell for about every 100 pluripo can be enriched from at least about 40- to about 80-fold as tent cells, at least about 1 definitive endoderm cell for about compared to untreated or unenriched cell populations or cell every 10 pluripotent cells, at least about 1 definitive endo cultures. In certain embodiments, definitive endoderm cells derm cell for about every 5 pluripotent cells, at least about 1 can be enriched from at least about 2- to about 20-fold as definitive endoderm cell for about every 2 pluripotent cells, at compared to untreated or unenriched cell populations or cell least about 2 definitive endoderm cells for about every 1 cultures. pluripotent cell, at least about 5 definitive endoderm cells for 0201 In preferred embodiments, definitive endoderm about every 1 pluripotent cell, at least about 10 definitive cells are enriched, isolated and/or purified from other non endoderm cells for about every 1 pluripotent cell, at least definitive endoderm cells after the stem cell cultures are about 20 definitive endoderm cells for about every 1 pluripo induced to differentiate towards the definitive endoderm lin tent cell, at least about 50 definitive endoderm cells for about eage. It will be appreciated that the above-described enrich every 1 pluripotent cell, at least about 100 definitive endo ment, isolation and purification procedures can be used with derm cells for about every 1 pluripotent cell, at least about Such cultures at any stage of differentiation. 1000 definitive endoderm cells for about every 1 pluripotent 0202 In addition to the procedures just described, defini cell, at least about 10,000 definitive endoderm cells for about tive endoderm cells may also be isolated by other techniques every 1 pluripotent cell, at least about 100,000 definitive for cell isolation. Additionally, definitive endoderm cells may endoderm cells for about every 1 pluripotent cell and at least also be enriched or isolated by methods of serial subculture in about 1,000,000 definitive endoderm cells for about every 1 growth conditions which promote the selective survival or pluripotent cell are contemplated. In some embodiments, the selective expansion of said definitive endoderm cells. pluripotent cells are human pluripotent stem cells. In certain embodiments the stem cells are derived from a morula, the Compositions Comprising Definitive Endoderm inner cell mass of an embryo or the gonadal ridges of an 0203 Cell compositions produced by the above-described embryo. In certain other embodiments, the pluripotent cells methods include cell cultures comprising definitive endo are derived from the gondal or germ tissues of a multicellular derm and cell populations enriched in definitive endoderm. structure that has developed past the embryonic stage. For example, cell cultures which comprise definitive endo 0205 Some embodiments described herein relate to cell derm cells, wherein at least about 50-80% of the cells in cultures or cell populations comprising from at least about culture are definitive endoderm cells, can be produced. 5% definitive endoderm cells to at least about 95% definitive Because the efficiency of the differentiation process can be endoderm cells. In some embodiments the cell cultures or cell adjusted by modifying certain parameters, which include but populations comprise mammaliancells. In preferred embodi are not limited to, cell growth conditions, growth factor con ments, the cell cultures or cell populations comprise human centrations and the timing of culture steps, the differentiation cells. For example, certain specific embodiments relate to cell procedures described herein can result in about 5%, about cultures comprising human cells, wherein from at least about 10%, about 15%, about 20%, about 25%, about 30%, about 5% to at least about 95% of the human cells are definitive 35%, about 40%, about 45%, about 50%, about 55%, about endoderm cells. Other embodiments relate to cell cultures 60%, about 65%, about 70%, about 75%, about 80%, about comprising human cells, wherein at least about 5%, at least 85%, about 90%, about 95%, or greater than about 95% about 10%, at least about 15%, at least about 20%, at least US 2014/0134726 A1 May 15, 2014

about 25%, at least about 30%, at least about 35%, at least RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, about 40%, at least about 45%, at least about 50%, at least SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, about 55%, at least about 60%, at least about 65%, at least AI821586, BF941609, AI916532, BC034407, N63706 and about 70%, at least about 75%, at least about 80%, at least AW772192 is greater than the expression of the OCT4, about 85%, at least about 90% or greater than 90% of the SPARC, AFP, TM and/or SOX7 marker in at least about 10% human cells are definitive endoderm cells. In embodiments of the human cells, in at least about 15% of the human cells, where the cell cultures or cell populations comprise human in at least about 20% of the human cells, in at least about 25% feeder cells, the above percentages are calculated without of the human cells, in at least about 30% of the human cells, respect to the human feeder cells in the cell cultures or cell in at least about 35% of the human cells, in at least about 40% populations. of the human cells, in at least about 45% of the human cells, 0206. Further embodiments described herein relate to in at least about 50% of the human cells, in at least about 55% compositions. Such as cell cultures or cell populations, com of the human cells, in at least about 60% of the human cells, prising humancells. Such as human definitive endoderm cells, in at least about 65% of the human cells, in at least about 70% wherein the expression of SOX17, CXCR4 and/or one or of the human cells, in at least about 75% of the human cells, more other markers selected from Table 3 is greater than the in at least about 80% of the human cells, in at least about 85% expression of the OCT4, SPARC, alpha-fetoprotein (AFP). of the human cells, in at least about 90% of the human cells, Thrombomodulin (TM) and/or SOX7 marker in at least about in at least about 95% of the human cells or in greater than 95% 5% of the human cells. In other embodiments, the expression of the human cells. In embodiments where the cell cultures or of SOX17, CXCR4 and/or one or more other markers selected cell populations comprise human feeder cells, the above per from Table 3 is greater than the expression of the OCT4, centages are calculated without respect to the human feeder SPARC, AFP, TM and/or SOX7 marker in at least about 10% cells in the cell cultures or cell populations. In certain of the human cells, in at least about 15% of the human cells, embodiments, the definitive endoderm cells expressing one in at least about 20% of the human cells, in at least about 25% or more markers selected from the group consisting of of the human cells, in at least about 30% of the human cells, AGPAT3, APOA2, C20orf56, C21orf129, CALCR, CCL2, in at least about 35% of the human cells, in at least about 40% CER1, CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, of the human cells, in at least about 45% of the human cells, DIO3OS, EB-1, EHHADH, ELOVL2, EPSTI1, FGF17, in at least about 50% of the human cells, in at least about 55% FLJ10970, FLJ21195, FLJ22471, FLJ23514, FOXA2, of the human cells, in at least about 60% of the human cells, FOXO1, GATA4, GPR37, GSC, LOC283537, MYL7, in at least about 65% of the human cells, in at least about 70% NPPB, NTN4, PRSS2, RTN4RL1, SEMA3E, SIAT8D, of the human cells, in at least about 75% of the human cells, SLC5A9, SLC40A1, SOX17, SPOCK3, TMOD1, TRPA1, in at least about 80% of the human cells, in at least about 85% TTN, AW166727, AI821586, BF941609, AI916532, of the human cells, in at least about 90% of the human cells, BC034407, N63706 and AW772192 do not express signifi in at least about 95% of the human cells or in greater than 95% cant levels or amounts of PDX1 (PDX1-negative). of the human cells. In embodiments where the cell cultures or 0208. Additional embodiments described herein relate to cell populations comprise human feeder cells, the above per compositions. Such as cell cultures or cell populations, com centages are calculated without respect to the human feeder prising mammalian endodermal cells, such as human endo cells in the cell cultures or cell populations. In certain derm cells, wherein the expression of SOX17, CXCR4 and/or embodiments, the definitive endoderm cells expressing one or more other markers selected from Table 3 is greater SOX17, CXCR4 and/or one or more other markers selected than the expression of the OCT4, SPARC, AFP, TM and/or from Table 3 do not express significant levels or amounts of SOX7 marker in at least about 5% of the endodermal cells. In PDX1 (PDX1-negative). other embodiments, the expression of SOX17, CXCR4 and/ 0207. It will be appreciated that some embodiments or one or more other markers selected from Table 3 is greater described herein relate to compositions. Such as cell cultures than the expression of the OCT4, SPARC, AFP, TM and/or or cell populations, comprising human cells, such as human SOX7 marker in at least about 10% of the endodermal cells, definitive endoderm cells, wherein the expression of one or in at least about 15% of the endodermal cells, in at least about more markers selected from the group consisting of AGPAT3, 20% of the endodermal cells, in at least about 25% of the APOA2, C20orf56, C21orf129, CALCR, CCL2, CER1, endodermal cells, in at least about 30% of the endodermal CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, cells, in at least about 35% of the endodermal cells, in at least EHHADH, ELOVL2, EPSTI1, FGF17, FLJ10970, about 40% of the endodermal cells, in at least about 45% of FLJ21195, FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, the endodermal cells, in at least about 50% of the endodermal GPR37, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, cells, in at least about 55% of the endodermal cells, in at least RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, about 60% of the endodermal cells, in at least about 65% of SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, the endodermal cells, in at least about 70% of the endodermal AI821586, BF941609, AI916532, BC034407, N63706 and cells, in at least about 75% of the endodermal cells, in at least AW772192 is greater than the expression of the OCT4, about 80% of the endodermal cells, in at least about 85% of SPARC, AFP, TM and/or SOX7 markers in from at least about the endodermal cells, in at least about 90% of the endodermal 5% to greater than at least about 95% of the human cells. In cells, in at least about 95% of the endodermal cells or in other embodiments, the expression of one or more markers greater than 95% of the endodermal cells. In certain embodi selected from the group consisting of AGPAT3, APOA2, ments, the mammalian endodermal cells expressing SOX17. C20orf56, C21orf129, CALCR, CCL2, CER1, CMKOR1, CXCR4 and/or one or more other markers selected from CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, Table 3 do not express significant levels or amounts of PDX1 ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, (PDX1-negative). FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, 0209. It will be appreciated that some embodiments GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, described herein relate to compositions, such as cell cultures US 2014/0134726 A1 May 15, 2014

or cell populations comprising mammalian endodermal cells, of the human cells, in at least about 15% of the human cells, wherein the expression of one or more markers selected from in at least about 20% of the human cells, in at least about 25% the group consisting of AGPAT3, APOA2, C20orf56, of the human cells, in at least about 30% of the human cells, C21orf129, CALCR, CCL2, CER1, CMKOR1, CRIP1, in at least about 35% of the human cells, in at least about 40% CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, of the human cells, in at least about 45% of the human cells, ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, in at least about 50% of the human cells, in at least about 55% FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, of the human cells, in at least about 60% of the human cells, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, in at least about 65% of the human cells, in at least about 70% RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, of the human cells, in at least about 75% of the human cells, SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, in at least about 80% of the human cells, in at least about 85% AI821586, BF941609, AI916532, BC034407, N63706 and of the human cells, in at least about 90% of the human cells, AW772192 is greater than the expression of the OCT4, in at least about 95% of the human cells or in greater than 95% SPARC, AFP, TM and/or SOX7 markers in from at least about of the human cells. In embodiments where the cell cultures or 5% to greater than at least about 95% of the endodermal cells. cell populations comprise human feeder cells, the above per In other embodiments, the expression of one or more markers centages are calculated without respect to the human feeder selected from the group consisting of AGPAT3, APOA2, cells in the cell cultures or cell populations. In certain C20orf56, C21orf129, CALCR, CCL2, CER1, CMKOR1, embodiments, the definitive endoderm cells expressing CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, SOX17, CXCR4 and/or one or more other markers selected ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, from Table 3 do not express significant levels or amounts of FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, PDX1 (PDX1-negative). GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, 0211. It will be appreciated that some embodiments RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, described herein relate to compositions, such as cell cultures SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, or cell populations, comprising human cells, such as human AI821586, BF941609, AI916532, BC034407, N63706 and definitive endoderm cells, wherein the expression of one or AW772192 is greater than the expression of the OCT4, more markers selected from the group consisting of AGPAT3, SPARC, AFP, TM and/or SOX7 marker in at least about 10% APOA2, C20orf56, C21orf129, CALCR, CCL2, CER1, of the endodermal cells, in at least about 15% of the endoder CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, mal cells, in at least about 20% of the endodermal cells, in at EHHADH, ELOVL2, EPSTI1, FGF17, FLJ10970, least about 25% of the endodermal cells, in at least about 30% FLJ21195, FLJ22471, FLJ23514, FOXA2FOXQ1, GATA4, of the endodermal cells, in at least about 35% of the endoder GPR37, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, mal cells, in at least about 40% of the endodermal cells, in at RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, least about 45% of the endodermal cells, in at least about 50% SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, of the endodermal cells, in at least about 55% of the endoder AI821586, BF941609, AI916532, BC034407, N63706 and mal cells, in at least about 60% of the endodermal cells, in at AW772192 is greater than the expression of the OCT4, least about 65% of the endodermal cells, in at least about 70% SPARC, AFP, TM and/or SOX7 markers in from at least about of the endodermal cells, in at least about 75% of the endoder 5% to greater than at least about 95% of the human cells. In mal cells, in at least about 80% of the endodermal cells, in at other embodiments, the expression of one or more markers least about 85% of the endodermal cells, in at least about 90% selected from the group consisting of AGPAT3, APOA2, of the endodermal cells, in at least about 95% of the endoder C20orf56, C21orf129, CALCR, CCL2, CER1, CMKOR1, mal cells or in greater than 95% of the endodermal cells. In CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, certain embodiments, the mammalian endodermal cells ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, expressing one or more markers selected from the group FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, consisting of AGPAT3, APOA2, C20orf56, C21orf129, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, CALCR, CCL2, CER1, CMKOR1, CRIP1, CXCR4, RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, ELOVL2, SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, EPSTI1, FGF17, FLJ10970, FLJ21195, FLJ22471, AI821586, BF941609, AI916532, BC034407, N63706 and FLJ23514, FOXA2, FOXO1, GATA4, GPR37, GSC, AW772192 is greater than the expression of the OCT4, LOC283537, MYL7, NPPB, NTN4, PRSS2, RTN4RL1, SPARC, AFP, TM and/or SOX7 marker in at least about 10% SEMA3E, SIAT8D, SLC5A9, SLC40A1, SOX17, SPOCK3, of the human cells, in at least about 15% of the human cells, TMOD1, TRPA1, TTN, AW 166727, AI821586, BF941609, in at least about 20% of the human cells, in at least about 25% AI916532, BC034407, N63706 and AW772192 do not of the human cells, in at least about 30% of the human cells, express significant levels or amounts of PDX1 (PDX1-nega in at least about 35% of the human cells, in at least about 40% tive). of the human cells, in at least about 45% of the human cells, 0210. Further embodiments described herein relate to in at least about 50% of the human cells, in at least about 55% compositions. Such as cell cultures or cell populations, com of the human cells, in at least about 60% of the human cells, prising humancells. Such as human definitive endoderm cells, in at least about 65% of the human cells, in at least about 70% wherein the expression of SOX17, CXCR4 and/or one or of the human cells, in at least about 75% of the human cells, more other markers selected from Table 3 is greater than the in at least about 80% of the human cells, in at least about 85% expression of the OCT4, SPARC, alpha-fetoprotein (AFP). of the human cells, in at least about 90% of the human cells, Thrombomodulin (TM) and/or SOX7 marker in at least about in at least about 95% of the human cells or in greater than 95% 5% of the human cells. In other embodiments, the expression of the human cells. In embodiments where the cell cultures or of SOX17, CXCR4 and/or one or more other markers selected cell populations comprise human feeder cells, the above per from Table 3 is greater than the expression of the OCT4, centages are calculated without respect to the human feeder SPARC, AFP, TM and/or SOX7 marker in at least about 10% cells in the cell cultures or cell populations. In certain US 2014/0134726 A1 May 15, 2014 20 embodiments, the definitive endoderm cells expressing one AI821586, BF941609, AI916532, BC034407, N63706 and or more markers selected from the group consisting of AW772192 do not express significant levels or amounts of AGPAT3, APOA2, C20orf56, C21orf129, CALCR, CCL2, PDX1 (PDX1-negative). CER1, CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, 0215. In one embodiment, a description of a definitive DIO3OS, EB-1, EHHADH, ELOVL2, EPSTI1, FGF17, endoderm cell based on the expression of marker genes is, FLJ10970, FLJ21195, FLJ22471, FLJ23514, FOXA2, SOX17 high, MIXL1 high, AFP low, SPARC low, Thrombo FOXO1, GATA4, GPR37, GSC, LOC283537, MYL7, modulin low, SOX7 low, CXCR4 high. In other embodi NPPB, NTN4, PRSS2, RTN4RL1, SEMA3E, SIAT8D, ments, a description of a definitive endoderm cell based on the SLC5A9, SLC40A1, SOX17, SPOCK3, TMOD1, TRPA1, expression of marker genes is one or more markers selected TTN, AW166727, AI821586, BF941609, AI916532, from the group consisting of AGPAT3, APOA2, C20orf56, BC034407, N63706 and AW772192 do not express signifi C21orf129, CALCR, CCL2, CER1, CMKOR1, CRIP1, cant levels or amounts of PDX1 (PDX1-negative). CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, 0212. Using the methods described herein, compositions ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, comprising definitive endoderm cells substantially free of FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, other cell types can be produced. In some embodiments GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, described herein, the definitive endoderm cell populations or RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, cell cultures produced by the methods described herein are SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, substantially free of cells that significantly express the OCT4, AI821586, BF941609, AI916532, BC034407, N63706 and SOX7, AFP, SPARC, TM, ZIC1 or BRACH marker genes. AW772192 high; AFP low; SPARC low; Thrombomodulin 0213 Some embodiments described herein relate to low and SOX7 low. enriched, isolated and/or purified compositions, such as cell cultures or cell populations, comprising human cells, such as Reagent-Cell Complexes human definitive endoderm cells, wherein the expression of 0216) Some aspects of the present invention relate to com SOX17, CXCR4 and/or one or more other markers selected positions, such as cell cultures and/or cell population, that from Table 3 is greater than the expression of the OCT4, comprise complexes of one or more definitive endoderm cells SPARC, alpha-fetoprotein (AFP), Thrombomodulin (TM) bound to one or more reagents (reagent-cell complexes). For and/or SOX7 marker in at least about 96% of the human cells, example, cell cultures and/or cell populations comprising in at least about 97% of the human cells, in at least about 98% reagent-cell complexes, wherein at least about 5 to at least of the human cells, in at least about 99% of the human cells or about 100% of the definitive endoderm cells in culture are in in at least about 100% of the human cells. In certain embodi the form of reagent-cell complexes, can be produced. In other ments, enriched, isolated and/or purified compositions of embodiments, cell cultures and/or cell populations can be definitive endoderm cells expressing SOX17, CXCR4 and/or produced which comprise at least about 5%, at least about one or more other markers selected from Table 3 do not 10%, at least about 15%, at least about 20%, at least about express significant levels or amounts of PDX1 (PDX1-nega 25%, at least about 30%, at least about 35%, at least about tive). 40%, at least about 45%, at least about 50%, at least about 0214. It will be appreciated that some embodiments 55%, at least about 60%, at least about 65%, at least about described herein relate to enriched, isolated and/or purified 70%, at least about 75%, at least about 80%, at least about compositions. Such as cell cultures or cell populations, com 85%, at least about 90%, at least about 95%, at least about prising humancells. Such as human definitive endoderm cells, 96%, at least about 97%, at least about 98%, at least about wherein the expression of one or more markers selected from 99% or at least about 100% reagent-cell complexes. In some the group consisting of AGPAT3, APOA2, C20orf56, embodiments, the reagent cell complexes comprise one or C21orf129, CALCR, CCL2, CER1, CMKOR1, CRIP1, more definitive endoderm cells bound to one or more anti CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, bodies that bind to a marker selected from Table 3. In pre ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, ferred embodiments, the reagent cell complexes comprise FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, one or more definitive endoderm cells bound to one or more GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, antibodies that bind to a marker selected from the group RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, consisting of AGPAT3, APOA2, C20orf56, C21orf129, SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, CALCR, CCL2, CER1, CMKOR1, CRIP1, CXCR4, AI821586, BF941609, AI916532, BC034407, N63706 and CXorf1, DIO3, DIO3OS, EB-1, EHHADH, ELOVL2, AW772192 is greater than the expression of the OCT4, EPSTI1, FGF17, FLJ10970, FLJ21195, FLJ22471, SPARC, AFP, TM and/or SOX7 markers in from at least about FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, GSC, 96%, in at least about 97% of the human cells, in at least about LOC283537, MYL7, NPPB, NTN4, PRSS2, RTN4RL1, 98% of the human cells, in at least about 99% of the human SEMA3E, SIAT8D, SLC5A9, SLC40A1, SOX17, SPOCK3, cells or in at least about 100% of the human cells. In certain TMOD1, TRPA1, TTN, AW166727, AI821586, BF941609, embodiments, the enriched, isolated and/or purified definitive AI916532, BC034407, N63706 and AW772192. In other pre endoderm cells expressing one or more markers selected from ferred embodiments, the reagent cell complexes comprise the group consisting of AGPAT3, APOA2, C20orf56, one or more definitive endoderm cells bound to one or more C21orf129, CALCR, CCL2, CER1, CMKOR1, CRIP1, antibodies that bind to a marker selected from the group CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, consisting of CALCR, CMKOR1, CXCR4, GPR37, NTN4, ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, RTN4RL1, SEMA3E, SLC40A1 SLC5A9 and TRPA1. In FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, still other embodiments, the reagent cell complexes comprise GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, one or more definitive endoderm cells bound to one or more RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, ligands that bind to a marker selected Table 3. In preferred SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, embodiments, the reagent cell complexes comprise one or US 2014/0134726 A1 May 15, 2014

more definitive endoderm cells bound to one or more ligands CALCR, CCL2, CER1, CMKOR1, CRIP1, CXCR4, that bind to a marker selected from the group consisting of CXorf1, DIO3, DIO3OS, EB-1, EHHADH, ELOVL2, AGPAT3, APOA2, C20orf56, C21orf129, CALCR, CCL2, EPSTI1, FGF17, FLJ10970, FLJ21195, FLJ22471, CER1, CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, GSC, DIO3OS, EB-1, EHHADH, ELOVL2, EPSTI1, FGF17, LOC283537, MYL7, NPPB, NTN4, PRSS2, RTN4RL1, FLJ10970, FLJ21195, FLJ22471, FLJ23514, FOXA2, SEMA3E, SIAT8D, SLC5A9, SLC40A1, SOX17, SPOCK3, FOXO1, GATA4, GPR37, GSC, LOC283537, MYL7, TMOD1, TRPA1, TTN, AW166727, AI821586, BF941609, NPPB, NTN4, PRSS2, RTN4RL1, SEMA3E, SIAT8D, AI916532, BC034407, N63706 and AW772192. In other pre SLC5A9, SLC40A1, SOX17, SPOCK3, TMOD1, TRPA1, ferred embodiments, the reagent cell complexes comprise TTN, AW166727, AI821586, BF941609, AI916532, one or more definitive endoderm cells bound to one or more BC034407, N63706 and AW772192. In other preferred ligands that bind to a marker selected from the group consist embodiments, the reagent cell complexes comprise one or ing of CALCR, CMKOR1, CXCR4, GPR37, NTN4, more definitive endoderm cells bound to one or more ligands RTN4RL1, SEMA3E, SLC40A1 SLC5A9 and TRPA1. that bind to a marker selected from the group consisting of 0218. In some embodiments, the definitive endoderm cells CALCR, CMKOR1, CXCR4, GPR37, NTN4, RTN4RL1, present in the reagent-cell complexes that have been SEMA3E, SLC40A1 SLC5A9 and TRPA1. described above express SOX17, CXCR4 and/or one or more 0217. Some embodiments described herein relate to cell other markers selected from Table 3 to a greater extent than cultures and/or cell populations comprising from at least OCT4, SPARC, alpha-fetoprotein (AFP), Thrombomodulin about 5% reagent cell complexes to at least about 95% (TM) and/or SOX7. In preferred embodiments, the definitive reagent-cell complexes. In some embodiments the cell cul endoderm cells expressing SOX17, CXCR4 and/or one or tures or cell populations comprise mammalian cells. In pre more other markers selected from Table 3 do not express ferred embodiments, the cell cultures or cell populations significant levels or amounts of PDX1 (PDX1-negative). In comprise human cells. For example, certain specific embodi other embodiments, the definitive endoderm cells present in ments relate to cell cultures comprising human cells, wherein the reagent-cell complexes that have been described above from at least about 5% to at least about 95% of the human cells express one or more markers selected from the group consist are definitive endoderm cells. Other embodiments relate to ing of AGPAT3, APOA2, C20orf56, C21orf129, CALCR, cell cultures comprising human cells, wherein at least about CCL2, CER1, CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, 5%, at least about 10%, at least about 15%, at least about 20%, DIO3OS, EB-1, EHHADH, ELOVL2, EPSTI1, FGF17, at least about 25%, at least about 30%, at least about 35%, at FLJ10970, FLJ21195, FLJ22471, FLJ23514, FOXA2, least about 40%, at least about 45%, at least about 50%, at FOXO1, GATA4, GPR37, GSC, LOC283537, MYL7, least about 55%, at least about 60%, at least about 65%, at NPPB, NTN4, PRSS2, RTN4RL1, SEMA3E, SIAT8D, least about 70%, at least about 75%, at least about 80%, at SLC5A9, SLC40A1, SOX17, SPOCK3, TMOD1, TRPA1, least about 85%, at least about 90% or greater than 90% of the TTN, AW166727, AI821586, BF941609, AI916532, human cells are reagent cell complexes. In embodiments BC034407, N63706 and AW772192 to a greater extent than where the cell cultures or cell populations comprise human OCT4, SPARC, AFP, TM and/or SOX7. In preferred embodi feeder cells, the above percentages are calculated without ments, the definitive endoderm cells expressing one or more respect to the human feeder cells in the cell cultures or cell markers selected from the group consisting of AGPAT3, populations. In some embodiments, the reagent cell com APOA2, C20orf56, C21orf129, CALCR, CCL2, CER1, plexes comprise one or more definitive endoderm cells bound CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, to one or more antibodies that bind to a marker selected from EHHADH, ELOVL2, EPSTI1, FGF17, FLJ10970, Table 3. In preferred embodiments, the reagent cell com FLJ21195, FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, plexes comprise one or more definitive endoderm cells bound GPR37, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, to one or more antibodies that bind to a marker selected from RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, the group consisting of AGPAT3, APOA2, C20orf56, SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, C21orf129, CALCR, CCL2, CER1, CMKOR1, CRIP1, AI821586, BF941609, AI916532, BC034407, N63706 and CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, AW772192 do not express significant levels or amounts of ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, PDX1 (PDX1-negative). In yet other embodiments, the FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, definitive endoderm cells present in the reagent-cell com GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, plexes that have been described above express one or more RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, cell-surface markers selected from the group consisting of SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, CALCR, CMKOR1, CXCR4, GPR37, NTN4, RTN4RL1, AI821586, BF941609, AI916532, BC034407, N63706 and SEMA3E, SLC40A1, SLC5A9 and TRPA1. In preferred AW772192. In other preferred embodiments, the reagent cell embodiments, the definitive endoderm cells expressing one complexes comprise one or more definitive endoderm cells or more cell-surface markers selected from the group consist bound to one or more antibodies that bind to a marker selected ing of CALCR, CMKOR1, CXCR4, GPR37, NTN4, from the group consisting of CALCR, CMKOR1, CXCR4, RTN4RL1, SEMA3E, SLC40A1 SLC5A9 and TRPA1. GPR37, NTN4, RTN4RL1, SEMA3E, SLC40A1, SLC5A9 0219. Additional embodiments described herein relate to and TRPA1. In still other embodiments, the reagent cell com compositions, such as cell cultures and/or cell populations, plexes comprise one or more definitive endoderm cells bound that comprise both pluripotent cells, such as stem cells, and to one or more ligands that bind to a marker selected Table 3. reagent-cell complexes. For example, using the methods In preferred embodiments, the reagent cell complexes com described herein, compositions comprising mixtures of prise one or more definitive endoderm cells bound to one or hESCs and reagent-cell complexes of definitive endoderm more ligands that bind to a marker selected from the group cells can be produced. In some embodiments, compositions consisting of AGPAT3, APOA2, C20orf56, C21orf129, comprising at least about 5 reagent-cell complexes for about US 2014/0134726 A1 May 15, 2014 22 every 95 pluripotent cells are provided. In other embodi Immunodetection of Definitive Endoderm ments, compositions comprising at least about 95 reagent 0223) In some embodiments, immunochemistry is used to cell complexes for about every 5 pluripotent cells are pro detect the presence, absence, and/or level of expression of vided. Additionally, compositions comprising other ratios of proteins encoded by the above-mentioned genes. Accord reagent-cell complexes cells to pluripotent cells are contem ingly, some embodiments relate to isolated antibodies that plated. For example, compositions comprising at least about bind to certain markers. Such as any of the polypeptides 1 reagent-cell complex for about every 1,000,000 pluripotent encoded by the nucleic acids described in Table 3 including, cells, at least about 1 reagent-cell complex for about every but not limited to, AGPAT3, APOA2, C20orf56, C21orf129, 100,000 pluripotent cells, at least about 1 reagent-cell com CALCR, CCL2, CER1, CMKOR1, CRIP1, CXCR4, plex cell for about every 10,000 pluripotent cells, at least CXorf1, DIO3, DIO3OS, EB-1, EHHADH, ELOVL2, about 1 reagent-cell complex for about every 1000 pluripo EPSTI1, FGF17, FLJ10970, FLJ21195, FLJ22471, tent cells, at least about 1 reagent-cell complex for about FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, GSC, every 500 pluripotent cells, at least about 1 reagent-cell com LOC283537, MYL7, NPPB, NTN4, PRSS2, RTN4RL1, plex for about every 100 pluripotent cells, at least about 1 SEMA3E, SIAT8D, SLC5A9, SLC40A1, SOX17, SPOCK3, reagent-cell complex for about every 10 pluripotent cells, at TMOD1, TRPA1, TTN, AW166727, AI821586, BF941609, least about 1 reagent-cell complex for about every 5 pluripo AI916532, BCO34407, N63706 and AW772192. Preferred tent cells, at least about 1 reagent-cell complex for about embodiments relate to antibodies that bind to CALCR, every 2 pluripotent cells, at least about reagent-cell com CMKOR1, CXCR4, GPR37, NTN4, RTN4RL1, SEMA3E, plexes for about every 1 pluripotent cell, at least about 5 SLC40A1 SLC5A9 and TRPA1. reagent-cell complexes for about every 1 pluripotent cell, at 0224. The term “antibody'is used in the broadest sense least about 10 definitive endoderm cells for about every 1 and unless specified specifically covers single monoclonal pluripotent cell, at least about 20 reagent-cell complexes for antibodies (including agonist and antagonist antibodies) and about every 1 pluripotent cell, at least about 50 reagent-cell antibody compositions with polyepitopic specificity. The complexes for about every 1 pluripotent cell, at least about term “antibody'also includes obvious variants, derivatives, reagent-cell complexes for about every 1 pluripotent cell, at analogs, fragments, mimetics, all of which Substantially least about 1000 reagent-cell complexes for about every 1 retain the binding characteristics and other properties of the pluripotent cell, at least about 10,000 reagent-cell complexes stated antibody. for about every 1 pluripotent cell, at least about 100,000 0225 Monoclonal antibodies (mo Abs) refer to antibodies reagent-cell complexes for about every 1 pluripotent cell and obtained from a population of substantially homogeneous at least about 1,000,000 reagent-cell complexes for about antibodies, i.e., the individual antibodies comprising the every 1 pluripotent cell are contemplated. In some embodi population are identical except for possible naturally occur ments of the present invention, the pluripotent cells are ring mutations that may be present in minor amounts. In some human pluripotent stem cells. In certain embodiments the embodiments, monoclonal antibodies may be highly specific, stem cells are derived from a morula, the inner cell mass of an being directed against a single epitopic region of an antigen. embryo or the gonadal ridges of an embryo. In certain other Furthermore, in contrast to conventional (polyclonal) anti embodiments, the pluripotent cells are derived from the gon body preparations that typically include different antibodies dal or germ tissues of a multicellular structure that has devel directed against different determinants (epitopes), each oped past the embryonic stage. mo Ab is directed against a single determinant on the antigen. In addition to their specificity, monoclonal antibodies are Identification and Quantitation of Definitive Endoderm Cells advantageous in that they can be synthesized by hybridoma 0220. The desirability of detecting or determining the culture, uncontaminated by other immunoglobulins. amount of definitive endoderm cells in a cell culture or cell 0226. The modifier “monoclonal indicates the character population, and methods of distinguishing this cell type from of the antibody as being obtained from a substantially homo the other cells in the culture or in the population is readily geneous population of antibodies, and is not to be construed apparent. Accordingly, certain embodiments described as requiring production of the antibody by any particular herein relate to reagents and methods for the detection, iden method. As described above, the monoclonal antibodies to be tification, and quantitation of cell markers whose presence, used in accordance with the invention may be made by the absence and/or relative expression levels are indicative for hybridoma method first described by Kohler et al., Nature, definitive endoderm. 256:495 (1975), or may be made by recombinant DNA meth 0221) Some embodiments provide a method of detecting ods (see, e.g., U.S. Pat. No. 4,816,567 to Cabilly et al.). The definitive endoderm cells in a cell population that includes the "monoclonal antibodies' also include clones of antigen-rec steps of detecting the expression of at least one of the markers ognition and binding-site containing antibody fragments (FV described in Table 3. In some embodiments, the method also clones) isolated from phageantibody libraries using the tech includes the step of detecting the expression of at least one niques described in Clackson et al., Nature, 352:624–628 marker selected from the group consisting of OCT4, AFP. (1991) and Marks et al., J. Mol. Biol., 222:581-597 (1991), TM, SPARC and SOX7. In certain embodiments, the expres for example. sion of one or more markers selected from Table 3 is greater 0227 Furthermore, as described above, in some embodi than the expression of OCT4, AFP, TM, SPARC, or SOX7. ments, the antibodies can be “antibody fragments.” which 0222. As discussed below, expression levels of the mark refer to a portion of an intact antibody comprising the antigen ers may be determined by any method known to those skilled binding site or variable region of the intact antibody, wherein in the art, including but not limited to immunocytochemistry the portion is free of the constant heavy chain domains of the or quantitative PCR (Q-PCR). Embodiments of the present Fc region of the intact antibody. Examples of antibody frag invention relate to compositions useful in the detection and ments include Fab, Fab', Fab'-SH, F(ab'), and Fv fragments: quantitation of definitive endoderm cells. diabodies; any antibody fragment that is a polypeptide having US 2014/0134726 A1 May 15, 2014 a primary structure consisting of one uninterrupted sequence 87: 1874), strand displacement amplification (SDA), tran of contiguous amino acid residues (referred to herein as a scription-mediated amplification (TMA) (See, Kwoh (1989) “single-chain antibody fragment” or 'single chain polypep Proc. Natl. Acad Sci. USA 86: 1173), cycling probe technol tide'), including without limitation (1) single-chain Fv (scFv) ogy (CPT), solid phase amplification (SPA), nuclease depen molecules; (2) single chain polypeptides containing only one dent signal amplification (NDSA), rolling circle amplifica light chain variable domain, or a fragment thereof that con tion technology (RCA), Anchored strand displacement tains the three CDRs of the light chain variable domain, amplification, Solid-phase (immobilized) rolling circle without an associated heavy chain moiety; and (3) single amplification, Q Beta replicase amplification and other RNA chain polypeptides containing only one heavy chain variable polymerase mediated techniques (e.g., NASBA, Cangene, region, or a fragment thereof containing the three CDRs of the Mississauga, Ontario) are useful in the methods described heavy chain variable region, without an associated light chain herein. These and other techniques are also described in moiety. In some embodiments, antibody fragments further Berger (1987) Methods Enzymol. 152:307-316; Sambrook, encompass multispecific or multivalent structures formed Ausubel, Mullis (1987) U.S. Pat. Nos. 4,683,195 and 4,683, from the aforementioned antibody fragments. 202: Amheim (1990) C&EN 36-47; Lomell J. Clin. Chem., 0228. It will be appreciated that a wide variety of tech 35:1826 (1989); Van Brunt, Biotechnology, 8:291-294 niques are useful for the immunodetection of molecules (1990); Wu (1989) Gene 4:560; Sooknanan (1995) Biotech including the markers described herein. For example, flow nology 13:563-564. For example, several real-time reverse cytometry, immunomicroscopy, Western blotting, direct and transcription-PCR (RT-PCR) based techniques are available indirect sandwich assays, such as ELISA, Western blotting, that enable mRNA detection and quantitation. and the like. These techniques are described, herein and, for 0231. Detection and quantitation of nucleic acids is also example in Malik and Lillehoj (1994), herein incorporated by known to those skilled in the art. Non-limiting examples of reference in its entirety. amplification and detection techniques include, for example, TaqMan probe technology (See, European Patent EP 0543 Nucleic Acid-Based Detection of Definitive Endoderm 942), molecular beacon probe technology (See, Tyagi et al., 0229. In other embodiments, nucleic acid hybridization (1996) Nat. Biotech. 14:303-308.), Scorpion probe technol and/or amplification techniques are used to detect the pres ogy (See, Thewell (2000), Nucl. Acids Res. 28:3752), nano ence, absence and/or level of expression of a marker. For particle probe technology (See, Elghanian, et al. (1997) Sci example, the amount of transcript produced by certain genetic ence 277: 1078-1081.) and Amplifluor probe technology markers, such as any of the markers listed in Table 3, includ (See, U.S. Pat. Nos. 5,866,366; 6,090,592: 6,117,635; and ing but not limited to AGPAT3, APOA2, C20orf56, 6,117,986), fluorescence resonance energy transfer (FRET)- C21orf129, CALCR, CCL2, CER1, CMKOR1, CRIP1, based methods such as adjacent hybridization of probes (in CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, cluding probe-probe and probe-primer methods) (See, J. R. ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, Lakowicz, “Principles of Fluorescence Spectroscopy.” Klu FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, wer Academic/Plenum Publishers, New York, 1999). GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, 0232. Accordingly, some embodiments of the present RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, invention relate to compositions useful in nucleic acid based SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, techniques for the detection, identification and/or quantita AI821586, BF941609, AI916532, BC034407, N63706 and tion of definitive endoderm cells in a cell population. Some AW772192, can be determined by nucleic acid hybridization embodiments relate to composition that include a first oligo techniques, such as, for example, Northern blots, slot blots, nucleotide that hybridizes to a first marker, such as those RNase protection, in, situ hybridization, and the like. Tech described herein, and a second oligonucleotide that hybrid niques for the detection and quantitation of specific nucleic izes to a second marker that other than the first marker, Such acids are well-known to those skilled in the art and are as those described herein. By way of example, the first and described in, for example, Ausubel et al., (1997), Current second oligonucleotides can hybridize to different markers Protocols of Molecular Biology, John Wiley and Sons (1997); selected from Table 3. In preferred embodiments, the first and Gene transcription. RNA analysis: essential techniques. second oligonucleotides can hybridize to different markers 1996. K. Docherty (ed.), Chichester; and Reue, K, mRNA selected from the group consisting of AGPAT3, APOA2, Ouantitation Techniques. Considerations for Experimental C20orf56, C21orf129, CALCR, CCL2, CER1, CMKOR1, Design and Application, (1998), J. Nutr. 128(11):203 CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, 8-2044, each of which is herein incorporated by reference in ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, their entirety. Several sophisticated techniques exist for the FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, sensitive and specific detection nucleic acids GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, 0230. In some embodiments, nucleic acid detection tech RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, niques such as those described herein are used in combination SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, with nucleic acid amplification techniques. Several nucleic AI821586, BF941609, AI916532, BC034407, N63706 and acid amplification techniques are also useful in detection of AW772192. In other preferred embodiments, the oligonucle the presence/absence and/or level of expression markers. The otides hybridize to the above markers under stringent condi skilled artisan will appreciate that any nucleic acid amplifi tions. cation method that can be adapted to detect expression levels 0233. As used herein, oligonucleotides refer to polynucle of genes, such as ligase chain reaction (LCR) (See, Wu (1989) otides that are generally at least about 10 nucleotides in Genomics 4:560; Landegren (1988) Science 241:1077; Bar length, at least about 12 nucleotides in length, at least about ringer (1990) Gene 89:117), nucleic acid sequence-based 14 nucleotides in length, at least about 16 nucleotides in amplification (NASBA), self-sustained sequence replication length, at least about 18 nucleotides in length, at least about (3SR) (See, Guatelli (1990) Proc. Natl. Acad. Sci. USA, 20 nucleotides in length, at least about 22 nucleotides in US 2014/0134726 A1 May 15, 2014 24 length, at least about 24 nucleotides in length, at least about second time point is increased or decreased as compared to 26 nucleotides in length, at least about 28 nucleotides in expression of the marker at the first time point, then the length, at least about 30 nucleotides in length, at least about candidate differentiation factor is capable of promoting the 35 nucleotides in length, at least about 40 nucleotides in differentiation of definitive endoderm cells. length, at least about 45 nucleotides in length, at least about 0239. Some embodiments of the screening methods 50 nucleotides in length or greater than 50 nucleotides in described herein utilize cell populations or cell cultures length. which comprise human definitive endoderm cells. For 0234. It will be appreciated by one skilled in the art that example, the cell population can be a Substantially purified hybridization of the oligonucleotides to marker sequence is population of human definitive endoderm cells. Alternatively, achieved by selecting sequences which are at least Substan the cell population can be an enriched population of human tially complementary to the target or reference nucleic acid definitive endoderm cells, wherein at least about 90%, at least sequence. In some embodiments, this includes base-pairing about 91%, at least about 92%, at least about 93%, at least of the oligonucleotide target nucleic acid sequence over the about 94%, at least about 95%, at least about 96%, at least entire length of the oligonucleotide sequence. Such about 97% or greater than at least about 97% of the human sequences can be referred to as “fully complementary” with cells in the cell population are human definitive endoderm respect to each other. Where an oligonucleotide is referred to cells. In other embodiments described herein, the cell popu as “substantially complementary with respect to a nucleic lation comprises human cells wherein at least about 10%, at acid sequence herein, the two sequences can be fully comple least about 15%, at least about 20%, at least about 25%, at mentary, or they may form mismatches upon hybridization, least about 30%, at least about 35%, at least about 40%, at but retain the ability to hybridize under the conditions used to least about 45%, at least about 50%, at least about 55%, at detect the presence marker. least about 60%, at least about 65%, at least about 70%, at 0235. As set forth above, oligonucleotides of the embodi least about 75%, at least about 80%, at least about 85% or ments described herein may be used in an amplification reac greater than at least about 85% of the human cells are human tion. Accordingly, the compositions may include a poly definitive endoderm cells. In some embodiments, the cell merase, such as an RNA-dependent DNA polymerase. In population includes non-human cells such as non-human some embodiments, the RNA-dependent DNA polymerase is feeder cells. In other embodiments, the cell population a reverse transciptase. Moloney Murine Leukemia Virus includes human feeder cells. In Such embodiments, at least reverse transcriptase and Avian Myeloblastosis Virus (AMV) about 10%, at least about 15%, at least about 20%, at least reverse transcriptase are non-limiting examples of reverse about 25%, at least about 30%, at least about 35%, at least transcriptase enzymes commonly used by those skilled in the about 40%, at least about 45%, at least about 50%, at least art and which are useful in the embodiments described herein. about 55%, at least about 60%, at least about 65%, at least Further embodiments may also include DNA-dependent about 70%, at least about 75%, at least about 80%, at least DNA polymerases, such as Taq polymerase and the like. about 85%, at least about 90%, at least about 95% or greater 0236 Advantageously, the oligonucleotides described than at least about 95% of the human cells, other than said herein can include a label, Such as a radioactive label, a feeder cells, are human definitive endoderm cells. fluorescent label, or any other type of label that facilitates the detection and/or quantitation of nucleic acid markers, such as 0240. In embodiments of the screening methods described those described herein. herein, the cell population is contacted or otherwise provided 0237. In still other embodiments, Q-PCR and immunohis with a candidate (test) differentiation factor. The candidate tochemical techniques are both used to identify and deter differentiation factor can comprise any molecule that may mine the amount or relative proportions of Such markers. have the potential to promote the differentiation of human definitive endoderm cells. In some embodiments described Identification of Factors Capable of Promoting the herein, the candidate differentiation factor comprises a mol Differentiation of Definitive Endoderm Cells ecule that is known to be a differentiation factor for one or more types of cells. In alternate embodiments, the candidate 0238 Certain screening methods described herein relate differentiation factor comprises a molecule that in not known to methods for identifying at least one differentiation factor to promote cell differentiation. In preferred embodiments, the that is capable of promoting the differentiation of definitive candidate differentiation factor comprises molecule that is endoderm cells. In some embodiments of these methods, cell not known to promote the differentiation of human definitive populations comprising definitive endoderm cells. Such as endoderm cells. human definitive endoderm cells, are obtained. The cell popu lation is then provided with a candidate differentiation factor. 0241. In some embodiments of the screening methods At a first time point, which is prior to or at approximately the described herein, the candidate differentiation factor com same time as providing the candidate differentiation factor, prises a small molecule. In preferred embodiments, a small expression of a marker is determined. Alternatively, expres molecule is a molecule having a molecular mass of about sion of the marker can be determined after providing the 10,000 amu or less. In some embodiments, the small mol candidate differentiation factor. At a second time point, which ecule comprises a retinoid. In some embodiments, the Small is Subsequent to the first time point and Subsequent to the step molecule comprises retinoic acid. of providing the candidate differentiation factor to the cell 0242. In other embodiments described herein, the candi population, expression of the same marker is again deter date differentiation factor comprises a polypeptide. The mined. Whether the candidate differentiation factor is polypeptide can be any polypeptide including, but not limited capable of promoting the differentiation of the definitive to, a glycoprotein, a lipoprotein, an extracellular matrix pro endoderm cells is determined by comparing expression of the tein, a cytokine, a chemokine, a peptide hormone, an inter marker at the first time point with the expression of the marker leukin or a growth factor. Preferred polypeptides include at the second time point. If expression of the marker at the growth factors. In some preferred embodiments, the candi US 2014/0134726 A1 May 15, 2014 date differentiation factors comprises one or more growth tidine, Amphotericin B, Ascorbic acid, Ascrorbate, isobutylx factors selected from the group consisting of FGF10, FGF4. anthine, indomethacin, B-glycerolphosphate, nicotinamide, FGF2, Wnt3A and Wnt3B. DMSO, Thiazolidinediones, and TWS 119. 0243 In some embodiments of the screening methods 0244. In some embodiments of the screening methods described herein, the candidate differentiation factors com described herein, the candidate differentiation factor is pro prise one or more growth factors selected from the group vided to the cell population in one or more concentrations. In consisting of Amphiregulin, B-lymphocyte stimulator, IL-16, some embodiments, the candidate differentiation factor is Thymopoietin, TRAIL/Apo-2, Pre B cell colony enhancing provided to the cell population so that the concentration of the factor, Endothelial differentiation-related factor 1 (EDF1). candidate differentiation factor in the medium Surrounding Endothelial monocyte activating polypeptide II, Macrophage the cells ranges from about 0.1 ng/ml to about 10 mg/ml. In migration inhibitory factor (MIF), Natural killer cell enhanc Some embodiments, the concentration of the candidate dif ing factor (NKEFA), Bone mophogenetic protein 2. Bone ferentiation factor in the medium Surrounding the cells ranges mophogenetic protein 8 (osteogeneic protein 2), Bone mor from about 1 ng/ml to about 1 mg/ml. In other embodiments, phogenic protein 6, Bone morphogenic protein 7. Connective the concentration of the candidate differentiation factor in the tissue growth factor (CTGF), CGI-149 protein (neuroendo medium Surrounding the cells ranges from about 10 ng/ml to crine differentiation factor), Cytokine A3 (macrophage about 100 ng/ml. In still other embodiments, the concentra inflammatory protein 1-alpha), Gliablastoma cell differentia tion of the candidate differentiation factor in the medium tion-related protein (GBDR1), Hepatoma-derived growth surrounding the cells ranges from about 100 ng/ml to about 10 factor, Neuromedin U-25 precursor, Vascular endothelial ng/ml. In preferred embodiments, the concentration of the growth factor (VEGF), Vascular endothelial growth factor B candidate differentiation factor in the medium Surrounding (VEGF-B), T-cell specific RANTES precursor, thymic den the cells is about 5 ng/ml, about 25 ng/ml, about 50 ng/ml, dritic cell-derived factor 1, Transferrin, Interleukin-1 (IL 1), about 75 ng/ml, about 100 ng/ml, about 125 ng/ml, about 150 Interleukin-2 (IL2), Interleukin-3 (IL3), Interleukin-4 (IL4), ng/ml, about 175 ng/ml, about 200 ng/ml, about 225 ng/ml, Interleukin-5 (IL5), Interleukin-6 (IL6), Interleukin-7 (IL 7). about 250 ng/ml, about 275 ng/ml, about 300 ng/ml, about Interleukin-8 (IL 8), Interleukin-9 (IL. 9), Interleukin-10 (IL 325 ng/ml, about 350 ng/ml, about 375 ng/ml, about 400 10), Interleukin-11 (IL 11), Interleukin-12 (IL 12), Interleu ng/ml, about 425 ng/ml, about 450 ng/ml, about 475 ng/ml. kin-13 (IL 13), Granulocyte-colony stimulating factor about 500 ng/ml, about 525 ng/ml, about 550 ng/ml, about (G-CSF), Granulocyte macrophage colony stimulating factor 575 ng/ml, about 600 ng/ml, about 625 ng/ml, about 650 (GM-CSF), Macrophage colony stimulating factor (M-CSF), ng/ml, about 675 ng/ml, about 700 ng/ml, about 725 ng/ml, Erythropoietin, Thrombopoietin, Vitamin D, Epidermal about 750 ng/ml, about 775 ng/ml, about 800 ng/ml, about growth factor (EGF), Brain-derived neurotrophic factor, Leu 825 ng/ml, about 850 ng/ml, about 875 ng/ml, about 900 kemia inhibitory factor, Thyroid hormone, Basic fibroblast ng/ml, about 925 ng/ml, about 950 ng/ml, about 975 ng/ml, growth factor (bFGF), aFGF, FGF-4, FGF-6, Keratinocyte about 1 g/ml, about 2 g/ml, about 3 ug/ml, about 4 Lig/ml. growth factor (KGF). Platelet-derived growth factor (PDGF). about 5 g/ml, about 6 g/ml, about 7 Lig/ml, about 8 Lug/ml. Platelet-derived growth factor-BB, beta nerve growth factor, about 9 g/ml, about 10 ug/ml, about 11 Lug/ml, about 12 activin A, Transforming growth factor beta1 (TGF-B1), Inter ug/ml, about 13 Jug/ml, about 14 ug/ml, about 15 g/ml, about feron-C, Interferon-B, Interferon-Y, Tumor necrosis factor-C. 16 g/ml, about 17 g/ml, about 18 Lug/ml, about 19 Jug/ml. Tumor necrosis factor-B, Burst promoting activity (BPA), about 20 g/ml, about 25 g/ml, about 50 g/ml, about 75 Erythroid promoting activity (EPA), PGE, insulin growth ug/ml, about 100 ug/ml, about 125 ug/ml, about 150 ug/ml. factor-1 (IGF-1), IGF-II, Neutrophin growth factor (NGF), about 175 ug/ml, about 200 g/ml, about 250 ug/ml, about Neutrophin-3, Neutrophin 4/5, Ciliary neurotrophic factor, 300 ug/ml, about 350 ug/ml, about 400 ug/ml, about 450 Glial-derived nexin, Dexamethasone, 3-mercaptoethanol, ug/ml, about 500 ug/ml, about 550 ug/ml, about 600 ug/ml, Retinoic acid, Butylated hydroxyanisole, 5-azacytidine, about 650 ug/ml, about 700 g/ml, about 750 ug/ml, about Amphotericin B, Ascorbic acid, AScrorbate, isobutylxan 800 ug/ml, about 850 g/ml, about 900 ug/ml, about 950 thine, indomethacin, 3-glycerolphosphate, nicotinamide, ug/ml, about 1000 g/ml or greater than about 1000 ug/ml. DMSO, Thiazolidinediones, TWS 119, oxytocin, vaso pressin, melanocyte-stimulating hormone, corticortropin, 0245. In certain embodiments of the screening methods lipotropin, thyrotropin, growth hormone, prolactin, luteiniz described herein, the cell population is provided with a can ing hormone, human chorionic gonadotropin, follicle stimu didate differentiation factor which comprises any molecule lating hormone, corticotropin-releasing factor, gonadotropin other than foregut differentiation factor. For example, in some releasing factor, prolactin-releasing factor, prolactin embodiments, the cell population is provided with a candi inhibiting factor, growth-hormone releasing factor, date differentiation factor which comprises any molecule Somatostatin, thyrotropin-releasing factor, calcitonin gene other than a retinoid, a member of the TGFB Superfamily of related peptide, parathyroid hormone, glucagon-like peptide growth factors, FGF10 or FGF4. In some embodiments, the 1, glucose-dependent insulinotropic polypeptide, gastrin, cell population is provided with a candidate differentiation secretin, cholecystokinin, motilin, vasoactive intestinal pep factor which comprises any molecule other than retinoic acid. tide. Substance P. pancreatic polypeptide, peptide tyrosine 0246. In some embodiments, steps of the screening meth tyrosine, neuropeptide tyrosine, insulin, glucagon, placental ods described herein comprise determining expression of at lactogen, relaxin, angiotensin II, calctriol, atrial natriuretic least one marker at a first time point and a second time point. peptide, and melatonin. thyroxine, triiodothyronine, calcito In some of these embodiments, the first time point can be nin, estradiol, estrone, progesterone, testosterone, cortisol, prior to or at approximately the same time as providing the corticosterone, aldosterone, epinephrine, norepinepherine, cell population with the candidate differentiation factor. androstiene, calcitriol, collagen, Dexamethasone, B-mercap Alternatively, in Some embodiments, the first time point is toethanol, Retinoic acid. Butylated hydroxyanisole, 5-azacy Subsequent to providing the cell population with the candi US 2014/0134726 A1 May 15, 2014 26 date differentiation factor. In some embodiments, expression factor-1 (PDX1). In other embodiments, the marker is of a plurality of markers is determined at a first time point. homeobox A13 (HOXA13) or homeobox C6 (HOXC6). 0247 Some preferred markers for use in the above Additionally, in other embodiments, the marker is indicative embodiments include one or more markers selected from of liver cells or liver precursor cells. In certain preferred Table 3. In other preferred embodiments, the one or more embodiments, the marker is albumin, hepatocyte specific markers are selected from the group consisting of AGPAT3, antigen (HSA) or prospero-related homeobox 1 (PROX1). In APOA2, C20orf56, C21orf129, CALCR, CCL2, CER1, other embodiments, the marker is indicative of lung or lung CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, precursor cells. In some preferred embodiments, the marker EHHADH, ELOVL2, EPSTI1, FGF17, FLJ10970, is thyroid transcription factor 1 (TITF1). In yet other embodi FLJ21195, FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, ments, the marker is indicative of intestinal or intestinal pre GPR37, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, cursor cells. In additional preferred embodiments, the marker RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, is villin, glucose transporter-2 (GLUT2), apolipoprotein A1 SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, (APOA1), vascular cell adhesion molecule-1 (VACM1), von AI821586, BF941609, AI916532, BC034407, N63706 and Willebrand factor (VWF), CXC-type chemokine receptor 4 AW772192. In still other preferred embodiments, the one or (CXCR4) or caudal type homeobox transcription factor 2 more markers are selected from the group consisting of (CDX2). In still other embodiments, the marker is indicative CALCR, CMKOR1, CXCR4, GPR37, NTN4, RTN4RL1, of stomach or stomach precursor cells. In additional preferred SEMA3E, SLC40A1 SLC5A9 and TRPA1. embodiments, the marker is VCAM1, VWF or CXCR4. In 0248. In addition to determining expression of at least one other embodiments, the marker is indicative of thyroid or marker at a first time point, some embodiments of the screen thyroid precursor cells. In such embodiments, the marker is ing methods described herein contemplate determining TITF1. In still other embodiments, the marker is indicative of expression of at least one marker at a second time point, thymus or thymus precursor cells. which is subsequent to the first time point and which is sub 0251. In some embodiments of the screening methods sequent to providing the cell population with the candidate described herein, sufficient time is allowed to pass between differentiation factor. In such embodiments, expression of the providing the cell population with the candidate differentia same marker is determined at both the first and second time tion factor and determining marker expression at the second points. In some embodiments, expression of a plurality of time point. Sufficient time between providing the cell popu markers is determined at both the first and second time points. lation with the candidate differentiation factor and determin In such embodiments, expression of the same plurality of ing expression of the marker at the second time point can be markers is determined at both the first and second time points. as little as from about 1 hour to as much as about 10 days. In In some embodiments, marker expression is determined at a Some embodiments, the expression of at least one marker is plurality of time points, each of which is Subsequent to the determined multiple times Subsequent to providing the cell first time point, and each of which is Subsequent to providing population with the candidate differentiation factor. In some the cell population with the candidate differentiation factor. embodiments, Sufficient time is at least about 1 hour, at least In certain embodiments, marker expression is determined by about 6 hours, at least about 12 hours, at least about 18 hours, Q-PCR. In other embodiments, marker expression is deter at least about 24 hours, at least about 30 hours, at least about mined by immunocytochemistry. 36 hours, at least about 42 hours, at least about 48 hours, at 0249 Some preferred markers for use in the above least about 54 hours, at least about 60 hours, at least about 66 embodiments include one or more markers selected from hours, at least about 72 hours, at least about 78 hours, at least Table 3. In other preferred embodiments, the one or more about 84 hours, at least about 90 hours, at least about 96 hours, markers are selected from the group consisting of AGPAT3, at least about 102 hours, at least about 108 hours, at least APOA2, C20orf56, C21orf129, CALCR, CCL2, CER1, about 114 hours, at least about 120 hours, at least about 126 CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, hours, at least about 132 hours, at least about 138 hours, at EHHADH, ELOVL2, EPSTI1, FGF17, FLJ10970, least about 144 hours, at least about 150 hours, at least about FLJ21195, FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, 156 hours, at least about 162 hours, at least about 168 hours, GPR37, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, at least about 174 hours, at least about 180 hours, at least RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, about 186 hours, at least about 192 hours, at least about 198 SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, hours, at least about 204 hours, at least about 210 hours, at AI821586, BF941609, AI916532, BC034407, N63706 and least about 216 hours, at least about 222 hours, at least about AW772192. In still other preferred embodiments, the one or 228 hours, at least about 234 hours or at least about 240 hours. more markers are selected from the group consisting of 0252. In some embodiments of the methods described CALCR, CMKOR1, CXCR4, GPR37, NTN4, RTN4RL1, herein, it is further determined whether the expression of the SEMA3E, SLC40A1 SLC5A9 and TRPA1. marker at the second time point has increased or decreased as 0250 In certain embodiments of the screening methods compared to the expression of this marker at the first time described herein, the marker having its expression is deter point. An increase or decrease in the expression of the at least mined at the first and second time points is a marker that is one marker indicates that the candidate differentiation factor associated with the differentiation of human definitive endo is capable of promoting the differentiation of the definitive derm cells to cells which are the precursors of cells which endoderm cells. Similarly, if expression of a plurality of make up tissues and/or organs that are derived from the gut markers is determined, it is further determined whether the tube. In some embodiments, the tissues and/or organs that are expression of the plurality of markers at the second time point derived from the gut tube comprise terminally differentiated has increased or decreased as compared to the expression of cells. In some embodiments, the marker is indicative of pan this plurality of markers at the first time point. An increase or creatic cells or pancreatic precursor cells. In preferred decrease in marker expression can be determined by measur embodiments, the marker is pancreatic-duodenal homeobox ing or otherwise evaluating the amount, level or activity of the US 2014/0134726 A1 May 15, 2014 27 marker in the cell population at the first and second time 0256. It will be appreciated by those of skill in the art that points. Such determination can be relative to other markers, stem cells or other pluripotent cells can also be used as start for example housekeeping gene expression, or absolute. In ing material for the differentiation procedures described certain embodiments, wherein marker expression is herein. For example, cells obtained from embryonic gonadal increased at the second time point as compared with the first ridges, which can be isolated by methods known in the art, time point, the amount of increase is at least about 2-fold, at can be used as pluripotent cellular starting material. least about 5-fold, at least about 10-fold, at least about 20-fold, at least about 30-fold, at least about 40-fold, at least EXAMPLE 2 about 50-fold, at least about 60-fold, at least about 70-fold, at least about 80-fold, at least about 90-fold, at least about hESCyt-25 Characterization 100-fold or more than at least about 100-fold. In some embodiments, the amount of increase is less than 2-fold. In (0257. The human embryonic stem cell line, hESCyt-25 embodiments where marker expression is decreased at the has maintained a normal morphology, karyotype, growth and second time point as compared with the first time point, the self-renewal properties over 18 months in culture. This cell amount of decrease is at least about 2-fold, at least about line displays strong immunoreactivity for the OCT4, SSEA-4 5-fold, at least about 10-fold, at least about 20-fold, at least and TRA-1-60 antigens, all of which, are characteristic of about 30-fold, at least about 40-fold, at least about 50-fold, at undifferentiated hESCs and displays alkaline phosphatase least about 60-fold, at least about 70-fold, at least about activity as well as a morphology identical to other established 80-fold, at least about 90-fold, at least about 100-fold or more HESC lines. Furthermore, the human stem cell line, hESCyt than at least about 100-fold. In some embodiments, the 25, also readily forms embryoid bodies (EBs) when cultured amount of decrease is less than 2-fold. in Suspension. As a demonstration of its pluripotent nature, hESCyT-25 differentiates into various cell types that repre 0253) In some embodiments of the screening methods sent the three principal germ layers. Ectoderm production described herein, after providing the cell population with a was demonstrated by Q-PCR for ZIC1 as well as immunocy candidate differentiation factor, the human definitive endo tochemistry (ICC) for nestin and more mature neuronal mark derm cells differentiate into one or more cell types of the ers. Immunocytochemical staining for B-III tubulin was definitive endoderm lineage. In some embodiments, after pro observed in clusters of elongated cells, characteristic of early viding the cell population with a candidate differentiation neurons. Previously, we treated EBs in suspension with ret factor, the human definitive endoderm cells differentiate into inoic acid, to induce differentiation of pluripotent stem cells cells that are derived from the guttube. Such cells include, but to visceral endoderm (VE), an extra-embryonic lineage. are not limited to, cells of the pancreas, liver, lungs, stomach, Treated cells expressed high levels of C.-fetoprotein (AFP) intestine, thyroid, thymus, pharynx, gallbladder and urinary and SOX7, two markers of VE, by 54 hours of treatment. bladder as well as precursors of such cells. Additionally, these Cells differentiated in monolayer expressed AFP in sporadic cells can further develop into higher order structures Such as patches as demonstrated by immunocytochemical staining. tissues and/or organs. As will be described below, the hESCyT-25 cell line was also 0254 Having generally described this invention, a further capable of forming definitive endoderm, as validated by real understanding can be obtained by reference to certain specific time quantitative polymerase chain reaction (Q-PCR) and examples which are provided herein for purposes of illustra immunocytochemistry for SOX17 and HNF3beta in the tion only, and are not intended to be limiting. absence of AFP and SOX7 expression. To demonstrate dif ferentiation to mesoderm, differentiating EBs were analyzed EXAMPLE1 for Brachyury gene expression at several time points. Brachyury expression increased progressively over the Human ES Cells course of the experiment. In view of the foregoing, the hES CyT-25 line is pluripotent as shown by the ability to form cells 0255 For our studies of endoderm development we representing the three germ layers as well as extraembryonic employed human embryonic stem cells, which are pluripo lineages. tent and can divide seemingly indefinitely in culture while maintaining a normal karyotype. ES cells were derived from EXAMPLE 3 the 5-day-old embryo inner cell mass using either immuno logical or mechanical methods for isolation. In particular, the Production of SOX17 Antibody human embryonic stem cell line hESCyt-25 was derived from a supernumerary frozen embryo from an in vitro fertilization 0258 Aprimary obstacle to the identification of definitive cycle following informed consent by the patient. Upon thaw endoderm in hESC cultures is the lack of appropriate tools. ing the hatched blastocyst was plated on mouse embryonic We therefore undertook the production of an antibody raised fibroblasts (MEF), in ES medium (DMEM, 20% FBS, non against human SOX17 protein. essentialamino acids, beta-mercaptoethanol, and FGF2). The (0259. The marker SOX17 is expressed throughout the embryo adhered to the culture dish and after approximately definitive endoderm as it forms during gastrulation and its two weeks, regions of undifferentiated hESCs were trans expression is maintained in the gut tube (although levels of ferred to new dishes with MEFs. Transfer was accomplished expression vary along the A-P axis) until around the onset of with mechanical cutting and a brief digestion with dispase, organogenesis. SOX17 is also expressed in a Subset of extra followed by mechanical removal of the cell clusters, washing embryonic endoderm cells. No expression of this protein has and re-plating. Since derivation, hESCyt-25 has been serially been observed in mesoderm or ectoderm. It has now been passaged over 100 times. We employed the hESCyt-25 discovered that SOX17 is an appropriate marker for the human embryonic stem cell line as our starting material for definitive endoderm lineage when used in conjunction with the production of definitive endoderm. markers to exclude extra-embryonic lineages. US 2014/0134726 A1 May 15, 2014 28

0260. As described in detail herein, the SOX17 antibody (Invitrogen, Carlsbad, Calif.). Total cellular lysates were col was utilized to specifically examine effects of various treat lected 36 hours post-transfection in 50 mMTRIS-HCl (pH8), ments and differentiation procedures aimed at the production 150 mM NaCl, 0.1% SDS, 0.5% deoxycholate, containing a of SOX17 positive definitive endoderm cells. Other antibod cocktail of protease inhibitors (Roche Diagnostics Corpora ies reactive to AFP, SPARC and Thrombomodulin were also tion, Indianapolis, Ind.). Western blot analysis of 100 ug of employed to rule out the production of visceral and parietal cellular proteins, separated by SDS-PAGE on NuPAGE endoderm (extra-embryonic endoderm). (4-12% gradient polyacrylamide, Invitrogen, Carlsbad, 0261. In order to produce an antibody against SOX17, a Calif.), and transferred by electro-blotting onto PDVF mem portion of the human SOX17 cDNA (SEQID NO: 1) corre branes (Hercules, Calif.), were probed with a 1/1000 dilution sponding to amino acids 172-414 (SEQ ID NO: 2) in the of the rat SOX17 anti-serum in 10 mMTRIS-HCl (pH8), 150 carboxyterminal end of the SOX17 protein (FIG. 2) was used mM NaCl, 10% BSA, 0.05% Tween-20 (Sigma, St. Louis, for genetic immunization in rats at the antibody production Mo.), followed by Alkaline Phosphatase conjugated anti-rat company, GENOVAC (Freiberg, Germany), according to IgG (Jackson ImmunoResearch Laboratories, West Grove, procedures developed there. Procedures for genetic immuni Pa.), and revealed through Vector Black Alkaline Phosphatase Zation can be found in U.S. Pat. Nos. 5,830,876, 5,817,637, staining (Vector Laboratories, Burlingame, Calif.). The pro 6,165,993 and 6,261,281 as well as International Patent teins size standard used was wide range color markers Application Publication Nos. WO00/29442 and WO99/ (Sigma, St. Louis, Mo.). 13915, the disclosures of which are incorporated herein by 0264. In FIG.4, protein extracts made from human fibro reference in their entireties. blast cells that were transiently transfected with SOX17, 0262. Other suitable methods for genetic immunization SOX7 or EGFP cDNAs were probed on Western blots with are also described in the non-patent literature. For example, the SOX17 antibody. Only the protein extract from hSOX17 Barry et al. describe the production of monoclonal antibodies transfected cells produced a band of ~51 Kda which closely by genetic immunization in Biotechniques 16: 616-620, matched the predicted 46 Kda molecular weight of the human 1994, the disclosure of which is incorporated herein by ref SOX17 protein. There was no reactivity of the SOX17 anti erence in its entirety. Specific examples of genetic immuni body to extracts made from either human SOX7 or EGFP Zation methods to produce antibodies against specific pro transfected cells. Furthermore, the SOX17 antibody clearly teins can be found, for example, in Costaglia et al., (1998) labeled the nuclei of human fibroblast cells transfected with Genetic immunization against the human thyrotropin recep the hSOX17 expression construct but did not label cells trans tor causes thyroiditis and allows production of monoclonal fected with EGFP alone. As such, the SOX17 antibody exhib antibodies recognizing the native receptor, J. Immunol. 160: its specificity by ICC. 1458-1465; Kilpatricketal (1998) Genegun delivered DNA based immunizations mediate rapid production of murine EXAMPLE 4 monoclonal antibodies to the Flt-3 receptor, Hybridoma 17: 569-576; Schmolke et al., (1998) Identification of hepatitis G Validation of SOX17 Antibody as a Marker of virus particles in human serum by E2-specific monoclonal Definitive Endoderm antibodies generated by DNA immunization, J. Virol. 72: 0265. As evidence that the SOX17 antibody is specific for 4541-4545; Krasemann et al., (1999) Generation of mono human SOX17 protein and furthermore marks definitive clonal antibodies against proteins with an unconventional endoderm, partially differentiated hESCs were co-labeled nucleic acid-based immunization strategy, J. Biotechnol. 73: with SOX17 and AFP antibodies. It has been demonstrated 119-129; and Ulivieri et al., (1996) Generation of a mono that SOX17, SOX7, which is a closely related member of the clonal antibody to a defined portion of the Heliobacter pylori SOX gene family subgroup F (FIG. 3), and AFP are each vacuolating cytotoxin by DNA immunization, J. Biotechnol. expressed in visceral endoderm. However, AFP and SOX7 are 51:191-194, the disclosures of which are incorporated herein not expressed in definitive endoderm cells at levels detectable by reference in their entireties. by ICC, and thus, they be can employed as negative markers 0263 SOX7 and SOX18 are the closest Sox family rela for bonifide definitive endoderm cells. It was shown that tives to SOX17 as depicted in the relational dendrogram SOX17 antibody labels populations of cells that exist as dis shown in FIG. 3. We employed the human SOX7 polypeptide crete groupings of cells or are intermingled with AFP positive as a negative control to demonstrate that the SOX17 antibody cells. In particular, FIG. 5A shows that small numbers of is specific for SOX17 and does not react with its closest SOX17 cells were co-labeled with AFP; however, regions family member. In particular, to demonstrate that the anti were also found where there were little or no AFP cells in the body produced by genetic immunization is specific for field of SOX17" cells (FIG. 5B). Similarly, since parietal SOX17, SOX7 and other proteins were expressed in human endoderm has also been reported to express SOX17, antibody fibroblasts, and then, analyzed for cross reactivity with the co-labeling with SOX17 together with the parietal markers SOX17 antibody by Western blot and ICC. For example, the SPARC and/or Thrombomodulin (TM) can be used to iden following methods were utilized for the production of the tify the SOX17" cells which are parietal endoderm. As shown SOX17, SOX7 and EGFP expression vectors, their transfec in FIGS. 6A-C, Thrombomodulin and SOX17 co-labelled tion into human fibroblasts and analysis by Western blot. parietal endoderm cells were produced by random differen Expression vectors employed for the production of SOX17, tiation of hES cells. SOX7, and EGFP were pCMV6 (OriGeneTechnologies, Inc., 0266. In view of the above cell labelling experiments, the Rockville, Md.), pCMV-SPORT6 (Invitrogen, Carlsbad, identity of a definitive endoderm cell can be established by Calif.) and plGFP-N1 (Clonetech, Palo Alto, Calif.), respec the marker profile SOX17”/AFP/ITM' or SPARC'). In tively. For protein production, telomerase immortalized other words, the expression of the SOX17 marker is greater MDX human fibroblasts were transiently transfected with than the expression of the AFP marker, which is characteristic supercoiled DNA in the presence of Lipofectamine 2000 of visceral endoderm, and the TM or SPARC markers, which US 2014/0134726 A1 May 15, 2014 29 are characteristic of parietal endoderm. Accordingly, those for “differentiation bias’ within the population. The bias cells positive for SOX17 but negative for AFP and negative towards a particular differentiation pathway, prior to the overt for TM or SPARC are definitive endoderm. differentiation of those cellular phenotypes, is unrecogniz 0267 As a further evidence of the specificity of the speci able using immunocytochemical techniques. For this reason, ficity of the SOX17'/AFP/TM/SPARC' marker profile as Q-PCR provides a method of analysis that is at least comple predictive of definitive endoderm, SOX17 and AFP gene mentary and potentially much Superior to immunocytochemi expression was quantitatively compared to the relative num cal techniques for screening the Success of differentiation ber of antibody labeled cells. As shown in FIG. 7A, hESCs treatments. Additionally, Q-PCR provides a mechanism by treated with retinoic acid (visceral endoderm inducer), or which to evaluate the success of a differentiation protocol in Activin A (definitive endoderm inducer), resulted in a 10-fold a quantitative format at semi-high throughput scales of analy difference in the level of SOX17 mRNA expression. This S1S. result mirrored the 10-fold difference in SOX17 antibody 0271 The approach taken here was to perform relative labeled cell number (FIG.7B). Furthermore, as shown in FIG. quantitation using SYBR Green chemistry on a Rotor Gene 8A, Activin A treatment of hESCs suppressed AFP gene 3000 instrument (Corbett Research) and a two-step RT-PCR expression by 6.8-fold in comparison to no treatment. This format. Such an approach allowed for the banking of cDNA was visually reflected by a dramatic decrease in the number of samples for analysis of additional marker genes in the future, AFP labeled cells in these cultures as shown in FIGS. 8B-C. thus avoiding variability in the reverse transcription effi To quantify this further, it was demonstrated that this approxi ciency between samples. mately 7-fold decrease in AFP gene expression was the result of a similar 7-fold decrease in AFP antibody-labeled cell 0272 Primers were designed to lie over exon-exon bound number as measured by flow cytometry (FIGS. 9A-B). This aries or span introns of at least 800 bp when possible, as this result is extremely significant in that it indicates that quanti has been empirically determined to eliminate amplification tative changes in gene expression as seen by Q-PCR mirror from contaminating genomic DNA. When marker genes were changes in cell type specification as observed by antibody employed that do not contain introns or they possess pseudo staining. genes, DNase I treatment of RNA samples was performed. 0268 Incubation of hESCs in the presence of Nodal fam 0273 We routinely used Q-PCR to measure the gene ily members (Nodal, Activin A and Activin B-NAA) resulted expression of multiple markers of target and non-target cell in a significant increase in SOX17 antibody-labeled cells over types in order to provide a broad profile description of gene time. By 5 days of continuous activin treatment greater than expression in cell samples. The markers relevant for the early 50% of the cells were labeled with SOX17(FIGS. 1A-F). phases of hESC differentiation (specifically ectoderm, meso There were few or no cells labeled with AFP after 5 days of derm, definitive endoderm and extra-embryonic endoderm) activin treatment. and for which validated primer sets are available are provided 0269. In summary, the antibody produced against the car below in Table 1. The human specificity of these primer sets boxy-terminal 242 amino acids of the human SOX17 protein has also been demonstrated. This is an important fact since the identified human SOX17 protein on Western blots but did not hESCs were often grown on mouse feeder layers. Most typi recognize SOX7, it’s closest Sox family relative. The SOX17 cally, triplicate samples were taken for each condition and antibody recognized a subset of cells in differentiating HESC independently analyzed in duplicate to assess the biological cultures that were primarily SOX17"/AFP' (greater than variability associated with each quantitative determination. 95% of labeled cells) as well as a small percentage (<5%) of 0274. To generate PCR template, total RNA was isolated cells that co-label for SOX17 and AFP (visceral endoderm). Treatment of HESC cultures with activins resulted in a using RNeasy (Qiagen) and quantitated using RiboGreen marked elevation of SOX17 gene expression as well as (Molecular Probes). Reverse transcription from 350-500 ng SOX17 labeled cells and dramatically suppressed the expres of total RNA was carried out using the iScript reverse tran Sion of AFP mRNA and the number of cells labeled with AFP Scriptase kit (BioRad), which contains a mix of oligo-dT and antibody. random primers. Each 20 LL reaction was Subsequently diluted up to 100 uL total volume and 3 uI was used in each 10 uL Q-PCR reaction containing 400 nM forward and EXAMPLE 5 reverse primers and 5 uL 2X SYBR Green master mix Q-PCR Gene Expression Assay (Qiagen). Two step cycling parameters were used employing a 5 second denature at 85-94°C. (specifically selected accord 0270. In the following experiments, real-time quantitative ing to the melting temp of the amplicon for each primer set) RT-PCR (Q-PCR) was the primary assay used for screening followed by a 45 second anneal/extend at 60° C. Fluorescence the effects of various treatments on HESC differentiation. In data was collected during the last 15 seconds of each exten particular, real-time measurements of gene expression were sion phase. A three point, 10-fold dilution series was used to analyzed for multiple marker genes at multiple time points by generate the standard curve for each run and cycle thresholds Q-PCR. Marker genes characteristic of the desired as well as (Ct’s) were converted to quantitative values based on this undesired cell types were evaluated to gain a better under standard curve. The quantitated values for each sample were standing of the overall dynamics of the cellular populations. normalized to housekeeping gene performance and then aver The strength of Q-PCR analysis includes its extreme sensi age and standard deviations were calculated for triplicate tivity and relative ease of developing the necessary markers, samples. At the conclusion of PCR cycling, a melt curve as the genome sequence is readily available. Furthermore, the analysis was performed to ascertain the specificity of the extremely high sensitivity of Q-PCR permits detection of reaction. A single specific product was indicated by a single gene expression from a relatively small number of cells peak at the Tappropriate for that PCR amplicon. In addition, within a much larger population. In addition, the ability to reactions performed without reverse transcriptase served as detect very low levels of gene expression provides indications the negative control and do not amplify. US 2014/0134726 A1 May 15, 2014 30

0275 A first step in establishing the Q-PCR methodology patterns of differentiation to definitive endoderm and extra was validation of appropriate housekeeping genes (HGs) in embryonic endoderm as well as to mesodermal and neural the experimental system. Since the HG was used to normalize cell types. across samples for the RNA input, RNA integrity and RT 0279. In view of the above data it is clear that increasing efficiency, it was of value that the HG exhibited a constant doses of activin resulted in increasing SOX17 gene expres level of expression over time in all sample types in order for sion. Further this SOX17 expression predominantly repre the normalization to be meaningful. We measured the expres sented definitive endoderm as opposed to extra-embryonic sion levels of Cyclophilin G, hypoxanthine phosphoribosyl endoderm. This conclusion stems from the observation that transferase 1 (HPRT), beta-2-microglobulin, hydroxymeth SOX17 gene expression was inversely correlated with AFP, ylbiane synthase (HMBS), TATA-binding protein (TBP), and SOX7, and SPARC gene expression. glucuronidase beta (GUS) in differentiating hESCs. Our results indicated that beta-2-microglobulin expression levels increased over the course of differentiation and therefore we EXAMPLE 6 excluded the use of this gene for normalization. The other genes exhibited consistent expression levels overtime as well Directed Differentiation of Human ES Cells to as across treatments. We routinely used both Cyclophilin G Definitive Endoderm and GUS to calculate a normalization factor for all samples. The use of multiple HGs simultaneously reduces the variabil 0280 Human ES cell cultures will randomly differentiate ity inherent to the normalization process and increases the if they are cultured under conditions that do not actively reliability of the relative gene expression values. maintain their undifferentiated state. This heterogeneous dif 0276 After obtaining genes for use in normalization, ferentiation results in production of extra-embryonic endo Q-PCR was then utilized to determine the relative gene derm cells comprised of both parietal and visceral endoderm expression levels of many marker genes across Samples (AFP, SPARC and SOX7 expression) as well as early ecto receiving different experimental treatments. The marker dermal and mesodermal derivatives as marked by ZIC1 and genes employed have been chosen because they exhibit Nestin (ectoderm) and Brachyury (mesoderm) expression. enrichment in specific populations representative of the early Definitive endoderm cell appearance has not traditionally germ layers and in particular have focused on sets of genes been examined or specified for lack of specific antibody that are differentially expressed in definitive endoderm and markers in ES cell cultures. As such, and by default, early extra-embryonic endoderm. These genes as well as their rela definitive endoderm production in ES cell cultures has not tive enrichment profiles are highlighted in Table 1. been well studied. Since no good antibody reagents for defini tive endoderm cells have been available, most of the charac TABLE 1. terization has focused on ectoderm and extra-embryonic endoderm. Overall, there are significantly greater numbers of Germ. Layer Gene Expression Domains extra-embryonic and neurectodermal cell types in compari Endoderm SOX17 definitive, visceral and parietal endoderm son to SOX17" definitive endoderm cells in randomly differ MDXL1 endoderm and mesoderm entiated ES cell cultures. GATA4 definitive and primitive endoderm HNF3b definitive endoderm and primitive endoderm, 0281. As undifferentiated HESC colonies expand on a bed mesoderm, neural plate of fibroblast feeders the edges of the colony take on alterna GSC endoderm and mesoderm tive morphologies that are distinct from those cells residing SOX7 visceral endoderm within the interior of the colony. Many of these outer edge Extra- AFP visceral endoderm, liver embryonic SPARC parietal endoderm cells can be distinguished by their less uniform, larger cell TM parietal endoderm?trophectoderm body morphology and by the expression of higher levels of Ectoderm ZIC1 neural tube, neural progenitors OCT4. It has been described that as ES cells begin to differ Mesoderm BRACH nascent mesoderm entiate they alter the levels of OCT4 expression up or down relative to undifferentiated ES cells. Alteration of OCT4 lev 0277 Since many genes are expressed in more than one els above or below the undifferentiated threshold may signify germ layer it is useful to quantitatively compare expression the initial stages of differentiation away from the pluripotent levels of many genes within the same experiment. SOX17 is State. expressed in definitive endoderm and to a smaller extent in 0282. When undifferentiated colonies were examined by visceral and parietal endoderm. SOX7 and AFP are expressed SOX17 immunocytochemistry, occasionally small 10-15-cell in visceral endoderm at this early developmental time point. clusters of SOX17-positive cells were detected at random SPARC and TM are expressed in parietal endoderm and locations on the periphery and at the junctions between undif Brachyury is expressed in early mesoderm. ferentiated ESC colonies. As noted above, these scattered 0278 Definitive endoderm cells were predicted to express pockets of outer colony edges appeared to be some of the first high levels of SOX17 mRNA and low levels of AFP and cells to differentiate away from the classical ESC morphol SOX7 (visceral endoderm), SPARC (parietal endoderm) and ogy as the colony expanded in size and became more Brachyury (mesoderm). In addition, ZIC1 was used here to crowded.Younger, smaller fully undifferentiated colonies (<1 further rule out induction of early ectoderm. Finally, GATA4 mm, 4-5 days old) showed no SOX17 positive cells within or and HNF3b were expressed in both definitive and extra-em at the edges of the colonies while older, larger colonies (1-2 bryonic endoderm, and thus, correlate with SOX17 expres mm diameter, D5 days old) had sporadic isolated patches of sion in definitive endoderm (Table 1). A representative SOX17 positive, AFP negative cells at the periphery of some experiment is shown in FIGS. 11-14 which demonstrates how colonies or in regions interior to the edge that were differen the marker genes described in Table 1 correlate with each tiated away from classical HESC morphology described pre other among the various samples, thus highlighting specific viously. Given that this was the first development of an effec US 2014/0134726 A1 May 15, 2014

tive SOX17 antibody, definitive endoderm cells generated in derm. In some embodiments of the present invention, it is such early “undifferentiated ESC cultures have never been therefore valuable to remove BMP4 from the treatment previously demonstrated. within 4 days of addition. 0283 Based on negative correlations of SOX17 and (0289. To determine the effect of TGFB factor treatment at SPARC gene expression levels by Q-PCR, the vast majority the individual cell level, a time course of TGFB factor addi of these SOX17 positive, AFP negative cells will be negative tion was examined using SOX17 antibody labeling. As pre for parietal markers by antibody co-labeling. This was spe viously shown in FIGS. 10A-F, there was a dramatic increase cifically demonstrated for TM-expressing parietal endoderm in the relative number of SOX17 labeled cells over time. The cells as shown in FIGS. 15A-B. Exposure to Nodal factors relative quantitation (FIG. 20) shows more than a 20-fold Activin A and B resulted in a dramatic decrease in the inten increase in SOX17-labeled cells. This result indicates that both the numbers of cells as well SOX17 gene expression sity to TM expression and the number of TM positive cells. level are increasing with time of TGFB factor exposure. As By triple labeling using SOX17, AFP and TM antibodies on shown in FIG. 21, after four days of exposure to Nodal, an activin treated culture, clusters of SOX17 positive cells ActivinA, Activin Band BMP4, the level of SOX17 induction which were also negative for AFP and TM were observed reached 168-fold over undifferentiated hESCS. FIG. 22 (FIGS. 16A-D). These are the first cellular demonstrations of shows that the relative number of SOX17-positive cells was SOX17 positive definitive endoderm cells in differentiating also dose responsive. Activin A doses of 100 ng/mL or more ESC cultures (FIGS. 16A-D and 17). were capable of potently inducing SOX17 gene expression 0284 With the SOX17 antibody and Q-PCR tools and cell number. described above we have explored a number of procedures 0290. In addition to the TGFB family members, the Wnt capable of efficiently programming ESCs to become family of molecules may play a role in specification and/or SOX17"/AFP/SPARC/TM' definitive endoderm cells. We maintenance of definitive endoderm. The use of Wnt mol applied a variety of differentiation protocols aimed at increas ecules was also beneficial for the differentiation of hESCs to ing the number and proliferative capacity of these cells as definitive endoderm as indicted by the increased SOX17 gene measured at the population level by Q-PCR for SOX17 gene expression in samples that were treated with activins plus expression and at the level of individual cells by antibody Wnt3a over that of activins alone (FIG. 23). labeling of SOX17 protein. 0291 All of the experiments described above were per 0285. We were the first to analyze and describe the effect formed using tissue culture medium containing 10% serum of TGFB family growth factors, such as Nodal/activin/BMP. with added factors. Surprisingly, we discovered that the con for use in creating definitive endoderm cells from embryonic centration of serum had an effect on the level of SOX17 stem cells in in vitro cell cultures. In typical experiments, expression in the presence of added activins as shown in Activin A, Activin B, BMP or combinations of these growth FIGS. 24A-C. When serum levels were reduced from 10% to factors were added to cultures ofundifferentiated human stem 2%, SOX17 expression tripled in the presence of Activins A cell line hESCyt-25 to begin the differentiation process. and B. 0292 Finally, we demonstrated that activin induced 0286 As shown in FIG. 19, addition of Activin A at 100 SOX17" cells divide in culture as depicted in FIGS. 25A-D. ng/ml resulted in a 19-fold induction of SOX17 gene expres The arrows show cells labeled with SOX17/PCNA/DAPI that sion vs. undifferentiated hESCs by day 4 of differentiation. are in mitosis as evidenced by the PCNA/DAPI-labeled Adding Activin B, a second member of the activin family, mitotic plate pattern and the phase contrast mitotic profile. together with Activin A, resulted in a 37-fold induction over undifferentiated hESCs by day 4 of combined activin treat EXAMPLE 7 ment. Finally, adding a third member of the TGFB family from the Nodal/Activin and BMP subgroups, BMP4, together Chemokine Receptor 4 (CXCR4) Expression with Activin A and Activin B, increased the fold induction to Correlates with Markers for Definitive 57 times that of undifferentiated hESCs (FIG. 19). When SOX17 induction with activins and BMP was compared to no 0293 Endoderm and not Markers for Mesoderm, Ecto factor medium controls 5-, 10-, and 15-fold inductions derm or Visceral Endoderm resulted at the 4-day time point. By five days of triple treat 0294 As described above, ESCs can be induced to differ ment with Activins A, B and BMP, SOX17 was induced more entiate to the definitive endoderm germ layer by the applica than 70 times higher than hESCs. These data indicate that tion of cytokines of the TGFB family and more specifically of higher doses and longer treatment times of the Nodal/activin the activin/nodal subfamily. Additionally, we have shown that TGFB family members results in increased expression of the proportion of fetal bovine serum (FBS) in the differentia SOX17. tion culture medium effects the efficiency of definitive endo derm differentiation from ESCs. This effect is such that at a 0287 Nodal and related molecules Activin A, B and BMP given concentration of activin A in the medium, higher levels facilitate the expression of SOX17 and definitive endoderm of FBS will inhibit maximal differentiation to definitive endo formation in vivo or in vitro. Furthermore, addition of BMP derm. In the absence of exogenous activin A, differentiation results in an improved SOX17 induction possibly through the of ESCs to the definitive endoderm lineage is very inefficient further induction of Cripto, the Nodal co-receptor. and the FBS concentration has much milder effects on the 0288 We have demonstrated that the combination of differentiation process of ESCs. Activins A and B together with BMP4 result in additive 0295). In these experiments, hESCs were differentiated by increases in SOX17 induction and hence definitive endoderm growing in RPMI medium (Invitrogen, Carlsbad, Calif.; formation. BMP4 addition for prolonged periods (>4 days), cathé1870-036) supplemented with 0.5%, 2.0% or 10% FBS in combination with Activin A and B may induce SOX17 in and either with or without 100 ng/mL activin A for 6 days. In parietal and visceral endoderm as well as definitive endo addition, a gradient of FBS ranging from 0.5% to 2.0% over US 2014/0134726 A1 May 15, 2014 32 the first three days of differentiation was also used in con exogenous activin A was used. Under these conditions the junction with 100 ng/mL of activin A. After the 6 days, SOX17 cells also appeared in clusters and these clusters replicate samples were collected from each culture condition were smaller and much more rare than those found in the high and analyzed for relative gene expression by real-time quan activin A, low FBS treatment (FIGS. 29 C and F). These titative PCR. The remaining cells were fixed for immunof results demonstrate that the CXCR4 expression patterns not luorescent detection of SOX17 protein. only correspond to definitive endoderm gene expression but 0296. The expression levels of CXCR4 varied dramati also to the number of definitive endoderm cells in each con cally across the 7 culture conditions used (FIG. 26). In gen dition. eral, CXCR4 expression was high inactivin Atreated cultures (A100) and low in those which did not receive exogenous EXAMPLE 8 activin A (NF). In addition, among the A100 treated cultures, CXCR4 expression was highest when FBS concentration was Differentiation Conditions that Enrich for Definitive lowest. There was a remarkable decrease in CXCR4 level in Endoderm Increase the Proportion of CXCR4 the 10% FBS condition such that the relative expression was Positive Cells more in line with the conditions that did not receive activin A 0301 The dose of activin A also effects the efficiency at (NF). which definitive endoderm can be derived from ESCs. This 0297. As described above, expression of the SOX17, GSC, example demonstrates that increasing the dose of activin A MIXL1. and HNF3? genes is consistent with the character increases the proportion of CXCR4" cells in the culture. ization of a cell as definitive endoderm. The relative expres (0302 hESCs were differentiated in RPMI media supple sion of these four genes across the 7 differentiation conditions mented with 0.5%-2% FBS (increased from 0.5% to 1.0% to mirrors that of CXCR4 (FIGS. 27A-D). This demonstrates 2.0% over the first 3 days of differentiation) and either 0, 10. that CXCR4 is also a marker of definitive endoderm. or 100 ng/mL of activin A. After 7 days of differentiation the 0298 Ectoderm and mesoderm lineages can be distin cells were dissociated in PBS without Ca"/Mg" containing guished from definitive endoderm by their expression of vari 2% FBS and 2 mM (EDTA) for 5 minutes at room tempera ous markers. Early mesoderm expresses the genes Brachyury ture. The cells were filtered through 35 um nylon filters, and MOX1 while nascent neuro-ectoderm expresses SOX1 counted and pelleted. Pellets were resuspended in a small and ZIC1. FIGS. 28A-D demonstrate that the cultures which volume of 50% human serum/50% normal donkey serum and did not receive exogenous activin A were preferentially incubated for 2 minutes on ice to block non-specific antibody enriched for mesoderm and ectoderm gene expression and binding sites. To this, 1 uL of mouse anti-CXCR4 antibody that among the activin A treated cultures, the 10% FBS con (Abeam, catfiab10403-100) was added per 50 uL (containing dition also had increased levels of mesoderm and ectoderm approximately 10 cells) and labeling proceeded for 45 min marker expression. These patterns of expression were inverse utes on ice. Cells were washed by adding 5 mL of PBS to that of CXCR4 and indicated that CXCR4 was not highly containing 2% human serum (buffer) and pelleted. A second expressed in mesoderm or ectoderm derived from ESCs at wash with 5 mL of buffer was completed then cells were this developmental time period. resuspended in 50 uL buffer per 10 cells. Secondary anti 0299 Early during mammalian development, differentia body (FITC conjugated donkey anti-mouse; Jackson Immu tion to extra-embryonic lineages also occurs. Of particular noResearch, cati 715-096-151) was added at 5 ug/mL final relevance here is the differentiation of visceral endoderm that concentration and allowed to label for 30 minutes followed by shares the expression of many genes in common with defini two washes in buffer as above. Cells were resuspended at tive endoderm, including SOX17. To distinguish definitive 5x10° cells/mL in buffer and analyzed and sorted using a endoderm from extra-embryonic visceral endoderm one FACS Vantage (Beckton Dickenson) by the staff at the flow should examine a marker that is distinct between these two. cytometry core facility (The Scripps Research Institute). SOX7 represents a marker that is expressed in the visceral Cells were collected directly into RLT lysis buffer (Qiagen) endoderm but not in the definitive endoderm lineage. Thus, for subsequent isolation of total RNA for gene expression culture conditions that exhibit robust SOX17 gene expression analysis by real-time quantitative PCR. in the absence of SOX7 expression are likely to contain (0303. The number of CXCR4" cells as determined by flow definitive and not visceral endoderm. It is shown in FIG. 28E cytometry were observed to increase dramatically as the dose that SOX7 was highly expressed in cultures that did not of activinA was increased in the differentiation culture media receive activin A, SOX7 also exhibited increased expression (FIGS. 30A-C). The CXCR4" cells were those falling within even in the presence of activin A when FBS was included at the R4 gate and this gate was set using a secondary antibody 10%. This pattern is the inverse of the CXCR4 expression only control for which 0.2% of events were located in the R4 pattern and Suggests that CXCR4 is not highly expressed in gate. The dramatically increased numbers of CXCR4" cells visceral endoderm. correlates with a robust increase in definitive endoderm gene 0300. The relative number of SOX17 immunoreactive expression as activin A dose is increased (FIGS. 31 A-D). (SOX17') cells present in each of the differentiation condi tions mentioned above was also determined. When hESCs EXAMPLE 9 were differentiated in the presence of high dose activin A and low FBS concentration (0.5%-2.0%) SOX17 cells were Isolation of CXCR4 Positive Cells Enriches for ubiquitously distributed throughout the culture. When high Definitive Endoderm Gene Expression and Depletes dose activin A was used but FBS was included at 10% (v/v), Cells Expressing Markers of Mesoderm, Ectoderm the SOX17 cells appeared at much lower frequency and and Visceral Endoderm always appeared in isolated clusters rather than evenly dis tributed throughout the culture (FIGS. 29A and C as well as B 0304. The CXCR4" and CXCR4 cells identified in and E). A further decrease in SOX17 cells was seen when no Example 8 above were collected and analyzed for relative US 2014/0134726 A1 May 15, 2014

gene expression and the gene expression of the parent popu TABLE 2 lations was determined simultaneously. Composition of Definitive Endodern Cultures 0305. The relative levels of CXCR4 gene expression was dramatically increased with increasing dose of activin A Percent Percent Percent Percent of Definitive Extraembryonic ES (FIG. 32). This correlated very well with the activin A dose Marker(s) culture Endoderm endododerm cells dependent increase of CXCR4" cells (FIGS.30A-C). It is also SOX17 70-80 100 clear that isolation of the CXCR4" cells from each population Thrombomodulin <2 O 75 accounted for nearly all of the CXCR4 gene expression in that AFP <1 O 25 population. This demonstrates the efficiency of the FACS CXCR4 70-80 100 O ECAD 10 O 1OO method for collecting these cells. other (ECAD neg.) 10-22 0306 Gene expression analysis revealed that the CXCR4" cells contain not only the majority of the CXCR4 gene Total 100 100 100 1OO expression, but they also contained other gene expression for markers of definitive endoderm. As shown in FIGS. 31A-D, 0310. In particular, Table 2 indicates that CXCR4 and the CXCR4" cells were further enriched over the parent A100 SOX17 positive cells (endoderm) comprised from 70%–80% population for SOX17, GSC, HNF3B, and MIXL1. In addi of the cells in the cell culture. Of these SOX17-expressing tion, the CXCR4 fraction contained very little gene expres cells, less than 2% expressedTM (parietal endoderm) and less sion for these definitive endoderm markers. Moreover, the than 1% expressed AFP (visceral endoderm). After subtract CXCR4 and CXCR4 populations displayed the inverse pat ing the proportion of TM-positive and AFP-positive cells tern of gene expression for markers of mesoderm, ectoderm (combined parietal and visceral endoderm; 3% total) from the proportion of SOX17/CXCR4 positive cells, it can be seen and extra-embryonic endoderm. FIGS. 33 A-D shows that the that about 67% to about 77% of the cell culture was definitive CXCR4 cells were depleted for gene expression of endoderm. Approximately 10% of the cells were positive for Brachyury, MOX1, ZIC1, and SOX7 relative to the A100 E-Cadherin (ECAD), which is a marker for hESCs, and about parent population. This A100 parent population was already 10-20% of the cells were of other cell types. low in expression of these markers relative to the low dose or 0311. We have discovered that the purity of definitive no activin A conditions. These results show that the isolation endoderm in the differentiating cell cultures that are obtained of CXCR4" cells from hESCs differentiated in the presence prior to FACS separation can be improved as compared to the of high activin A yields a population that is highly enriched above-described low serum procedure by maintaining the for and substantially pure definitive endoderm. FBS concentration at s().5% throughout the 5-6 day differen tiation procedure. However, maintaining the cell culture at EXAMPLE 10 s0.5% throughout the 5-6 day differentiation procedure also results in a reduced number of total definitive endoderm cells that are produced. Quantitation of Definitive Endoderm Cells in a Cell 0312 Definitive endoderm cells produced by methods Population Using CXCR4 described herein have been maintained and expanded in cul ture in the presence of activin for greater than 50 days without 0307 To confirm the quantitation of the proportion of appreciable differentiation. In such cases, SOX17, CXCR4, definitive endoderm cells present in a cell culture or cell MIXL1, GATA4. HNF3? expression is maintained over the population as determined previously herein and as deter culture period. Additionally, TM, SPARC, OCT4, AFP. mined in U.S. Provisional Patent Application No. 60/532, SOX7, ZIC1 and BRACH were not detected in these cultures. 004, entitled DEFINITIVE ENDODERM, filed Dec. 23, It is likely that such cells can be maintained and expanded in 2003, the disclosure of which is incorporated herein by ref culture for substantially longer than 50 days without appre erence in its entirety, cells expressing CXCR4 and other ciable differentiation. markers of definitive endoderm were analyzed by FACS. 0308 Using the methods such as those described in the EXAMPLE 11 above Examples, hESCs were differentiated to produce definitive endoderm. In particular, to increase yield and purity Additional Markers of Definitive Endoderm Cells expressed in differentiating cell cultures, the serum concen 0313. In the following experiment, RNA was isolated tration of the medium was controlled as follows: 0.2% FBS on from purified definitive endoderm and human embryonic day 1, 1.0% FBS on day 2 and 2.0% FBS on days 3-6. stem cell populations. Gene expression was then analyzed by Differentiated cultures were sorted by FACS using three cell gene chip analysis of the RNA from each purified population. surface epitopes, E-Cadherin, CXCR4, and Thrombomodu Q-PCR was also performed to further investigate the potential lin. Sorted cell populations were then analyzed by Q-PCR to of genes expressed in definitive endoderm, but not in embry determine relative expression levels of markers for definitive onic stem cells, as a marker for definitive endoderm. and extraembryonic-endoderm as well as other cell types. 0314 Human embryonic stem cells (hESCs) were main CXCR4 sorted cells taken from optimally differentiated cul tained in DMEM/F 12 media supplemented with 20% Knock tures resulted in the isolation of definitive endoderm cells that Out Serum Replacement, 4 ng/mL recombinant human basic were >98% pure. fibroblast growth factor (bFGF), 0.1 mM 2-mercaptoethanol, 0309 Table 2 shows the results of a marker analysis for a L-glutamine, non-essential amino acids and penicillin/strep definitive endoderm culture that was differentiated from tomycin. hESCs were differentiated to definitive endoderm hESCs using the methods described herein. by culturing for 5 days in RPMI media supplemented with US 2014/0134726 A1 May 15, 2014 34

100 ng/mL of recombinant human activin A, fetal bovine 0316 Purified RNA was submitted in duplicate to Expres serum (FBS), and penicillin/streptomycin. The concentration sion Analysis (Durham, N.C.) for generation of the expres of FBS was varied each day as follows: 0.1% (first day), 0.2% sion profile data using the Affymetrix platform and U133 Plus 2.0 high-density oligonucleotide arrays. Data presented is a (second day), 2% (days 3-5). group comparison that identifies genes differentially 0315 Cells were isolated by fluorescence activated cell expressed between the two populations, hESCs and definitive sorting (FACS) in order to obtain purified populations of endoderm. hESCs and definitive endoderm for gene expression analysis. 0317 Genes that exhibited an upward change in expres Immuno-purification was achieved for hESCs using SSEA4 sion level over that found in hESCs are described in Table 3. antigen (R&D Systems, catfiFAB 1435P) and for definitive The “Gene Symbol column refers to the unique name given endoderm using CXCR4 (R&D Systems, catfiFAB170P). to the gene by the HUGO Committee. Cells were dissociated using trypsin/EDTA (Invitrogen, The “Raw Fold Change' column refers to the fold difference cati25300-054), washed in phosphate buffered saline (PBS) in signal for each gene in definitive endoderm compared to containing 2% human serum and resuspended in 100% hESCs. The “Unigene.” “LocusLink” and “OMIM columns human serum on ice for 10 minutes to block non-specific refer to identifiers that are descriptive and functional annota binding. Staining was carried out for 30 minutes on ice by tions derived from current National Center for Biotechnology adding 200 uL of phycoerythrin-conjugated antibody to Information (NCBI) releases of the UniGene, LocusLink, 5x10° cells in 800 uL human serum. Cells were washed twice OMIM, and HomoloGene publicly available databases. The with 8 mL of PBS buffer and resuspended in 1 mL of the column entitled “SeqerivedFrom' provides the Genbank same. FACS isolation was carried out by the core facility of Accession Number for a primary database sequence. Such as The Scripps Research Institute using a FACS Vantage (BD a fragment of a , from which the listed gene was Biosciences). Cells were collected directly into RLT lysis derived. The column entitled “Gene Descriptor provides a buffer and RNA was isolated by RNeasy according to the description of the function of the polypeptide that is encoded manufacturers instructions (Qiagen). by the gene named in the first column. TABLE 3 Up-regulated markers in definitive endodern

Raw Fold Locus- SeqDerived Gene Symbol Change Unigene Link OMIM From Gene Descriptor ABCB4 S.22 HS.73812 S244 171060 BCO2O618 ATP-binding cassette, Sub-family B (MDR/TAP), member 4 ABCC4 7.18 HS.30791S 10257 6OS250 AI948SO3 ATP-binding cassette, Sub-family C (CFTR/MRP), member 4 ABI2 5.95 HS.387906 101S2 6.06442 AFO70S66 abl interactor 2 ACADL 7.09 HS.4301.08 33 201460 NM OO1608 acyl-Coenzyme A dehydrogenase, long chain ACE2 26.83 HS.178098 59272 300335 NM O21804 angiotensin I converting enzyme (peptidyl-dipeptidase A) 2 ACOX3 13.04 HS.12773 831O 603402 BFOSS171 acyl-Coenzyme A oxidase 3, pristanoyl ACPP 7.06 HS.388677 SS 171790 AI659898 acid phosphatase, prostate ACSL1 14.73 HS.406,678 2180 152425 NM 021122 acyl-CoA synthetase long-chain family member 1 ACSL3 6.01 HS.268O12 2181 6O2371 BFS12846 acyl-CoA synthetase long-chain family member 3 ACTA1 8.03 HS.1288 58 102610 NM 001100 actin, alpha 1, skeletal muscle ADAM19 6.68 HS.2893.68 8728 603640 Y13786 a disintegrin and metalloproteinase domain 19 (meltrin beta) ADAMTS18 17.6S HS.188746 170692 6O7512 AIf3312O a disintegrin-like and metalloprotease (reprolysin type) with thrombospondin type 1 motif, 18 ADAMTS9 16.19 HS.318751 S6999 6OS421 AF4888O3 a disintegrin-like and metalloprotease (reprolysin type) with thrombospondin type 1 motif, 9 ADRA1B 5.21 HS.416813 147 104220 NM OOO679 adrenergic, alpha-1B-, receptor AGPAT3 53.81 HS.443657 56894 AI3373OO 1-acylglycerol-3-phosphate O-acyltransferase 3 AHI1 6.21 HS.273294 S4806 AV658469 Abelson helper integration site AIP1 6.21 HS.22599 98.63 606382 NM 012301 atrophin-1 interacting protein 1 ALAD 5.43 HS.1227 210 125270 BCOOO977 aminolevulinate, delta-, dehydratase AMHR2 946 HS.437877 269 600956 NM 020547 anti-Mullerian hormone receptor, type II ANGPT2 3.75 HS.115181 285 601922 NM OO1147 angiopoietin 2 ANGPTL1 6.19 HS.304398 9068 603874 BFOO2O46 angiopoietin-like 1 ANK2 O.11 HS.409783 287 106410 AF131823 ankyrin 2, neuronal ANKH 26.52 HS.156727 56172 605145 NM O19847 ankylosis, progressive homolog (mouse) ANKRD6 4.01 HS.30991 22881 NM O14942 ankyrin repeat domain 6 ANXA3 S.O6 HS.442733 306 106490 M63310 annexin A3 APC 6.40 HS.75081 324 175100 NM 000038 adenomatosis polyposis coli APOA1 S.O.S HS.93.194 33S 107680 XO2162 apolipoprotein A-I APOA2 33.51 HS.237658 336 107670 NM OO1643 apolipoprotein A-II APOBEC3G S.76 HS.286849 60489 607 113 NM O21822 apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3G APOC1 S.13 HS.268,571 341 107710 NM OO1645 apolipoprotein C-I APOL3 5.88 HS.241535 80833 607253 NM O14349 apolipoprotein L, 3 ARF4L S.64 HS.183153 379 600732 NM OO1661 ADP-ribosylation factor 4-like ARG99 5.44 HS.528689 83857 AU151239 ARG99 protein ARHGAP18. 8.34 HS.413282 93663 AU158022 Rho GTPase activating protein 18 ARHGAP24 O.96 HS.4428O1 83478 AI743534 Rho GTPase activating protein 24 ARHGEF4 5.19 HS.6066 50649 605216 NM. O15320 Rho guanine nucleotide exchange factor (GEF) 4 ARHGEF7 S.10 HS.172813 8874 605477 AL831814 Rho guanine nucleotide exchange factor (GEF) 7 US 2014/0134726 A1 May 15, 2014 35

TABLE 3-continued Up-regulated markers in definitive endodern

Raw Fold Locus SeqLDerived Gene Symbol Change Unigene Link OMIM From Gene Descriptor ARKS 8.63 S.200598 98.91 608130 NM 014840 AMP-activated protein kinase family member 5 ARMCX1 5.24 S. 9728 S1309 3.00362 NM 016608 armadillo repeat containing, X-linked 1 ARRB1 13.03 S.112278 408 107940 NM OO4041 arrestin, beta 1 ARSE 7.26 S.386975 415 300180 NM 000047 arylsulfatase E (chondrodysplasia punctata 1) ASAM 7.31 S.135121 79827 BG112263 adipocyte-specific adhesion molecule ASB4 10.03 S.413226 S1666 605761 BE22O587 ankyrin repeat and SOCS box-containing 4 ASBS 47.32 S.352364 140458 BF589787 ankyrin repeat and SOCS box-containing 5 AUTS2 13.27 S.296,720 26053 6O727O A417756 autism susceptibility candidate 2 BCAR3 547 S.2O1993 8412 604704 NM 003567 breast cancer anti-estrogen resistance 3 BCAT2 7.68 S.S12670 587 113530 AKO23909 branched chain aminotransferase 2, mitochondrial BCL2L14 8.79 S.11962 79370 606126 AISS4912 BCL2-like 14 (apoptosis facilitator) BCL7C 7.8O S.3031.97 92.74 605847 NM 004765 B-cell CLL/lymphoma 7C BHLHBS 7.16 S.388788 27319 AL134708 basic helix-loop-helix domain containing, class B, 5 BIRC3 741 S.127799 330 6O1721 U37S46 baculoviral IAP repeat-containing 3 BMP2 7.76 S.73853 6SO 112261 NM 001200 bone morphogenetic protein 2 BMPER 5.53 S.71730 1686.67 608699 A4232O1 BMP-binding endothelial regulator precursor protein BMPR2 6.OS S.S32SO 659 600799 A457436 bone morphogenetic protein receptor, type II (serine/threonine kinase) 10.60 S.435976 56853 AI6977O1 bruno-like 4, RNA binding protein (Drosophila) 6.56 S.1085O2 54836 NM O17688 B-box and SPRY domain containing 5.88 S.75462 7832 6O1597 NM OO6763 BTG family, member 2 S.13 S.1594.94 695 300300 NM 000061 Bruton agammaglobulinemia tyrosine kinase 9.09 S.106254 282973 AL137SS1 chromosome 10 open reading frame 39 12.58 S.311 003 93426 BCO34821 hromosome 10 open reading frame 94 S.13 S.2721OO 2912S BMS47346 hromosome 11 open reading frame 21 7.13 4O7975 AF339828 hromosome 13 open reading frame 25 13.95 S.41502 79686 NM 024633 hromosome 14 open reading frame 139 5.05 S.285091 753 606571 AWOO8SOS hromosome 18 open reading frame 1 1769 S.146O25 79839 NM O24781 hromosome 18 open reading frame 14 36.01 140828 AL121722 hromosome 20 open reading frame 56 10.03 S.386685 90625 BCOOS107 hromosome 21 open reading frame 105 77.87 S.3506.79 15O135 NM 152506 hromosome 21 open reading frame 129 6.17 S.155361 755 603.191 U84569 hromosome 21 open reading frame 2 5.57 S473997 S4084 BCO21197 hromosome 21 open reading frame 29 6.67 S.222909 S4083 AL117578 hromosome 21 open reading frame 30 15.16 S.1281 727 120900 NM OO1735 omplement component 5 6.29 S.SO8741 93.15 607332 AIf33949 hromosome 5 open reading frame 13 9.47 63914 NM 022084 hromosome 6 open reading frame 164 11.05 S.375746 285753 BES67344 hromosome 6 open reading frame 182 6.58 S.437SO8 10758 607.043 AI916498 hromosome 6 open reading frame 4 9.39 347744 AW103116 hromosome 6 open reading frame 52 6.13 S.225962 26236 ABO16900 hromosome 6 open reading frame 54 21.05 S.20537 79632 A420S63 hromosome 6 open reading frame 60 547 S.287738 8O129 NM O25059 hromosome 6 open reading frame 97 8.66 S.31.8791 83648 BE856336 hromosome 8 open reading frame 13 7.04 S.291342 2O3O81 NM 145656 hromosome 8 open reading frame 6 22.03 S.119947 158326 AI824037 hromosome 9 open reading frame 154 10.19 S.49605 158219 AWOO 1030 hromosome 9 open reading frame 52 12.97 S.2O8914 84267 AW983691 hromosome 9 open reading frame 64 8.78 S.190877 157983 NM 152569 hromosome 9 open reading frame 66 7:44 S. 96641 169693 AW271796 hromosome 9 open reading frame 71 11.26 S.65425 793 114050 NM 004929 calbindin 1, 28 kDa 38.31 S.640 799 114131 NM OO1742 calcitonin receptor 7.15 S.111460 817 6O7708 BF797381 calcium calmodulin-dependent protein kinase (CaM kinase) II delta 5.83 S.296341 10486 N90755 CAP, adenylate cyclase-associated protein, 2 (yeast) 9.64 S.288,196 8573 3OO172 AI65922S calcium calmodulin-dependent serine protein kinase (MAGUK family) CASP1 9.90 S.2490 834 147678 AIf 1.96SS caspase 1, apoptosis-related cysteine protease (interleukin 1, beta, convertase) CBLB S.96 S.436986 868 604491 NM 004351 Cas-Br-M (murine) ecotropic retroviral transforming sequence b CCKBR 27.45 S.2O3 887 118445 BCOOO740 cholecystokinin B receptor CCL2 111.46 S.303649 6347 15810S S69738 chemokine (C-C motif) ligand 2 CCR4 5.70 S506129 1233 604836 NM OO5508 chemokine (C-C motif) receptor 4 CD163 9.73 S.74O76 9332 60SS45 Z22969 CD163 antigen CD44 5.25 S.306278 960 107269 AW851.559 CD44 antigen (homing function and Indian blood group system) CD8O 8.85 S.838 941 1122O3 AYO81815 CD80 antigen (CD28 antigen ligand 1, B7-1 antigen) CD99 5.86 S.283477 4267 313470, U82164 CD99 antigen CDGAP 10.17 S.3OO670 57514 ABO33O3O KIAA1204 protein CDH12 9.44 S.333997 1010 600S62 L33477 cadherin 12, type 2 (N-cadherin 2) CDH2 8.85 S.334131 1OOO 114020 NM OO1792 cadherin 2, type 1, N-cadherin (neuronal) CDKSR2 S.13 S.158460 8941 603764 RS1311 cyclin-dependent kinase 5, regulatory Subunit 2 (p39) US 2014/0134726 A1 May 15, 2014 36

TABLE 3-continued Up-regulated markers in definitive endodern

Raw Fold Locus- SeqLDerived Gene Symbol Change Unigene Link OMIM From Gene Descriptor CDKN2B 8.45 HS.72901 1030 600431 AW444761 cyclin-dependent kinase inhibitor 2B (p15, inhibits CDK4) CDW92 5.10 HS.414728 23446 606105 AW444881 CDW92 antigen CER1 33.04 HS.248.204 9350 603777 NM 005454 cerberus 1 homolog, cysteine knot superfamily (Xenopus laevis) CFLAR 21.93 HS.3SS724 8837 603599 AFOO5775 CASP8 and FADD-like apoptosis regulator CFTR S.SO HS.41 1882 1080 602421. S64699 cystic fibrosis transmembrane conductance regulator, ATP-binding cassette (sub-family C, member 7) CGI-38 1967 HS.412685 S1673 NM 016140 brain specific protein CLDN1 7.32 HS.7327 9076 603718 AF101051 claudin 1 CLDN11 7.86 HS.31595 5010 601,326 AW264204 claudin 11 (oligodendrocyte transmembrane protein) CMKOR1 53.39 HS.231853 S7007 AI817041 chemokine orphan receptor 1 CNTN3 7.56 H.S.S12593 5067 601325 BE221817 contactin 3 (plasmacytoma associated) CNTN4 S.13 HS-12111S 152330 607280 R42166 contactin 4 COL22A1 5.12 HS-116394 169044 BE349115 collagen, type XXII, alpha 1 COL4A3 9.58 HS.407817 28S 120070 A694562 collagen, type IV, alpha 3 (Goodpasture antigen) COLSA2 6.62 HS.283393 290 120190 NM 000393 collagen, type V, alpha 2 COL8A1 11.41 HS.114599 295 120251 AL359062 collagen, type VIII, alpha 1 COL9A2 7.94 HS.418O12 298 120260 AIf3346S collagen, type IX, alpha 2 COLEC12 24.49 HS.29423 81035 607621 NM 030781 collectin sub-family member 12 CPE 13.90 HS.7S360 363 114855 NM 001873 carboxypeptidase E CPXCR1 S.O3 HS.458292 S3336 AKO98.646 CPX chromosome region, candidate 1 CRIP1 50.56 396 123875 NM 001311 cysteine-rich protein 1 (intestinal) CST1 6.20 HS.123114 469 123855 NM OO1898 cystatin SN CTSH 5.20 HS.114931 512 116820 NM 004390 cathepsin H CXCR4 SS.31 HS.421986 7852 162643 A.J224869 chemokine (C-X-C motif) receptor 4 CXorf1 S4.35 HS.106688 9142 NM 004709 chromosome X open reading frame 1 CYLC1 11.56 HS.444230 S38 603121 Z22780 cylicin, basic protein of sperm head cytoskeleton 1 CYP27A1 1984 HS.82568 593 606530 NM 000784 cytochrome P450, family 27, subfamily A, polypeptide 1 DACT2 11.72 HS.248294 168OO2 AF318336 dapper homolog2, antagonist of beta-catenin (xenopus) DDC 8.62 HS.408106 644 107930 NM 000790 dopa decarboxylase (aromatic L-amino acid decarboxylase) DDIT4L 5.33 Hs.107515 115265 607730 AA528140 DNA-damage-inducible transcript 4-like DIO3 33.07 HS.49322 735 601038 NM 001362 deiodinase, iodothyronine, type III DIO3OS 59.10 HS.406958 64150 608523 AF305836 deiodinase, iodothyronine, type III opposite strand DIRAS2 8.07 Hs. 165636 54769 607863 NM 017594 DIRAS family, GTP-binding RAS-like 2 J222E13.1 28.34 HS.301947 2531.90 AL5901.18 kraken-like DKFZp434C184 5.73 HSS31492 399474 N63821 cDNA DKFZp434C184 gene DKFZp564I1922 6.26 HS.72157 25878 AF2455.05 adlican DKFZp566FO947 7.27 HS.22O597 94023 AL137518 hypothetical gene DKFZp566FO947 DKFZPS86M1120 S.29 HS.159068 83450 BC040276 hypothetical protein DKFZp586M1120 DLC1 13.62 HS.87OO 10395 604258 AA524250 deleted in liver cancer 1 DLEC1 9.06 HS.277589 9940 604050 NM 007337 deleted in lung and esophageal cancer 1 DLEU2 8.62 HS.4464O6 8847 605766 AF264787 deleted in lymphocytic leukemia, 2 DLGS 5.88 HSSOO245 9231 604090 AI809998 discs, large homolog 5 (Drosophila) DNAH14 7.14 HS.381,271 1772 603341 U61741 dynein, axonemal, heavy polypeptide 14 DNAJC6 9.80 HS-129587 9829 608375 AV729634. DnaJ (Hsp40) homolog, subfamily C, member 6 DNAD1 6.48 HS.43883O 291O3 NM 013238 DnaJ (Hsp40) homolog, Subfamily D, member 1 DNMT3L 14.53 Hs. 157237 29947 606588 NM 013369 DNA (cytosine-5-)-methyltransferase 3-like DOCKS 5.24 HS.383OO2 8OOOS AL832744 dedicator of cytokinesis 5 DOCK8 16.36 HS.S28687 81704 AL161725 dedicator of cytokinesis 8 DPYD 33.20 HS.1602 806 274270 NM 000110 dihydropyrimidine dehydrogenase DSCAM 8.25 HS.49002 826 6O2S23 BESO306S Down syndrome cell adhesion molecule DSCAML.1 S.S6 HS.397966 S7453 AKO25940 Down syndrome cell adhesion molecule like 1 DSCR6 21.62 HS.254560 S3820 NM 018962 Down syndrome critical region gene 6 DUSP4 22.18 HS.417962 846 602747 NM 001394 dual specificity phosphatase 4 EB-1 162.04 HS.372732 56899 607815 AWOO5572 E2a-Pbx1-associated protein EBAF 14.86 HS.25195 7044 601877 NM 003240 endometrial bleeding associated factor (left-right determination, factor A; transforming growth factor beta Superfamily) EBI2 S.68 HS784 880 605741 NM 004951 Epstein-Barr virus induced gene 2 (lymphocyte-Specific G protein coupled receptor) ED1 8.13 HS.1OS4O7 896 300451 NM 001399 ectodermal dysplasia 1, anhidrotic EDG3 20.71 HS.3S3892 903 601965 NM 005226 endothelial differentiation, sphingolipid G-protein-coupled receptor, 3 EDG3 12.22 HS.4257 903 601965 AA534817 endothelial differentiation, sphingolipid G-protein-coupled receptor, 3 EDNRA S.S8 HS.2112O2 909 131243 AU1 18882 endothelin receptor type A EGF 13.71 HS.4.19815 950 131530 NM 001963 epidermal growth factor (beta-urogastrone) EHEHADH 85.52 HS.432443 962 607037 NM 001966 enoyl-Coenzyme A, hydratase/3-hydroxyacyl Coenzyme A dehydrogenase 5.20 Hs. 164144 56648 605782 AV747725 eukaryotic translation initiation factor 5A2 7.28 HS.444.695 98.44 606420 NM 014800 engulfment and cell motility 1 (ced-12 homolog, C. elegans) 70.71 HS.246.107 54898 BFSO8639 elongation of very long chain fatty acids (FEN1/Elo2, SUR4/Elo3, yeast)-like 2 EMR2 5.66 Hs.137354 30817 606100 NM 013447 egf-like module containing, mucin-like, hormone receptor-like 2 ENPEP 10.17 HS.4357.65 2028 138297 L12468 glutamylaminopeptidase (aminopeptidase A) US 2014/0134726 A1 May 15, 2014 37

TABLE 3-continued Up-regulated markers in definitive endodern

Raw Fold Locus SeqLDerived Gene Symbol Change Unigene Link OMIM From Gene Descriptor EOMES 16.02 S.147279 832O 604615 NM 005442 eomesodermin homolog (Xenopus laevis) EPHB3 8.99 S.2913 2O49 6O1839 NM 004443 EPOR 6.92 S.127826 2057 133171 BU727288 erythropoietin receptor EPSTI1 34.25 S.3438OO 94240 607441 AA6332O3 epithelial stromal interaction 1 (breast) ERBB4 6.35 S.1939 2O66 600543 M OO5235 v-erb-a erythroblastic leukemia viral oncogene homolog 4 (avian) ERO1LB 5.94 S.424926 566OS M 019891 ERO1-like beta (S. cerevisiae) ETS2 6.96 S.292.477 2114 164740 L575.509 v-ets erythroblastosis virus E26 Oncogene homolog 2 (avian) EVC 26.80 S.28051 2121 604831 M O14556 Ellis van Creveld syndrome FAM13C1 5.24 S.311205 22096S CO36453 amily with sequence similarity 13, member C1 FAM38B S.63 S.293907 63895 W2698.18 amily with sequence similarity 38, member B FANK1 9.17 S.352591 92.565 U143929 fibronectin type 3 and ankyrin repeat domains 1 FBS1 7.33 S.247186 64319 6086O1 W 1 3 45 2 3 fibrosin 1 FBXL14 7.12 S.367956 144699 M 152441 F-box and leucine-rich repeat protein 14 FBXL8 10.33 S.75486 55336 M O18378 F-box and leucine-rich repeat protein 8 FBXO32 5.70 S.403933 114907 606604 F2444O2 F-box only protein 32 FBXO4 6.29 S.437345 26272 M O18007 F-box only protein 4 FCGR2A 6.65 S.352642 2212 146790 F416711 Fc fragment of IgG, low affinity IIa, receptor for (CD32) FER1L3 15.13 S.362731 26509 604603 F2O7990 er-1-like 3, myoferlin (C. elegans) FES 9.36 S.7636 2242 190030 M OO2005 eline sarcoma oncogene FGD6 S.11 S.17O623 55785 KO26881 FYVE, RhoGEF and PH domain containing 6 FGF17 107.07 S.248.192 8822 603725 M OO3867 fibroblast growth factor 17 8 14.46 S.S7710 2253 6OO483 M OO6119 fibroblast growth factor 8 (androgen-induced) S.OO S.497972 11328 LOSO187 FK506 binding protein 9, 63 kDa 30.88 S.183114 79822 NM 030672 hypothetical protein FLJ10312 SO.86 S.173233 552.73 NM 018286 hypothetical protein FLJ10970 10.14 S.98324 S4520 AASO4256 hypothetical protein FLJ10996 8.96 S.31792 55296 NM 018317 hypothetical protein FLJ11082 9.25 S.176227 SS314 AW173.720 hypothetical protein FLJ11155 5.55 S.33368 55784 AIO52257 hypothetical protein FLJ11175 5.76 S.254642 79719 AL136715 hypothetical protein FLJ11506 16.70 S.88144 64799 NM O22784 hypothetical protein FLJ12476 8.43 S.156148 65250 NM 023073 hypothetical protein FLJ13231 6.67 S.193048 8O161 NM O25091 hypothetical protein FLJ13330 6.74 S.4947S7 387751 AL833700 very large inducible Pase 1 S.66 S.152717 795.67 NM O24519 hypothetical protein FLJ13725 S.09 S.144655 84.915 BESS1088 hypothetical protein 14721 13.09 S.129563 54785 NM O17622 hypothetical protein 20014 5.65 S.7099 S4872 AV6SO183 hypothetical protein 20265 547 S.438867 55652 AW149696 hypothetical protein 20489 10.30 S.341806 79745 NM O24692 hypothetical protein 2 1 O 6 9 83.56 S.207407 64388 BFO64262 rO t e l rel 8. e d tO DA 5.58 S.192570 79912 AISS4909 34.33 S.387266 80212 NM O25140 imkain beta 2 34:12 S.113009 79781 W52934 hypothetical protein FLJ22527 7.90 S.297792 79971 ALS34095 putative NFkB activating protein 373 46.65 S.144913 60494 NM O21827 hypothetical protein FLJ23514 S.34 S.164705 79864 NM 024806 hypothetical protein FLJ23554 6.51 S.62604 90853 AKO96.192 hypothetical protein FLJ25348 5.97 158696 NM 153016 hypothetical protein FLJ30672 21.12 S84522 148534 AIOO4375 hypothetical protein FLJ31842 5.81 S.484250 2834.17 AA7587S1 hypothetical protein FLJ32949 S.82 2843SO AA180985 hypothetical protein FLJ34222 5.56 S.369235 160492 NM 152590 hypothetical protein FLJ36004 1544 S.1534.85 2841 61 R4618O hypothetical protein FLJ37451 16.72 S.991.04 285O25 NM 173648 hypothetical protein FLJ395.02 20.67 S.512228 349075 KO97282 hypothetical protein FLJ39963 5.67 S.98288 2O6338 246,369 aeverin 5.71 S.256.632 146S47 M 1735O2 hypothetical protein FLJ90661 25.08 S.24889 56776 606373 CO14364 ormin 2 6.81 S.396595 2330 603957 M 001461 flavin containing monooxygenase 5 7.60 S.SOO3 23380 606524 2638.19 ormin binding protein 2 8.62 S.163484 3169. 602294 M 004496 orkhead box A1 252.32 S.155651 317O 600288 BO28021 orkhead box A2 9.62 S.348883 2296 6O1090 M OO1453 orkhead box C1 25.76 S.87236 2299 6O1093 M 012188 orkhead box I1 43.02 S.2974S2 94234 676059 orkhead box Q1 5.53 S.98751 8939 6O3S36 W O 8 5 4 8 9 air upstream element (FUSE) binding protein 3 10.18 S.152251 7855 601723 M 003468 rizzled homolog 5 (Drosophila) 37.68 S.302634 832S 6061.46 L121749 rizzled homolog 8 (Drosophila) GAL3ST1 7.31 S.17958 9514 6O2300 M 004861 galactose-3-O-sulfotransferase 1 GATA3 1322 S.16994.6 262S 131320 796.169 GATA binding protein 3 US 2014/0134726 A1 May 15, 2014 38

TABLE 3-continued Up-regulated markers in definitive endodern

Raw Fold Locus- SeqLDerived Gene Symbol Change Unigene Link OMIM From Gene Descriptor GATA4 44.12 HS243987 2626 600576 AV700724 GATA binding protein 4 GATA6 24.26 HSSO924 2627 6O1656 D87811 GATA binding protein 6 GATM S.63 HS.7S335 2628 602360 NM 001482 glycine amidinotransferase (L-arginine:glycine amidinotransferase) GBA S.30 HS28.2997 2629 606463 D13.287 glucosidase, beta; acid (includes glucosylceramidase) GDF15 S.O6 HS.296.638 9518 605312 AA129612 growth differentiation factor 15 GA4 13.38 HS.296310 2701 121012 NM 002060 gap junction protein, alpha 4, 37 kDa (connexin 37) GJB3 7.07 HS.415770 2707 603324 BFO60667 gap junction protein, beta 3, 31 kDa (connexin 31) GMPR S.80 HS.1435 2766 139265 NM 006877 guanosine monophosphate reductase GNG2 9.92 Hs. 112928 54331 606981 AKO26424 guanine nucleotide binding protein (G protein), gamma 2 GPC6 6.51 HSSO8411 10O82 6O4404 AI6512SS glypican 6 GPM6A 15.76 HS.75819 2823 601275 D49958 glycoprotein M6A GPR115 6.03 HS.15O131 221393 W67511 G protein-coupled receptor 115 GPR126 8.44 HS.41917O S7211 ALO33377 G protein-coupled receptor 126 GPR2 786 HS.278446 2826 6.00240 NM 016602 G protein-coupled receptor 2 GPR37 33.44 HS.406094 2861 6O2S83 U87460 G protein-coupled receptor 37 (endothelin receptor type B-like) GPR39 15.38 HS.S12O73 2863 602886 AV717094 G protein-coupled receptor 39 GPR49 21.48 HS.1667OS 8549 606667 ALS24520 G protein-coupled receptor 49 GPR81 6.22 HS.326712 27198 606923 AF345568 G protein-coupled receptor 81 GPSM2 16.16 HS.278338 29899 NM 013296 G-protein signalling modulator 2 (AGS3-like, C. elegans) GREB1 5.95 HS.438037 96.87 NM 014668 GREB1 protein GRK7 5.60 HS.351818 131890 606987 NM 139209 G protein-coupled receptor kinase 7 GRP 1911 HS.153444 2922 137260 NM 002091 gastrin-releasing peptide GSC 524.27 HS.440438 145258 138890 AY1774.07 goosecoid GW112 11.OO HS.273.321 10562 AL390736 differentially expressed in hematopoietic lineages H2AFY 29.04 HS.75258 9555 AFO44286 H2A histone family, member Y HAS2 7.56 HS.159226 3O37 6O1636 AI374.739 hyaluronan synthase 2 HCN1 14.55 Hs.353176 348980 602780 BM682352 hyperpolarization activated cyclic nucleotide-gated potassium channel 1 Hes4 6.20 HS.154029. 57801 608060 NM 021170 bLH factor Hes4 HGF S.O2 HS.396530 3O82 142409 U46O1O hepatocyte growth factor (hepapoietin A; Scatter factor) HHEX 6.58 HS.118651 3O87 60442O Z21533 hematopoietically expressed homeobox HIPK2 2O2S HS.39746S 28996 606868 AF2O7702 homeodomain interacting protein kinase 2 HLXB9 18.27 HS.37035 3110 142994 AIf38.662 homeo box HB9 HNF4A 38.25 HS.54424 3172 600281 AIO32108 hepatocyte nuclear factor 4, alpha HOXD13 1568 HS.152414 3239 142989 NM 000523 homeo box D13 HP 14.05 HS.403931 3240 1401.00 NM 005143 haptoglobin HSD17B1 6.50 HS.448861 3292 109684 NM 000413 hydroxysteroid (17-beta) dehydrogenase 1 HTATIP2 S.S4 HS.907S3 10553 60S628 BG1534O1 HIV-1 Tat interactive protein 2, 30 kDa COSL 5.24 HS.1415S 23308 605717 AL355690 inducible T-cell co-stimulator ligand FITS 6.36 HS.2S2839 241.38 NM 012420 interferon-induced protein with tetratricopeptide repeats 5 GFBP6 20.26 HS.274313 3489 146735 NM 002178 insulin-like growth factor binding protein 6 HH S.76 HS.115274 3549 6OO726 AA628967 Indian hedgehog homolog (Drosophila) L18R1 2SO1 HS.1593O1 8809 604494 NM 003855 interleukin 18 receptor 1 L1R1 5.36 HS.82112 3554. 147810 NM 000877 interleukin 1 receptor, type I MAP1 47.52 Hs. 159955 170575 608084 BCO29442 immunity associated protein 1 TGA1 6.58 HS.43932O 3672 192968 BG619261 integrin, alpha 1 TGAS 566 HS.149609 3678 135620 NM 002205 integrin, alpha 5 (fibronectin receptor, alpha polypeptide) TGA9 1312 HS.222 3680 603963 AI4791.76 integrin, alpha 9 TPKB 19.74 HS.78877 3707 147522 AA348410 inositol 1,4,5-trisphosphate 3-kinase B MD3 12.61 HS.10391S 2313S AI83O331 jumonji domain containing 3 KCNG1 12.30 HS.118695 3755 603788 AI332979 potassium voltage-gated channel, Subfamily G, member 1 KCNH8 13.59 HS.410629 131096 NM 144633 potassium voltage-gated channel, Subfamily H (eag-related), member 8 KCNJ3 22.66 HS.199776 3760 601534 AK026384 potassium inwardly-rectifying channel, subfamily J, member 3 KCNK1 6.73 HS.376874 377S 6O1745 AL833343 potassium channel, Subfamily K, member 1 KCNK12 7.71 Hs.252617 56660 607366 NM 022055 potassium channel, subfamily K, member 12 KCNV2 26.04 HS.441357 169522 6O76O4 AI2O6888 potassium channel, Subfamily V, member 2 KIAAO258 6.12 HS.47313 9827 AI690081 KIAAO258 KIAAO318 S.49 HS.22SO14 23504 BE549770 RIM binding protein 2 KIAAO484 S.10 57240 AA732995 KIAAO)484 protein KIAAO626 S.O2 HS.1781.21 9848 NM 021647 KIAA0626 gene product KIAAO774 7.72 HS.222O1 23281 AI8184.09 KIAAO774 KIAAO792 5.85 HS-119387 9725 AW510783 KIAAO792 gene product KIAAO825 S.O3 Hs.1947SS 23004 AB020632 KIAAO825 protein KIAAO882 23.04 HS.411317 23158 AI348094 KIAAO882 protein KIAAO895 12.12 HS-6224 23366 AB020702 KIAAO895 protein KIAA1161 11.92 HS.181679 S7462 ABO32987 KIAA1161 KIAA12O2 22.67 HS.380697 S7477 AIOO542O KIAA1202 protein KIAA1211 5.43 HS2O5293 S7482 AI991996 KIAA1211 protein KIAA1447 5.19 Hs.512733 57597 NM O24696 KIAA1447 protein KIAA1618 S.21 HS.437033 S7714 AA976354 KIAA1618 US 2014/0134726 A1 May 15, 2014 39

TABLE 3-continued Up-regulated markers in definitive endodern

Raw Fold Locus SeqLDerived Gene Symbol Change Unigene Link OMIM From Gene Descriptor KIAA1679 S.98 S.68533 80731 ABO51466 KIAA1679 protein KIAA1728 7.82 S.437362 85461 ABOS1515 KIAA1728 protein KIAA1729 5.19 S.99073 85460 AI13.8969 KIAA1729 protein KIAA1856 7.82 S383.245 84629 AI936,523 KIAA1856 protein KIFC3 7.78 S.231.31 3801 60453S BCOO1211 kinesin family member C3 KLFS S.42 S84728 688 6O2903 AF132818 Kruppel-like factor 5 (intestinal) KYNU 2.90 S.444471 8942 236800 D55639 kynureninase (L-kynurenine hydrolase) LAMB3 6.26 S.436983 3914 1SO310 L2S541 aminin, beta 3 LAMP3 6.OO S.10887 27074 605883 NM O14398 ysosomal-associated membrane protein 3 LEFTB O.76 S.278239 10637 603O37 NM 020997 eft-right determination, factor B LEPREL1 5.44 S.42824 SS214 NM 018192 eprecan-like 1 LHX 3.00 S.443727 3975 6O1999 NM 005568 LIM homeobox. 1 LIFR O.18 S.4465O1 3977 151443 AA7O1657 eukemia inhibitory factor receptor LMLN 7.70 S.432613 89782 BFOS 6991 eishmanolysin-like (metallopeptidase M8 family) LOC128153 7.17 S.99214 128153 AA90SS08 hypothetical protein BCO14608 LOC131873 O.87 S.263560 131873 76.1416 hypothetical protein LOC131873 LOC132604 6.82 132604 CO44793 GalNAc transferase 10 isoform-like LOC1488.98 S.S4 S. 61884 1488.98 W298597 hypothetical protein BC007899 LOC148987 740 14.8987 CO40313 hypothetical protein LOC148987 LOC149414 1.75 149414 F218941 ormin 2-Like LOC151878 7.05 151878 WOO9761 hypothetical protein LOC151878 LOC152084 S.62 152084 CO41967 hypothetical protein LOC152084 LOC153682 25.68 153682 L137383 hypothetical protein LOC153682 LOC1572.78 21.30 S.363O26 1572.78 KO74886 hypothetical protein LOC157278 LOC162O73 5.32 S.28890 162O73 CO15343 hypothetical protein LOC162073 LOC19992O 12.19 19992O 452457 hypothetical protein LOC199920 LOC2O1175 8.06 S.205326 201175 F258593 hypothetical protein LOC201175 LOC2O3806 5.74 S.256916 2O3806 KO92S6S hypothetical protein LOC203806 LOC2S3012 7.75 S.443169 253012 6001.75 hypothetical protein LOC253012 LOC253970 5.72 253970 E218152 hypothetical protein LOC253970 LOC283537 68.29 S.117167 283537 KO2672O hypothetical protein LOC283537 LOC2836.66 6.OS 2836.66 W OO 6 1 8 5 hypothetical protein LOC283666 LOC28.3887 5.17 S.171285 28.3887 215529 hypothetical protein LOC283887 LOC284OO1 6.04 S.526416 284OO1 FOOf 146 hypothetical protein LOC284001 LOC284367 14.25 S.132045 284367 8O1574 hypothetical protein LOC284367 LOC284542 14.80 S. 61504 284542 FO60736 hypothetical protein LOC284542 LOC284561 6.O1 284561 KO23S48 hypothetical protein LOC284561 LOC284615 8.69 284615 L359622 hypothetical protein LOC284615 LOC284739 5.97 S.978.40 284739 L157500 hypothetical protein LOC284739 LOC284898 S.82 S.350813 284898 CO36876 hypothetical protein LOC284898 LOC284950 6.21 284950 KO95038 hypothetical protein LOC284950 LOC28SO45 5.97 S.434660 28SO45 KO95182 hypothetical protein LOC285045 LOC285878 S.63 285878 420977 hypothetical protein LOC285878 LOC33900S S.O2 S.212670 33900S 824O78 hypothetical protein LOC339005 LOC348O94 12.72 S.207157 34.8094 760630 hypothetical protein LOC348094 LOC349136 7.15 S.174373 349136 96.8904 hypothetical protein LOC349136 LOC360O3O 6.38 S.38.5546 360O3O O 3 6 2 2 6 homeobox C14 LOC375616 25.34 S.443744 375616 955614 kielin-like LOCS 1066 21.78 S.113019 S1066 M O15931 S485 LOCS1326 S.04 S.416744 S1326 F493886 ARF protein LOC90342 6.85 90342 L133022 similar to fer-1 like protein 3 LOC90736 9.70 S.415414 90736 G285399 hypothetical protein BC000919 LOC91526 7.85 S.11571 91526 U157224 hypothetical protein DKFZp434D2328 LRIG3 26.20 S.19669 121227 627704 eucine-rich repeats and immunoglobulin-like domains 3 LRRFIP1 S.61 S.S.12387 92.08 6032S6 COO4958 eucine rich repeat (in FLII) interacting protein 1 LW-1 32.53 S1402 M O16153 LW-1 MADH6 6.35 S.153863 4.091 602931 193899 MAD, mothers against decapentaplegic homolog 6 (Drosophila) MAML3 8.03 S.31O320 55534 569476 mastermind-like 3 (Drosophila) MAN1A1 10.19 S.2551.49 4121 604344 G2871.53 mannosidase, alpha, class 1A, member 1 MANEA 18.25 S.46903 79694 587307 mannosidase, endo-alpha MAP3K13 10.61 S.406946 91.75 60491S M 004721 mitogen-activated protein kinase kinase kinase 13 MAP3K4 6.15 S390428 4216 602425 633.559 mitogen-activated protein kinase kinase kinase 4 MATN3 10.63 S.278461 4148 602109 M 002381 matrilin 3 MCC 9.39 S4O9515 41.63 1593.50 E967311 mutated in colorectal cancers MCLC 13.56 S.93121 23155 AA4O6603 Mid-1-related chloride channel 1 MERTK 6.55 S.3061.78 10461 604705 NM OO6343 c-mer proto-oncogene tyrosine kinase MGC13057 9.53 S3893.11 84281 BCOOSO83 hypothetical protein MGC13057 MGC14276 6.08 S.434283 253150 B E195670 hypothetical protein MGC14276 MGC14289 S.OO S.152618 92092 A. 1884.45 similar to RIKEN cDNA 1200014N16 gene MGC19764 6.65 S.14691 162394 A435399 hypothetical protein MGC19764 MGC26989 6.83 25.4268 AL832216 hypothetical protein MGC26989 US 2014/0134726 A1 May 15, 2014 40

TABLE 3-continued Up-regulated markers in definitive endodern

Raw Fold Locus SeqLDerived Gene Symbol Change Unigene Link OMIM From Gene Descriptor MGC33835 10.65 S.367947 222662 BCO28630 hypothetical protein MGC33835 MGC34032 1641 S.213897 2O4962 AAOO1450 hypothetical protein MGC34032 MGC3S130 10.58 S388746 148581 NM 152489 hypothetical protein MGC35130 MGC3S4O2 6.35 S.146844 3.99669 AKO96828 hypothetical protein MGC35402 MGC39518 5.83 S.372046 2851.72 BCO3929S hypothetical protein MGC39518 MGC398.21 11.27 S.351906 284440 BI599587 hypothetical protein MGC39821 MGC39900 8.56 S422848 286527 HO9657 hypothetical protein MGC39900 MGC45441 9.54 S.36567 149473 AU153816 hypothetical protein MGC45441 MGC45594 8.34 S.408.128 284273 BCO3378O hypothetical protein MGC45594 MGC46719 8.84 S.356.109 128077 AWSOO18O hypothetical protein MGC46719 MGCS2498 8.36 S.424,589 348378 AI2O2632 hypothetical protein MGC52498 MGC5395 10.77 S.378738 79026 BG287862 hypothetical protein MGC5395 MGC8721 S.OO S.27992.1 S1669 AI351653 hypothetical protein MGC8721 MGST2 13.23 S.81874 4258 6O1733 NM 002413 microsomal glutathione S-transferase 2 MIB S.49 S.34892 57534 608677 BEO48628 DAPK-interacting protein 1 MIDORI 12.95 S.3O1242 57538 ABO377S1 ikely ortholog of mouse myocytic induction differentiation originator MIXL1 16.37 S.282O79 83881 AF21 1891 Mix1 homeobox-like 1 (Xenopus laevis) MME 12.25 S.307734 4311 120520 NM OO7287 membrane metallo-endopeptidase (neutral endopeptidase, enkephalinase, CALLA, CD10) MMP14 13.59 S.2399 4323 6007 S4 NM OO4995 matrix metalloproteinase 14 (membrane-inserted) MMP21 6.36 S.314141 11.8856 608416 NM 147191 matrix metalloproteinase 21 MPRG 12.47 S.257511 S4852 607781 934,557 membrane progestin receptor gamma MRC1 S.O2 S.75182 4360 153618 M 002438 mannose receptor, C type 1 MSMB 25.23 S.255462 4477 1571.45 M 002443 microseminoprotein, beta MTUS1 6.10 S.7946 57509 LO96842 mitochondrial tumor Suppressor 1 MYBPC1 7.33 S.169849 4604 16O794 F593.509 myosin binding protein C, slow type MYCT1 14.81 S.1816O 8O177 242583 myc target MYL4 11.75 S.356717 4635 160770 58.851 myosin, light polypeptide 4, alkali; atrial, embryonic MYL7 38.38 S.75636 S8498 M 021223 myosin, light polypeptide 7, regulatory MYO3A 6.69 S.148228 S3904 606808 AA44328O myosin IIIA MYOCD 6.86 S.42128 93649 606127 093327 myocardin NBL1 8.62 S.439671 4681 600613 M O05380 neuroblastoma, Suppression of tumorigenicity 1 NBR2 6.10 SSOO268 10230 M OO5821 neighbor of BRCA1 gene 2 NCR1 9.38 S.97084 9437 604530 M 004829 natural cytotoxicity triggering receptor 1 NEBL 10.59 S.SO2S 10529 605491 ESO2910 nebullette NEXN 9.60 S.22370 91624 F114264 nexilin (Factin binding protein) NEASC 35.64 S.13349 23114 821777 neurofascin NFATC2 6.56 S.356321 4773 600490 77.0171 nuclear factor of activated T-cells, cytoplasmic, calcineurin dependent 2 LA 6.83 S.81328 4792 164008 O781.67 nuclear factor of kappa light polypeptide gene enhancer in B-cells inhibitor, alpha IL1 6.39 S.2764 4795 FO97419 nuclear factor of kappa light polypeptide gene enhancer in B-cells inhibitor-like 1 NODAL 7.89 S.370414 4838 6O126S OSO866 nodal homolog (mouse) NOG 6.96 S.2482O1 9241 6O2991 L575177 noggin NOL3 5.46 S.462911 8996 60.5235 FO43244 nucleolar protein 3 (apoptosis repressor with CARD domain) NPL 6.11 S.64896 8O896 368358 N-acetylneuraminate pyruvate lyase (dihydrodipicolinate synthase) NPPB 35.43 S.219140 4879 60O295 M OO2521 natriuretic peptide precursor B 1933 S.268.490 190 3OO473 M 000475 nuclear receptor Subfamily O, group B, member 1 6.71 S.126608 2908 138040 934556 nuclear receptor Subfamily 3, group C, member 1 (glucocorticoid receptor) NRG 15.88 S.453951 3O84 142445 M O13959 neuregulin 1 NRP1 14:29 S.173S48 8829 602069 F145712 neuropilin 1 NRTN S.20 S.234775 4902 602O18 L161995 neurturin NTN4 57.13 S.102541 59.277 F278532 netrin 4 NUDT14 7.42 S.436170 2S6281 F111170 nudix (nucleoside diphosphate linked moiety X)-type motif 14 NUDT4 13.66 S355399 11163 W 5 1 1 1 3 5 nudix (nucleoside diphosphate linked moiety X)-type motif 4 OMD 6.06 S.94O70 4958 M 005014 osteomodulin 5.43 S.444988 79541 COO5587 olfactory receptor, family 2, Subfamily A, member 4 ORS1B2 5.59 79345 F137396 olfactory receptor, family 51, Subfamily B, member 2 OSTF S.20 S47011 26578 U929456 osteoclast stimulating factor 1 5.44 S.278901 23533 BG236366 phosphoinositide-3-kinase, regulatory Subunit, polypeptide p101 P2RY 2 9.81 S.444983 648OS AA810452 purinergic receptor P2Y, G-protein coupled, 12 PAG 18.02 S.266.175 SS824 A. KOOO68O phosphoprotein associated with glycosphingolipid-enriched microdomains PAX6 48.76 S.89506 SO8O 6O7108 WO88232 paired box gene 6 (aniridia, keratitis) PCDH 10 12.SO S.146858 57575 608286 640307 protocadherin 10 PCDH 7 6.54 S.443020 5099 602988 E644.809 BH-protocadherin (brain-heart) PCDH A2 6.83 S6146 6063O8 COO3126 protocadherin alpha 2 PCDH B13 6.32 S.2838O3 S6123 606339 AA489646 protocadherin beta 13 PCDH 10.65 S.119693 26167 606331 AF152S28 protocadherin beta 5 US 2014/0134726 A1 May 15, 2014 41

TABLE 3-continued Up-regulated markers in definitive endodern

Raw Fold Locus- SeqLDerived Gene Symbol Change Unigene Link OMIM From Gene Descriptor PCTP 6.11 Hs.285218 58488 606055 NM 021213 phosphatidylcholine transfer protein PDE4D 8.34 HS.28482 S144 600129 USO157 phosphodiesterase 4D, cAMP-specific (phosphodiesterase E3 dunce homolog, Drosophila) PDE4DIP 7.74 Hs.502577 9659 608117 H24473 phosphodiesterase 4D interacting protein (myomegalin) PDE4DIP 6.27 HSSO2S82 9659 608117 ABOO7923 phosphodiesterase 4D interacting protein (myomegalin) PDZK1. 29.68 HS.15456 5174 603831 NM 002614 PDZ domain containing 1 PHLDB2 5.43 HS.7378 90102 AKO25444 pleckstrinhomology-like domain, family B, member 2 PIGC S.O4 HS.386487 S279 6O1730 ALO3S301 phosphatidylinositol glycan, class C PITX2 12.09 HS.92282 5308 601542 NM 000325 paired-like homeodomain transcription factor 2 PKHD1L1 11.52 HS.17O128 93O3S 60.7843 AV7O6971 polycystic kidney and hepatic disease 1 (autosomal recessive)-like 1 PLA1A 6.48 HS.17752 51365 607460 NM O15900 phospholipase A1 member A PLA2G4D 5.98 HS.38O225 283748 BCO34571 phospholipase A2, group IVD (cytosolic) PLAGL1 6.84 HS-132911 5325 603044 NM 002656 pleiomorphic adenoma gene-like 1 PLCE1 18.17 Hs.103417 51196 608414 NM 016341 phospholipase C, epsilon 1 PLD1 11.78 HS.38O819 5337 6O2382 AI378587 phospholipase D1, phophatidylcholine-specific PLXNA2 23.20 HS.3OO622 5362 601054 AI688418 plexin A2 PLXNA2 S.48 HS.3S006S 5362 601054 AI694S45 plexin A2 PORCN 6.62 HS.386453 64840 NM 022825 porcupine homolog (Drosophila) PPAPDC1 7.83 HS.40479 196051 BF130943 phosphatidic acid phosphatase type 2 domain containing 1 PPFIBP2 7.OO HS-12953 849S 603142 AI69218O PTPRF interacting protein, binding protein 2 (liprin beta 2) PPP1R16B 846 HS.45719 26051 AB020630 protein phosphatase 1, regulatory (inhibitor) subunit 16B PPP3CC O.69 HS.752O6 5533 114107 NM 0056.05 protein phosphatase 3 (formerly 2B), catalytic subunit, gamma isoform (calcineurin Agamma) PRDM1 3.93 HS.381140 639 603423 AI692659 PR domain containing 1, with ZNF domain PREX1 6.21 HS.4372S7 S7580 606905 AL445.192 phosphatidylinositol 3,4,5-trisphosphate-dependent RAC exchanger 1 PRND 2.95 Hs.406696 23627 604263 AL133396 prion protein 2 (dublet) PROO471 5.73 HS.381.337 28994 AF111846 PROO471 protein PRO2012 8.53 HS.283066 SS478 NM 018614 hypothetical protein PRO2012 PROS1 3.38 HS.64O16 5627 176880 NM 000313 protein S (alpha) PRSS2 2OS.O3 HS.435699 S646 NM 002770 protease, serine, 3 (mesotrypsin) PRV1 24.58 Hs.232165 57126 162860 NM 020406 polycythemia rubravera 1 PTGFR 6.53 HS.89418 5737 600563 NM 000959 prostaglandin F receptor (FP) PTPN5 6.24 HS.79092 84867 176879 H29627 protein tyrosine phosphatase, non-receptor type 5 (striatum enriched) PTPRM 8.83 HS.1541.51 S797 176888 BCO294.42 protein tyrosine phosphatase, receptor type, M QPCT 8.60 HS.79033 25797 607065 NM 012413 glutaminyl-peptide cyclotransferase (glutaminyl cyclase) RAB17 20.30 HS.44278 64284 NM 022449 RAB17, member RAS oncogene family RAB18 5.00 HS.414357 22931 602207 AW367507 RAB18, member RAS oncogene family RAI1 7.07 HS.438904 10743 607642 BF984830 retinoic acid induced 1 RARB 5.89 HS.436538 5915 180220 NM 000965 retinoic acid receptor, beta RASGEF1B 6.86 HS.352SS2 153O2O BCO36784 RasGEF domain family, member 1B RASGRP1 7.86 Hs. 189527 10.125 603962 NM 005.739 RAS guanyl releasing protein 1 (calcium and DAG-regulated) RASSF6 5.44 HS.158857 166824 AI167789 Ras association (RalGDSAF-6) domain family 6 RBM2O 11.OO HS.11663.O 282996 AIS391.18 RNA binding motif protein 20 RBM24 21.38 HS2O1619 221662 AI677701 RNA binding motif protein 24 RBP5 1434 HS.246O46 83758 AY007436 retinol binding protein 5, cellular RBPMS S.21 HS.1958.25 11O3O 6O1558 AIO1709S RNA binding protein with multiple splicing RGC32 27.89 HS.76640 28984 NM 014059 response gene to complement 32 RGS11 7.12 HS.65756 8786 603895 NM 003834 regulator of G-protein signalling 11 RGS13 S.1S HS.1716S 6003 607 190 BC036950 regulator of G-protein signalling 13 RGSS S.70 HS249SO 8490 603276 AF159570 regulator of G-protein signalling 5 RGS8 7.50 HS.458417 85397 607 189 R371.01 regulator of G-protein signalling 8 RHOBTB3 21.23 HS.31653 22836 607353 NM O14899 Rho-related BTB domain containing 3 RIN2 7.OO HS.4463O4 54453 AL136924 Ras and Rab interactor 2 RNASE1 8.63 HS78224 6035 180440 NM 002933 ribonuclease, RNase A family, 1 (pancreatic) RNASEA. 10.29 HS.283749 6038. 601030 AIf 61728 ribonuclease, RNase A family, 4 ROR2 10.29 HS2O8O8O 4920 602337 NM 004560 receptor tyrosine kinase-like orphan receptor 2 RP26 8.04 HS.14S140 375298 608381 AI936034 retinitis pigmentosa 26 (autosomal recessive) RS1 S.49 HS.1493.76 6247 312700 AL049684 retinoschisis (X-linked, juvenile) 1 RTN4RL1 60.66 HS.22917 146760 HO6251 reticulon 4 receptor-like 1 RUFY2 S. 60 HS.297.044 SS680 NM 017987 RUN and FYVE domain containing 2 RWDD3 9.89 HS.196585 2S950 AW295367 RWD domain containing 3 S100A10 6.73 HS.143873 6281 114085 BF 12615S S100 calcium binding protein A10 (annexin II ligand, calpactin I, light polypeptide (p11)) S100A11 7.07 HS.417004 6282 603114 NM 005620 S100 calcium binding protein A11 (calgizZarin) S100A13 7.21 HS.446592 6284 601989 NM 005979 S100 calcium binding protein A13 S100A14 26.13 Hs.288998 57402 607986 NM 020672 S100 calcium binding protein A14 S100A16 11.27 HS.8182 140576 AA045184 S100 calcium binding protein A16 S100A2 S.97 HSS15713 6273 176993 NM 005978 S100 calcium binding protein A2 S1 OOZ 12.38 HS.3S2172 170591 AF437876 S100Z. protein US 2014/0134726 A1 May 15, 2014 42

TABLE 3-continued Up-regulated markers in definitive endodern

Raw Fold Locus SeqLDerived Gene Symbol Change Unigene Link OMIM From Gene Descriptor S AMD3 12.06 S44.0508 154075 AI129628 sterile alpha motif domain containing 3 S CA1 8.18 S.434961 6310 BF438383 spinocerebellar ataxia 1 (olivopontocerebellar ataxia 1, autosomal dominant, ataxin 1) S 6.62 S.379191 79966 608370 AL571375 Stearoyl-CoA desaturase 4 S S.98 S.186877 11280 604385 AF150882 sodium channel, voltage-gated, type XI, alpha S 8.34 S.272374 S4536 AF220217 SEC15-like 1 (S. cerevisiae) S 7:44 S.252451 10371 603961 BF102683 sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (Semaphorin) 3A S EMA3E 135.96 S.S28721 9723 NM 012431 sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3E EMASA S.42 S.S28.707 9037 NM OO3966 sema domain, seven thrombospondin repeats (type 1 and type 1 like), transmembrane domain (TM) and short cytoplasmic domain, (Semaphorin) 5A ERHL 27.81 S.398O85 94.009 607979 AL450314 serine hydrolase-like ETT 5.97 S.160208 80854 606594 AKO24846 SET domain-containing protein 7 GPP1 5.85 S.24678 81537 BE880703 sphingosine-1-phosphate phosphatase 1 GSH 10.71 S.31074 6448 605270 NM OOO199 N-sulfoglucosamine Sulfohydrolase (sulfamidase) H3BGR 8.65 S.474.38 6450 602230 NM OO7341 SH3 domain binding glutamic acid-rich protein HOX2 12.75 S.SS967 6474 6O2SO4 AFO22654 short stature homeobox2 HREW1 15.74 S.25924 55966 AA835OO4 transmembrane protein SHREW1 AT7B 1312 S.288215 10610 NM OO6456 sialyltransferase 7 (alpha-N-acetylneuraminyl-2,3-beta-galactosyl 1,3)-N-acetylgalactosaminide alpha-2,6-Sialyltransferase) B. S 34.34 S.408.614 6489 6O1123 L32867 sialyltransferase 8A (alpha-N-acetylneuraminate:alpha-2.8- sialyltransferase, GD3 synthase) 7.63 S.3O2341 81.28 NM OO6011 sialyltransferase 8B (alpha-2,8-sialyltransferase) s 6.55 S.298923 S1046 NM O15879 sialyltransferase 8C (alpha2,3Galbeta1,4GlcNAcalpha 2.8- sialyltransferase) 161.93 S.3O8628 7903 422986 sialyltransferase 8D (alpha-2.8-polysialyltransferase) s 13.35 S.415117 8869 O17540 sialyltransferase 9 (CMP-NeuAc:lactosylceramide alpha-2,3- sialyltransferase; GM3 synthase) GLEC10 7.88 S.284.813 89790 606091 M 03.3130 sialic acid binding Ig-like lectin 10 PA1L2 5.51 S.406879 57568 BO37810 signal-induced proliferation-associated 1 like 2 LAC2-B S.O1 S.138380 23O86 BO14524 SLAC2-B LC1A1 14.09 S.91139 6505 133550 W 23 5 O 6 1 solute carrier family 1 (neuronal epithelial high affinity glutamate transporter, system Xag), member 1 5.45 S.104637 6512 604471 M OO6671 solute carrier family 1 (glutamate transporter), member 7 18.00 S.354O13 115111 6084.79 758950 solute carrier family 26, member 7 7.97 S.441716 10568 604217 F146796 solute carrier family 34 (sodium phosphate), member 2 7.42 S.448979 234.43 60S632 COOS136 solute carrier family 35 (UDP-N-acetylglucosamine (UDP-GlcNAc) transporter), member A3 S 11.46 S.158748 148641 F968270 solute carrier family 35, member F3 S 44.06 S4O9875 30061 L136944 solute carrier family 40 (iron-regulated transporter), member 1 S 79.96 S.37890 2OOO10 767388 solute carrier family 5 (sodium glucose cotransporter), member 9 S 6.86 S.413095 S4716 M 020208 solute carrier family 6 (neurotransmitter transporter), member 20 S 12.84 S.83974 6578 M OO5630 solute carrier organic anion transporter family, member 2A1 S 13.29 S.32O368 84631 L109653 SLIT and NTRK-like family, member 2 S 737 S.444445 6604 NM 003078 SWI/SNF related, matrix associated, actin dependent regulator of chromatin, Subfamily d, member 3 S MPD1 5.76 S.77813 6609 M59917 sphingomyelin phosphodiesterase 1, acid lysosomal (acid sphingomyelinase) S OCS1 S.92 S50640 8651 603597 U88326 Suppressor of cytokine signaling 1 S OCS3 941 S.436943 9021 604176 AI.244908 Suppressor of cytokine signaling 3 S ORCS1 24.07 S.348923 114815 606283 AI675836 sortilin-related VPS10 domain containing receptor 1 S OX17 30.2O S.98367 64321 NM 0224.54 SRY (sex determining region Y)-box 17 S OX18 10.36 S8619 S4345 6O1618 NM 018419 SRY (sex determining region Y)-box 18 S POCK3 62.58 S.159425 50859 607989 AI808090 sparcosteonectin, cwcv and kazal-like domains proteoglycan (testican) 3 S PTAN1 5.56 S.387905 6709 1828.10 AKO26484 spectrin, alpha, non-erythrocytic 1 (alpha-fodrin) S RGAP1 9.61 S.408259 57522 606523 BCO29919 SLIT-ROBO Rho GTPase activating protein 1 S SX2 11.44 S.289105 6757 3OO192 BCOO2818 synovial sarcoma, X breakpoint 2 S SX3 1243 S.178749 10214 3OO325 NM 021014 synovial sarcoma, X breakpoint 3 S SX4 8.27 S.278632 6759 3OO326 BCOO5325 synovial sarcoma, X breakpoint 4 S TAT4 6.73 S.80642 6775 600SS8 NM OO3151 signal transducer and activator of transcription 4 S TC1 1886 S.25590 6781 601.185 U46768 Stanniocalcin 1 S TMN2 33.93 S.9 OOOS 11075 6OO621 BF967657 stathmin-like 2 S ULF2 5.84 S.43857 55959 AL133OO1 Sulfatase 2 S ULT1E1 S.O8 S.S.4576 6783 6OOO43 NM 005420 Sulfotransferase family 1E, estrogen-preferring, member 1 S V2B 16.36 S.8071 98.99 185861 NM 014848 synaptic vesicle glycoprotein 2B S YNE1 840 S.282117 23345 608441 AFO43290 spectrin repeat containing, nuclear envelope 1 TAF11 S.83126 6882 600772 BQ709323 TAF11 RNA polymerase II, TATA box binding protein (TBP)- associated factor, 28 kDa US 2014/0134726 A1 May 15, 2014 43

TABLE 3-continued Up-regulated markers in definitive endodern

Raw Fold Locus SeqLDerived Gene Symbol Change Unigene Link OMIM From Gene Descriptor TAGLN2 9. OS S.406504 84O7 604634 NM 003564 transgelin 2 TAIP-2 9.08 S.287442 80O34 NM 024969 TGF-beta induced apotosis protein 2 TAL2 6.OO S.247978 6887 186855 NM 005421 T-cell acute lymphocytic leukemia 2 TA-LRRP 7.71 S.387915 23507 R41498 T-cell activation leucine repeat-rich protein TCF2 9.87 S.408093 6928 189907 NM 000458 transcription factor 2, hepatic; LF-B3; variant hepatic nuclear factor TDRD7 1946 S.416543 23424 AW129593 tudor domain containing 7 TGFB1 S.40 HS.1103 7040 19018O BCOOO12S transforming growth factor, beta 1 (Camurati-Engelmann disease) TIMP3 5.72 S.245.188 7078 188826 NM OOO362 issue inhibitor of metalloproteinase 3 (Sorsby fundus dystrophy, pseudoinflammatory) TIRP 6.36 S.278391 353376 608321 AI4(OO110 TRIF-related adaptor molecule TLE 8.87 S.332.173 7089 601041 M994.36 transducin-like enhancer of split 2 (E(Sp1) homolog, Drosophila) TM4SF1 8.02 S.351316 4O71 191155 AI189753 transmembrane 4 Superfamily member 1 TM4SF12 5.88 23554 NM 012338 transmembrane 4 Superfamily member 12 TMOD1 74.09 7111 190930 NM OO3275 tropomodulin 1 TNFSF10 15.05 8743 603598 NM OO3810 tumor necrosis factor (ligand) Superfamily, member 10 TNNC1 31.06 7134 191040 AFO2O769 troponin C, slow TPK1 S.18 27010 606370 NM 022445 hiamin pyrophosphokinase 1 TPTE 11.46 7179 604336 AL137389 transmembrane phosphatase with tensin homology TRADD 8.22 8717 603SOO NM 003789 TNFRSF1A-associated via death domain TRDN 12.67 10345 603283 AA192306 triadin TRPA1 103.37 8989 604775 AASO2609 transient receptor potential cation channel, Subfamily A, member 1 TRPM4 S.OO 54.795 606936 NM O17636 transient receptor potential cation channel, Subfamily M, member 4 TSGA10 6.41 80705 6O7166 AYO14284 estis specific, 10 TSHR S.96 7253 603372 BEO45816 hyroid stimulating hormone receptor TSPYL5 17.94 85453 AIO96375 TSPY-like S TTC3 12.26 7267 602259 AI652848 etratricopeptide repeat domain 3 TTLL3 7.35 26140 AA534291 tubulin tyrosine ligase-like family, member 3 TTN 76.16 7273 188840 NM 003319 itin TUBE1 8.85 51175 6O7345 BE55O254 tubulin, epsilon 1 UBE21 7.26 S146S N64O79 ubiquitin-conjugating enzyme E2, J1 (UBC6 homolog, yeast) UBXD3 5.81 127733 T86344 UBX domain containing 3 UNC13C 32.43 145790 AL8344O7 unc-13 homolog C (C. elegans) UNC93A 32.18 S4346 607995 ALO21331 unc-93 homolog A (C. elegans) UNQ467 5.30 388533 W69083 KIPV467 UNQ827 S.O8 40O830 BEO44548 KFLL827 USP3 7.90 9960 604728 AFO77040 ubiquitin specific protease 3 USP53 6.36 54532 AW188464 ubiquitin specific protease 53 WANGL1 7.57 81839 NM 024062 vang-like 1 (van gogh, Drosophila) WCAM1 1840 7412 192225 NM 001078 vascular cell adhesion molecule 1 WGCNL1 S.61 259232 BE22O480 voltage gated channel like 1 VIL1 14.42 7429 193040 NM 007127 willin 1 VNN3 6.94 5.5350 606592 NM O78625 wanin 3 VPS13C 6.16 3 2 8 1 O 9 54832 AA828371 vacuolar protein sorting 13C (yeast) VTN 6.33 7448 1931.90 NM 000638 vitronectin (serum spreading factor, somatomedin B, complement S-protein) VWF 13.24 S.440848 7450 1934OO NM 000552 von Willebrand factor WASPIP 5.76 S.401414 7456 6O2357 AWOS8622 Wiskott-Aldrich syndrome protein interacting protein WBSCR24 6.28 S.126451 155382 S21163 Williams Beuren syndrome chromosome region 24 WDR31 6.10 S.133331 114987 F589326 WD repeat domain 31 WFDC2 8.49 Hs.2719 10406 M OO6103 WAP four-disulfide core domain 2 WT1 1992 HS1145 7490 6O7102 M O24426 Wilms tumor 1 XIST S.11 7503 314670 E644917 X (inactive)-specific transcript XLHSRF-1 10.49 HS.974O 25981 OO4779 heat shock regulated 1 YAF2 1236 S.348380 101.38 607S34 6794 YY1 associated factor 2 YPEL2 15.77 S.368672 3884.03 ESO2982 yippee-like 2 (Drosophila) ZAP70 14.98 S.234569 7535 176947 BO83211 Zeta-chain (TCR) associated protein kinase 70 kDa ZFPM2 6.50 S.106309 23414 603693 M 012082 Zinc finger protein, multitype 2 ZNF148 S.13 S.442787 7707 6O1897 W 5 9 4 1 6 7 Zinc finger protein 148 (pHZ-52) ZNF335 7.13 S.174193 63925 Zinc finger protein 335 ZNF396 1881 S.351 OOS 2S2884 F533251 Zinc finger protein 396 ZNF436 7.11 S.293798 8O818 829509 Zinc finger protein 436 ZNF439 5.39 S.378527 90594 29327 Zinc finger protein 439 ZNF471 7.23 S.23O188 57573 LO42S23 Zinc finger protein 471 ZNF533 8.23 HS. 6295 151126 O38422 Zinc finger protein 533 ZNF555 9.1O S.12471 148254 FOS2118 Zinc finger protein 555 ZNF7 8.09 HS2O76 7553 194531 862153 Zinc finger protein 7 (KOX 4, clone HF.16) 116.51 S.11873 W 1 6 7 7 2 7 Transcribed sequences 66.25 S.177968 821586 Transcribed sequence with weak similarity to protein ref: NP 066928.1 (H. sapiens) phospholipid scramblase 1 Homo Sapiens US 2014/0134726 A1 May 15, 2014 44

TABLE 3-continued Up-regulated markers in definitive endodern

Raw Fold Locus SeqLDerived Gene Symbol Change Unigene Link OMIM From Gene Descriptor S4.91 S.160418 BF941609 Transcribed sequences 52.28 S.444751 AI916S32 Transcribed sequences SO.21 S.390285 4.01221 BCO344O7 CDNA clone IMAGE: 5268504, partial cols 43.32 S.S18877 BCO14345 Clone IMAGE: 3935474, mRNA 42.12 S.446121 AW450381 Transcribed sequences 41.67 S.445324 BFS13800 Transcribed sequences 4O16 S.S.14745 N637O6 CDNA FLJ41084 fis, clone ADRGL2010974 35.98 S.434969 AI917371 Transcribed sequences 35.50 S.1454.04 345557 AI6943.25 Similar to RIKEN cDNA B130016O10 gene (LOC345557), mRNA 33.79 S.127462 AI190292 Transcribed sequences 33.43 S.S.0850 AI127440 CDNA FLJ3.0128 fis, clone BRACE1000 124 32.63 S.348762 AI143879 CDNA FLJ25677 fis, clone TSTO4054 3O42 AI248OSS Consensus includes gb: AI248.055 (FEA = EST /DB XREF = gi: 38434.52 /DB XREF = est: qhé4c02.x1 iCLONE = IMAGE: 1849442 UG = HS.137263 ESTs 3O.OS S.7888 AW772192 CDNA FLJ44318 fis, clone TRACH3000780 29.77 S.382051 BCO1458S Clone IMAGE: 4047715, mRNA 29.74 S.446340 AW2914O2 Transcribed sequences 28.51 S.112742 401141 BFSO8344 CDNA clone IMAGE: 6301163, containing frame-shift errors 28.41 S.118317 AU147152 CDNA FLJ12088 fis, clone HEMBB1002545 27.71 S.360628 AW8736O4 CDNA FLJ42757 fis, clone BRAWH3001712 24.86 S.158853 AW274369 Transcribed sequences 24.14 AFOO9316 gb: AF0093.16.1 /DB XREF = gi: 2331119 TID = Hs2.384893.1 CNT = 1 FEA = mRNA TIER = ConsEnd STK = 0 UG = HS.384893 fUG TITLE = Homo sapiens clone TUB2 Cri-du-chat region mRNA /DEF = Homo sapiens clone TUB2 Cri-du-chat region mRNA. 24.04 HS.S31661 AFO88O10 Full length insert cDNA clone YY74E10 23.38 HS287436 AU145336 CDNA FLJ11655 fis, clone HEMBA1004554 22.75 HS.106975 H10408 Transcribed sequences 21.99 BCO37932 gb: BCO37932.1/DB XREF = gi: 23138809 TID = Hs2.385469.1 CNT = 2 FEA = mRNA TIER = ConsEnd STK = 0 UG = HS.385469 /UG TITLE = Homo sapiens, clone IMAGE: 5285165, mRNA /DEF = Homo sapiens, clone IMAGE: 5285165, mRNA. 2126 S-451 103 BCO41486 CDNA clone IMAGE: 5492202, partial cols 20.96 S.269873 AF339813 Clone IMAGE: 297403, mRNA sequence 20.87 S.113150 AV7391.82 Transcribed sequences 2010 S.191144 T83380 Transcribed sequences 9.98 S.S2S221 W29.1482 Transcribed sequences 9.17 S34673S E222344 Clone IMAGE: 3881549, mRNA 8.97 S.256230 W418842 CDNA FLJ31245 fis, clone KIDNE2005062 7.92 S.265.194 F223214 Transcribed sequences 7.65 S.S23019 8O1626 Transcribed sequence with weak similarity to protein ref: NP 055301.1 (H. sapiens) neuronal thread protein Homo Sapiens 6.87 S.121518 Transcribed sequence with strong similarity to protein sp: P00722 (E. coli) BGAL ECOLI Beta-galactosidase 6.53 S.126622 W341.701 Transcribed sequences 6.52 S.348682 L833609 Similar to mitochondrial carrier triple repeat 1 (LOC4O1612), mRNA 6.41 S.2O1875 E6454.80 Transcribed sequences 6.35 S.296O14 U148154 CDNA FLJ14136 fis, clone MAMMA1002744 6.33 S.375762 955.713 Clone IMAGE: 4828750, mRNA 6.33 S390443 CO43538 Clone IMAGE: 5170153, mRNA 6.27 S.488530 FO86391 Full length insert cDNA clone ZD73HO4 5.79 S.244.283 O18404 MRNA, cDNA DKFZp686I19109 (from clone DKFZp686I19109) 5.55 S.S19952 L83.3685 MRNA, cDNA DKFZp667O0522 (from clone DKFZp667O0522) 5.45 S.13188 W3OO488 Human HepG2 partial cDNA, clone himd5d04m5. 5.37 S.311250 L137360 MRNA, cDNA DKFZp434CO326 (from clone DKFZp43400326) 5.33 S.96917 F591483 Transcribed sequences 5.15 S.316856 CO4O219 Clone IMAGE: 4818734, mRNA 5.15 S.SOSOO3 G208091 CDNA FLJ37414 fis, clone BRAWH10OO157 5.05 S.126995 M969275 CDNA FLJ46728 fis, clone TRACH3O19142 4.91 S.345792 A. 684SS1 Transcribed sequence with weak similarity to protein ref: NP 077243.1 (M. musculus) ribosome binding protein 1 isoform mRRp.61 Mus musculus 4.91 S.441051 769647 CDNA clone IMAGE: 5296,106, partial cols 4.75 S.526642 498OS Transcribed sequences 4.18 S.264606 741597 Full length insert cDNA clone ZD68B12 3.94 S.S28S4O CO3949S Similar to double homeobox protein (LOC4O1074), mRNA 3.54 S.241559 L109791 MRNA full length insert cDNA clone EUROIMAGE 151432 3.49 S.2S3690 BCO42378 Clone IMAGE: 5277693, mRNA 3.17 S.291 O15 AA651631 Transcribed sequences US 2014/0134726 A1 May 15, 2014 45

TABLE 3-continued Up-regulated markers in definitive endodern

Raw Fold Locus SeqLDerived Gene Symbol Change Unigene Link OMIM From Gene Descriptor 13.13 HS.S2928S AA588.092 Transcribed sequences 12.65 HS128439 AI190306 Transcribed sequences 12.63 HS.435959 BFS11336 Transcribed sequences 12.55 HS2O2S12 AI697714 Transcribed sequences 12.51 HS126918 R1SOO4 Transcribed sequences 12.21 AL832724 gb: AL832724.1 /DB XREF = gi: 21733304 (TID = Hs2.376962.1 CNT = 1 FEA = mRNA TIER = ConsEnd STK = 0 UG = HS.376962 /UG TITLE = Homo sapiens mRNA, cDNA DKFZp313I209 (from clone DKFZp313I209) DEF = Homo sapiens mRNA, cDNA DKFZp313I209 (from clone DKFZp313I209). 12.16 HS.385547 390845 BCO36040 Similar to KIAAO563-related gene (LOC390845), mRNA 12.08 HS.172778 ABO46770 CDNA FLJ38287 fis, clone FCBBF3008362, moderately similar to PLEXIN 4 PRECURSOR 12O6 HS.446.559 BGO24649 Transcribed sequence with strong similarity to protein sp: P00722 (E. coli) BGAL ECOLI Beta-galactosidase 11.82 AKO55534 gb: AKO55534.1 /DB XREF = gi: 16550279 TID = Hs2.350858.1 CNT = 1 FEA = mRNA TIER = ConsEnd STK = 0 UG = HS.350858 /UG TITLE = Homo sapiens cDNA FLJ30972 fis, clone HEART2000492. DEF = Homo sapiens cDNA FLJ30972 fis, clone HEART2000492. 11.80 HS.348434 BCO33956 Clone IMAGE: 5286779, mRNA 11.66 HS.158528 BFOO2339 Transcribed sequences 11.58 HS.110406 AIFOO341 Transcribed sequences 11.57 HS.S17745 AL359583 MRNA, cDNA DKFZp547L174 (from clone DKFZp547L174) 11.54 HS116462 AA639753 Transcribed sequences 11.28 HS28.391 BFS11741 CDNA FLJ37384 fis, clone BRAMY2026347 11.18 HS.382361 BCO34279 CDNA clone IMAGE:4824424, containing frame-shift errors 11.17 HS143258 BCO21687 Clone IMAGE: 3934974, mRNA 11.07 HS.110286 AA418.074 CDNA FLJ43404 fis, clone OCBBF2017516 10.97 HS194626 AA916568 Transcribed sequences 10.96 HS.S32249 AL833080 MRNA, cDNA DKFZp451G2119 (from clone DKFZp451G2119) 10.96 HS.383.399 AF113679 Cone FLB3107 10.87 HS108068 BF197757 Transcribed sequences 10.80 HS.374838 AB002318 Clone 24641 mRNA sequence 10.70 AKO24556 gb: AKO24556.1 /DB XREF = gi: 10436865 TID = Hs2.383622.1 CNT = 1 FEA = mRNA TIER = ConsEnd STK = 0 UG = HS.383622 fUG TITLE = Homo sapiens cDNA: FLJ20903 fis, clone ADSE00222. DEF = Homo sapiens cDNA: FLJ20903 fis, clone ADSEOO222. 10.67 HS.S31632 R43486 CDNA FLJ33443 fis, clone BRALZ10001.03 10.61 HS.163734 BG028463 Transcribed sequences 1054 HS1431.34 AW207243 CDNA FLJ38181 fis, clone FCBBF1000125 10.49 HS.290853 AW970985 Transcribed sequences 10.43 HS.447459 AU158588 CDNA FLJ13756 fis, clone PLACE3000365 10.37 HSS25111 AW1345.04 Transcribed sequence with moderate similarity to protein sp: P39194 (H. sapiens) ALU7 HUMAN Alu subfamily SQ sequence contamination warning entry 10.34 AKO96139 gb: AKO96139.1 /DB XREF = gi: 21755553 (TID = Hs2.376171.1 CNT = 1 FEA = mRNA TIER = ConsEnd STK = 0 UG = HS.376171 fUG TITLE = Homo sapiens cDNA FLJ38820 fis, clone LIVER2008473. DEF = Homo sapiens cDNA FLJ38820 fis, clone LIVER2OO8473. 10.2O HS.314414 BC040901 Clone IMAGE: 5743779, mRNA 10.15 HS.432615 AWO70877 Transcribed sequences 10.13 Hs.176872 AW297742 Transcribed sequences 1.O.OS AKO57525 gb: AKO57525.1/DB XREF = gi: 16553266 TID = Hs2.4O1310.1 CNT = 6 FEA = mRNA TIER = ConsEnd STK = S UG = HS.401.310 fUG TITLE = Homo sapiens cDNA FLJ32963 fis, clone TESTI2008.405. DEF = Homo sapiens cDNA FLJ32963 fis, clone TESTI2O084OS. 9.91 BCO16356 gb: BC016356.1 /DB XREF = gi: 18921441 TID = Hs2.382791.1 CNT = 1 FEA = mRNA TIER = ConsEnd STK = 0 UG = HS.382791 /UG TITLE = Homo sapiens, clone IMAGE: 4093039, mRNA /DEF = Homo sapiens, clone IMAGE: 4093039, mRNA. 9.90 HS283228 BC041378 CDNA clone IMAGE: 5274.693, partial cols 9.85 HS.S21517 AYO10114 Unknown mRNA sequence 9.74 HS.161.93 BG251521 MRNA, cDNA DKFZp586B211 (from clone DKFZp586B211) 9.64 HS116631 N63890 Transcribed sequences 9.59 HS.130526 AIOO4009 Transcribed sequence with moderate similarity to protein sp: P39194 (H. sapiens) ALU7 HUMAN Alu subfamily SQ sequence contamination warning entry US 2014/0134726 A1 May 15, 2014 46

TABLE 3-continued Up-regulated markers in definitive endodern

Raw Fold Locus SeqLDerived Gene Symbol Change Unigene Link OMIM From Gene Descriptor 9.SO S.176379 AA913O23 Transcribed sequences 9.45 S.278.177 AA809449 Transcribed sequence with weak similarity to protein sp: P39191 (H. sapiens) ALU4 HUMAN Alu subfamily SB2 sequence contamination warning entry 9.32 S.101428 AW628575 Transcribed sequences 9.25 S.373857 AA551114 Transcribed sequences 9.23 S.57773 AA449026 Transcribed sequences 9.09 S.46908 AI629O41 Transcribed sequences 9.08 S.364332 BCO42834 Clone IMAGE: 5314388, mRNA 9.00 S.221.37 AI3.93695 MRNA, cDNA DKFZp686O0849 (from clone DKFZp686O0849) 8.94 S.S.08368 AF3398OS Clone IMAGE:248602, mRNA sequence 8.91 S.12962O AI97O133 Transcribed sequences 8.84 S.481953 AU144883 CDNA FLJ11566 fis, clone HEMBA1003273 8.78 S.42522 2553.38 AIO27091 Clone IMAGE: 4827791, mRNA 8.78 S.418040 BF47608O CDNA clone IMAGE: 30367357, partial cols 8.73 S. 60797 AAO173O2 Transcribed sequences 8.70 S.114111 BMO21056 Full length insert cDNA clone YA75D10 8.68 S.469666 ALO41381 CDNA FLJ32512 fis, clone SMINT1000075 8.63 S.36958 AA496799 Transcribed sequences 8.59 S.212298 AW204033 Transcribed sequence with weak similarity to protein sp: P39189 (H. sapiens) ALU2 HUMAN Alu subfamily SB sequence contamination warning entry 8.57 S.365692 AL691.692 CDNA FLJ20833 fis, clone ADKAO2957 8.57 S.158209 BE1784.18 Transcribed sequence with weak similarity to protein ref: NP 062553.1 (H. sapiens) hypothetical protein FLJ11267 Homo sapiens 8.54 S.12533 Z39.557 Clone 23705 mRNA sequence 8. SO S.49932O AU147518 CDNA FLJ12203 fis, clone MAMMA1000914 8.48 S.1SO378 BFO 63236 Transcribed sequences 8.47 S.306704 AKO248OO CDNA: FLJ21147 fis, clone CASO9371 8.41 AL1373.13 Consensus includes gb: AL137313.1 (DEF = Homo sapiens mRNA: cDNA DKFZp761M10121 (from clone DKFZp761M10121). /FEA = mRNA /DB XREF = gi: 68.07798 (UG = HS.306449 Homo sapiens mRNA, cDNA DKFZp761M10121 (from clone DKFZp761M10121) 8.41 gb: AKO98258.1 /DB XREF = gi: 21758235 (TID = Hs2.3798.09.2 CNT = 1 FEA = mRNA TIER = ConsEnd STK = 0 UG = HS.379809 fUG TITLE = Homo sapiens cDNA FLJ40939 fis, clone UTERU2008077. DEF = Homo sapiens cDNA FLJ40939 fis, clone UTERU2008077. 8.39 S.136672 BCO43227 CDNA clone IMAGE: 5295530, partial cols 8.36 S.434253 BCO42892 Clone IMAGE: 4830182, mRNA 8.36 S.351262 AI950472 Similar to otoconin 90, clone IMAGE: 4278507, mRNA 8.32 S.S31627 BE1384.86 MRNA, cDNA DKFZp686K1836 (from clone DKFZp686K1836) 8.31 S.24994.6 AV7OOO86 Transcribed sequences 8.26 S.1OSSOO AAS21154 Transcribed sequences 8.24 S.444915 AIf34054 Transcribed sequences 8.21 S.444387 AIOS1967 Transcribed sequences 8.18 S.384618 AFO86084 Full length insert cDNA clone YZ84CO1 8.18 S.370704 AW104.813 CDNA FLJ36685 fis, clone UTERU2008018 8.11 S.192075 AA7024.09 Transcribed sequences 8.10 S.117474 R49146 CDNA FLJ34491 fis, clone HLUNG2004774 8.08 S.492671 AU147969 CDNA FLJ12341 fis, clone MAMMA1002269 8.07 S.421746 CA442689 CDNA FLJ13866 fis, clone THYRO100 1213 8. OS S.126024 ALO41122 Transcribed sequence with weak similarity to protein ref: NP 491607.1 (C. elegans) C09D4.4a.p Caenorhabditis elegans 8.03 S.527657 AL552727 Transcribed sequences 8.03 S.133319 BE67S108 Alu repeat mRNA sequence 8.O2 S.S1515 AAOS3967 MRNA, cDNA DKFZp564G112 (from clone DKFZp564G112) 8.O2 S.155814 N23O33 Transcribed sequence with strong similarity to protein sp: P00722 (E. coli) BGAL ECOLI Beta-galactosidase 7.98 S.98.314 ALSS3774 MRNA, cDNA DKFZp586LO 120 (from clone DKFZp586LO120) 7.93 S.408455 AA002022 MRNA, cDNA DKFZp686.J1595 (from clone DKFZp686.J1595) 7.91 S.488293 N74195 Transcribed sequence with weak similarity to protein ref: NP 079268.1 (H. sapiens) hypothetical protein FLJ12547 Homo sapiens

US 2014/0134726 A1 May 15, 2014 50

TABLE 3-continued Up-regulated markers in definitive endodern

Raw Fold Locus SeqLDerived Gene Symbol Change Unigene Link OMIM From Gene Descriptor S.61 AI990495 Consensus includes gb: AI990495 (FEA = EST /DB XREF = gi: 5837376 (DB XREF = est: ws40dO2.x1 fCLONE - IMAGE: 2499651 UG = HS.1268.09 ESTs 5.60 HS.130260 AI695695 Transcribed sequences 5.55 HS.107070 H93O43 Transcribed sequences S.S4 HSSO1925 AU144382 CDNA FLJ11476 fis, clone HEMBA1001745 S.S4 HS.98O73 AA490685 CDNA: FLJ22994 fis, clone KAT11918 5.53 HS126133 38.9324 AIf 9969S MRNA, cDNA DKFZp686D0374 (from clone DKFZp686D0374) 5.52 HS.262826 AI218551 Transcribed sequences 5.51 HS.373,571 AA3OSO27 CDNA FLJ39665 fis, clone SMINT2007294 5.50 HS.445122 AW291140 Transcribed sequences S.48 NM O251.16 gb: NM 025116.1 (DEF = Homo sapiens hypothetical protein FLJ12781 (FLJ12781), mRNA. FEA = mRNA (GEN = FLJ12781 /PROD = hypothetical protein FLJ12781 /DB XREF = gi: 13376692 /UG = Hs.288726 hypothetical protein FLJ12781 /FL =gb: NM 025 116.1 S.48 Hs.7309 AA995925 CDNA FLJ34019 fis, clone FCBBF2002898 S.48 S.2688.18 AF1474.04 Full length insert cDNA clone YPO1 H07 5.47 S.433,791 AW664964 Transcribed sequence with weak similarity to protein ref: NP 060312.1 (H. sapiens) hypothetical protein FLJ20489 Homo sapiens 5.46 S.432355 392,551 Similar to Ras-related protein Rab-28 (Rab-26) (LOC392551), mRNA 5.46 S.302631 AW952781 CDNA clone IMAGE: 5286843, partial cols 5.45 S.371751 BCO443OS Clone IMAGE: 5174069, mRNA 5.44 HS.434.569 BQ707702 Full length insert cDNA clone YB66H06 5.44 HSS23914 AIS65746 Full length insert cDNAYH73HO8 5.44 HS.1857O1 AL109696 MRNA full length insert cDNA clone EUROIMAGE 21920 S.42 S.16370 AW139789 CDNA FLJ11652 fis, clone HEMBA1004461 5.41 S.371436 AW418647 CDNA FLJ34002 fis, clone FCBBF1000206 5.41 S.160900 BIS6OO14 CDNA FLJ39420 fis, clone PLACE6O18769 5.41 S.96297 AIOS1769 Transcribed sequence with weak similarity to protein ref: NP 055301.1 (H. sapiens) neuronal thread protein Homo Sapiens 5.39 S.356457 AW197431 Transcribed sequence with moderate similarity to protein sp: P39194 (H. sapiens) ALU7 HUMAN Alu subfamily SQ sequence contamination warning entry 5.39 B EO66040 Consensus includes gb: BE066040 (FEA = EST /DB XREF = gi: 8410690 DB XREF = est: RC3-BTO319-240200-015 cO1 UG = HS.292358 ESTs 5.38 S.448887 A. F131798 Clone 25119 mRNA sequence 5.37 S.407471 B CO34815 C DNA FLJ25418 fis, clone TSTO3512 5.35 HS.4.194 A. KO24956 C DNA: FLJ21303 fis, clone COLO2107 5.35 S.432643 A. KOO1829 C DNA FLJ10967 fis, clone PLACE1000798 5.35 S.47O154 A. KO26778 C DNA: FLJ23125 fis, clone LNGO8217 S.34 HS28803 B ESO1976 Transcribed sequence with moderate similarity to protein sp: P39192 (H. sapiens) ALU5 HUMAN Alu subfamily SC sequence contamination warning entry S.34 HSS31894 CO36311 Clone IMAGE: 4819526, mRNA S.34 HS.105623 O38071 Transcribed sequences 5.33 S.190060 874267 Transcribed sequences 5.32 S452398 LO37998 CDNA FLJ30740 fis, clone FEBRA2000319 5.31 S.356888 L71371.4 MRNA, cDNA DKFZp667C0715 (from clone DKFZp667C0715) 5.31 SSO2632 KOOO995 CDNA FLJ10133 fis, clone HEMBA1003067 5.30 F118079 gb: AF 118079.1/DEF = Homo sapiens PRO1854 mRNA, complete cols. FEA = mRNA PROD = PRO1854/DB XREF = gi: 6650803 /UG = Hs. 136570 PRO1854 protein /FL =gb: AF 118079.1 5.30 S.306439 L122039 MRNA, cDNA DKFZp434E0572 (from clone DKFZp434E0572) 5.30 S.437104 W139588 Transcribed sequences 5.30 S.13818 E6754.86 Transcribed sequences S.29 S.SS378 123586 Transcribed sequences S.29 S.105.268 CO43176 CDNA clone IMAGE: 5287441, partial cols 5.27 S.133 160 401237 4782 Clone IMAGE: 5742072, mRNA 5.25 S.225986 LOSO145 MRNA, cDNA DKFZp586C2020 (from clone DKFZp586C2020) 5.25 S.416521 CO39533 Clone IMAGE: 5743964, mRNA 5.25 S.1788O3 AA913,383 Transcribed sequences 5.25 S.88156 B F207870 Transcribed sequences 5.25 S.406781 A. L1373.25 MRNA, cDNA DKFZp434MO835 (from clone DKFZp434MO835) 5.23 S3554.04 AWS90614 Transcribed sequences S.22 S.2S2588 A. L359626 MRNA, cDNA DKFZp564F172 (from clone DKFZp564F172) S.21 S.126895 Transcribed sequences US 2014/0134726 A1 May 15, 2014 51

TABLE 3-continued Up-regulated markers in definitive endodern

Raw Fold Locus- SeqLDerived Gene Symbol Change Unigene Link OMIM From Gene Descriptor S.21 HS.291564 BE46.7383 Transcribed sequence with moderate similarity to protein sp: P39194 (H. sapiens) ALU7 HUMAN Alu subfamily SQ sequence contamination warning entry S.21 HS.197745 AI66O254 Transcribed sequences S.21 HS.28S4O ALOSO2O4 MRNA, cDNA DKFZp586F1223 (from clone DKFZp586F1223) S.20 HS.40O2S6 AW162768 CDNA FLJ11478 fis, clone HEMBA1001781 S.18 HS.272198 AKOOO798 CDNA FLJ20791 fis, clone COLO1392 S.18 US4734 gb: U54734.1 /DB XREF = gi: 2724997 TID = Hs2.384853.1 (CNT = 1 FEA = mRNA TIER = ConsEnd STK = 1 UG = HS.384853 fUG TITLE = Human clone TMO29 mRNA sequence. DEF = Human clone TMO29 mRNA sequence. S.16 AF143866 gb: AF 143866.1 /DB XREF = gi:48.95008 (TID = Hs2.407308.1 CNT = 2 FEA = mRNA TIER = ConsEnd STK = 0 UG = HS.407308 fUG TITLE = Homo sapiens clone IMAGE: 110987 mRNA sequence /DEF = Homo sapiens clone IMAGE: 110987 mRNA sequence. S.1S HS.252924 AA8838.31 Transcribed sequences 5.14 HS.282340 AV6464.08 Transcribed sequences S.14 HS.174067 AI4O1627 Transcribed sequences S.13 HS.439326 AU145365 CDNA FLJ11662 fis, clone HEMBA1004629 5.12 HS.43744 AW29O940 CDNA FLJ35131 fis, clone PLACE60O8824 S.11 HS.308O17 AI125337 CDNA FLJ36973 fis, clone BRACE2006249 S.11 HS.37S788 AL713719 MRNA, cDNA DKFZp667K1916 (from clone DKFZp667K1916) S.09 X792OO Consensus includes gb: X792.00.1 (DEF = Homo Spaiens mRNA for SYT-SSX protein. FEA = mRNA PROD = SYT-SSX protein /DB XREF = gi: 531107 (UG = Hs.289105 synovial sarcoma, X breakpoint 2 5.08 HS.37681.1 AL833811 MRNA, cDNA DKFZp564G203 (from clone DKFZp564G203) 5.08 HS.S31959 BCO4O68O Clone IMAGE: 4817893, mRNA 5.08 AKO24527 gb: AKO24527.1 /DB XREF = gi: 10436829 TID = Hs2.306684.1 CNT = 1 FEA = mRNA TIER = ConsEnd STK = 0 UG = HS.306684 fUG TITLE = Homo sapiens cDNA: FLJ20874 fis, clone ADKAO2818. DEF = Homo sapiens cDNA: FLJ20874 fis, clone ADKAO2818. S.O7 HS.385644 BCO23608 CDNA clone IMAGE: 4639754, partial cols S.06 HS.493477 AW301393 CDNA FLJ10263 fis, clone HEMBB100.0991 S.O.S HS.445316 BFSO9230 Transcribed sequences 5.05 HS.434344 BCO41884 Clone IMAGE: 5298.352, mRNA S.O2 HS.274593 AL162O33 MRNA, cDNA DKFZp434F1872 (from clone DKFZp434F1872) S.O2 HSS31897 AV70O891 Transcribed sequences S.O1 HS.282377 AV647958 Transcribed sequences S.O1 HS.14451O BF110426 Transcribed sequences S.OO AL83428O gb: AL834280.1 /DB XREF = gi: 21739855 (TID = Hs2.407106.1 CNT = 1 FEA = mRNA TIER = ConsEnd STK = 0 UG = HS.407106 /UG TITLE = Homo sapiens mRNA, cDNA DKFZp547JO114 (from clone DKFZp547JO114) DEF = Homo sapiens mRNA, cDNA DKFZp547JO114 (from clone DKFZp547JO114).

0318 Increased expression of these cell Surface markers in SOX7 provides a reliable estimate of definitive endoderm definitive endoderm cells as compared to hESCs was con contribution to the SOX17 expression witnessed in the popu firmed with real time QPCR as previously described. FIGS. lation as a whole. The similarity of panels G-Land M to panel 34A-M show the results of QPCR for certain markers. Results C indicates that FGF17, VWF, CALCR, FOXO1, CMKOR1 are displayed for cell cultures analyzed 1, 3 and 5 days after and CRIP1 are likely markers of definitive endoderm and that the addition of 100 ng/ml activin A, CXCR4-expressing they are not significantly expressed in extra-embryonic endo definitive endoderm cells purified at the end of the five day derm cells. It will be appreciated that the Q-PCR results differentiation procedure (CXDE), and in purified human described herein can be further confirmed by ICC. embryonic stem cells (HESC). A comparison of FIGS. 34C 0319 Table 4 describes a subset of genes included in Table and G-M demonstrates that the six marker genes, FGF17, 3 that exhibit over 30-fold upward change in expression in VWF, CALCR, FOXO1, CMKOR1 and CRIP1, exhibit an definitive endoderm cells as compared to hESCs. Select expression pattern that is almost identical to each other and genes were assayed by Q-PCR, as described above, to verify which is also identical to the pattern of expression of CXCR4 the gene expression changes found on the gene chip and also and SOX17/SOX7. As described previously, SOX17 is to investigate the expression pattern of the genes during a expressed in both the definitive endoderm as well as in the during a time course of hESC differentiation as discussed SOX7-expressing extra-embryonic endoderm. Since SOX7 is below. Q-PCR data is presented in the column entitled not expressed in the definitive endoderm, the ratio of SOX17/ “QPCR Fold Change,” where applicable. US 2014/0134726 A1 May 15, 2014 52

TABLE 4 Highly up-regulated markers in definitive endodern Raw Fold QPCR Fold Gene Symbol Change Change Unigene LocusLink OMIM SeqLDerivedFrom Gene Descriptor AGPAT3 53.81 S.44.3657 56894 AI3373OO 1-acylglycerol-3-phosphate O-acyltransferase 3 APOA2 33.51 47 S.237658 336 107670 NM OO1643 apolipoprotein A-II C20orf56 36.O1 140828 AL121722 open reading frame 56 C21orf129 77.87 S.3506.79 15O135 NM 152506 chromosome 21 open reading frame 129 CALCR 38.31 S.640 799 114131 NM OO1742 calcitonin receptor CCL2 111.46 S.303649 6347 15810S S69738 hemokine (C-C motif) ligand 2 CER1 33.04 S.248.204 9350 603777 NM OO5454 erberus 1 homolog, cysteine knot superfamily (Xenopus laevis) CMKOR1 53.39 213 S.231853 57007 AI817041 hemokine orphan receptor 1 CRIP1 50.56 28 1396 123875 NM 001311 ysteine-rich protein 1 (intestinal) CXCR4 55.31 289 S.421986 7852 162643 AJ224869 chemokine (C-X-C motif) receptor 4 CXorf1 5435 S.106688 9142 NM 004709 chromosome X open reading frame 1 DIO3 33.07 S.49322 1735 601038 NM OO1362 deiodinase, iodothyronine, type III DIO3OS 59.10 S.406958 641SO 608523 AF3OS836 d eiodinase, iodothyronine, type III opposite S trand B-1 162.04 S.372732 56899 607815 AWOO5572 E2a-Pbx1-associated protein HHADH 85.52 S.432443 1962 6O7037 NM OO1966 enoyl-Coenzyme A, hydratase 3-hydroxyacyl Coenzyme A dehydrogenase E LOVL2 70.71 S.2461 O7 S4898 BFSO8639 elongation of very long chain fatty acids (FEN1/Elo2, SUR4/Elo3, yeast)-like 2 EPSTI1 34.25 S.3438OO 94240 6O7441 AA6332O3 epithelial stromal interaction 1 (breast) FGF17 107.07 166 S.248.192 8822 603725 NM OO3867 fibroblast growth factor 17 LJ10970 SO.86 S.173233 552.73 NM 018286 hypothetical protein FLJ10970 FL21,195 83.56 S.207407 64388 BFO64262 protein related to DAN and cerberus FLJ22471 34.33 S.387266 80212 NM O25140 limkain beta 2 FLJ23514 46.65 S.144913 60494 NM O21827 hypothetical protein FLJ23514 FOXA2 2S2.32 S.155651 3170 ABO28O21 forkhead box A2 FOXO1 43.02 85 S.297452 94234 AI676,059 forkhead box Q1 GATA4 44.12 140 S.243987 2626 AV7OO724 GATA binding protein 4 GPR37 33.44 S.406094 2861 U87460 G protein-coupled receptor 37 (endothelin receptor type B-like) GSC 524.27 596 S.440438 145258 138890 AY1774O7 goosecoid LOC283537 68.29 S.117167 283537 AKO26720 hypothetical protein LOC283537 MYL7 38.38 S.75636 S8498 NM 021223 myosin, light polypeptide 7, regulatory NPPB 35.43 S.219140 4879 60O295 NM OO2521 natriuretic peptide precursor B NTN4 57.13 S.1O2541 59.277 AF278532 netrin 4 PRSS2 2OS.O3 S.S11525 S645 NM 002771 protease, serine, 2 (trypsin 2) RTN4RL1 60.66 S.22917 146760 HO6251 reticulon 4 receptor-like 1 SEMA3E 135.96 S.S28721 9723 NM 012431 sema domain, immunoglobulin domain (Ig), short basic domain, Secreted, (semaphorin) 3E 161.93 7903 A422986 sialyltransferase 8D (alpha-2, 8-polysialyltransferase) 79.96 S.37890 2OOO10 AIf 67388 solute carrier family 5 (sodium glucose cotransporter), member 9 44.06 30061 AL136944 solute carrier family 40 (iron-regulated transporter), member 1 30.2O 61 S.98367 64321 NM O224.54 SRY (sex determining region Y)-box 17 62.58 S.159425 50859 607989 AI808090 sparcosteonectin, cwcv and kazal-like domains proteoglycan (testican) 3 74.09 S.374849 7111 190930 NM OO3275 tropomodulin 1 103.37 S.137674 8989 604775 AASO2609 transient receptor potential cation channel, subfamily A, member 1 76.16 S.434384 7273 18884O NM 003319 tiltin 116.51 S.11873 AW167727 Transcribed sequences 66.25 S.177968 AI821586 Transcribed sequence with weak similarity to protein ref: NP 066928.1 (H. sapiens) phospholipid scramblase 1 Homo Sapiens S4.91 S.160418 BF941609 Transcribed sequences 52.28 S.444751 AI916532 Transcribed sequences SO.21 S.390285 4.01221 BCO344O7 CDNA clone IMAGE: 5268504, partial cols 40.16 S.S.14745 N637O6 CDNA FLJ41084 fis, clone ADRGL2010974 3O.OS S. 7888 AW772192 CDNA FLJ44318 fis, clone TRACH3000780 US 2014/0134726 A1 May 15, 2014

0320 Gene products that appear to be localized to the cell AW772192 are high in cultures treated with 100 ng/ml activin surface are described in Table 5. Localization of markers at A and low in those which do not receive exogenous activin A. the cell Surface is useful for applications such as enrichment, Additionally, among activin A treated cultures, expression isolation and/or purification of definitive endoderm cells. levels of AGPAT3, APOA2, C20orf56, C21orf129, CALCR, Each of the markers described in Table 5 include over a CCL2, CER1, CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, 30-fold increase in expression level in definitive endoderm as DIO3OS, EB-1, EHHADH, ELOVL2, EPSTI1, FGF17, compared to hESCs. FLJ10970, FLJ21195, FLJ22471, FLJ23514, FOXA2, TABLE 5 Up-regulated cell Surface markers in definitive endodern Raw Fold QPCR Fold Gene Symbol Change Change Unigene LocusLink OMIM SeqLDerivedFrom Gene Descriptor CALCR 38.31 HS.640 799 114131 NM OO1742 calcitonin receptor CMKOR1 53.39 213 HS.231853 57007 AI817041 chemokine orphan receptor 1 CXCR4 55.31 289 HS.421986 7852 162643 A.J224869 chemokine (C-X-C motif) receptor 4 GPR37 33.44 HS.406094 2861 6O2S83 U87460 G protein-coupled receptor 37 (endothelin receptor type B-like) NTN4 57.13 HS. 102541 59.277 AF278532 netrin 4 RTN4RL1 60.66 HS.22917 146760 HO6251 reticulon 4 receptor-like 1 SEMA3E 135.96 HS.528721 9723 NM 012431 Sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (Semaphorin) 3E SLCSA9 79.96 HS.37890 2OOO10 AIf 67388 Solute carrier family 5 (sodium glucose cotransporter), member 9 SLC40A1 44.06 HS.409875 30061 604653 AL136944 Solute carrier family 40 (iron-regulated transporter), member 1 TRPA1 103.37 Hs.137674 8989 60477S AASO2609 transient receptor potential cation channel, Subfamily A, member 1

EXAMPLE 12 FOXO1, GATA4, GPR37, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, RTN4RL1, SEMA3E, SIAT8D, Validation of Definitive Endoderm Markers SLC5A9, SLC40A1, SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, AI821586, BF941609, AI916532, 0321) Using the methods, such as those described in the BC034407, N63706 and AW772192 are highest when FBS above Examples, hESCs are differentiated to produce defini concentration is lowest. A decrease in AGPAT3, APOA2, tive endoderm. In particular, to increase the yield and purity C20orf56, C21orf129, CALCR, CCL2, CER1, CMKOR1, of definitive endoderm in differentiating cell cultures, the CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, serum concentration of the medium is controlled as follows: ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, 0% FBS on day 1, 0.2% FBS on day 2 and 2.0% FBS on days FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, 3-6. Differentiating cell populations are grown in the pres GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, ence of 100 ng/ml activin A, whereas non-differentiating RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, control populations are grown in the absence of activin A. SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, Human embryonic cell cultures grown in the presence or AI821586, BF941609, AI916532, BC034407, N63706 and absence of activin A and in either 2% serum or 10% serum AW772192 expression levels is seen in the 10% FBS cultures over the entire 6 day differentiation period are also monitored. as compared to such expression levels in low serum FSB Daily samples of each culture are withdrawn and the amount cultures. of mRNA transcript produced for each of the markers 0323. Additionally, the expression profile of AGPAT3, AGPAT3, APOA2, C20orf56, C21orf129, CALCR, CCL2, APOA2, C20orf56, C21orf129, CALCR, CCL2, CER1, CER1, CMKOR1, CRIP1, CXCR4, CXorf1, DIO3, CMKOR1, CRIP1, CXorf1, DIO3, DIO3OS, EB-1, DIO3OS, EB-1, EHHADH, ELOVL2, EPSTI1, FGF17, EHHADH, ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ10970, FLJ21195, FLJ22471, FLJ23514, FOXA2, FLJ21195, FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, FOXO1, GATA4, GPR37, GSC, LOC283537, MYL7, GPR37, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, NPPB, NTN4, PRSS2, RTN4RL1, SEMA3E, SIAT8D, RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, SLC5A9, SLC40A1, SOX17, SPOCK3, TMOD1, TRPA1, SPOCK3, TMOD1, TRPA1, TTN, AW166727, AI821586, TTN, AW166727, AI821586, BF941609, AI916532, BF941609, AI916532, BC034407, N63706 and AW772192 BC034407, N63706 and AW772192 is measured by Q-PCR. is consistent with the expression profile of SOX17 and/or 0322. In general, expression levels of AGPAT3, APOA2, CXCR4 under similar conditions over the differentiation C20orf56, C21orf129, CALCR, CCL2, CER1, CMKOR1, period. CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, 0324. The markers AGPAT3, APOA2, C20orf56, ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, C21orf129, CALCR, CCL2, CER1, CMKOR1, CRIP1, FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, AI821586, BF941609, AI916532, BC034407, N63706 and RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, US 2014/0134726 A1 May 15, 2014 54

SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, CXCR4, GPR37, NTN4, RTN4RL1, SEMA3E, SLC40A1, AI821586, BF941609, AI916532, BC034407, N63706 and SLC5A9 and TRPA1 and other markers of definitive endo AW772192 are not substantially expressed in cultures that are derm are analyzed by FACS. allowed to differentiate to mesoderm, ectoderm, or extra 0330. Using the methods such as those described in the embryonic endoderm. above Examples, hESCs are differentiated to produce defini tive endoderm. In particular, to increase yield and purity of EXAMPLE 13 definitive endoderm in differentiating cell cultures, the serum concentration of the medium is controlled as follows: 0% Isolation of Definitive Endoderm Cells using FBS on day 1, 0.2% FBS on day 2 and 2.0% FBS on days 3-6. Markers Selected from Table 5 Differentiated cultures are sorted by FACS using cell surface epitopes, E-Cadherin, Thrombomodulin and one of CALCR, 0325 This Example demonstrates the isolation of defini CMKOR1, CXCR4, GPR37, NTN4, RTN4RL1, SEMA3E, tive endoderm cells positive for AGPAT3, APOA2, C20orf56, SLC40A1 SLC5A9 and TRPA1 as described above. Sorted C21orf129, CALCR, CCL2, CER1, CMKOR1, CRIP1, cell populations are then analyzed by Q-PCR to determine CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, relative expression levels of markers for definitive and ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, extraembryonic endoderm as well as other cell types. Use of FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, any one of the above cell surface markers for isolation results GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, in populations of definitive endoderm cells that are >98% RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, pure. SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, AI821586, BF941609, AI916532, BCO34407, N63706 and/ EXAMPLE 1.5 or AW772192 in a cell culture by using a cell surface marker selected from CALCR, CMKOR1, CXCR4, GPR37, NTN4, Transplantation of Human Definitive Endoderm RTN4RL1, SEMA3E, SLC40A1 SLC5A9 and TRPA1. Cells Under Mouse Kidney Capsule 0326 Using methods, such as those described in the above Examples, hESCs are differentiated to produce definitive 0331. To demonstrate that the human definitive endoderm endoderm. In particular, to increase yield and purity of defini cells produced using the methods described herein are tive endoderm in differentiating cell cultures, the serum con capable of responding to differentiation factors so as to pro centration of the medium is controlled as follows: 0% FBS on duce cells that are derived from the gut tube, such human day 1, 0.2% FBS on day 2 and 2.0% FBS on days 3-6. definitive endoderm cells were subjected to an in vivo differ Differentiating cell populations are grown in the presence of entiation protocol. 0332 Human definitive endoderm cells were produced as 100 ng/ml, 50 ng/ml or 25 ng/ml activin A, whereas non described in the foregoing Examples. Such cells were har differentiating control populations are grown in the absence Vested and transplanted under the kidney capsule of immu of activin A. nocompromised mice using standard procedures. After three 0327. An appropriate amount of labeled antibody that weeks, the mice were sacrificed and the transplanted tissue binds to one of the markers listed above is added to a sample was removed, sectioned and Subjected to histological and of each culture and labeling proceeds on ice. Cells are washed immunocytochemical analysis. by adding PBS and pelleted. A second wash with buffer is 0333 FIGS. 35A-D show that after three weeks post completed then cells are resuspended. Secondary antibody transplantation, the human definitive endoderm cells differ (FITC conjugated secondary antibody is added and allowed entiated into cells and cellular structures derived from the gut to label for about 30 minutes followed by two washes in buffer tube. In particular, FIG. 35A shows hematoxylin and eosin as above. Cells are resuspended and analyzed and Sorted stained sections of transplanted human definitive endoderm using a FACS. Cells are collected directly into RLT lysis tissue that has differentiated into gut-tube-like structures. buffer for subsequent isolation of total RNA for gene expres FIG. 35B shows a transplanted human definitive endoderm sion analysis by real-time quantitative PCR. section immunostained with antibody to hepatocyte specific 0328. The number of cells positive for AGPAT3, APOA2, antigen (HSA). This result indicates that the human definitive C20orf56, C21orf129, CALCR, CCL2, CER1, CMKOR1, endoderm cells are capable of differentiating into liver or liver CRIP1, CXCR4, CXorf1, DIO3, DIO3OS, EB-1, EHHADH, precursor cells. FIGS. 35C and 35D show a transplanted ELOVL2, EPSTI1, FGF17, FLJ10970, FLJ21195, human definitive endoderm section immunostained with anti FLJ22471, FLJ23514, FOXA2, FOXQ1, GATA4, GPR37, body to villin and antibody to caudal type homeobox tran GSC, LOC283537, MYL7, NPPB, NTN4, PRSS2, scription factor 2 (CDX2), respectively. These results indi RTN4RL1, SEMA3E, SIAT8D, SLC5A9, SLC40A1, cate that the human definitive endoderm cells are capable of SOX17, SPOCK3, TMOD1, TRPA1, TTN, AW166727, differentiating into intestinal cells or intestinal cell precur AI821586, BF941609, AI916532, BCO34407, N63706 and/ or AW772192 increase significantly as the dose of activin A is SOS. increased in the differentiation culture media. Isolated cell EXAMPLE 16 populations are analyzed as described in the Example below. Identification of Differentiation Factors Capable of EXAMPLE 1.4 Promoting the Differentiation of Human Definitive Endoderm Cells In Vitro Quantitation of Isolated Definitive Endoderm Cells 0334) To exemplify the differentiation factor screening 0329. To quantitate the proportion of definitive endoderm methods described herein, populations of human definitive cells present in a cell culture or cell population, cells express endoderm cells produced using the methods described herein ing a cell surface marker selected from CALCR, CMKOR1, were separately provided with several candidate differentia US 2014/0134726 A1 May 15, 2014

tion factors while determining the normalized expression lev REFERENCES els of certain marker gene products at various time points. 0339. Numerous literature and patent references have 0335 Human definitive endoderm cells were produced as been cited in the present patent application. Each and every described in the foregoing Examples. In brief, hESCs cells reference that is cited in this patent application is incorpo were grown in the presence of 100 ng/ml activin A in low rated by reference herein in its entirety. serum RPMI medium for four days, wherein the fetal bovine 0340 For some references, the complete citation is in the serum (FBS) concentration on day 1 was 0%, on day 2 was body of the text. For other references the citation in the body 0.2% and on days 3-4 was 2%. After formation of definitive of the text is by authorandyear, the complete citation being as endoderm, beginning on day 5 and ending on day 10, cell follows: populations maintained in individual plates in RPMI contain 0341 Alexander, J., Rothenberg, M., Henry, G. L., and ing 0.2% FBS were treated with one of Wnt3A at 20 ng/ml, Stainier, D.Y. (1999). Casanova plays an early and essential FGF2 at 5 ng/ml or FGF2 at 100 ng/ml. The expression of role in endoderm formation in Zebrafish. Dev Biol 215,343 marker gene products for albumin, PROX1 and TITF1 were 357. quantitated using Q-PCR. 0342 Alexander, J., and Stainier, D.Y. (1999). A molecu lar pathway leading to endoderm formation in Zebrafish. Curr 0336 FIG.36A shows that expression of the albumin gene Biol 9, 1147-1157. product (a marker for liverprecursors and liver cells) substan 0343 Aoki, T. O., Mathieu, J., Saint-Etienne, L., Reba tially increased on days 9 and 10 in response to FGF2 at 5 gliati, M. R. Peyrieras, N., and Rosa, F. M. (2002). Regula ng/ml as compared to expression in definitive endoderm cells tion of nodal signalling and mesendoderm formation by on day 4 prior to treatment with this differentiation factor. TARAM-A, a TGFbeta-related type I receptor. Dev Biol 241, Expression of the albumin gene product was also increased in 273-288. response to 20 ng/ml Wnt3A on days 9 and 10 as compared to (0344 Ausubel et al., (1997), Current Protocols of Molecu expression in untreated definitive endoderm cells, however, lar Biology, John Wiley and Sons, Hoboken, N.J. the increase was not as large as that observed for the 5 ng/ml (0345 Beck, S. Le Good, J. A., Guzman, M., Ben Haim, FGF2 treatment. Of particular significance is the observation N., Roy, K., Beermann, F., and Constam, D. B. (2002). Extra that the expression of the albumin gene product was not embryonic proteases regulate Nodal signalling during gastru increased on days 9 and 10 in response to FGF2 at 100 ng/ml lation. Nat Cell Biol 4, 981-985. 0346 Beddington, R. S., Rashbass, P., and Wilson, V. as compared to expression in definitive endoderm cells on day (1992). Brachyury-a gene affecting mouse gastrulation and 4. Similar results were seen with the PROX1 marker (a second early organogenesis. Dev Suppl. 157-165. marker for liver precursors and liver cells) as shown in FIG. 0347 Bongso, A., Fong, C. Y., Ng, S.C., and Ratnam, S. 36B. FIG. 36C shows that in cell populations provided with (1994). Isolation and culture of inner cell mass cells from 100 ng/ml FGF2, expression of the TITF1 marker gene sub human blastocysts. Hum Reprod 9, 2110-2117. stantially increased on days 7, 9 and 10 as compared to (0348 Chang, H., Brown, C. W., and Matzuk, M. M. expression in definitive endoderm cells on day 4 prior to (2002). Genetic analysis of the mammalian transforming treatment with this differentiation factor, but FGF2 at 5 ng/ml had very little effect on expression of this gene product as growth factor-beta Superfamily. Endocr Rev 23, 787-823. compared to untreated definitive endoderm. TITF1 is a 0349 Conlon, F. L., Lyons, K. M., Takaesu, N., Barth, K. marker expressed in developing lung and thyroid cells. Taken S., Kispert, A., Herrmann, B., and Robertson, E.J. (1994). A together, the results shown in FIGS. 36A-C indicate that the primary requirement for nodal in the formation and mainte concentration at which the candidate differentiation factor is nance of the primitive streak in the mouse. Development 120, provided to the cell population can affect the differentiation 1919-1928. fate of definitive endoderm cells in vitro. 0350 Dougan, S.T., Warga, R.M., Kane, D.A., Schier, A. F., and Talbot, W. S. (2003). The role of the Zebrafish nodal 0337 The methods, compositions, and devices described related genes Squint and cyclops in patterning of mesendo herein are presently representative of preferred embodiments derm. Development 130, 1837-1851. and are exemplary and are not intended as limitations on the 0351 Feldman, B., Gates, M.A., Egan, E. S., Dougan, S. Scope of the invention. Changes therein and other uses will T., Rennebeck, G., Sirotkin, H.I., Schier, A.F., and Talbot, W. occur to those skilled in the art which are encompassed within S. (1998). Zebrafish organizer development and germ-layer the spirit of the invention and are defined by the scope of the formation require nodal-related signals. Nature 395, 181 disclosure. Accordingly, it will be apparent to one skilled in 185. the art that varying Substitutions and modifications may be 0352 Feng, Y. Broder, C. C., Kennedy, P. E., and Berger, made to the invention disclosed herein without departing E.A. (1996). HIV-1 entry cofactor: functional clNA cloning from the scope and spirit of the invention. of a seven-transmembrane, G protein-coupled receptor. Sci 0338. As used in the claims below and throughout this ence 272, 872-877. disclosure, by the phrase “consisting essentially of is meant 0353. Futaki, S., Hayashi, Y. Yamashita, M. Yagi, K., including any elements listed after the phrase, and limited to Bono, H., Hayashizaki, Y., Okazaki, Y., and Sekiguchi, K. other elements that do not interfere with or contribute to the (2003). Molecular basis of constitutive production of base activity or action specified in the disclosure for the listed ment membrane components: Gene expression profiles of elements. Thus, the phrase “consisting essentially of indi engelbreth-holm-swarm tumor and F9 embryonal carcinoma cates that the listed elements are required or mandatory, but cells. J Biol Chem. that other elements are optional and may or may not be 0354 Grapin-Botton, A., and Melton, D.A. (2000). Endo present depending upon whether or not they affect the activity derm development: from patterning to organogenesis. Trends or action of the listed elements. Genet 16, 124-130. US 2014/0134726 A1 May 15, 2014 56

0355 Harris, T. M., and Childs, G. (2002). Global gene lineage and granulocytic precursors within the bone marrow expression patterns during differentiation of F9 embryonal microenvironment. Immunity 10, 463-471. carcinoma cells into parietal endoderm. Funct Integr Genom 0373) Malik V and Lilleho E (ed.), (1994), Antibody ics 2, 105-119. Techniques, Academic Press, Inc. Burlington, Mass. 0356. Hogan, B. L. (1996). Bone morphogenetic proteins 0374 McGrath KE, Koniski AD, Maltby KM, McGann in development. Curr Opin Genet Dev 6, 432-438. JK, Palis J. (1999) Embryonic expression and function of the 0357 Hogan, B. L. (1997). Pluripotent embryonic cells chemokine SDF-1 and its receptor, CXCR4. Dev Biol. 213, and methods of making same (U.S.A., Vanderbilt University). 442-56. 0358. Howe, C. C., Overton, G. C., Sawicki, J., Solter, D., Stein, P., and Strickland, S. (1988). Expression of SPARC/ 0375 Miyazono, K., Kusanagi, K., and Inoue, H. (2001). osteonectin transcript in murine embryos and gonads. Differ Divergence and convergence of TGF-beta/BMP signaling. J entiation 37, 20-25. Cell Physiol 187,265-276. 0359 Hudson, C., Clements, D., Friday, R. V., Stott, D., 0376 Nagasawa, T., Hirota, S., Tachibana, K., Takakura, and Woodland, H. R. (1997). Xsox 17alpha and -beta mediate N., Nishikawa, S., Kitamura, Y., Yoshida, N., Kikutani, H., endoderm formation in Xenopus. Cell 91, 397-405. and Kishimoto, T. (1996). Defects of B-cell lymphopoiesis 0360 Imada, M., Imada, S., Iwasaki, H., Kume, A., and bone-marrow myelopoiesis in mice lacking the CXC Yamaguchi, H., and Moore, E. E. (1987). Fetomodulin: chemokine PBSF/SDF-1. Nature 382, 635-638. marker surface protein of fetal development which is modu 0377 Niwa, H. (2001). Molecular mechanism to maintain latable by cyclic AMP. Dev Biol 122, 483-491. stem cell renewal of ES cells. Cell Struct Funct 26, 137-148. 0361 Kanai-AZuma, M., Kanai, Y., Gad, J. M., Tajima, Y., 0378 Ogura, H., Aruga, J., and Mikoshiba, K. (2001). Taya, C., Kurohmaru, M., Sanai, Y. Yonekawa, H., Yazaki, Behavioral abnormalities of Zic1 and Zic2 mutant mice: K., Tam, P. P., and Hayashi, Y. (2002). Depletion of definitive implications as models for human neurological disorders. gut endoderm in Sox 17-null mutant mice. Development 129, Behav Genet 31, 317-324. 2367-2379. 0379 Reubinoff, B. E., Pera, M. F., Fong, C.Y., Trounson, 0362 Katoh, M. (2002). Expression of human Sox17 in A., and Bongso, A. (2000). Embryonic stem cell lines from normal tissues and tumors. Int J Mol Med 9,363-368. human blastocysts: somatic differentiation in vitro. Nat Bio 0363 Kikuchi, Y. Agathon, A., Alexander, J., Thisse, C., technol 18, 399-404. Waldron, S., Yellon, D., Thisse, B., and Stainier, D.Y. (2001). 0380 Rodaway, A., and Patient, R. (2001). Mesendoderm. casanova encodes a novel Sox-related protein necessary and an ancient germ Layer? Cell 105, 169-172. sufficient for early endoderm formation in Zebrafish. Genes 0381 Rodaway, A., Takeda, H., Koshida, S., Broadbent, Dev 15, 1493-1505. J., Price, B., Smith, J. C., Patient, R., and Holder, N. (1999). 0364 Kim, C. H., and Broxmeyer, H. E. (1999). Chemok Induction of the mesendoderm in the Zebrafish germ ring by ines: signal lamps for trafficking of T and B cells for devel yolk cell-derived TGF-beta family signals and discrimination opment and effector function.J Leukoc Biol 65, 6-15. of mesoderm and endoderm by FGF. Development 126, 0365 Kimelinan, D., and Griffin, K.J. (2000). Vertebrate 3067-3078. mesendoderm induction and patterning. Curr Opin Genet 0382 Rohr, K. B., Schulte-Merker, S., and Tautz, D. Dev 10, 350-356. (1999). Zebrafish Zic 1 expression in brain and somites is 0366 Kubo A, Shinozaki K, Shannon J M. Kouskoff V. affected by BMP and hedgehog signalling. Mech Dev 85, Kennedy M, Woo S, Fehling HJ. Keller G. (2004) Develop 147-159. ment of definitive endoderm from embryonic stem cells in (0383 Schier, A. F. (2003). Nodal signaling in vertebrate culture. Development. 131, 1651-62. development. Annu Rev Cell Dev Biol 19, 589-621. 0367. Kumar, A., Novoselov, V., Celeste, A. J., Wolfman, N. M., ten Dijke, P., and Kuehn, M. R. (2001). Nodal signal (0384 Schoenwolf, G. C., and Smith, J. L. (2000). Gastru ing uses activin and transforming growth factor-beta recep lation and early mesodermal patterning in vertebrates. Meth tor-regulated Smads. J Biol Chem 276, 656-661. ods Mol Biol 135, 113-125. 0368 Labosky, P. A., Barlowo, D. P., and Hogan, B. L. 0385 Shamblott, M. J. Axelman, J. Wang, S., Bugg, E. (1994a). Embryonic germ cell lines and their derivation from M., Littlefield, J. W., Donovan, P. J., Blumenthal, P. D., Hug mouse primordial germ cells. Ciba Found Symp 182, 157 gins, G. R., and Gearhart, J. D. (1998). Derivation of pluri 168; discussion 168-178. potent stem cells from cultured human primordial germ cells. 0369 Labosky, P. A., Barlow, D. P., and Hogan, B. L. Proc Natl Acad Sci USA 95, 13726-13731. (1994b). Mouse embryonic germ (EG) cell lines: transmis 0386 Shapiro, A.M., Lakey, J.R., Ryan, E. A., Korbutt, G. sion through the germline and differences in the methylation S., Toth, E. Warnock, G. L., Kneteman, N.M., and Rajotte, R. imprint of insulin-like growth factor 2 receptor (Igf2r) gene V. (2000). Islet transplantation in seven patients with type 1 compared with embryonic stem (ES) cell lines. Development diabetes mellitus using a glucocorticoid-free immunosup 120, 31.97-3204. pressive regimen. N Engl J Med 343,230-238. 0370 Lickert, H., Kutsch, S., Kanzler, B., Tamai, Y., 0387 Shapiro, A. M., Ryan, E. A., and Lakey, J. R. Taketo, M.M., and Kemler, R. (2002). Formation of multiple (2001a). Pancreatic islet transplantation in the treatment of hearts in mice following deletion of beta-catenin in the diabetes mellitus. Best Pract Res Clin Endocrinol Metab 15, embryonic endoderm. Dev Cell 3, 171-181. 241-264. 0371 Lu, C. C., Brennan, J., and Robertson, E.J. (2001). 0388 Shapiro, J., Ryan, E. Warnock, G. L., Kneteman, N. From fertilization to gastrulation: axis formation in the mouse M. Lakey, J., Korbutt, G. S., and Rajotte, R. V. (2001b). embryo. Curr Opin Genet Dev 11, 384-392. Could fewer islet cells be transplanted in type 1 diabetes? 0372 Ma, Q., Jones, D., and Springer, T. A. (1999). The Insulin independence should be dominant force in islet trans chemokine receptor CXCR4 is required for the retention of B plantation. Bm322, 861. US 2014/0134726 A1 May 15, 2014 57

0389 Shiozawa, M., Hiraoka, Y., Komatsu, N., Ogawa, data by geometric averaging of multiple internal control M., Sakai, Y., and Aiso, S. (1996). Cloning and characteriza genes. Genome Biol 3, RESEARCH0034. tion of Xenopus laevis xSox7 cDNA. Biochim Biophys Acta 0398. Varlet, I., Collignon, J., and Robertson, E.J. (1997). 1309, 73-76. nodal expression in the primitive endoderm is required for 0390 Smith, J. (1997). Brachyury and the T-box genes. specification of the anterior axis during mouse gastrulation. Curr Opin Genet Dev 7, 474-480. Development 124, 1033-1044. 0391 Smith, J. C., Armes, N.A., Conlon, F. L., Tada, M., 0399. Vincent, S. D., Dunn, N. R. Hayashi, S., Norris, D. Umbhauer, M., and Weston, K. M. (1997). Upstream and P., and Robertson, E.J. (2003). Cell fate decisions within the downstream from Brachyury, a gene required for vertebrate mouse organizer are governed by graded Nodal signals. mesoderm formation. Cold Spring HarbSymp Quant Biol 62, Genes Dev 17, 1646-1662. 337-346. (0400 Weiler-Guettler, H., Aird, W. C., Rayburn, H., 0392 Takash, W., Canizares, J., Bonneaud, N., Poulat, F., Husain, M., and Rosenberg, R. D. (1996). Developmentally Mattei, M. G., Jay, P., and Berta, P. (2001). SOX7 transcrip regulated gene expression of thrombomodulin in postimplan tion factor: sequence, chromosomal localisation, expression, tation mouse embryos. Development 122, 2271-2281. transactivation and interference with Wnt signalling. Nucleic 0401 Weiler-Guettler, H., Yu, K., Soff, G., Gudas, L. J., Acids Res 29, 4274-4283. and Rosenberg, R. D. (1992). Thrombomodulin gene regula 0393 Taniguchi, K., Hiraoka, Y., Ogawa, M., Sakai, Y., tion by cAMP and retinoic acid in F9 embryonal carcinoma Kido, S., and Aiso, S. (1999). Isolation and characterization cells. Proceedings Of The National Academy Of Sciences Of of a mouse SRY-related cDNA, mSox7. Biochim Biophys The United States Of America 89, 2155-2159. Acta 1445, 225-231. (0402 Wells, J. M., and Melton, D. A. (1999). Vertebrate 0394 Technau, U. (2001). Brachyury, the blastopore and endoderm development. Annu Rev Cell Dev Biol 15, 393 the evolution of the mesoderm. Bioessays 23, 788-794. 410. 0395 Thomson, J. A. Itskovitz-Eldor, J., Shapiro, S. S., (0403 Wells, J. M., and Melton, D.A. (2000). Early mouse Waknitz, M.A., Swiergiel, J. J., Marshall, V. S., and Jones, J. endoderm is patterned by Soluble factors from adjacent germ M. (1998). Embryonic stem cell lines derived from human layers. Development 127, 1563-1572. blastocysts. Science 282, 1145-1147. 0404 Willison, K. (1990). The mouse Brachyury gene and 0396 Tremblay, K. D., Hoodless, P.A., Bikoff, E. K., and mesoderm formation. Trends Genet 6, 104-105. Robertson, E.J. (2000). Formation of the definitive endoderm 04.05 Zhao, G. Q. (2003). Consequences of knocking out in mouse is a Smad3-dependent process. Development 127. BMP signaling in the mouse. Genesis 35, 43-56. 3O79-3090. 0406. Zhou, X., Sasaki, H., Lowe, L., Hogan, B. L., and 0397 Vandesompele, J., De Preter, K., Pattyn, F. Poppe, Kuehn, M. R. (1993). Nodal is a novel TGF-beta-like gene B. Van Roy, N., De Paepe, A., and Speleman, F. (2002). expressed in the mouse node during gastrulation. Nature 361, Accurate normalization of real-time quantitative RT-PCR 543-547.

SEQUENCE LISTING

<16 Os NUMBER OF SEO ID NOS: 2

<21 Os SEQ ID NO 1 &211s LENGTH: 1245 &212s. TYPE: DNA <213> ORGANISM: Homo sapiens

<4 OOs SEQUENCE: 1 atgagcagcc cqgatgcggg at acgc.cagt gacgaccaga gccagaccca gagcgc.gctg 60

ccc.gcggtga tiggc.cgggct gggcc cctgc cc ctgggc.cg agtc.gctgag ccc.catcggg 12O gacatgaagg talaggg.cga gC9CC9gcg alacagcggag Caccgg.ccgg gcc.gcgggc 18O

cgagc.ca agg gcdagtc.ccg tatcc.ggcgg cc gatgaacg ctitt catggit gtgggctaag 24 O

gacgagcgca agcggctggc gcago agaat coagacctgc acaacgc.cga gttgagcaag 3 OO

atgctgggca agtcgtggala gC9ctgacg Ctgg.cggaga agcggCCCtt C9tggaggag 360 gcagagcggc tigcgc.gtgca gCacatgcag gaccaccc.ca act acaagta ccggcc.gcgg 42O

cgg.cgcaa.gc aggtgaa.gcg gctgaag.cgg gtggagggcg gCttCctgca C9gcCtggct 48O

Ct c cagttcc cc.gagcaggg ct tcc.ccgcc gg.ccc.gc.cgc tigctgcctcc gcacatgggc 6 OO

ggccact acc gcgactgcca gagtctgggc gcgc.ctic cqc togacggct a ccc.gttgc cc 660

acgc.ccgaca C9tc.ccc.gct ggacggcgtg gaccc.cgacc cqgctitt Ctt cqc.cgcc.ccg 72O US 2014/0134726 A1 May 15, 2014 58

- Continued atgc.ccgggg actgc.ccggc ggc.cggcacc tacagctacg cqc aggt ct C gract acgct 78O ggc.ccc.ccgg agcct C cc.gc cqgtcc catg Caccc.ccgac toggcc.caga gcc.cgcgggit 84 O

C cct cattC cqggcct cct ggcgccaccc agcgc.ccttic acgtgtact a C9gcgcgatg 9 OO ggct cqc.ccg gggcgggcgg C9ggcgcggc titccagatgc agcc.gcaa.ca C cagdaccag 96.O

Caccagcacc agcaccaccc ccc.gggcc cc gga cago.cgt. c9c cc cct cc ggaggcactg 1 O2O CCCtgc.cggg acggcacgga CCC cagticag ccc.gc.cgagc ticcitcgggga ggtggaccgc 108 O acggaatttgaac agitat cit gcactitcgtg tdaagcctg agatgggcct CCC ctaccag 114 O gggcatgact C cqgtgttgaa tict coccgac agccacgggg C catttic ct c gatggtgtcC 12 OO gacgc.cagct C cqcggtata t tact.gcaac tat cotgacg tdtga 1245

<210s, SEQ ID NO 2 &211s LENGTH: 414 212. TYPE: PRT <213> ORGANISM: Homo sapiens

<4 OOs, SEQUENCE: 2 Met Ser Ser Pro Asp Ala Gly Tyr Ala Ser Asp Asp Glin Ser Glin Thr 1. 5 1O 15 Gln Ser Ala Leu Pro Ala Val Met Ala Gly Lieu. Gly Pro Cys Pro Trp 2O 25 3O Ala Glu Ser Lieu. Ser Pro Ile Gly Asp Met Lys Val Lys Gly Glu Ala 35 4 O 45 Pro Ala Asn. Ser Gly Ala Pro Ala Gly Ala Ala Gly Arg Ala Lys Gly SO 55 6 O Glu Ser Arg Ile Arg Arg Pro Met Asn Ala Phe Met Val Trp Ala Lys 65 70 7s 8O Asp Glu Arg Lys Arg Lieu Ala Glin Glin Asn Pro Asp Lieu. His Asn Ala 85 90 95 Glu Lieu. Ser Lys Met Lieu. Gly Lys Ser Trp Lys Ala Lieu. Thir Lieu Ala 1OO 105 11 O Glu Lys Arg Pro Phe Val Glu Glu Ala Glu Arg Lieu. Arg Val Glin His 115 12 O 125 Met Glin Asp His Pro Asn Tyr Lys Tyr Arg Pro Arg Arg Arg Lys Glin 13 O 135 14 O Val Lys Arg Lieu Lys Arg Val Glu Gly Gly Phe Lieu. His Gly Lieu Ala 145 150 155 160 Glu Pro Glin Ala Ala Ala Lieu. Gly Pro Glu Gly Gly Arg Val Ala Met 1.65 17O 17s Asp Gly Lieu. Gly Lieu. Glin Phe Pro Glu Gln Gly Phe Pro Ala Gly Pro 18O 185 19 O Pro Leu Lleu Pro Pro His Met Gly Gly His Tyr Arg Asp Cys Glin Ser 195 2OO 2O5 Lieu. Gly Ala Pro Pro Leu Asp Gly Tyr Pro Leu Pro Thr Pro Asp Thr 21 O 215 22O Ser Pro Lieu. Asp Gly Val Asp Pro Asp Pro Ala Phe Phe Ala Ala Pro 225 23 O 235 24 O Met Pro Gly Asp Cys Pro Ala Ala Gly Thr Tyr Ser Tyr Ala Glin Val 245 250 255 Ser Asp Tyr Ala Gly Pro Pro Glu Pro Pro Ala Gly Pro Met His Pro 26 O 265 27 O US 2014/0134726 A1 May 15, 2014 59

- Continued

Arg Lieu. Gly Pro Glu Pro Ala Gly Pro Ser Ile Pro Gly Luell Lieu Ala 28O 285

Pro Pro Ser Ala Lieu. His Wall Tyr Gly Ala Met Gly Ser Gly 29 O 295 3 OO

Ala Gly Gly Gly Arg Gly Phe Glin Met Glin Pro Glin His Glin His Glin 3. OS 310 315 32O

His Glin His Glin His His Pro Pro Gly Pro Gly Glin Pro Ser Pro Pro 3.25 330 335

Pro Glu Ala Lieu. Pro Cys Arg Asp Gly Thir Asp Pro Ser Glin Ala 34 O 345 35. O

Glu Luell Luell Gly Glu Wall Asp Arg Thr Gilul Phe Glu Glin Lieu. His 355 360 365

Phe Wall Llys Pro Glu Met Gly Leu Pro Glin Gly His Asp Ser 37 O 375 38O

Gly Wall Asn Lieu Pro Asp Ser His Gly Ala Ile Ser Ser Wall Wall Ser 385 390 395 4 OO

Asp Ala Ser Ser Ala Wall Tyr Asn Pro Asp Wall 4 OS

1.-74. (canceled) 86. The composition of claim 85, wherein the bone mor 75. An in vitro composition comprising thyroid cells phogenetic protein is BMP4. derived from definitive endoderm. 87. The composition of claim 75, wherein the thyroid cells 76. The composition of claim 75, further comprising a are derived from human pluripotent cells. fibroblast growth factor or Wnt protein. 88. The composition of claim 87, wherein the human pluri 77. The composition of claim 76, wherein the fibroblast potent cells are human embryonic stem cells. growth factor is FGF2. 89. A method of producing thyroid cells from definitive 78. The composition of claim 77, wherein FGF2 is present endoderm cells comprising, exposing definitive endoderm at a concentration of greater than 2 ng/ml. cells to an effective amount offibroblast growth factor or Wnt 79. The composition of claim 76, wherein the Wnt protein protein thereby forming thyroid cells. is Wnt3O. 90. The method of claim 89, wherein the fibroblast growth 80. The composition of claim 75, wherein the thyroid cells factor is FGF2. express TITF1. 91. The method of claim 89, wherein the WNT protein is 81. The composition of claim 75, wherein the composition Wnt3A. comprises serum at low concentrations. 82. The composition of claim 75, wherein the composition 92. The method of claim 89, wherein the thyroid cells comprises less than 2% FBS. express TITF1. 83. The composition of claim 75, wherein the composition 93. The method of claim 89, wherein the composition comprises 0.2% FBS. comprises less than 2% FBS. 84. The composition of claim 75, further comprising a 94. An in vitro composition comprising cells that express TGFB family growth factor. TITF1, wherein the TITF1 expression is greater than the 85. The composition of claim84, wherein the TGFB family TITF1 expression in definitive endoderm cells. growth factor is a bone morphogenetic protein. k k k k k