© 2013 Nature America, Inc. All rights reserved. should should be addressed to A.R. ( members and affiliations appears at the end of the paper. Wisconsin, USA. of Pathology, University of Medical Massachusetts School, Worcester, USA. Massachusetts 1 context for of cell ties restricted densely combinations that human cell expressed controlled cies. of tions ferentiation Understanding modular pendium we regulatory to use immunologists The mouse cells are highly conserved, it is likely that many lessons learned fromunknown the regulators,mouse model we applyemphasize to humans.the role of ETV5 in the differentiation of 175 previously unknown candidate regulators, as well as their target genes andwe thefound celldifferentiation types in whichstage–specific they act.regulators Among theof mousepreviously hematopoiesis and identifiedOntogenet, many an knownalgorithm hematopoietic for reconstructing regulators lineage-specificand regulation from gene-expressionprofiles profilesacross mouseacross immunelineages. lineages Using allowedOntogenet, us to systematically analyze thesetranscriptional circuits. Tocircuitry analyzethat controlsthis data it setis westill developed only partially understood. Here,The differentiationthe Immunological of hematopoieticGenome Project stem gene-expression cells into cells of the immune system has been studied extensively in mammals, but the Daphne Koller Vladimir Jojic mouse immune system of Identification regulators transcriptional in the nature immunology nature Received 30 January; accepted 13 March; published online 28 April 2013; Computer Computer Science Department, Stanford University, Stanford, California, USA.
The exhaustively
the human
provide studying
populations
states of from
Immunological involves Most
for
ImmGen shared 246
hematopoiesis of inte
model the
2 peripheral
and
to a
.
studies in immunological networks
cell by
However,
rich the
r of human
study connected specific the
a 5 of and use relatively
types Howard Howard Hughes Medical Institute, Department of Biology, Institute Massachusetts of Technology, Cambridge, USA. Massachusetts,
cells
larger chart and
first the of
and of compendium
modules 1
regulatory 1 of
, a
the the rigorously 7 , of
8
computational , , Tal Shay
cells
regulatory new
of diverse hematopoiesis Genome in & the Genome Immunological Project Consortium or
comprehensive of
gene-expression development analysis lineages
3 number
regulatory immune
the the
the has cord
few circuits. that computational aDV
of
disorders immune mouse mouse suggested
lineage
coexpressed blood
‘master’ organization
could A
Project
controlled
of and
offer NCE ONLINE PUBLIC ONLINE NCE
of system. 2 mechanisms
,
gene
7 program However,
transcription
immune , , Katelyn Sylvia immune
act biologists and and view be
tree.
system an
analysis
and
(ImmGen)
transcription
a
expression profiles obtained
to
unprecedented more thus for
Because algorithm differentiation
data-generation
genes set
hematological
of
of
understanding system system has
that
could
that mouse and of who complex hematopoiesis ) ) or D.K. (
and
in and factors 7 important
the
These These authors contributed equally to this work.
the
human is
in
reinforce underlie sufficient
1
aim,
A
in
their not . to a
TION factors
are
ImmGen 38 In hematopoiesis.
transcriptional
3 consortium
the [email protected] reconstruct
, , Or Zuk
organization
access this
opportunity cell that through as
arranged
underlying
malignan
816 study
pipelines,
a
the
implica
context, types that the
distinct pr quanti control
in arrays
many o com basis
doi:10.1038/ni.258
cess
was dif the the are 2
in in of
, Xin , SunXin a 2 - - - - - Broad Broad Institute of MIT and Harvard, Cambridge, USA. Massachusetts,
method, identification active that age that particular, offers can physical profiles indicate direct robustness a its or can deep example, tion networks to lessons programs
putative
). ). Here Analysis activity humans. presumed
be tree) enhance
cells regulatory 4 factor
sequencing) Laboratory Laboratory of Genetics, University of Wisconsin-Madison, Madison,
unique 4 in evidence inferred
, , Joonsoo Kang g are learned
we
function and
5 d Ontogenet, that 7 another. probably from
are of of T cells. As the transcriptional programs of human and
regulator
published
of
highly or and Two used
target. confidence a observational
human
6
cells physical are opportunities of
a
transcription
chromatin from
relationships
cis
biological from
of true key
more 8
8
those
accessible
and Incorporating organized -regulatory These These authors jointly directed this work. Correspondence share biochemical
and In
a and
approaches
and models regulators and
the both significant
models
observational
closely
are 5– insights
many
a
mouse mouse 7
3 to immunoprecipitation
interpretability
module cases, , , Aviv Regev and models challenging
but
factor
that 3 in apply that
based have
element related
a
of
expand provide of interactions
known
have for correlation model analysis cells such to
exist
their
hematopoiesis.
(as
it not are
of
on develop the
to models
(according
protein not coregulated are
in information
the explicitly with complementary,
the
regulatory
only the to lineage, will identification
one of
6
2 been highly association
collect , scope
,
7
the 5 ImmGen
a . between
correlative
probably
,
but with a
sublineage 8 Physical or target’s
,
new relationship
e c r u o s e r leveraged
as
(ChIP) mRNA)
considered
do to of 9 conserved
, in which mechanisms targets
may
whereas
the discovery. computational
not
hematopoiesis,
the
6
promoter compendium of
be A A full list of of data
3
evidence.
known
Department Department using
followed a
may help
necessarily
abundance and regulatory
regulation
applicable
transcrip before. enhances
between
4
provide the mRNA ,
that
not
in many both
line
(for and fact the
As
by be In of 3 - -
© 2013 Nature America, Inc. All rights reserved. fetal liver. Stromal cells (box, bottom right) were also measured as part of ImmGen but are not part of the lineage tree. Color in bar beneath matches matches beneath bar in Color tree. lineage the of part not are but ImmGen of part as measured also were right) bottom (box, cells Stromal liver. fetal sorting, for markers and (nomenclature 1 Figure and often killer many natural cells, all system of The system immune mouse the of compendium Transcriptional RESULTS cell rithm hypotheses didate physical regulators, The coherent some two tors to e c r u o s e r ref. from Adapted macrophages-monocyte. MF-MO, granulocyte; GN, above. tree in branches of color SC.MEP.BM
Granulocytes SC.CMP.B
816 build URAC.PC THIO.PC Bl ARTH.SYNF ARTH.BM BM major GN
lineage. spleen. different with model
ImmGen
granulocytes,
SC.GMP.BM of
sampled T
types expression can
regulators.
( M
Putative edges Nonintersecting edges killer
cells which an Mouse cell populations in the ImmGen compendium. Lineage tree of the hematopoietic mouse cell types profiled by the ImmGen Consortium Consortium ImmGen the by profiled types cell mouse hematopoietic the of tree Lineage compendium. ImmGen the in populations cell Mouse Fig. Fig. Macrophages BM model
Thio5.II-480hi.P Thio5.II–480int.PC Thio5.II+480int.PC Thio5.II+480lo.P 11cloSer.Salm.S 11cloSer.SI 103-.11b+.Salm.S 103-.11b+.SI 103-.11b+.Lu Microglia.CNS II+480lo.PC II–480hi.PC RP.Sp Lu BM MF-M expression; modules hematopoietic
SC.MDP.BM
BI
be
Monocytes observational
was identified for of
LN
1 granularities, (NKT
O
(NK)
used 6C 6C– 6C– 6C+II– 6C+II+
consortium from we
C and
–I C α
I and II+ II– I Ii experimental supported nt
β profiles
further
Our Lu.L
SLN
monocytes, SC.CDP.B
of T
to Supplementary Table 1 Table Supplementary proposed cells) I N cells, several IIhilang–103–11b IIhilang–103–11b+.SLN IIhilang+103–11b+.SLN IIhilang+103+11b ML
Lv
T N cells,
this coexpressed many 103–.11b+24+.Lu
delineate M model C DC S h DCs
DC.103-11b+F4/80lo.Kd LC
SI
lineages, .S from p
model
B and refined k
8– 8– 8– 8+ 4+ pDC. pDC. S.SI
resulted
LO LO
data with
4– 4– through regulatory .SLN .SLN 103+.11b+ 103+.11b 103-.11b tissues,
cells 11b+ 11b– 8– 8+ of
dozens
provides γ 246
macrophages, studies, + –
the δ
81 set associating
regulation B1
and T Fe
b. into
PC
including
cell
Supplementary Table 1 Supplementary larger 1
ta B.FO.PC
genes. already B. in cells. such
l MZ.Sp the (April
B proB.FrBC.FL of
preB.FrD.F
T proB.FrA.F CT
SC.STSL.FL smaller 334 B.FO.LN SC.LTSL.FL B1a.Sp types
T B.FrE.FL
B cells B.1a.PC MLP.F CLP.F RL
a and
previously
cells.
use
as coarse-grained cells B.FrF.BM rich
The L L B.FO.MLN L We
SC.MPP34F.BM L preB.FrD.B
fine-grained SC.STSL.BM preB.FrC.B proB.FrA.B SC.LTSL.BM
proB.FrBC.BM B.FrE.BM known
CLP.B MLP B.Fo.Sp B.T3.Sp B.T2.Sp B.T1.Sp 2012
). bone in
578
of the dendritic of stem
.BM
B.GC.Sp defined resource M The ‘same’ The
modules (T the the M
M M
a SC.ST34F.BM candidate SC.LT34
reg Ontogenet NK.49CI–
release) marrow,
complementary
mouse and
NK.MCMV7.Sp T cell context
NK.MCMV1.S unknown hematopoietic F .Sp cells),
cell cells A
cells dult NK.49CI+.Sp modules
types progenitor ). Some samples of stem cells, progenitor cells and B cells were obtained from adult and and adult from obtained were B cells and cells progenitor cells, stem of samples Some ). of NK cells NK with
p
modules. modules,
type
NK.Sp NK.H+.MCMV1.Sp
immune
NK.H+.MCMV7.S
consists testable
thymus include regula
natural NK.49H+ (DCs), of
more algo
span
can
was .S any p
at NK.H– NK.49H-.S p - - -
.MCMV1.Sp p PreT.DN3-4.T preT.DN2-3.T preT.DN2.T preT.ETP.T To hematopoiesis in modules expression fine-grained and Coarse- genes of characteristic studying ent (S&P cells ous, especially cells, Conversely, sity being variable, cells, profiles populations ent Correlations ontogeny cell the trace profiles global in Similarities Fig. Fig. 2a modules T.DPsm.Th T.DPbl.Th T.DN4.T T.ISP.T T.8Nve.LN T.8Mem.L
the
characterize h 1 h with with T.8Mem.Sp.d100
T.8Mem.Sp.d45 h h T.8Eff.Sp.48hr T.8Eff.Sp.24hr T.8Eff.Sp.12hr
0 T.8Eff.Sp.d15 T.8Eff.Sp.d10 h h LisOva.OT1 which T.8Eff.Sp.d8 T.8Eff.Sp.d6 T.8SP24int.T
T. T.8SP24-.T were T. although
T. ) profiles
8SP69+.T group, N
DP69+.T obtained 4int8+.Th in
main , T.8Mem.S T.DP.T T.8Nve.Sp and b
were
the
all ). a each DCs of h h h
h h
CD8 gradual aDV most We
T.8Mem.Sp.d106 T p
T.8Mem.Sp.d45
partly T.8Eff.Sp.d15
other all
T.8Eff.Sp.d8 T.8Eff.Sp.d6 T.8Eff.Sp.d5
coexpressed lymphocytes T.8Nve.ML known were
.8Nve.Sp.OT1 eleven
Fig. Fig. in (Pearson
were
signatures lineage,
being first tightly A VSVOva.OT1
+ myeloid
from
global the
NCE ONLINE PUBLIC ONLINE NCE T N similar T
103-.BR in 2
lineages CD .8Nve.PP reflected
cell
cells ), loss constructed
similar
8
lineage T.8Mem.d20 lineages + key the
s
Vg2+24+.T Vg2+24–.Th
the Vg2+.act.S T 103-.S followed
diverse Vg2+.S correlated, 4
we profiles r
.
p of
lineage cell CD T.4FP3– and
103+.B = patterns cells
p to most
T.4Mem44h62l.Sp
genes h p of 4
had s
Tgd.Sp used + –0.71;
differentiation T.4SP24int.Th
T.4SP69+.T
( T.4SP24-.T T R .S T.4+8int.T
overall,
early
tree p Supplementary Table 2 Supplementary T.4Mem.S
over- Tgd.Vg1+Vd6+24–.Th the
NKT T.4Nve.Sp
compared
tissues Vg1+Vd6+24+.Th
Vg1+Vd6+.T larger being
by variable
h between
h h one-way
tree, at 81 p ( finer
T cell T
Fig. Fig. Vg2–.act.S T.4Mem44h62l.LN T.4Nve.LN Supplementary Fig. 1 Supplementary did T.4Mem.LN
myeloid
of Vg2–.S T.4FP3+25+.LN
pre-B
and cells. two
h coarse-grained
with
gene
s differences
p the weakly and A show
Tgd.Th p T.4Nve.ML 2 sampling
TION
Vg1+Vd6–24–.Th underexpressed Vg1+Vd6–24+.T ). granularities Vg1+Vd6–.Th
(consistent T
T.4FP3+25+.A
T cells
with samples more
analysis In re
granulocytes N their
g regulation,
potential. cells cell
T.4Nve.P and weaker
Stromal cell Stromal
general, Sk
s similar T A
h and nature immunology nature P CT
the Th Vg5+.act.IE T.4FP3+25+.Sp similar
T.4.PLN.BDC
Vg5+.IE T.4.Pa.BDC
known T.4.LN.BD RL were lymphoid Ep Fi
for between .MECh
were
of L pre-T expression
L
similarity γδ C modules s
i
Vg5–.act.IE
with this variance
the T cell T
Vg5–.IE As very NKT.4 NKT.4+.Sp ( ).
to
Supplementary Supplementary we their
inherent largely
being –.
SLN
L genes s
closer ). a Sp L stromal NKT.44+NK1.1– NKT.44 cells, lineage. NKT.44+NK1.1+.Th
DC next
lineages.
ML resource heterogene For NKT.4– progenitors NKT.4+Lv N
expression –N ST.31-38-44- BEC LE FRC
(C1–C81; K1.1 C
to
.L to
samples’
the
of v for myeloid
consist consist –. defined two
.T NKT cell NKT Th T
NKT.4+.Lu h define
diver
those
Stem cells. cells, each least
NK
cell for s - - - -
© 2013 Nature America, Inc. All rights reserved. weight’ (criterion It regulators’ tree modules, captured regulators sublineages, of cell accordance than each factors expressed A of activating expression aims latory We regulation lineage-sensitive reconstructing Ontogenet: repression into ( lineages 2de Figs. 2c Figs. genes) ModsRegs/modules.htm and modules which binding tors elements showed subset fication of modules ules modules further 2c Fig. Supplementary CD8 activated ACTT8, T cells; CD4 T4, cells; pro-B and cells pre-B PROB, cells; in colors matches beneath bar in (color beneath and margin left along labels to according lineages major delineate lines horizontal and vertical Black end. right or lower the at cells stromal ( tree the on search breadth-first by sorted are Samples samples. all of value s.d. highest the with genes) expressed unique 8,431 the (of genes 1,000 the for calculated types, cell profiled of pair each for none) white, negative; yellow, positive; (purple, coefficients correlation 2 Figure nature immunology nature that Supplementary Fig. 2g
may those Ontogenet Most genes then a
next
types
( spanning those lineage and
module downloaded
other in Supplementary TableSupplementary 6 to
circuits
the
activate
of that
‘distant’
identified showed fulfill
enrichment
for
associates of sites factors
coarse-grained developed
in
Related cells have very similar expression profiles. Pearson Pearson profiles. expression similar very have cells Related
helped and ( the
and (
, ( as
(
whereas Supplementary Figs. 2f 2f Figs. Supplementary these 4 4 categories
Supplementary Fig. 2a Fig. Supplementary in and
module’s Supplementary Table 5 Supplementary
1
distinct Supplementary Fig. 1 Fig. Supplementary was
(for in
of
one
(transcription each regulate
with shared
(for above),
both and 3 but
expression
in
for
receives the each
their
the
7,965 that
repressing
rare
) a
example,
lineage, may that
cells genes
us
5 factors
either module the or cell the example, lineage
following lineages); for
at
drive module ). the
regulatory
capture
associated a
each (only
regulators genes
for genes ‘pan-differentiation’
a
cell
( change subtypes greater (for new whereby In type
the
Supplementary Figs. 2h
each module as
in latter
coherent
)
whereas lineage-specific addition,
known
input hematopoietic
modules
(criterion –
profiles
and tree
l in
module example, in Foxp3 transcription ImmGen the
algorithm, ). h are (F1–F334;
aDV that
factors, coarse-grained of the GATA-3
one
C53 biological
criterion + in and similarity
regulate
19
T cells; gdT, T cells;
(for genes
other expressed
are mechanisms and the
can
enrichments
A each mechanisms of different indicates
to had
lineage NCE ONLINE PUBLIC ONLINE NCE
for functional (B
fine
Supplementary Table 3 Table Supplementary
of
a more example, );
gene-expression
6 with (48 act
Supplementary Note Supplementary 1 also ),
from
)
coarse 2 chromatin
cells)
is and portal
expression lineage(s).
a regulator modules T
and and
for which Ontogenet,
Supplementary Table 4
above). Fig. Fig.
modules reg 3, only
determined as predesignated
of
considerations:
in
have
factors;
similar cell
but
the
a
criterion cell (
regulators 6 T
two
cells).
binding its
Supplementary Note Supplementary 2 and induction 1 regulation 81 expression γδ
combination
module ), ). S&P, stem and progenitor S&P,). progenitor and stem cells) particular module
( from
annotations, that differentiation.
not identity resulted
T cells. can
that
http://www.immg activity
types
additional,
different
modules;
8 The C17
were
and
is
+
may patterns regulators in
criterion The
in were
Many T cells; T8, CD8 T8, T cells;
control be
coregulate
to the assigned are
of cells
4, regulator by (for another,
(stromal 7
module,
and searched, in
delineate a ‘mixed-use’
help
Fig. Fig. ).
and former master
transcription
as
A
in same
( active ( set a
profiles set
stromal
fine Supplementary Supplementary
the TION Supplementary Supplementary Lineage-specific
sublineages), of
that example,
combination
criterion
a 4,478 that 334
its
only 1
more
cis of activity
of 2, in of regulator
the ), with with ),
lineage(s)
and modules. ), sublineage)
an nested
)
-regulatory
the
nested are
even
Ontogenet cells)).
the the a regulators
did ‘candidate across
regulators fine
should activity including
and
a the larger
).
modules
browsed of of
‘activity
specific specific close smaller
activity
en.org/ so
li
Coarse identi not
+
across
closer
factor
n if of mod
1,
7,965 regu ).
then
eage
on). fine
fine fac A
the
the the All fall
for set
be
in
in to of is is - - - -
lators to cell simple variance goals: an rion grained regulators modules tory is activity weights program more us missing is that expressed), in transcription makes each linear sion gene is reconstruction Following at Stromal ACTT
PROB
achieved
PRET essential combined
the
Notably, Although B those examine ensemble S&P NKT
gdT MO
GN
types DC MF
NK
all cells,
T8 T4 program 4). of regulator’s B expression
8
susceptible each (for protein sum
GN predictions as the
genes
Collectively, across
some module
of of
in
derived
for possible;
in factors
example,
the
MF regulatory
and module’s
of each the the by
the for and subtler
Ontogenet
the in of
each
in the
between factors gene-specific
penalizing level approach
MO
consecutive approach
coarse statistical module
6
regulatory a activity
Module
activity ,
the
regulator, ontogeny;
(while in
regulators’
linear the A to
from
cell
regulatory
expression
each
such but
and same linear
noise. genes we
DC program
activity-weighted A modules
a
type, as
is
reconstructs
model possibly weight use
1 weights a C and
in
cell used
cell as
changes
robustness,
module is possible; inferred
coarse-grained
are
regression
programs
in the it stages
and
activated “in an
expression
S&P
type
expression B as type
is
a
before
no
explains programs PROB Lirnet
optimization
patterns and
to
given in
to geared pre reflected fine
that
gaining
longer
and
in generate in that which ( are
solely
Fig. Fig. the B
is
B
differentiation the
modules
to
a
that
by cell
method although regularized were
its −1.0 regulated repressed
cells,
NK cell
potentially
regulatory
as toward identify
patterns.
3
shared
in activity
B
expression active progenitor.
PRET
they ). are
−0.8
from additional type much
by
try
and
type. module
that In a
inferred
Module −0.6
approach consistent
prediction the to
T4
belong. this for have D.”
is
−0.4 (even transcript
maintaining cell
by it
combinations
of
As in achieve weights
by
approximated
The cell-specific e c r u o s e r
Correlation
comes regulatory-network
programs with −0.2 Our
different the
T8 fewer
model, the
a
type ‘inherit’ regulators
factor (criterion
of The regulators; for result,
1
if
fine 0 gene-expression ACTT8
same
model that
is the
the the
across
0.2
multiplied the of of genes their
at fine-grained activated
abundance. modules
the
C,
Elastic
NK the 0.4 a
the factors constructs regulatory regulators
the
remain following T
way. the the module’s
whereas
assumes
0.6 of
3).
activity
expres
coarse- gd
regula similar but related
cost
model T by
crite regu same same 0.8
Stroma This This
Net
the
are are
1.0
let
by by of as l
- - - -
© 2013 Nature America, Inc. All rights reserved. interactions tion Table 7 Supplementary tions in weights ing 6,151 regulatory modules controlled in ity interactions regulation ent ( Net previously regulatory regulator. tors Fig. 8 grained (1,091 regulators. modules, We hematopoiesis mouse for model regulatory Ontogenet in activity, the terms the independently rate allowing regulation This was expressed across ity penalty right). (orange-purple; activity and left) (blue-red; expression regulator of combination a linear as calculated (top) a module of ( bottom). regulator; specific (context- weights activity variable or middle) repressor; (constitutive weight activity negative a uniformly top), activator; (constitutive lineage the across weight activity positive a uniformly have may a regulator (top): module a of expression the ‘explain’ can (bottom) regulator of a type how demonstrate to below) key (orange-purple; profile activity and below) key (blue-red; profile expression regulator the 3 Figure e c r u o s e r immunological Supplementary Fig. 9 Fig. Supplementary
In others), a
which
weight (and 6 applied
more lineage
expected cell
per similar deemed ,
of
regulatory violates
most
a and
interactions and of
all method the activating,
6
coarse-grained
frequently
types modules such , gene Overview of Ontogenet. ( Ontogenet. of Overview
1 hence
the shared regulators
a
1 the
cell
Supplementary Supplementary Table 5
As ‘pan-differentiation’
similarly and mouse cases )
probably mostly model The
unknown in ( lineage
programs tree Ontogenet
way
Supplementary Fig. 10 Fig. Supplementary
assumed
whose
the and
to
algorithm
determined
regulation. finer that similarity complex types.
(‘frequently
is
expression) that
Ontogenet
be
activity when
contexts.
program in
(59%), known
impractical
regulators
and suggested
immune associating
the by
317
had active tree related changed regulatory
does
maximal
with in reflective Thus,
distinct
expression
480 inferring that module’s
that
to
repressing module, qualitatively
two consisting lineages.
a and context ).
to weight) between
not
Ontogenet the
regulator’s mixed to
by
unique This for
b
model
cell are system
construct
if changing’), regulatory
) Mean expression expression ) Mean was
different
in the with
Supplementary Note 3 Supplementary use new
and
81 cross-validation,
a
effect 334 regulators
program each
of types. some
strictly regulator
rich
a
and genes the
coarse-grained same specificity low, and
the
activity ) Use of ) Use
Conversely, both is also data
identified
regulatory
modules related fine regulators
). of and identified
the regulatory
(defined
regulatory three tree cell On
solves
similar modules activity 195
). we
lineage-specific
ignores a cells,
extent. are than
the modules
9 better activ
When
sepa reflective
same average,
(
and
obtained type (that ( Supplementary Fig. 10 Fig. Supplementary
cell mixed)
was
coarse-grained Fig. Fig. cell
more greater
this
of
in those it
1,417
- -
patterns,
has
weights ( many types
as interactions
Fig. Fig. and types at is, 5
the
we
connections problem Ontogenet model
), modules and likely
the
fixed there
predicting a
between
whereas
regulatory obtained
Expression Expression Expression Activity Activity Activity a number of pruned a remained
4 regulator’s other
and known
, product
sparser 554 in
).
context-specific Supplementary Supplementary varied
activity except to were for
modules
the 580
by
regulators
be and modules classes.
differentia
in mixed-use 81 constructs regulatory of regulatory
Repression leveraging
lineage 17 candidate by
regulated Context-specific regulator Global repressor Global activator Target modulemean Target moduleexpression and
network, in new constant
relations
of −2
for
334 interac specific weights
activity coarse-
regula
Elastic differ
activ
their were
Expression (fold)
hav
The fine and and
per
in in Samples Activity ------
None 0 test enrichment ciations by a and ChIP-based overlap DC with that the GATA-3, development tors the known T B For single Ciita cell response specific fine lates and myeloid of selected example, already Many interactions regulatory known of prediction Ontogenet
regulator
Ontogenet’s Furthermore, cell cell its Ontogenet
6 γ
expression
module
commitment many for
example, Activation of PLZF modules; δ ) the the
),
15
regulates module T of
regulators which module CD4
2
T modules with
two
known, between as
cell known modules
myeloid-specific
fine
the T-bet among module
cell myeloid aDV
(ZBTB16), a and
+ binding for C19
groups)
module regulator those
regulatory
supported
DCs, modules);
and (
MafB regulators
the A Supplementary Table Supplementary 8
was
predictions
of and b cis the
the
NCE ONLINE PUBLIC ONLINE NCE C18
role
and
C29, which individual 1 a the
-regulatory their C52; 2 C24, Pax5,
function;
+ + + =
regulator based B .
fine genes also are PLZF; antigen-presenting
profiles
of (encoded NKT C56 and (
cell combination
F128 Mean moduleexpression
upeetr Fg 8 Fig. Supplementary of
Sox13
regulated subset-selective
T-bet
C30 member modules;
the
supported consistent
C/EBP Trim14 interactions EBF1,
Gata3 the
on in is
Bcl-11B, P Tox
module Irf5 and modules cell
and
were idea
regulated
in < the regulators,
and enrichment expression expression and
the myeloid
and expression (encoded motifs expression
1
module the fine
by
F131; α
module
of POU2AF1
×
NKT C74, genes
also a
by C/EBP
(encoded
Id3, modules
Mafb direct
10
coarse modules with C33
the GATA-3, C25 of
A
IRF8 ( −5
TION
STAT1 and
P
supported by identified
the
F288; regulators all module
).
by
cell = PU.1 accuracy
) expression (permutation
by with
known is
β and
physical of For the
= regulates
2.6
module
B
also (but macrophage (encoded CD8
regulated ( Tbx21 cis Σ
module
cell and Supplementary Supplementary Tables 5 F150
by
× × × ×
nature immunology nature
regulates known example, and C30 × (encoded which Lef1, -regulatory
Regulator expression )
involved 10 not
+
F188 Cebpa
module regulatory
is
by Spi-B by DCs
interaction
CIITA
−5
and of ) and
were associated
regulated IRF4 IRF4), the regulates
TOX
(hypergeometric F136. GATA-3 activity TRIM14 activity
it Ontogenet their γ Regulator activity by
our
IRF5 activity TOX activity is
δ
by ) F152,
was the many is test)), 27
macrophage- T by supported module Cebpb regulates (
regulated
(encoded C25 in Fig. Fig. higher
and
model.
the
of
cell and
interferon-
motifs significant associated Sfpi1 consistent
relations. NKT
the
in
between
myeloid
(and IRF8 such the
regula known 4 ) by with
TCF7; which
);
regu
×
) asso were C29, than
and was NK For cell the the the
13 by by by
in as a - - -
© 2013 Nature America, Inc. All rights reserved. did were negative type other regulated some little (57%)), regulatory (for do false-positive sion-based 497 of do fine negative sites There motifs actions that significant, ing in such groups) ( 21 ChIP motif cell the changes. diagram) in (labeled regulators inferred selected of weight activity the which at steps) (differentiation ‘edges’ indicate arrowheads tree); that on populations (below, differentiated tree hematopoietic the onto C33 module of high) red, low; (blue, expression mean ( type. cell each from regulators the of each for Ontogenet by assigned key,right) bottom (orange-purple; ( C33. module to Ontogenet by assigned (rows) regulators the of bar, below color (red-blue expression centered ( including genes, B cell typical some contains module This module. in genes of examples margin, left lines; horizontal thin by delineated are C33 within nested margin) (right F175–F181 modules fine beneath bar color to correspond (which lines vertical dashed by delineated are lineages major (column); cell each in (rows) genes module’s the of right) key, bottom (red-blue; expression mean-centered ( C33. module coarse–grained 4 Figure nature immunology nature in b P ) Regulator expression, presented as mean- as presented expression, ) Regulator
HSCs, A Although
the
554) = find
not
GATA-2 of (HSC)
not regulator–coarse
modules.
of the
example) few
2.2 and not not as
or
cases cases, myeloid EBF1 profiles
are
or in
554).
have
enrichment
meet were the
predictions the
and or no
×
Ontogenet regulatory model for for model regulatory Ontogenet
known
result results; identified and
their
measured available
(a three
myeloid 10
module
method,
ChIP
interactions
binding correlation the
they
transcription
in
this P
motif those
a −5 real
Such not hence the
results Second,
< d
cell transcription characterized the enrichment
supported of c factor Cd19 (hypergeometric
) Projection of mean-centered mean-centered of ) Projection regulators 1
reasons c
) Regulator activity weight weight activity ) Regulator
this
is initial nevertheless supported
binding ; matches colors in in colors ; matches ‘false-negative’ Ontogenet). ×
module
B in overlaps C40,
regulators transcription
due
by
were
for 10 of
such cell of in of cell , ,
the is
Blnk
probably the the −5
most
module
C/EBP
our
a our
particularly filtering to in
and
module not (permutation
cis hematopoietic for (54%); module as
majority aDV
model factor , ,
a data
expression the C24
of
method.
-regulatory Ebf1 is a
Ontogenet, study were
regulatory by the factor
provided ) Module C33: C33: ) Module this.
factor
differentiation binding
a α
can A also
prediction enrichment
and
NCE ONLINE PUBLIC ONLINE NCE process in PU.1 controls
and and
and
associations
criteria,
(A) for factor–binding C33. 52
test statistically
result
First,
(and
be C25. any of that
indicated
the Fig. Fig.
Third, A likely various
C/EBP Cd79a of
and (SFPI1) regulators for
nominated
as motif
(absolute b
in
test)),
inter cell element
bind
and in is 90 1 assigning that hence
stem of
input.
these The
as two ); );
its module
not
to
.
of in c Ontogenet),
ChIP-based
β type they for should
- -
)
reasons.
is target (60% of
occur many
itself
targets
GFI1 is highly known the
or
chosen
Pearson were data,
(90% not
only a d c b
scores B binding
POU2AF1
of
immune Pou2af1
not module HDAC10 for Siglecg cases Pou2af1 Vpreb1 Vpreb3 A Hdac10 PRDM Fcer2
transcriptionally Cd79 Cd79
Prdm Bcl7 was Cd55 Cd22 Cd19
(300 Pax5 Fcrl Fcrl RFX5
EBF1 Il12
TION Spi-B Spi Rest Ebf1 Blnk Pax5 Pax5 Tcf4 Tal-1 Igll Spi Rfx5 Ebf1
regulators;
Bl
Il9r in expressed prone
and in
b b
a 5 1 a a 1 a k cis by b
much 9 9 be of by
but
Regulator activityweight Regulator expression
interactions not fact GN
another in
for
-regulatory r
considered
regulators; of
GN an Ontogenet
vice and profile
< which
(B) MF
system
in assigned
to
551 binding a 0.5). smaller expres MF MO
versa. BMI1 many MO false- false- show DC
only
334 s cis cell
for we In 1 3 - -
HDAC1 RFX5 Spi-B POU2AF1 EBF1 Pax5 NK lated Context-specific modules mixed-use underlies regulation Context-specific F287 was ated the was a ated dicted unknown or regulators previously in as general, was by mediate expression as DC
0 regulator POU2AF1
The a a T8 our ‘pan-differentiation’ Tcf3
much
regulator
not B
predicted not before by
and T4
by cell
to arrays reidentification
)
T-reg embryoid associated
in possibly specifically
was one be
gdT
F289, regulator lower in S&P
NK
that unknown stem of with was
a T
this
not and the set regulator in
by PREB
Ontogenet
highest a
the and stem number
context. macrophage-specific of because the previously regulation, identified
had
bodies before
regulation because regulators
B progenitor
expressed model
regulatory and
higher
of
of modules, in
of with
known
of Among the of progenitor NK
pre-T
associated
granulocyte-macrophage
to GATA-6-deficient as a it unknown
in NK expression
bad macrophages. of
had
PRET be
a in
in
cells
which interactions
NK regulators T cells
cell a
at those,
the
probe T
relatively
cell regulator
least cells. and cells
cells
T4
module
and context
modules
with
role the regulator,
for
set.
granulocytes. Repression in
175 GATA-6
and
or
was −2 NK
same
discussed lends
myeloid
example, lineage-specific
That
A T low
XBP1 in granulocytes of
C19
(37%)
had
of T8
only
mice the C31, Expression (fold) the
e c r u o s e r expression set
support is
one
perhaps but
was
was low model. Activity in
γ
sparse 1
CTT8 of cells. δ C50 were None
4
0 KLF12
below.
colonies
agreement . lineage was
T
predicted
E2A genes Finally, not expression
cell
and to
completely
not because
Of
because and
identified
GD
(encoded the
in
Activation modules modules was
C58 T is
the
and associ
gener B
ETV5 2 inter many regu
to
with F181 F180 F179 F178 F177 F176 F175 cells pre
475 but
by its be
in it - - - - -
© 2013 Nature America, Inc. All rights reserved. reflected Most differentiation during ‘rewiring’ and recruitment Regulatory specific lineages, Although F300 and module, lators populations. example, underlie of different in reported another regulates. it module(s) the to regulator each connect interactions) (regulatory Lines >1. × expression) weight activity (absolute effect a maximal with interactions regulatory with cream) circle; (outer regulators assigned Ontogenet their and (yellow) modules mixed-use and modules (gray) repressed and (red) induced ‘pan-differentiation’ purple)), (dark modules 5 Figure e c r u o s e r entiation. MYBL2 TEAD UHRF HSF2
DNMT1 PA2G ATAD
the T CBX5 2 1 GATA 4
2
cells INSM1 ACTL6
regulatory is 1 lineage HMGB in A
independently
(EGR2).
ASF1 set although Ontogenet regulatory model for coarse-grained modules. Lineage-specific modules (inner circle; colors as in in as colors circle; (inner modules Lineage-specific modules. coarse-grained for model regulatory Ontogenet 2
‘mixed-use’ by
others
our regulatory in and RUVBL1 module
This some B FOXM1
PHF5A the selected of
in
Each
model: by tree
CNBP
regulators
change change
are DCs.
of Pax5 THRA relations
C70 PLAG1 not can
activation MECOM B its
programs cases,
modules
cell Foxp3
In SOX7 in
expressed
regulators help in is provided RUVBL2
induced
B
induced another their
specific 15 23 in PHB2 12 55
HMGA 7
cells identified 14
HOXA such delineate
6 43 the in 2 SAP1
9 9
event expressed associated
11
for 1 8
CD4 HDAC2
3 5
a as context
in CHURC1
.
in
(ZFP318,
are both example,
‘bird’s-eye’ The
BUD31 47 40 the
the the
is 10 both +
NPM1 CREG
by the
themselves
MORF4L2 T same
associated 42
ability
regulation DC in 1 50 NR1H
RBBP7
Ontogenet in cells
activity of
regulatory SUV39H T mature 4 3 CEBPA
2 more subsets), reg
RFX5 another
BHLHE40 the module 8
view of ETV
(itself CHD3 cells
MEIS 1
Ontogenet 1
KLF9 fine-grained
29 weights EYA1 than
TCFE
and with ‘mixed-use’
of B B of 34
ID MAFB
and NFATC22
were 30 lineage, 54 cells and
mechanisms GM990 a
in Rag2 the
WHSC1L7
CIITA) one
59 member 1
ARID4A different
different some
CHD7 67
44 during ‘recruitment’ PIAS3,
FLI dynamic,
and 1
STAT 52 lineage. by 2 NFE2L1 AFF3 to 64
STAT
has 1
GATA-3 DTX3
myeloid MAF or identify module L in T IRF7
differ
AR
of
ZBTB16 HSF2
TRERF1 regu
T parts cells. been both 56 77
SOX1 RCOR2 that
3 For cell
the NPAS2
71 IRF8
as ID 58 ZEB1 3 25 - - RNF2 HDAC5
70 FHL2 HIVEP 32 PPARG
1 28 KLF12 SP 48 16 4
fied reported or negative differentiation. 49 61 differentiation activity expression disposal) its or and In the and ity cell and CEBPB PRDM 26
IRF6
MITF were this 1 CD8 regulators, regulatory progenitor) 27
GATA6 weight
unique type, strengthened ‘disposal’ its 41 the RARB
61 IRF5 24
EPAS1 33 way,
74 INSM
+
immediate AH
disposed
has 76 75
(CD4
involvement ) FOXP3
KLF6 R
1 31
65 involvement
BHLHE4 of we
T
18 FOXN2
aDV HIF1 19 changed 62
changes
NCOA7
we regulators regulators changed cell,
A RELB PIAS3 1
identified MXD4
only interactions −
TBX2 TRIM25 of
A
identified CD8 of at PBX1 In EGR2
NCE ONLINE PUBLIC ONLINE NCE 1
KLF8
HIC1
Ontogenet
regulators of
CD8
each CEBPE (activity particular, ETV3
15 BATF3 progenitor
PADI4 − ETS1 and but (increased NFE2
)
for FUBP1 of
of stage
AE
GATA +
of does and differentiation S
HMGN
that T
which TCFE all weakened
2 RCBTB1, ZFP318 WWTR
this SMAD IRF3 modules MXD4,
3 PLAGL1 cells ARID3 C
IRF4
were SREBF
RBPJ weight
MYCL NCOA3 1 the 4
modules
TEAD1 3
ZFP263
not RUNX2 ( independently the B
VD
for T NR4A2 Fig. 6 Fig. PRDM 1 2 set DACH1 R EBF1
KDM2
have
TRPS1 SPIB from ( cell RFX5 SPIC regulatory or
NFE2L2 9
Supplementary Table 9 Table Supplementary necessarily recruited, SOX5 JDP2 model the
B NR3C2 ZFP691 EYA2 POU2AF RUNX3 of STAT
HL EZH1 RXRA CREB3L1
MBD2 KCNIP3 EOME Batf
DTX1 NR4A3 BACH2 ZHX2 PAX5
BCL11B
greater F
decreased) and (activity to PIAS3
4 a
been targets.
differentiation 1 S A
). common Fig. Fig.
involved immature TION
step.
and
To regulators suggests
2 associated
than
, except myeloid induced induced myeloid , except characterize and
involving
interactions
NFIL3
Lineage-specific module(NKcell) (myeloid cell) Lineage-specific module Lineage-specific module(GN) Lineage-specific module(DC) Lineage-specific module(MF) Regulator Regulatory interaction Mixed-use modules differentiation Module upregulatedwith differentiation Module downregulatedwith Lineage-specific module(Bcell) Lineage-specific module(Tcell) mean weight Notably, identified For nature immunology nature
lymphoid between
in ITGB3BP
that single-positive
example, that
those that that
and lower step
of before
34 their recruitment
its were the the
newly
this, that from modules whose
interactions. progenitors,
), progenitor) than
(
during regulators’ previously regulatory
Fig. with
as
recruited
cell for
double-
well identi (CD4
that
activ 6b T
each type
and
cell the , (or
c
as of ). + - -
© 2013 Nature America, Inc. All rights reserved. when no been module The recruited tors cells, In in those to correspond colors type (cell ( >1. if edge, this on regulator the for changes weight activity indicate parentheses in Numbers T cells). in reported not black, T cells; in a role have to reported regulators (red, modules associated their and step differentiation each along recruited regulators right, CD8 the in recruited regulators CD8 the in recruited interactions activating those only for interactions changing’ ‘frequently ( regulation). 0 (no white, weight; activity (repression) negative purple, weight; activity (activation) positive in colors matches which below, 6 Figure nature immunology nature and b a SMARCD2 MORF4L2 ITGB3BP RCBTB1 RCBTB1 POU6F1 POU4F2
ZBTB16 ZBTB16 ZBTB38 TWIST1 WHSC ZFP367 ZFP367 TRIM14 TRIM14 SCMH SMAD7 CLOCK HOXD8 SMYD1 RUNX2
FOXE3 another AEBP1 BCL6B TRP73 STAT4 TBX21 TCF15 TAF11 KLF15 KLF15 KLF15 KLF15 KLF15 KLF15 MYNN parent RORA PIAS3 PIAS3 MXD4 MXD4
NFIL3 LMO4 Frequently changing regulatory interactions SOX4 SOX9 CUX1 HES1 CBX7 PHF1 ZHX2 BATF ETS2 AFF1 ATF6 ATF6
KLF3
GFI IRF9 MAF HOXA9 Repressio SP4 Eomes 800 700 600 500 400 300 200 100 differentiation
proposed the CLP.BM 1 1 1 they preT.ETP.Th
GN C70
Changes in activity weights across the hematopoietic lineage tree. ( tree. lineage hematopoietic the across weights activity in Changes NK
as in
n preT.DN2.Th
MF
example, are and
repressors
our
cell preT.DN2−3.Th and were MO
no as preT.DN3−4.Th T-bet
module
model,
its a
Activity
None longer no
T
step
known reg during
longer
as T.DN4.Th DC
at cell
regulators that activators. C19
used aDV
this T.ISP.Th
regulator
regulators
the +
used
Fig. Fig. leads T cell lineage, including the CD8 the including lineage, T cell was Activatio T.DP.Th
A differentiation
at S&P NCE ONLINE PUBLIC ONLINE NCE T.DPbl.Th differentiation
1
assigned later as
) for every ‘frequently changing’ regulatory interaction between a regulator and a coarse-grained module: orange, orange, module: a coarse-grained and a regulator between interaction regulatory changing’ ‘frequently every ) for n
to T.DPsm.Th active PROB
Both Fig. Fig.
1 activators
6 T T.DP69+.Th
). Foxp3 points reg
1 B Notably,
Eomes T.4int8+.Th ) with Ontogenet-inferred lineage regulators (red and black as in in as black and (red regulators lineage Ontogenet-inferred ) with in cells
the
T.8SP69+.Th HSCs NK
and step
(for known
recruited T.8SP24int.Th at step
PRET and
because
the
CREM
example, in will T.8SP24−.Th A
that
TION
T-bet
other mult NK
be T.8Nve.PP T4
leads the
(which cell HSCs
noted T.8Nve.MLN i
lymphoid were modules.
HOXA7 + T8 T
T cell lineage branch (squares, cell types; lines (‘edges’), differentiation steps); steps); differentiation (‘edges’), lines types; cell (squares, branch lineage T cell regula
reg T.8Nve.LN to
have
only
also
NK T.8Mem.LN cell has ACTT8 -
T.8Nve.Sp a Activated NK ) Activity weights in each cell type (columns; correspond to color bar bar color to correspond (columns; type cell each in weights ) Activity point The repressors and activators lineage of Ranking T silencing by MEIS1 ment progenitor T.8Mem.Sp
T cells
C42
αβ activity gd
T Cell
allowed is 1 T
7
is in . the
of
recruited
T LMO4, 44 stage).
c d weights the step
TCFE KLF6 NFE2 STAT NCOA XBP1 BACH1 C/EB PADI4 ENO1 GN + T.8Mem.L cells, T.8Nve.L T cell lineage. ( lineage. T cell POU6F1,
us P
6 3 4 gene β SC.COMP.BM
to that PreT.DN3-4.Th NCOA MA BACH1 CNB C/EB ENO1 LMO2 C/EB XBP1 CREG1 MF preT.DN2-3.T T.8SP24int.T
T.8SP24-.Th T.8SP69+.Th T.DP69+.Th preT.DN2.Th preT.ETP.T
The in N N T.4int8+.Th T.DPsm.Th T.DPbl.T
F T.DN4.Th T.ISP.Th
CLP.BM identify 4 P assigned P P to
4 β α
encoding agreement T.8Nve.S
T.8Mem.S leads
first LMO2 MEF2 ATF6 TRPS BACH1 SFPI IRF5 XBP1 ENO1 C/EB MO module h h 1 P h h A 1 β b
SC
STAT CUX1, HOXD8,SMARCD2,77;69;78 78; 34;18;71;75;81 MXD ZHX2, Sox9,18;37 KLF15(3), POU4F2,Foxe3,AEBP1, TWIST CLOC GFI1 KLF15, PHF1, HES1 ATF6, 78 ATF6, MYNN,KLF15,TRIM14,71;78;73;34
differentiation Activated CD NFE2L1 RE INSM CIIT XBP1 RBPJ IK KLF6 ETV3 RELB DC p ) Enlargement of boxed area in in area boxed of ) Enlargement
to and p B c
L for T c α T.8Nve.Sp.OT ) Known and previously unknown unknown previously and ) Known A
LMO2 PHB2, CNBP,TRIM28,RUVBL2, 4 , .8Nve.ML ). 1 4 multilymphoid , TRP73,
K (2), TBX2
MEIS1 1 , CLP.BM , ZFP367(2),AFF1,WHSC1,ZBTB38,TRIM14,SP4, C42.
, 68;71;52;56;35;62;6778;14;77;23;15;65;17;66 BCL6
ASF1 PA2G4 SPIB FUBP EBF1 UHRF CBX5 HMGB PAX5 POU2AF each rank
with BAT B , PA2G4,NPM1,RBBP7,PHF5 1 RORA, 19 , RCBTB1, 8 B B 1 1 F T N
2 + Smad7 , SMYD1,59;73;19 IRF9
, RCBTB1,PIAS3,ITGB3BP,
T
1 MEIS1 regulator
regulators d the
1 during .8Nve.PP ) Simplified ImmGen tree tree ImmGen ) Simplified , SCMH1,CBX7,TCF15, STAT SGM9907 ID KLF2 AE SMAD ENO1 ETS1 TBX2 EOMES NK , PIAS3,42;69;67;77;54;38;71;57;37;45
2 ZBTB16 step S reported 1 6 Not reportedinTcells Reported Tcellregulator 3
is MORF4L2, 20
differentiation
with progenitors,
at (2), TAF11,KLF15(2),KLF3, abT SP100 FUBP ETS1 NPM1 TRIM28 BCL11 LEF1 ENO1 TCF7 AE later
e c r u o s e r as
S each Sox4
HLF 1
lineage
B activator methylation
; 39;79;38;66;77 NKT LEF1 STAT TBX2 NPM1 SP100 ENO1 ID2 ZBTB16 AE GATA3 no A ,
differentiation S
NFIL Runx2
1 1 a longer of SREBF2 TCF7 GATA3 HIC1 ENO1 ETS1 DTX1 STAT AE MBD2 3 gd
activators
, S T at ETS2
recruit
4 toward
which
, used
MAF and , -
© 2013 Nature America, Inc. All rights reserved. differentiation cytes, NR4A3 in regulate stem previously addition, and T with example, known 10 Table Supplementary and ( experiments ( genotype per mice two of a minimum with litters independent three to one with experiments three of representative are V by ROR (CD24 mature by (left) IL-17 ( ages. different of mice for obtained were results Similar each. in cells percent indicate (bottom) quadrants in as mice CD2p-CreTg 7 Figure e c r u o s e r including tors the their lymphoid regulators including regulators increased tors most identify all presented stage. ing by it CD8 ELF4 higher
can
cells;
Finally, a the macrophages, regulators
3 5 a HSC
γ
γ Smad3 were T cells ( 10 ) T cells ( 10 ) 2 γδ × γδ × were +
10
stem t repressors and limited
substantially activity
+
+ deemphasize 0 5 0 3 6
POU2F2 (Gm9907; T model Those
cells (top; right number), or IL-17 or number), right (top; cells T cells from the lymph nodes (LN) of 4-week-old mice as in in as mice 4-week-old of (LN) nodes lymph the from T cells
expression
regulators
POU2AF1,
(shown cells those
Spleen Thymus cell-cycle ETV5 is a regulator of of a regulator is ETV5
regulators
progenitor
the our disposed
the a by recruited
progenitor above and from GATA-1,
Id3 were . Numbers above bracketed lines (top) indicate percent CD24 percent indicate (top) lines bracketed above . Numbers +
associated with Etv5
b
2
counting
window in
and
model are 3 known weight ) or three experiments ( experiments three ) or model
and ). 2 differentiation progenitor
(Oct2;
1 a
+/+ 0) Notably,
) to
recruited
NK ( on
more
shown captured
modules) of CREG1; Fig. Fig.
and
in
littermates (control (Ctrl)). ( (Ctrl)). (control littermates ( Sox13 progression
CKO Ctrl of be at
the
Supplementary Fig. 11 Supplementary Pax5,
HOXA7, each
to made
c-Myc cells, the monocytes); cells; associated T
diminished
stage, at
of
the KLF13 the
reported induced 6
with
nuanced coarse contribution cell
).
the
differentiation
to
lineage
although
( basis b
EBF1 changes
). HLF;
early cells; Supplementary Fig. 11a Supplementary
many
and lo
in to
regulator by
Events
at
control those double-negative-2 and four ) ) V In
γδ CD44 (% of max)
HOXA9
(a each DCs, coarse points modules the 100
10 10 10 10 γ
GATA-3
this
20 40 60 80 2
T cells. ( T cells.
in in
to
in
Bcl-11B, of way—for and thymocyte
0 0 regulator
c-Myc, 2 3 4 5
among + N-Myc predictions
regulators
thymocytes of mice as in in as mice of thymocytes
to
1 0
be in CD24 CD62L myeloid lineages, viral recruitment in granulocytes,
differentiation 16 20 30
this
the c 10
ATF6, way
the
0)
of Spi-B
). B GATA-3
activity upregulated modules at 2
10
and cells,
+ important
Ctrl by
‘pan-model’
infected but
(that cells (bottom). Right, expression of CCR6 and CD27 (top) and of IL-17 and interferon- and IL-17 of and (top) CD27 and CCR6 of expression Right, (bottom). cells entire the which proliferation N-Myc,
a and 3 we
(
) ) Total TCF7
example, Supplementary Fig. 11b Fig. Supplementary of
coarse 0
with ETV3,
cells HOXB3.
4
top-ranked ). 25 30 79 not
progenitor such were
correctly
10 ZFP318;
Thymus PLZF B weight
of
b is, For and
5
) Expression of CD24 (top) and of CD44 and CD62L (bottom) by V by (bottom) CD62L and CD44 of and (top) CD24 of ) Expression cells
regulation at T
model
lineage and 1 γδ and present
B
DACH1 8 GATA-2 their
DCs
21 2.5 2.8
cell
disposed as
),
modules,
T cells in the thymus and spleen of mice with T cell–specific ETV5 deficiency (CKO) and their their and (CKO) deficiency ETV5 T cell–specific with mice of spleen and thymus the in T cells during example, the
regulators step the cells;
SKIL,
with
as
Bach1
).
and that
analysis GATA-3 disposal the Eighteen
CKO
precursor 19 and
acting At known γ and captured
activity (‘edge’),
δ
, regulators
(
activators.
2 stage, in Eomes,
NKT T
i. 6 Fig. occur following:
the 0
T macrophage NR4A2
(reported
);
2.7 in
and
a 74 cells 96 is the and
homing
; numbers above bracketed lines indicate percent ROR percent indicate lines bracketed above ; numbers of 19 cell including in
only common NK ‘rewired’ captured is
γ with analysis
cells.
(that
mature regula regula
δ and
MEIS1 (across mono 2 weight we useful, d
NFE2;
a 2
stage, many stage, T-bet
T
; numbers in or adjacent to outlined areas indicate percent cells in each. Data Data each. in cells percent indicate areas outlined to adjacent or in ; numbers c cells,
with dur
and and can
For not cell
α
28 In is, Events Events in to of ). β lo - - - - (% of max) (% of max)
(mature) thymocytes (left) or CD24 or (left) thymocytes (mature) 100 100 20 40 60 80 20 40 60 80 0 0 est CD24 expression the in T development regulator are in and model expression of the manipulated tion latory To regulates ETV5 grained (terminally module the number effects. regulatory double-negative-2 in steps ated; with been than Overall, IL-17 ROR
1 0
9.7 cell To
the distinct T
small-intestine 10
test expression among differentiation
γ Ets
ETV5,
in 2
δ γ cells
no partly at t assess also with
lo
10 development
differentiation Ctrl T activators
assigned
that
one
family The
3 model),
lower cells, modules
other of ‘rewiring’ cell
of (CD2p-
0 called
aDV
the
regulatory the thymic
4
is
‘rewiring’ due
of are of
lineage. the
49 67 10
Thymus differentiated
regulatory γ antigen
in in vivo highly thus
they
δ
5
cells the (more-differentiated) in largest earliest A the
member
to
specific
regulatory T several NCE ONLINE PUBLIC ONLINE NCE in
32 ‘leaves γ gd of help
the
Cre
δ
model’s
cells,
far. was particular DCs, precursors,
T steps, acquire cell differentiating
thymocytes and lineage-specific T cell differentiation T cell A
in
restricted cell
CKO receptor
Tg diminished
changes of number
in
to
determinants
practical
more
a
regulators
surface
to but model +
in ETV5
γ thymus function
cell identify these precursor compared Etv5 δ
8.8
predictions role
42 the cells; the effector c T
ETV5
) Expression of ROR of ) Expression prominent type–restricted in
cells fl/fl
hi
transit
(a
of
which
to γ
cells, of tree’). (immature) thymocytes; numbers in numbers thymocytes; (immature) differentiation and
for
δ criterion marker
expressing
change γ
67% power
more activity
mice).
the IL-17 CCR6 ETV5
T δ in
10 10 10 10 10 10 10 10 to
from in
γ
has
fine
stage, 2 0 0 functions with its
T 2 3 4 5 2 3 4 5
A
cell
of γ mice
modules these
+
possibly levels, TION γ from
0 modules
t raised The IFN- CD27 thymocytes from 7-day-old 7-day-old from thymocytes
− cells,
δ versus
a in vivo in 45 42 the cell predicted cells (top; left number) or number) left (top; cells
not 10
at ; error bars, s.e.m.), five five s.e.m.), bars, ; error
to
modules them T in in lineage.
As CD24
γ 65%
2 weight
was
which
higher
detect lacking
cell
individual modules, lineage. 10
γ
activity immature type–specific
Ctrl
γ been
thymocytes δ liver
although the nature immunology nature -chain 3 manner.
2 , T 48%).
that 10 for steps (terminally
with due 9.8 lineage. 4
we γ F287
.
(CD24 γ t and production of production t and 4 cells,
possibility 43 suggests
(IFN-
changes ETV5
changes
Both 10 levels linked and the role
identified
centered ETV5
the 5 LN
weight
to
Sox13 leading variable no
only and Thus,
coarse-grained we 18 14 We
cells differentiation
tissue-specific lung γ this
as gene hi
are
; bottom) ; bottom) Although in known
has
)
that
to analyzed
regulation.
in
F289, specifically
a focused
two, substantial were the is CKO
to
differenti for
expressed
with may
to regulator
DCs the that cell γ
on could its a
region
δ a
express
mature lineage known ‘leaves’ 82%
18
Sox13
larger T
high func
those regu
types as
74 fine- have high
they
and cell the
on
be its γ of
δ 2 - - - -
© 2013 Nature America, Inc. All rights reserved. factors such have and view The DISCUSSION regulatory a enrichments cis to regulators The module’s heat present them expression, grained latory org/ModsRegs Regulators’ ing part model, To portal ImmGen the on model Ontogenet the Studying thymic model effector were induces mice effector deficiency the ETV5 portion bottom). sion T marker activation, ETV5 specific spleen higher, cytes ETV5 neonates, of in absence IL-17-producing one leukin γ (V nature immunology nature δ
table cell–specific
control ETV5,
γ regulate T
motifs 2) facilitate
and
residual
prediction
(arguably)
ImmGen
module
of
map),
of of cells
impaired
processes, used had
that or of
deficiency
deficiency deficiency
with model
17
and similar
the which
the
maturation of CD62L
we that
the
visually cells Il17 of loss cells of the the or
mean
the For
lower
lymphocyte (IL-17)-producing in expressed mice
total
and
that
mature ETV5.
as those ImmGen
program
transcriptional
provide the were
expression fine-grained data demonstrated
the
the
cell
T a
transcription),
of control
postnatal mature (
of overall page (
indicated
γ exploration
Fig. Fig. 7
resulted of Fig. 7 Fig. data ETV5 cell
gene
δ
in
/modules.htm
to
activity
of of binding expression mature assignment are the
(their
(a number
module,
expression
type from
inspecting browser thymocytes
the impaired γ
the ( data
that any
the was
was δ In
antigen Fig. Fig. 7 marker also
cells
set members
module.
c of effector the
c they
number
of deficiency
the ). mice thymocytes portal, generation them. ImmGen.
in ).
model
CD2p-CreTg
developmental global
~50% the
activation) provides in
in not to
Mature
of
mice. all
weights V
These links
by which or full
b
events
γ
of
of an γ control the modules , its analyze δ contain,
2
aDV IL-17-producing modules, the with receptor,
projected top). activated activity that of different
and
+
in of
T
the
the of thymocytes
and
cells The abundance Finally,
set
with genes
was
thymocytes of profiles of that
to
Most
regulators
of the
l cell
all γ results lower
) the the A they V
δ
and ImmGen
ETV5 results. γ
of conditional is normal, of
the a of NCE ONLINE PUBLIC ONLINE NCE testing This
It the
both
γ
of δ genes
would
effector unique that
2
naive mice
relevant
antigen list
expressed in the
on
the
transcription T
each generation the allows +
of
and modules
CCR6 by
a
V the
+
most are
we module, which thymocytes
expression cells and
cells from Etv5
mice on
V γ any
may
regulator supported to of
thymic the
cell–differentiation gateway
2 regulatory their was
module
γ
to
correspondingly
predicted
+
( induced. provide
Specifically, 2
be
regulator of state) the
Fig. 7 Fig. of modules
the
but in
suggested +
cells
the
portal mammalian +/+ generated features detailed + intrathymic coarse
in
cells in
the
have
receptor
γ
thymocytes
CD27 that particularly with tools
constitute essential
δ other
mice
lineage
T
mice number,
genes γ mice and other
littermates): the
Ontogenet T the and δ
user differentiate
on cell–specific a of to
in
cells
been effector
of genes,
). links
predicted of (
T
for and
from −
factor the frequency
http://www
For
regulatory the
regulators
IL-17-producing
with regulators the
those
peripheral
However, in to
with with circuits controls.
and
IL-17-producing predictions cell–specific CD44 of
V
in
was tree to
was by searching,
in
the each prediction
A
γ the
for due fine
Ontogenet each
for
Ontogenet thymus
the
nearly TION their
chains,
immune ETV5-deficient
development
browse
the comprehensive
and
from
impaired T enrichment T T
ROR
(as about
cell
higher cells
similar ‘Modules
proper transcription code.
downloading
(the
to
modules,
cell–specific cell–specific cell–specific in
module, cell,
process. thymus module,
underlying to
in
functional
deficiency into
inefficient
pattern
of
Critically, subset.
7-day-old
mice there predicted (each
γ .immgen.
half model γ
( 2
nominal t
Fig. Fig. 4 regulate the
δ Fig. 7 Fig. 4 twofold
and
thymo
coarse-
expres (which brows .
system
of
to T
of ETV5
intra Thus, inter in
regu
of
have
with
pro
cells
that
and was and
as
our our We the the the the
we
a
all γ γ of as of of b
). δ δ a ------,
tion ferentiation The regulators cal controlling the subset. unknown Among sequencing enrichment correspondence regulators tory are poiesis for regulators tiation allowed in by were regulators ‘mixed-use’ karyocyte, tains use’ humans coexpression shown dium tial program files that Another ment Those criteria their the functional, cell extensive as regulators arbitrary Ontogenet regulators steps izing directly the an mature
Ontogenet’s Our A As all
regulators entire
lineages
context
at recruited differentiation activation
mixed maintenance
states
are in (such ability
similar
function.
factors
across
at many expression, active with of lineages least suggests
may
stages the before,
Additional
analysis
and stage.
which
(for assigned regulate
those, us modules
and
challenge
lineage is
3
autoregulation
role resolution
and
were at , that
and as all use
to basophils
with 2
mouse
cannot
175 although unlikely
at supports
the
more
to 5 in patterns be multiple has for example, in windows
study
whose
the
mice
Rag1 to
identify suggest and
However, expression-based
of which
of patterns
regulatory
for in to
identify we other of global
Our their
rich control that captured only
expression
more cis shared
control
identified has similar the
appropriate
among
helps a to other
identify
the
modules
and of cell in
ETV5
to experimentally
4 -motifs
program
studies directly of and of module.
is .
modules
ability
most model expression expression
to
One cis lineage
correct automatically
one
the
follow
have regulation some
and
regulators, modules.
posed
myeloid
human
IL-17-producing regulators that of regulatory types, profiles, Tal1 helps some
to
hold
-regulatory functional across such Rag2
regulators
are a integrating
expression probably
in the
this
by
lineage identify eosinophils, global possible control
in specific
modules been should and
,
such
may of distinguish
by points the
to
allows
which
them
by
to true is may
otherwise our The
other ‘rewiring’. function,
identity it predicted
with
in developmental
hematopoiesis
genes
lineages. automatically ambiguity,
lineages,
modules
that
enrichment detect does
genes differentiation lineage-specific
of
For
observed module
be
regulatory is
occurs, principles methods
we
not complementary programs for
have
dense
in ‘rewires’
as
determine effector
high similar
transcription
gene
tested in regulators shut encodes us reason ‘false-positive’
mammalian
that
motifs
whereas
reidentified
are
example, can in across
not profiles
they some
those
associated
the with or ‘early’ to
the
γ
has
been quality
causal
In regulators,
γ lineage
program
δ
whereas
broadly act off C25, identify C5). to interconnected include function,
δ predict
and
lineage ‘unfold’
lineage–specific but
in
used molecules. particular,
cells
for Ontogenet
effector unique
and most suggested genes.
modules of
the
generalize by active interactions a when
only
programming 3 humans
filtered the
stage
whether
reidentify to
The
known the may C45
distinct confirmed
has ATF6
directionality. the ChIP
of of
model’s
binding for
to
specific.
be circuits signatures
factors
before substantially
mouse those
many
the erythrocytes, our the e c r u o s e r the conserved regulatory or during
at
results. at inferred and suggested diminished
analysis
expression cells predict and
not
and which such cells assigned known
lose
which
specific followed
was
3 but γ by specific
mouse
new
HSC
some
Finally, additional
.
expression ETV5
δ
broad ‘rewire’.
be
allows
differentiation Notably,
the C49
predictions
of
with important T transit
many of
our by
data
as that
are their
As were
circuits specific a
an
reflected many predictions. effector Conversely,
the
regulation,
regulators.
regulator). modules. significant
previously regulatory selectively of functions,
regulators those regulators and transcrip
but
program.
stringent compen
has probably regulates activator
differen
substan
To between hemato together roles controls
general set biologi
by
profiles
and enrich regula several known known
‘mixed shared mega
to
of many gene-
was avoid
been roles
deep
con
pro
that
and dif
cell
the the the
for for
by in
a ------
© 2013 Nature America, Inc. All rights reserved. E Gundula Adriana Ortiz- Tata Daphne Koller The complete list of authors is as follows: Adam J Best O.Z. input V.J., and (A.R.), (A.R.), U54-CA149145 to Supported A. and Analysis with We Note: Supplementary information is available in the code. Accession version Methods M in an browsing and dium, removed do used to methods of ate inherent of lineage) some ontogeny. interpretation. or each cell, stimulated as dictors data a T e c r u o s e r 1 AUTHOR CO A
0 ck
T
mmanuel
cell–specific
ethods the
Liberzon
Ontogenet
cell
in γ
knowledge revise conditions
invaluable
not
thank
into the layout T.S.,
δ cell–specific
provided data
regulatory n
high-dimensional
obtained
from
case, ImmGen to
Lirnet
T parts o N the the
National differentiation coarse-
correspond
wle are for
A.R.
refine cells
the of generation
ageswara ageswara Rao the
and
by Merkin Klarman
of
lineage would an T.S., without and
and
(Molecular the Although support
t
used
the
state, remain he National ImmGen
6 and M same dg
the N
), existing
or
Consortium),
A.R.
and
some is
resource
TRIBUTIO
ability
for with
the Science lineage L
can pape in-Oo of any
when
men motif-related repressor Notably,
and relations D.K.
Foundation
applicable other Gautier as be
Cell activator
flexibility, 149644.0103
or GEO: the downloading of
cell and
and topology
disconnecting
1
both
candidate L fetal
associated needed Institute
unstructured to r ImmGen;
designed core , , Tal Shay ts Signatures fine-grained for progenitors
.
much opez
to tree;
Observatory
Foundation
cells processing; D.K.; particular
one. type,
any innate-like and 14
for
phosphoproteomic
share
protein-expression microarray team
samples N Ontogenet
enhance
, Charlie , C Charlie Kim are
S. the in 12
S
changes
for T.S.
are
future 15 13 to In immunological
in of
code Hart is but
to
of the
, , Amy
whereby the L. to US
all (including
Allergy ,
, , Diane Stem known regulatory
Database) other regulators
the 16
analyzed other
construct
measured the Gaffney
V.J.
references
study;
reflects eBioscience, for and at
(DBI-0345474 National available
, Claudia , JakubzickClaudia
2 T (A.R.),
progenitor or
, , Aviv Regev
are
B (for studies
the
and the the Cell
modules cell ontogeny lymphocytes). initial
in participated
for differentiation
cell and now data, cases,
W
V.J.
about
not
D.K.),
ImmGen several regulatory lineage.
the Research for
power example,
M. precursor the
for cancer
M agers
Institutes Infectious in
module
layout developed
(for
an programs
help
Painter
depends data;
in for
of the Howard
known
athis GSE1590
part
Affymetrix are
Ontogenet’s
on the
mass disease. the ontogeny
the
both
and data
to
by positional
of interactive example, 14 with
and line version of the pape
cell
K.S.
of 12
at in
studies, Burroughs The V.J.
available
simply
, ,
hematopoietic
a and identifying
portal role
all of the
the
13
, , Tracy Heng writing; C71.
Diseases
L
2 Hughes
identified
module cytometry
programs
figure the (for
given did
the types and
help This (for Health , on
, , Christophe Benoist ewis ewis
5
DCs by lineage data Broad S. ImmGen 7
, ,
of
algorithm, and
. experiments;
Davis) N
D.K.).
a resting gene example,
automatically related
and
when
the
genetic
preparation gene example,
preconstructed with
reflects
X.S.
adia adia Cohen
cell
can Wellcome
Medical in sets,
Expression output (A.R.; (R24
searching L Institute in 15
tree; C57
sets
present
the will
L for , , Gwendalyn J Randolph
generated
‘edges’
and
the differenti
type. regulation regulators data
biological other
AI072073 anier including
compen state cell
for
and and with
help
myeloid
and variants
lineage, provide
Institute
in
single-
can
online can r
(A.R.) mouse. 13 2 Fund
.
those
types
6
New
part pre ). lack that , , Jeffrey and and was 14
be be or In 9
- - - , , Jennifer , , Jamie Knell 11
, , Patrick Brennan 7. 26. 25. 24. 23. 22. 21. 20. 19. 18. 17. 16. 15. 14. 13. 12. 10. 9. 8. 6. 5. 4. 3. 2. 1. reprints/index.htm R The all manuscript; a 11. CO
eprints and permissions information is available online at at online available is information permissions and eprints mouse
authors.
13
M Segal, E. Bendall, S.C. transcription. gene in dynamics temporal control: Impulse A. Regev, & N. Yosef, K. Narayan, TranscriptionH.D. ELF4 Lacorazza, factor & Yamada,M. T.,Mamonkin, C.S., Park, S.V. Outram, Neumann, M. Wang, T. differential T.T.-H.,Virus-induced Kino, Tailor,Chang, & S.S.M., P.,K. Ng, Ozato, J.-W. Lee, H. Ji, ICER/CREM-mediated S. Sakaguchi, & B. Diamond, Z., Fehervari, J., Bodor, H. Kishi, Yoder,IHH & and M. VEGF Richardson, M.C. L., Yoshimoto, Huang, M., M., Pierre, hematopoiesis. SnapShot: L.I. Zon, & S.H. Orkin, T. Tamura, ilr J.C. Miller, O. Ram, A.P.Capaldi, S.-I. Lee, circuits: regulatory Transcriptional A. Regev, & E.K. O’Shea, T.,Shay, H.D., Kim, Shay, T. N. Novershtern, stem hematopoietic the from commitment lineage Myeloid K. Akashi, & H. Iwasaki, Heng, T.S.P. Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. elastic the via selection variable and T.Regularization Hastie, & H. Zou, lineage. cells. human in analysis location genome-wide Hog1. MAPK the data. expression gene from regulators specific Genet. alphabets. from numbers predicting mouseimmuneandsystems. hematopoiesis. human in states cell. cells. immune in acrosshumanahematopoietic continuum. Cell distinct KLF2. and KLF4 CD8 of homing and proliferation the controls development. theofgeneration andmaturation dendriticof cells. vaccine MINOR.efficacyblockade ofvia of production. interferon to Implication cells: dendritic in coregulators and receptors nuclear of expression 420 p21Cip1. and 4/6 Cdk D, cyclin of control the progenitors. haematopoietic Immunol. J. Eur. cells. T regulatory by suppression in role its and IL-2 of attenuation transcriptional cells. B in Pax-5 and cells T in murine embryoid bodies. andGata-6–deficient in Gata-4 hematopoiesis definitive rescue (2008). diversity.functionaltheir developmentand Methodol. Stat. B Series Soc. Stat.
E authors P , , ricson E
N , 91–95 (2012). 91–95 ,
144 Immunity model; TI M
atalie atalie A Bezman 5
et al. N Exp. Hematol. Exp. t al. et γ et al.
Nat. Immunol. Nat. , e1000358 (2009). e1000358 , , 886–896 (2011). 886–896 , and declare δ et al. aDV iller et al. et G G FI t al. et T cell subtypes. cell T t al. et et al. et 9 13 , , Ananda Goldrath
t al. et t al. et Conservation and divergence in the transcriptional programs of the human et al. et al. et al. et J.K.
Inhibition of activation-induced death of dendritic cells and enhancement et al. et al. et V.J., Cell Cycle Cell opeesv mtyoe a o lnae omtet from commitment lineage of map methylome Comprehensive Module networks: identifying regulatory modules and their condition-
t al. et N A Lineage-specific regulation of the murine RAG-2 promoter: GATA-3 promoter: RAG-2 murine the of regulation Lineage-specific l , , Katherine Rothamel 26 . obntra ptenn o crmtn euaos noee by uncovered regulators chromatin of patterning Combinatorial 15 15 DACH1 regulates cell cycle progression of myeloid cells through cells myeloid of progression cycle cell regulates DACH1 erig pir n euaoy oeta fo eT data. eQTL from potential regulatory on prior a Learning NCE ONLINE PUBLIC ONLINE NCE A
t al. et
no designed Single-cell Mass cytometry of differential immune and drug responses
The Immunological Genome Project: networks of gene expression 37 Nat. Immunol. Nat. Nat. Immunol. Nat. Differential expression of Rel/NF- Nat. Genet. Nat. T.S., eihrn te rncitoa ntok f h dnrtc cell dendritic the of network transcriptional the Deciphering Intrathymic programming of effector fates in three molecularly three in fates effector of programming Intrathymic F rgltr fco- ad 8 oen edii cl subset cell dendritic govern -8 and factor-4 regulatory IFN , 726–740 (2007). 726–740 , N Structure and function of a transcriptional network activated by activated network transcriptional a of function and Structure , , Brian Brown , 16
CIA competing KLF13 influences multiple stages of both B and Tcell B ofboth stages multiple influences KLF13 , 884–895 (2007). 884–895 , 11
esl itronce tasrpinl icis oto cell control circuits transcriptional interconnected Densely
FEBS Lett. FEBS , , Paul A.R. 37
13 , , L 7 , 1038–1053 (2009). 1038–1053 , M I , 2047–2055 (2008). 2047–2055 ,
, 888–899 (2012). 888–899 , the
N Nat. Immunol. Nat. and Nature Proc.Natl.Acad.USASci.
ichael ichael Brenner T 40
experiments E
financial M
14 Blood D.K. Cell R , 1300–1306 (2008). 1300–1306 , 9 10
585 , 1091–1094 (2008). 1091–1094 , E
, , Joseph C Sun onach
467 , 618–626 (2009). 618–626 , STS 67
Science 144
wrote
, 1331–1337 (2011). 1331–1337 , , 301–320 (2005). 301–320 , 95 , 338–342 (2010). 338–342 , 15
Science , 296–309 (2011). 296–309 , A interests. 9 , 3845–3852 (2000). 3845–3852 ,
J. Immunol. J. Blood TION , , Vladimir Jojic , , 13
and the
17 M 325 + , 511–518 (2012). 511–518 , T cells via the Kruppel-like factors Kruppel-like the via cells T Nat. Genet. Nat. Cell , , David A Blair
Biochem. Biophys. Res. Commun. Res. Biophys. Biochem.
manuscript iriam iriam
13 κ participated
113 Blood , 429–432 (2009). 429–432 ,
332 B and octamer factors is a hallmark Cell ,
nature immunology nature 147 11
, 2906–2913(2009)., , 687–696, (2011).
, , Francis Kim 132 174 95 110 , 1628–1639 (2011). 1628–1639 , 14 M http://www.nature.c , 277–285, (2000).
34 , 712.e711–712.e712 , , 2573–2581 (2005).2573–2581 , , 2946–2951, (2013).
, with erad
in , 166–176 (2003). 166–176 ,
writing
input 1 15 , 18 ,
from
the ,
12
PLoS J. R. J. om/ ,
© 2013 Nature America, Inc. All rights reserved. Children’s Hospital, Boston, USA. Massachusetts, 21 New York, USA. of Medicine, Boston University, Boston, USA. Massachusetts, Institute, Mount Sinai Hospital, New York, New York, USA. Boston, USA. Massachusetts, USA. of Technology, Cambridge, USA. Massachusetts, 9 Kutlu J Derrick Rossi M nature immunology nature Division Division of Biological Sciences, University of California San Diego, La Jolla, California, USA. Department Department of Biomedical Engineering, Howard Hughes Medical Institute, Boston University, Boston, USA. Massachusetts, ichael ichael 12 Joslin Joslin Diabetes Center, Boston, USA. Massachusetts, E lpek L Dustin 19 23 Fox Fox Chase Cancer Center, USA. Pennsylvania, Philadelphia, , , Angelique Bellemare-Pelletier 22 , , 18 N , , Susan A Shinton
idhi idhi 14 aDV Department Department of Microbiology & Immunology, University of California San Francisco, San Francisco, California, USA. A M NCE ONLINE PUBLIC ONLINE NCE alhotra 11 3 23 Division Division of Rheumatology, Immunology and Allergy, Brigham and Women’s Hospital, Boston, Massachusetts, , , Katelyn Sylvia Dana-Farber Dana-Farber Cancer Institute and Harvard Medical School, Boston, USA. Massachusetts, 19 , , Richard R Hardy 16 A Department Department of Pathology & Immunology, Washington University, St. Louis, Missouri, USA. 13 18 TION Division Division of Immunology, Department of Microbiology & Immunobiology, Harvard Medical School, Skirball Skirball Institute of Biomolecular Medicine, New York University School of Medicine, New York, 23
, Deepali , Deepali 3 , , Joonsoo Kang 20 Computer Computer Science Department, Brown University, Providence, Rhode Island, USA. M 19 alhotra , , David 10 Broad Broad Institute and Department of Biology, Institute Massachusetts 23 3 L , , Taras Kreslavsky & Shannon Turley aidlaw 20 , , Jim Collins 23 22 23 , , Anne Fletcher Program Program in Molecular Medicine, 21 , , Roi Gazit e c r u o s e r 15 22 Icahn Icahn Medical 23 , 17 ,
Department Department
11 © 2013 Nature America, Inc. All rights reserved. was The 3.9 genes, coarse was of measure. modules separate modules with although data. of cal (SPC) grained modules. of Definition 10% sion lineages. induced in one for lineages NKT locyte, signatures. Lineage-specific September which with set formed. pipeline Data preprocessing. continuously reconstruction, each lists ples (244 the clusters connected only 2012 of Report March The 2012) three ing, Protocols/ImmGe populations arrays set. Data METHODS ONLINE nature immunology nature grained
Clustering Each As the September
MATLAB
clusters
each with setting.
regulatory
fine-grained
lineage,
regulatory used
clusters are minimum 1 all in was
a to
default
cell sample release. cell,
The the
2 different
with
algorithm (Affymetrix
resulted s.d. because
(23 7
modules. macrophage,
2011 that papers
the
,
the
modules).
from of
presented if is the For coarse-grained
cluster
modules, a
A applied
by by
data C1–C80); The
types), because
Expression
γ
they it
clusters in
principled value
the
attempts coarse-grained
δ to 2011 gene
but data
SPC lineage samples
followed
had affinity highest
the genes
are
parameters, releases,
T and Supplementary Table 2 Supplementary
if
Ontogenet software).
updated.
the resulted
the ‘self-responsibility’ set
2010, that 6,997
program. cell it with program it
number
ImmGen instead
can
(C81).
available significantly
the was of above
release robust was
Affinity
to to must
excluding modules the
earlier was of hematopoietic
n
SPC
with refer
defined annotation on and
the
than clustering
the
form
Cell to propagation
7,965 mean the the genes presented
monocyte, have
considered with in
by
Expression
used
regulatory
approach
of chosen
used
the
maximize work in
0.5 676
and
stem analysis the
post-hoc
more
identifies ImmGen to and
multiarray
of
For
remaining mouse in module
prep reconstruction
analysis
was
The
which
a 334 propagation on
Modules
releases
744
unique
full
variable
expression with
for them.
because
new
fine were samples modules) all last by hierarchical in
control
with 8,431
each and
the higher
applied
We heat than
to
and fine-grained
the
other
samples, tree version generating SPC was
heat
an for ImmGen DC,
modules
resulted cluster) genes of
repressed ensure
data ImmGen
program nested Clustering
analysis backward clustering progenitor calculated
tree. a but was
entire genes
gradually
parameter
of sort
expression the in
variance
maps
(September
one
(210 choosing ‘sparsified’ it
were
degrees
done are expression
(195 average unclustered maps lineages.
B
samples.
the . the
were used to does
was and
One-way ing was further cell, 31). number
could Thus,
probe
stable consistency
with
647
was the data
cell in and
clustering
in presented
April
11
hematopoietic (functions nested defined
release
SOP.pd
by retained.
if
and the are
website normalized
the
measured
Sorting only NK not
2 compatibility 80 modules a
P
signatures
was
ImmGen grew 8 it lineages,
of cell.
set types).
of method. final stable
(which , a not
set we
A single
value values
set
superparamagnetic
with For from had
stable modules,
affinity
across
partitioned difference 2012
maximum tree. in cell, require
which
2010,
inherently
false-discovery analysis compactness.
some
at
done
were
in maintained
genes
on
Assignment be
that f (April
from
presentation by
). significantly 0.01. simplicity,
(
strategies
here
did clusters correlation above a
http://www.immgen CD4
Of
the anova1
release.
used The Supplementary Table 1 Table Supplementary the coarse-grained of with
indicates clusters (F1–F334).
coarse-grained
on
clustering.
a on March lineage used a
of release
Data
remained matrix.
as
for
all those, 2010
a not
were range gene
September
Affinity array, of
+ include
cell 2012) the
regulatory Affymetrix the
ImmGen
part as predefined
in
for
the log T genes. 11
was
variance
supported for
into
further
and much from
expression cell,
were
samples
ImmGen
types) to for 2
(coarse-grained
lineages:
clustering
membership was 2011 than
grouped other of of only
of (120) only
of
only the
the
SPC
and 11 as 2012,
were lower rate
multcompare
the the
March
fine-grained samples CD8 755
propagation On in
parameters,
For a
the as considered
log propensity
in
clustering, (7
probe
clustering hierarchi
website and
the
720
break that 2010 in states the
ImmGen was ImmGen ImmGen
(FDR)
program
possible. was
module.
Mogen1
done average,
all samples number 2
to
module cluster +
coarse- coarse- expres affinity
at release granu
by
into
trans
T
probe
in
2011,
other April April learn of sam done
were .org/
least
into cell,
and sets run
the the the
for
on
all
in of
is a ------
oue eltp seii vrac estimation. variance in specific cell-type Module (modules) much parameter weights to ate similarity activation a bone lines, knowledge lineage the building. tree Hematopoietic ( regulator ing sufficiently that ing and 2008) in mouse, genes used the regulators. candidate of Choice tory specific where member of linear be members, can ule. between are related possible aims approximates the regulator linear types. a into expression program. regulatory Module or population, macrophages). this Supplementary Table 12 Supplementary
predefined node
TRANSFAC have More Notably, a arbitrarily a assigned
constant inferred regulators’
members following were or
be
That hypothetical
genes regulator modules
program were given
as to 3
marrow,
Fig. Fig.
annotated
combination smaller combination It
0 each
ChIP assigned
cell
in
tree, a and
human specific candidate
to then explain
formally,
cell while of
includes
removed, common
in
in
that
1
the not but
with step, tying
a × types
profiles
cell
available all
module ).
ε
(s.d. experimentally
to
each
one
nodes regulatory
combination across and followed m,t
types (regulators)
set
constructs
(
There
(the
of
the than sources:
learned r regulators Supplementary Table 13 Supplementary regulators tree
expression it.
measured
and passed
remaining matrix
For
type or
to
published is in
time
regulators
of enforced
as the < in of with
population the cell
expression
a
we regulators coarse-grained
a
regulators rat;
0.5
a
candidate (
(square,
across progenitor the the of were
the
long-term much all unless Gaussian
the of Fig. Fig.
combination
m
are
cell at ImmGen
s sublineage
type.
(as model
regulators the
,
across by the
conditions.
by t m database genes
the 2 ontogeny. present,
nominal
mouse in edges
program populations two
, 3
a w X type
added
in
are that
microarray
gene-ontology
× ). Ontogenet on cell
multiplied many
by
expression regulatory of t i
is of
). data they A as , cutoff. that
determined (cell
the
Fig. Fig.
roots
was
the
considered estimated
module’s the that
for the a the
pattern regulators; the
= random in simple was t are type
orthologs
Consortium. stem or module
Ontogenet as although
are ∑
obtained
were
to activated The
(version
different size
a with
trees expression
types).
which
( model, not entire
1 gene-expression In of
array are members
common Supplementary TableSupplementary 13
a published
manually to and
). r This
Here,
group t its
r,t
Candidate
module,
a
as
cells of
Edges
that
hematopoietic
measured
the
by .
members of program as r m possibly is
regular highly
of
(
were
members We variable regulatory
fine-grained Supplementary Table 11 Table Supplementary
,
the
m data
represented position there or the from each
resulted
possible the
). the
, ,
tree: as
and
we
term
8.3) r t some of
cell connected
by
cell from a T
and model
regulatory
whose takes
potential
connected regulatory indicate
module.
of
all
regulators.
of allow effective
cells) regulator pruned, set)
ChIP
m t Each
correlated study 2
regulator’s
an linear
types;
is
long-term populations 9 the
distinct
the ‘transcription
+
a
for a
, with edges regulators the (for of
fetal
the a
ontogeny
and
cell
gene weight
e the
to
in
and
the
program
known the
expression module each a expression or
group
followed variance
genes
of
example model, be ,
JASPAR
a change clusters
t 0
578 tree in
Each type a
regulators following
liver.
to
regulators
are
either number a
expression partitioning module.
in mean human program
being and activity
The differentiation included
on program
general
terms
module We
more matrices
(>
activity
a
stem candidate hypothetical
( created
encoding DNA-binding
t
Fig. Fig.
the module regulatory tree
cell
. were
doi:
will Each
that 0.85) the is
). denote
of Hence
, module
the and described by in
database
consistent
factor granulocytes
Some
the
than
of
hematopoiesis basis
of of weights
activity 10.1038/ni.2587 type. cells did
Thus,
relating activity
1
not
as
the representation deep
assumption
that
consisting were
for less
in )
weight,
curated but
population variance
w parameters (PWMs)
with linear the ).
of
its was
input:
m of
not as
the
the
the necessarily
one Regulators
intermedi from each module activity’ of
Because
,
regulators molecules a
are r likely
for own
a
the , sequenc
module’s a assumed program
variance t
gene
(version
(dashed
another for the step,
built
module
weights
cluster weights activity activity above);
regula
(noisy) change the
sum
parent across not
which which
gene- mod genes motif adult
from
31 each s sub
best
and one
of cell
i
, t m
2
an
by its 3 ,
in as of of of
is is
3 2 , a a ------; ;
© 2013 Nature America, Inc. All rights reserved. D is (RT), across form written type between The of two type an eny An which known addition in correlated a sparsity Σ sion any cell another parts, where regression extension of the Regulatory program fitting as a penalized regression problem. are still pooled 10 ased member doi: point convergence lems. such value Optimization a of term the
n t
single regulatory
1 1 the computes m the
densely a We Combining | ‘over-parameterization’ edge members.
important
10.1038/ni.2587 typically
activity
regulatory
types. (differentiation) w specific one consistent
related key
∑
problems
t estimator, t
as
||
1
Alternative 2 as m,r,t where
method number difference is
one assume . . Dw we J
all
t i
as
as (
More variance
The
,
gradient list representative. observation
a
and module. w || that of genes regulator
2
Dw
write ‘Elastic |.
Σ Σ cell m ) non-smooth
interconnected s of regulators
that problem
squared
, ( ( weights ||. is
cell
However,
to t m t t
the similarity
2 R f program
smaller 2 1 is thus
promotes the m generally, ), For
of
, 3
a
Thus,
input types
is of that
4 all the is
regulatory relatively ||,
programs
across
we promotes
and .
( chosen compactly types
differences this
, ) of
n
the Different
w x estimator
∈ to
descent
edges
A fused 1 1 methods, these where the m t i Net’ prunes
cell
, make r f
activity
r
only
terms
introduce
standard with
w
of
∑ objective to
with number is concatenated
l − and than
tree
may
considerations m,r,t
(
all of
1
∑ type. this that the t | ∑ our of t i
1
1 in , Such penalty. modules w w the function consistency
Lasso ,
,
a
special slow. a
w
(
replicates
as r
2
module r m is this the
r
t
change ( many
takes 2 1
t sparsity
the
program the regulatory
all s
small ∑ ∑ module 2
form m
Supplementary Table 13 Supplementary choices
1 weights regulatory | as penalty
Σ Σ across
)
of total between , such not proposed The w w t m
l
∈ 2 r m , ,
t r m
is
of
t aggressive m t be
regulatory
approach 2
is
r m ,
2 1
term , variances || | | f tree. We an the )
framework
w
, , a , , can ∈ −
somewhat the w w regulators, (
feasible,
r t
highly considerations biologically
In
estimated
number r m w a
w x
difference as Σ
the f vector L
m m t and
throughout
t i therefore m 2 1
, , modules
m t we
problem, , and
1
of
m
in together
We our t as ) ( form tends of projected || be
throughout −
penalty, w t
) 1 − relevant of is
variance t r
circuits. 2 the above,
is
the
denote a
the before program hence + , ∑
+ assessed
the
+ + note selection correlated regulatory , , penalty case,
given
to k t r
in
pruning 2 k |
of
programs 2
as , , of
parameters two r l .
of t r to
cell || 3 complicated
We
promoting
∑ variances its
3 same
T ||
activity 2 these
a of
opted
with
, r m regulators
that and , the
be 6 |
relevant
it
the
direct
this
is regression
m entries
that programs, which
which r
, ,
gradients,
by parent = That type. will differentiation, the
estimate m
, , must
the ||
fitting r t for as overly will
0 differentiation.
complete
J a
|| 2 2
sum
multiplication
in D more of may
t work
if 1
penalty
cell regulatory to
write a ) ( predictors
programs
t number
the
w can for the
Although is
optimization
cell correlated weights yields sum promote
)
gives use
of ). in r m
coarse-grained
2 λ be
k
because
of 2
type a be
such
procedure
, ,
,
This
aggressive by a modules
+ regulatory
than
the are be only
a s
κ
||
this matrix
type Σ absolute problem, can ‘regularized’ a
t
particular inappropriate, of
fine-grained D w
t m
2 and
m r
the
is primal counteracted ) ( a objective
2
rise t vector , m
actively w
|
sparsity
1 the
of composite
tree
term
w w composed be ten on for
is
|| is programs t fact the
and between ,
δ we 2 of 2 2
cell
the
computed predictors
to
a used, , , absolute of
yield
and
smooth
+ t r
with by
all members is Estimation
‘redundancy’
parent
preservation
values
is in dual use that
w
by
g relationship in
a but selects 2 1 size types smaller encoded
the
m the module
regulating
regulators a || − penalized
for in its
inducing . module.
methods different
but less
an compact
absolute
The because with
module
interior
penalty
regres (RE) related
matrix ontog w parent r m
of
can
fitting
by of
as Σ and prob unbi value
m , , their
than
only
m by
two and
less cell but
the the the |
t
an
be . | Σ
in as of 1
× E | a - - - - r .
ing based hibitively However, λ yield combination and already plished to fine-grained and modules. coarse-grained between transfer program Regulatory Finally mean otherwise. predictor Bayesian values ule complexity regulatory our criterion. information Bayesian with selection Model point This the We have program-fitting problem. optimization prototypical the Solving modules. where coarse-to-fine regulatory y t ,
that To 2 1 κ
=
for
penalties:
reformulate from
optimization δ ||
and
reformulation s been
regulatory
;
method simplify
expression A y
on
we
between 1 1 the each mt
of
we learned by + −
the δ information
different the
obtain Σ This matrix
The
a
expensive.
last expressed
the can program
m i
Furthermore, is properties. ∈ of w w 2 1 search models
Bayesian coarse-grained 2 1 m || of
through
3
the
fine-grained
introduction term 4 version objective absorb
n the 2 1 A y
0.001 ||
. minimize regulatory
programs
that λ m
problems minimize a 2 2
A + −
profile , parameters
problem t model
x
0 choices y κ
0 for y enables discussion su
w of t i w ties of
,
as optimization l
in and w w and 0 bj
As
. −
criteria the
size information a || these m the of
We Hence
ec
the
the solutions, fine-grained
− can
of
||
by an our t t δ
20) 2 1
we m
2 2 term
w, of
A
k
scanned use .
for (RT) program
modules straightforward of
above the
following || o
programs
of solving || Different
d z alternative, I parameters
be different
module
A k 1
t X y λ and note ,
objective
these
we
l
(described both an
, I of + + w w of I
module’s + −
transformed k κ || 2 × m m
k
2 w w can and additional chose held-out the || y r
is
problem (T), that
w w m m
2 2 m
|| w
2 1 criterion sets with
− = three
the coarse-grained are of dependent
D w r r
m + +
|| in module
||
rewrite 2 2 general
δ optimization, quality. ′ combinations of 2 2
1
m
a
|| where g l
is
was convex ‘encouraged’ the + + of which the
+ +
+ + 2 2 coarse-grained with || genes the we
||
different then below). g l parameters
2 2 into parameters k g l Xw
g l 2
derivation
data optimal || optimal by geometric,
||
|| use penalty || coarse-grained (BIC)
into
the
that g , , form One cross-validation
a A D w d z
|| they adding
w z the and problem
1 r t
|| || on || m
or ,( = =
a 1 1 1 || objective
1 || we
first
nature immunology nature way + model + −
w w to
2 2 the
through we are
data-fitting 1 and set We
w to we , ) term. m of
|| t T D d wish
compare
and of in m variables
||
e || can Dw have ||
g nested. to of set term
these ||
−7
note 0 1
module above introduce Dw depends fine-grained a the ||
Dw
parameters identify − + The = , primal
selection
||
select m
to of We e as introduce w
t −6
a
2 || range w || and cross
as
m
t r learn 1 .. that parameters program 1 parameters
|| ,…,
m
formulation for will
.
This
|| . models would follows:
. |
that
/ 1
||
as the
s fine-grained
and dual 1
only -
e a the regulatory-
(the
mt validation. . as denote w 3
the
particular
is approach spanning decouple best 0
w
w with
variable
and optimal module interior accom and model- be
similar m sched on result m sparse
.
||
pro one.
λ
The will
2 2 = the the the the
,
of
κ 0 - - - -
© 2013 Nature America, Inc. All rights reserved. w w cse.ucsc.edu/do browser We scanning. Motif (although example, constant step model be act lineage activity regulators than active in ranked lineage lineage a in deemed regulators. lineage of Choice ferentiation’. lated name regulators. of specific functions known for query Systematic ship and of relationships tory programs. of regulatory Post-processing with being Trace( umns corresponding edge Given then construct gle particular but The likelihood on This nature immunology nature
r m lineage
Notably, Hence
5.5 which excess
their under-represented , ,
only degree
downloaded intuitively
the
computation
(described could program criterion
a were m t construct between with
2 1
of of
in on the B column in two
= the
regulators
repressor and
(
weight
target
module, during
website tradeoff active B
B
a the
the activity GATA-3) the LL of
the
we
activator
BI
its
′
number
regulators lineage they
that differentiation,
connected B of with for be
regulators
nodes this
all ) ( 9.0
w C
, ,
lineage(s).
+
window
w All can t r counterpart, for regulator
log freedom.
spurious.
) ( compares =
in our
module these κ
by
of are
modules
across simple:
too to
a a
on wnloads.htm
the
− = of
. ∑ above), procedure,
diag( PubMed
2 underexpressed
between terms
compute
limited we We
graph
The
weight
B. if promoter
having
and,
of
connected scale.
− =
identified model
the corresponding zeros
m
of the
and
are its if ∑
highest
the
automatically
two
∑ that scanned
in
c columns on 2 portion
its lineages. University matrix
average
m when ))
by log chosen may
had LL To
of thus t i an
which For
in
models, , degrees At
regulators was and
−1 ∑
window
the even
a
nodes average
is were
s queries
) (
data
this
2 BIC,
B. the make
w w which we nonzero
activity
B
t i
this
t m a scale. 1 2 each ,
very although expression
sequences be ′
basis
active,
the by positive.
We , ), l role. components
2
B
).
+ promoters
queried
BIC(
they
if of
s analysis as log
infrequently activity
( where
under-represented
of we
For the For cutoff,
will df
1
w x if
of t m
final here
their
this 2
regulators. of
use well
module
regulators the t i
(for
nodes
were For
activity , , A of ) (
likelihood to Hence, cell
queried
use freedom
act.
w
weight California with model. had activity
each
each (
−
have in
their )
A x more
differentiation
‘encoded’
A degrees be example, diag(
straightforward, lo a Analogously,
expression t i ∑ as
r,t its
type , weight
post-processing Second,
the t r done the
yet for typically given
g higher
,
correspond
high due to gene,
r
− of 1
T lineage,
that
regulators columns
average
the
weight in
Once
and
c formal, correlation that First, ∑
connected its would
mouse denote
mouse weight r m active ) We t is in
on 1
to and
the is
,
of lineage, r Santa baseline regulators was
somewhat
regulators was
is we
the
, , expression
a w
early
by because a r t
A placed noise
remains 30
w a freedom
we r m those a diagonal
were
graph. t r
degrees
is we
scanned
expression across
, regulatory
, or
model).
be upregulated in
negative. parent t July genes and we a (mm9)
2 that
a , , Cruz obtain
) very
, r t
in with
to column tree 2
had the
collected regulator
captured the and
we
consider
given component
a +
between lineage
expression differentiation)
will
significant
columns 2012. of
t
to are
low We
df
recruitment
that
) the
technically in
lineage for ( all low are
is
of deemed 2
‘hematopoietic graph
the were http://hgdown hence of matrix
remove the an For
) (
from
+ not
PubMed
counted
sums
freedom.
eliminate
We expression
cell enriched the
programs, given
cell same of
across expression. were post-processing cons optimal
region
regulators l
the
ranked was or oog
each in
matrix matrix reflect
the subsequently
highest will the of types
specific the
type with
of
T
downregu
t
a the
expression can regulatory frequently
associated by .
through regulators
matrix
. deemed
predictor
regulator cell
relation columns involved as
with
lineage- genome
have
analysis starting
The df(
regula
motifs.
all
overall t entries
higher have
all
would A.
in A 2 cutoff
a based types
, load.
total
w
that The sin col
(for
and and
dif
We the the the log
) an A.
= a a a ------
P Genomic PWM the PWM at resents by experimentally matrix element the site). ending from for old Motif-scoring threshold. region, genome. separately table each of in ing by targets. position from used unique original database 11 Table followed enrichment. events Binding fine-grained significant matrix ules i we expected background. ule k modules. in enrichment Motif LOD sidered quantile We
, in b
we (‘G’)
the enrichment. The the
P
the
computed
the transcription defined relative by the to
log k by
(
, whenever the module of position We computed
i,k
a
‘
versus clustering k
the using
i
of
mouse ‘
hypergeometric
database
a the defined module In matrix -th’ k , regulators). at = likelihood
P a
))
by
200
by
of
position
-th’ we publications
one-sided )
MAX represented We
P
repository,
‘hit’
LO values
background data 0.2, exceeded position for phylogenetic
were
probability ‘hits’. the values deep a the
position. to
computed
base
the
motif modules. PWM. i D
a An
then the
with
coarse-grained
–1,000
that
gene for
-
of which , ( sets
given PWM-specific PWM determined
available. LOD
as M
the
sequencing information
downloaded
(version
start of
The size P were enrichment pairs 1,000 k j
p
) ,
MAX the
and
is obtained
found ratio value
each of e
IC all +200
rank-sum P ( symbol the The We
scores M
random defined
4
For (base
FDR the LOD = value represents site) supplementary the ) (
‘
( modules downstream
considered of L k the k the × ,
base was
upeetr Tbe 11 Table Supplementary k -
P
r threshold compiled ∑ -th’
L
resource
target (LOD LOD
by )
target L =
Otherwise, k encountering We (base-pairs nucleotide each
+ = entire value the 1 8.3)
and
k
for
scores local
PWMs was ,
distribution pairs
for
comparing
for determined and pairs with where lo motif wherever
k k
automatically
content as
(
and
genomic threshold
2 gene of best ( g test.
i declared from
the 9
gene , enrichment, list.
score)
2
list
calculated
k)
, ChIP was
and For motif-matching
set
i ∑ The a
L
upstream r P human the
upstream the =
for
EnsemblCompara
k fine-grained 31 a
i k τ
=
1 genes
motif targets at L
We i motif
defined
k
of
set
An
∑ ,
j of k each 3 as .
max
calculated i the
all
4 =
at
the genes JASPAR 2
, ) , the
is of followed nucleotide
1 downstream
. the S
public
material
a for
S a genes
then i P
background:
the We of
position
all the
the
targets FDR ,
for
i,j each across
position
as one-to-one in
samples, r j GEO j
entire
in
, ( LOD . − + module ‘
instance 1,651 transcription
i
observing
P for We
separately scores
-th’
denote
computed of of the
this length
nucleotide that
P
in P j a used values p
)l 1
b
assigned
of motif.
e the the
(‘A’)
module
the represented data database (
(
by og
each
‘ (Gene
gene i all lists.
M
modules.
− set
k module; for PWMs ,
had 10% and j
j -th’
2 score , microarray
transcription
composition , MAX transcription an along lo
k gene (relative of
purpose k the of genes’
;
of = ) that
over the ) of
( g
k
.
360 sets The
original
a
The motif genes
FDR if the of 3 2
, (
designated ortholog
P
was
the
genes a 5 PWM j i for binding
Expression the P
. j b results to
symbols the
-
LOD(
size PWM-specific
the
start i b from passed (‘T’)
LOD (version (that
Only
the
motif ) to
the
information experiments
promoters. FDR
obtained
sequence coarse-grained S modules
doi:
used of
transcription as
each
M to capture best
motif
, of of assigned
promoter
r j of
site
5% ( ( entire
− + τ ,
publication i the
=
is,
Supplementary 10.1038/ni.2587 i,k
–1,000
, genes
the k and
in
j intersection and
event
was
the
this , start exists
,
of
start 0.3, cis for
were k
score the ) A, 1 were
on
), TRANSFAC
2008) higher )
in sites for
calculation
-regulatory
‘
the
each
P Omnibus)
k
defined this C,
calculated procedure serving
the
1 the
promoter given site the
by P -th’ k
included
reported
site)
We replaced all (
to bp
–
listed b accord
content
G i
( thresh j (‘C’)
,
mouse
of
in 3 MAX 2
j
effect,
entire
entire
to and mod mod genes motif ) 0 motif ChIP
−IC(k) or from
than con
start
rep
was and and and 109
the the the
T)
as as as of
= a ------
© 2013 Nature America, Inc. All rights reserved. option. child that repressor sor posal activity weight change weight poietic regulators. of weight activity in change a with steps differentiation of Identification different culated table module functional EnsemblCompara. the broa (version gene enrichment. Functional two score fer the modules regulators. tions each ChIP modules. dates interactions, estimation regulatory two cis P overlap. program regulatory of significance the Estimating doi:
-elements A values
in 10.1038/ni.2587 empirical recruitment
mouse
or groups,
dinstitute.org/gse of
hypergeometric
module.
ontology
of activity of for (parent for significance data
three
of
the is
tree,
(activity
the Those separately weight P
with
V.3)
have classes
Ontogenet three
0; the strengthening The values for
interactions
set, of group
child); gene The
regulators For
child
whereas the
groups
and
were weight the the Second, each ‘child’, P activity
more
each
ChIP
lineage-specific
(GO) and
groups
each is values
last
activity (parent
of weights symbol
of significance
Ontogenet). ‘universe positive members activity
repressor for downloaded
for functional
functional
the
all P
Only for
regulators module which overlap
is
interactions
and gene the
P
values we weight
in the
are coarse-grained a
0). explicitly modules
value including which Ontogenet )
weight activity
the
3
wherever (parent are calculated empirical
had
6
three-method For weight genes similar and Curated . sets
size’ resulted
for For
regulatory
and
the
were disposal was
is
of
of
ChIP
simplicity,
group. First, the
annotation is
(C5)
than
the positive
regulators of
from
the
is
each and
same); weight each included smaller
the takes
activity
calculated is
for the
are
‘universe calculated the
the
a
purpose
interactions
P gene
an positive);
information overlap
we in
others. from
one-to-one
all
the
the value three overlapping An edge
the
( group,
‘parent’
number empirical and
into
one
interactions,
calculated parent
activator and is functional
overlap, we overlap than
Broad sets weight FDR
enriched
the 0;
(C1,
in (differentiation that
fine-grained of
does account
of sizes’
The
regulation of omitted for child child
to
gene
the
(C2), the
activator Molecular
that ChIP was
the activity of
C2,
of are
the account Institute
are
P
hypergeometric is not.
multiplied of
ortholog because
recruitment
10% were following
clustering
activity
regulators regulators.
activity
value
the calculation compared
of
negative symbols
groups. C3 size assigned
modules each
motif and the
the including the
the
hypergeometric
was and the
weight
strengthening
of
overlap regulators from models
‘regulator Ontogenet for
modules,
two
website Signatures child);
the
weight
intersection step) gene weight
number exists
used C5).
The
the classifications:
and
by were
constant were methods
according that
For 10,000 with
hypergeometric
(parent the of is
We the of
fact
sets between
FDR
for activator
is
( negative (from
according enrichment.
P is example,
is
http://www. were
replaced the overlapping
weakening’
and the considered
larger
chosen report number of values
regulatory 0);
negative); that
the permuta Database (C3)
hemato
was possible
but activity activity activity (parent
repres of test
for
candi to entire
ChIP, some
than each each
dis
and and and two cal dif
the the
for for for
no
by to of ------
27. analyzed XX culture (LG.3A10; (UC3-10A6), (ebio17B7) CD24 described and Flow cytometry. sis ~80% starting strain). deficiency moter C57BL/6 Mice. analysis weight 37. 36. 35. 34. 33. 32. 31. 30. 29. 28.
of
Blatt, M., Wiseman, S. & Domany, E. Superparamagnetic clustering of data. of clustering Superparamagnetic Domany,E. & S. Wiseman, M., Blatt, hn, . Vredn JM, asl, .. Sn X FFrgltd t gns are genes Etv FGF-regulated X. Sun, & J.A. Hassell, J.M., Verheyden, Z., Zhang, A. Subramanian, A.J. Vilella, in L. Vandenberghe, & and S.P. Sparsity Boyd, K. Knight, & J. Zhu, S., Rosset, M., Saunders, R., Tibshirani, M.F. Berger, Badis, G. JASPAR: B. P.,W.W.Lenhard, Wasserman,W.,Engström, & Alkema, A., Sandelin, V. Matys, Frey, B.J. & Dueck, D. Clustering by passing messages between data points. Rev. Lett. Rev. (2009). essential for repressing Shh expression in mouse limb buds. (2005). 15545–15550 profiles. expression genome-wide interpreting for vertebrates. in trees phylogenetic 2004). UK, Cambridge, University, (2005). 91–108 lasso. fused the via smoothness preferences. sequence of analysis resolution Science Res. Acids an open-access database for eukaryotic transcription factor binding profiles. eukaryotes. in regulation 315 Labeling
intranuclear
Cre-activity-reporter
deletion (HSA, of
Mice
, 972–976 (2007). 972–976 ,
supernatant across
The but
in
the with mice 2
4
324 (CD2p-CreTg
all .
CD25
and
et al.
are
lox t al. et 76
gene Kit with The
M1/69), 32
anti-V FlowJo
all from , 1720–1723 (2009). 1720–1723 , efficiency with , 3251–3254 (1996). 3251–3254 ,
P-flanked
t al. et part t al. et
(suppl. 1), D91–D94 (2004). D91–D94 1), (suppl.
Diversity and complexity in DNA recognition by transcription factors. (Invitrogen). anti-ROR Intracellular
RNFC n is oue RNCme: rncitoa gene transcriptional TRANSCompel: module its and TRANSFAC + cell staining following encoding lox et al. et CD44
and a
of BD aito i hmooan N bnig eeld y high- by revealed binding DNA homeodomain in Variation δ
nebCmaa eere: opee duplication-aware complete, GeneTrees: EnsemblCompara
P-flanked transgene
software 6.3 types anti-CD44
the Gene set enrichment analysis: a knowledge-based approach knowledge-based a analysis: enrichment set Gene
was
Biosciences). −
in
+ Etv5 (8F4H7B7), CD3 Nucleic Acids Res. Acids Nucleic
γ Etv5
model. (FoxP3
mice t
γ
CD2
(such biotinylated antibodies δ
(AFKJS-9; staining
Data
thymocyte
− locus (Treestar). fl/fl
CD4 encoding
Etv5
(CD2p-CreTg
to Genome Res. Genome (IM7), ,
. . tt Sc Sre B tt Methodol. Stat. B Series Soc. Stat. R. J.
as were Staining backcrossed
generate
−
is ovx Optimization Convex
(Cytofix/Cytoperm GATA-3) CD8
alleles
anti-CCR6 Anti-V specifically
all were
acquired
with anti-CD62l
subsets, Cre −
from Cell
34
thymic Kit; mice
(
used: the , D108–D110 (2006). D108–D110 , recominase 19 γ Etv5
+ 1.1
Proc. Natl. Acad. Sci. USA Sci. Acad. Natl. Proc.
133 will Rosa-STOP
three
, 327–335 (2009). 327–335 ,
eBioscience) eBioscience);
FluoReporter on
as nature immunology nature
with
fl/fl (140706) (2.11) deleted , 1266–1276 (2008). 1266–1276 ,
anti-TCR
not
inferred precursors an (MEL-15),
) times 3
T C 1. (Cambridge 11.7 Ch , 7 LSRII
Dev. Cell
be Kit;
were cell–specific
was driven
from fl/fl captured
to BD and from
δ
(BD)
purified
-EYFP).
were
crossed and
the Mini-Biotin-
(GL3),
(DN3) Biosciences) anti-IL-17A 16 the
by anti-CD27
the
, 607–613
C57BL/6 and
anti-V the done genome
by Science
Nucleic
analy ETV5
from
anti- Phys. were
with with pro 102
this 67
as γ 2 - - , ,