RESEARCH ARTICLE

Figures and figure supplements

Cooperative interactions enable singular olfactory receptor expression in mouse olfactory neurons

Kevin Monahan et al

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 1 of 30 Research article and

Lhx2 ChIP Ebf ChIP A 2 2 BCGenome Wide OR clusters Lhx2 Sites mOSN 1 1 Lhx2 Sites Ebf Sites bits 16,311 Ebf Sites 114 65 C C G C A AT T T A CGG C G G CC GTA G C T T G A CA G 0 TAATTA 0 CTT AAG 9,024 5 5 10 2 2 2 11,519 4,792 4,232 51 63 Known 1 1 bits

G T A A CG C GC C C AT CT GTGA CT CT AT GA GA 0 AAT 0 C C G G 5 5 10 p = 5.7e-09 p = 1.6e-15

D E ATAC ChIP Lhx2 ChIP Ebf chr2:36,107,000-37,507,000 Crete 300 kb 25 - 8 40 20 ATAC 0_ 4 - 20 10 90 Signal ChIP Lhx2 0 0 0 0_ -4kb Island 4 -4kb Island 4 -4kb Island 4 40 - ChIP Ebf 0_ 9 - ChIP H3K9me3 -2 _ Genes / ORs Greek Islands chr6:42,161,494-43,561,494 300 kb - Anafi Sfaktiria 35 0.5 6 0.5 27 0.5 14 ATAC _ 0 - 115 F ATAC ChIP Lhx2 ChIP Ebf ChIP Lhx2 8 40 20 0_ 85 - ChIP Ebf 4 20 10 0_ Signal - 9 0 0 0 ChIP H3K9me3 -2kbTSS TES 2 -2kbTSS TES 2 -2kbTSS TES 2 -2 _ Genes / ORs

chr2:36,107,000-37,507,000 300 kb Lipsi 20 - ATAC 0_ 55 - ORs ChIP Lhx2 0 _ 30 - ChIP Ebf 0_ 9- ChIP H3K9me3 -2 _ 0.5 6 0.5 27 0.5 14 Genes / ORs

Figure 1. Greek Islands represent Lhx2 and Ebf co-bound regions residing in heterochromatic OR clusters. (A) The top sequence motif identified for mOSN ChIP-seq peaks is shown above sequence motifs generated from previously reported Lhx2 (Folgueras et al., 2013) and Ebf (Lin et al., 2010) ChIP-seq data sets. mOSN ChIP-seq peaks were identified using HOMER and motif analysis was run on peaks present in both biological replicates. (B) Overlap between mOSN Lhx2 and Ebf bound sites genome-wide. See Figure 1—figure supplement 2 for analysis of ChIP-seq signal on Ebf and Lhx2 Co-bound sites within OR clusters. (C) Overlap between mOSN Lhx2 and Ebf bound sites within OR clusters. For each factor, co-bound sites are À À significantly more frequent within OR clusters than in the rest of the genome (p=5.702e 9 for Lhx2, p=1.6e 15 for Ebf, Binomial test). See Figure 1— figure supplement 2 for ontology analysis of peaks bound by Lhx2 and Ebf. (D) mOSN ATAC-seq and ChIP-seq signal tracks for three representative OR gene clusters. Values are reads per 10 million. Below the signal tracks, OR genes are depicted in red and non-OR genes are depicted in blue. Greek Island locations are marked. Anafi is a newly identified Greek Island, located in a small OR cluster upstream of the Sfaktiria cluster. See also Figure 1—figure supplement 3 and Supplementary file 1. For ATAC-seq, pooled data is shown from 4 biological replicates, for ChIP-seq, pooled data is shown from 2 biological replicates. For H3K9me3 ChIP-seq, input control signal is subtracted from ChIP signal prior to plotting. (E) mOSN ATAC-seq or ChIP-seq signal across 63 Greek Islands. Each row of the heatmap shows an 8 kb region centered on a Greek Island. Regions of high signal are shaded red. Mean signal across all elements is plotted above the heatmap, values are reads per 10 million. All heatmaps are Figure 1 continued on next page

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 2 of 30 Research article Genes and Chromosomes

Figure 1 continued sorted in the same order, based upon ATAC-seq signal. See also Figure 1—figure supplement 3 and Supplementary file 1. For ATAC-seq, pooled data is shown from 4 biological replicates, for ChIP-seq, pooled data is shown from 2 biological replicates. See Figure 1—figure supplement 4 for a comparison of newly and previously identified Greek Islands, and Figure 1—figure supplement 5 for RNA-seq analysis of ORs with Greek Islands near the TSS. (F) mOSN ATAC-seq and ChIP-seq signal tracks on OR genes. Each row of the heatmap shows an OR gene scaled to 4 kb as well as the 2 kb regions upstream and downstream. Plots and heatmap are scaled the same as in Figure 1E. DOI: https://doi.org/10.7554/eLife.28620.002

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 3 of 30 Research article Genes and Chromosomes

-log10(Binomial p value) GO Biological Process 0 5 10 15 positive regulation of axonogenesis positive regulation of axon extension regulation of calcium ion transport regulation of extent of cell growth regulation of cell size semaphorin-plexin signaling pathway regulation of mRNA processing regulation of axon extension

MSigDB Pathway Olfactory transduction Ca++/ Calmodulin-dependent Kinase Activation

Figure 1—figure supplement 1. terms associated with Ebf and Lhx2 co-bound sites. Top Gene Ontology terms from the Biological Process and MSigDB Pathway categories associated with genes proximal to sites bound by both Ebf and Lhx2. DOI: https://doi.org/10.7554/eLife.28620.003

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 4 of 30 eas hr r nytopasi hsctgr.CI inli acltdb vrgn omlzdpa onsfo w ilgclreplicates. biological two from counts peak included normalized not averaging are by peaks calculated Ebf-only is OR-cluster signal peaks. ChIP of category. categories this different in for peaks peak) two each only in are reads there of because number (normalized strength peak ChIP-seq ed e 0mlin ohhamp r otdi h aeodr ae pnAA-e inl olddt ssonfo ilgclrpiae.( replicates. biological 2 from shown are is values data heatmap, Pooled the signal. above ATAC-seq plotted upon is based group order, each same for the signal in Mean sorted sites. are Lhx2) heatmaps or Both (Ebf million. bound 10 singly per OR-cluster reads to compared Islands) (Greek peaks est lto h itiuino ek vrLx hPsqpa tegh(omlzdnme frasi ahpa)frdfeetctgre fpeaks. of categories different for peak) ( each replicates. in biological reads two of from number counts (normalized peak strength normalized peak averaging ChIP-seq by Lhx2 calculated over is peaks signal of ChIP distribution the of plot Density O:https://doi.org/10.7554/eLife.28620.004 DOI: 2. supplement 1—figure Figure Monahan A OR Cluster Sites Ebf or Lhx2 or Ebf Ebf+Lhx2 tal et eerharticle Research Lf 2017;6:e28620. eLife .

Singly Bound Greek Islands 20 40

(Ebf or Lhx2) (Ebf+Lhx2) 0 -4kb . 27 0.5 ChIP Lhx2 ChIP Peak obnigo b n h2wti Rcutr.( clusters. OR within Lhx2 and Ebf of Co-binding 4 O:https://doi.org/10.7554/eLife.28620 DOI: 20 10 0 4bPa 4 Peak -4kb . 14 0.5 ChIP Ebf ChIP B Density 0.0 0.2 0.4 0.6 ChIP PeakStrengthby Type Log2(Lhx2 ChIPSignal) . . 10.0 7.5 5.0 Greek Islands Ebf+Lhx2 OR-cluster Lhx2-only Lhx2-only A ONLx n b hPsqsga nO lse b +Lhx2 Ebf Cluster OR on signal ChIP-seq Ebf and Lhx2 mOSN ) C est lto h itiuino ek vrEbf over peaks of distribution the of plot Density ) C Density 0.0 0.2 0.4 0.6 ChIP PeakStrengthby Type 012 10 8 6 4 Log2(Ebf ChIPSignal) Greek Islands Ebf-only Ebf+Lhx2 ee n Chromosomes and Genes f30 of 5 B ) Research article Genes and Chromosomes

chr6:42,865,500-42,874,499 chr11:74,031,930-74,040,929 chr2:37,123,665-37,132,664 A 2 kb 2 kb 2 kb Sfaktiria Crete Lipsi 9 - 9 - 9 -

ChIP H3K9me3 0 _ 0 _ 0 _ 11 - 11 - 10 -

ChIP H3K79me3 0 _ -0.5 _ 0 _ 1 - 1 - 1 -

ChIP H3K27ac 0 _ 0 _ 0 _ 35 - 25 - 20 -

ATAC

0 _ 0 _ 0 _ 115 - 90 - 55 -

ChIP Lhx2

0 _ 0 _ 0 _ 85 - 40 - 30 - ChIP Ebf

0 _ 0 _ 0 _

B ChIP H3K9me3 ChIP H3K79me3 ChIP H3K27ac 0.5 4 4 0.3 2 2 0.1 Signal 0 0 í0.1 -4 Island 4 -4 Island 4 -4kb Island 4 Greek Islands

0.2 6.4 0.2 7.6 0.1 0.7

Figure 1—figure supplement 3. Histone modifications proximal to Greek Islands. (A) ATAC-seq and ChIP-seq signal tracks for three Greek Islands, Sfaktiria, Crete and Lipsi. Greek Island position is highlighted in yellow. For heterochromatin modifications (H3K9me3 and H3K79me3), input control signal is subtracted from ChIP signal. Pooled data is shown from 4 biological replicates for ATAC-seq, 2 biological replicates for Lhx2, Ebf, H3K9me3, and H3K27ac, and one replicate for HeK79me3. (B) ChIP-seq signal for histone modifications associated with heterochromatin and active enhancers in the vicinity of Greek Islands. Pooled data is shown from 2 biological replicates for H3K9me3 and H3K27ac, and one replicate for HeK79me3. DOI: https://doi.org/10.7554/eLife.28620.005

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 6 of 30 Research article Genes and Chromosomes

ATAC ChIP-Lhx2 ChIP-Ebf ChIP H3K9me3 ChIP H3K27ac ChIP H3K79me3 14 60 30 5 0.7 6 Original Islands 0.6 12 50 25 5 New Islands 4 10 0.5 40 20 4 8 3 0.4 30 15 3 0.3 Signal 6 2 20 10 2 4 0.2 1 1 2 10 5 0.1 0 0 0.0 -4kb Peak 4 -4 Peak 4 -4 Peak 4 -4 Peak 4 -4 Peak 4-4 Peak 4

Figure 1—figure supplement 4. Comparison of new and previously identified Greek Islands. Mean ATAC-seq or ChIP-seq signal for previously identified Greek Islands (Markenscoff-Papadimitriou et al., 2014) (blue shaded) that are bound by Ebf and Lhx2 compared to newly identified Ebf and Lhx2 bound islands (green shaded). Pooled data is shown from 4 biological replicates for ATAC-seq, 2 biological replicates for Lhx2, Ebf, H3K9me3, and H3K27ac, and one replicate for HeK79me3. DOI: https://doi.org/10.7554/eLife.28620.006

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 7 of 30 Research article Genes and Chromosomes

ORs without ORs with Island Island



0 log2(fpkm)

í

Figure 1—figure supplement 5. Greek Islands represent Lhx2 and Ebf co-bound regions residing in heterochromatic OR clusters. Level of expression (FPKM) for OR genes in mOSNs determined by RNA- seq. ORs with a Greek Island within 500 bp of the annotated TSS are plotted separately and in red. FPKM is the mean of three biological replicates. DOI: https://doi.org/10.7554/eLife.28620.007

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 8 of 30 iue2cniudo etpage next on continued 2 Figure iue2. Figure bearing Monahan D C A

Olfr17-IRES-GFP ATAC Signal

Signal 12 36 - 36 - 36 - 0 _ 0 _ 0 _ re sadacsiiiyi needn fO rmtrcoc.( choice. promoter OR of independent is accessibility island Greek Olfr714 tal et Greek Islands 0 4 8 4bPa 4 Peak -4kb chr14:52,461,399-52,521,399 chr9:37,698,000-37,758,000 chr7:107,066,538-107,126,538 Olfr17-IRES-GFP . 5.5 0.5 eerharticle Research 10 kb 10 kb 10 kb Lf 2017;6:e28620. eLife . mOSN Olfr874 * Olfr1507 Olfr17 , Olfr151-IRES-tauGFP Olfr17+ OSNs Olfr151 4Pa 4 Peak -4 . 5.5 0.5 Olfr151-IRES-tauGFP Gm10081 Olfr17+ OSNs Olfr17+ O:https://doi.org/10.7554/eLife.28620 DOI: Olfr1508 * Olfr160 or , 36 - 36 - 36 - Olfr714 0 _ 0 _ 0 _ Olfr1507-IRES-GFP 4Pa 4 Peak -4 . 5.5 0.5 Olfr151+ OSNs Olfr151+ Olfr874 Olfr1507 * Olfr1507-IRES-GFP Olfr17 Olfr151+ OSNs Olfr151 lee.Nce r tie ihDP bu) ( (blue). DAPI with stained are Nuclei alleles. 4Pa 4 Peak -4 . 5.5 0.5 Gm10081 A Olfr1508 F loecne(re)i O isescin rmautmice adult from sections tissue MOE in (green) fluorescence GFP ) Olfr1507+ OSNs Olfr1507+ Olfr160 * B 36 - 36 - 36 - Olfr714 0 _ 0 _ 0 _

Log2 Fold Change Log Fold2 Change Log Fold2 Change E

(Olfr1507+ - mOSN) (Olfr151+ - mOSN) (Olfr17+ - mOSN) Olfr874 Olfr1507 í 0 5 í 0 5 í 0 5 ATACseq PeaksinORSortedCellsvsmOSNs Olfr17 Log2(Normalized SignalinPeak) Olfr1507+ OSNs 01 14 12 10 8 6 4 01 14 12 10 8 6 4 01 14 12 10 8 6 4 / Olfr151 = GreekIslands * B Gm10081 ersnaieFC aafor data FACS Representative ) Olfr1508 Olfr160 ee n Chromosomes and Genes f30 of 9 Olfr- Research article Genes and Chromosomes

Figure 2 continued IRES-GFP mice. Data is shown from Olfr151-IRES-GFP mice. Viable (DAPI negative), GFP+ cells were collected for ATAC-seq. (C) ATAC-seq signal tracks from GFP+ cells sorted from Olfr17-IRES-GFP (red), Olfr151-IRES-GFP (blue), or Olfr1507-IRES-GFP (green) mice. Values are reads per 10 million. The region spanning each targeted OR is shown for all three lines. See also Figure 2—figure supplement 1. Pooled data is shown for 2 biological replicates. (D) ATAC-seq signal over Greek Islands is shown for mOSNs and each Olfr-IRES-GFP line. All samples are sorted by signal in mOSNs. A blue arrow marks the H Enhancer, which is the Greek Island proximal to Olfr1507. A blue asterisk marks Kimolos, the Greek Island proximal to Olfr151, which has the strongest change in signal relative to mOSNs. See also Figure 2—figure supplement 2. Pooled data is shown for 4 biological replicates for mOSNs, and 2 biological replicates for each Olfr-IRES-GFP sorted population. (E) MA-plots showing fold change in ATAC-seq signal for each sorted Olfr-IRES-GFP population compared to mOSNs. Peak strength (normalized reads in peak) and fold change are shown for all ATAC-seq peaks; peaks that are not significantly changed are black and peaks that are significantly changed (FDR < 0.001) are gold. Greek Islands are plotted as larger dots and are shown in red if significantly changed. Kimolos is marked with an asterisk in Olfr151 expressing cells, and H is marked with an arrow in Olfr1507 expressing cells. See also Figure 2—figure supplement 2. DOI: https://doi.org/10.7554/eLife.28620.013

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 10 of 30 Research article Genes and Chromosomes

A All Genes - Grouped by Level of Expression 50 1st Quartile 40 2nd Quartile 3rd Quartile 4th Quartile 30

20

10 mOSN ATAC Signal ATAC mOSN 0 -5kb TSS TES 5

B Olfr17 30 mOSN 25 Olfr17+ 20

15

10 ATAC Signal ATAC 5

0 -5kb TSS TES 5

Olfr1507 C 14 12 mOSN Olfr1507+ 10 8 6 4

ATAC Signal ATAC 2 0 -5kb TSS TES 5

Figure 2—figure supplement 1. High Accessibility near the OR transcription end site. Signal plots are from pooled data from 4 biological replicates for mOSNs and 2 replicates each for Olfr17-IRES-GFP+, and Olfr1507-IRES-GFP+cells. (A) Profile of mean mOSN ATAC-seq signal over all genes. Genes are grouped into quartiles by level of expression in mOSNs. (B) Profile of ATAC-seq signal over Olfr17 in all mOSNs and Olfr17-IRES-GFP expressing OSNs. (C) Profile of ATAC-seq signal over Olfr1507 in all mOSNs and Olfr1507-IRES-GFP expressing OSNs. DOI: https://doi.org/10.7554/eLife.28620.014

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 11 of 30 Research article Genes and Chromosomes

A 0 _ chr14:52,451,868-52,571,868 30 kb 20 - Olfr17+ 0 _ 20 - Olfr151+ ATAC 0 _ 20 - Olfr1507+ 0 _ UCSC Genes /ORs Olfr1507

B 0_ chr9:37,648,666-37,768,666 30 kb 40 - Olfr17+ 0_ 40 - Olfr151+ * ATAC 0_ 40 - Olfr1507+ 0_ UCSC Genes /ORs Olfr151

Figure 2—figure supplement 2. Greek island accessibility is independent of OR promoter choice. Signal plots are from 2 replicates each for Olfr17-IRES-GFP+, Olfr151-IRES-GFP+, and Olfr1507-IRES-GFP+cells. (A) ATAC-seq signal in the vicinity of Olfr1507 for each Olfr-IRES-GFP population. A blue arrow marks the location of H. (B) ATAC-seq signal in the vicinity of Olfr151 for each Olfr-IRES-GFP population. A blue asterisk marks Kimolos, the Greek Island with greatly increased signal in Olfr151-IRES-GFP expressing cells relative to mOSNs. DOI: https://doi.org/10.7554/eLife.28620.015

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 12 of 30 Research article Genes and Chromosomes

AB C 1.00 Ebf 2 0.6 Ebf Motif 1 bits 0.75 p<2.2e-16 C C G 0.5 T A A C TG G G C

T T A CA G 0 CTT AAG Greek Islands 2 5 10 0.4 Co-bound Sites Composite Motif 1 0.50 bits G Genome Wide G T T GC C A G GG C T G T A A C p=1.2e-13 C T T G T TA A T G 0.3 TG C C AA TA TA 0 ATA GAAT CAC AT A5 10 15 20 OR promoters 2 0.25 0.2 Lhx2 Motif 1 bits C End Density Fragment ATAC AT CGG 0.1 0 CTCA GTGATA AT 5 0 Lhx2 Proportion of Sites with Motif Scoring < X 0.0 í í 0 5 10 15 -200 bpComposite Motif 200 Composite Motif Score

1500 D E F

T T GC C C G G T T GC C C G G T C G T C G A T A T A A A A T A T A A A AT T G T A AT T G T A T A G T A G T G T T G T ATT A AT A T C ATT A AT A T C A A A G A C A A A A G A C A C T C T A A GC C GT T C T C T A A GC C GT T GG G AA T C C A A T GG G AA T C C A A T GCA T A G T A GCA T A G T A T C G G T T T T T C G G T T T T C C G T T C C C G T T C C C C G A C C C C G A C T C A A G T C A A G ATA CG G TA T ATA CG G TA T T G G GC T CAAG GATATT T C G G T G G GC T CAAG GATATT T C G G A T A A T A AA AA C A AA AA C A 1000 p=8.8e-07

(score > 10) (score

Strong Composite Strong 500 A G A G T G CCC GG T G CCC GG T A A A T G C A T A A A T G C A C G C A TT A C G C A TT A A T T A T A A T A ATA A T T A T A A T A ATA T T C G T A T T C G T A T A T C A T T T C C T T T C T A A A T T T A A A T T G A G A A A T A G A A T A G T A T T A T T T G A A G A G A T G T A A T A A T C G G A T C C G G A T C A A A A A A T T T T T T G T G T G C T C G CC C T C C C G G C CC C G C CC C G TA T C T G TA T C T G T G CA T G CA G C A ATACAAAGAG AAT CATTC G G G C A ATACAAAGAG AAT CATTC G G 0 Distance from Ebf Motif to Lhx2 (bp) (score 5-10) (score Over 10 Under 10 Weak Composite Weak Greek Islands with OR promoters Genome Wide

Composite Score Co-bound Sites

Figure 3. Greek Islands have stereotypically proximal Lhx2 and Ebf motifs. (A) Sequence logo of the Greek Island composite motif (center). The mOSN ChIP-seq derived Lhx2 and Ebf motifs logos are positioned above and below the corresponding regions of the composite motif. (B) Cumulative distribution plot of the score of the best composite motif site found in each of the 63 Greek Islands. Also plotted are cumulative distributions for co- bound sites outside of OR clusters and OR gene promoters. A score of 10 was selected as a stringent threshold for motif identification, and a score of 5 was selected for permissive motif identification. This motif is significantly enriched in Greek Islands relative to co-bound sites outside of OR clusters at both of these score cut-offs (Binomial test). See also Supplementary file 2.(C) Plot of the density of ATAC-seq fragment ends in the vicinity of Greek Island composite motifs sites scoring over 10. Plot shows mean signal and standard error in 5 bp windows centered on 43 composite motif sites (yellow). (D) Multiple alignment of composite motif sequences from Greek Islands together with 20 bp of flanking sequence. Each base is shaded by nucleotide identity: A = green, C = blue, G = yellow, T = red. Top panel depicts composite with score over 10 and bottom panel depicts composites with score between 5 and 10, together with a sequence logo of the motif present in those sequences. See Figure 3—figure supplement 1 for sequences of strong and weak Greek Island composite motifs. (E) As in (D), except purines are shaded red and pyrimidines are shaded blue. (F) For each site, the distance (in base pairs) between the closest Ebf-Lhx2 motif pair was determined. For each set of sites, the distribution of distances is shown as a boxplot. Sets of sites comprising Greek Islands with a strong composite motif, Greek Islands without a strong composite motif, Ebf and Lhx2 co-bound sites genome-wide, and OR gene promoters are compared. Sites without an Ebf motif are excluded. The distribution of distances between Ebf and Lhx2 motifs was significantly smaller for Greek Islands without a composite motif than for Ebf and Lhx2 bound sites genome-wide (two-sample, one-sided Kolmogorov–Smirnov test) See also Supplementary file 2. n = 25 for Greek Islands with Composite Score greater than 10; n = 21 for Greek Islands with Composite Score less than 10; n = 3805 for Co-bound sites genome wide; n = 521 for OR promoters. DOI: https://doi.org/10.7554/eLife.28620.020

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 13 of 30 Research article Genes and Chromosomes

A Strong Composite (Score >10) B Weak Composite ( 10 > Score >5) 10 10 chr1:173190756-173190776(-) CCT AACGAGGCCCCTGAGAT chr1:92545955-92545975(-) TCAAACAAAGT T AT AAGGAG chr1:173265886-173265906(+) CT T AATGAAGCCCAGGAGAC chr1:173190690-173190710(+) TCT AAAGAAT TCCCT TGAGA chr1:174035943-174035963(-) CCT AAT T AAGT TCATGGGAA chr1:173190833-173190853(+) CAAAAT AAACACACCGAAAA chr1:174342704-174342724(-) CT T AATGAAGCCCAGGAGAC chr1:173265949-173265969(+) TCAAAAAAAGATCCTCAGGA chr2:37128033-37128053(+) T T T AAT T AGGCCATCAGGAG chr2:90414535-90414555(+) TGT AACT AAGCACACAAT T A chr2:37128054-37128074(-) GCT AAT T AACCTCTCAAGT T chr2:112200932-112200952(+) CAAAATGAAGTCCTGAGAGT chr2:37128064-37128084(+) GT T AAT T AGCATCAAGAAAA chr3:106876154-106876174(-) TCAAATGAGCAT TCCCAGGA chr2:37128111-37128131(-) T T T AACGAGGCTCTGGGGAG chr3:106876235-106876255(-) GAAT ATGGGCCCCCAGGGAG chr2:90414471-90414491(+) GCAAAT T AATGCCT TGGAGG chr4:43716195-43716215(-) TCT AAT TGACCT TCTGGGAG chr2:111727083-111727103(+) GT T AATGGGTCACATGGGGA chr4:58640643-58640663(+) GCAAATGAATCT T T T AGT T T chr2:111790206-111790226(+) GT T AAT AAGTCACCCGGGAA chr4:58764039-58764059(-) T AT AATGAGGT ACTGAGGTC chr2:112200849-112200869(+) GCT AAT AGGCCCCCAAGGGG chr4:58764124-58764144(-) T AT AATGAAGCCTCT TGAAA chr3:97491426-97491446(-) T T AAAT T AGT T TCTCGGTGG chr4:118729780-118729800(-) CT T AACAAGTCCCCAAGAAA chr4:118642436-118642456(+) GT T AACAAGCCTCCCAGGAA chr6:42576885-42576905(+) AAT AATGAGGCCCCAGGAAC chr6:42576930-42576950(+) GAAAAT T AAGCCCTGGAGGT chr6:42870103-42870123(+) T T T AATGAGT TCTCTGAAGG chr6:42870022-42870042(-) T T T AAT TGATCTCCAAGGAA chr6:42869956-42869976(+) T T T AATGAACCCCGCAAGGA chr6:116614088-116614108(+) GAT AAT T AACCCCAT AGGGG chr7:6545295-6545315(+) GCT AATGAGT T T ATCGAGT A chr7:6650044-6650064(+) T AAAAAT AGTCTCAT AGGGA chr7:86295143-86295163(-) TCT AACAAGTCCCCTGAT AA chr7:6650090-6650110(+) T T T AAT TGGTCCCCTGATGA chr7:86295250-86295270(+) GAAAAAT AACCTCAGGGT AT chr7:99787973-99787993(+) T T T AAT TGGCCCCTGAAAGG chr7:99788153-99788173(+) GGT AAT T AGACCCAAGAGAG chr7:99788015-99788035(+) CCT AATGAATCCCT AGGAAT chr7:140187323-140187343(-) AAT AAT T AAT TCCTCGGTGA chr7:102513064-102513084(-) GCT AACGAGCCCCAGCGGAG chr9:19651357-19651377(+) GCT AATGAAT TCTCACGGGT chr7:108797627-108797647(-) GGAAAT T AGT TCCTCTGGAA chr9:37687836-37687856(+) T T T AATGAATCCCGGAGGAT chr7:140187164-140187184(-) T T T AATGGAGCCCCAGGGAA chr9:39952115-39952135(-) GAT AAT TGATCCCTCTGT TC chr9:37687882-37687902(-) TCAAAT AAGCCTCACAAGGC chr10:78618266-78618286(-) GCT AAAAAGGATCACAAGGG chr10:128979021-128979041(-) T T T AAT T AAT TCCCTGAGGT chr10:130098709-130098729(-) T T T AATCAGTCTCAGAGGGA chr11:50999391-50999411(+) CCT AAT T AGCCT T TGGGGAA chr10:130098872-130098892(+) GAAAAT T ATCT TCCTGAT AA chr11:58739376-58739396(+) GGAAATGAGGGCCATGAGAA chr11:49575514-49575534(+) T T AT ACT AGGTCCCAGGGAA chr11:59576086-59576106(+) T T T AAT T AGTGTCT AAGGGA chr11:50999415-50999435(-) T T T AAT TGGCACACCAAGAG chr11:74036185-74036205(-) GAAAACT AGCTCCT TGGAGA chr11:58810852-58810872(-) GGAAAT T AAGACT AAAGAGT chr11:74036333-74036353(-) TCT AAT T AGT TCCCAGATGA chr11:74036453-74036473(+) GT T AATGAAGCT T T TCAAT T chr11:87937436-87937456(+) GCT AAT AAGGCTCACTGGAA chr11:87897402-87897422(+) CAT AATGAGGAT T T T AAAAA chr14:50758860-50758880(-) TCT AAT T AGT TCTCAGGGGT chr13:21341290-21341310(+) AT AAAT T AACCCCAATGGAA chr14:52548071-52548091(-) T AT AATGAACCACT AGAGGC chr13:21341376-21341396(-) T T T AACT AGT TCCCT AGGCA chr14:54285025-54285045(-) GGT AATGAATCTCAAGGGAA chr13:21343665-21343685(-) T AAT AAT AGGCCCTGAGAGA chr15:98263532-98263552(+) CCT AACT AACCTCCCGAGAC chr14:52548145-52548165(+) GGAACTGAAT TCCTCAGGGA chr16:3781442-3781462(+) T T T AATGAGCCCCAT AGTGA chr14:52548155-52548175(-) T T T AAT AGGGTCCCTGAGGA chr17:37544113-37544133(-) CCT AATGAGCTCCCAAGGGA chr14:52548209-52548229(+) GT AAAT T AGTGT T ATCAGTG chr19:14096505-14096525(-) T T T AAT T AGCACACAGGGGA chr14:54304233-54304253(-) GAT AAT T AGATCCCAAAAGA chr19:14090645-14090665(+) CT T AATGAGCTCCCCTGGGA chr16:3781561-3781581(-) T AT ACTGAGCTCCTGGGGAC chrX:74559207-74559227(+) TCT AAT T AGT TCCCAAGTGA chr16:58957204-58957224(+) TCT AATGAAGTCTCAAGTGG chrX:74559321-74559341(+) CAT AATGAAGTCCCT AAAGT chr16:58957112-58957132(+) CCT AAT AAGGTCCACTGAGC 2.0 chr19:12830994-12831014(-) GAT AATGAACAAT T AAAAAG chr19:12831130-12831150(+) GGT AAT T AGACCCAGAGAGA 1.0 bits chr19:14096548-14096568(-) TGT AACT AGGGCCAT TGAGC T T GC C C GG G T C A G A G T A G C A A T T G T A G TA CA A C TA C TA TAGAAG CTT ATA chrX:74559250-74559270(-) T T T AATGAAT T AT ACAGGAT 0.0 AA5 10 15 20 2.0 WebLogo 3.5.0

1.0 bits A G T T G CCC GG C G C A A A T A T G A G GT T T A T T C T C T CA A ATCAAGAG AAT CATTC 0.0 AA 5 10 15 20 WebLogo 3.5.0

Figure 3—figure supplement 1. Greek Islands have stereotypically proximal Lhx2 and Ebf motifs. (A) Multiple alignment of composite motif sequences found in Greek Islands using a stringent cutoff (motif score >10). Positions with at least 50% identity are shaded by nucleotide. A motif logo of the included sequences is shown below the alignment. (B) Multiple alignment of weak composite motif sequences found in Greek Islands using a loose cutoff (10 > motif score > 5). Positions with at least 50% identity are shaded by nucleotide. A motif logo of the included sequences is shown below the alignment. DOI: https://doi.org/10.7554/eLife.28620.021

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 14 of 30 Research article Genes and Chromosomes

A B Lhx2 DAPI Lhx2 / DAPI OR Transcript Levels in Control vs Lhx2 KO mOSNs  0 (Lhx2KO - Control) í Change in OR Transcript Level Transcript Change in OR Log2

1 10 100 1000 OR Transcript Level (Normalized Read Counts) OMP-IRES-Cre ; Lhx2 fl/fl Lhx2 fl/fl NE FKU C  Lipsi

Control mOSNs

0 _  ChIP Lhx2 ChIP Lhx2 KO mOSNs

0 _ 

Control mOSNs

0 _  ChIP Ebf ChIP Lhx2 KO mOSNs

0 _ 20 -

Control mOSNs

0 _ 20 - ATAC

Lhx2 KO mOSNs

0 _ UCSC Genes / ORs

Type of Ebf Site DEChIP Lhx2 ChIP Ebf F ATAC G Ebf Ebf & Greek Control mOSNs Lhx2 KO mOSNs Control mOSNs Lhx2 KO mOSNs Control mOSNs Lhx2 KO mOSNs 40 Only Lhx2 Islands  20 10 Signal

0 0 0 0 -4kb Peak 4-4 Peak 4 -4kb Peak 4-4 Peak 4 -4kb Peak 4-4 Peak 4 Ebf ChIP

í Log2 (Lhx2 KO - WT) KO (Lhx2 Log2

Greek Islands Greek Islands Greek Islands í

 27  27  14  14  10  10

Figure 4. Lhx2 is required for Ebf binding predominantly on Greek Islands. (A) Lhx2 immunofluorescence (IF) (green) in MOE sections from 3 week old control (Lhx2 fl/fl) and Lhx2 KO (Omp-IRES-Cre; Lhx2fl/fl) mice. Nuclei are stained with DAPI (blue). The Lhx2 immunoreactive cells on the basal layers of Figure 4 continued on next page

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 15 of 30 Research article Genes and Chromosomes

Figure 4 continued the MOE represent immature OSNs and progenitors that have not yet turned on OMP (and thus Cre) expression. See also Figure 4—figure supplement 1 for demonstration of the Cre induced deletion at the mRNA level. (B) MA-plot of OR transcript levels in FAC-sorted Lhx2 KO mOSNs (Omp-IRES-Cre; Lhx2fl/fl; tdTomato) compared to FAC-sorted control mOSNs (Omp-IRES-GFP). Red dots correspond to OR genes with statistically significant transcriptional changes (adjusted p-value<0.05). Three biological replicates were included for control mOSNs and 2 biological replicates were included for Lhx2 KO mOSNs. (C) ChIP-seq and ATAC-seq signal tracks from FAC-sorted control mOSNs (Omp-IRES-GFP) and Lhx2 KO mOSNs (Omp-IRES-Cre; Lhx2fl/fl; tdTomato) for the OR cluster containing the Greek Island Lipsi. Values are reads per 10 million. For ATAC-seq, pooled data from 4 biological replicates for control mOSNs are compared to data from 2 biological replicates for Lhx2 KO mOSNs. For ChIP, pooled data is shown from 2 biological replicates. (D–F) Heatmaps depicting Lhx2 and Ebf ChIP-seq and ATAC-seq signal across Greek Islands for FAC-sorted control and Lhx2 KO mOSNs for the samples described in C. (G) Log2 fold change in normalized Ebf ChIP-seq signal in Lhx2 KO mOSNs relative to control mOSNs for Greek Islands (red), compared to sites genome-wide that are bound by Ebf-only or both Ebf and Lhx2 in wild-type mOSNs. Fold change was calculated using data from 2 biological replicates each of control mOSNs and Lhx2 KO mOSNs.. See also Figure 4—figure supplement 2 for MA-plot showing data for all peaks in each set and Figure 4—figure supplement 3 for RNA-seq analysis of the effect of Lhx2 KO on ORs with and without a promoter Lhx2 motif. DOI: https://doi.org/10.7554/eLife.28620.025

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 16 of 30 Research article Genes and Chromosomes

chr2:38,338,990 38,349,250 38,359,511 38,369,772

Lhx2 Deleted in KO 1000 Control mOSN 210 71 144

0

139

1000 Lhx2KO mOSN 332 60

0 33

Figure 4—figure supplement 1. Effect of Lhx2 deletion on Lhx2 expression and splicing. Sashimi plot (Katz et al., 2010) of Lhx2 RNA-seq signal and splicing junctions in control and Lhx2 KO mOSNs. A schematic of Lhx2 and the region affected by the conditional knockout is shown at the top. Representative data is shown for one replicate from each condition. DOI: https://doi.org/10.7554/eLife.28620.026

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 17 of 30 Monahan tal et eerharticle Research Lf 2017;6:e28620. eLife . O:https://doi.org/10.7554/eLife.28620.027 DOI: and mOSNs control biological of 2 each Fold from replicates red. data are using Islands calculated Greek was and change blue, overlap are that peak peaks Lhx2 black, an a are overlap peak not Lhx2 do mOSN that control are peaks Peaks type; peaks. by ChIP-seq coded Ebf color mOSN all change for fold shown and are peak) in reads Peak (normalized mOSNs. strength control to for compared signal mOSNs ChIP-seq KO Ebf in change MA-plots fold Islands. showing Greek on predominantly binding Ebf 2. supplement 4—figure Figure mOSNs. O:https://doi.org/10.7554/eLife.28620 DOI:

Ebf ChIP Log2 Fold Change

í í (Lhx2 KO - WT) 0  Ebf ChIP PeaksGroupedby Type Greek Islands Ebf +Lhx2 Ebf Only Log2 (NormalizedSignalinPeak) 4  h2i eurdfor required is Lhx2 Lhx2 8 KO Lhx2 ee n Chromosomes and Genes 8o 30 of 18 Research article Genes and Chromosomes

ORs Grouped by Motifs in Promoter Lhx2 No Lhx2

0.3

0.2 density 0.1

0.0 í 0 Log2 Change in OR Transcript Level (Lhx2KO - WT)

Figure 4—figure supplement 3. Lhx2 deletion downregulates ORs that do not have Lhx2 promoter motfis. Density plot of Log2 fold change in OR transcript levels in Lhx2 KO mOSNs compared to control mOSNs, with ORs grouped based upon the motifs present in the promoter region (À500 bp to the TSS). ORs with a very low level of expression (OR transcript level <5 in Figure 4B) are not included. Three biological replicates were included for control mOSNs and 2 biological replicates were included for Lhx2 KO mOSNs. DOI: https://doi.org/10.7554/eLife.28620.028

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 19 of 30 Research article Genes and Chromosomes

AB Probe Composite C Probe Composite Probe Ebf Site Lhx2 Site CompositeComp osite Protein Fusion Lhx2 Ebf1 Protein Fusion

Protein Competitor Number of bases inserted

Competitor None

- - - - Fusion-5 Fusion-10 Fusion-20 Lhx2 Ebf1 Fusion-5 Fusion-5 Fusion-10 Fusion-10 Fusion-20 Fusion-20 Lhx2 Ebf1 Ebf1 Fusion-5 Fusion-10 Fusion-20 Lhx2 Ebf1 Ebf1 None Lhx2 Site Ebf Site Composite None Lhx2 Site Ebf Site Composite None Lhx2 Site Ebf Site Composite into composite motif 2 4 6 8 10 12 14 None Composite Lhx2 Site Ebf Site

Fusion Protein Linker

Ebf Lhx2

Free Probe

Free Probe Free Probe

chr2:36,107,000-37,507,000 300 kb D mOSN E 20 - Lipsi mOSN 0 _ 20 - ATAC mOSN + Fusion Ebf Lhx2 0 _ mOSN + Fusion UCSC Genes / ORs 4 - mOSN

0 _ Fusion Protein 4 - RNAseq mOSN + Fusion 0 _

Genes Grouped by Promoter Binding F G HI Genes Grouped by ATAC-seq OR Transcript Levels in Control vs No Ebf Ebf + Promoter Binding mOSN mOSN + Fusion Fusion Protein Expressing mOSNs or Lhx2 Ebf Lhx2 Lhx2 ORs ORs 12 10 Ebf+Lhx2 Lhx2 8 Ebf 4 Neither Signal 5 0 -4kb Peak 4-4kb Peak 4

0 5 0 5 (Fusion - Control)

(Fusion - Control) 0 í Greek Islands í Log2 Change in OR Transcript Change in OR Log2

(Fusion - Control) í Log2 Fold Change in Expression

í

0.5 10 0.5 10 Log2 Fold Change in Expression 1 5 50 500 5000 OR Transcript Level

Figure 5. Displacement of Lhx2 and Ebf from Greek Islands shuts off OR transcription. (A) Electrophoretic Mobility Shift Assay (EMSA) for binding of in vitro translated protein to DNA probes containing either an Ebf site, an Lhx2 site, or a composite site. Binding of three versions of the Fusion protein with either 5, 10, or 20 amino acid linker peptides were compared to full length Lhx2 or full length Ebf1. (B) EMSA for sequence selectivity of in vitro translated . Binding of Fusion protein (20aa linker), Ebf1, and Lhx2 to composite motif probe was competed with a 20-fold molar excess of unlabeled oligo containing either an Lhx2 site, Ebf site, or composite site. (C) EMSA for motif-spacing selectivity of in vitro translated proteins. Binding of Fusion protein (20aa linker) was competed with 100-fold molar excess of unlabeled oligo containing either wild type composite sequence or mutant composite generated by the insertion of 2–14 base pairs in two increments. In the last two lanes the competitors are either a single Lhx2 or a single Ebf site. (D) Schematic illustrating the proposed dominant-negative activity of the fusion protein for composite motif sites. See also Figure 5— figure supplement 1 for depiction of the genetic strategy for mOSN overexpression. (E) ATAC-seq and RNA-seq signal tracks from FAC-sorted control mOSNs and Fusion protein-expressing mOSNs for the OR cluster containing the Greek Island Lipsi. ATAC-seq values are reads per 10 million. RNA-seq values are reads per million. For ATAC-seq, pooled data from 4 biological replicates for control mOSNs are compared to data pooled from 2 Figure 5 continued on next page

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 20 of 30 Research article Genes and Chromosomes

Figure 5 continued independent founders of the Fusion Protein transgene. For RNA-seq, representative tracks are shown for one of three biological replicates for control mOSNs and for one of 2 independent founders for the Fusion Protein transgene. (F) ATAC-seq signal across the Greek Islands for control mOSNs and Fusion protein-expressing mOSNs. Pooled data from 4 biological replicates for control mOSNs are compared to data pooled from 2 independent founders of the Fusion Protein transgene. See Figure 5—figure supplement 2 for the effect of Fusion Protein expression on Ebf and Lhx2 sites genome-wide. (G) MA-plot (Dudoit and Fridlyand, 2002) of OR transcript levels in FAC-sorted mOSNs expressing fusion protein (Omp-IRES-tTA; tetO-Fusion-2a-mcherry) compared to FAC-sorted control mOSNs (Omp-IRES-GFP). Red dots correspond to OR genes with statistical significant transcriptional changes (adjusted p-value<0.05). Three biological replicates were included for control mOSNs and data from 2 independent founders were included for the Fusion Protein transgene. See Figure 5—figure supplement 3 for analysis of effect of Fusion Protein expression on ORs grouped by the presence of Ebf and Lhx2 promoter motifs. (H) Violin plot of Log2 fold change in transcript levels of ORs (red) in mOSNs expressing fusion protein compared to control mOSN. ORs are compared to additional sets of genes: genes with Ebf and Lhx2 bound within 1 kb of the TSS, genes with Lhx2-only bound within 1 kb of the TSS, genes with Ebf-only bound within 1 kb of the TSS, and non-OR genes without Ebf or Lhx2 binding. (I) As in (H), with Log2 fold change in transcript levels shown as a heatmap for each set of genes. DOI: https://doi.org/10.7554/eLife.28620.035

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 21 of 30 Research article Genes and Chromosomes

A

2A mCherry Ebf Lhx2

20aa Flexible 2A Linker Peptide TetO Ebf1-DBD Lhx2-DBD mCherry

IRES OMP tTA tetO-Fusion OMP-IRES-tTA

B DAPI mCherry mCherry DAPI OMP-IRES-tTA ; tetO-Fusion OMP-IRES-tTA

Figure 5—figure supplement 1. Genetic strategy for expression of Fusion Protein. (A) Schematic of OMP-IRES- tTA driven expression of Fusion protein and mCherry in mOSNs. (B) mCherry fluorescence (red) in MOE tissue sections from animals bearing an Omp-IRES-tTA; tetO-Fusion-2A-mCherry transgene. Nuclei are stained with DAPI (blue). DOI: https://doi.org/10.7554/eLife.28620.036

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 22 of 30 Research article Genes and Chromosomes

ATACseq Peaks Grouped by Ebf and Lhx2 Binding

5

0 (mOSN+Fusion - Control mOSNs)

Log2 Fold Change in ATACseq Change in Signal Fold Log2 í

No Ebf Ebf Lhx2 Ebf + Greek or Lhx2 Lhx2 Islands

Figure 5—figure supplement 2. Fusion protein expression most strongly affects ATAC-seq signal on Greek Islands. Violin plot of Log2 fold change in normalized ATAC-seq signal in mOSNs expressing fusion protein compared to control mOSNs. ATAC-seq peaks on Greek Islands (red), are compared to ATAC-seq peaks genome-wide that are grouped by the presence of an overlapping Ebf and/or Lhx2 ChIP-seq peak in wild type OSNs. DOI: https://doi.org/10.7554/eLife.28620.037

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 23 of 30 Research article Genes and Chromosomes

ORs Grouped by Motifs in Promoter 0.3 Ebf & Lhx2 0.2

density 0.1 0.0 í í í 0 3

0.3 Lhx2 0.2 0.1 density 0.0 í í í 0 3

0.3 Ebf 0.2 0.1 density 0.0 í í í 0 3

0.3 Neither 0.2

0.1 density 0.0 í í í 0 3 Log2 Fold Change in OR Transcript Level (mOSN+Fusion - Control mOSNs)

Figure 5—figure supplement 3. Fusion protein expression downregulates OR expression irrespective of presence of Lhx2 or Ebf promoter motifs. Density plots of Log2 fold change in OR transcript levels in Fusion protein expressing mOSNs compared to control mOSNs, with ORs grouped based upon the motifs present in the promoter region (À500 bp to the TSS). ORs with a very low level of expression (OR transcript level <5 in Figure 5G) are not included. Three biological replicates were included for control mOSNs and data from 2 independent founders were included for the Fusion Protein transgene. DOI: https://doi.org/10.7554/eLife.28620.038

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 24 of 30 Research article Genes and Chromosomes

A OR Transcript Levels in Control B OR Transcript Levels in Control vs Lhx2 KO mOSNs vs Fusion mOSNs ORs with no Greek ORs with no Greek / / Island in Cluster Island in Cluster   (Fusion - WT) (Lhx2KO - WT) Log2 Change in OR Transcript Log2 Change in OR Log2 Change in OR Transcript Log2 Change in OR í í   í í             OR Transcript Level OR Transcript Level

C ORs Grouped by Motifs in Promoter D ORs Grouped by Motifs in Promoter Lhx2 No Lhx2 Lhx2 or Ebf Neither

 

í í (Lhx2KO - WT) (Fusion - WT) Log2 Change in OR Transcript Transcript Log2 Change in OR

í Transcript Log2 Change in OR í  H H H H  H H H H ORJ ES ORJ ES Distance from OR to Greek Island Distance from OR to Greek Island

Figure 6. Downregulation of OR expression over large genomic distances. (A) MA-plot of OR transcript levels in FAC-sorted Lhx2 KO (Omp-IRES-Cre; Lhx2fl/fl; tdTomato) mOSNs compared to FAC-sorted control mOSNs (Omp-IRES-GFP). Gold dots correspond to OR genes with statistical significant transcriptional changes. ORs in clusters without a Greek Island are shown as large dots, with significantly changed ORs in red. Three biological replicates were included for control mOSNs and 2 biological replicates were included for Lhx2 KO mOSNs. (B) MA-plot of OR transcript levels in FAC-sorted Fusion protein expressing (Omp-IRES-tTA; tetO-Fusion-2a-mcherry) mOSNs compared to FAC-sorted control mOSNs (Omp-IRES-GFP). Gold dots correspond to OR genes with statistical significant transcriptional changes. ORs in clusters without a Greek Island are shown as large dots, with significantly changed ORs in red. Three biological replicates were included for control mOSNs and data from 2 independent founders were included for the Fusion Protein transgene. See Figure 6—figure supplement 1 for an example OR cluster without a Greek Island. (C) Plot of OR distance from a Greek Island compared to Log2 Fold change in Lhx2 KO mOSNs. ORs overlapping a Greek Island have distance set to 1. ORs on a without a Greek Island have distance set to 1e + 08. (D) Plot of OR distance from a Greek Island compared to Log2 Fold change in Fusion Protein expressing mOSNs. ORs overlapping a Greek Island have distance set to 1. ORs on a chromosome without a Greek Island have distance set to 1e + 08. DOI: https://doi.org/10.7554/eLife.28620.045

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 25 of 30 Research article Genes and Chromosomes

chr16:19,141,296-19,787,482 200 kb 20 - ATAC 0 _ 55 -

ChIP Lhx2 0 _ 30 -

ChIP Ebf 0 _ 6 - ChIP H3K9me3

-2 _ Genes / ORs 3 - No Lhx2/Ebf motif mOSN 0 _ 3 -

Lhx2 KO mOSNs

RNAseq 0 _ 3 -

mOSN + Fusion 0 _

Figure 6—figure supplement 1. Fusion Protein and Lhx2 KO downregulate ORs in a cluster without a Greek Island. mOSN ATAC-seq and ChIP-seq signal tracks are shown for an OR gene cluster without a Greek Island, scaled as in Figure 1A. Below the annotation, RNA-seq tracks show signal for control mOSNs, Lhx2 KO mOSNs, and mOSNs expressing fusion protein. RNA-seq values are reads per million. An OR without Ebf or Lhx2 motifs in its promoter is circled. For ATAC-seq, pooled data from 4 biological replicates of control mOSNs is shown. For ChIP-seq, pooled data from 2 biological replicates of control mOSNs is shown. For RNA-seq, representative tracks are shown for one biological replicate from each condition. DOI: https://doi.org/10.7554/eLife.28620.046

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 26 of 30 oto O t2weso g.Nce r aee ihDP (blue). DAPI with labeled are Nuclei age. of weeks 2 at MOE control aee ihDP bu) ( (blue). DAPI with labeled N nst yrdzto ihpoefor probe with hybridization situ in RNA ie p00,*p00,totie tdn’ -et o idtp ien=3 o SH eeoyosadhmzgu ien=4 ( 4. = n mice homozygous and heterozygous LSCHR for 3, = n mice wild-type For t-test. student’s two-tailed **p<0.01, *p<0.05, mice. rue ypeec nieo usd h Rcutrcontaining cluster OR the outside or inside presence by grouped O:https://doi.org/10.7554/eLife.28620.053 DOI: 7. Figure See f3we l SH ieadwl-yeltemt otos rncitlvl r xrse sqatt eaieto relative quantity as expressed are levels Transcript controls. littermate wild-type and mice LSCHR old week 3 of Monahan iue7fgr upeet1 supplement 7—figure Figure C A targeting vector chr1 Olfr12 Olfr12 DAPI ut-nacrhb ciaeO rncito.( transcription. OR activate hubs Multi-enhancer tal et Olfr1416 eerharticle Research Lf 2017;6:e28620. eLife . Olfr1415

Rhodes +/+ Olfr1414

D 3x attP50 loecn N nst yrdzto ihpoefor probe with hybridization situ in RNA Fluorescent ) LoxP Lipsi (chr1:92,545,813-92,546,514)

Sfaktiria LSCHR Crete o hPqC nlsso h2bnigt h netdGekIlns ( Islands. Greek inserted the to binding Lhx2 of analysis qPCR ChIP for O:https://doi.org/10.7554/eLife.28620 DOI: H Olfr12 Rhodes Rhodes

Rhodes gen in (green)

Rhodes LSCHR/LSCHR Frt Neo

Frt LSCHR A LoxP Olfr1413 agtdisrino re sad ( Islands Greek 5 of insertion Targeted )

Olfr1412 ooyosadwl-yeltemt oto O t2weso g.Nce are Nuclei age. of weeks 2 at MOE control littermate wild-type and homozygous

Rhodes Olfr1411

Olfr1410

n ihnec ru R r ree ylvlo xrsini wild-type in expression of level by ordered are ORs group each within and , Olfr12 D

Olfr1410 Olfr1410 DAPI B Quantity Relative to Adcy3 0.05 0.15 0.25 0.35 0.45 0.1 0.2 0.3 0.4 Olfr1410 0 Olfr1339 Olfr444 gen in (green) Rhodes +/+ Olfr62 ORs OutsideRhodesCluster

Olfr1341 Rhodes +/+ Olfr1342

LSCHR Olfr1302

LSCHR Olfr1357 LSCHR Knock-In RT-qPCRLSCHR Olfr1305 hds+LCRRhodesLSCHR/LSCHR Rhodes +/LSCHR daetto adjacent ) Olfr878 B ooyosadwl-yelittermate wild-type and homozygous

TqC fO rncitlvl nMOEs in levels transcript OR of RT-qPCR ) Olfr1299 Olfr1340

Adcy3 Olfr1349 Olfr1297 Rhodes LSCHR/LSCHR Rhodes Olfr1301 ro asaeSM R are ORs SEM. are bars error , Olfr1507

** Olfr1410 ** ee n Chromosomes and Genes Rhodes ClusterORs oriae r mm10. are Coordinates . Olfr1413 ***

Olfr1415 **

Olfr1414 ** ** C

Fluorescent ) Olfr1411 * * ** Olfr1412 ** ** Olfr12 ** Olfr1416 7o 30 of 27 Research article Genes and Chromosomes

A Lhx2 ChIP from MOE Normalized to Input Control Non-LSCHR Greek Islands Greek Islands in LSCHR Sites 0.3%

0.2% Percent Input 0.1%

0.0% H Lipsi Tinos Ikaria Crete Chios Sifnos Rhodes Sfaktiria Keffalonia Lhx2 Positive Lhx2 Negative 5 Rhodes +/+ Rhodes LSCHR/LSCHR

B Input Chromatin for Lhx2 ChIP from MOE

Control Sites Non-LSCHR Greek Islands Greek Islands in LSCHR 3.5

3

2.5

2

1.5

1

Normalized Quantity of Input DNA 0.5

0 H Lipsi Tinos Ikaria Crete Chios Sifnos Rhodes Sfaktiria Keffalonia Lhx2 Positive Lhx2 Negative 1 Lhx2 Negative 2 Lhx2 Negative 3 Lhx2 Negative 4 Lhx2 Negative 5

Rhodes +/+ Rhodes LSCHR/LSCHR

C Lhx2 ChIP from MOE Normalized to External Controls Control Non-LSCHR Greek Islands Greek Islands in LSCHR Sites 0.6%

0.5%

0.4%

0.3%

0.2% Normalized Percent Input 0.1%

0.0% H Lipsi Tinos Ikaria Crete Chios Sifnos Rhodes Sfaktiria Keffalonia Lhx2 Positive Lhx2 Negative 5

Rhodes +/+ Rhodes LSCHR/LSCHR

Figure 7—figure supplement 1. Lhx2 binds to inserted Greek Islands. (A) Quantitative PCR analysis of Lhx2 ChIP performed on MOE chromatin from wild-type and Rhodes LSCHR/LSCHR mice. Data is shown for control Lhx2 Figure 7—figure supplement 1 continued on next page

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 28 of 30 Research article Genes and Chromosomes

Figure 7—figure supplement 1 continued negative and Lhx2 positive sites located outside OR clusters, for 5 Greek Islands that were not included in the LSCHR knock-in, and for the 5 Greek Islands included in the knock-in. For four additional negative control primer sets ChIP signal was not detectable; these are not shown. Percent recovery of input DNA was calculated for each sample. Plots show the mean and error bars show the range for two biological replicates for each genotype. (B) Quantitative PCR analysis of input chromatin for MOE Lhx2 ChIP experiments. For each sample, signal observed with each primer set is normalized to the mean signal observed at 6 external control sites located outside OR clusters. Plots show the mean and error bars show the range for two biological replicates for each genotype. (C) Quantitative PCR analysis of MOE Lhx2 ChIP normalized to control sites located outside of OR clusters. For each site, percent recovery of DNA was calculated relative to the mean input signal observed at non-OR external control sites, rather than the input control for that site, which is shown in (B). DOI: https://doi.org/10.7554/eLife.28620.054

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 29 of 30 Research article Genes and Chromosomes

A

Ebf Lhx2

H3K9me3 OR Greek Island

B

OR C

OR

Figure 8. A Hierarchical Model for OR gene choice. (A) Lhx2 and Ebf bind in a functionally cooperative fashion on the composite motifs of the Greek Islands. Because these motifs are not juxtaposed in most OR promoters, Lhx2 and Ebf cannot overcome the heterochromatic silencing of OR promoters, thus their binding is restricted to the OR enhancers. (B) Lhx2/Ebf bound OR enhancers are not strong enough to activate proximal OR alleles on their own and to facilitate stable transcription factor binding on their promoters. (C) Lhx2/Ebf bound Greek Islands form an interchromosomal, multi-enhancer hub that recruits coactivators essential for the de-silencing of OR promoters and robust transcriptional activation of the OR allele that would be recruited to this hub. DOI: https://doi.org/10.7554/eLife.28620.055

Monahan et al. eLife 2017;6:e28620. DOI: https://doi.org/10.7554/eLife.28620 30 of 30