ENCODE and modENCODE Consortia Meeting May 23-25, 2011

Crystal City Hyatt Regency Ballroom Crystal City, VA

AGENDA Monday, May 23, 2011

7:30 a.m. Continental Breakfast and Registration Location: Regency Ballroom Foyer

8:30 a.m. Welcome and Introduction Mark Guyer

8:35 a.m. Goals of the Meeting Elise Feingold

8:45 a.m. Summary of Production Activities Peter Good

9:00 a.m. modENCODE Accomplishments and Challenges Robert Waterston Manolis Kellis Susan Celniker Mark Gerstein 9:45 a.m. Coffee Break

10:00 a.m. ENCODE Accomplishments and Challenges Ian Dunham 10:45 a.m. Challenges in Data Coordination and Display

modENCODE Lincoln Stein ENCODE James Kent

11:30 a.m. Consortia photo

11:45 a.m. Lunch on your own

1:15 p.m. Functional Annotation of GWAS SNPs Ross Hardison

1:30 p.m. Keynote Speaker: David Altshuler , Human Genetics and Disease

2:00 p.m. Discussion: Utility of Catalogs of Functional Elements in Bradley Bernstein Human Health and Disease Studies

2:30 p.m. Breakout Session #1: Maximizing the Utility of the mod/ENCODE Data to the Biomedical Research Community

Group 1: Utility of the modENCODE data to the model organism, human genetics, medical research communities Co-chairs: Jason Lieb, Susan Celniker Meeting room: Washington Room

- 3 - Group 2: Utility of the ENCODE data to the general biomedical, human genetics and medical research communities Co-chairs: John Stamatoyannopoulos, Tom Gingeras Meeting Room: Regency Ballroom

3:15 p.m. Coffee Break

3:30 p.m. Lightning Talks Moderator: Michael Pazin 5 minute talks, no questions

Quality control and reproducibility measures for automatic threshold detection in ChIP-seq datasets Speaker: Qunhua Li

Analysis of DNA methylation in a large number of human cell lines and tissues Speaker: Jason Gertz

High Occupancy Target (HOT) regions in C. elegans Speaker: Eric Van Nostrand

Combinatorial patterning of chromatin regulators uncovered by genome-wide location analysis in human cells Speaker: Oren Ram

Integrative models of transcription factor binding profiles Speaker: Anshul Kundaje

A locus control region anchors a trans-genomic regulatory circuit Speaker: Hao Wang

ChIP-seq regulatory analysis using ChIA-PET Speaker: Ali Mortazavi

Novel antisense and intergenic spliced human transcripts and their enriched localization in nuclei Speaker: Alex Dobin

Genome-wide analysis of splicing regulation in Drosophila melanogaster by RNAi depletion of 58 RNA binding proteins Speaker: Steven Brenner

Comparison of the D. melanogaster and C. elegans developmental timecourse with RNA-Seq transcriptomics data Speaker: Jingyi Jessica Li

An Integrative pipeline to link disease associations with functional ENCODE data Speaker: Marc A. Schaub

Tissue-specific binding site profiling in vivo Speaker: Valerie Reinke

- 4 - 5:00 p.m. Dinner on your own

6:00 p.m. Joint ENCODE-modENCODE AWG Session Meeting Room: Regency Ballroom

9:00 p.m. Adjourn

Tuesday, May, 24, 2011

Posters can be setup in the Independence Center beginning at 8:00 a.m.

7:30 a.m. Continental Breakfast Location: Regency Ballroom Foyer ECP and ENCODE PIs: Chesapeake Grill

8:30 a.m. Comparative Analyses

Human/Mouse Michael Snyder

Report back from Joint AWG session Ewan Birney Manolis Kellis

9:30 a.m. Report Back From Breakout Session #1 Breakout #1 Co-Chairs

modENCODE 10 minute presentation, 10 minute discussion ENCODE 10 minute presentation, 10 minute discussion General Discussion 20 minutes

10:30 a.m. Coffee Break

10:45 a.m. Implementation of NHGRI Strategic Plan Eric Green

11:15 a.m. Future of ENCODE and modENCODE Elise Feingold Peter Good 11:45 a.m. Lunch on your own

Executive session of ECP with NHGRI Staff: Tidewater I Mouse ENCODE: Cinnabar (hotel restaurant)

1:15 p.m. Hot Topics Moderator: Michael Pazin 12 minute talks, 3 minutes for questions

Scoring ENCODE data quality Speaker: Robert Thurman

Chromatin signatures of active and silent genes on Drosophila chromosome 4 –a unique regulatory system Speaker: Nicole Riddle

Long-range interaction networks in the ENCODE pilot regions Speaker: Amartya Sanyal

- 5 - Duplicated sequences in non canonical introns: a way to enhance splicing? Speaker: Sarah Djebali

Elucidating the Regulatory Code: Cell type-specific transcription factor co-associations and gene expression Speaker: Manoj Hariharan

Profiling the subnucleosomal active chromatin landscape at single base-pair resolution Speaker: Steven Henikoff

2:45 p.m. Coffee Break Poster setup in Independence Center

3:15 p.m. Breakout Session #2: Future of Functional Genomics

Group 1: Application to disease Co-chairs: Kevin White, Michael Snyder Meeting room: Washington Room

Group 2: Comparative analysis and data integration Co-chairs: Lincoln Stein, Mark Gerstein Meeting room: Regency Ballroom

Group 3: Technology development Co-chairs: Steven Henikoff, Richard Myers, David MacAlpine Meeting room: Conference Theater

5:00 p.m. Poster Session Location: Independence Center

7:00 p.m. Dinner on your own PIs meet in lobby at 6:50 p.m. to take the shuttle to restaurant reservations.

- 6 -

Wednesday, May 25, 2011

8:00 a.m. Continental Breakfast Location: Regency Ballroom Foyer ECP members and modENCODE PIs: Chesapeake Grill

9:00 a.m. Report from Breakout Session #2 Breakout #2 Co-Chairs

Application to disease 10 minute presentation, 10 minute discussion Comparative analysis 10 minute presentation, 10 minute discussion Technology development 10 minute presentation, 10 minute discussion General discussion 20 minutes

10:20 a.m. Coffee break and time for checkout

11:00 a.m. Updates from Related Projects and Discussion of Opportunities for Collaboration

Common Fund Update Peggy Farnham Common Fund Genotype-Tissue Expression (GTEx) Update Jeff Struewing 29 Mammals Update Manolis Kellis Additional Fly/Worm Species Sequence Annotation Update Manolis Kellis

12:00 noon Interactome/Networks Discussion Moderator: Marc Vidal

12:30 p.m. Planning for Fifth Year of Consortia Elise Feingold Peter Good

1:00 p.m. Feedback from ECP John Lis

1:15 p.m. Meeting Summary

1:30 p.m. Adjourn Lunch on your own

3:00 p.m. Begin ENCODE-Roadmap Epigenomics Joint Meeting

- 7 -

ENCODE and Roadmap Epigenomics Joint Session May 25, 2011

Crystal City Hyatt Tidewater II Crystal City, VA

AGENDA Wednesday, May 25, 2011

3:00 p.m. Welcome and charge for joint session

Objectives:  Increase communication, coordination and transparency between the ENCODE and Roadmap Epigenomics Programs.  Improve ability to use data from both programs for analyses.  Obtain recommendations for maximizing accessibility and utility of data to research community.

3:05 p.m. Opportunities for Directed Data Generation Bradley Bernstein to “Complete” Datasets

. Is it important to have any data overlap? If so, how much overlap is required, what data types should we have overlap for, what questions do we want to ask of this overlapping data? . Are there cell types that are being analyzed only by ENCODE or REMC, that should be analyzed by both? . Can we determine what assays are most cost effective, and most informative? Can we use such information to make future experiments more efficient? (e.g. For DNA methylation, Bisulfite-seq v. RRBS v. MeDIP-ChIP-seq; for open chromatin, DNase fragments v. DNase ends v FAIRE, for enhancer marks, H3K4me1 v. H3K27ac v. p300 v. H3K18ac) . Are there assays in ENCODE that should become part of the epigenome? Pol II, p300, RNA- seq? . Are there Epigenomics assays that should become part of the ENCODE repertoire? . How can we compare ENCODE and REMC data standards and data quality? If so, do we understand how they compare, or who should be performing the comparison? If comparisons are not possible now, should we work on this, and how should we do it?

3:55 p.m. Opportunities for Integrated Data Analysis Manolis Kellis

. What new comparisons or biological questions are possible by combining ENCODE and REMC data? . How can we prioritize samples and marks across the two consortia? What is the best way to learn which marks are most informative, and what marks are closest to being redundant, to allow future work to focus only on the assays that are most informative? Does the informativeness of a mark depend on the cell/tissue being assayed? . Are there fundamental differences between types of samples? For example, cell lines v. primary cells, and primary cells v. tissues? Are there fundamental differences between cell types; for example, stem cells v. differentiated cells, or neurons v. keratinocytes? Are there different rules for identifying chromatin states, novel chromatin states, different rules for location of individual marks, different abundance of marks, or differences in technical qualities such as signal to noise ratio? - 8 - . Cross-consortium analysis examples: Analysis of variation between individuals in the same cell type, what marks are most informative? Analysis of variation between cell types, what marks are most informative?

4:55 p.m. Maximizing Accessibility and Utility of Data John Stamatoyannopoulos to the Research Community

. How can new users learn to understand what data are available, where to find it, what it is useful for? . Where can users find data from both groups together, for uniform display and direct comparison? What are the strengths and weaknesses of current data display options? . How can we make our findings readily accessible? How can we prepare for the different levels of expertise/comfort with using browsers and other genomics databases? . What mechanisms are needed to ensure that the data display meets the needs of the different user communities? . What are the three most common questions a naïve user would have regarding applying epigenomics data to their research? How could the ENCODE/Epigenomics programs address these use cases rapidly and effectively?

6:00 p.m. Adjourn

- 9 -

- 10 -