Single Cell 11 - 13 March 2020 ABSTRACT BOOK ABSTRACT 2020 CONFERENCES COURSES Evolutionary Systems Biology LABORATORY COURSES LECTURE/DISCUSSION COURSES 12-14 February and Clinical Microbiology Clinical Genomics: Fundamentals of Optimmunize: Improving the beneficial 19-24 January Variant Interpretation in Clinical Practice effects of vaccines NEW Genomics and Clinical Virology 29-31 January 19-21 February 23–28 February Genomic Practice for Genetic Counsellors Single Cell Biology Genetic Engineering of Mammalian 3-5 February 11-13 March Stem Cells Practical Aspects of Small Molecule Genomics of Brain Disorders 15–27 March Drug Discovery 18-20 March Next Generation Sequencing 21-26 June Genomics of Rare Diseases 20–27 April Evolutionary Biology and Ecology 25-27 March Low Input Epigenomics NEW of Cancer Proteomics in Cell Biology and 12-20 May 29 June-3 July Disease Mechanisms RNA Transcriptomics Policy: Improving the 30 March-1 April 17-26 June Uptake of Research into UK Policy Longitudinal Studies Single Cell Technologies and Analysis 19-21 August 20-22 April 24-31 July Genomics for Dermatology Nursing, Genomics and Healthcare NEW Molecular Pathology and 25-27 November 27-29 April Diagnosis of Cancer OVERSEAS COURSES Antimicrobial Resistance - Genome 22-27 November Next Generation Big Data and Emerging Technologies Derivation and Culture of Human 4-6 May Sequencing Induced Pluripotent Stem Cells (hiPSCs) 19-24 January (Chile) Curating the Clinical Genome 14-18 December 20-22 May 9-14 February (Malaysia) Healthy Ageing COMPUTATIONAL COURSES Molecular Approaches to Clinical 27-29 May Mathematical Models for Infectious Microbiology in Africa 21-27 March (The Gambia) Genomic Epidemiology of Malaria Disease Dynamics 7-10 June 24 February-6 March Genomics and Epidemiological Surveillance of Bacterial Pathogens Virus Genomics and Evolution Fungal Pathogen Genomics 19-24 April (Paraguay) 15-17 June 11-16 May Reproducibility, Replicability and Trust Summer School in Bioinformatics Working with Pathogen Genomes in Science NEW 22-26 June 10-15 May (Vietnam) 9-11 September Systems Biology: From Large Viral Genomics and Bioinformatics Genome Informatics Datasets to Biological Insight 7-12 June (Uruguay) 14-17 September 6-10 July Antimicrobial Resistance of CRISPR and beyond: perturbations at Genetic Analysis of Mendelian Bacterial Pathogens scale to understand genomes and Complex Disorders 27 September-3 October (Kenya) 15-21 July 23-25 September Malaria Experimental Proteomics Bioinformatics Genomic Imprinting - from Biology 8-13 November (The Gambia) to Disease NEW 26-31 July 28-30 September Genetic Analysis of Practical Aspects of Drug Discovery Exploring Human Host-Microbiome Population-based Association Studies 29 November-4 December (Uruguay) Interactions in Health and Disease 21-25 September 21-23 October Working with Protozoan Parasite ONLINE COURSES Database Resources Bacterial Genomes - 4 courses 4-9 October Genetic Counselling - 1 course Next Generation Please see our website for more details Sequencing Bioinformatics and scheduling of online courses 18-24 October Computational Systems Biology for Complex Human Disease NEW 6-11 December

@ACSCevents wellcomegenomecampus.org/coursesandconferences

WGC_Courses_and_Conferences_2020-Abstract-Books(Blue-FullColour)december2019.indd 1 03/12/2019 11:08:04 Name:

Single Cell Biology

Wellcome Genome Campus Conference Centre, Hinxton, Cambridge, UK 11-13 March 2020

Scientific Programme Committee:

Ellen Rothenberg California Institute of Technology, USA

Sarah Teichmann Wellcome Sanger Institute, UK

Fabian Theis Helmholtz Zentrum München, Germany

Itai Yanai New York University, USA

Tweet about it: #SCB20

@ACSCevents /ACSCevents /c/WellcomeGenomeCampusCoursesandConferences

1

Scientific Programme Committee

Ellen Rothenberg Sarah Teichmann California Institute of Technology, USA Wellcome Sanger Institute, UK

Fabian Theis Itai Yanai Helmholtz Zentrum München, Germany New York University, USA

Wellcome Genome Campus Scientific Conferences Team:

Treasa Creavin Laura Wyatt Scientific Programme Manager Conference and Events Manager

2

Dear colleague,

I would like to offer you a warm welcome to Single Cell Biology. I hope you will find the talks interesting and stimulating, and find opportunities for meeting colleagues, making new connections and form new and exciting collaborations throughout your time here with us.

The conference is organised by Wellcome Genome Campus Advanced Courses and Scientific Conferences (ACSC), which is run on a not-for-profit basis, funded by the . ACSC funds, develop and deliver training and conferences that span basic research, cutting-edge biomedicine, and the application of genomics in healthcare. Our scientific programme committees, speakers and instructors are world-renowned scientists and clinicians. We run ~60 events each year attracting up to 3,500 people, from ~130 countries to the Campus.

Our programme includes a range of conferences and laboratory-, computational - and discussion- based courses, providing hands-on training in the latest biomedical techniques for research scientists, clinicians and healthcare professionals. We also organise invitation-only retreats for high-level discussion on emerging science, technologies and strategic direction for select groups and policy makers. To enable everyone to benefit from the revolution in genomic medicine, we have recently introduced an online courses programme to provide training across the globe for free. To find out more about our programme, please visit: https://coursesandconferences.wellcomegenomecampus.org/

We also have a strong commitment to equality, diversity and inclusion across the programme. We provide funding to support childcare, or extra costs for dependants, while attending a conference or course. There is also a family room for parents, to accommodate feeding and napping. Delegates can stay involved in the conference, as the talks will be live-streamed into this room. To further promote a culture of inclusion and equal representation at our conferences, we ensure that 50% or our programme committees, session chairs and invited speakers are women. We also work with our programme committees to invite speakers from a range of countries. To read more about our policies, please visit: https://coursesandconferences.wellcomegenomecampus.org/about-us/policies/

The conference team are here to help this meeting run smoothly, and at least one member will be at the registration desk between sessions, so please do come and speak with us if you have any queries.

Finally, enjoy the conference.

Best wishes,

Dr Rebecca Twells Head of Advanced Courses and Scientific Conferences [email protected]

3

General Information

Conference Badges Please wear your name badge at all times to promote networking and to assist staff in identifying you.

Scientific Session Protocol Photography, audio or video recording of the scientific sessions, including poster session is not permitted.

Social Media Policy To encourage the open communication of science, we would like to support the use of social media at this year’s conference. Please use the conference hashtag #SCB20. You will be notified at the start of a talk if a speaker does not wish their talk to be open. For posters, please check with the presenter to obtain permission.

Internet Access Wifi access instructions:  Join the ‘ConferenceGuest’ network  Enter your name and email address to register  Click ‘continue’ – this will provide a few minutes of wifi access and send an email to the registered email address  Open the registration email, follow the link ‘click here’ and confirm the address is valid  Enjoy seven days’ free internet access!  Repeat these steps on up to 5 devices to link them to your registered email address

Presentations Please provide an electronic copy of your talk to a member of the AV team who will be based in the meeting room.

Poster Sessions Posters will be displayed throughout the conference. Please display your poster in the Conference Centre on arrival. There will be two poster sessions during the conference.

Odd number poster assignments will be presenting in poster session 1, which takes place on Wednesday, 11 March at 18:00 – 19:30.

Even number poster assignments will be presenting in poster session 2, which takes place on Thursday, 12 March, at 18:00 – 19:30.

The page number of your abstract in the abstract book indicates your assigned poster board number. An index of poster numbers appears in the back of this book.

Conference Meals and Social Events Lunch and dinner will be served in the Hall, apart from on Wednesday, 11 March when it will be served in the Conference Centre. Please refer to the conference programme in this book as times will vary based on the daily scientific presentations. Please note there are no lunch or dinner facilities available outside of the conference times.

All conference meals and social events are for registered delegates. Please inform the conference organiser if you are unable to attend the conference dinner.

The Kitchen Garden Bar (cash bar) will be open from 19:00 – 23:00 each day.

4

Dietary Requirements If you have advised us of any dietary requirements, you will find a coloured dot on your badge. Please make yourself known to the catering team and they will assist you with your meal request.

If you have a gluten or nut allergy, we are unable to guarantee the non-presence of gluten or nuts in dishes, even if they are not used as a direct ingredient. This is due to gluten and nut ingredients being used in the kitchen.

For Wellcome Genome Campus Conference Centre Guests Check in If you are staying on site at the Wellcome Genome Campus Conference Centre, you may check into your bedroom from 14:00. The Conference Centre reception is open 24 hours.

Breakfast Your breakfast will be served in the Hall restaurant from 07:30 – 09:00.

Telephone If you are staying on-site and would like to use the telephone in your room, you will need to contact the Reception desk (Ext. 5000) to have your phone line activated – they will require your credit card details to do so.

Departures You must vacate your room by 10:00 on the day of your departure. Please ask at reception for assistance with luggage storage in the Conference Centre.

For Holiday Inn Express & Red Lion, Whittlesford Bridge Hotel Guests Check in If you are staying on site at the Holiday Inn Express you may check into your room from 14:00. Hotel staff are on hand 24 hours a day.

Breakfast and Dining Your breakfast will be served in the hotel from 06:30 – 09:30. The hotel also offers a relaxed licensed bar and lounge area.

Telephone and Internet A telephone and free wifi internet access is available in your room. Wifi is complimentary.

Departures You must vacate your room by 12:00 on the day of your departure. A luggage store is available at the Conference Centre.

Wellcome Genome Campus Scientific Conferences guests receive a 15% discount on food at the Red Lion, Whittlesford Bridge Hotel. Please note there is a parking charge of £6 per day.

Transfers If you are staying off site, a complimentary shuttle service has been organised with Richmond’s Coaches. The shuttle service is as follows:

Wednesday, 11 March Hills Road, Cambridge – Conference Centre 12:00 Holiday Inn Express & Red Lion, Whittlesford – Conference Centre 12:00

Conference Centre – Holiday Inn Express & Red Lion, Whittlesford 21:30

5

Thursday, 12 March Holiday Inn Express & Red Lion, Whittlesford – Conference Centre 08:30

Conference Centre – Holiday Inn Express & Red Lion, Whittlesford 21:30

Friday, 13 March Holiday Inn Express & Red Lion, Whittlesford – Conference Centre 08:30

Taxis Please find a list of local taxi numbers on our website. The conference centre reception will also be happy to book a taxi on your behalf.

Return Ground Transport Complimentary departure transport from the Conference Centre has been arranged on Friday, 13 March at the following times:

 13:30 to Heathrow airport  13:40 to Stansted airport  13:50 to Cambridge train station and city centre (Downing Street).

A sign-up sheet will be available at the conference registration desk from 15:00 on Wednesday, 11 March until lunchtime on Thursday, 12 March. Please note that places are limited and will be allocated on a first come first served basis.

Please allow a 30-40 minute journey time to both Cambridge and Stansted Airport, and two hours to Heathrow.

Messages and Miscellaneous Lockers are located outside the Conference Centre toilets and are free of charge.

All messages will be available for collection from the registration desk in the Conference Centre.

A variety of toiletry and stationery items are available for purchase at the Conference Centre reception. Cards for our self-service laundry are also available.

Certificate of Attendance A certificate of attendance can be provided. Please request one from the conference organiser based at the registration desk.

Contact numbers Wellcome Genome Campus Conference Centre – 01223 495000 (or Ext. 5000) Wellcome Genome Campus Conference Organiser (Laura) – 07733 338878

If you have any queries or comments, please do not hesitate to contact a member of staff who will be pleased to help you.

6

Conference Summary

Wednesday, 11 March

12:00-13:20 Registration with lunch 13:20-13:30 Welcome and Introductions 13:30-15:00 Session 1: Disease 15:00-15:30 Afternoon Tea 15:30-17:30 Session 2: Modeling development 17:30-18:00 Lightning talks 18:00-19:30 Poster Session 1 (odd numbers) with drinks reception 19:30 prompt Buffet Dinner 19:30 Cash Bar

Thursday, 12 March

09:00-10:30 Session 3: Dissecting Development 10:30-11:00 Morning Coffee 11:00-12:30 Session 4: Mapping development 12:30-14:00 Lunch 14:00-15:30 Session 5: Computational approaches 15:30-16:00 Afternoon Tea 16:00-17:30 Session 6: Cell dynamics 17:30-18:00 Lightning talks 18:00-19:30 Poster Session 2 (even numbers) with drinks reception 19:30 prompt Silver Service Conference Dinner 19:30 Cash Bar

Friday, 13 March

09:00-10:30 Session 7: Cell dynamics / New technology 10:30-11:00 Morning Coffee 11:00-12:45 Session 8: Spatial tech 12:45 -13:30 Lunch 13:30 Coach departs to Heathrow Airport 13:40 Coach departs to Stansted Airport 13:50 Coach departs to Cambridge Train Station and City Centre

7

Conference Sponsors

www.10xgenomics.com www.takarabio.com

www.bio-techne.com www.cellmicrosystems.com

www.cytena.com www.dolomite-bio.com

www.opticalbiosystems.com www.partek.com

www.sptlabtech.com

8

Single Cell Biology

Wellcome Genome Campus Conference Centre, Hinxton, Cambridge

11-13 March 2020

Lectures to be held in the Auditorium Lunch and dinner to be held in the Hall Restaurant Poster sessions to be held in the Conference Centre

Spoken presentations - If you are an invited speaker, or your abstract has been selected for a spoken presentation, please give an electronic version of your talk to the AV technician.

Poster presentations – If your abstract has been selected for a poster, please display this in the Conference Centre on arrival.

Conference programme

Wednesday, 11 March

12:00-13:20 Registration with lunch

13:20-13:30 Welcome and Introductions Sarah Teichmann, Wellcome Sanger Institute, UK

13:30-15:00 Session 1: Disease Chair: Fabian Theis, Helmholtz Zentrum München GmbH, Germany

13:30 Putting cells into context with spatial transcriptomics and single-cell RNA-Seq Itai Yanai New York University, USA

14:00 Temporal & spatial dynamics of human endometrium one cell at a time Roser Vento Wellcome Sanger Institute, UK

14:30 Single-cell multi-omics reveals the epigenetic encoding of glioma cell states Federico Gaiti Weill Cornell Medicine, USA

14:45 Multi-modal single-cell profiling reveals colonic CD8+ topography in IBD Agne Antanaviciute University of Oxford, UK

15:00-15:30 Afternoon Tea

9

Hold your own meeting at the Wellcome Genome Campus Conference Centre

The Conference Centre hosts hundreds of one-day and residential meetings for biomedical sector clients each year, and offers preferential rates to organisers from this sector.

Facilities: • 300-seat auditorium with all the latest • On-site accommodation: 134 modern and audiovisual capabilities comfortable bedrooms • 8 distinctive meeting rooms with flexible • Outdoor space for team building activities set-up options for groups of 2-120 people and BBQs in summer • Large naturally-lit exhibition space with bar • Complimentary parking for 180 cars and • 300-seat restaurant bike rack We would love to welcome you and your delegates to our venue at the heart of life-changing science.

We chose the conference facility for the beautiful surrounds, the professionalism of the staff, the high quality of the venue itself and the“ amazing wow factor. Health Enterprise East Innovation” Showcase

To enquire or to book a show round please call the Sales team on 01223 495123 or email [email protected] www.wellcomegenomecampus.org/conferencecentre

WGC CC abstract book ad final.indd 1 12/12/2019 15:26 15:30-17:30 Session 2: Modeling development Chair: Fabian Theis, Helmholtz Zentrum München GmbH, Germany

15:30 Mapping mammalian cell landscapes by single cell mRNA-seq Guoji Guo Zhejiang University School of Medicine, China

16:00 Single-cell recording of lineage and transcriptional regulation in direct reprogramming Samantha Morris Washington University, USA

16:30 Human organ development through the lens of single cell genomics Barbara Treutlein ETH Zurich, Switzerland

17:00 Mutating autism risk in brain organoids grown with multiple differentiation protocols Amanda Kedaigle Broad Institute of MIT and Harvard, USA

17:15 The tempo and mode of host-pathogen interactions during bacterial infections Gal Avital NYU Langone Health, USA

17:30-18:00 Lightning talks

18:00-19:30 Poster Session 1 (odd numbers) with drinks reception

19:30 prompt Buffet Dinner

19:30 Cash Bar

Thursday, 12 March

09:00-10:30 Session 3: Dissecting Development Chair: Itai Yanai, New York University, USA

09:00 Deconvoluting mechanisms of intervertebral disc degeneration by single cell transcriptomics and lineage analyses Kathy Cheah The University of Hong Kong, China

09.30 Studying human liver development using single cell analyses and organoids Ludovic Vallier UK

11

Biology at True Resolution Explore new dimension. Enable novel discoveries.

Get a Multidimensional View of Complex Cellular Systems with Single Cell Solutions from 10x Genomics

Single Cell Transcriptomics • Expression Profiling • CRISPR Screening • Gene Expression & Cell Surface • Immune Profiling • Immune Profiling & Cell Surface Protein • Immune Profiling & Antigen Specificity

Single Cell Genomics • Copy Number Variation

Single Cell Epigenomics • Chromatin Accessibility

10xgenomics.com 10.00 Dissection of the global governing cardiac neural crest development Shashank Gandhi California Institute of Technology, USA

10.15 progression is not responsible for the transcriptional rewiring and loss of self-renewal capacity in ex vivo cultured human Haematopoietic Stem Cells (HSCs) Carys Johnson Cambridge Stem Cell Institute, UK

10:30-11:00 Morning Coffee

11:00-12:30 Session 4: Mapping development Chair: Ellen Rothenberg, California Institute of Technology, USA

11:00 Dissecting the regulatory switches driving cell fate trajectories during embryonic development one cell at a time Eileen Furlong EMBL, Germany

11:30 Decoding the developing human immune system Muzz Haniffa Newcastle University, UK

12:00 Using single-cell RNA sequencing to map lymphoid cell state and differentiation across human fetal haematopoietic organs Simone Webb Newcastle University, UK

12:15 Single-cell transcriptome of gastrulating human embryo Elmir Mahammadov Helmholtz Munich, Germany

12:30-14:00 Lunch

14:00-15:30 Session 5: Computational approaches Chair: Samantha Morris, Washington University, USA

14:00 Integrated analysis of single-cell data across modalities and technologies Rahul Satija New York University, USA

14:30 Modeling cellular response across perturbations and spatial context Fabian Theis Helmholtz Zentrum München, Germany

15:00 Revealing dynamics of gene expression variability in cell state space Dominic Grün Max Planck Institute of Immunobiology and Epigenetics, Germany

13

ICELL8® cx single cell system Push the boundaries of your single-cell genomics experiments

Don’t limit yourself in size or quality ! Process cells or nuclei of any size, no 60 µm limit! Don’t miss any mutation with full length transcriptomics Sequence only relevant cells & obtain only relevant data Develop your own single-cell genomic assay

Fully automated workflow Isolate > Visualize > Select > Process

Visit www.takarabio.com/icell8

www.takarabio.com 15:15 Joint modeling of transcriptome and surface proteome enhances single-cell data analysis Zoe Steier University of California, Berkeley, USA

15:30-16:00 Afternoon Tea

16:00-17:30 Session 6: Cell dynamics Chair: Ellen Rothenberg, California Institute of Technology, USA

16:00 Long-term single-cell quantification: New tools for old questions Timm Schroeder ETH Zurich, Switzerland

16:30 Regulating the rheostat in killer T cells Gillian Griffiths University of Cambridge, UK

17:00 Single-cell resolution of the human germinal centre reveals novel B cell states and antibody maturation dynamics Hamish King Queen Mary University of London, UK

17:15 Reconstructing foetal blood development at the single-cell level through somatic mutations Anna Maria Ranzoni University of Cambridge, UK

17:30-18:00 Lightning talks

18:00-19:30 Poster Session 2 (even numbers) with drinks reception

19:30 prompt Silver Service Conference Dinner

19:30 Cash Bar

Friday, 13 March

09:00-10:30 Session 7: Cell dynamics / New technology Chair: Itai Yanai, New York University, USA

09:00 Pairing droplet microfluidics with FACS for ultra-high-throughput single-cell analysis Polly Fordyce Stanford University, USA

09:30 Live-seq: Measuring transcriptomes of live cells Wanze Chen EPFL, Switzerland

15

09:45 Probing multicellular mechanisms of immune modulation with massively parallel single-cell mRNA-seq Sisi Chen Caltech, USA

10:00 Generalizing RNA velocity to transient cell states through dynamical modeling Volker Bergen Helmholtz Munich, Germany

10:15 Leveraging time-lapse and kinship information in single cell RNA Sequencing Arne Wehling ETH Zurich, Switzerland

10:30-11:00 Morning Coffee

11:00-12:45 Session 8: Spatial tech Chair: Sarah Teichmann, Wellcome Sanger Institute, UK

11:00 Imaging the transcriptomes in tissue and disease Joakim Lundeberg KTH Royal Institute of Technology, Sweden

11:30 Spatial maps of molecularly defined cell types by in situ sequencing Mats Nilsson Stockholm University, Sweden

12:00 Spatially resolved cell atlasing via integration of spatial and single nucleus transcriptomics Omer Bayraktar Wellcome Sanger Institute, UK

12:15 A human single-cell atlas of the Substantia nigra reveals novel cell- specific pathways associated with the genetic risk of Parkinson’s disease and neuropsychiatric disorders Viola Volpato DRI at Cardiff University, UK

12:30 Closing remarks Ellen Rothenberg, California Institute of Technology, USA

12:45 -13:30 Lunch

13:30 Coach departs to Heathrow Airport 13:40 Coach departs to Stansted Airport 13:50 Coach departs to Cambridge Train Station and City Centre

16

These abstracts should not be cited in bibliographies. Materials contained herein should be treated as personal communication and should be cited as such only with consent of the author.

17

Notes

18

Spoken Presentations

Putting cells into context with spatial transcriptomics and single-cell RNA-Seq Itai Yanai

New York University Grossman School of Medicine, NYU Langone Health, 435 East 30th Street, Eight Floor, Room 817, New York, NY 10016, USA

The integration of single-cell and spatial transcriptomics provides us with an opportunity for important insights into the architecture of tissues under healthy and diseased conditions. In this talk, I will describe our computational approach for integration, where an emerging principle is the study of distinct cell states according to gene modules. I will also present our analysis of tumorigenesis and bacterial infection using these approaches.

S1

Notes

S2

Temporal & spatial dynamics of human endometrium one cell at a time

Roser Vento

Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK

Abstract not available at time of printing

S3

Notes

S4

Single-cell multi-omics reveals the epigenetic encoding of glioma cell states

Federico Gaiti, Ronan Chaligne, Dana Silverbush, Joshua Schiffman, Sunil Deochand, Hannah Weisman, Alyssa Richman, Caroline Sheridan, Alicia Alonso, Mario L. Suvà, Dan- Avi Landau

New York Genome Center, New York, NY, USA Weill Cornell Medicine, New York, NY, USA Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA Broad Institute of Harvard and MIT, Cambridge, MA, USA

Cancers harbor significant cell-state diversity, enabling progression and resistance to therapy. Single-cell RNAseq recently charted the cellular states of diffuse gliomas, such as IDH-mutant gliomas (IDH-G) and IDH-wildtype glioblastoma (GBM). These studies showed that malignant cells in these tumors partly recapitulate neural differentiation trajectories; however, their genetic and epigenetic determinants remain uncharted.

To address these questions, we generated multi-omics single-cell DNA methylome, transcriptome and mutation genotyping of 1,824 cells from IDH-G (n=5) and GBM (n=7) samples. Stochastic DNA methylation changes (epimutations) were higher in IDH-G cells compared with GBM cells, also linked to higher cell-to-cell transcriptional variation. Thus, IDH mutation impacts the epigenetic state of the cells, leading to higher transcriptional dysregulation. Epimutations, acting as a cellular molecular clock and correlating with the proliferative history of the cell, were higher in more differentiated malignant cells (- /mesenchymal-like cells) compared with progenitor-like cells (NPC-like/OPC-like).

To directly examine the epigenetic encoding of glioma cell states, we compared the methylomes of progenitor-like cells with differentiated glioma cells, and identified Polycomb Repressive Complex 2 (PRC2) targets as the most strongly differentially methylated. Thus, similar to neurodevelopmental processes, PRC2 is a key switch between stem-like and differentiated malignant glioma cells.

These results raised the question of the stability and inheritance of glioma cell states. We therefore utilized stochastic heritable DNA methylation changes to infer high-resolution lineage histories of glioma cells, orthogonally validated through genetic mutation data. The leaves of the trees were annotated by cell state, enabling us to interrogate the cell state similarity as a function of lineage distance. The mathematical modeling of cell state dynamics further allowed us to derive the rate of dedifferentiation to stem-like states directly in patient samples. Together, these data revealed greater cell state plasticity in GBM compared to more stable differentiation hierarchies in IDH-G.

Overall, our work offers direct integration of malignant cell programs, their plasticity, and their modulation by epigenetic drivers directly in glioma patients, providing unprecedented insight into the multi-dimensional underpinning of IDH-G and GBM evolution.

S5

Notes

S6

Multi-modal single-cell profiling reveals colonic CD8+ topography in IBD

Agne Antanaviciute, Agne Antanaviciute1,3*, Daniele Corridoni1,2*, Tarun Gupta1,2*, David Fawkner-Corbett1,2, Marta Jagielowicz1,2, Kaushal Parikh1,2, Emmanouela Repapi4, Steve Taylor4, Dai Ishikawa5, Ryo Hatano6, Wei Xin7, Hubert Slawinki8, Rory Bowden8, Giorgio Napolitani1, Chikao Morimoto6, Hashem Koohy1,3§, Alison Simmons1,2§

1Medical Research Council (MRC) Human Unit, MRC Weatherall Institute of Molecular Medicine (WIMM), John Radcliffe Hospital, University of Oxford, Oxford, UK. 2Translational Gastroenterology Unit, John Radcliffe Hospital, Oxford, UK. 3MRC WIMM Centre For , MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, University of Oxford, Oxford, UK. 4Computational Biology Research Group, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK. 5Department of Gastroenterology, Juntendo University School of Medicine, Tokyo Japan. 6Department of Therapy Development and Innovation for Immune Disorders and Cancers, Juntendo University, Tokyo, Japan. 7Department of Pathology, Case Western Reserve University, Cleveland, Ohio, of America 8Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK. *These authors contributed equally. §Correspondence: [email protected] and [email protected]

The pathology of inflammatory bowel diseases (IBDs), such as ulcerative colitis (UC), involves immune mediated tissue destruction secondary to a combination of barrier dysfunction, genetic risk and dysbiosis. Evidence in murine models has extensively examined pathogenic immune infiltrates in IBD, where multiple components of the cellular immune response could drive disease pathology. Despite emerging evidence for a role of CD8+ T cells in contributing to IBD pathology, the extent of heterogeneity, transcriptional regulation and effector function of distinct populations has not been investigated in an unbiased manner. Furthermore, their connections, hierarchy and how they may dynamically remodel to influence inflammation in IBD is unclear. Here, we conducted multi-modal single cell profiling of more than 80,000 CD8+ cells from the human colon in health and UC using a combination of CITE-seq, scRNA-seq, single cell VDJ sequencing and mass cytometry. Single cell RNA and protein expression profiles, coupled with assembled T cell receptor sequencing, revealed 14 populations based on molecular properties and delineated their interrelationships. We further reconstructed gene regulatory networks defining these populations and explored putative cross-talk with epithelial cell subtypes in IBD. In UC, our analysis showed preferential enrichment and clonal expansion of effector GZMK+ cells and terminally differentiated Tc-17 like IL26+ CD8+ T cells. Characterised by a co- inhibitory "exhaustion" RNA and protein expression profile, these post-effector cells also adopted key innate signatures, such as induction of NCR3, an NKR that participates in antitumor responses, and FURIN, which regulates tissue repair. In UC, this may suggest a mechanism to overcome the effects of chronic stimulation and T-cell exhaustion, in which incorporating the innate program triggered by IL26+ cells may retain the initial response functions in the absence of antigen-specific cues. We further investigated the functional role of CD8+ IL26+ cells in inflammation by inducing DSS colitis in a transgenic mouse model and found that IL26 over-expression protected against epithelial damage in acute inflammation. This was supported by RNA-Seq profiles from colonic tissue, which revealed downregulation of core T-cell and B-cell signatures in transgenic animals.

S7

Notes

S8

Mapping mammalian cell landscapes by single cell mRNA-seq Xiaoping Han1, & Guoji Guo1* 1Center for Stem Cell and Regenerative Medicine, Zhejiang University School of Medicine, Hangzhou 310058, China Single-cell analysis is a valuable tool to dissect cellular heterogeneity in complex systems. We used single-cell RNA sequencing to determine the cell-type composition of all major mouse and human organs to construct cell landscapes for the mammalian systems. We revealed single-cell hierarchies for many tissues that have not been well characterised. We established a cell mapping pipeline that helps to define mammalian cell identity. Finally, we performed a single-cell comparative analysis of landscapes from both human and mouse to reveal the conserved genetic networks. In the mammalian systems, stem and progenitor cells exhibt strong transcriptomic stochasticity, while the differentiated cells are more distinct.

S9

Notes

S10

New single-cell genomic technologies to dissect and enhance cell fate reprogramming

Kenji Kamimoto1,2,3, Xue Yang1,2,3, Christy M. Hoffmann1,2,3, Wenjun Kong1,2,3, and Samantha A. Morris1,2,3,*.

1Department of Developmental Biology; 2Department of Genetics; 3Center of Regenerative Medicine. Washington University School of Medicine in St. Louis. 660 S. Euclid Avenue, Campus Box 8103, St. Louis, MO 63110, USA. *Correspondence: [email protected]

Direct lineage reprogramming involves the remarkable conversion of cellular identity. Single- cell technologies aid in deconstructing the considerable heterogeneity in transcriptional states that typically arise during lineage conversion. However, lineage relationships are lost during cell processing, limiting accurate trajectory reconstruction. We previously developed ‘CellTagging’, a combinatorial cell indexing methodology, permitting the parallel capture of clonal history and cell identity, where sequential rounds of cell labeling enable the construction of multi-level lineage trees. CellTagging and longitudinal tracking of fibroblast to induced endoderm progenitor (iEP) reprogramming reveals two distinct trajectories: one leading to successfully reprogrammed cells, and one leading to a dead-end state. Here, I present two new methods to enable the molecular mechanisms underlying reprogramming outcome to be dissected. The first is an experimental method, 'Calling Cards', enabling transcription factor binding to be recorded, in individual cells, in the earliest stages of reprogramming. The second method is a new computational platform, called 'CellOracle', that uses single-cell transcriptome and chromatin accessibly data to reconstruct changes in GRN configurations across the reprogramming process. Together, these tools provide new mechanistic insights into how transcription factors can drive changes in cell identity, and help reveal new factors to enhance the efficiency and fidelity of reprogramming.

S11

Notes

S12

Human cerebral organoid development through the lens of single-cell genomics

Barbara Treutlein

Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland

I will present how we are applying single-cell genomic tools to illuminate uniquely development, malformation, and evolution. First, we use single-cell transcriptomics (scRNA-seq) to analyze the cell composition and reconstruct differentiation trajectories over the entire course of human cerebral organoid development from pluripotency, through neuroectoderm and neuroepithelial stages, followed by divergence into neuronal fates within the dorsal and ventral forebrain, midbrain and hindbrain regions. We find that brain region composition varies in organoids from different iPSC lines, yet regional gene expression patterns are largely reproducible across individuals. We then use a combination of scRNA-seq and accessible chromatin profiling (scATAC-seq) to explore gene regulatory changes that are specific to humans. We measure analyze chimpanzee and macaque cerebral organoids and find that human neuronal development proceeds at a delayed pace relative to the other two primates. Through pseudotemporal alignment of differentiation paths, we identify human-specific gene expression resolved to distinct cell states along progenitor to neuron lineages in the cortex. We find that chromatin accessibility is dynamic during cortex development, and identify instances of accessibility divergence between human and chimpanzee that correlate with human-specific gene expression and genetic change. Finally, we analyze human cerebral organoids from patients with brain malformations to identify molecular mechanisms underlying these neurodevelopmental disorders.

S13

Notes

S14

Mutating autism risk genes in brain organoids grown with multiple differentiation protocols

Amanda Kedaigle, Bruna Paulsen, Silvia Velasco, Giorgia Quadrato, JoJo Nguyen, Sean Simmons, Xian Adiconis, Catherine Abbate, Joshua Levin, and Paola Arlotta

Stanley Center for Psychiatric Research at the Broad Institute of MIT and Harvard, Department of Stem Cell & Regenerative Biology at Harvard University, and the CIRM Center for Regenerative Medicine and Stem Cell Research at University of Southern California

Several genetic loci and specific mutations have now been implicated in autism spectrum disorder (ASD), but the precise mechanism of most of these genes during neurodevelopment remains unknown. Recently, we released an optimized protocol for the generation of reproducible 3-D cerebral brain organoids, which contain many of the cell types found in the developing dorsal forebrain. This represents an opportunity to study the effect of ASD risk genes in a reproducible, stem-cell derived 3-D model of human neural cell development. To assess the effect of protein-truncating mutations in autism risk genes in this model, we introduced mutations in the genes CHD8 and SUV420H1 in human stem cell lines, and grew brain organoids from mutated lines and isogenic controls. Using large single cell RNA-seq datasets, we evaluated the cell types grown and differential gene expression at several timepoints in these organoids over 1-6 months in vitro. Importantly, we compared organoids grown from the same cell lines using various differentiation protocols. As expected, we find differences in cell types and levels of organoid-to-organoid reproducibility between the protocols. Some gene expression changes between mutated and wildtype organoids within specific cell types, however, are more consistent between protocols. We show for the first time a comprehensive atlas of single cells grown in cerebral organoids from the same stem cell lines across different protocols and at several time points, and evaluate the effect of mutated ASD risk genes in each of them.

S15

Notes

S16

The tempo and mode of host-pathogen interactions during bacterial infections

Gal Avital, Felicia Kuperwaser, Itai Yanai

NYU Grossman School of Medicine

During an intracellular bacterial infection, the host cell and the infecting pathogen interact through a progressive series of events that may result in many distinct outcomes. To understand the specific strategies our immune system employs to manage attack by diverse pathogens, we sought to identify the unique and the core host and pathogen interactions that occur during infection: We compared in molecular detail the pathways induced across infection by seven diverse bacterial species that constitute many of the main human pathogens: Staphylococcus aureus, Listeria monocytogenes, Enterococcus faecalis, Group B Streptococcus, Yersinia pseudotuberculosis, Shigella flexneri and Salmonella enterica. We infected primary human macrophages with each species and used scRNA-Seq to generate a comprehensive dataset of gene expression profiles during bacterial infection. Examining the expression profiles of the infected macrophages across the pathogens, we discovered different modules of infection representing different states through which the infection progresses. The early module captures intra-cellular activity such as lysosome and degranulation, followed by type I IFN signaling, from which results in a cell death module, with a last mode of inflammatory response through response to IL-1. Comparing these modules across the pathogens, we found that their dynamics differ, with some modules active in all species and others which are present in some, but not all pathogens. Our work defines the hallmarks of host-pathogen interactions by identifying recurring properties of infection that can provide insight into diagnostics and therapeutic timing.

S17

Notes

S18

Deconvoluting mechanisms of intervertebral disc degeneration by single cell transcriptomics and lineage analyses

Kathryn S.E. Cheah School of Biomedical Sciences, The University of Hong Kong.

Intervertebral disc disease (IDD) and associated low back pain have major impact on the quality of life of millions of people globally. In IDD, disability arises because of impaired ability of the intervertebral disc (IVD) to provide shock-absorption function for supporting the spine intervertebral joint and for protecting the spine against compressive load. The underlying disease etiology are not well understood. Current treatments for severe IDD are primarily surgical that render the spine immobile and are not lasting. Biological restoration of IVD function would be preferred for which knowledge of disc biology: the constituent cell populations, changes in cell states and identities during development and growth, and the ageing process that lead to functional decline, is essential. In mice, all cells in the nucleus pulposus (NP) in the IVD core, have been shown to be derived from the embryonic notochord, which persist as vacuolated notochordal-like cells (NCLs) up to 2 years of age. But knowledge of human notochord development, lineages in the NP and the molecular characteristics of NCLs is limited. In young humans, histological studies show substantial numbers of NCLs are present in the NP after birth but the presence of these cells reduce rapidly with age. To provide insights into the changes in cell populations in the human healthy and ageing disc, we analysed the transcriptomes of single NP cells isolated from non-degenerated and degenerated discs. These data provide detailed information on the molecular signatures of constituent cells in the mature human NP of different ages and degenerative states. We identified two major populations in the young healthy disc, the signature of one was consistent with a NCL identity and the other, expressing many genes in common with chondrocytes (CLCs). In degenerated discs, we found NCLs were absent. Instead CLCs displayed further signature heterogeneity and other cells were found, expressing markers characteristic of fibroblasts, myofibroblasts, and macrophages which may underpin the fibrotic changes in IDD. This data and lineage tracing analyses in an injury-induced mouse model of disc degeneration, support the concept of NCLs decline and possible cell fate transformation associated with upregulation of the integrated stress response and a partial EMT process in IDD. The rich transcriptome data and insights gained provide ideas for maintaining NP cell identity, important for the development of cell-based therapies for IDD.

S19

Notes

S20

Studying human liver development using stem cells and single cell analyses

Ludovic Vallier, Brandon Wesley1, Rute Tomaz1, Daniele Muraro1, Alexander Ross1, Sarah Teichmann2, Chichau Miau2, and Muzz Haniffa3

1 Wellcome and MRC Cambridge Stem cell Institute 2 Wellcome Sanger Institute, 3 Newcastle University

The liver is a unique organ by the broad spectrum of its functions which include drugs detoxification and, glycogen storage, lipid metabolisms and secretion of protein such as Albumin. End stage liver disease are systematically life threatening and orthoptic liver transplantation is the only treatment available. This situation will continue to worsen in the foreseeable future due increase in cirrhosis associated with obesity. Understanding liver development could help to develop alternative therapies by providing the knowledge necessary to control regenerative process. However, the study of human liver development has been impaired by technical and ethical challenges. Consequently, little is known about the mechanisms directing the cellular composition of the human liver and how this cellular diversity result into tightly regulated hepatic activity. Here, we start to address this knowledge gap by performing single cell transcriptomic on human livers between 5 weeks post conception to adult. These analyses reveal the cellular diversity of the human foetal liver and the developmental trajectory of its different cell types including hepatocytes, cholangiocytes, Liver sinusoidal endothelial cells, Kupffer macrophages and hepatic stellate cells. To further demonstrate the interest of our data, we compared the differentiation process of human pluripotent stem cells (hPSCs) at the single cells level with their primary counterparts. Of particular interest, hepatocytes differentiation in vitro appears to deviate from normal development at an early stage suggesting that the resulting cells could represent an artificial state. These analyses also reveal the factors potentially necessary to redirect differentiation of hPSCs toward a natural path of development allowing the production of hepatocytes with improved functionality. In conclusion, our study provides a transcriptomic map of the developing fetal human liver. The resulting atlas reveals new developmental mechanisms which can then inform process of differentiation in vitro for the production of cell types with a clinical interest. We expect this approach to be broadly applicable to other organs and thus to open new area of investigation in a diversity of fields.

S21

Notes

S22

Dissection of the global gene regulatory network governing cardiac neural crest development

Shashank Gandhi, Fan Gao (1), Ruth M. Williams (1), Irving T. C. Ling (2), Tatjana Sauka- Spengler (2), Marianne E. Bronner (1)

(1)Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA (2) University of Oxford, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, Oxford, OX3 9DS, UK

The neural crest (NC) is a multipotent, vertebrate-specific stem cell population that gives rise to diverse cell types in the developing embryo, including craniofacial cartilage, enteric ganglia, and cardiac septa. The cardiac NC sub-population originates in the dorsal hindbrain and migrates into the developing heart, where it forms the aorticopulmonary septum, cardiac ganglia, and part of the interventricular septum, amongst other cell types. A feed-forward gene regulatory network (GRN) comprised of transcription factors and signaling molecules underlies the formation of the NC from induction at the neural plate border to final differentiation into its multitude of derivatives. In this study, we combine single cell RNA-seq (scRNA-seq) with ATAC-seq to analyze the GRN underlying early cardiac NC specification. First, we characterize a novel cardiac NC enhancer that regulates the activity of the NC specifier gene FoxD3 in the dorsal hindbrain, allowing us to transcriptionally and epigenetically profile a pure population of premigratory cardiac NC cells. scRNA-seq of these cells revealed a previously overlooked mesenchymal signature in a subset of the cells, suggesting early fate restriction. We find that these signatures are retained as the cells migrate from the dorsal neural tube. Next, by comparing the epigenetic landscape of premigratory and migratory cardiac NC with that of cranial NC, we isolate cardiac NC- specific chromatin signatures and test their activity in vivo. Within chromatin regions that were uniquely accessible in cardiac NC cells, we utilize transcription factor binding motifs obtained from the JASPAR database to investigate the co-binding dynamics of transcription factors enriched in the cardiac NC scRNA-seq dataset. Finally, using CRISPR-Cas9- mediated genome engineering, we validate a subset of these linkages to establish the embryonic cardiac NC GRN. Taken together, our work enables analysis of the cardiac neural crest GRN at a global level, thereby expanding our understanding of the regulatory network that governs the unique ability of this neural crest subpopulation to contribute to the heart.

S23

Notes

S24

Cell cycle progression is not responsible for the transcriptional rewiring and loss of self-renewal capacity in ex vivo cultured human Haematopoietic Stem Cells (HSCs)

Serena Belluschi*, Carys Johnson* 1,2 , Kendig Sham* 1, Michael Drakopoulos 1, Winnie Lau 1, Evangelia Diamanti 1, Xiaonan Wang 1, Bertie Göttgens 1 and Elisa Laurenti 1 (*Equal contribution)

1. Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, UK. 2. Process Research, Cell & Gene Therapy, GlaxoSmithKline, Stevenage, UK.

Molecular and functional heterogeneity in the human Haematopoietic Stem Cell (HSC) pool exists to meet the dynamic demand of blood production. To sustain lifelong haematopoiesis, HSCs divide infrequently and predominantly reside in a reversible state of "quiescence/G0". Upon sensing mitogenic signals, HSCs exit quiescence (from G0 to the end of early G1) and progress through the cell cycle (from late G1 to M) in a process termed "activation". While past studies have focused on the molecular networks that maintain HSC quiescence, little is known about quiescence exit and HSC activation. Upon ex vivo activation, HSCs partially lose self-renewal capacity by poorly understood mechanisms. However, it is suggested that cell cycle progression is largely responsible for such loss in repopulation potential. To better study quiescence exit and HSC activation at single cell resolution, we used an in vitro model system to block cell cycle progression (past early G1) by inhibition of CDK6, a master regulator of quiescence exit.

Time course scRNA-Sequencing of cord blood HSCs at 0, 6, 24 and 72 hours reveals that the majority of transcriptional changes occur within the first 6 hours of culture. Sudden downregulation of quiescent transcriptional networks and immune regulators is accompanied by fast upregulation of metabolic pathways. Following the first 6 hours in vitro, transcriptional changes are gradually accumulated dependent on the duration of the mitogenic signal. Importantly, transcriptional changes observed in HSCs arrested in early G1 by treatment with a CDK6 inhibitor (CDK6i) were strikingly similar to those observed in control cells progressing through the cell cycle. This indicates that cell cycle progression is not the main driver of the transcriptional rewiring observed in cultured HSCs.

To verify if the loss of self-renewal capacity during HSC culture is dependent on cell cycle progression, control or CDK6i HSCs were transplanted into immunocompromised mouse models after 24 or 72 hours in culture. Comparable long-term repopulation kinetics and self- renewal capacity were observed between CDK6i and control HSCs at both culture durations in primary and secondary animals.

Overall, our data identify quiescence exit, and not cell cycle progression, as the key phase of transcriptional remodelling during ex vivo culture of HSCs. We also demonstrate that, contrary to accepted knowledge, loss in self-renewal capacity during HSC ex vivo culture is independent of cell cycle progression. These results have significant relevance for the improvement of ex vivo HSC expansion approaches and gene therapy.

S25

Notes

S26

Dissecting the regulatory switches driving cell fate trajectories during embryonic development one cell at a time

Eileen E. Furlong

Genome Biology Unit/Dept., European Molecular Biology Laboratory (EMBL) Heidelberg, Germany

Co-authors – Stefano Secchia, Gabriel Cavalheiro, Tim Pollex

Complex patterns of temporal and spatial gene expression are regulated by enhancers; cis- regulatory elements that recruit multiple transcription factors, leading to a very defined output of expression. Enhancers can be located in close proximity to, or at great distances from, their target gene. To better understand the relationship between enhancer usage and transcriptional regulation, we are integrating single cell genomic approaches with single cell imaging and genetic deletions to determine inherent properties of enhancers and regulatory networks during embryogenesis. Using single cell ATAC-seq we recently showed that this information can predict tissue-specific enhancers, identify cell types and follow their trajectories during embryogenesis1. We are now extending this to study the specification of the mesoderm (one of the three germ layers) into different tissue primordia. This is being combined with natural sequence variation and transcription factor mutants to dissect the functional impact of perturbing the system in both cis and trans, as well as with single cell imaging of nuclear topology and nascent transcription. This information is being used to link enhancers to their target genes and build a regulatory network of one germ-layer’s development as it becomes specified into different lineages, and through the subsequent tissue’s differentiation. 1. Cusanovich DA, Reddington JP, Garfield DA, et al (2018). The cis-regulatory dynamics of embryonic development at single cell resolution. , Mar 22;555(7697):538-542

S27

Notes

S28

Decoding the developing human immune system

Muzz Haniffa

Newcastle University, Leech Building, Medical School, Newcastle Upon Tyne NE2 4HH, UK

Muzlifah has used functional genomics, comparative biology and single cell genomics to study human mononuclear phagocytes. In this seminar, she will discuss the power and utility of single cell technologies to decode the developing human immune system.

S29

Notes

S30

Using single-cell RNA sequencing to map lymphoid cell state and differentiation across human fetal haematopoietic organs

Simone Webb*1, Laura Jardine*1, Issac Goh1, Daniel Maunder1, David Dixon1, Hamish King2, Justin Engelbert1, Emily Stephenson1, Rachel Botting1, Peter Vegh1, Dorin Mirel- Popescu1, David McDonald1, Sarah Teichmann*3, Irene Roberts*4, *1

1. Institute of Cellular Medicine, Newcastle University, Newcastle upon Tyne, UK 2. Queen Mary University of London, UK 3. Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK 4. Department of Paediatrics, University of Oxford, Oxford, UK

The human blood and immune systems develop during early fetal development through a process called 'haematopoiesis', which is critical for survival and health. The earliest blood and immune cells originate from the yolk-sac between 2-3 post-conception weeks (PCW), followed by the aorta-gonad-mesonephros (AGM) at 3-4 PCW, the liver at 11-20 PCW and finally the bone marrow, which becomes the dominant site of haematopoiesis after 20 PCW. To date, our understanding of human haematopoietic development derives primarily from murine and in vitro model systems due to difficulties in human fetal tissue access.

Through analysis of single-cell RNA sequencing data from human fetal bone marrow (k = 8, n = 86,481), liver (k = 14, n = 113,063) and yolk sac (k = 3, n = 10,071), we identify over 30,000 lymphoid cells, annotate cell types and track fluctuations in gene expression across gestational age and haematopoietic organ. We model developmental trajectories in lymphoid lineages using computational approaches, including pseudotime analysis, force-directed graphs, approximated graph abstraction (AGA) and diffusion maps. In lieu of antigen, the function of lymphoid cells during development remains unclear, so we additionally interrogated VDJ (variable diversity junctional) single-cell RNA sequencing data from fetal bone marrow (k=2).

Clustering and differential gene expression analysis of the fetal bone marrow dataset revealed presence of 'Early lymphoid progenitors' (ELP), 'Pre-pro B cells', 'Pro B cells', 'Pre- B cells' and 'Mature naïve B cells', as well as CD4+, CD8+ and regulatory T cells. A peak in expression of proliferation marker MKI67 was noted at the pre-B stage. Equivalent cell types in fetal yolk-sac and liver were inferred through AGA. Pseudotime inference uncovered a complex and branched B cell lineage in the fetal bone marrow, with a clear backbone differentiation trajectory from ELP to Mature naïve B cells. Analysis of VDJ data showed that productivity of B cell receptor chains increase later in B cell differentiation. Light chains were shown to dominate in early B cell differentiation, followed by heavy chains, then a mixture of heavy and light during late development. Few clonal CDR3 sequences were detected in the bone marrow, indicating that B cell expansion is limited in second trimester fetal bone marrow. This study provides a unique insight into the state and role of over 30,000 lymphoid cells across three human fetal haematopoietic organs and informs broader understanding of lymphoid compartment development.

S31

Notes

S32

Single-cell Transcriptome of Gastrulating Human Embryo

Elmir Mahammadov*, Richard Tyser**, Antonio Scialdone*, Shankar Srinivas**

*Helmholtz Zentrum Muenchen, Institute of Epigenetics and Stem Cells **Oxford University, Department of Physiology, Anatomy and Genetics

Gastrulation is a crucial stage of development in all animals, where pluripotent cells form body-pattern and organ determining cell types. There has been ample amount of data on gastrula stage in model organisms, such as mouse to study the mechanisms and cell atlas. However, full-sequencing of human embryo in this stage has not been done. Our study clearly shows endoderm, ectoderm and mesoderm layers, as well as their differentiation trajectory. We were also able to capture a rare and small primordial germ cell population. Surprisingly, our data show an early population of erythroid progenitor cells and primitive macrophage cells, which at this stage of development have not been observed in humans. Finally, we compared each of the cell types seen in our data to the single-cell data of mouse. This allowed us to draw parallels between human and mouse gastrulation, and assign an equivalent stage for each cell type in human. All in all, this data is a unique and valuable resource for scientists working on early-embryonic development in humans.

S33

Notes

S34

Integrated analysis of single-cell data across modalities and technologies

Rahul Satija

New York University, USA

Abstract not available at time of printing.

S35

Notes

S36

Modeling cellular response across perturbations and spatial context F.J. Theis Institute of Computational Biology, Helmholtz Munich - http://comp.bio

Modeling cellular state changes e.g. during differentiation or in response to perturbations is a central goal of computational biology. Single-cell technologies now give us easy and large- scale access to state observations on the transcriptomic and more recently also epigenomic level. In particular, they allow resolving potential heterogeneities due to asynchronicity of differentiating or responding cells, and profiles across multiple conditions such as time points, replicates and more recently with spatial resolution are being generated. Latent space modeling and manifold learning have become a popular tool to learn overall variation in single cell gene expression. Here, I will first summarize recent approaches based on neural networks and describe an extension from the lab towards multiple, quantitative perturbation levels. I will then discuss how we can extend these models to include spatial proximity as observed from spatial genomics assays. We summarize spatial gene expression in a node-labeled proximity graph. We can then leverage graph convolutional neural networks to improve on modeling cellular state. We apply this to learn complex cell- cell communication in a protein expression data set across breast cancer patients.

S37

Notes

S38

Revealing Dynamics of Gene Expression Variability in Cell State Space

Dominic Grün, Dominic Grün

Max-Planck-Institute of Immunobiology and Epigenetics, D-79108 Freiburg, Germany

The availability of high-throughput single-cell RNA-sequencing methods has facilitated the identification of cell types in organs and tissues at high resolution, allowing the creation of tissue cell type atlases. In these efforts typically tens of thousands of cells are sequenced, frequently at low sequencing depth, and cell type identification relies on computational methods sensitive to highly expressed genes. To understand cell differentiation and cell state transitions in general it is crucial to quantify potentially low stochastic expression of lineage determining factors, requiring computational methods sensitive to subtle changes in lowly expressed genes. We here introduce VarID, a computational method that identifies locally homogenous neighborhoods in cell state space permitting the quantification of local gene expression variability. By controlling for the variance-mean dependence of transcript counts VarID delineates neighborhoods with differential variability and reveals pseudo- temporal dynamics of gene expression variability during differentiation. VarID recovers the stochastic activity of lineage-associated transcription factor networks in murine hematopoietic multipotent progenitors, and reveals stochastic activity of transcription factors associated with secretory lineages in mouse intestinal epithelial stem cells. In conclusion, our approach enables the investigation of stochastic gene activity, a previously understudied aspect, with the help of single-cell RNA-seq data and can provide novel insights in the regulation of cell fate decisions.

S39

Notes

S40

Joint modeling of transcriptome and surface proteome enhances single-cell data analysis

Adam Gayoso and Zoë Steier, Romain Lopez, Jeffrey Regier, Aaron Streets, Nir Yosef

Center for Computational Biology, University of California, Berkeley (AG, AS, NY); Department of Bioengineering, University of California, Berkeley (ZS, AS); Department of Electrical Engineering and Computer Sciences, University of California, Berkeley (RL, NY); Department of Statistics, University of Michigan, Ann Arbor (JR); Chan Zuckerberg Biohub, , California (AS, NY); Ragon Institute of MGH, MIT, and Harvard (NY)

CITE-seq, which enables simultaneous measurement of RNA and surface protein abundance in single cells, is a promising approach to connect transcriptome characteristics with cell phenotype and function. However, recent approaches to CITE-seq data analysis have often been limited to treating as metadata labels for cell type classification in an RNA-centric pipeline. This strategy does not take full advantage of the distinctive information contained in the protein measurements and their relationships to other proteins and RNA, all of which contribute to an overall view of cell state. To learn a biologically meaningful low-dimensional representation of cell state, generative models have been successfully applied to single-cell RNA sequencing data, but it remains difficult to handle the added complexities of protein count data, which is subject to unique sources of technical bias. We aim to combine RNA and protein measurements into a single, more comprehensive representation of cell state while addressing the technical challenges of each modality.

Here we present Total Variational Inference (totalVI), a probabilistic deep-learning framework for end-to-end analysis of CITE-seq data. totalVI jointly models RNA and protein abundance data in a shared latent representation while addressing the previously unresolved complication of protein background signal. This joint model allows visualization and clustering of cell states, experimental batch correction, data denoising, and differential expression testing of RNA and proteins. totalVI is also capable of integrating data from CITE-seq experiments with different measured sets of proteins. We tested totalVI's performance on these analysis tasks using published datasets as well as our novel dataset of 40,000 cells from murine spleen and lymph nodes with over 100 measured proteins. We demonstrate that totalVI more accurately fits RNA and protein data than baseline factor analysis and is better able to distinguish protein foreground and background signal than cutoff-based methods like Gaussian mixture models. We use the uncertainty estimates from denoising to explore RNA and protein correlations. Finally, we apply totalVI to harness the power of paired single-cell transcriptome and proteome measurements in our study of the murine spleen and lymph nodes.

S41

Notes

S42

Long-term single-cell quantification: New tools for old questions

Timm Schroeder, -

Department of Biosystems Science and Engineering ETH Zurich, Basel, Switzerland

Despite intensive research, surprisingly many long-standing questions in cell research remain disputed. One major reason is the fact that we usually analyze only populations of cells - rather than individual cells - and at very few time points of an experiment - rather than continuously. We therefore develop imaging systems and software to image, segment and track cells long-term, and to quantify e.g. divisional history, position, interaction, and protein expression or activity of all individual cells over many generations. Dedicated software, machine learning and computational modeling enable data acquisition, curation and analysis. Custom-made microfluidics devices improve cell observation, dynamic manipulation and molecular analysis. The resulting continuous single-cell data is used for analyzing the dynamics, interplay and functions of signaling pathway and transcription factor networks in controlling hematopoietic, pluripotent, skeletal and neural stem cell fate decisions.

S43

Notes

S44

Regulating the rheostat in killer T cells

Claire Ma1, Gordon L Frazer1, Yukako Asano1, Jane C Stinchcombe1, Arianne C Richard1, 2, John Marioni2 and Gillian M Griffiths1

1University of Cambridge, Cambridge Institute for Medical Research, Cambridge, United Kingdom; 2 Cancer Research Institute, Cambridge, United Kingdom.

Recent advances in immunotherapy have highlighted the powerful anti-cancer potential of Cytotoxic T lymphocytes (CTLs). Understanding the mechanisms that determine and regulate killing by these cells becomes increasingly important with their increased role in immunotherapies. We have combined a number of different single-cell approaches to understand the cell biology of CTL killing from single cell gene analysis to high resolution imaging, allowing us to determine how the strength of T cell receptor (TCR) signaling controls the strength of CTL killing.

We have varied the strength of TCR signal using OT-I transgenic CTL and altered peptide ligands to understand the mechanisms that fine-tune delivery of the lethal hit from CTL. Combining single cell transcriptomics, CyTOF and high-resolution imaging we have identified multiple mechanisms of regulation at each level that effectively expand the range of response that can be generated in response to varying signal strength. Remarkably, common themes emerge. These results will be described, revealing how CTL can provide exquisite fine-tuning of killing.

S45

Notes

S46

Single-cell resolution of the human germinal centre reveals novel B cell states and antibody maturation dynamics.

Hamish W King, Nara Orban (3), Sarah A Teichmann (2,4,5), Louisa K James (1)

1 Centre for Immunobiology, Blizard Institute, Queen Mary University of London, London, UK 2 Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK 3 Barts Health Ear, Nose and Throat Service, The Royal London Hospital, London, UK 4 Theory of Condensed Matter, , Department of Physics, University of Cambridge, Cambridge, UK 5 European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK

B cells are an essential part of our adaptive immune system that undergo a maturation process in germinal centre structures within secondary lymphoid tissues to produce antigen- specific antibodies. However, this process is highly dynamic and heterogeneous, making it challenging to characterise the gene expression networks involved in regulating key B cell fate decisions outside of studies. To address this challenge and reveal novel insights into human B cell maturation in vivo, we have applied high-throughput droplet- based single-cell RNA-seq (scRNA-seq) to human tonsil-derived lymphocytes (n = 7; 32,502 cells). We identify all major stages of B cell activation and maturation, follicular helper and regulatory T cell subsets, and several myeloid cell populations. Furthermore, we describe several putatively novel B cell states within the germinal centre with unique gene expression patterns. We also characterise a subset of non-germinal centre B cells undergoing class- switch recombination prior to entry into the germinal centre and for the first time define their unique gene expression patterns. Subsequently, we integrate our single-cell transcriptomic datasets with single-cell and bulk VDJ repertoire analyses to provide evidence for isotype- specific dynamics of B cell maturation in the germinal centre, with expression of specific isotypes as the B cell receptor being linked with differential activity of key signalling pathways and expression of fate-specific transcription factors. Finally, we use single-cell chromatin accessibility (scATAC)-seq to define transcription factor networks that shape gene regulatory networks involved in B cell fate decisions. Together this work provides improved resolution and new insights into key dynamic processes involved in human B cell maturation in the germinal centre.

S47

Notes

S48

Reconstructing foetal blood development at the single-cell level through somatic mutations

Anna Maria Ranzoni, Michael Spencer Chapman, Peter Campbell, Ana Cvejic

1 Wellcome Trust - MRC Stem Cell Institute, University of Cambridge, UK 2 Department of Haematology, University of Cambridge, UK 3 Wellcome Trust Sanger Institute, Hinxton, UK AMR & MSC: Joint first authors PC & AC: Joint senior authors

Blood production during foetal development involves separate waves of migration of rare stem/progenitor cells among different organs. In humans, a first, primitive wave begins in the yolk sac, giving rise to short-lived progenitors. Definitive haematopoiesis starts with the appearance of blood stem cells in the dorsal aorta. These stem cells colonise the foetal liver, where they proliferate, before seeding the bone marrow, where adult haematopoiesis is established. The exact dynamics of these haematopoietic waves in humans and the size of the initial pool of blood stem cells remain largely unexplored. Somatic mutations, acquired in the genome of cells and passed to their progeny through cell division, can be used as a tool to unravel clonal relationships in a population and to reconstruct the phylogeny of a tissue. In this study, we sequenced the genomes of more than 500 single-cell derived blood colonies isolated from the main haematopoietic sites of two foetuses and used population genetics tools to reconstruct the phylogenetic tree of human foetal blood. On average, we detected 24 somatic mutations per cell at 8 weeks post conception and 38 at 18 weeks. Interestingly, the reconstruction of the phylogenetic trees revealed an asymmetric branching at the level of the first cell division. Next, based on the mutations identified in the blood colonies, we designed a custom DNA bait set and performed targeted- sequencing on eight laser-capture micro-dissected non-hematopoietic tissues from matched foetuses. In particular, we selected tissues originating from different embryonic germ layers (e.g. gut-endoderm, muscle-mesoderm, epidermis-ectoderm and placenta-trophoblast). The results from this study allowed us to pinpoint the chronological time of separation of different embryonic layers during development and gain insights on the origin of primitive blood in humans. Current work is focused on the development of a statistical model to estimate the rate of acquisition of somatic mutations and to determine the approximate size of the initial stem cell pool sustaining blood production during early foetal development.

S49

Notes

S50

Pairing droplet microfluidics with FACS for ultra-high-throughput single-cell analysis

Brower, K.K., Khariton, M., Calhoun, S.G.K., Suzuki, P., Still, C., Kim, G., Carswell- Crumpton, C., Nichols, L., and Fordyce, P.M.

Droplet microfluidics has made large impacts in diverse areas such as enzyme evolution, chemical product screening, polymer engineering, and single-cell analysis. However, while droplet reactions have become increasingly sophisticated, phenotyping droplets by a fluorescent signal and sorting them to isolate variants-of-interest remains a field-wide bottleneck. Here, we present a microfluidic platform, sdDE-FACS (single droplet Double Emulsion-FACS), that enables high-throughput phenotyping, selection, and sorting of droplets using standard flow cytometers. Using a 130 μm sort nozzle, we demonstrate robust post-sort recovery of intact droplets at high throughput (12-14 kHz sorting rates) with little to no droplet breakage. In addition, we demonstrate that individual double emulsion droplets can be isolated with >70% efficiency, and that all encapsulated nucleic acids can be recovered. Finally, we provide the first evidence that mammalian cells can be efficiently encapsulated within double emulsion droplets, analyzed via FACS to yield a phenotypic signal, and isolated for use in downstream single-cell sequencing assays. This work resolves several hurdles in the field of high-throughput droplet analysis and paves the way for a variety of new droplet assays, including "multi-omic" single-cell analyses, extending the full power of flow cytometry to current single-cell droplet techniques.

S51

Notes

S52

Live-seq: Measuring transcriptomes of live cells

Wanze Chen, Orane Guillaume-Gentil3,*, Riccardo Dainese1,2, Pernille Yde Rainer1,2, Magda Zachara1,2, Christoph G. Gäbelein3, J.A. Vorholt3,$, B. Deplancke1,2,$

1Laboratory of Systems Biology and Genetics, Institute of Bio-engineering & Global Health Institute, School of Life Sciences, EPFL, CH-1015 Lausanne, Switzerland 2Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland 3Department of Biology, Institute of Microbiology, ETH Zurich, 8093 Zurich, Switzerland

Single-cell transcriptomics (scRNA-seq) has greatly advanced our ability to characterize cellular heterogeneity in physiological and pathological conditions. However, since cells are lysed and can thus only be profiled once, it is currently impossible to perform downstream functional assays on the same cells and to obtain a direct scRNA-seq-based read-out of transcriptional dynamics. Here, we present Live-seq, a novel single cell transcriptomic biopsy approach based on fluidic force microscopy that differs from other scRNA-seq methods because the cells can be kept alive after extraction of the mRNA. We demonstrate that Live-seq allows the identification of both cell types and states, rendering it possible to complement single-cell transcriptomic data with additional functional analyses. These include sequential profiling on the same cell, which we demonstrate, allows capturing of both the past and current state of the cell. While still limited in overall throughput, we believe that live-seq breaks new ground by transforming scRNA-seq from an end-point to a real-time analysis approach.

S53

Notes

S54

Probing multicellular mechanisms of immune modulation with massively parallel single-cell mRNA-seq

Sisi Chen [1,2], Tahmineh Khazaei [1], Tiffany Tsou [1,2], Paul Rivaud [1,2], Benjamin L. Hoscheit [1], Jong H. Park [1,2], Christopher S. McGinnis [3], Eric Chow [4,5], Zev J. Gartner [3,6,7,8], Matt Thomson [1,2]

1. Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 2. Beckman Center for Single-Cell Profiling and Engineering, California Institute of Technology, Pasadena, CA 3. Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, CA, USA 4. Department of Biochemistry and , University of California San Francisco, San Francisco, CA, USA 5. Center for Advanced Technology, University of California San Francisco, San Francisco, CA, USA 6. Helen Diller Family Comprehensive Cancer Center, San Francisco, CA, USA 7. Chan Zuckerberg BioHub, University of California San Francisco, San Francisco, CA, USA 8. Center for Cellular Construction, University of California San Francisco, San Francisco, CA, USA

Human diseases are fundamentally multicellular in nature with many different cell-types contributing to disease progression and treatment response. However, how drugs impact each cell-type in a complex population, and modify their interactions, remains poorly understood. High-throughput drug screens have been conventionally performed on single cell-types, ignoring population-level effects. Here, we applied highly multiplexed single cell mRNA-seq to study the impact of 300 immunomodulatory compounds on human primary blood mononuclear cells (PBMCs), a heterogenous mixture of myeloid and lymphoid immune cell-types, under resting and activated (+CD3/CD28 for T cell driven inflammation) signaling conditions. We profiled over one million single cells using MULTI-seq to hash samples and used PopAlign, a probabilistic modeling platform, to discover cell-type specific responses for each compound in the library. Our results highlight the critical role that cell-cell communication plays in determining drug response. We find that immunomodulatory drugs can have fundamentally different effects in resting and activated conditions. By classifying cell-type specific drug response signatures across conditions, we identified two mechanistic types of immunomodulators: local and global inhibitors of immune activation. Local inhibitors suppress cell-type specific activation, while global inhibitors broadly suppress immune activation by additionally inhibiting cytokine production and blocking information flow between immune cell-types. Our analysis reveals novel local and global activity for previously poorly characterized molecules. Among our findings include a previously undocumented myeloid-suppressing function of a group of compounds including NSAIDs and an artificial sweetener. Overall, our results highlight the importance of understanding cell-type specific therapeutic responses in the context of complex and physiologically relevant tissues. Our platform can be broadly applied towards understanding heterogeneous cell populations in a wide range of therapeutic and disease conditions.

S55

Notes

S56

Generalizing RNA velocity to transient cell states through dynamical modeling

Volker Bergen, Marius Lange, Stefan Peidli, F. Alexander Wolf, Fabian J. Theis

1 Institute of Computational Biology, Helmholtz Center Munich, Germany. 2 Department of Mathematics, TU Munich, Germany.

The introduction of RNA velocity in single cells has opened up new ways of studying . The originally proposed framework obtains velocities as the deviation of the observed ratio of spliced and unspliced mRNA from an inferred steady state. Errors in velocity estimates arise if the central assumptions of a common splicing rate and the observation of the full splicing dynamics with steady-state mRNA levels are violated. With scVelo (https://scvelo.org), we address these restrictions by solving the full transcriptional dynamics of splicing kinetics using a likelihood-based dynamical model. This generalizes RNA velocity to a wide variety of systems comprising transient cell states, which are common in development and in response to perturbations. We infer gene-specific rates of transcription, splicing and degradation, and recover the latent time of the underlying cellular processes. This latent time represents the cell's internal clock and is based only on its transcriptional dynamics. Moreover, scVelo allows us to identify regimes of regulatory changes such as stages of cell fate commitment and, therein, systematically detects putative driver genes. We demonstrate that scVelo enables disentangling heterogeneous subpopulation kinetics with unprecedented resolution in hippocampal dentate gyrus neurogenesis and pancreatic endocrinogenesis. We anticipate that scVelo will greatly facilitate the study of lineage decisions, gene regulation, and pathway activity identification.

S57

Notes

S58

Leveraging time-lapse and kinship information in single cell RNA Sequencing

Arne Wehling, Prof. Dr. Timm Schroeder

Cell Systems Dynamics Group, D-BSSE, ETH Zurich, Mattenstrasse 26, 4058 Basel, Switzerland

Trajectory inference tries to capture developmental dynamics in single cell transcriptomic datasets by assigning each cell a pseudo-temporal progression. Due to the lack of information on history and kinship, we cannot test these methods for accuracy, nor can they describe actual kinetics.

Here we present a novel experimental workflow linking quantitative long-term time-lapse imaging to single cell RNA sequencing (scRNA-Seq). A novel computational tool performs automated cell tracking during time-lapse imaging of cell cultures and identifies user- specified events to be isolated with a robotic picker for subsequent scRNA-Seq. Throughout this entire pipeline, time-lapse information including absolute time and cellular kinship is preserved and can thus be mapped to pseudotime models to test their accuracy.

We recently reported that murine hematopoietic stem cells undergo asymmetric cell division. Asymmetric inheritance of lysosome predicts the metabolic activation of daughter cells receiving less lysosomes. Here, we now investigate the transcriptional programs following this asymmetric cell division by stratifying single cells based on their kinship, lysosomal inheritance and time since division. Our new insights are surprising and highlight the significance of time-lapse information in scRNA-Seq analysis.

The presented workflow has broad implications. It is suited for different cell types and many questions requiring to link cellular or molecular dynamics (e.g. of signaling and transcription factor network) to high-dimensional endpoint transcriptome analysis.

S59

Notes

S60

Transcriptional landscapes in health and disease using tissue images, single cell sequencing and in situ transcriptomes

Joakim Lundeberg

The cell is a fundamental unit of life, yet we know surprisingly little about them. Specific types of cells exist in every organ, and serve specialized functions defined by the specific genes and proteins active in each cell type. Comprehensive maps of molecularly defined human cell types are underway through the Human Cell Atlas (https://www.humancellatlas.org) effort using primarily single cell RNA sequencing. The technologies to assemble spatiotemporal maps that will describe and define the cellular basis of health and disease is less well clear. We have developed and established the Spatial Transcriptomics technology, in which tissue imaging is merged with spatial RNA sequencing and resolved by computational means. Spatial Transcriptomics technology was the first method to provide unbiased whole transcriptome analysis with spatial information from tissue using barcoded array surfaces and has since the initial publication been used in multiple biological systems in health and disease. Biological applications, benchmarking, and future drivers of the technology combined with single cell RNA sequencing will be described.

S61

Notes

S62

High resolution anatomical mapping of gene expression using targeted in situ sequencing

Mats Nilsson

Department of Biochemistry and Biophysics, Stockholm University Box 1031Se-171 21 Solna, Sweden

Single-cell RNA-seq (scRNAseq) is a powerful tool to classify cells into molecularly defined cell types. However, information about spatial location within the original tissue is lost, and these questions often arise: How are the cell clusters spatially organized in the tissue? Is there a difference in spatial organization of cell clusters in healthy vs disease tissue? I will present work on developing and applying targeted in situ sequencing (ISS) to build spatial maps of scRNAseq-defined celltypes in cm2 sections of human and mouse tissues. Together with Kenneth Harris (UCL) we have developed a probabilistic approach to assign identity to individual cells in the tissue sections based on comparison of the ISS data with the profiles of scRNAseq clusters. We validated the approach by comparing our maps with the well understood organization of interneurons in mouse hippocampus, and excitatory neurons in iso-cortex. We mapped the activity of 99 marker genes across 14 sections of mouse brains, each comprising about 120 000 cells (Qian et al (2020) Nature Methods, 17, 101-106). We have applied the method to draw spatial cell maps of human developmental heart tissue, where marker genes were selected from both Spatial Transcriptomics and scRNAseq data (Asp et al. (2019) Cell, 179, 1647-1660. We have improved the ISS chemistry to improve signal-to-noise, and detection efficiency which has allowed us to map expression of 160 genes in human brain cortex, samples that are challenging to analyze due to high autofluorescence (Gyllborg, et al. (2020) bioRxiv). We are also using our targeted in situ sequencing to map expression- and mutational heterogeneity in tumors. By targeting mutations identified by deep sequencing, we create maps of clones of subtypes of cancer cells across tissue sections. We then overlay these maps with in situ expression profiles of tumor marker genes, as well as, immune celltyp- and activity markers, to create oncomaps where we aim to predict treatment responses for different sub-clones of the tumor.

S63

Notes

S64

Spatially resolved cell atlasing via integration of spatial and single nucleus transcriptomics

Omer Bayraktar, Vitalii Kleshchevnikov, Artem Lomakin, Emma Dann, Artem Shmatko, Moritz Gerstung, Omer Bayraktar

Wellcome Sanger Institute, European Bioinformatics Institute

Multi-modal single cell genomics can reveal tissue architecture at an unprecedented level. Here, we present a probabilistic model that integrates single nucleus genomics and spatial transcriptomics to spatially map cell types, gene expression programs and cellular interactions at scale. We generated single-nucleus RNA-sequencing and Visium array-based spatial transcriptomics data from anatomically adjacent tissue sections in the mouse brain. We used this rich multi-modal dataset to derive reference cell signatures from single nucleus transcriptomics, then estimate the mRNA contribution of cell types to spatial tissue locations on the Visium data via a Bayesian model. We accurately located dozens of neuronal and glial cell types across major brain regions. Furthermore, we could distinguish neural subtypes across fine anatomical structures such as distinct pyramidal neurons within each layer of the cerebral cortex. We validated these predictions using single molecule fluorescent in situ hybridization as well as independently generated single cell transcriptomic references. We also distinguished cell states that were unobserved in dissociated singe nucleus transcriptomics in the spatial data. Finally, we inferred tissue regions in an unbiased manner using spatially resolved cell types and prioritize cell-cell interactions identified single nucleus transcriptomic data based on inferred cellular co-locations. Thus, we propose a simplified experimental and computational workflow to spatially resolve cell types across complex tissues at scale.

S65

Notes

S66

A human single-cell atlas of the Substantia nigra reveals novel cell-specific pathways associated with the genetic risk of Parkinson’s disease and neuropsychiatric disorders.

Viola Volpato, Devika Agarwal 1,2, Cynthia Sandor 3, Tara Caffrey 2, Jimena Monzon- Sandoval 3, Javier Alegre-Abarrategui 2,5, Richard Wade-Martins 2, Caleb Webber 2,3

1 Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, University of Oxford, Oxford, UK 2 Department of Physiology, Anatomy, Genetics, University of Oxford, Oxford, UK 3 UK Dementia Research Institute, Cardiff University, Cardiff, UK 4 Wellcome Trust Centre for Human Genetics, Roosevelt Drive Oxford OX2 7BN 5 Department of Neuropathology, University of Oxford, Oxford, UK

We sequenced ~ 17,000 nuclei from both the Cortex and Substantia Nigra (SN) of five human brains and generated the first human single-nuclei transcriptomic atlas of the SN, a region playing important roles in reward and movement. By mapping genetic variants associated with different human traits to SN cell-type-specific gene expression patterns, we demonstrate for the first time that Parkinson's disease (PD) genetic risk, for which the symptoms are caused by loss of SN dopaminergic neurons (DaNs), is indeed associated with DaN-specific gene expression affecting pathways such as mitochondrial organisation and functioning, protein ubiquitination and vesicle transport. We also identify a distinct cell type association between PD risk and oligodendrocyte-specific gene expression. Unlike Alzheimer's disease (AD), we find no association between PD risk and microglia or , suggesting that neuroinflammation plays a less causal role in PD than AD. Beyond PD, we find other neuropsychiatric disorders, particularly schizophrenia, to be associated to SN DaNs and GABAergic neurons. Nevertheless, we find that each neuropsychiatric disorder is associated with a distinct set of genes within those neuronal types. On the contrary, we find an overlapping component of risk between neuropsychiatric disorders in the association to glial cells, particularly oligo-precursor cells (OPCs). This atlas provides the first objective associations between genetic risk of multiple disorders and the midbrain cell types these risks likely manifest through, thereby directing our aetiological understanding.

S67

Notes

S68

Poster Presentations

Neural stem cell differentiation trajectories in the developing human brain revealed by whole-transcriptome in situ spatial profiling

Alexander Aivazidis, Kenny Roberts, Robin Fropf, Michael Rhodes, Joseph Beechem, Omer Bayraktar

Wellcome Sanger Institute, Nanostring Technologies

The emerging field of spatial genomics promises highly-multiplexed transcriptomic analysis of cells within intact tissues. Here, we present the application of a whole transcriptome wide in situ profiling method, the Nanostring GeoMx technology, to mapping the differentiation trajectories of neural stem cells in the fetal human brain. The newly developed GeoMx technology targets an mRNA panel of 18,000+ human protein-coding genes in situ. The transcriptome wide probe panel is hybridized to FFPE human brain tissue sections where each probe carries a photocleavable oligonucleotide tag. The probe tags are collected from cells and areas of interest (AOI) using the GeoMx workflow and prepared into sequencing libraries. After sequencing, expression profiles are mapped back to each AOI. We applied the GeoMx workflow to distinguish the transcriptomic profiles of neural stem cells, intermediate progenitors and neurons in the fetal human cerebral cortex at 14 and 19 post conception weeks. Using reference single cell RNA-sequencing datasets, we validated that the GeoMx technology distinguished cell-type specific expression profiles in situ. We examined cell type specific gene expression programs throughout the cortical germinal zones, subplate and the maturing cortical plate. We successfully identified spatiotemporal gene expression programs that correlate with neural stem cell self-renewal and differentiation. Finally, we demonstrated the heterogeneity of neural stem cells across different germinal zones as well as within each germinal zone. Hence, whole transcriptome in situ profiling reveals cell type specific gene expression programs with unprecedented spatial resolution.

P1

Epigenetic Profiling of Single-Cell Peripheral Blood Mononuclear Cells in response to P. falciparum infection

Bana Alamad1, Massar Dieng1, Wael Abdrabou1, Aissatou Diawara1, Manikandan Vinu1, Sodiomon B. Sirima 2, Issiaka Soulama2 and Youssef Idaghdour1.

1. Department of Biology, New York University Abu Dhabi, UAE [Lead Presenter Bana Alamad] 2. Centre National de Recherche et Foromation sur le Paludisme (CNRFP), Ouagadougou, Burkina Faso

Malaria is a blood-born life-threatening disease that affects 300 to 500 million people annually, it is transmitted by Anopheles mosquitoes and caused by parasites of the genus Plasmodium. Upon infection, transcriptional activation of host immune cells is required to mount an efficient immune response. Both genetic and epigenetic mechanisms are involved in the regulation of transcription, but little is known about these effects under host response to P. falciparum infection. Here we focus on documenting epigenetic and transcriptomic changes taking place in single-cell Peripheral Blood Mononuclear Cells (PBMCs) collected from children in Burkina Faso upon P. falciparum infection. We performed gene expression profiling and 10X Genomics Chromium Single-cell Assay for Transposase-Accessible Chromatin (ATAC-seq) on PBMCs collected from children of two sympatric ethnic groups. scATAC-seq libraries were sequenced on a NovaSeq6000 instrument and analyzed using multiple bioinformatic pipelines. This analysis identified significant cell-type specific changes in chromatin accessibility as well as the genes and genomic regions underlying these changes. Pathway enrichment analysis implicate key adaptive immune pathway implicated in these changes. These results provide valuable insights and resources that will assist in a deeper understanding of host in vivo immune response to malaria infection.

P2

AutoGeneS: Automatic Gene Selection using Multi-Objective Optimization for Bulk RNA-seq Deconvolution

Hananeh Aliee, Fabian J. Theis

Institute of Computational Biology, Helmholtz Centre, Munich, Germany Department of Mathematics, Technische Universität München, Munich, Germany

Bulk RNA samples are routinely collected and profiled for clinical purposes and biological research to study gene expression patterns in various conditions such as disease states. Such samples reflect the averaged gene expression across thousands of cells and so masks cellular heterogeneity in their complex tissues. However, the knowledge of cell type composition and their fractions helps to get enhanced insights into a tissue, for example, to identify the cell types which are targeted by a disease. Characterizing the variation of cell type composition also opens new avenues in analyzing tremendous, yet underexplored quantity of biomedical data already collected in clinics. Therefore, several computational methods have been proposed to infer the proportion of cell types from bulk RNA samples. The performance of these methods on real bulk RNA samples highly depends on the set of genes on which the deconvolution is applied. These genes are often selected based on prior knowledge or a single-criterion test that might not be suitable to dissect closely-correlated cell types. Moreover, due to the advances in single-cell RNA-sequencing (scRNA-seq), many cell subtypes are still being discovered whose specific markers might not be known. Therefore, automatic techniques are desired to specify genes that can be employed for deconvolution. In this study, we introduce AutoGeneS, an automatic approach for extracting informative genes that can be used to reveal the cellular heterogeneity of bulk RNA samples. AutoGeneS requires no prior knowledge about marker genes and, for the first time, selects genes by simultaneously optimizing multiple criteria: minimizing the correlation and maximizing the distance between cell types. Moreover, AutoGeneS can be applied to reference profiles from various sources like single-cell experiments or sorted cell populations. Results on human samples of peripheral blood illustrate that AutoGeneS outperforms other methods for the analysis of bulk RNA samples with noise and closely- correlated cell types. Ground truth cell proportions by flow cytometry confirmed the accuracy of AutoGeneS's predictions in identifying cell type proportions. AutoGeneS is available for use via a standalone Python package (https://github.com/theislab/AutoGeneS).

P3

Single cell transcriptomic analysis uncovers the potential of induced pluripotent stem cell differentiation into myeloid cells

Clara Alsinet, Maria Primo, Daniel Gaffney

Genomics of Gene Regulation, Wellcome Sanger Institute, Hinxton, UK

Induced pluripotent stem cells (iPSCs) can be differentiated to a number of cell types, however differentiation processes are not fully understood and the full potential and limitations are not clear. Here, we present a comprehensive single cell transcriptomic analysis of an embryoid body(EB)-based differentiation protocol from iPSCs to macrophages and dendritic cells (DCs). We collected samples every 2-3 days for 38 days. iPSCs express a mesodermal commitment profile when forming EBs and cultured with BMP4, SCF and VEGF. These EBs express markers of primitive streak and mesoderm progenitors (MIXL1, LHX1, NODAL, FGF8, WNT3). Upon plating of the EBs in gelatin and stimulation with IL3 and M-CSF (macrophages) or FLT3L and GM-CSF (DCs), mesoderm progenitors give rise to two main cell compartments. First, hematopoietic myeloid lineage cells (PTPRC, CD38, RUNX1) were identified using gene signatures from a fetal liver hematopoiesis single cell dataset, including hematopoietic stem cells (HSCs), mast cells, megakaryocytes, erythrocyte progenitors, neutrophil-myeloid progenitors, monocytes and macrophages or DCs. Second, a stromal cell compartment shows some heterogeneity but it consistently expresses VIM and fibroblast markers (IFITM3, DCN, SPARC, IGF2, IGFBP3). Besides these two larger classes, a number of side populations were observed. A distinct group of cells within the iPSCs does not respond to the cytokines and remains transcriptomically stable, its main characteristic is low mitochondrial genes expression. Another cluster expresses mesendoderm markers (FOXA2, GSC, WNT3, FGF8) and progresses to definitive endoderm (AFP, FOXA3, IHH). Also, another population shows hemangioblast markers (CD34, ETV2, KDR, TAL1), a controversial embryonic cell type that could be an extraembryonic primitive HSCs progenitor. Finally, along the stromal cells we unexpectedly find a cluster of cells expressing epithelial markers (EPCAM, CDH1, KRT8, KRT19) which eventually undergoes an epithelial to mesenchymal transition. Of note, all main and side populations described are found in 3 independent cell lines and, also, in both the macrophage and DC protocols despite the distinct cytokines used in each protocol.

Overall, we find clusters of cells covering the full myeloid hematopoiesis thus showing the applicability of this protocol to study cell types beyond macrophages or DCs. Importantly, these populations resemble cell types from embryonic origin. The unexpected populations identified highlight that a number of cell-fate decisions occur during the differentiation independently from the cytokines used. This further expands the range of biological questions that could be studied using single cell transcriptomics and differentiation protocols.

P4

Cell type classification with minimal marker gene sets

Felix Frauhammer and Simon Anders, Isabel Pötzsch and Margot Chazotte

Center for Molecular Biology, University of Heidelberg

To date, most data analysis work flows for single-cell RNA-Seq determine cell types by first assigning cells to clusters and then clusters to cell types, relying on the questionable assumption that all cells in a cluster are of the same type. Often, one knows which cell types one expects and also knows a few marker genes for each cell type. Then, classification is a more natural approach, but approaches to do so have become available only very recently (namely, "Garnett" by Zheng et al. and "CellAssign" by Zhang et al.)

We present a classification method that is able to perform reliable cell classification with minimal input. The user specifies a list of expected cell types and, for each cell type, a few or perhaps even just one marker genes. We then classify cells by first averaging marker gene expression over neighbouring cells, and then using the EM algorithm to fit a mixture of multivariate lognormal or Gamma-Poisson distributions to the resulting data. We ensure that the smoothing does not create artifacts by checking the Poisson deviance of the raw UMI counts from the neighbourhood averages.

We successfully applied this strategy to classify cell types in various data sets, including simple designs (PBMCs from one donor) and complex cohorts (diseased and healthy brain tissue from dozens of human donors). In a cohort study with human brain samples, we show that a single marker per cell type is sufficient for robust assignment of major cell types, even if this marker gene is lowly expressed and detected in only a fraction of the respective cells. Adding one marker per subtype to that panel resolves cortical neurons into their inhibitory and excitatory subtypes. Notably, we find that the same minimal set of markers, established on above cohort of autism patients, was immediately applicable to another study on multiple sclerosis, from another lab. This transferability is crucial for large projects aiming for objective and reproducible cell type annotations across studies, tissues and platforms, and performing comparative inference across sample groups. Lastly, we compare different smoothing methods by their performance (nearest neighbor averaging, local Poisson- regression on principal components, and others) and show preliminary evidence that our approach might, by design, exclude doublets and outlier cells. Our smoothing-classification approach will soon be available as an R package.

P5

Identification of mesenchymal stromal cells in a spermatogenic niche using bioinformatic approaches

M.S. Arbatsky, G.D. Sagaradze1,2, N.A. Basalova1,2, A.Yu. Efimenko1,2

1Department of Biochemistry and Molecular Biology, Faculty of Medicine, M.V. Lomonosov Moscow State University, Moscow, Russia, 2Institute for Regenerative Medicine, Medical Research and Education Center, Lomonosov Moscow State University, Moscow, Russia

Male infertility, manifested by impaired spermatogenesis, is a serious social and medical problem. For most idiopathic forms of male infertility, there is no effective treatment. Since the viability and function of spermatogonial stem cells (SSCs) that give rise to spermatozoa is regulated by a specific microenvironment, or stem cell niche, impaired function of the niche can cause infertility. An important contribution to this process can be made by stromal cells, in particular, mesenchymal stromal cells (MSCs). Thus, the participation of MSCs in the restoration of some stem cell niches was demonstrated, in particular, due to the secretion of the spectrum of bioactive factors by these cells. However, the localization and origin of MSCs in the spermatogenic niche and, therefore, the possibility of mobilization of the MSC pool in case of damage remain unexplored.

Recently, new methods for studying the cellular composition of tissues at the level of single cells have made it possible to establish various subpopulations of cells with a specific transcriptome profile. Therefore, the aim of this work was to identify subpopulations among stromal testicular cells that are similar in transcript to MSCs isolated from other tissues in order to establish their localization and belonging to known populations of CCK niche cells using bioinformatics analysis. To do this, we used arrays of data for sequencing single cells of human adipose tissue, as well as testicular tissue samples obtained from healthy donors. To process the sequencing data arrays, we used the cellranger count program (Cellranger- 3.0.1), and the clustering results were combined using cellranger aggr (Cellranger-3.0.1). It was found that clusters enriched with genes overexpressed in MSCs are closest to clusters enriched with genes inherent in Leydig cells and pericytes. Moreover, these groups are distant from clusters in which spermatogenic epithelial genes are overexpressed. Thus, it can be assumed that MSCs are localized in the interstitium of the testis, possibly in the perivascular zone, and are closest to Leydig cells in terms of the spectrum of expressed genes. Further comparison of MSCs with populations of testicular interstitial cells and the search for mechanisms of activation or increase of the MSC pool by affecting individual components of the testicular interstitium will allow us to develop more effective approaches to the restoration of the SSC niche and, ultimately, the treatment of male infertility.

This work was supported by the Russian Science Foundation (project No. 19-75-30007).

P6

Epigenetic alterations and their functional consequences in naïve-to-memory B cell transition in Common Variable Immunodeficiency (CVID) one cell at a time

Javier Rodriguez-Ubreva, Anna Arutyunyan, Javier Rodríguez-Ubreva (1)*, Anna Arutyunyan (2)*, Marc J. Bonder (3), Stephen J. Clark (4), Lucía Del Pino-Molina (5), Laura Ciudad (1), Luz Garcia-Alonso (2), Felix Krueger (3), Krzysztof Polanski (2), Lira Mamanova (2), Francesc Català-Moll (1), Virginia C. Rodríguez-Cortez (1), Maite Saavedra (6), Holger Heyn (7), Carlos Rodríguez-Gallego (6), Eduardo López-Granados (5), Sarah A. Teichmann (2), Gavin Kelsey (4), Oliver Stegle (3), Roser Vento-Tormo (2), Esteban Ballestar (1) * these authors contributed equally to this work

1 - Josep Carreras Leukaemia Research Institute (IJC), Badalona, Barcelona, Spain; 2 - Wellcome Sanger Institute, Cambridge, Hinxton, UK; 3 - European Bioinformatics Institute, Hinxton, Cambridge, UK; 4 - Babraham Institute, Cambridge, UK; 5 - Department of Clinical Immunology, IdiPAZ Institute for Health Research, University Hospital La Paz, Madrid, Spain; 6 - Hospital Universitario de Gran Canaria Dr. Negrín, Las Palmas de Gran Canaria, Canary Islands, Spain; 7 - CNAG-CRG, Center for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain; Universitat Pompeu Fabra, Barcelona, Spain. * these authors contributed equally to this work

Common variable immunodeficiency (CVID), the most prevalent symptomatic primary immunodeficiency, is characterised by impaired terminal B cell differentiation and defective antibody responses. CVID pathogenesis remains largely unresolved, only 25% of CVID cases can be explained by monogenic gene defects. Incomplete disease penetrance and the wide range of phenotypic expressivity in CVID suggest additional mechanisms of its pathogenesis. Monozygotic (MZ) twins share almost all of their genetic variants as well as their environment before birth. Therefore, discordant MZ twins for a specific disorder are ideal models to characterise the contribution of epigenetics to disease development. We performed DNA methylomics, transcriptomics and chromatin accessibility analysis at single- cell level on three B cells subsets (naive, non-switched and switched memory B cells) from a pair of CVID-discordant MZ twins. Our results show heterogenic DNA methylation defects in CVID memory B cells in genomic regions associated with upregulated genes and active enhancers during B cell activation. Additionally, we identified changes in chromatin accessibility in the memory B cell compartment and very small changes in gene expression in these steady state cells. Furthermore, we activated PBMCs and performed scRNA-seq in order to investigate how the identified epigenetic defects in CVID influence both the establishment of transcriptional programs and the crosstalk between different immune compartments. We observed alterations in the expression of genes related to B cell differentiation and function, as well as crucial defects in the crosstalk between B cells and other immune subsets. Altogether, our findings point towards CVID intrinsic epigenetic machinery malfunction in B cells, as well as alterations in other immune cell compartments that will ultimately give rise to an altered phenotype. These findings have implications for understanding epigenetic regulation of naive-memory B cell transition, and the epigenetic regulatory layers underlying CVID.

P7

Fate Mapping of Vascular Smooth Muscle Cells in Atherosclerosis by Single-Cell RNA-Seq

Benjamin Auerbach (1), Huize Pan (2), Chenyi Xue (2), Mingyao Li (3), Muredach Reilly (2)

(1):Graduate Group in Genomics and Computational Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; (2): Division of Cardiology, Department of Medicine, Columbia University Irving Medical Center, New York, NY 10032, USA; (3): Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA.

Vascular smooth muscle cells (VSMCs) play an important role in both maintaining the structural integrity and regulating the diameter of blood vessels. In health, VSMCs reside in the medial layer of the vessel wall. During atherogenesis, VSMCs have been observed to migrate from the medial layer to the atherosclerotic lesion. Recent studies suggest VSMCs residing in the lesion phenotypically resemble fibroblast-endothelial intermediates, referred to as modulated VSMCs, which likely promote plaque stability. However, the causes of VSMC migration and possible fates of VSMC-derived cells in atherosclerotic lesions remain unknown. To uncover VSMC-derived cell types during atherogenesis and reveal the underlying mechanisms of VSMC transdifferentiation, we combined lineage tracing with single-cell RNA-seq (scRNA-seq) in an LDLR-/- mouse model at four time points during atherogenesis (0, 8, 16, and 24 weeks of high fat diet). In total, we generated 27292 high- quality scRNA-seq cells with a median of 1869 detected genes per cell. Using an unsupervised deep-learning based clustering algorithm that is robust to batch effect, we identified 12 cell types, including VSMCs, modulated VSMCs, fibroblasts, endothelial cells, T cells, and macrophages. To identify candidate cues that provoke VSMC differentiation, we studied altered signaling during atherogenesis from cell types within the blood vessel. Using a database of receptor-ligand pairs, we identified candidate cell signals promoting VSMC migration from endothelial, macrophage, T, and fibroblast cells. Cells positive for the VSMC lineage tracing reporter suggest VSMCs can differentiate into three main phenotypes during atherogenesis: 1) modulated VSMCs, which have previously been observed, 2) traditional fibroblasts, and 3) CD68+ macrophages appearing late during atherogenesis. Through trajectory reconstruction and RNA velocity analyses, we characterized the transcriptional changes associated with these differentiation processes. We identified a set of genes associated with the decision of VSMC-derived cells to commit toward either modulated VSMCs or traditional fibroblasts. Further, using an in vitro stimulation protocol, we demonstrate that modulated VSMCs expressing pluripotent markers Ly6a and Ly6c1 are capable of differentiating into CD68+ macrophages. We are currently pursuing scRNA-seq analysis in human samples, which will be important for clarifying the applicability of mouse models as a means to study VSMC migration and fate during human atherosclerosis. Similarities in VSMC fates between human and mouse would underscore a need to characterize the role of VSMC-derived cells in both lesion formation and lesion stability. Ultimately, our results may point to therapeutic opportunities to prevent lesion formation and/or improve lesion stability in atherosclerosis.

P8

BCL11A enables aberrant differentiation of TNBC tumour initiating cells

Karsten Bach*1,2,8, Sara Pensa*1,8, Kyren A. Lazarus*1,8, Mona Shehata7, Alasdair Russell2, Song Choon-Lee3, Pentao Liu3,4, Carlos Caldas2,8, John C. Marioni2,3,5,8, Walid T. Khaled1,6,8

1. University of Cambridge, Department of Pharmacology, Cambridge, CB2 1PD, UK 2. Department of Oncology and Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, CB2 0RE, UK 3. Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1HH, UK 4. Present address: School of Biomedical Sciences, University of Hong Kong, Hong Kong 5. European Bioinformatics Institute, European Molecular Biology Laboratory, Hinxton CB10 1SD, UK 6. Wellcome-MRC Cambridge Stem Cell Institute, Cambridge, CB2 0SZ 7. Medical Research Council Cancer Unit, University of Cambridge, Cambridge, CB2 0XZ, UK 8. Cancer Research UK, Cambridge Cancer Centre, Cambridge, CB2 0RE, UK *These authors contributed equally

Much of our understanding of cancer biology is based on analysis of late stage disease. Little is known about how at the early stages subtype specific genetic aberrations impact the fate of tumour initiating cells. To shed light on this, here we examine the interplay between the triple negative breast cancer (TNBC) oncogene, BCL11A, and the putative cell of origin, luminal progenitors. We demonstrate that the deletion of Bcl11a completely protects the Brca1/ TNBC mouse model from developing tumours. Single-cell RNA-sequencing analysis of pre-cancerous tissue from the Brca1/p53 mice revealed an aberrant differentiation of luminal progenitors towards secretory luminal cells, which phenotypically resemble those found during gestation. We show that this aberrant differentiation is dependent on Bcl11a as its deletion reverses the phenotype. In the human, analysis of luminal progenitor cells from BRCA1 carriers that had undergone prophylactic mastectomies also confirmed the detection of this aberrant differentiation phenotype in some of the patients. As such, our data demonstrates that this aberrant differentiation process is one of the earliest steps in TNBC tumour initiation and provide a potential avenue for stratifying women at high risk of TNBC.

P9

A gene expression atlas of the Drosophila midgut in intestinal homeostasis and bacteria-feeding conditions

Josephine Bageritz1, Felix Frauhammer2, Michaela Wölk,3 Schayan Yousefian1, Svenja Leible1, Erica Valentini1, Simon Anders2, Michael Boutros1

1German Cancer Research Center (DKFZ), Heidelberg, Germany; 2 Center for Molecular Biology (ZMBH), University of Heidelberg, Germany; 3Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland

Intestinal homeostasis is maintained by continuous renewal of the epithelium. In Drosophila, a population of proliferative intestinal stem cells (ISCs) gives rise to an immature progenitor, called enteroblast (EB), which further differentiates into an absorptive enterocyte (EC) or secretory enteroendocrine (EE) cell type. To accomplish the different tasks in the digestive process, the intestine consists of distinct regions, which differ morphologically and functionally within both, the epithelium as well as the intestinal stem/progenitor cell population. Those regional differences are additionally reflected in the transcriptomic landscape of the intestinal tissue. Great advances in single cell technologies and computational methods enable us to further elucidate the transcriptional heterogeneity within the Drosophila midgut in homeostatic and bacteria-challenged condition. We provide an interactive visualization tool that allows users to explore our single-cell transcriptome data for their specific interest. While ECs and EEs can be unambiguously identified by scRNA-seq, ISC and EB cells were found to have a similar transcriptional profile in the unperturbed condition. More in-depth data analysis allowed us to separate ISCs from their immediate progeny and consequently a better characterization of both cell types. Novel marker genes could be identified and validated. While regional information was not apparent in the stem/progenitor cell pool, the population of profiled EEs and ECs clearly comprises regional heterogeneity. An elevation of pathogen-response gene expression in ECs and the appearance of a proliferative ISC cluster is seen under bacteria-feeding. Interestingly, our single-cell transcriptome data suggest that the EB population expresses high number of mRNA transcripts and directly responds to the pathogenic bacteria. In sum, our single-cell study explores the differences in ISC/EB transcriptome and offers a user-friendly tool to explore gene expression in the intestinal cells upon bacteria-feeding as well as intestinal homeostasis.

P10

Single cell RNA-seq is plagued by transcriptome-wide expression loss

Brad Balderson, Mikael Boden

School of Chemistry and Molecular Bioscience, The University of Queensland

Single cell RNA-sequencing (scRNA-seq) quantifies the RNA content in thousands of individual cells. This data is used to discover rare cell types, identify regulatory genes, and detect changes in gene expression that occur as cells differentiate. Lowly expressed genes in scRNA-seq are inadequately measured due to insufficient RNA in a single cell, resulting in a technical phenomenon known as \textit{dropout effect}. Dropout effect is when a gene is measured to have no expression, when the gene is actually expressed. We show through usage of scRNA-seq and RNA-seq performed on equivalent cells at different dilutions, that missing gene expression is not just limited to dropout. Rather, there is a transcriptome-wide expression loss that we term as the \textit{dilution effect}. The dilution effect results in lowly expressed genes losing more units of expression than is available, therefore resulting in the commonly observed dropout effect. Hence the dropout effect is a symptom of a much larger problem, and statistical solutions should focus on correcting expression loss across the transcriptome to address dropout events.

P11

Single-cell RNA-Seq reveals recurrent and unique gene modules in diverse cancer types

Dalia Barkley, Reuben Moncada, Itai Yanai

NYU Grossman School of Medicine

While intratumoral heterogeneity has been extensively studied in terms of genetic alterations and phenotypic properties such as drug resistance, only recently has single-cell RNA-seq begun to expose the remarkable transcriptional diversity within tumors. Using this technology, independent reports over the past five years have found evidence for distinct transcriptional states among cancer cells, exposing the complex transcriptional architecture of tumors. To search for commonalities between cell states in diverse cancer types, we collected 17 primary untreated tumors spanning 11 cancer types and processed them for single cell RNA- Seq analysis. To augment our dataset, we also included 9 tumors from 2 previous studies of glioblastoma and oligodendroglioma. Analyzing the transcriptomic data, we identified the malignant cells and performed non-negative matrix factorization (NMF) on these to identify groups of genes with similar expression. To search for commonalities across the gene groups, we adopted a graph theoretic approach and identified 15 conserved gene modules found across multiple cancer types. All gene modules were enriched in several terms, suggesting that rather than innovating a cancer-specific attribute, tumors co- opt existing gene modules. Specifically, the modules included cell cycle, antigen presentation, interferon presentation, stress response, and hypoxia. Seeking to understand how these gene modules are assembled at the level of individual cells, we scored the expression of each module in each cell, and found that individual gene modules are expressed at similar cell frequencies across tumors. We conclude that cancer cell states are likely an important factor in the tumor dynamic system, with potentially significant clinical impact. Further studies of the relationships between cancer cell states will be critical to understand how malignant cells form organized systems capable of immune evasion, drug resistance, invasion, and metastasis.

P12

Low-input, deterministic profiling of single-cell transcriptomes reveals individual intestinal organoid subtypes comprised of single, dominant cell types

Johannes Bues 1, Marjan Biocanin 1, Joern Petzold 1, Riccardo Dainese 1, Saba Rezakhani 2, Revant Gupta 3, Esther Amstad 4, Manfred Claassen 3, Matthias Lutolf 2, Bart Deplancke 1

1 - Laboratory of Systems Biology and Genetics, Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland 2 - Laboratory of Stem Cell Bioengineering, Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland 3 - Institute of Molecular Systems Biology, Department of Biology, Eidgenössische Technische Hochschule Zürich (ETH Zürich), Zürich, Switzerland 4 - Soft Materials Laboratory, Institute of Materials, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland

High-throughput single-cell RNA-sequencing (scRNA-seq) has become a transforming tool for exploration of cellular components in tissues. An essential development was the introduction of microdroplet based systems, vastly increasing throughput and permitting routine experimentation of samples, but limited to relatively high cell numbers (> 1000). Despite these improvements, efficient processing of rare cells or small tissues is a bottleneck of current approaches and pose a substantial limit to address key questions in research and diagnostics. In this study, we developed a deterministic droplet system that overcomes these limitations and employ it to map heterogeneity in the context of intestinal organoid model system. By combining machine-vision and multilayer microfluidics, we achieve position control of particles, droplet formation, and droplet sorting, on a single microfluidics chip; effectively permitting deterministic co-encapsulation of mRNA-capture beads with single cells. We utilized this system to profile miniscule numbers of cells (< 100 cells), by effective capture and sequencing library conversion. As a case study, to underscore the unique capabilities of the system, we mapped the developmental process of individual intestinal organoids on the single cell level. We identified four organoid subtypes and two stem-cell populations, underscoring the heterogeneity across organoids under conventional intestinal organoid culture conditions. Furthermore, we utilize this data to elaborate the cause for heterogeneity in one of the organoid subtypes.Overall, our technological platform enables us to cost-efficiently probe heterogeneity of ultra-rare samples by deterministically processing each input cells.

P13

Next generation In Situ Sequencing for spatial single cell RNA sequencing in intact tissues

Anton Björninen, Iván Hernández, Malte Kühnemund, Xiaoyan Qian, Toon Verheyen

CartaNA AB, Stockholm, Sweden

Single cell RNAseq methods lose the spatial organization of cells inside tissues. To overcome this, a range of spatial genomics/transcriptomics techniques have been devised over the past years. In Situ Sequencing (ISS) takes place in morphologically preserved tissue sections and can be used to sequence clonally amplified barcode sequences that are introduced by ligation of gene specific probes (Ke, 2013 Nat methods). ISS analyses hundreds of genes inside intact tissues with single cell resolution. In contrast to spatial sequencing methods (ST, Slideseq), ISS has single cell resolution and is targeted at the genes of interest, eliminating reads from non-informative house-keeping RNA. In comparison to multiplexed FISH methods, ISS has a higher throughput (~10 cm2 tissue/week/standard fluorescent microscope) and detects mutations and splice variants with high specificity. At CARTANA, we have developed the next generation of in situ sequencing (NGISS) with increased sensitivity. It omits cDNA synthesis using a padlock probe derivate that directly targets RNA with high sensitivity and retained specificity. Today, CARTANA serves customers in the Neurosciences, embryonic developmental and disease research and will soon expand into Oncology. CARTANA is contributing to the Human Cell Atlas initiative as partner in 3 European HCA projects with its main task to spatially map scRNAseq-defined cell types in a range of tissue types.

P14

Genotyping and demultiplexing bulk and single-cell RNA-seq with kmuxlet

A. Sina Booeshaghi [1], Páll Melsted [2], Nicolas Bray, Vasilis Ntranos [3,4,5,6], and [7,8]

1. Department of Mechanical Engineering California Institute of Technology, Pasadena, California 2. Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of Iceland, Reykjavik, Iceland 3. Department of Epidemiology and Biostatistics, UCSF 4. Department of Bioengineering and Therapeutic Sciences, UCSF 5. Bakar Computational Health Sciences Institute, UCSF Diabetes Center, UCSF 6. Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California 7. Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California

We present a workflow for genotyping individual scRNA-seq samples, and for demultiplexing pooled samples without a reference genotype. We prove theoretically and via simulations that our method, which relies on fast matrix completion, can demultiplex hundreds of individuals and quantify the required coverage. Our approach is based on the kallisto and bustools single-cell RNA-seq pre-processing tools, and is extremely fast and memory efficient. We also demonstrate the advantages of our method, called kmuxlet, on experimental data from Kanton et al., Nature, 2019 consisting of several pooled human organoids.

P15

NicheNet: Modeling Intercellular Communication by Linking Ligands to Target Genes

Robin Browaeys[1,2], Johnny Bonnardel[3,4], Wouter T'Jonck[3,4], Liesbet Martens[3,4], Charlotte Scott[3,4], Martin Guilliams[3,4], Wouter Saelens[1,2]*, Yvan Saeys[1,2]*

[1]: Data Mining and Modeling for Biomedicine, VIB Center for Inflammation Research, Ghent, Belgium; [2]: Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium; [3]: Laboratory of Myeloid Cell Ontogeny and Functional Specialization, VIB Center for Inflammation Research, Ghent, Belgium; [4]: Department of Biomedical Molecular Biology, Ghent University, Ghent, Belgium; *: These authors contributed equally.

Spatial and single-cell transcriptomics allow the study of intercellular communication in unprecedented depth. Current computational methods study intercellular communication from these data types by linking ligands expressed by sender cells to their corresponding receptors expressed by receiver cells. However, these methods fall short in providing functional understanding of an intercellular communication process because they do not cover the influence of these ligand-receptor interactions on intracellular signaling pathway activation and target gene expression. To address this need, we developed a computational method called NicheNet (https://github.com/saeyslab/nichenetr). NicheNet prioritizes ligand-receptor interactions based on their gene regulatory effects, and predicts their specific downstream target genes. NicheNet can thus infer which ligands produced by one cell influence the expression of which target genes in an interacting cell. Ligand-target links are predicted by integrating transcriptome data of interacting cells with existing knowledge on ligand-receptor, signaling, and transcriptional regulatory networks. The existing knowledge is represented by a prior knowledge model that was calculated by network-based data-integration of several complementary publicly available data sources. In this study, we first performed an exhaustive computational validation of NicheNet. As gold standard, we collected transcriptome data of cells before and after they were stimulated by a ligand in culture. We assessed the accuracy, popularity and cell-type bias of the model in predicting the transcriptional response, but also in predicting the ligand of interest given this response. For the latter prediction task, NicheNet outperforms existing methods for upstream regulator analysis while being less popularity biased. In conclusion, results support the proposed data-integration methodology and demonstrate that NicheNet could be generally applicable to a broad set of different biological systems. As a first case study, we applied NicheNet on scRNAseq data of malignant cells and cancer- associated fibroblasts to study tumor-microenvironment interactions in head and neck squamous cell carcinoma. We demonstrate how NicheNet can prioritize fibroblast ligands with a role in promoting a partial EMT program in malignant cells. As second case study, we applied NicheNet to bulk transcriptomics data of mouse Kupffer cells (KCs) and interacting niche cells. NicheNet predicted that a combination of ligands from the Delta-Notch and BMP families could drive the expression of crucial KC-identity target genes. This hypothesis was experimentally validated in vitro, highlighting the hypothesis- generating potential of NicheNet. In conclusion, we expect that NicheNet will be a useful tool to better study the functional effects of cell-cell communication processes in health and disease states.

P16

Deep learning for cell-type annotation tasks does not outperform classical machine learning

Mrs Maren Buettner, Niklas Koehler, Niry Adriamanga, Fabian J. Theis

Institute of Computational Biology, Helmholtz Center Munich for Environmental Health, Neuherberg, Germany Department of Mathematics, TU Munich, Germany

Deep learning has revolutionized image analysis and natural processing with remarkable accuracies in prediction tasks, such as image labeling or word identification. The origin of this revolution was arguably the deep learning approach by the Hinton lab in 2012, which halved the error rate of existing classifiers in the then 2-year-old ImageNet database. In hindsight, the combination of algorithmic and hardware advances with the appearance of large and well-labeled datasets has led up to this seminal contribution. The emergence of large amounts of data from single-cell RNA-seq and the recent global effort to chart all cell types in the Human Cell Atlas has attracted an interest in deep-learning applications. However, all current approaches are unsupervised, i.e., learning of latent spaces without using any cell labels, even though supervised learning approaches are often more powerful in feature learning. Therefore, we investigate whether the increasingly large datasets and cell-type labels can be a playground for supervised deep learning in order to learn cell-type identity in new single-cell datasets. Notably, deep learning does not outperform classical machine-learning methods in the task. Cell-type prediction based on gene-signature derived cell-type labels is potentially a simplistic task for complex non-linear methods, and better labels of functional single-cell readouts are ideally required. We, therefore, are still waiting for the "ImageNet moment" in single-cell genomics.

P17

Building on the Seq-Well single cell platform

Lia Chappell1, Lauren Deighton1, Thierry Voet1,2

1. Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK 2. Dept. of Human Genetics, KU Leuven, Leuven, Belgium

Recently developed methods for sequencing thousands of cells in parallel are beginning to revolutionise the scale of single cell experiments. We wish to develop new protocols at this scale, and have explored droplet-based methods including Drop-seq and the 10x chromium platform. However, these approaches have limitations in either throughput or cost, and rely on custom instrumentation, limiting the application of these approaches across a range of labs.

At a previous single cell conference, we were excited to hear about the "Seq-Well" platform (Geirahn et al., 2017; PMID:28192419), presented by the Shalek group (MIT); this approach has the potential to address some of the limitations we found with droplet-based methods. We have been testing the performance of this platform in our hands at the Sanger Institute (Cambridge, UK) on a range of cell types and with a number of operators. We will present a summary of our key results from experiments conducted using the Seq-Well platform. We will also present data from our modifications to the platform, including variations on the barcoded beads that are a key component.

P18

BatchBench: a systematic comparison of scRNA-Seq data integration methods

Ruben Chazarra-Gil (1), Vladimir Kiselev (2) , Martin Hemberg (3)

Polytechnical University of Valencia (UPV) (1), Sanger Institute (2,3)

Single-cell RNA Sequencing (scRNA-Seq) has become an immensely powerful technology capable of performing a genome-wide characterization and quantification of mRNA transcripts of thousands of individual cells. Among the existing scRNA sequencing methods, considerable protocol differences regarding molecule capture efficiency, transcript read length, and other, condition the data shape. Hence, data integration from different sources has to be assessed carefully to avoid misleading results. Additionally, in a context where large-scale efforts are attempting to produce cellular maps of entire cell lineages, organs and organisms, data generated contains both biological and technical variability that contributes to strong batch effect. This effect must be overcome to meaningfully analyse datasets of different origins. In this study we benchmark some of the most popular batch effect removal tools for scRNA- Seq data. We highlight their methodological differences and asses their performance and computational requirements through a compendium of well-studied datasets. This systematic comparison, integrated into a workflow, serves as an introductory exploration to scRNA-Seq integrative methods, providing users a better guideline in the choice of tools for aligning their data, and potentially helping in the future development of new methods.

P19

A weighted support vector machine method for identifying marker-tagged sub- populations in scRNAseq data

PEIKAI CHEN, RON WU, PETER J N LU, TIFFANY AU, P C SHAM, KATHRYN S E CHEAH

1, School Of Biomedical Sciences, The University Of Hong Kong, Hong Kong; 2, Centre For Panoromic Sciences (Cpos), The University Of Hong Kong, Hong Kong

Single-cell transcriptomics by sequencing (scRNAseq) is a fast-developing RNA profiling technology that offers genome-wide transcriptomic information at a per-cell resolution. Numerous techniques have been developed to address issues in batch-effect correction, dimension reduction, clustering and population identification, deferentially expressed genes detection, co-expression and regulatory network analyses. Nonetheless, due to the inherent difficulty of capturing all transcript molecules, conventional protocols, including smartseq2, 10X, and DropSeq, still suffers from the issues of dropout effects. As a result, neighbouring cells of the same cluster often demonstrate inconsistent expression for certain marker genes, particularly for transcription factors which tend to be lower in expression levels. This causes problem in assigning the correct labels to cells, and reduces the sensitivity and accuracy of detecting DEGs. Here, we present a weighted support vector machine (wSVMs) with radial kernels to draw decision boundaries for given markers. This approach, semi-supervised in principle, regards the cells tagged by certain numbers of markers as the designated cell-type, and all-else as the other cell-type. It then fits the boundary between the two. For the cells within the boundary that defines the target cell-type, they are all designated as the same cell-type, regardless of their marker profiles. We iterated this for a number of marker sets, to obtain labels and boundaries for the whole dataset, upon which we can detect the population signature, and derive novel markers. We applied this technique to two of our in-house data: 1) a set of scRNAseq data (10X Genomics) for differentiation derived from mouse embryonic stem cells (ESCs), and 2) a scRNAseq data set of enteric neural crest cells (ENCCs) in the developing mouse guts. Results show that our technique was effective in predicting sub-populations within an otherwise connected big cluster of cells, particularly in the earlier stage of differentiation.

P20

Robust Representations for Single-cell Data Using DenseFly

Sijie Chen, Yixin Chen, Wenchang Chen, Haoxiang Gao, Xuegong Zhang

Department of Automation, MOE Key Laboratory of Bioinformatics; Bioinformatics Division and Center for Synthetic & Systems Biology, BNRist, Tsinghua University, Beijing 100084, China.

Finding proper representations for the molecular profiles of the individual cells is the foundation of single-cell data integration and interpretation. A good representation should capture the biological variations beneath the data, alleviate the technical variations present in the data, and be tolerant of the unwanted noises. We studied a fly-neuron-inspired algorithm DenseFly for embedding cells into a locality sensitive hashing (LSH) space. The simulation and real-data experiments show that the DenseFly representation of cells is robust to dropout events and batch effects. In PBMC datasets, the DenseFly representation aggregates cells of the same type together across multiple technical origins without explicit batch-removal transformations. DenseFly also aligns cell identities/states across different omics datasets (scATAC-seq and scRNA-seq) and across different organisms (human and mouse, with curated expressional features). As an LSH-based representation, DenseFly is flexible and adaptable for robustly mapping cells across different scRNA-seq datasets and establishing links among different omics datasets. DenseFly takes linear time to or query a cell; thus, it can be an efficient tool for cell atlas searching.

P21

Single-cell RNA sequencing of human cardiac fibrosis

Sonia Chothani, Konstantinos Vanezis, Elaine Yiqun Cao, Sebastian Schäfer, Owen J.L. Rackham, Stuart A. Cook

Program in Cardiovascular and Metabolic Disorders, Duke–National University of Singapore Medical School, Singapore (S.C, E.Y.C, S.S, O.JL.R, S.A.C). National Heart Centre Singapore, Singapore (S.S, S.A.C). National Heart and Lung Institute, Imperial College London, London, UK (K.V, S.A.C)

Cardiovascular diseases are one of the most prominent causes of death worldwide and cause nearly 1 in every 3 deaths. In most cases, these diseases culminate in tissue fibrosis driven by the transition of resident fibroblasts to myofibroblasts, resulting in a remodelling of the myocardium. Recently, we have shown that RNA-binding proteins (RBPs) are key regulators of cardiac fibrosis and that a reduction in their expression can blunt the fibroblast- to-myofibroblast transition. In order to further understand the role of these RBPs within different fibroblast sub-populations, we carried out single-cell RNA-sequencing of primary human fibroblasts from two donors. These cells were stimulated with Transforming growth factor beta 1 (TGFB1), a known driver of the fibroblast-to-myofibroblast transition, over three time points (0h, 6h, 24h). Subsequent analysis of the 56,536 cells in the data showed the transition of fibroblasts to myofibroblasts were marked by the upregulation of alpha-smooth muscle actin expression. To better understand the dynamics of the system we carried out a number of pseudotime analyses. Based on these, we identified genes that were switching during the transition and found that Actin binding pathways were activated followed by genes involved in extracellular matrix organization. Furthermore, we identify several sub- populations during this transition in which our previously identified RNA-binding proteins were specifically expressed, providing a clearer understanding of their regulation. With current technological advances in single-cell RNA sequencing, fibroblasts have been shown to have different sub-populations with roles including inflammation and ECM production and this study can aid in disentangling the different roles of these fibrotic drivers.

P22

A functional characterisation of the role DNA methylation in cell fate specification during early mammalian embryogenesis

Dr Stephen Clark, Tim Lohoff, Ricard Argelaguet, Jennifer Nichols, Oliver Stegle, John C. Marioni, Wolf Reik

1. Epigenetics Programme, Babraham Institute, Cambridge CB22 3AT, UK; 2. Wellcome- MRC Cambridge Stem Cell Institute, University of Cambridge, CB2 0AW, UK; 3. European Bioinformatics Institute (EMBL-EBI), Hinxton, CB10 1SD, UK; 4. Cancer Research UK Cambridge Institute, University of Cambridge, CB2 0RE, UK

The earliest lineage decisions in mammalian development are accompanied by global epigenetic reprogramming and in vivo knockouts of DNA methylation enzymes result in embryonic lethality shortly after gastrulation, highlighting the essential role this epigenetic mark. We recently developed a novel single-cell sequencing method to profile multiple molecular layers from the same cell (single-cell nucleosome, methylation and transcription sequencing, scNMT-seq). By applying scNMT-seq to cells from gastrulating mouse embryos we characterised the lineage-specific epigenomic profiles as they arise in development and found a temporal asymmetry between germ-layers, with ectoderm methylation profiles being established in the early epiblast and mesoderm and endoderm profiles arising only once cells are committed to these fates. However, the functional role of the epigenome in regulating these early cell fate choices is still not clear. We now ask whether particular DNA methylation states are functionally required for certain cell types by combining multiple genetic perturbations with high-throughput scRNA-seq and scNMT-seq in a time-series of development in mouse embryos. Using transcriptional profiles, we can precisely map the distributions of cell types in each knockout and link epigenomic alterations to differentiation defects. As an example, we find that Tet-dependent de-methylation of lineage-specific regulatory regions is necessary for the formation of certain mature lineages, in particular blood precursors. In summary, this on-going work will provide a detailed study of the causal relationship between DNA methylation and cell fate in early mammalian embryogenesis.

P23

Single-cell RNAseq study of the intrahepatic immune cell landscape of primary sclerosing cholangitis

Fabiola Curion1, Kate Lynch2, Victor Yeung1, Eve Fryer3, Helen Ferry2; Moustafa Attar1, Hubert Slawinski1, Roger W. Chapman2, Rory Bowden4, Satish Keshav2, Calliope Dendrou1, Paul Klenerman2

1. Wellcome Centre for Human Genetics, University of Oxford, Oxford, United Kingdom. 2. Translational Gastroenterology Unit, University of Oxford, Oxford, Oxfordshire, United Kingdom. 3. Nuffield Division of Clinical and Laboratory Sciences, University of Oxford, Oxford, United Kingdom. 4. The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia.

Primary sclerosing cholangitis (PSC) is a chronic immune-mediated liver disease characterized by progressive cholestasis with inflammation and fibrosis of the intrahepatic and extrahepatic bile ducts. The disease is mostly prevalent in young to middle-aged males and prognosis is often poor. Co-morbidity with other immune-mediated disease, particularly inflammatory bowel disease is typical, patients have a high lifetime risk of cancer development, and the 5-year mortality rate is ~30%. Treatment options are limited: there are no approved therapeutics and the most definitive treatment is liver transplantation, which can be associated with post-transplant complications and PSC recurrence. Despite being recognised as an autoimmune disease, the precise role of different infiltrating and liver- resident immune cells in PSC development has not been well characterised, partly due to the difficulty of obtaining liver samples from these patients. Fine needle aspiration (FNA) of the liver has emerged as a safe, minimally invasive technique for intrahepatic sampling for immunophenotyping. We have performed single-cell RNA sequencing on 35,000 CD45+ mononuclear cells obtained by FNA and on 73,000 matched peripheral blood mononuclear cells from PSC patients and patients with non-autoimmune liver pathology. We can identify >40 different specific immune cell subsets sampled by FNA, including liver-resident natural killer (NK) cells and Kupffer cells. We find that NK subsets show increased chemokine expression and that gut-derived memory B cells have an increased IgG expression specifically in the PSC liver. We hypothesize a mechanism of cell-to-cell interactions that explains the complex network of pathways involved in the disease, with implications for new therapeutic avenues.

P24

Single-Cell Proteomics Defines the Cellular Heterogeneity of Localized Prostate Cancer

Laura De Vargas Roditi, Laura De Vargas Roditi1, Andrea Jacobs2, Jan H. Rueschoff1, Hartland W. Jackson2, Pete Bankhead3, Stephane Chevrier2, Thomas Hermanns4, Christian D. Fankhauser4, Cedric Poyet4, Niels J. Rupp1, Bernd Bodenmiller2,*, and Peter J. Wild1,5,*

1 Department of Pathology and Molecular Pathology, University Hospital Zürich, University of Zürich, Zürich, Switzerland 2 Department of Quantitative Biomedicine, University of Zürich, Zürich, Switzerland 3 Pathology and Institute of Genetics and Molecular Medicine, , Edinburgh, United Kingdom 4 Department of Urology, University Hospital Zurich, University of Zürich, Zürich, Switzerland 5 Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt, Germany

Prognostic biomarkers are needed to better manage treatment of patients with localized prostate cancer. Prostate cancer is characterized by multiple genomic alterations, and heterogeneity at the proteomic level has not been evaluated. The aim of this study was to identify prospective biomarkers that can be targeted to prevent disease progression by simultaneously quantifying 36 proteins in prostate samples from 48 patients with localized prostate cancer using single-cell mass cytometry analysis of over 1 million cells. To perform this task, we developed a novel computational method, Franken, for high-dimensional clustering. We compared Franken to state-of-the-art methods for single-cell analysis on multiple datasets and demonstrated its unprecedented combination of performance, sensitivity and scalability. By applying Franken to localized hormone-naïve prostate tumor samples analyzed with mass cytometry, we identified expected subpopulations of immune, stromal, and prostate cells as well as characterized hitherto unknown rare prostate tumor- specific cellular phenotypes which were confirmed through imaging. By identifying single-cell phenotypes unique to tumor tissue that specifically occur in high-grade disease, we provide insights into the coordinated progression of prostate cancer. In some cases in our cohort, we observed subpopulations typically seen in castration-resistant disease that could serve as biomarkers to predict adverse oncological outcomes and influence treatment decisions.

P25

Comparing patient trajectories using trajectory alignment methods: prospects and guidelines.

Louise Deconinck, Wouter Saelens, Robrecht Cannoodt, Yvan Saeys

Data Mining and Modeling for Biomedicine, VIB Center for Inflammation Research, Ghent, Belgium and Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium

Trajectory inference methods have become a major novel class of methods to unravel cellular dynamics from single-cell data. One of the next questions to ask is how to compare different trajectories, e.g. healthy versus normal patients, or different genotypes, as many diseases may cause a change in the developmental trajectory or in the system's dynamics, resulting in a trajectory change. These changes can result in several variations of different patient trajectories, such as differential branching events in the trajectory structure, or premature ending of the trajectory. While research on trajectory alignment is still in its infancy, some use cases have already been described e.g. to compare human, chimpanzee and macaque neuronal development, or to find differences in gene regulation in the presence of certain growth factors.

While a few algorithms for trajectory alignment have already been developed, no real metrics exist that benchmark the performance of a trajectory alignment technique, nor do any guidelines exist on which kinds of preprocessing steps can be useful and which configurations of the specific technique make most sense. Currently, the most common technique to align two linear trajectories is Dynamic Time Warping (DTW). DTW is a proven technique that contracts and dilates a temporal sequence to best match the other sequence. In some cases, data is preprocessed before an alignment is calculated. This is done by constructing pseudocells at fixed time points in the developmental trajectory and interpolating the gene expression at this point along the trajectory. Sometimes the gene expression data is scaled, using min-max normalization or cosine normalization.

We evaluated the performance of DTW on 20 different synthetic datasets. We compared the influence of different preprocessing steps and parameters (such as the smoothing method and the number of pseudocells) used in those steps on the alignment of the cells. We provide several guidelines on how to best employ DTW to compare trajectories between healthy and diseased patients.

P26

Single-cell atlas of the first intra-mammalian developmental stage of the human parasite Schistosoma mansoni

Carmen Lidia Diaz Soria, Jayhun Lee, Tracy Chong, Avril Coghlan, Alan Tracey, Matthew D Young, Tallulah Andrews, Christopher Hall, Bee Ling Ng, Kate Rawlinson, Stephen R. Doyle, Steven Leonard, Zhigang Lu, Hayley M Bennett, Gabriel Rinaldi, Phillip A. Newmark, Matthew Berriman.

-Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK -Regenerative Biology, Morgridge Institute for Research, Madison, WI, USA -Howard Hughes Medical Institute, Department of Integrative Biology, University of Wisconsin-Madison, Madison, WI, USA

Over 250 million people suffer from schistosomiasis, a tropical disease caused by parasitic flatworms known as schistosomes. Humans become infected by free-swimming, water-borne larvae, which penetrate the skin. The earliest intra-mammalian stage, called the schistosomulum, undergoes a series of developmental transitions. These changes are critical for the parasite to adapt to its new environment as it navigates through host tissues to reach its niche, where it will grow to reproductive maturity. Unravelling the mechanisms that drive intra-mammalian development requires knowledge of the spatial organisation and transcriptional dynamics of different cell types that comprise the schistomulum body. To fill these important knowledge gaps, we performed single-cell RNA sequencing on two-day old schistosomula of Schistosoma mansoni. We identified likely gene expression profiles for muscle, nervous system, tegument, parenchymal/primordial gut cells, and stem cells. In addition, we validated cell markers for all these clusters by in situ hybridisation in schistosomula and adult parasites. Taken together, this study provides a comprehensive cell-type atlas for the early intra-mammalian stage of this devastating metazoan parasite.

P27

The sfaira repository accelerates representation learning and model reuse in single cell genomics

Leander Dony1,2,3,*, David Fischer1,2,*; Hananeh Aliee1; Olle Holmberg1,2; Sophie Tritschler1,2; Fabian Theis1,2,4,+

1Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany 2TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85354 Freising, Germany 3Department of Translational Psychiatry, Max Planck Institute of Psychiatry, 80804 München, Germany 4Department of Mathematics, Technical University of Munich, 85748 Garching bei München, Germany + Corresponding author * Equal Contribution

Exploratory analysis of single-cell RNA-seq data sets is currently based on embeddings tailored to each data set and on cluster-based manual cell type annotation. Both clustering and embedding require feature engineering which is a bottleneck in many single-cell RNA- seq projects. Here, we provide a zoo of pre-trained models that can be directly used or easily adapted to a new data set to produce an embedding and cell type labels without requiring feature engineering. Moreover, the abstract representation of the data learned by the neural networks can integrate data from different sources and also allows interpretation of sparse data with low signal that are otherwise hard to interpret. We propose a model ontology motivated by the anatomy of the organisms sampled which directly complements the efforts assembling an atlas of cell types for a given organism.

P28

Giotto, a toolbox for integrative analysis and visualization of single-cell spatial data

Ruben Dries 1, Qian Zhu 1, Rui Dong 1, Chee-Huat Linus Eng 2, Kan Liu 1, Yuntian Fu 1, Tianxiao Zhao 1, Arpan Sarkar 1, Feng Bao 1, Rani George 1, Nico Pierson 2, Long Cai 2, Guo-Cheng Yuan 1

1. Department of Pediatric Hematology/Oncology, Dana-Farber Cancer Institute and Boston Children’s Hospital, Boston, MA, USA. 2. Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.

While single-cell RNAseq is powerful for identifying cell types and states, a major limitation is the loss of spatial information due to tissue dissociation and cell isolation. Spatial transcriptomics (seqFISH+, merFISH, osmFISH, 10X visium, Slide-seq, ...) and other protein-based multiplexing methods (CODEX, MIBI, t-cyCIF, ...) are a group of technologies that profile expression information in situ at the resolution of either single or a small number of cells, thereby filling an important gap by allowing the interrogation of single-cells within their native microenvironment. However, effective use of such technologies requires the development of innovative computational algorithms and easy-to-use tools. Here we present Giotto, a comprehensive and flexible open-source toolbox for spatial single-cell data analysis in R and swift visualization within your browser. Giotto provides a common framework which is applicable on virtually all current available technologies, regardless of the profiled molecular feature (RNA or protein), resolution, or data size. Giotto requires a simple count matrix and corresponding cell centroid coordinates to perform a wide array of single-cell and spatial analyses, such as cell type identification, detection of spatially coherent genes or co- expression patterns, cell neighborhood analysis, exploration of cell-cell interaction effects and context-specific ligand-receptor binding. Giotto can facilitate integration of spatial transcriptomic and single-cell RNAseq information to identify spatial distribution of cell types. In addition, Giotto provides a visualization tool can be used to simultaneously visualize multiple modalities of information and interactively explore the relationship between different cellular features. Altogether, Giotto provides a powerful computational toolbox for spatial transcriptomic and proteomic data analysis and visualization.

P29

Employing automated cell imaging on the CellRaft AIR System to sort and isolate single cells for cloning and rare cell workflows

Jacquelyn DuVall, Steven Gebhart, Jessica Hartman

Cell Microsystems, Inc

Contemporary single cell biology experiments typically rely on isolating single cells by flow cytometry, encapsulation in a droplet, or micro-manipulation. These approaches limit the ability to maintain viability, actively select individual cells of interest, and discern more meaningful information from each cell. The CellRaft AIR system addresses these limitations and enables streamlined workflows in CRISPR gene editing, stem cell biology, 3D cell culture, and single cell multi-modal analysis. The CellRaft AIR™ System is a bench-top instrument capable of imaging, sorting, and isolating single cells for downstream molecular analysis. The system is based on the CellRaft technology and employs CytoSort Arrays - a proprietary cell culture dish designed with an array of embedded, releasable microwells - to culture and phenotype cells prior to isolation. These consumable arrays closely replicate standard in vitro conditions, rendering the array an ideal substrate for time-course phenotyping, drug sensitivity assays, and other imaging-based evaluation prior to isolation of cells for single cell molecular analysis. The system has been shown to be effective in isolating single cells from cell lines, primary samples, non-adherent cells, and stem cells such as iPSCs and human embryonic stem cells. For 3D cell culture applications, growth of organoids in Matrigel and subsequent isolation after weeks of culture has been demonstrated on the system. The software and imaging components discern unique and subtle phenotypes - which can then be leveraged as sorting criteria. Imaging also provides validation for single cell isolation (i.e. elimination of empties and multiplets) and confirmation of other phenotypes which can be linked to downstream sequencing data. Genomic and transcriptomic workflows have been optimized and published using the system to analyze single cells. Further multi-modal and functional genomics assays are enabled by depositing a live cell in a collection tube and subjecting it to any number of downstream analysis methods. Compared to flow sorting of cells post-CRISPR, the AIR system workflow has demonstrated at least a 2.5X improvement in viability, a 90% reduction in time, and a 50% reduction in reagent consumption. These advances establish a comprehensive and automated workflow to eliminate key limitations associated with single cell isolation, while empowering the user with invaluable imaging data. Further development is underway to provide even higher resolution imaging for deeper sub-cellular understanding.

P30

Convergent cell fate specification in the developing vertebrate skeleton

Christian Feregrino, Chloé Moreau, Patrick Tschopp

Zoology - DUW. University of Basel

During development, the process of cell type specification increasingly subdivides progenitor populations into diverging lineages of different cell types. For most cell types, cell fate specification can be linearly traced back until their earliest progenitors in a developmental kinship lineage model. However, certain cell types can originate in a convergent fashion, that is: different progenitor lineages giving rise to similar cellular phenotypes. Depending on anatomical location, cells of the vertebrate skeleton (e.g. chondrocytes) arise from three distinct developmental lineages. Namely, the somitic sclerotome, the lateral plate mesoderm and the cranial neural crest. This process results in the formation of the skeleton in the trunk, the limbs and the head, respectively. In order to study the gene expression logic underlying this process, we conducted bulk transcriptional analyses of FACS-isolated chondrocytes from the three different populations. We found that, despite the differences in their embryonic origins, they share - already early on - a highly similar expression of the core chondrocyte transcriptional program. To follow the temporal dynamics of this apparent transcriptomic convergence, we used the chicken embryo as a model and 10x Genomics to generate over 26,000 single-cell transcriptomes coming from the cervical, nasal and limb areas at three different stages of chondrogenesis. Amongst several cell populations identified, we find putative early chondrocytes and the undifferentiated cells that give rise to them at all three anatomical sampling sites. Moreover, performing co-expression on the different samples, we found shared sets of genes across some the modules of co-expression, which further confirm out hypothesis of a similar fate. Using pseudotime analyses, we aim to recapitulate the differentiation process in silico, define the convergence points of the different lineages, and the underlying changes in transcription factors expression profiles happening around such points. We plan to complement these data sets with corresponding ATAC-seq chromatin accessibility maps, in an effort to understand the regulatory dynamics that drive transcriptomic changes towards a functional convergence of vertebrate skeletal cell types.

P31

A flexible microfluidic system for single-cell transcriptome profiling elucidates phased transcriptional regulators of cell cycle

Heike Fiegler, Karen Davey, Christopher R. Sibley

Imperial College London

Single cell transcriptome profiling has emerged as a breakthrough technology for the high- resolution understanding of complex cellular systems. Here we report a flexible, cost- effective and user-friendly droplet-based microfluidics system, called the Nadia Instrument, that can allow 3' mRNA capture of ~50,000 single cells or individual nuclei in a single run. The precise pressure-based system demonstrates highly reproducible droplet size, low doublet rates and high mRNA capture efficiencies that compare favourably in the field. Moreover, when combined with the Nadia Innovate, the system can be transformed into an adaptable setup that enables use of different buffers and barcoded bead configurations to facilitate diverse applications. Using the Nadia Innovate and Nadia Instrument, 3' mRNA profiling of asynchronous human and mouse cells at different phases of the cell cycle was performed. Firstly, the experiment demonstrated the system's ability to readily produce data from which distinct cell populations could be distinguished. Secondly, the use of a data analysis approach based on the expression of transcription factor target networks, rather than direct gene expression profiles, helped infer transcriptional regulators active in various cell cycle phases. Notably, we identified multiple transcription factors that had little or no known link to the cell cycle (e.g. DRAP1, ZKSCAN1 and CEBPZ). In summary, our work showed that the Nadia platform represents an exciting and flexible technology for future single-cell transcriptomic studies. It also exemplifies how single-cell transcriptome profiling can be used to further the mechanistic understanding of basic cellular biology.

P32

Making heads or tails of mouse embryos

Jonathan Fiorentino 1,2,3, Shifaan Thowfeequ 4, Shankar Srinivas 4, Antonio Scialdone 1,2,3

1 Institute of Epigenetics and Stem Cells, Helmholtz Zentrum München, Germany, 2 Institute of Functional Epigenetics, Helmholtz Zentrum München, Germany, 3 Institute of Computational Biology, Helmholtz Zentrum München, Germany, 4 Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, UK

The establishment of an anterior-posterior axis in mouse embryos is crucial to start organogenesis; this begins with the migration of a subpopulation of Visceral Endoderm (VE) cells called the AVE. Although several early molecular asymmetries that could initiate this symmetry breaking event have been found in recent years, a transcriptome-wide analysis is still missing. We investigate this issue by combining single-cell transcriptomics data and mathematical modelling. We analyse a novel Smart-seq2 single-cell RNA-seq dataset of mouse embryos, around the stage where the AVE migrates. Employing diffusion maps and ordering VE cells in diffusion pseudotime, we identify novel putative markers of anterior and posterior asymmetries. We then integrate our results with publicly available 10X single-cell RNA-seq data, in order to identify cell subpopulations in the VE. Finally, we validate the spatial domains of expression for some of the candidate genes through hybridisation chain reaction (HCR), which also allows us to feed back the spatial information in the transcriptomics dataset. Furthermore, we study the active cell signalling pathways within the AVE and between the AVE and the other cell types, identifying putative regulators of the migratory process. We propose a 2-dimensional mathematical model of multicellular gradient sensing that links the biochemical information on cell-cell communication to the geometrical arrangement of AVE cells. Taken together, these results shed light on the spatial asymmetries in gene expression in the VE and their role in anterior-posterior axis specification.

P33

Inferring cell-cell communication with graph-based neural networks from spatial single-cell data

David S. Fischer1,2, Sabrina Richter1, Anna Schaar1, Bernd Bodenmiller3, Fabian J. Theis1,2

1Institute of Computational Biology, Helmholtz Zentrum Mu¨nchen, 85764 Neuherberg, Germany 2TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85354 Freising, Germany 3Department of Quantitative Biomedicine, University of Zurich, Switzerland

Spatial graphs can be assembled from single-cell molecular profiling data on tissue slices such that nodes are cells represented by their molecular state and edges represent spatial proximity between cells. These graphs characterise heterogeneity in the population of cells but also capture the spatial organisation of the tissue. Accordingly, these graphs encode information related to cell-cell communication: the dependence of the cellular state of a target cell on other cells in close proximity. We used neural networks on these graphs to infer complex cell-cell communication in a cancer data set. We leverage these algorithms to gain insights that can neither be easily obtained from neighbour-frequency statistics nor from non-spatial data.

P34

Early fate decisions and differences in cell-cycle speed shape CD8 T-cell memory

Michael Flossdorf, Jonas Mir (1), Atefeh Kazeroonian (1), Albulena Toska (1), Veit Buchholz (1), Lorenz Kretschmer (1), Marten Plambeck (1), Thomas Höfer (2,3), Dirk Busch (1)

1 Technical University of Munich (TUM), Munich, Germany 2 German Cancer Research Center, Heidelberg, Germany 3 BioQuant Center, University of Heidelberg, Heidelberg, Germany

Adaptive immune responses to infection or cancer rely on coordinated programs of cell proliferation and differentiation. Upon stimulation, naive, antigen-specific T cells expand vigorously and give rise to short-lived effector and long-lived memory cells. The dynamics of this process have been difficult to resolve using population-based analyses. In particular, the timing of fate divergence and its interplay with cell division speed has remained elusive. Here we combine single-cell RNA sequencing with single-cell fate mapping data interrogated by stochastic population modeling to show that cells progressively differentiate from memory precursor cells to terminal effectors with significant differences in cell cycle speed already early during infection. To test our model predictions, we developed a flow cytometry-based method that enables us to efficiently quantify cell proliferation in vivo. Indeed, we find differences in cell cycle length between the subsets, with central memory precursors dividing approximately every eight and effector cells every five hours. We further find that memory precursors selectively elongated their cell cycle as soon as antigenic stimuli were removed. To further resolve early T cell fate decisions, we analyzed continuous live-cell imaging data under various stimulus conditions in vitro using Hidden Markov Models. Also here we find early fate decisions that result in inheritable differences in cell cycle speed close to what we observe in vivo. Taken together, our mathematical models begin to provide a quantitative picture of the developmental program of T cells during an immune response. Improvements in the quantitative understanding of this process will have implications for immunotherapy and the design of effective vaccines.

P35

Using single-nuclei sequencing in human postmortem brain to identify mechanisms of early life adversity-associated risk for psychiatric disorders

Anna S. Fröhlich 1,2, Natalie Matosin 3, Miriam Gagliardi 1, Leander Dony 1,2,4, Simone Röh 1, Michael J. Ziller 1, Elisabeth B. Binder 1,5

1 Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Munich, Germany 2 International Max Planck Research School for Translational Psychiatry, Munich, Germany 3 School of Medicine, Faculty of Science, Medicine and Health, University of Wollongong, Wollongong, NSW, Australia 4 Institute of Computational Biology, Helmholtz Zentrum Mu¨nchen, Neuherberg, Germany 5 Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, USA

Early life adversity such as physical or sexual abuse in childhood increases the risk for developing psychiatric disorders such as major depressive disorder, schizophrenia and bipolar disorder in adulthood, as well as increasing the risk for suicidal ideation and suicide. Moreover, having experienced early adversity is associated with a more severe course of illness and poorer treatment response. In fact, early life adversity is one of the strongest predictors for psychopathology. However, the molecular underpinnings of the mediating mechanisms are still poorly understood. The study cohort consists of three matched groups; one group of individuals with psychopathology, one group of individuals with psychopathology and a history of early life adversity and one psychiatrically healthy control group (n=8/group). We are performing single-nuclei RNA and ATAC-sequencing in adult postmortem brain samples of the prefrontal cortex, hippocampus and amygdala; three brain regions involved in stress response and highly implicated in psychiatric disorders. Moreover, individuals are being genotyped to evaluate the underlying genetic contribution to observed transcriptional and epigenetic changes. Preliminary data showed clusters of all main brain cell types such as excitatory and inhibitory neurons, oligodendrocytes and precursors, astrocytes, microglia, and endothelial cells in all three brain regions. Neuronal vs. non-neuronal cell types were present in a ratio of 1.5:1, 1:5.3, and 1:2.5 in the prefrontal cortex, hippocampus and amygdala, respectively. Differences in the contribution to some cell populations between individuals were observed, especially in samples of the hippocampus and amygdala. Our findings suggest that there are differences between brain regions in terms of cell-type distribution as well as a certain degree of inter-individual variability. Future analyses will focus on whether a history of early life adversity and/or of psychopathology has an impact on the cell-type distribution suggesting effects on cell fate or survival. Additionally, cell-type specific gene expression and chromatin accessibility will be explored to examine possibly dysregulated brain function within and across brain regions.

P36

Anatomical and Transcriptional Characterization of Breast Tumor Heterogeneity Using Spatial RNA Sequencing

Solongo Ziraldo, Cedric Uytingco, Stephen Williams, Alvaro J. Gonzalez, Jennifer Chew, Meghan Frey, YiFeng Yin, Francesca Meschi, Andrej Hartnett, Eswar Iyer, Stefania Giacomello, Aleksandra Jurek, James Chell, Erik Borgstrom, Nigel Delany, Neil Weisenfeld, Zachary Bent

10x Genomics Inc.

The tumor microenvironment is composed of highly heterogeneous cellular components that dynamically interact and communicate with each other. Significant advancements in single- cell RNA sequencing allow capture of thousands of cells and have revealed many cellular subpopulations in tumor tissues. However, tissue dissociation into single cells results in the loss of its important anatomical information. The recently introduced spatial transcriptomics technology resolves spatial localization of cells within a tissue section. Here we present an improved version of this spatial gene expression technology with increased tissue coverage, higher spatial resolution, and improved sensitivity. We applied our enhanced technology to tumor tissue sections from human breast tumors and analyzed tissue-wide transcriptomic profiles to locate cancer-related genes within spatial context, reveal intra-tumor heterogeneity within a tissue section, and shed light on the differences between breast cancer types in terms of cell type populations and importantly, their spatial arrangement in the tumor microenvironment. Elucidation of the spatial heterogeneity of tumor cells can inform on disease state and progression, aiding treatment decisions.

P37 ieCS: interactive explorer of single cell cluster similarity

Ling Hai, Matthias Schlesner

Bioinformatics and Omics Data Analytics, German Cancer Research Center (DKFZ), Heidelberg, Germany

Single-cell transcriptomics has a great potential to characterize disease-related cell subpopulations. However, it remains a challenge to compare cell subpopulation across multiple individuals and different conditions. We created ieCS, an R package with an interactive graphical user interface that aids to explore similarity of cell clusters among heterogeneous datasets. We introduce a similarity quantification method based on cluster specific marker genes. Based on the obtained similarity scores, ieCS then provides three methods to identify superclusters of similar cell clusters within a sample and also across different samples: hierarchical clustering of cell clusters, the Louvain community detection on a similarity network of cell clusters, and a greedy method to aggregate cell clusters into a tree structure. ieCS allows users to interactively explore superclusters and to visualize dynamic supercluster composition. In addition, ieCS accepts custom cell type markers as reference to annotate cell clusters. In the case studies, we demonstrate that ieCS can quickly and accurately identify superclusters which match the annotation of cell types across individuals, conditions and scRNA-seq technologies, thereby showing the robustness of our method.

P38

Accounting for repetitive elements within intronic regions for whole-gene transcriptome quantification

Mr LouisFrancois Handfield, Martin Hemberg

Sanger Institute

Single-cell transcriptomics allows the discrimination of cell-types in silico, which better our understanding of biological processes occurring within a population of interacting cells. The inclusion of reads of intronic origin in the transcriptome quantification allows the detection of a higher number of genes and effectively increases the sequencing depth from single-cells. This can improve the cell-type assessment within complex systems such adult brains or Ipsc derived cultures; however, we noted that a fraction of the introduced intronic reads are unrelated to their associated gene since they can be explained by extrinsic factors instead. Many intronic regions contains repetitive DNA motifs; the incorporation of intronic reads by pipeline proposed by 10X genomics is producing new counts that originates from reads containing poly-A and other repetitive RNA fragments. While this contamination typically does not hinder the cell-type assessment by downstream analysis, it is a bias that makes specific genes more prone to appear differentially expressed from sample specific effects, and probably lead a study to delve into further experimental validations while unaware of a possible association with degraded RNA. We propose an alteration to the pipeline that aims to exclude reads that contain repetitive or "low-complexy" fragments and instead quantify that expression for all repeat mask families. This allowed to quantify the DNA transposon activity in a colony of Ipsc derived neurons, and note a clear bifurcation in single-cell data that is unrelated to cell type, which might discriminate whole cells from cell fragments.

P39

Single-cell analysis of neuro-immune crosstalk in an in vitro model of monogenic Alzheimer’s disease

Dr Moritz Haneklaus (1,2,3), Louis-François Handfield (2,3), Phil Brownjohn (1), Martin Hemberg (2,3), Rick Livesey (1,3)

1 UCL Great Ormond Street Institute of Child Health, University College London, UK 2 Wellcome Sanger Institute, Hinxton, UK 3 Open Targets, Wellcome Genome Campus, Hinxton, UK

Alzheimer's disease (AD) is the most common form of dementia and an increasing public health problem in ageing populations. Apart from neuronal dysfunction and degeneration, recent genetic association studies have highlighted a role for microglia in the pathogenesis of AD. To study neuro-immune interactions in a human system, we have established and characterized a 3D co-culture system of induced pluripotent stem cell (iPSC)-derived cortical neuronal organoids and iPSC-derived microglia. To model AD using this system, we engineered isogenic stem cell lines using CRISPR-Cas9 to generate allelic series of mutations in either PSEN1 (M146I and Intron4) or APP (V717I), which cause familial forms of early onset AD. 3D neuron-microglia models using each mutation were generated, with neurons, microglia or both cell types carrying each individual mutation. We then used single- cell RNA sequencing to profile over 90,000 cells from replicate cultures. We are now using this dataset to identify cell type-specific gene expression changes, such as microglial activation states and adaptive responses in neurons. To further understand the interplay of immune signalling and AD mutations, we are also exogenously activating microglia in 3D co- cultures. Finally, we are comparing our findings to a single-nucleus sequencing dataset of post-mortem cortex of individuals with monogenic AD mutations and matched controls. This comparison enables us to determine how the neuron- interactions that can be modelled in vitro relate to the phenotypes of the same cell types in end-stage disease. Ultimately, our study will provide a better understanding of cellular interactions in AD and how they can contribute to disease initiation and progression.

P40

Single-cell transcriptional landscape of human embryonic limb development

Bao Zhang, Shuaiyu Wang, Peng He, Eirini S. Fasouli, Yixi Fu, Kenny Roberts, Hao Yang, Xiaoling He, Krzysztof Polański, Yongjiang Zheng, Xi Chen, Jongeun Park, Sid Lawrence, David R. FitzPatrick, Helen Firth, Hui Zhang, Sam Behjati, Roger A. Barker, Kerstin B. Meyer, Hongbo Zhang, Sarah A. Teichmann

Key Laboratory for Stem Cells and Tissue Engineering, Ministry of Education and Department of Histology and Embryology of Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK WT-MRC Cambridge Stem Cell Institute and Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0QQ, UK Department of Hematology and Institute of Hematology, Third Affiliated Hospital of Sun Yat- Sen University, Guangzhou 510630, China Academic Clinical Fellow, Division of Trauma & Orthopaedic Surgery, Addenbrookes Hospital, Cambridge CB2 0QQ, UK MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, WGH, Edinburgh, EH4 2SP, UK Institute of Human Virology, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China Cavendish Laboratory, Department of Physics, University of Cambridge, JJ Thompson Ave, Cambridge CB3 0EH, UK

How the limb bud gives rise to the limb is a classic paradigm of developmental biology, which is well studied in model organisms but not in humans. Here, we analyze the transcriptomes of 27,426 cells from 5 embryonic limb samples between 5- and 8-weeks post-conception to the developmental trajectories of the major mesodermal lineages. With a new visualization tool to project transcription factors onto cell states, we systematically identify regulatory factors guiding cell fate decisions across development. For skeletal muscle, we show that PAX3+ cells differentiate towards embryonic myocytes, and also give rise to the PAX7+ stem cell reservoir. In the osteoblast lineage, we reveal unexpected development out of the perichondrium. Based on a global cell-cell signaling map, we identified an interaction between endothelia and muscle progenitors, which directs myocyte differentiation. Finally, we highlight the importance of the cell signaling circuitry to mechanisms underlying human developmental diseases.

P41

Combination of scRNA-seq strategies to untangle complex cell populations

Emilio Yángüez, Ralph Schlapbach, Klaus Hentrich

Functional Genomics Center Zurich (ETH/UZH), SPT Labtech

Single cell sequencing methods let us define the gene expression profile of thousands of cells, allowing in depth characterization of heterogeneous cell populations. However, the integration of these fast-evolving technologies and the increasing complexity of biological samples pose a challenge to research core facilities. In the Functional Genomics Center Zurich (FGCZ), we implemented both plate-based and droplet-based methods, and use them either as standalone or in combination, to provide our users with access to the latest scRNA-seq technologies. For samples with high number of cells, 10X Genomics is primarily used. This approach allows the identification of 1,000-4,000 different transcripts per cell and, due to the high throughput, this is the method of choice for the initial characterization of complex cell populations such as tumor samples or whole organs. Based on the initial results or if higher resolution is needed, an automated low- volume version of the Smart-seq2 protocol, using a TTP Labtech mosquito HV robot, is used to provide in-depth characterization of selected cell populations. In this approach, single cells are sorted into 384-well plates and individually analyzed to get full transcriptomic coverage. 2,000-7,000 different transcripts per cell are routinely identified with this method, which is the ideal solution for in depth transcriptomic characterization of relatively homogenous populations or for the analysis of single cell transcriptome changes during development or in response to different treatments. The results from both approaches can be combined in our analysis pipeline, allowing us to explore highly heterogeneous samples with low proportions of relevant cells.

P42

Single Cell Genomic Sequencing brings unprecedented insights to aneuploidy mosaicism in clonal populations of Leishmania donovani.

Gabriel Heringer Negreira1, Hideo Imamura1, Pieter Monsieurs1, Yvon Sterkers2, Jean- Claude Dujardin1, Malgorzata A. Domagalska1

1 Institute of Tropical Medicine Antwerp, BE 2 Université de Montpellier, FR

Maintenance of stable ploidy over continuous mitotic events is a paradigm for most higher eukaryotes. Defects in segregation and/or under and over replication of can lead to aneuploidy, a deleterious condition for most organisms. Surprisingly, in Leishmania, a Protozoan parasite, aneuploidy is a ubiquitous feature, where variations of chromosome copy number (CCN) in the cell population represent a mechanism of gene expression adaptation, possibly impacting phenotypes. Moreover, even in clonal populations, individual CCN varies significantly between single cells, a phenomenon named mosaic aneuploidy (MA). At populational level, this can act as an important adaptation potential in early response to environmental stresses, such as hosts immune system activity and drug pressure. Until recently, the only technique available to study MA in Leishmania was FISH which, despite its single-cell resolution, only allows somy assessment of few chromosomes, not providing information about the complete karyotype of single-cells. To overcome these limitations, in this work we used for the first time high-throughput 10X© Single-Cell Genomic Sequencing (SCGS) to estimate individual ploidy of 1590 promastigote cells in a clonal population of Leishmania donovani. We identified 134 different karyotypes, with the most common one occurring in 438 cells. The 14 most frequent karyotypes, representing 76,1% of the population, diverge from each other by gains or loss of somy in one or two chromosomes. Interestingly, aneuploidy patterns that were previously described by Bulk Genome Sequencing as emerging during early phases of drug resistance selection are already present in single karyotypes in the SCGS data, suggesting that somy changes observed in Leishmania populations in early stages of adaptation to environmental stresses are led by positive selection of pre-existing karyotypes. Additionally, although all chromosomes displayed different levels of intercellular somy variation, high degrees of cell- to-cell variability was restricted to a specific group of chromosomes. For instance, Chromosome 35 was di, tri and tetrasomic in 26,2%, 71,8%, and 2% of cells respectively. Conversely, Chromosome 18 had the most uniform somy distribution, with 99,5% of disomic cells. The SCGS also revealed a small fraction of cells where one or more chromosomes were absent. To validate our data, we performed FISH analysis on 3 chromosomes, where similar somy profiles were observed. Together, these results demonstrate the power of SCGS to resolve complex CCN variations amid individual Leishmania cells and illustrated for the first time the whole picture of in vitro mosaic aneuploidy scenario for these parasites.

P43

Single Cell RNA Seq Analysis of Kidneys of Inducible Mouse Model to Investigate Early ccRCC Development

S Hirosue, P Rodrigues, D Bihary, S Samarajiwa, S Vanharanta

MRC Cancer Unit, University of Cambridge

Renal cell carcinoma (RCC) is among the 10 most common cancers in the world, causing 143,000 deaths per year worldwide (Hsieh et al. 2017). Clear cell renal cell carcinoma (ccRCC), which develops from proximal tubule (PT) cells, is the most common subtype of RCC. VHL and PBRM1 are considered as early critical events in ccRCC, as they are the most common as well as highly clonal tumour surpressor genes inactivated(Turajlic et al. 2018).

Despite its high clonality, recent paper suggested that there is a long latency between VHL mutation and the formation of ccRCC(Mitchell et al. 2018). It is important to understand the molecular and evolutionary mechanisms at work during this latency period as it may give us insights for the early detection and treatment of ccRCC.

To study early tumour progression, the timing of initial oncogenic insults and cancer formation need to be known. Hence, conditional genetic mouse models which recapitulate the genetics and phenotypes of the corresponding human disease are required. Espana- Agusti et al. developed Pax8-CreER transgenic mouse which allows highly specific and inducible deletion of Vhl and Pbrm1 in kidney epithelial cells. The system displays a ~20 month latency between Vhl/Pbrm1 loss and the development of ccRCC.

The aims of our project are as follows: Aim 1: Identify the acute effect of the inactivation of VHL and PBRM1 in kidney epithelial cells in vivo. scRNA-seq and bulk RNA-seq of kidney cells of the transgenic mice will be performed 2 weeks after tamoxifen or vehicle injection. scRNA-seq analysis will enable us to investigate any new population of cells in treatment samples, or the cell type specific changes between treatment and control.

Aim 2: Understand how the phenotype of VHL and PBRM1 inactivated epithelial cells change over time, from the initial oncogenic insult until the tumor formation. Kidney samples will be collected regularly at 4 different time points between induction and tumor formation and scRNA-seq and bulk RNA-seq will be performed. The transcriptional changes over time will be monitored.

Samples from the first time point was collected and analysed. Bulk RNA-seq showed the upregulation of some Hif target genes, which is the marker for VHL deletion. All known kidney cell types were identified in scRNA-seq analysis and differentially expressed genes between treatment and control were identified. A PT cell population which was originated from only treatment samples was found, in which HIF transcription factor network genes are upregulated.

P44

Single-cell RNA-seq of normal cell-of-origin reveals non-genetic heterogeneity of serous ovarian cancer

Zhiyuan Hu, Mara Artibani, Abdulkhaliq Alsaadi, Christopher Yau, Ahmed Ashour Ahmed

MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK

Serous ovarian cancer is the deadliest gynaecological malignancy. The development of stratified diagnosis and more efficient treatments has been impeded by the lack of a well- defined molecular classification system. To address this issue, we first looked into the cell- of-origin of serous ovarian cancer, the fallopian tube epithelium, by profiling ~6,000 single- cell transcriptomes with Smart-seq2. It led to the discovery of six novel cell subtypes in the normal fallopian tube epithelium. Based on the transcriptomic signatures of these epithelial cell subtypes, we decomposed the expression data of ~1,700 bulk ovarian tumours into the proportions of individual cell states, which unravels the extensive non-genetic heterogeneity in serous ovarian tumours. Then we found that the proportion of a mesenchymal-like cell state was robustly associated with patients' survivorship. This work elucidates that the heterogeneity seen in cancer can start from the different subclasses of the normal cells-of- origin. It provides insight into the molecular subtyping of ovarian cancer with important clinical implications.

P45

DropCleaner: a quality control pipeline for droplet-based sequencing data

Kui Hua, Haoxiang Gao, Sijie Chen, Xuegong Zhang

Tsinghua University

The droplet-based single-cell sequencing technique provides powerful approaches for understanding cells. Due to the high noise in the droplet-based sequencing data, it is essential to conduct stringent quality control of the data before extracting information from it. The excessive number of droplets and the random capture procedure in droplet-based sequencing techniques lead to three kinds of droplets: huge amount of empty droplets that contain no cells, thousands of droplets with only one cell (referred to as cell droplets) and hundreds of doublets/multiplets that capture more than one cells (referred to as doublets since most of the multiplets contains two cells). Empty droplets and doublets are bad droplets we want to remove, otherwise they may form artificial cell clusters that mislead downstream analysis. Due to the existence of floating RNAs in the cell suspension, empty droplets are not all 'empty'. They can capture tens (in 10X v2 data) or even hundreds (in 10X v3 data) of floating RNAs. This sometimes makes it difficult to distinguish empty droplets from cell droplets, especially when there exist low-UMI cells. Floating RNAs in the cell suspension can also be captured by cell droplets and bias the expression profile. Here we introduce DropCleaner, a quality control pipeline for droplet-based sequencing data which is able to classify the three kinds of droplets as well as to correct the contamination of floating RNAs in the cell droplets. We introduced data-driven ways to calculate suspension scores for identifying empty-droplet and doublets, respectively, and developed a statistical model for correcting the background contamination. We tested DropCleaner on several public data and our own single-nuclei RNA-seq (snRNA-seq) data collected from the human adult heart. Experiments showed that quality control with DropCleaner improves downstream analysis and leads to more biologically meaningful results.

P46

The intestinal Th17 population and its role in extra-intestinal autoimmune disease

Linglin Huang [1], Alexandra Schnell [2], Meromit Singer [3,4,5], Anvita Singaraju [2], Rafael Irzarry [1,3], [5,6], Vijay K. Kuchroo [2,5]

[1] Department of Biostatistics Harvard T.H. Chan School of Public Health, Boston, MA 02215, USA [2] Evergrande Centre for Immunologic Diseases, Harvard Medical School and Brigham and Women’s Hospital and Ann Romney Centre for Neurologic Diseases, Brigham and Women’s Hospital, Boston, MA 02215, USA [3] Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA 02215, USA [4] Department of Immunology, Harvard Medical School, Boston, MA 02115, USA [5] Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA [6] Howard Hughes Medical Institute, Department of Biology and Koch Institute of Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA 02138, USA

The Il17-producing T helper subset (Th17) is a well-established driver of multiple autoimmune diseases. At homeostasis most Th17 cells are found in the lamina propria of the intestine where they act as major contributors to tissue homeostasis. Interestingly, recent studies strongly suggest an involvement of the intestinal Th17 population in extra-intestinal autoimmune diseases. However, the mechanism in which intestinal Th17 cells can drive autoimmune tissue inflammation in peripheral sites remains to be elucidated. We performed single-cell RNA sequencing of splenic, lymph node-derived, intestinal and -infiltrating Th17 cells at homeostasis and during experimental autoimmune encephalomyelitis (EAE). Accordingly, we built an analysis pipeline for a comprehensive characterization of the Th17 population. In particular, we identified tissue- specific Th17 signatures and characterized the heterogeneity of Th17 cells in each tissue at homeostasis. We also described the changes in transcriptional profiles upon EAE in different tissues. Furthermore, we used single-cell TCR sequencing with cell hashing to obtain clonotype information of Th17 cells from different tissues in the same mouse. This allowed us to characterize Th17 cell clonal expansion activity as well as across-tissue migration at homeostasis and during EAE. Taken together, our study provides an extensive single-cell survey of tissue Th17 cells during homeostasis and EAE.

P47

Modelling foam cell formation in-vitro

Maria Imaz(1,2), Andrew Knights(1), Nikos Panousis(1), Helle Jørgensen(2), Dirk Paul(1,3), Daniel J Gaffney(1)

1. Wellcome Sanger Institute, Wellcome Genome Campus, Hixton, Cambridge, UK 2. Cardiovascular Medicine Division, Department of Medicine, University of Cambridge, Cambridge, UK 3. BHF Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK

Background: Macrophages are central to the pathology and progression of atherosclerosis. Uptake and retention of modified lipoproteins by tissue-resident macrophages leads to the formation of lipid-laden 'foam cells'. Subsequent induction of endothelial and smooth muscle cells further promotes monocytic recruitment and chronic inflammation of the vessel wall. This can lead to plaque formation and eventual rupture, giving rise to thrombotic events such as stroke. Aim: To further understand macrophage involvement in atherosclerosis through development of a cellular model for foam cell formation. Exploitation of this model will aid understanding of the molecular mechanisms and genetic risk variants associated with atherosclerosis. Methods: 189 iPSC lines from the HipSci Project were differentiated into macrophages. Macrophages were first polarised to a proinflammatory (IFNγ/LPS) subtype or an anti- inflammatory (IL-4) subtype for 18 hours, followed by stimulation with oxidised LDL (oxLDL) for 72 hours. Lines were then subjected to RNA- and ATAC-sequencing as well as analysis of the secreted proteome. Results: In an initial pilot data set of 36 lines, we identified distinct transcriptional markers corresponding to the different pro-/anti-inflammatory macrophage subtypes and atherogenic stimulus. Gene ontology analysis revealed enrichment of inflammatory and metabolic processes, providing evidence of the validity of our cellular model. Next steps: We will next analyse samples taken directly from areas containing arterial plaques to establish a reference dataset of macrophage phenotype through morphological assessment (histology/immunohistochemistry) and transcriptomic analysis (single-nuclei RNA-seq). This data will also be used to evaluate our current in-vitro model of foam cell formation including, but not limited to, assessment of how well lipid-loading with modified LDL recapitulates the pathophysiology of atherosclerosis and how this model compares to published atherosclerotic plaque data.

P48

A Platform for Automated and Highly-Multiplexed In Situ Gene Detection with Single- Cell Resolution

Noel Jee, Irene Oh, Raymund Yin, Matthew Mo, Hojin Lee, Yang Hyo Kim, Zenjoe Green, Josh Ryu, Brett Cook, Noel Jee

Optical Biosystems, Inc

Single-cell RNA sequencing (scRNA-seq) has fundamentally expanded our understanding of tissue composition and heterogeneity. Various scRNA-seq approaches enable comprehensive screening of complete tissues but require tissue dissociation and the removal of individual cells from their biological context. Validation of differential gene expression in situ is, therefore, of high interest. Emerging spatial technologies focus on genome-wide expression data in situ and are fantastic screening tools but lack the depth, resolution, scale, or speed necessary to interrogate spatial gene expression with true single- cell resolution. Optical Biosystems (OBI) has developed an automated platform that combines Synthetic Aperture Optics (SAO), fluidics engineering, and indirect single-molecule RNA Fluorescence In Situ Hybridization (smRNA FISH) chemistry to probe 30+ gene targets across a large tissue sections with sub-micron resolution. Here, we describe a demonstration of the technology platform. Multiple target probes and corresponding fluorescent readout probes for each of 30 mouse genes were designed. A fresh frozen mouse brain tissue section was attached to a glass coverslip and fixed. A flow cell was then assembled and loaded onto the instrument. Target probes for all 30 genes were initially hybridized. Readout probes for 3 genes were then hybridized, imaged, and removed. 30 genes were detected over 10 cycles with full automation and no human intervention. SAO enabled imaging of the entire tissue section with a lateral resolution comparable to a 100x oil immersion lens while achieving more than an order or magnitude improvement in throughput. A full experiment, from flow cell assembly to the probing of 30 genes in 10 cycles, was completed within two days for a whole mouse brain (~15mm x 5mm). The raw images were reconstructed and target RNA molecules were counted with software algorithms for further analysis and comparison with independent, published data. Single RNA molecules were consistently detected for all 30 genes with good signal strength, sensitivity, and specificity. These data exhibit a powerful and robust automated platform that can translate biological insights from transcriptomic screening experiments into a spatial context and provide a foundation for expansion into long non-coding RNAs, mitochondrial RNAs, introns, splice variants, chromatin structure, and proteins.

P49

Targeting tumour - macrophage interaction in pancreatic ductal adenocarcinoma

Tony Wu, Michael Gill, Oliver Cast, Zeynep Kalender-Atak, Martin Miller

Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge UK

Pancreatic ductal adenocarcinoma (PDAC) is currently the fourth leading cause of cancer related deaths in the Western world, and is projected to rise to the second position in the next ten years. Therapeutic targeting of PDAC has not generated a significant improvement to date with the current first-line chemotherapy, gemcitabine, only extending life by 11 months. Hence, novel therapeutic interventions are acutely needed.

The tumour microenvironment of PDAC is characterized by dense fibrotic stroma and extensive infiltration by myeloid cells, with macrophages being one of the most abundant populations. Macrophages are critical for tumour growth and metastasis, and higher macrophage infiltration in PDAC patients is associated with poor survival.

To understand the contribution of macrophages in PDAC, we conducted a multi-omics study with tumour cell lines derived from the KPC tumour mouse model of human PDAC and primary macrophages derived from the bone marrow of wildtype mice. First, we profiled the proteome, secretome and transcriptome of PDAC cell lines and primary macrophages (alone or in co-culture) with mass spectrometry and bulk RNA-seq. This analysis revealed CCR1, a g-protein coupled receptor, as the highest differentially expressed protein during prolonged physical contact. Next, we assessed the phenotype of PDAC cell lines during co-culture with macrophages using a 3D in vitro invasion assay, and found that macrophages imparted an invasive phenotype on tumour cells, which could be blocked using small molecule inhibitors against CCR1. And finally, we profiled PDAC cell lines and macrophages with scRNA-seq (alone or in co-culture) in combination with CCR1 inhibition. Using 10X Chromium single-cell 3' technology, we sequenced more than 50,000 single cells across eleven conditions (2 time points, mono and co-culture condition; CCR1 inhibition or baseline). Currently we are delineating the effect of co-culture conditions and CCR1 inhibition on cell states of PDAC cells, macrophages and CAFs using gene regulatory network analysis, receptor-ligand interactions and trajectory inference methods.

Our work reflects that the traditional tumour cell-centric view for drug discovery is being replaced with a more comprehensive view that takes the tumour microenvironment into account. This multi-omics dataset we present will provide insights into macrophage-tumour interactions in an in vitro PDAC model system and reveal downstream effects of a novel therapeutic regimen to better inform treatment options.

P50

Bayesian modelling of single cell epigenetic heterogeneity

Chantriolnt-Andreas Kapourani & Ricard Argelaguet, Catalina A. Vallejos, Oliver Stegle & Guido Sanguinetti

CAK: MRC Human Genetic Unit, University of Edinburgh, UK, School of Informatics, University of Edinburgh, UK. RA: European Bioinformatics Institute (EMBL-EBI), Hinxton, UK. CAV: MRC Human Genetic Unit, University of Edinburgh, UK. OS: European Molecular Biology Laboratory (EMBL), Heidelberg, Germany. GS: School of Informatics, University of Edinburgh, UK

High throughput measurements of DNA methylomes across single cells (e.g. single cell bisulfite sequencing; scBS-seq) are a promising resource to uncover the heterogeneity and dynamics of DNA methylation. This could revolutionise our understanding of the regulatory landscape underlying complex biological processes. However, limitations of the technology result in sparse CpG coverage, effectively posing challenges when robustly quantifying genuine epigenetic heterogeneity.

We introduce a Bayesian hierarchical model to disentangle technical from biological heterogeneity while also sharing information across similar genomic features (e.g. enhancers and promoters) to overcome data sparsity. The statistical method combines a beta-binomial specification (to capture biological overdispersion) with a generalised linear model framework. The latter allows to incorporate covariates that may explain methylation rates, such as CpG density, and to correct for the mean-overdispersion relationship that is typically observed in such assays. This modelling framework allows diverse downstream analysis tasks, including: (i) feature selection, identifying highly (or lowly) variable features that drive the epigenetic heterogeneity within a cell population and (ii) differential methylation testing, to identify features that show differences in mean methylation and/or methylation variability between groups of cells, e.g. different developmental stages.

We show both on simulated and real data sets that our model can robustly and accurately quantify epigenetic heterogeneity. We illustrate the usage of our method on scBS-seq data, but it could be also be applied to other types of assays whose output is measured in terms of proportions. To the best of our knowledge, this is the first computational method that joint performs differential mean methylation and differential variability analysis for single cell DNA methylation data.

P51

Locating cells and expression programmes using spatial data with single cell reference

Vitalii Kleshchevnikov^, Artem Lomakin^, Emma Dann^, Artem Shmatko*°, Moritz Gerstung*, Omer Bayraktar^

^ Wellcome Sanger Institute, * EMBL-EBI, ° Moscow State University

According to the current model of cell type identity, cell types respond to distinct environments by switching on new expression programmes - resulting in distinct cell states. Thus gathering information about cell environment is crucial for understanding function of cell states that we detect as clusters using single cell transcriptome sequencing. Gene expression can be quantified in-situ at single cell resolution using microscopy and hybridisation-based detection of individual mRNA molecules. However, it is currently challenging to locate all cell states within the tissue using this approach because it requires pre-defining a large set of marker genes, imaging of which requires custom microscopy setups. Here we propose a simplified workflow to both identify and locate cell states within tissue regions using a combination of single nucleus, spatial RNA-sequencing from adjacent tissues sections and a probabilistic modelling approach. Using 10X Visium technology we measure combined genome-wide expression profiles of several cells at a grid of 50 um circular locations (1-10 to > 50 cells depending on cell density). Using single nucleus and single cell RNA-seq we derive cell cluster signatures and individual gene expression programmes they use. To locate cell states, we developed a Bayesian model of Visium mRNA count data that estimates mRNA contributions of both cell clusters and expression programmes to each location. We applied our model to tissue sections from the mouse brain and human thymus. We showed it can accurately recapitulate spatial data, accounting for differences between transcriptomic technologies. Inferred cell state locations correspond to known brain regions, including cortical layers. We demonstrate the value of our probabilistic approach for locating both observed and unobserved cell states to spatial data and for studying cell-cell interactions in situ.

P52

Functional module detection through integration of single-cell RNA sequencing data with protein–protein interaction networks

Florian Klimm, Enrique M. Toledo, Thomas Monfeuga, Fang Zhang, Charlotte M. Deane, and Gesine Reinert

Department of Mathematics, Imperial College; Mitochondrial Biology Unit, University of Cambridge; Department of Statistics, University of Oxford; Novo Nordisk Research Centre Oxford;

Recent advances in single-cell RNA sequencing (scRNA-seq) have allowed researchers to explore transcriptional function at a cellular level. In this study, we present scPPIN, a method for integrating single-cell RNA sequencing data with protein-protein interaction networks (PPINs) that detects active modules in cells of different transcriptional states. We achieve this by clustering RNA-sequencing data, identifying differentially expressed genes, constructing node-weighted PPINs, and finding the maximum-weight connected subgraphs with an exact Steiner-tree approach. As a case study, we investigate RNA-sequencing data from human liver spheroids but the techniques described here are applicable to other organisms and tissues. In particular, scPPIN allows us to expand the output of differential expressed genes analysis with information from protein interactions. We find that different transcriptional states have different subnetworks of the PPIN significantly enriched which represent biological pathways. In these pathways, scPPIN also identifies proteins that are not differentially expressed but have a crucial biological function (e.g., as receptors) and therefore reveal biology beyond a standard differentially expressed gene analysis. The scPPIN method is available as R library and as an interactive online tool: https://github.com/floklimm/scPPIN A preprint is available under https://www.biorxiv.org/content/10.1101/698647v2

P53

Poincaré Maps for Analyzing Complex Hierarchies in Single-Cell Data

Anna Klimovskaia, David Lopez-Paz, Léon Bottou, Maximilian Nickel

Facebook AI

The need to understand cell developmental processes has spawned a plethora of computational methods for discovering hierarchies from scRNAseq data. However, existing techniques are based on Euclidean geometry which is a poor choice for modeling complex cell trajectories with multiple branches. To overcome this fundamental representation issue we propose Poincaré maps, a method harnessing the power of hyperbolic geometry for the realm of single-cell data analysis. Often understood as a continuous extension of trees, hyperbolic geometry enables the embedding of complex hierarchical data into as few as two dimensions, and faithfully preserves distances between points in the hierarchy. This enables direct exploratory analysis and the use of our embeddings in a wide variety of downstream data analysis tasks, such as visualization, clustering, lineage detection and pseudotime inference. In contrast to existing methods - which are not able to achieve all those important aspects in a single embedding - we show that Poincaré maps produce state-of-the-art two-dimensional representations of cell trajectories on multiple scRNAseq datasets. We quantitatively and qualitatively compared the Poincaré maps approach developed in our work to seven commonly used embedding appoaches (tSNE, UMAP, diffusion maps, PCA, SIMLR, PHATE, SAUCIE) on several publicly available single-cell RNAseq datasets. Poincaré maps not only achieve state of the art results on all the datasets, but significantly outperform other methods on datasets of very large complexity such as embryogenesis of C. Elegans. We demonstrate that Poincaré maps allow us to straightforwardly formulate new hypotheses about biological processes, which were not visible with the methods introduced before. With Poincaré maps, we hope to bring interest about hyperbolic embeddings to the biology community, as these could be applied to a wide variety of problems. These could include the study of transcriptional heterogeneity and lineage development in cancer from single-cell RNA and DNA sequencing data, reconstructing the developmental hierarchy of blood development, and reconstructing embryogenesis branching trajectories.

P54

Spatiotemporal Dynamics of Placental Group B Streptococcus Infection During Pregnancy

Felicia Kuperwaser, Gal Avital, Tara M. Randis, Michelle Vaz, Allison Dammann, Adam J Ratner, Itai Yanai

NYU Grossman School of Medicine

Group B Streptococcus (GBS) is a major cause of infection during pregnancy and in the first weeks of an infant's life. GBS asymptomatically colonizes 25% of adults, but during pregnancy, ascending GBS infection can cause chorioamnionitis, which may subsequently induce premature birth or neonatal sepsis. In this context, the placenta acts as the initial interface between pathogen virulence and host defense mechanisms. Specifically, prior work has implicated placental macrophages in the host response to GBS, but the full extent of their role in this process has not been clearly defined. To investigate the effect of GBS infection on the placenta, we used an ascending model of GBS infection in pregnant mice and analyzed the accompanying transcriptomic changes in infected placental tissue. Pregnant mice were vaginally colonized with GBS on day E13 of pregnancy and placentas were harvested at different time points for scRNA-Seq and Spatial Transcriptomic (ST) analysis, in order to capture the dynamic changes that accompany this infection process. The ST method produces a high-resolution transcriptomic map of infected placenta, and this data highlights enrichment in spatial regions for processes involved in immune defense and response to GBS-specific mechanisms of pathogenesis, allowing us to localize the phenotypic manifestations of this infection to functional regions of the organ. Cell state analysis reveals a dysregulation in phenotypically discrete macrophage subpopulations following infection. Correlated with these changes is a distinct temporal pattern of neutrophil activation and abundance, suggesting that the cellular landscape in the GBS-infected placenta is altered and potentially permissive to GBS growth conditions. Further studies promise to elucidate the effect of the immune response on other placental cell populations, in turn affecting normal placental function and contributing to the devastating clinical manifestations of GBS infection in pregnancy with implications for therapeutic intervention.

P55

Improving the data quality of low performing scRNA-seq protocols by learning from high quality equivalent datasets

Atefeh Lafzi, Holger Heyn, fabian J. Theis

CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain

Recent scRNA-seq benchmarking studies has demonstrated that scRNA-seq protocols generate data of different quality. Low quality datasets usually suffer from lower library complexity and resolution which causes problems in down-stream scRNA-seq data analysis pipelines. To computationally address such variance in quality output, we developed a computational approach that enhances data quality of low performing protocols by learning from best performing protocols. To this end, we calculated the latent space of the joint high and low quality datasets using bottleneck layer of a variational autoencoder (VAE), and applying vector arithmetics to generate a "transformation vector" that represents the average of differences between two datasets in the latent space. Later, by encoding low quality datasets into a latent space, adding the transformation vector and decoding it back to the high dimensional space, we were able to enhance the quality of the dataset. As an application example, we trained our model on data from the Mouse Cell Atlas (Tabula Muris consortium) and demonstrated the improvement in data quality of thymus cells. We decreased the level of dropout events, enhanced the expression level of known markers and reduced the variance and noise in the expression of expected markers from low quality dataset.

P56

Investigating the role of Sca1 in selective VSMC expansion using trajectory analysis and functional assays

Jordi Lambert and Sebnem Oc, Annabel L Taylor, Lina Dobnikar, Joel Chappell, Jennifer L Harman, Martin R Bennett, Mikhail Spivakov, Helle F Jørgensen

1. Nuclear Dynamics Programme, Babraham Institute, Babraham Research Campus, Cambridge, CB22 3AT, UK Lina Dobnikar & Mikhail Spivakov 2. Division of Cardiovascular Medicine, University of Cambridge, Cambridge Biomedical Campus, Cambridge, CB2 0QQ, UK Jordi Lambert, Sebnem Oc, Lina Dobnikar, Annabel L. Taylor, Joel Chappell, Jennifer L. Harman, Martin R. Bennett & Helle F. Jørgensen 3. Functional Gene Control Group, Epigenetics Section, MRC London Institute of Medical Sciences, Du Cane Road, London, W12 0NN, UK Mikhail Spivakov 4. Institute of Clinical Sciences, Faculty of Medicine, Imperial College, Du Cane Road, London, W12 0NN, UK Mikhail Spivakov

Atherosclerosis and vessel injury are conditions characterised by the accumulation and proliferation of vascular smooth muscle cells (VSMCs), which can lead to serious cardiovascular events. Previously, we showed that the VSMC expansion in atherosclerotic lesions is oligoclonal in nature (Chappell et al., 2016). However, the mechanisms behind this selective activation of VSMC proliferation are yet to be elucidated. Using single cell RNA- sequencing (scRNA-seq) we identified a small population of Sca1+ VSMC cells in healthy vasculature (Dobnikar et al., 2018), which we hypothesize may be 'primed' for proliferation, or represent a dedicated progenitor population. Here, we investigated a potential role of Sca1 in selective VSMC proliferation through scRNA-seq a mouse carotid ligation model of vessel injury, which induces rapid VSMC proliferation and in vitro proliferation assays. Sca1+ cells were found in higher numbers in injured carotid arteries, but maintain similar transcriptional signatures. Pseudotemporal ordering of cells placed Sca1+ cells prior to proliferating cells supporting the hypothesis that Sca1-expression represent a primed cell stage. Consistent with this, Sca1 upregulated cells are found in a cluster that was found to be significantly associated with stress response, cell migration, and proliferation gene ontology terms. Functional examination supported the idea that Sca1 marks a primed, rather than as a specific progenitor population. VSMCs isolated from the healthy mouse aorta were cultured in a 2D clonal proliferation assay which demonstrated that, whereas proliferation is not restricted to Sca1+ cells, attachment and clonal expansion of Sca1+ cells is increased and temporally advanced compared to Sca1- VSMCs. These findings enabled characterization of a disease-relevant primed population of VSMCs that could be targeted in disease. Further insight into the mechanism controlling VSMC proliferation will be gained from functional testing of potential regulators identified using the axis defined by the trajectory analysis of scRNA-seq data.

P57

CellRank infers probabilistic lineages based on RNA Velocity

Marius Lange (1,2), Volker Bergen (1,2), Michal Klein (1), Manu Setty (3), Dana Pe’er (3), Fabian J. Theis (1,2)

1 Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, 85764 Neuherberg, Germany 2 Department of Mathematics, Technische Universität München, 85748 Munich, Germany 3 Program for Computational and Systems Biology, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA.

Single-cell transcriptomics has enabled the unbiased study of cellular differentiation and lineage choice at single cell resolution. A multitude of computational trajectory inference methods have been developed to infer cellular dynamics. A central challenge is that single- cell RNA-seq only reveals static snapshots of gene expression. With few exceptions, these methods are based on transcriptomic similarities between cells, i.e. cells with similar gene expression profiles are likely to transition into one another. This can lead to false conclusions as transcriptomic similarity does not necessarily imply developmental relatedness.

Recently, RNA velocity has been introduced, recovering directed dynamic information from splicing kinetics. It yields a time derivative for each individual cell indicating the future cellular state. Here, we propose CellRank, a probabilistic model based on Markov chains which makes use of both transcriptomic similarities as well as RNA velocity to model cellular dynamics. CellRank infers developmental start- and endpoints and assigns lineages in a probabilistic manner.

We demonstrate CellRank's capabilities on cell lineages in pancreatic endocrinogenesis, hippocampal dentate gyrus neurogenesis as well as in lung regeneration. CellRank yields insights into the timing of endocrine lineage commitment and recapitulates gene expression trends towards developmental endpoints. CellRank scales to large cell numbers and is fully compatible with the analysis toolkit scanpy.

P58

RECONSTRUCTING ENTERIC NERVOUS SYSTEM LINEAGES AT SINGLE-CELL RESOLUTION

Anna Laddach and Reena Lasrado, Jens Kleinjung, Franze Progatzky, Michael Shapiro, Vassilis Pachnis

The , 1 Midland Road, London NW1 1AT, United Kingdom

The Enteric Nervous System (ENS) encompasses the intrinsic neuro-glial networks of the gut and is essential for digestive functions and intestinal homeostasis. The majority of enteric neurons and glia originate from a small population of neural crest cells that invade the foregut during embryogenesis and colonise the entire organ giving rise to the neuronal and glial lineages. Despite considerable progress in understanding the cellular mechanisms underpinning ENS development, the molecular mechanisms that control fate choice in individual progenitors and the ensuing steps leading to the generation of mature enteric neurons and glial cells remain obscure. Furthermore, the molecular mechanisms promoting the anatomical and functional integration of intestinal neuroglia lineages into the dynamic tissue and luminal microenvironment of the gut are unclear. We have previously characterised at single-cell resolution the transcriptome of Sox10-expressing ENS progenitors isolated from the small intestine of embryonic day (E) 12.5 mouse embryos and described cellular trajectories and candidate regulators of enteric neuron and glia differentiation. To gain insight into the regulatory mechanisms that shape the cellular and molecular landscape of ENS progenitors across the developmental framework of gut organogenesis and functional maturation, we have extended our single-cell transcriptomic analysis of Sox10+ intestinal cells at key developmental stages (E12.5, E14.5, E16.5, P0, P25 and P60). Computational analysis using clustering algorithms and visualizations for dimensionality reduction reveals a landscape that encompasses both progenitor and mature ENS states. Gene module analysis shows a change of molecular players between early (E12) and late progenitors (E14-P0) suggesting a shift in the intrinsic character of ENS progenitors in the temporal axis and their adaptation to the developing gut environment. Using pseudotime trajectory algorithms, we observe the clear emergence of the neurogenic lineage from both early and late time points. In addition, stochastic RNA velocity analysis reveals the emergence of an adult glial trajectory from late progenitors that appear to mature around weaning (P25). Adult neuronal and glial clusters were identified using known markers (Elavl4-neuron, S100b-glia). In the adult neuronal cluster, we find two sub-clusters that define the cardinal excitatory (Chat) and inhibitory (Nos1) subtypes. Interestingly, while both glial sub-clusters are enriched in genes associated with immune function, one cluster shows an increased expression of Gfap. Together, our study reconstructs the lineage landscapes along the developmental timeline of ENS progenitors and disentangles gene regulatory networks that contribute to the spatial and functional features of the ENS.

P59

Single cell profiling of immature human postnatal thymocytes resolves the complexity of intra-thymic lineage differentiation and thymus seeding precursors

Marieke Lavaert 1, Marieke Lavaert 1 , Kai Ling Liang 1 , Niels Vandamme 2,3 , Jong-Eun Park 4 , Juliette Roels 1,5 , Monica S.Kowalczyk 6 , Bo Li 6,7 , Orr Ashenberg 6 , Marcin Tabaka 6 , Danielle Dionne 6 , Timothy L. Tickle 6,8 , Michal Slyper 6 , Orit Rozenblatt- Rosen 6 , Bart Vandekerckhove 1,3 , Georges Leclercq 1,3 , Aviv Regev 6,9 , Pieter Van Vlierberghe 3,5 , Martin Guilliams 10,11 , Sarah A. Teichmann 4,12 , Yvan Saeys 2,3 , Tom Taghon 1,3

1Faculty of Medicine and Health Sciences, Department of Diagnostic Sciences, Ghent University, C.Heymanslaan 10, MRB2, Entrance 38, 9000 Ghent, Belgium 2Data Mining and Modeling for Biomedicine, VIB Center for Inflammation Research, Ghent, Belgium 3Cancer Research Institute Ghent (CRIG), Ghent University, Ghent, Belgium 4Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK 5Department of Biomolecular Medicine, Ghent University, Ghent, Belgium 6Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA. 7Data Sciences Platform, Broad Institute of Harvard and MIT, Cambridge, MA, USA. 8Haematology Department, Royal Victoria Infirmary, Newcastle-upon-Tyne Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK. 9Howard Hughes Medical Institute, Koch Institute of Integrative Cancer Research, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA 10 Laboratory of Myeloid Cell Ontogeny and Functional Specialization, VIB Center for Inflammation Research, Ghent, Belgium 11 Faculty of Sciences, Department of Biomedical Molecular Biology, Ghent University, Ghent, Belgium 12 Theory of Condensed Matter Group, Cavendish Laboratory/Department of Physics, University of Cambridge, Cambridge CB3 0HE, UK

During postnatal life, thymopoiesis depends on the continuous colonization of the thymus by bone marrow derived hematopoietic progenitors that migrate through the bloodstream. In human, the nature of these thymus immigrants has remained unclear. Here, we employ single-cell RNA sequencing on approximately 70.000 CD34 + thymocytes to unravel the heterogeneity of the human immature postnatal thymocytes. Integration of bone marrow and peripheral blood precursors datasets identifies several putative thymus seeding precursors that display heterogeneity for currently used surface markers as revealed by CITEseq. Besides T cell precursors, we discover branches of intrathymic developing dendritic cells with predominantly plasmacytoid DCs. Trough trajectory inference, we delineate the transcriptional dynamics underlying early human T-lineage development from which we predict transcription factor modules that drive stage-specific steps of human T cell development. Thus, our work resolves the heterogeneity of thymus seeding precursors in human and reveals the molecular mechanisms that drive their in vivo cell fate.

P60

Single-cell molecular and cellular architecture of the mouse neurohypophysis

Dena Leshkowitz [1], Qiyu Chen [2], Refeal Kohen [1], Janna Blechman [2] and Gil Levkowitz [2]

Department of Life Sciences Core Facilities, Bioinformatics Unit [1] and Department of Molecular Cell Biology[2], Weizmann Institute of Science, Israel

The neurohypophysis (NH), located at the posterior lobe of the pituitary, is a major neuroendocrine tissue, which mediates osmotic balance, blood pressure, reproduction, and lactation by means of releasing the neurohormones oxytocin and arginine-vasopressin from the brain into the peripheral blood circulation. However, despite the physiological importance of the NH, the exact molecular signature defining neurohypophyseal cell types and in particular the pituicytes, remains unclear. Using 10X chromium single cell RNA sequencing, we captured seven distinct cell types in the NH and intermediate lobe (IL) of adult male mouse. We revealed novel pituicyte markers showing higher specificity than previously reported. Bioinformatics analysis demonstrated that pituicyte is an astrocytic cell type whose transcriptome resembles that of tanycyte. Single molecule in situ hybridization revealed spatial organization of the major cell types implying intercellular communications. We present a comprehensive molecular and cellular characterization of neurohypophyseal cell- types serving as a valuable resource for further functional research [1]. In addition we demonstrate that the sequence data can be used to identify single nucleotide variation that can originate from RNA editing or rare DNA somatic mutations, thus complementing gene expression information. Towards this aim we applied a novel bioinformatics pipeline to detect rare base substitution events. We found several candidates among them several in Malat1 gene. [1] Qiyu Chen, Dena Leshkowitz ,Janna Blechman and Gil Levkowitz, Single-cell molecular and cellular architecture of the mouse neurohypophysis, Accepted to eNEURO

P61

Simultaneous electrophysiological and transcriptomic study of cell states in dopaminergic neurons

Marcela Lipovsek 1, Lorcan Browne 1, Darren Byrne 1, James Lipscombe 2, Iain Macaulay 2, Jonathan Mill 3 and Matthew Grubb 1

1, King's College London, 2, Earlham Institute, 3, University of Exeter

Dopaminergic (DA) neurons in the olfactory bulb regulate the transmission of information at the earliest stages of sensory processing, and are one of the few neuronal types in the mammalian brain that are continually generated throughout postnatal life. Here, we performed simultaneous electrophysiological recordings and single-cell RNA sequencing (Patch-Seq), coupled with immunohistochemical and birth-dating approaches to ask whether this continuous neuronal production results in a gradient of cell states within the resident population. Birthdating in 4-week old DAT-IRES-Cre/Floxed-tdT mice revealed that resident DA neurons span an age range of at least 3 weeks. We next collected individual DA neurons by either manual sorting of tdT positive DA neurons, or aspiration after patch-clamp recordings in acute slices. We performed deep single-cell RNA sequencing using the Smart- Seq2 protocol. Consensus clustering identified 3 putative subpopulations of DA neurons, while cell trajectory analysis identified a single, unbranched, trajectory that closely matched the clusters. Differential gene expression analysis revealed 680 differentially expressed genes, significantly enriched for GO terms related to neuronal and synaptic function, indicating that the identified trajectory may reflect a transcriptional maturational gradient. Ongoing analysis of electrophysiological properties along the identified trajectory will reveal whether it describes a gradient of functional states. In summary, we are exploring a hitherto unanticipated gradient of cell state within a specific neuronal subtype that could underpin the functional maturation of DA cells in the postnatal brain.

P62

Meiosis and recombination in wheat using G&T Seq

Ashleigh Lister, Ned Peel, Graham Etherington, Azahara Martin, María-Dolores Rey, Graham Moore, and Iain Macaulay

Earlham Institute, Research Park, Lane, Norwich, NR4 7UZ , Norwich Research Park, Colney Lane, Norwich, NR4 7UH Plant Biochemistry and Proteomics Research Group, Department of Biochemistry and Molecular Biology, University of Cordoba, Andalusia, Spain

Single-cell sequencing holds enormous potential in plant biology - notably in the study of meiotic recombination that underlies the generation of genetic diversity in crop breeding. In many plant species, including crops such as wheat, cross-overs are commonly sub- telomeric, limiting the probability of certain genetic traits combining. Wheat is of major nutritional, agricultural, economical and academic importance and there are significant efforts underway to increase recombination rates and unlock potentially beneficial traits. We have applied single-cell genome and transcriptome sequencing (G&T-seq) to individual pollen precursor cells (meiocytes) to assess the viability of a single-cell recombination readout. In order to map natural recombination patterns as wells as the effects of mutation and other environmental stimulation, there is a requirement for a high-throughput sequencing assay using individual plants. Separation of single meiocytes was carried out by Fluorescence- Activated Cell Sorting (FACS) followed by G&T-seq. This involves physical separation of the mRNA and DNA from each sorted cell, followed by amplification using a modified Smart- seq2 (mRNA) and multiple displacement amplification (DNA) and sequencing using Illumina sequencing platforms. Genomic reads are then used to assess genome coverage and subsequently recombination rates, while the transcriptomic data can be used to link genomic recombination patterns as well as meiotic progression. We demonstrate the feasibility of the G&T Seq method in plant single-cell analysis and the potential to analyse recombination rates in important crops with high-throughput, by applying the method to the wheat cultivar Chinese Spring. The approaches we are developing may also be applicable across a wide spectrum of living systems, including other agriculturally important species.

P63

Landscape and Dynamics of Single Cells in Nasopharyngeal Carcinoma

Yang Liu, Xiliang Wang, Shuai He, Wan Peng, Jinxin Bei.

Sun Yat-sen University Cancer Center

Cancer immunotherapies have shown promising activity in early phase clinical trials in the nasopharyngeal carcinoma (NPC), while detailed knowledge of immune cell phenotypes in the tumor microenvironment (TME) remains unclear. To characterize the TME of NPC, we profiled 185,368 immune cells and 2787 epithelial cells from ten NPC tissues and matched peripheral blood, using single-cell RNA sequencing (scRNA-seq) along with paired T cell receptor sequencing (TCR). We identified multiple immune cell phenotypes for T (CD4 and CD8) and B lymphocytes, natural killer (NK) cells, and myeloid cells. By TCR tracing, we found that regulatory T (Treg) cells migrated from blood to tumor and then proliferated locally to maintain the population of tumor Treg cells. Tumor Treg cells showed large diversity in gene profiles, activated stages and different immunosuppressive functions. We also found a novel cluster of LAMP3+ DC cells with upregulation of HLA-class II genes, downregulation of HLA-class Ⅰ genes, and highly expressed immunosuppressive genes such as PDL1. We also predicted interactions between cell populations via specific ligand-receptor binding and generated a potential cellular communication network in the peripheral blood and TME of NPC patients, especially EBV infection epithelial cells, myeloid cells, and T cells. In summary, we have leveraged scRNA-seq to refine our understanding of the relative abundance, diversity, and complexity of the immune landscape of NPC. This report represents the first characterization of NPC immune landscape using scRNA-seq. With further characterization and functional validation, these findings may identify novel sub- populations of immune cells amenable to therapeutic intervention.

P64

A single-cell resolution spatial roadmap of early mouse development using seqFISH

Tim Lohoff1,2, Shila Ghazanfar3,4, Noushin Koulena5, Nico Pierson5, Jonathan Griffiths4, Bertie Göttgens1, Long Cai5, John Marioni3,4,6, Jenny Nichols1 and Wolf Reik1,2,6

1 Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, CB2 0AW, UK 2 Epigenetics Programme, Babraham Institute, Cambridge CB22 3AT, UK 3 European Bioinformatics Institute (EMBL-EBI), Hinxton, CB10 1SD, UK 4 Cancer Research UK Cambridge Institute, University of Cambridge, CB2 0RE, UK 5 Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA 6 Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK

Following implantation, the mammalian embryo specifies the epiblast precursors required for the formation of the tissues of the foetus. At the exit from pluripotency, global epigenetic and transcriptional remodelling occurs. These changes are highly dynamic and known to be essential for gastrulation, the process by which all three germ layers - ectoderm, mesoderm, and endoderm - are specified. Research to date suggests that signals from the surrounding tissues, mechanical constraints, and transcriptional and epigenetic changes are all factors potentially involved in lineage priming and cell fate specification. However, the precise molecular mechanisms that control cell fate decisions are poorly understood. Recent advantages in single-cell sequencing technologies have allowed the characterisation of the transcriptional and epigenetic changes during mouse gastrulation and early organogenesis. However, a major limitation of these approaches is that spatial information is difficult to reconstruct. To characterise the roles of both the intrinsic regulatory network and the spatial environment on cell fate specification, we apply a novel image-based single-cell transcriptomics method, seqFISH, to precisely measure mRNA abundance of about 400 selected target genes in early mouse embryo tissue sections. We aim to integrate the imaging-based single-cell transcriptomic profiles with matching single-cell nucleosome, methylation and transcriptome sequencing (scNMT-seq) and 10X scRNA-seq data to combine whole-transcriptomic, DNA methylation and DNA accessibility with spatial information. The joint inference will allow us to answer biological questions related to cellular position, signaling gradients, cell-cell contact and thereby powerfully combine the benefits of these technologies. In summary, this work provides a detailed study of the relationship between spatial positioning and cell fate in early mammalian embryogenesis.

P65

Conditional out-of-sample generation for unpaired data using trVAE mohammad lotfollahi, Mohsen Naghipourfar, Fabian J. Theis, F. Alexander Wolf

Institute of Computational Biology, Helmholtz Center Munich, Neuherberg, Germany.

While generative models have shown great success in generating high-dimensional samples conditional on low-dimensional descriptors (learning e.g. stroke thickness in MNIST, hair color in CelebA, or speaker identity in Wavenet), their generation out-of-sample poses fundamental problems. The conditional variational autoencoder (CVAE) as a simple conditional generative model does not explicitly relate conditions during training and, hence, has no incentive of learning a compact joint distribution across conditions. We overcome this limitation by matching their distributions using maximum mean discrepancy (MMD) in the decoder layer that follows the bottleneck. This introduces a strong regularization both for reconstructing samples within the same condition and for transforming samples across conditions, resulting in much-improved generalization. We refer to the architecture as transformer VAE (trVAE). Benchmarking trVAE on high-dimensional image and tabular data, we demonstrate higher robustness and higher accuracy than existing approaches. In particular, we show qualitatively improved predictions for cellular perturbation response to treatment and disease based on high-dimensional single-cell gene expression data, by tackling previously problematic minority classes and multiple conditions. For generic tasks, we improve Pearson correlations of high- dimensional estimated means and variances with their ground truths from 0.89 to 0.97 and 0.75 to 0.87, respectively.

P66

Cell Competition in mouse embryo

Gabriele Lubatti (1,2,3), Ana Lima (4), Di Hu (5), Shankar Srinivas (5), Tristan Rodriguez (4), Antonio Scialdone (1,2,3)

(1) Institute of Epigenetics and Stem Cells, Helmholtz Zentrum Munich, Munich, Germany (2) Institute of Computational Biology, Helmholtz Zentrum Munich, Munich, Germany (3) Institute of Functional Epigenetics, Helmholtz Zentrum Munich, Munich, Germany (4) National Heart and Lung Institute, Imperial College London, Hammersmith Hospital Campus, London, UK (5) Department of Physiology Anatomy & Genetics, University of Oxford, Oxford, UK

Cell competition is a biological process whereby cells eliminate their less fitted neighbours [1] [2]. It has myriad positive roles in the organism: it selects against mutant cells in developing tissues, prevents the propagation of oncogenic cells and eliminates damaged cells during ageing. While it was first characterized in drosophila [3], it is currently unclear what are the transcriptional features of cells eliminated through competition and what are the roles of cell competition during mammalian development. We analysed single-cell transcriptomic data from mouse embryos around the time gastrulation starts (stage E6.5) where apoptosis was inhibited. We show that in these embryos a new population of epiblast cells emerges, expressing markers of cell competition previously characterized [4]. Our analysis also identifies additional features of eliminated cells, including disrupted mitochondrial activity that we validate in vivo. Moreover, by using physical modelling, we show that cell competition might play a role in the regulation of embryo size, which could be particularly important around gastrulation [5]. [1] A. Di Gregorio et al., Developmental Cell, Volume 38, 621-634 (2016). [2] S. Bowling et al., Development, (2019). [3] G. Morata et al., Dev. Biol, Volume 42, 211-221, (1975). [4] S. Bowling et al., Nature Communications, (2018). [5] Y. Kojima et al., Seminars in cell and developmental biology, (2014).

P67

Benchmarking single-cell genomics data integration

Malte D Luecken, Maren Büttner, Kridsadakorn Chaichoompu, Anna Danese, Marta Interlandi, Michaela Mueller, Daniel Strobl, Maria Colomé-Tatché, Fabian J Theis

Helmholtz Center Munich

Broad commercial availability has enabled single-cell genomics to move from generating cell maps to cell atlases. Atlases include multiple samples, generated across conditions, often involving multiple labs. These experimental designs lead to complex, nested batch effects in the data, which motivated the development of data integration methods to enable the joint analysis of atlas datasets.

Comparing data integration methods is a challenge due to varying output data representations, and the difficulty of defining success. We have benchmarked 8 data integration methods on 60 batches of gene expression and chromatin accessibility data taken from 21 publications, which we distribute among six integration tasks. Data from each publication was individually curated to ensure we can define successful batch effect removal. Our integration tasks span a variety of batch effect contributors such as individuals, species, protocol, data modality, organ, and experimental lab and each task revolves around a tissue or organ. Data integration methods are evaluated on scalability, usability, and their ability to remove batch effects while retaining biological variation. As data integration methods differ in the expected pre-processing of the input data, we also assess the contribution of pre- processing decisions on the performance of these methods.

We find that highly variable gene selection improves the performance of data integration methods. Furthermore, integration methods that output corrected low-dimensional embeddings tend to perform better than methods that output corrected expression matrices. These methods also have an advantage in scaling to large cell numbers. Compositional imbalance between batches particularly affects integration of open chromatin data when overlapping feature selection becomes challenging. Our analysis code is available as a reproducible python module that can be used to benchmark future data integration methods. The module is freely available at https://github.com/theislab/scib and will contribute to improved tool development for data integration. We envision that our results will guide users in choosing the optimal integration method and pre-processing for their own data.

P68

Single-cell Multiomics with Partek® Flow®

Ivan K. Lukic, Alison Hargreaves, Simit Patel

Partek, Inc.

Over the past few years the popularity of single-cell based next generation-sequencing has been expanding at an exponential rate, as it provides a unique insight into omics at the level of an individual cell. At the same time, the increase in the number of samples, cells and their biological features, as well as assays has been mirrored by growing complexity of the data analysis and the number of available computational tools. Our solution for that challenge is Partek Flow, an all-in-one versatile and flexible software platform for analysis of multiomics data sets, which makes data analysis simple and straightforward. Partek Flow supports different types of data files, generated by all the major vendors and does not require knowledge of programming or a background in bioinformatics. All the steps are performed using point and click interface with a context-sensitive toolbox which guides the user down the pipeline. The analysis is performed in a visual way, enabling the biologists to fully interrogate their data and get the answers to their questions. The presentation will be based on a live demo of Partek Flow, and will go over different stages of single cell pipelines, such as quality control, data processing, reduction of dimensionality and data visualisation, identification of cell populations, detection of transcriptomic signatures, pathway interpretation, trajectory analysis, and tissue transcriptomics. Moreover, it will also illustrate joint analysis of various tiers of omics.

P69

Stimulation strength controls the rate of initiation but not the molecular organization of the TCR-induced signalling network

Claire Y. Ma (1), John C. Marioni (2,3,4)*, Gillian M. Griffiths (1)*, Arianne C. Richard (1,2)*

1. Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK 2. Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK 3. EMBL-European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK 4. Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK * Co-corresponding authors

Cytotoxic T lymphocytes (CTLs) play a key role in the cell-mediated immune response against virally-infected and tumourigenic cells, killing their targets through the release of cytolytic granules. T cell receptor (TCR) recognition of foreign peptides presented by Class I MHC molecules stimulates naïve CD8+ T cell differentiation to effector cells and triggers the cytolytic activity of effector CTLs. Signal transduction downstream of the TCR is a highly diverse and coordinated network of post-translational protein modifications that ultimately drive transcriptional, translational, metabolic and cytoskeletal changes in the cell. How cytotoxic T cells coordinate their signalling machinery in response to strong versus weak stimuli remains unknown. In naïve T cells, we previously showed that ligand strength controls the rate of transcriptional activation of CD8+ T cells, but all activated cells achieved cytolytic capacity. Here we studied how ligand strength affects the coordination of proximal T cell signalling responses, through the use of multi-dimensional measurements of phospho- and total-protein signalling molecules in single cells.

We used mass cytometry to simultaneously study multiple signalling events at a single-cell resolution. We designed a custom panel of 22 metal-conjugated antibodies, which probed surface receptors and key elements of major signalling pathways. We used this panel to profile naïve transgenic CD8+ T cells stimulated with ligands of differing potencies. Our data revealed that stimulation strength exerted a differential effect on the rate with which cells commenced activation of key signalling molecules, but the programme of signalling events was tightly conserved. Through simultaneous phosphoprotein and RNA flow cytometry, we also observed conservation of the relationship between activation of transcriptional and translational pathways. These data indicate that TCR-induced signalling results in a single coordinated activation program, modulated in rate but not organization by stimulation strength.

P70

Robust multi-sample, multi-celltype, multivariate quality control for single cell RNA- seq

Will Macnair, Mark Robinson

Institute of Molecular Life Sciences, University of Zürich

Quality control (QC) is a critical component of single cell RNA-seq processing pipelines. As the size and sophistication of such experiments increases, assessing the quality of samples, in addition to the quality of cells, also becomes important. We present SampleQC, a method for quality control at both cell and sample levels. SampleQC is based on robust fitting of mixture models to quality control metrics for the cells in each sample, then fitting a likelihood function to the statistics learned for each sample within an experiment. This allows researchers to remove both cells and samples of low quality. At the cell level, SampleQC provides improved sensitivity and reduced bias relative to current industry standard approaches, such as scater, via a robust Gaussian mixture model (robust in the sense that it is less sensitive to outliers). The industry standard approaches to single cell QC may introduce biases, where celltypes with extremely large or small QC metric distributions are preferentially excluded (for example, those with naturally smaller library sizes). By fitting a distribution to each celltype individually, SampleQC avoids this bias, permitting better characterization of all cell subpopulations in downstream analysis. At the sample level, SampleQC identifies samples with unusual QC statistics, such as higher mean library size than observed in other samples, or unusually low correlation between library size and number of features. SampleQC estimates a likelihood for each sample based on the calculated statistics, visualizes the sample statistics, and combines these to flag abnormal samples for experimenters to review and potentially exclude. We demonstrate SampleQC on several complex datasets comprising up to 1M cells with multiple celltypes. Single cell RNA-seq datasets now commonly comprise hundreds of thousands of cells and dozens of samples. SampleQC provides a sensitive and unbiased approach to extracting the best quality data from such large and important experiments.

P71

No more paywalls: cost-benefit analysis across scRNA-seq platforms reveals biological insight is reproducible at low sequencing depths

Kathryn S. McClelland, Oswaldo A. Lozoya*, Suzanne N. Martos, Brian N. Papas, Jian-Liang Li, Douglas A. Bell

National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK/NIH), National Institute of Environmental and Health Sciences (NIEHS/NIH)

The major hurdle that single-cell RNA-seq (scRNA-seq) technologies face in reaching mainstream status is cost - including money, computational footprint, and statistical effort. In trying to disseminate these technologies, the usual concern pertains representation: how many reads must be produced to capture an informative picture of single-cell transcriptomes? Instead, we approach scRNA-seq optimization from a different question: does deeper sequencing improve data sparsity? Here, we show the answer is no. First, we produced matched scRNA-seq data at various sequencing depths in different systems (Illumina and IonTorrent), from scRNA-seq libraries assembled with different technologies (10X Genomics and sci-RNA-seq), across biological models (human PBMCs and mouse embryonic kidney stroma), and each with at least three independent replicates of over 10,000 cells per specimen. Then, we implemented expression matrix focusing and SALSA (doi: 10.1101/551762) to compare gene detection rates, barcode re-incidence, and clustering reproducibility within each specimen at increasing sequencing depths. Overall, high-depth sequencing (NOVAseq, NextSeq) detected over 4x more barcodes than low- depth (MiSeq, Ion 530) for droplet-based technologies (10X Genomics), and over 3x more compounded barcodes for combinatorial-indexing techniques (sci-RNA-seq); yet, most barcodes added at high depths represented ambient RNA or gDNA debris; in contrast, all barcodes scoring as single cells at low sequencing depths were rescored as such at higher ones. Newly detected UMIs at high-depth sequencing aligned to constitutive genes detected in all barcodes or rare transcripts from discarded barcodes. Also, markers for single-cell clusters lost statistical support during differential expression analyses at high sequencing depths. These results confirm that over-sequencing of scRNA-seq libraries provides no benefit regarding data sparsity, and instead admits higher rates of "false" single-cell barcodes and transcripts the more UMI-appended debris becomes readout. In sum, statistical insight from scRNA-seq data tracks with library complexity regardless of scRNA- seq technique or sequencing platform. Our findings posit a new paradigm to extract reproducible biological insight from scRNA-seq experiments in which minimal (and inexpensive) sequencing depths, with as many cells supplied per assay as possible, are always best. (* KSM and OAL joint first author).

P72

The effects of interleukin-36 signalling on circulating immune populations

Dan McCluskey, Christian Wohnhaas, Meera Ramanujam, Sudha Visvanathan, Catherine H Smith, Patrick Baum, Francesca Capon

St John’s Institute of Dermatology, Guy’s Hospital, King’s College London, UK. Translational Medicine and Clinical Pharmacology, Boehringer Ingelheim, Biberach, Germany.

The IL-36 cytokine family consists of three pro-inflammatory agonists (IL-36α, IL-36β, IL-36γ) and an antagonist (IL-36Ra). While activation of the IL-36 receptor up-regulates innate immune defences, abnormal IL-36 signalling causes general pustular psoriasis, a severe disease presenting with systemic upset. IL-36 expression is also elevated in other conditions manifesting with epithelial and systemic involvement, such as systemic lupus erythematosus and inflammatory bowel disease. While these observations suggest that IL-36 cytokines contribute to systemic immunity, most studies have focused on their function at barrier organs. To address this research gap, we investigated the effects of IL 36γ on circulating leukocytes obtained from healthy volunteers. Specifically, we carried out using single-cell RNA sequencing in peripheral blood mononuclear cells (n=5 donors) stimulated with the cytokine. Clustering and annotation of cells that passed quality control identified all major immune subsets. Importantly, all were present in equal proportions in treated and untreated cells. Differential expression analysis showed that the largest number of IL-36 responsive genes was found within CD14+ monocytes (n=150), followed by natural killer cells (n=15) and CD16+ monocytes (n=10). Further analysis of the genes up-regulated in CD14+ monocytes revealed the enrichment of multiple pathways related to IL-17, IL-8 and IL-10 signalling. These results demonstrate a potent activation of the main monocyte subset by IL-36γ, implicating this population as a key player in IL-36-mediated systemic inflammation.

P73

Advances and Applications of Single-Cell Genomics Sample Multiplexing Technology

Chris McGinnis, Sisi Chen, Danny Conrad, Hikaru Miyazaki, Tiffany Tsou, Tahmineh Khazaei, Paul Rivaud, Benjamin Hoscheit, Jong Park, Eric Chow, Matt Thomson, Zev Gartner

1. Department of Pharmaceutical Chemistry, UCSF, San Francisco, CA 2. Division of Biology and Biological Engineering, CalTech, Pasadena, CA 3. Beckman Center for Single-Cell Profiling and Engineering, CalTech, Pasadena, CA 4. Department of Biochemistry and Biophysics, UCSF, San Francisco, CA 5. Center for Advanced Technology, UCSF, San Francisco, CA 6. Helen Diller Family Comprehensive Cancer Center, San Francisco, CA 7. Chan Zuckerberg BioHub, UCSF, San Francisco, CA 8. Center for Cellular Construction, UCSF, San Francisco, CA

Sample multiplexing technologies are actively expanding the boundaries of experimental feasibility for single-cell genomics. We recently described a method for single-cell RNA- sequencing sample multiplexing using lipid tagged indices (MULTI-seq), which enables distinct samples to be processed in a pooled format by labeling cell and/or nuclear membranes with sample-specific DNA barcodes prior to single-cell isolation. To date, only sample multiplexing approaches for scRNA-seq have been documented, and these methods have largely been applied in a limited capacity to simple in vitro systems.

Here, we describe new advances and applications of the MULTI-seq technology, organized into three vignettes. First, we outline the development and benchmarking of the first multiplexing approach for scATAC-seq assays using the 10x Genomics system: MULTI- ATAC-seq. Second, we used MULTI-seq to perform the largest ever scRNA-seq-coupled drug screen, wherein we studied the impact of 300 immunomodulatory compounds on human peripheral blood mononuclear cells (PBMCs) under resting and activated (e.g., with CD3/CD28 agonists) conditions. Our PBMC drug screen data illustrates how single-cell read-outs improve sensitivity for hit identification relative to bulk assays, while identifying previously-unknown influences of drugs on PBMC system composition and cell-cell interaction networks. Finally, we applied MULTI-seq for meso-scale (i.e., mm-cm) spatial transcriptomic analysis of the developing murine small intestine, through which we uncovered previously-unknown regulators of gut vilification.

P74

Adult human haematopoietic stem and progenitor cell landscapes differ between medullary and extramedullary sites.

Nicole Mende1* and Hugo P Bastos1*, Antonella Santoro1, Krishnaa Mahbubani2, Abbie Curd2, Nicola K Wilson1, Bertie Göttgens1, Kourosh Saeb-Parsy2 and Elisa Laurenti1

1 Department of Haematology and Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK. 2 Department of Surgery and Cambridge NIHR Biomedical Research Centre, Biomedical Campus, University of Cambridge, Cambridge, UK. *equal contribution

In adults, most hematopoietic stem and progenitor cells (HSPCs) reside within the bone marrow (BM), giving rise to all mature blood cells. It is known that other anatomical sites can contribute significantly to blood production under stress conditions. However the cellular, molecular and functional composition of extramedullary HSPC pools remains unexplored, in particular at steady state. Here, we comprehensively characterized the single-cell transcriptome of the adult human HSPC pool within paired BM, spleen and peripheral blood (PB) from two organ donors. Using 10x scRNA-seq of 30,000 HSPCs, we identified the hematopoietic landscape in all three tissues, containing transcriptionally distinct cell clusters corresponding to quiescent hematopoietic stem cells and multipotent progenitors (HSC/MPPs) and precursors of all haematopoietic lineage branches. Interestingly, although most clusters were present in all organs, their proportions and molecular regulation significantly differed between tissues. For example, megakaryocyte-committed progenitors were almost exclusively found in BM, and BM-specific megakaryocyte priming was detected early on during hematopoietic differentiation. In contrast, early progenitors within the megakaryocyte-erythroid branch were more abundant at extramedullary sites, but significantly less cycling than their BM equivalent. Furthermore, we show that splenic HSPCs are able to engraft xenograft mouse models, and identify HSC/MPP clusters unique to extramedullary sites. In line, using single cell differentiation assays, we demonstrate that phenotypic HSC/MPPs show distinct differentiation patterns between organs, such as an increased erythroid output of single HSC/MPPs from spleen compared to BM. Altogether our data provides evidence that different organs host HSC subsets with distinct molecular make- ups and lineage differentiation capacities, giving rise to distinct topologies of the hematopoietic hierarchy.

P75

Profiling human lung development using scRNA-seq and ATAC-seq

Kerstin B Meyer1, Peng He1, Dawei Sun2, Kyungtae Lim2, Lira Mamanova1, Liam Bolt1, Emma Rawlins2

1. Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Saffron Walden CB10 1SA, UK 2. , Cambridge, Tennis Court Rd, Cambridge CB2 1QN, UK

Single cell sequencing (scRNA-seq) has allowed the dissection of the cellular landscape of tissues and developing organs at an unprecedented resolution. Here we present preliminary results of our analysis of human foetal lungs taken at 5-22 pcw (post conception weeks). We have generated both ATAC-seq and 10X 5' scRNAseq data. To date our data set comprises approximately 75,000 cells, representing a range of cell types. Mesenchymal/fibroblast cells are especially abundant at early developmental time points. We identify a range of epithelial cells including ciliated, secretory and neuroendocrine cells from the airway, as well as alveolar cells. Endothelial cell populations from both lymphatic and vascular vessels are detected as well as different populations of muscle cells. Amongst the immune cell populations, macrophages are the most prominent cell type at the early stages, monocytes increase with time and from week 15 we also detect B and T lymphocytes and dendritic cells. Further sub-clustering and immune repertoire analyses together with integration of ATAC-seq data are under way.

P76

A lineage resolved kidney development atlas at the transcriptional and epigenetic level

Zhen Miao1, Michael Balzer2, Junnan Wu2, Ziyuan Ma2, Tamas Aranyi2, Hongbo Liu2, Mingyao Li3, Katalin Susztak2, †

1Graduate Group in Genomics and Computational Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; 2Department of Medicine and Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA; 3Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA.†Corresponding author

There are more than 30 different cell types in the kidney and their delicate balance is critical for metabolite and electrolyte balance. However, our understanding of cell-type specification during kidney development at single-cell resolution is still limited. To characterize the epigenetic and transcriptomic dynamics of kidney development, we conducted single-cell RNA sequencing (scRNA-seq) and single-nucleus ATAC sequencing (snATAC-seq) of kidneys from developing and adult mice. After quality control, we obtained 66,254 scRNA- seq and 28,316 snATAC-seq profiles. Through clustering analysis with batch effect removal, we identified all major cell types in the kidney, including more than 10 renal epithelial cell types differentiated from a common nephron progenitor. To study cell-type specification of renal epithelial cells, we conducted trajectory analysis using scRNA-seq and snATAC-seq data, and resolved the developmental pseudotime along cell differentiation from nephron progenitors to podocytes, proximal tubules, and other epithelial cells. We found that progenitor cells have more open chromatins than differentiated cells. In addition, trajectory patterns based on scRNA-seq data and snATAC-seq data showed high consistency, with podocytes being close to nephron progenitors in both trajectories. To study how the cell differentiation program is driven by transcription factors (TFs) and secreted elements, we conducted differential expression and accessibility analysis and identified key TFs and secreted factors that change along cell specification. For example, along podocyte differentiation, regions containing Wt1 motif gain accessibility. Interestingly, although TF expression levels and the accessibility of their corresponding motifs showed consistent trends, we observed discordant cell state dynamics between gene transcription and chromatin accessibility in which some genes that are not expressed after certain developmental time remain accessible. For example, Jag1 is not expressed in differentiated proximal tubules, but regions around the transcription start site of Jag1 remain accessible. Such discordance indicates asynchronization between transcription and epigenetics at the single-cell level. In addition, we observed that some kidney disease-associated loci, such as those in the vicinity of Uncx, are only transcribed in developing cells but accessible in both developing and adult state, indicating that some disease-associated genes only affect certain developmental stages. In summary, single-cell studies of kidney provided a promising way to understand the mechanism of gene regulation during development and its relationship with complex human diseases.

P77

Integrated short and long read single cell RNA-sequencing reveals alternative splicing events in hematopoietic stem and progenitor cells

Laura Mincarelli, Vladimir Uzun, Anita Scoones, Stuart Rushworth, Wilfred Haerty, Iain C Macaulay

Earlham Institute, Norwich Bioscience Institutes, Norwich Research Park, Norwich Norwich Medical School, University of East Anglia

Single-cell RNA sequencing analysis has recently provided snapshots of gene expression of stem cells and progenitors across the haematopoietic hierarchy and alterations they undergo during ageing. As well as transcriptional changes, alternative splicing events and modifications of components of splicing machinery actively contributes to the ageing process. Current high-throughput single-cell RNA sequencing methods are based on short-read (Illumina) counting of unique 3' tag sequences. This enables identification of cell types within a complex population of cells but has limitations in terms of resolving transcriptional heterogeneity in closely related cell types and lacks any information on cell-specific isoform expression. Therefore, these methods are missing key aspects of stem and progenitor cell biology. In the present work we introduce a novel approach using the 10X Genomics Chromium to generate short-read (Illumina) and long-read (Pacific Biosciences Sequel II) RNA- sequencing libraries from the same single cells. Short read-based genes counting enabled identification of 22 cell types, novel marker genes restrictively expressed in the stem cell population and age-dependent cell type distributions. Integrated analysis of long reads from the same samples revealed transcripts encoding functional isoforms in known regulators of haematopoiesis. This included 2,157 previously undescribed exons, detected in 952 genes. This approach produced single cell parallel transcriptional and splicing profiling that demonstrates for the first time age-dependent changes in sub-populations of cells, cell-type specific isoform expression and splicing alterations associated with ageing in haematopoietic stem and progenitor cells

P78

Clustering of single-cell RNA sequencing data improved by random matrix theory

Maria Mircea, Mazéne Hochane, Diego Garlaschelli, Stefan Semrau

Leiden University

Single cell RNA-sequencing (scRNA-seq) has allowed unprecedented insight into biological processes by measuring the whole transcriptomes of thousands of single cells. However, compared to population-level methods, scRNA-seq is riddled with noise, which makes data analysis challenging. One essential analysis step is the identification of cell types. Many unsupervised clustering methods have been employed for this purpose, but it is often unclear if the identified clusters reflect only real biological variability or are also driven by noise. In addition, the number of clusters is often sensitive to user-provided input parameters. Here, I present an algorithm to determine the variability in the data that is more than to be expected from random noise and to identify clusters in a more sensitive way than existing methods. First, we leverage a result from random matrix theory to separate the contribution of random noise from the single-cell correlation matrix. Then, we apply the Leiden algorithm, a graph-based clustering method that we adapted for correlation matrices and to be free of any input parameters. This results in clusters which are internally positively correlated and mutually negatively correlated. By applying this algorithm to simulated and real datasets with varying cluster sizes and varying amounts of noise, we show that our algorithm compares favorably to well- established clustering methods. In addition, the whole procedure can be repeated on the resulting clusters to uncover more subtle differences and subtypes of cells. We have thus developed a systematic way to reveal real biological variability in noisy scRNA-seq data to identify and discover cell types.

P79

Single cell evolution of the multicellular environment of clear cell renal cell carcinoma

Tom Mitchell, Tom Mitchell, Kevin Loudon, John Ferdinand, Georgie Bowyer, Lira Mamanova, Matthew Young, Liam Bolt, Eirini Fasouli, Joana Neves, Maxine Tran, Anne Warren, Grant Stewart, Menna Clatworthy, Peter Campbell, Sam Behjati, Sarah Teichmann

Wellcome Sanger Institute, MRC Laboratory of Molecular Biology, Specialist centre for kidney cancer at Royal Free Hospital, Addenbrooke's Hospital.

Clear cell renal cell carcinoma is the most common form of kidney cancer, accounting for over 70% of adult renal tumours. It affects over 10,000 people per year in the UK with an ever increasing incidence. The initiating genomic event appears to occur many decades prior to diagnosis. When the tumours are small (<3-4cm), their growth rate is low (1-2mm per year) and they seldom possess the ability to metastasise or invade surrounding tissues. Conversely, larger tumours often exhibit highly aggressive behaviour. Almost half of tumours that on pre-surgical imaging appear localised, recur at distant sites sometime after surgery, presumably from undetected micro-metastatic sites. Sadly, despite advances in targeted therapies and immune checkpoint inhibition, metastatic disease remains incurable. Multi-regional interrogation of somatic mutations has revealed varying degrees of genetic intra-tumoural heterogeneity with often striking patterns of selection. As yet, these mutations do not strongly inform prognosis, and the forces sculpting the varying patterns of genomic tumour evolution are unknown. Prognostic scores have also been developed based on the transcriptomes of large cohorts of patients, but are not recommended for use in clinical practice due to lack of weak correlations and the inability to predict response to therapy. In this study, we aim to understand the multi-cellular and spatial dynamics of clear cell renal cell carcinomas through an in-depth analysis of their somatic mutations and single cell transcriptomes. In a cohort of high and low risk clear cell renal cell carcinomas we prospectively sample multiple tumour regions, the tumour-normal interface, peripheral blood, perinephric fat and normal kidney tissue. Each region undergoes droplet based 5' single cell RNA sequencing and focussed laser capture micro-dissected whole exome DNA sequencing. We present the results of an interim analysis of the transcriptomes of ~180,000 single cells from 8 patients. We show consistent and marked clonal expansion of cytotoxic T cells across all tumours. We demonstrate high sensitivity of detection of these T cell clones in the peripheral blood for those that are expanded across multiple tumour regions, in comparison to those that are locally expanded. We highlight issues surrounding batch effects of non- normal tissues, particularly in the ischaemic tumour environment. We finally chart the evolution of immune cell types from the source in the peripheral blood through to effector status in the tumour.

P80

Multi-omic single-cell approach reveals transcriptional and epigenetic plasticity in hormone driven cancers

Hisham Mohammed, Aysegul Ors1, Alex Chitsazan1, Aaron Doe1, Ashely Woodfin1, Yahong Wen1, Mithila Handu1 and Hisham Mohammed1,2

1. Cancer Early Detection Advanced Research Center, Knight Cancer Institute, Portland, OR, USA 2. Department of Molecular and Medical Genetics, Oregon Health & Science University, Portland, OR 97239, USA

Breast cancer is known to be driven by steroid hormones such as estrogen, progesterone and androgen. Using a combination of scRNA-seq and scNMT-seq, we aim to understand how underlying genetic and epigenetic tumor heterogeneity impacts hormone response in cancer. We apply single-cell RNA-seq on hormone responsive cell line models after multiple timepoint hormone treatments. Our results suggest that response to hormone signalling is highly heterogenous at the single-cell level. Surprisingly, cells transition between one cell cluster to another as treatments progress. This suggests that the heterogeneity observed is highly plastic and responsive to external signalling cues. We then applied a novel machine learning 'topic modelling' based approach to study these transitions and identify known and novel signalling patterns driving these states, allowing us to better characterize plasticity. These gene networks can be seen evolving across treatment time course. Analysis of these networks indicate both basal and luminal cell signatures driving heterogeneity. By corelating luminal and basal networks with 1000+ ChIP-sequencing datasets, we identified potential transcription factors regulating these states. Silencing of one such transcription factor lead to a dramatic reduction in estrogen signalling and a shift in underlying cell 'state'. To probe underlying epigenetic patterns driving these networks, we performed single-cell NMT-seq analysis in both line and patient tumor cells, a method that allows joint analysis of the DNA methylome, chromatin accessibility and transcriptome from a single cell. NMT-seq analysis identified epigenetic differences at enhancer and promoter regions between cell clusters/states. Together, our data suggests that hormone responsive cancer cells are highly plastic at the epigenetic and transcriptional level. This plasticity may explain treatment responses and therapeutic resistance in patients.

P81

Characterising the embryonic origin of hemangioblasts, their trajectories and lineage decisions at single cell resolution

Gi Fay Mok [1, 2], Anita Scoones [1], Eirini Maniou [2], Victor Martinez-Heredia [2], Shannon Weldon [2], Wilfried Haerty [1], Andrea Munsterberg [2], Iain Macaulay [1]

[1] Earlham Institute, Norwich Research Park, Norwich NR4 7UZ United Kingdom [2] University of East Anglia, School of Biological Sciences, Norwich Research Park, Norwich NR4 7TJ United Kingdom

Hemangioblasts generate cells of the hematopoietic and vascular systems. Understanding the molecular, cellular and developmental biology of these cells is of fundamental importance. The major anatomical sites of hematopoiesis change during an organism's lifetime and new blood vessels can form in response to injury and disease. However, the developmental origin of hemangioblast progenitors is incompletely understood. Furthermore, the gene regulatory network (GRN) involved in cell specification is also unresolved. Using the chick embryo as our model organism, we performed ATAC-sequencing on embryonic mesoderm tissue to identify genome-wide accessible-chromatin. Using this data, we identified a novel cis-regulatory element (CRE) that is specifically active in early hemangioblasts and label with a fluorescent tag. Time-lapse imaging reveal specified- hemangioblasts during early gastrulation prior to ingression into the primitive streak and migration trajectories into the dorsal aorta and extra-embryonic regions. Co-labelling with hematopoietic (Tal1) and endothelial (LMO2) markers confirms hemato-endothelial specific labelling. To obtain a complete GRN and pseudo-developmental trajectories of hemangioblasts, single-cell RNA sequencing was performed. Using hierarchical clustering analysis of differential genes, we are able to identify novel regulators for early hemangioblasts such as FBN2 and HOXB4, hematopoietic lineages such as CTSZ and TFR2, and for endothelial lineages such as TLL2 and SCARF1

P82

Integrating microarray-based spatial transcriptomics and single-cell RNA-seq reveals tissue architecture in the pancreatic ductal adenocarcinomas

Reuben Moncada, Dalia Barkley, Itai Yanai

NYU Grossman School of Medicine

Single-cell RNA sequencing (scRNA-seq) enables the systematic identification of cell populations in a tissue, but characterizing their spatial organization remains challenging. We combine a microarray-based spatial transcriptomics method that reveals spatial patterns of gene expression using an array of spots, each capturing the transcriptomes of multiple adjacent cells, with scRNA-Seq generated from the same sample. To annotate the precise cellular composition of distinct tissue regions, we introduce a method for multimodal intersection analysis. Applying multimodal intersection analysis to primary pancreatic tumors, we find that subpopulations of ductal cells, macrophages, dendritic cells and cancer cells have spatially restricted enrichments, as well as distinct coenrichments with other cell types. Furthermore, we identify colocalization of inflammatory fibroblasts and cancer cells expressing a stress-response gene module. Our approach for mapping the architecture of scRNA-seq-defined subpopulations can be applied to reveal the interactions inherent to complex tissues.

P83

Enabling interactive single cell RNA-Seq data analyses on data from Single Cell Expression Atlas and the Human Cell Atlas using Galaxy

Pablo Moreno, Ni Huang, Jonathan Manning, Suhaib Mohammed, Carlos Talavera-Lopez, Anja Füllgrabe, Silvie Korena Fexova, Nancy George, Matthew Green, Haider Iqbal, Alfonso Muñoz-Pomer Fuentes, Andrej Solovyev, Lingyun Zhao, Kerstin Meyer, Irene Papatheodorou

EMBL-EBI & Wellcome Sanger Institute

We present a new, open source, flexible and interactive analysis environment for scRNA- Seq analyses based in Galaxy. Its main aims are to offer interoperable access to a number of analysis methods from different established analysis software suites (such as Scanpy, Seurat, Monocle3, SC3 and SCMap among others) through a single user interface. The environment can directly download datasets for re-analysis from the EBI Single Cell Expression Atlas (SCXA; >130 datasets) and the Human Cell Atlas Data Coordination Platform, and provides visualisation tools to assist with the interpretation of results. Using those analysis software suites together has software engineering complications in terms of exchange formats, the need to write scripts in different , dealing with software dependencies, reproducibility and versioning, to name a few. Most of these issues are circumvented by this analysis environment. Currently it exposes >65 modules covering functionalities in quality control, normalisation, scaling, batch correction, dimensionality reduction methods, clustering, marker genes calling, projections and trajectories analysis among others. It includes the UCSC CellBrowser viewer, which enables the interactive exploration of cells, clusters and metadata overlaid in different dimensionality reductions (tSNE, UMAP, PCA) produced by the modules in the setup. SCXA uses this same environment in a programmatic manner to process all the datasets it displays on https://www.ebi.ac.uk/gxa/sc. Together with the EBI Training, we have run two successful training courses for scRNA-Seq analysis using this environment. The fact that both SCXA production pipelines and entire training courses can be run with this environment shows its maturity and flexibility. Towards analysis tools interoperability, many of the modules allow to read and write from different exchange formats (LOOM, AnnData, SingleCellExperiment, 10x) and an interconversion tool is available as well. It can be installed on different cloud providers, on local machines or on compute clusters, truly bringing compute to the data if needed for larger data analysis demands. All tools and workflows are publicly available for use on https://humancellatlas.usegalaxy.eu/ and tools can be individually installed on any existing Galaxy instance through https://toolshed.g2.bx.psu.edu/view/ebi-gxa. Besides Galaxy, the modules are also available for direct command-line use or through other workflow environments through the Bioconda initiative and as software containers in Biocontainers, further improving access to these analysis tools.

P84

Single-cell reconstruction of follicular remodeling in the human adult ovary

X. Fan, M. Bialecka, I. Moustakas, E. Lam, V. Torrens-Juaneda, N.V. Borggreven, L. Trouw, L.A. Louwe, G.S.K. Pilgram, H. Mei, L. van der Westerlaken & S.M. Chuva de Sousa Lopes

Department of Anatomy and Embryology, Leiden University Medical Center, 2333 ZC Leiden, Netherlands. Sequencing Analysis Support Core, Leiden University Medical Center, 2333 ZC Leiden, Netherlands. Department of Immunohematology and Blood Transfusion, Leiden University Medical Center,2333 ZA Leiden, Netherlands. Department of Gynaecology, Division of Reproductive Medicine, Leiden University Medical Center, 2333 ZA Leiden,Netherlands. Department for Reproductive Medicine, Ghent University Hospital, 9000 Ghent, Belgium

Ovary is perhaps the most dynamic organ in the human body, but the molecular mechanisms that regulate the follicular growth and regression remain elusive. In this work, we aim to identify the somatic cell types and associated signals that regulate tissue remodeling in the adult human ovary. We processed ovarian tissue from 5 adult women using 10X genomics platform that rendered data for about 56k cells. We retained 20k cells for further analysis after excluding cells expressing high levels of dissociation-related genes. Analysis was based on a standard Seurat-based workflow. In this workflow, the cell-cycle effects and the batch (patient) effect were removed using regression and Mutual Nearest Neighbors respectively. Moreover, Monocle and Diffusion Maps algorithms were used to to infer cell trajectories. Nineteen clusters of human ovarian somatic cells were identified, which in turn were clustered in 5 major cell types using hierarchical clustering. The 5 major cell types were identified as granulosa, theca and stroma cells, endothelial, immune and smooth muscle cells. Deferentially expressed marker genes and Gene Ontology (GO) terms for calculated for each of the clusters facilitated cluster identification and characterization. We found novel markers for the two most important follicular somatic cell types: granulosa(GC) and theca(TC) cells. We further identified 4 types of granulosa cells (GC) including common progenitor GC, mural GC, cumulus GC and atretic GC and 3 types of theca cells (TC) which are common progenitor TC, externa TC and interna TC. The common progenitor GC and TC identity were also confirmed by trajectory analysis. We also detected several types of endothelial and smooth muscle cells that relate to either angiogenesis or apoptosis and a mixture of adaptive and innate immune cells which could play a role in the remodeling process. Interestingly, we found pronounced expression of several components of the complement system in stroma and atretic TC clusters, which suggests the involvement of complement system in growth and degeneration of ovarian follicles. Our study is a major step to fill the gap in knowledge regarding the characterization of the somatic cell types present in human adult ovary. It also demonstrates an integrated approach of combining several computational methods to reveal various cell populations in ovary and their interaction mechanism.

P85

Introducing the CIRM Stem Cell Hub

Parisa Nejad, William Sullivan, Matthew Speir, Chris Villarreal, Clay Fischer, and Jim Kent

UC Santa Cruz Genomics Institute, California Institute for Regenerative Medicine

The California Institute for Regenerative Medicine (CIRM), the Center of Excellence in Stem Cell Genomics (CESCG) is spearheading the investigation into how stem cells can be used to treat disease. The CESCG funds projects at universities and research institutes across California. These projects were intended to interrogate a few different broad subjects such as neurobiology, cardiac biology, blood stem cells and therapeutics, and the molecular regulators of stem cells. The stem cell hub is the data warehouse for the data produced by the CESCG. It houses primary data files such as DNA reads in fastq format, as well as many types of files derived from mapping and other analysis of the primary data, and PDF and other document files describing protocols. It has a small but flexible system for associating metadata tags with a file. Any CIRM-genomic associated lab can submit data. Once submitted data is treated as prepublication human sequence data, and access is only allowed to authorized users. The CIRM Stem Cell Hub is set to contain many terabytes of data that cover a large variety of sequencing assays, including a vast amount of single-cell data. Researchers can compare experiments with our visualization tools, using metadata terms to color and arrange figures as a way of understanding which genes are driving stem cell actions.

P86

Searching for enrichment of neuropsychiatric disease risk in genes with cell type- specific temporal expression trajectories in prenatal human cerebral cortex

Mari Niemi (1), Marta Florio (2), Steve McCarroll (2), Mark Daly (1), Andrea Ganna (1)

(1) Institute for Molecular Medicine Finland (FIMM), (2) Department of Genetics, Harvard Medical School

Genome wide association studies have shown that hundreds of loci across the genome harbor variation contributing to risk of common neuropsychiatric diseases. For example, 24% of schizophrenia risk is explained by common variants, and 145 independent loci have been associated with the disease. However, most biological effects underlying these genetic associations are still unknown. In this study, we want to shed light on whether some disease heritability could be explained by genes with cell-type specific expression patterns during fetal brain development.

Our dataset is composed of 1.3 million single cell transcriptomes obtained using DropSeq on 17 fetal brain donors, spanning developmental ages 11-22 post conception weeks (pcw). Independent component analysis allowed us to group cells into 30 transcriptional clusters, representing all major distinct cell types in the fetal neocortex. This includes neural progenitor cells, neurons and other non-neural cell types, at different stages of maturation.

First, we assessed concordance of our data with published studies. We compared mean expression of 1,519 genes differentially expressed across our whole dataset to bulk RNA data from the Brainspan study (169 donors from developmental ages 12-21 pcw). We found highest Spearman's rank correlation of gene expression with Brainspan cortical regions, and lowest correlation with subcortical areas. We also found that clusters composed of postmitotic neurons were more strongly correlated with Brainspan tissues than neural progenitor cell clusters and non-neural cell clusters, consistent with dominance of neuronal cell types in bulk data.

We then wanted to characterize temporal developmental gene expression patterns in our single cell data; whether gene expression changed linearly across development, or whether some genes had more complex expression patterns. For each of the 30 cell clusters, we calculated mean expression for 1,519 differentially expressed genes at six developmental timepoints (11pcw, 13/14pcw, 15/16pcw, 17/18pcw, 19/20pcw, 21/22pcw). We then fitted a linear and a quadratic polynomial model to assess temporal changes. We found that for 7.6% (115/1519) of the genes, the non-linear model fit the data better when penalizing for model complexity using Bayesian information criterion.

We will extend these analyses to investigate the different expression trajectories, and enrichment of neuropsychiatric disease risk in groups of genes that have cell-type specific, differential trajectories during development. The size of this dataset will also provide better power to potentially find transiently existing cell types; this would present an opportunity to investigate whether some GWAS effects could be missed when conducting analyses in adult brain.

P87

Scalable Bayesian analysis of multi-donor scRNAseq data

Alan O'Callaghan, Catalina A Vallejos

MRC HGU, The Alan Turing Institute

As single cell transcriptomics matures as a field, population studies -- i.e., studies using multiple donor organisms -- are becoming increasingly important and common. These studies afford greater confidence that the differences observed between cell populations are genuine and reproducible. In this context, we present a Bayesian model for inferring changes in gene expression profiles from multiple donors using scRNAseq data. This model extends BASiCS, a Bayesian negative binomial model for molecule counts that performs joint inference of cell-specific normalisation factors, and gene-specific mean expression and expression variability parameters. BASiCS allows for comparison of the expression profiles of two cell populations; we extend the model using a more flexible approach, allowing inference of the effect of donor-level or population-level covariates on mean and variability in gene expression. To facilitate inference in this framework using large- scale scRNAseq datasets across many donors, we also explore scalable Bayesian inference strategies.

P88

Investigating the role of Sca1 in selective VSMC expansion using trajectory analysis and functional assays

Jordi Lambert and Sebnem Oc, Annabel L Taylor, Lina Dobnikar, Joel Chappell, Jennifer L Harman, Martin R Bennett, Mikhail Spivakov, Helle F Jørgensen

1. Nuclear Dynamics Programme, Babraham Institute, Babraham Research Campus, Cambridge, CB22 3AT, UK

Lina Dobnikar & Mikhail Spivakov

2. Division of Cardiovascular Medicine, University of Cambridge, Cambridge Biomedical Campus, Cambridge, CB2 0QQ, UK

Jordi Lambert, Sebnem Oc, Lina Dobnikar, Annabel L. Taylor, Joel Chappell, Jennifer L. Harman, Martin R. Bennett & Helle F. Jørgensen

3. Functional Gene Control Group, Epigenetics Section, MRC London Institute of Medical Sciences, Du Cane Road, London, W12 0NN, UK

Mikhail Spivakov

4. Institute of Clinical Sciences, Faculty of Medicine, Imperial College, Du Cane Road, London, W12 0NN, UK

Atherosclerosis and vessel injury are conditions characterised by the accumulation and proliferation of vascular smooth muscle cells (VSMCs), which can lead to serious cardiovascular events. Previously, we showed that the VSMC expansion in atherosclerotic lesions is oligoclonal in nature (Chappell et al., 2016). However, the mechanisms behind this selective activation of VSMC proliferation are yet to be elucidated. Using single cell RNA- sequencing (scRNA-seq) we identified a small population of Sca1+ VSMC cells in healthy vasculature (Dobnikar et al., 2018), which we hypothesize may be 'primed' for proliferation, or represent a dedicated progenitor population. Here, we investigated a potential role of Sca1 in selective VSMC proliferation through scRNA-seq a mouse carotid ligation model of vessel injury, which induces rapid VSMC proliferation and in vitro proliferation assays. Sca1+ cells were found in higher numbers in injured carotid arteries, but maintain similar transcriptional signatures. Pseudotemporal ordering of cells placed Sca1+ cells prior to proliferating cells supporting the hypothesis that Sca1-expression represent a primed cell stage. Consistent with this, Sca1 upregulated cells are found in a cluster that was found to be significantly associated with stress response, cell migration, and proliferation gene ontology terms. Functional examination supported the idea that Sca1 marks a primed, rather than as a specific progenitor population. VSMCs isolated from the healthy mouse aorta were cultured in a 2D clonal proliferation assay which demonstrated that, whereas proliferation is not restricted to Sca1+ cells, attachment and clonal expansion of Sca1+ cells is increased and temporally advanced compared to Sca1- VSMCs. These findings enabled characterization of a disease-relevant primed population of VSMCs that could be targeted in disease. Further insight into the mechanism controlling VSMC proliferation will be gained from functional testing of potential regulators identified using the axis defined by the trajectory analysis of scRNA-seq data.

P89

Single-Cell Multiomic Analysis of SNV, CNV, and Protein Expression

Aik Ooi, Pedro Mendez, Dalia Dhingra, Nigel Beard, and David Ruff

Mission Bio, Inc., South San Francisco, CA, USA.

Recent advancements in precision medicine, while highly promising, presents a major technical challenge to researchers due to disease heterogeneity. The emergence of single- cell technologies has greatly refined the resolution in which sample diversity can be investigated, enhancing the efficiency of selecting appropriate molecular targets. Additionally, applying multiomic analysis on single cells would further improve the understanding of cell-to-cell heterogeneity by providing unique insights on cellular and genetic composition. Using a two-step droplet microfluidic technology, the Mission Bio Tapestri Platform enables multiplex-PCR based high-throughput targeted DNA sequencing in single cells to obtain single-nucleotide variation (SNV) and copy number variation (CNV) information. By leveraging this technology, a new workflow is developed to detect protein expression in addition to DNA genotype in the same single cells. In this approach, cells are labeled with a pool of oligonucleotide-conjugated antibodies prior to loading the cells into the Tapestri Instrument for targeted DNA analysis. Sequencing libraries are then prepared from both antibody oligonucleotides and the amplified DNA sequences, followed by identification of single-cell DNA genotypes and protein signatures from the sequencing readout. In a mixed population of four cell lines, single-cell SNV and CNV information from 127 targeted amplicons and the protein data from 10 antibodies independently classified the cells into appropriate clusters. This method has been successfully performed on clinical samples with myeloid malignancies. In an acute myeloid leukemia (AML) sample, combined single-cell SNV, CNV, and protein expression data illustrated the heterogeneity within the sample. The data clearly identified CD3+ T cells and CD19+ B cells without pathogenic SNVs and CNVs. CD34+CD11b- and CD34-CD11b+ subpopulations were also identified within the cells carrying the same pathogenic SNVs and CNVs. We believe that this novel multiomic technology will facilitate new discoveries in the complex relationship between genotype and phenotype, enable a better understanding of disease biology, and subsequently improve the design of diagnostics and therapies.

P90

Disease map of cell-types using single-cell RNA sequencing

Subarna Palit, Lukas Simon, Fabian J Theis

Institute of Computational Biology

Genome-wide association studies (GWAS) have identified thousands of associations between genomic variants and a plethora of human phenotypes. However, determining the "cell-type of action" for any given phenotype is impossible without the integration of additional information. Recent advances in RNA sequencing have enabled transcriptomic profiling at cellular resolution leading to the first fully sequenced and annotated Mouse Cell Atlas (MCA). Here, we trained a hierarchical neural network to predict tissue and cell-type based on the transcriptomic profiles of 76,613 cells from adult tissues reported in MCA. We show that our network outperforms existing methods in terms of cell-type classification accuracy when applied to an independent mouse cell atlas. Moreover, we use our model to predict the "cell-type of action" across 1137 GWAS phenotypes. We derive from our analyses an interesting association between the myelinating oligodendrocyte cells in the brain with autism spectrum disorder, a neuropsychiatric disease, which has also been reported in literature. The resulting predictions represent a valuable resource for the interpretation of GWAS results and the study of human phenotypes including disease.

P91

Using latent Dirichlet allocation for detecting doublets in scRNA-seq

Alexandrina Pancheva (1), Guido Sanguinetti (2), Helen Wheadon (3), Simon Rogers (4), Thomas Otto (1)

(1): Institute of Infection, Immunity & Inflammation, University of Glasgow (2): School of Informatics, University of Edinburgh (3): Paul O'Gorman Leukaemia Research Centre, University of Glasgow (4): School of Computing Science, University of Glasgow

Since the first published study in 2009, scRNA-seq technology has attracted attention, and the development of new protocols and decrease in sequencing costs have boosted its popularity. But with increasing demand, there are still challenges to be addressed. Due to errors in cell sorting and capture, particularly in droplet-based methods due to the high number of cells, libraries are generated by combining the profiles of different cell types. This can cause false signal in the downstream analysis when identifying cell types and constructing trajectories to understand an underlying process. While we can filter doublets if we assume they have higher counts, having an unbiased approach for identification of doublets is vital. We propose an approach based on latent Dirichlet allocation (LDA), an algorithm originally developed for text, for extracting cell type profiles, detecting doublets, and deconvoluting them. Results are based on both simulated and real datasets. Real single cell datasets include a 10x experiment of P.berghei lifecycle with high number of doublets, quantified by the presence of P. falciparum in the sequencing. Additionally, we used a peripheral blood mononuclear cell (PBMC) dataset, where doublets were defined via Cell Hashing or donor SNP information with Demuxlet. We benchmark our topic modelling approach with other tools for doublet detection. Specifically, we find that doublets identified with our method not only contain the doublets identified by DoubletDecon, but we also find more doublets as confirmed by ground truth in the case of our dataset of two Plasmodium strains. In conclusion, we present a novel method to detect more doublets than existing tools using LDA.

P92

Cell segmentation-free inference of cell types from in situ transcriptomics data

Jeongbin Park, Wonyl Choi, Sebastian Tiesmeyer, Brian Long, Lars E. Borm, Emma Garren, Thuc Nghi Nguyen, Simone Codeluppi, Matthias Schlesner, Bosiljka Tasic, Roland Eils, Naveed Ishaque

Digital Health Center, Berlin Institute of Health (BIH) and Charité Universitätsmedizin, Berlin, Germany; Faculty of Biosciences, Heidelberg University, Heidelberg, Germany; Department of Computer Science, Boston University, Boston, the United States of America; Division of molecular neurobiology, Department of medical biochemistry and biophysics, Karolinska Institutet, Stockholm, Sweden; Science for life laboratory, Stockholm, Sweden; Allen Institute for Brain Science, Seattle, WA, USA; Bioinformatics and Omics Data Analytics, German Cancer Research Center (DKFZ), Heidelberg, Germany; Health Data Science Unit, Heidelberg University Hospital, Heidelberg, Germany

Multiplexed fluorescence in situ hybridization techniques have enabled cell-type identification, linking transcriptional heterogeneity with spatial heterogeneity of cells. However, inaccurate cell segmentation reduces efficacy of cell-type identification and tissue characterization. Here, we present a novel method called Spot-based Spatial cell-type Analysis by Multidimensional mRNA density estimation (SSAM), a robust cell segmentation- free computational framework performing de novo cell-type and tissue domain identification in 2D and 3D. SSAM is applicable to a variety of in situ transcriptomics techniques and capable of integrating prior knowledge of cell types (e.g. from single cell RNA-seq) if required.

We apply SSAM to three mouse brain tissue images obtained by different techniques: the somatosensory cortex by osmFISH, the hypothalamic preoptic region by MERFISH, and the visual cortex by multiplexed smFISH. With all three datasets, we demonstrate the robustness of SSAM in identifying 1) cell types in situ, 2) spatial distribution of cell types, 3) spatial relationships between cell types, and 4) tissue domains (e.g., cortical layers) based on the local composition of cell types without fine-tuning of required parameters. We demonstrate that 1) SSAM correctly identify the spatial distribution of known cell types in regions where pre-existing methods for osmFISH data fail; 2) it can perform analyses for MERFISH 3D data the same way as for 2D data without any extra adjustments of the settings when applied to osmFISH data; and 3) it can be utilized to identify new and rare cell types in tissue based on multiplexed smFISH data.

SSAM is implemented as a highly performant python package available on GitHub and is accompanied with an easy to use Jupyter notebook.

P93

Computational Classification of a Harmonised Atlas of Spinal Cord Cell Types

Ryan B. Patterson-Cross 1, *, Daniel E. Russ 2, *, Stephanie C. Koch 3, Kaya J.E. Matson 1, Vilas Menon 4, Ariel J. Levine 1

1 Spinal Circuits and Plasticity Unit, National Institute of Neurological Disorders and Stroke, NIH, Bethesda, MD 2 Division of Cancer Epidemiology and Genetics, Data Science Research Group, National Cancer Institute, NIH, Rockville, MD, USA 3 Division of Biociences, University College of London, London, UK 4 Department of Neurology, Colulmbia University, New York, NY, USA * Equal Contribution

To take full advantage of the recent revolution in single cell transcriptomics, it is critical to establish a standard atlas of cell types for each tissue and to develop tools for classifying these cells. We sought to streamline this process by developing an artificial intelligence algorithm capable of consistently classifying data from publicly available single cell RNA sequencing into consensus cell types. Recently, we performed an integrated analysis on six different, publicly available datasets that were independently validated. We profiled over one hundred thousand cells from the mouse spinal cord and characterized fifteen non-neuronal and sixty-nine neuronal cell types. This characterisation surpassed all prior studies in cell type resolution. Now, we are developing artificial intelligence algorithms to automate cell type classification of spinal cord cells using the cell types from the integrated data as a "ground truth." To this end, we compared the performance of three different algorithms: label transfer, a support vector machine, and a fully-connected neural network. Models were cross-validated on five random train-validate splits, using 80% of the data for training. Though all models performed well at classifying very distinct cell types, the neural network displayed higher accuracy when classifying highly related neuronal subtypes. When tested on a novel dataset, the neural replicated its performance and accurately identified complex neuronal cell types. In addition to superior performance in classifying spinal cord cell types, this neural net algorithm has several advantages. First, by operating on raw data, it is capable of easily integrating datasets from a variety of sequencing platforms with no prior pre-processing. Second, it will streamline and standardize future analysis by providing consistent cell type clusters across studies. Third, it may help reveal novel features that distinguish cell types. Ultimately, we aim to apply these tools towards the establishment of consistent cell types in a human cell atlas.

P94

Unprecedented sensitivity with SMART-Seq® single cell technology

1.Andrew Farmer, 1.Nathalie Bolduc,1.Tommy Duong,1.Magnolia Bostick,1.Nidhanjali Bansal, ,1.Suvarna Gandlur. Presented by 2.Matthieu Pesant

1.Takara Bio USA, Inc., Mountain View, CA, USA. 2. Takara Bio Europe

Since the emergence of next-generation sequencing(NGS),the importance of and demand for single-cell analysis have risen rapidly. Extracting meaningful biological information from the small amount of mRNA present in each cell requires an RNA-seq preparation method with exceptional sensitivity and reproducibility. To date, the SMART-Seqv4kit(SSv4)has been the most sensitive commercial single-cell RNA-seq method, in part due to its incomparable capability to retrieve information from full-length mRNA and not just the 3′end. However, there is still room for improvement for extremely challenging samples such as cells with very low RNA content or nuclei. To address this need, we have further modified our core technology to create a new chemistry with higher sensitivity-the SMART-Seq Single Cell Kit-that out performs all current commercial and noncommercial full-length methods, particularly with as little as 2pg of total RNA. When validating with a B lymphocyte cell line or peripheral blood mononuclear cells from a healthy donor, we were able to detect 50-60% more genes with the new chemistry compared to current methods. The improvement in sensitivity was associated with a clear reduction of the dropout rate as well as an increase in reproducibility. In addition, the new SMART-Seq single cell(SSsc) chemistry generates a high yield of cDNA, which is extremely useful when dealing with difficult cells such as clinical samples that tend to carry very low RNA content.

P95

Dissecting the heterogeneity of HIV-elicited innate immune responses across distinct macrophage phenotypes

Maria Nascimento Primo, Clara Alsinet-Armengol, Isaac Garcia and Daniel Gaffney

Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom

An important problem in HIV-1 infections is the existence of latent viral reservoirs that allow replication-competent virions to persist over extended periods of time. Memory CD4 T cells were thought to constitute the only HIV-1 reservoirs, but recent studies suggest that tissue- resident macrophages possess major features that allow them to also function as HIV-1 reservoirs. Because tissue-resident macrophages display a variety of phenotypes depending on their microenvironment, systematic studies are required to provide a complete picture of macrophage cellular heterogeneity and how these cell populations affect virus-host interactions and establish dynamic innate immune responses. Here, we performed single cell RNA sequencing in uninfected and HIV-1-derived infected human macrophages cultured in either homeostatic or pro-inflammatory environments over multiple timepoints. In parallel, we also cultured macrophages towards a microglia-like phenotype as brain inflammation is a common feature of HIV-infected patients. Preliminary analysis shows that at 8 hours post infection only a very small fraction of cells across the entire macrophage population senses the viral infection leading to induction of type I IFN. Notably, these rare early-responding cells are sufficient to initiate a rapid differential expression of distinct antiviral gene modules across uninfected cells from different culture environments. Likewise, we also find common antiviral gene modules characterised by different temporal heterogeneity profiles not only across different culture environments but also within the same environment. Our study provides new insights into the nature and source of single cell variability during viral infection in macrophages, highlighting the importance to characterise tissue-resident cells in vivo to identify the key cellular features that establish HIV-1 reservoirs in humans.

P96

Somatic CNV detection by single cell whole genome sequencing in multiple system atrophy brains

Christos Proukakis, Diego Perez-Rodriguez, Maria Kalyva, Melissa Leija-Salazar, Tammaryn Lashley, Maxime Tarabichi*, Anthony H. V. Schapira, Thomas T. Warner, Janice L. Holton, Zane Jaunmuktane

Queen Square Institute of Neurology, University College London; * The Francis Crick Institute, London

Synucleinopathies are mostly sporadic neurodegenerative disorders of partly unexplained aetiology. They include Parkinson's disease (PD), with predominantly neuronal pathology, and multiple system atrophy (MSA), with mixed but predominantly glial pathology. Inherited CNVs (gains) and other mutations in SNCA (α-synuclein) are among rare causes of familial disease. There is increasing evidence for the presence of mosaicism due to somatic mutations in the human brain, and we have hypothesized a role for somatic mutations in synucleinopathies. We have detected somatic SNCA (α-synuclein) CNVs, specifically gains, in PD and MSA brains using FISH. Our data suggest correlations with younger onset, and with the presence of α-synuclein inclusions in the same cells. To obtain the first data on genome-wide somatic CNVs in a synucleinopathy, we performed single-cell whole genome sequencing (WGS) in MSA brain. We performed nuclear fraction isolation, followed by immunohistochemistry, visual selection of single nuclei using the CellRaft (Cell Microsystems) or QiaScout (Qiagen) device on an inverted microscope, whole genome amplification, and low coverage WGS (single or paired end, ~1- 10 million reads), with QC and initial CNV calling by Ginkgo. We first compared neurons from a control substantia nigra (SN) amplified by Picoplex Gold (Takara) or RepliG Advanced (Qiagen). As the "noise" (median absolute pairwise deviation) was significantly better with Picoplex Gold, we used this for subsequent work. We validated detection of large SNCA CNVs in single cells from germline cases. We then obtained high quality data from 169 MSA brain cells in total. These comprised neurons and glia from the SN of two cases, and the pons and putamen of one. We detected somatic CNVs > 1 Mb in ~30% of cells, with gains in neurons and glia, often clustered together, and losses largely limited to neurons. We noted possible clonality in a few, and enrichment of CNV boundary regions for segmental duplications and telomeric regions, but not fragile sites. CNVs included a 10 Mb gain encompassing SNCA in a pontine neuron with multiple gains, a small gain over the nearby GRID2 gene a pontine neuron with a small nuclear inclusion, and multiple losses, including SNCA, in a putaminal neuron with prominent nuclear inclusions. Gene ontology analysis revealed different patterns in SN neurons and non-neurons. We propose that somatic SNCA CNVs contribute to the aetiology and pathogenesis of synucleinopathies, and that genome-wide somatic CNVs require further detailed investigation.

P97

Single-cell RNA analysis deciphers tumor heterogeneity and the immune microenvironment

Franziska Singer, Michael Prummer, Anne Bertolini, Daniel J. Stekhoven

Nexus Personalized Health Technologies, ETH Zurich, Zurich, Switzerland, and SIB Swiss Institute of Bioinformatics, Zurich, Switzerland

Single-cell RNA sequencing (scRNA-seq) based tumor biopsy analysis is an emerging technique that allows to profile tumor cells and infiltrating immune cells at unprecedented detail. Based on RNA expression, distinct tumor sub-clones can be identified, informing on tumor heterogeneity and potential treatment resistant sub-clones. Moreover, the cell type composition of the tumor microenvironment, in particular the presence and variability of immune cell sub populations, can strongly influence treatment response. Although multiple methods for scRNA-seq analyses exist, their application in a clinical setting demands standardized and reproducible analyses workflows, targeted to display the clinically relevant information. To this end, we designed a workflow to characterize the cell type composition of tumor biopsies based on scRNA-seq data from the 10x Genomics platform. The raw reads are assigned to genes and cells and subsequently filtered and normalized in several quality control steps. Based on the gene expression profile of each cell, we inform on tumor heterogeneity and immune cell sub populations, and highlight the expression of clinically relevant genes and pathways. We applied our workflow in the analysis of diverse tumor samples, informed on tumor and immune composition and monitored treatment response.

P98

Identification of differentiation outliers in iPSC-derived neuronal cell types from pooled single-cell data

Pau Puigdevall-Costa1, Julie Jerber2,3, Daniel Seaton4, Matthew Hurles2, Florian Merkle5, Oliver Stegle5,6, Daniel Gaffney2, Sergi Castellano1, Helena Kilpinen1,2

1. UCL Great Ormond Street Institute of Child Health, London, UK 2. Wellcome Sanger Institute, Cambridge, UK 3. Open Targets, Cambridge, UK 4. European Molecular Biology Laboratory, EMBL-EBI, Cambridge, UK 5. University of Cambridge, Wellcome Trust-Medical Research Council Institute of Metabolic Science, Cambridge, UK 6. Deutsches Krebsforschungszentrum (DKFZ), Heidelberg, Germany

Induced pluripotent stem cells (iPSC) are an important tool for disease modelling. In particular, they offer unique opportunities to study the cellular basis of developmental disorders, as cell types and lineages relevant to human neurogenesis can be accessed through differentiation. Combining iPSC-based experiments with single-cell sequencing has helped tackle the heterogeneity that is inherent to all cellular differentiation systems. Further, pooled differentiation allows combining multiple samples into a single experiment, which is a powerful study design to compare wild-type and mutant cells.

We leveraged this study design to explore how iPSC lines with engineered knock-outs (KO) of developmental disorder (DD) genes compare to wild-type cells during neuronal differentiation. Since mutations causing DDs have previously been linked with impaired commitment of stem cells to a neuronal fate, we hypothesized that the effects of the gene KOs may manifest as abnormal differentiation patterns already in early neurogenesis, rather than in mature neurons. We characterized cellular heterogeneity and the developmental trajectories of cell lines during different stages of neuronal development with special emphasis on outlier behaviour. Specifically, we analysed single-cell RNA-seq data collected from seven pools (83 individuals, ~120K cells) across three time points (days 11, 30, 52) during the differentiation of neurons towards a dopaminergic fate. Each pool included a single KO-line. We normalised the dataset across all pools and time points, and identified eight distinct cell type clusters, including the expected population of dopaminergic neurons at day 52.

To identify cell lines that show abnormal patterns of differentiation, we focused on outlier identification. A cell line was considered an outlier if any of its cell-type fractions at a particular time point deviated significantly from the other cell line fractions (|z-score|>2). Notably, while some of the included KO lines displayed an outlier phenotype, they were indistinguishable from wild type outlier lines. Since variability in differentiation outcomes is explained by both biological and technical factors, we are currently exploring whether outlier behaviour could be explained by acquired genetic variation in wild-type iPSCs that may confer growth advantage or disadvantage during differentiation. Initial analysis of exome- sequencing data suggests outlier lines present a higher burden of acquired somatic mutations than non-outlier lines. We are also defining outliers relative to differentiation trajectories and analysing gene expression differences between outlier and wild-type lines. Our study constitutes the first attempt to assess the feasibility and constraints to detecting effects of pathogenic genetic variants in iPSC-based, pooled single-cell experiments.

P99

Understanding molecular dynamics of enteric neuron subtype generation during gut organogenesis

Maryam Rahim1, Reena Lasrado1, Anna Laddach1, Reimer Kuehn2, Vassilis Pachnis1

1 The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK 2 Department of Mathematics, Kings College London, The Strand, London WC2R 2LS, United Kingdom

One of the largest subdivisions of the peripheral nervous system is the enteric nervous system (ENS), which is organised into ganglionated plexi within the gut wall. The ENS plays an essential gut-intrinsic role contributing to intestinal physiology and host defence. Despite recent progress in understanding cellular development in the ENS, molecular cascades that govern the assembly and function of neural circuits remain obscure. Previous work at single- cell resolution defined neuro-glial trajectories and candidate regulators of enteric neuron and glial differentiation in the small intestine. However, this study did not include the analysis of the ENS in the large intestine. In comparison to the small intestine, it has distinct anatomical and functional features. It plays a key role in waste removal and water homeostasis. It is primarily affected in genetic diseases such as Hirschprung disease and inflammatory bowel syndromes such as enteric colitis and colon cancer. Neural crest-derived Sox10-positive progenitors from both vagal and sacral origins contribute to the ENS of the large intestine. To gain insight into the molecular mechanisms that underpin the development and generation of diversity of the neural component of the large intestine, we combined single- cell transcriptomic and mathematical modelling approaches. Using the 10X genomics platform, we captured and sequenced individual Sox10-labelled ENS populations at key developmental stages of the large intestine (E14.5, E16.5, E18.5 and P2). Using Louvain clustering we identify progenitors, neurons and glial cells at all stages in the clusters as defined by known markers (Elavl4, S100b). Slingshot analysis performed to infer developmental pseudotime reveals the emergence of neuronal and glial trajectories from cycling progenitors, similar to that observed in the small intestine. To unravel molecular players of neuronal subtype specification, we selected neurogenic cells from our dataset and integrated it with the dataset of adult enteric neurons from the Linnarsson lab, using Seurat v3. Using Slingshot on the integrated dataset allowed us to visualise the emergence of two cardinal neuronal subtypes- an inhibitory Nos1+ and an excitatory Chat+. In addition, we have used a mathematical modelling approach that allows us to gain further insight into the generation of neuronal diversity at the level of variance of gene expression over pseudotime. Our study identifies candidate regulators that define neuronal subtypes in the ENS and further development of our modelling techniques will help us unearth the regulatory networks that underpin the development of enteric neuronal diversity and their integration into functional neural circuits.

P100

Single cell transcriptomics of transdifferentiating mouse ovaries

Chris Rands1*, Yasmine Neirijnck1, Moira Rossitto2, Françoise Kühne1, Chloé Mayere1, Brigitte Boizet2, Francis Poulat2, Serge Nef1

1 University of Geneva, 2 University of Montpellier

The molecular mechanisms underlying sex determination and maintenance are relevant for understanding diseases of sexual development and fertility. In mammalian primary sex determination, the bipotential gonads develop into either testes or ovaries. The gonads differentiate in the embryo, but to preserve the testicular or ovarian cell state, genetic programs continue throughout life as the male and female antagonistic pathways 'fight' for control of the cell fate. Immunofluorescence staining of mutant mice samples indicated that Trim28 is an important gene in the maintenance of adult ovarian cell identity. Here we characterised, at single cell resolution, transcriptional changes upon knockout of Trim28 in mouse ovaries.

We obtained Trim28-/- Sf1 Cre female 8-week-old transgenic mice and corresponding wildtype female and male mice. We dissected their gonads, isolated the single cells, and sequenced the RNA using the 10X platform. We pre-processed the sequencing data with CellRanger to generate a gene expression vs. cells matrix. Then, we used Scanpy, plus additional tools, for downstream analyses and visualisation, including filtering cells and genes, removing doublets, normalizing and log transforming counts, batch correction, identity highly variable genes, dimensionality reduction, clustering, marker gene exploration, and lineage tracing.

We found that Trim28 mutant ovarian cells show transcriptional changes consistent with transdifferentiation of the ovaries towards the testes. Specifically, in the mutant, the somatic supporting cells, which are critical for the production of the sperm or eggs, include intermediate cell populations in-between the normal female Granulosa cells and male Sertoli cells. Investigation of the steroidogenic cell lineage is ongoing. Our results reveal the transitionary cell states between ovarian and testicular cells, and patterns of cell lineage plasticity.

P101

DUBStepR: correlation-based feature selection for clustering scRNA-seq and scATAC-seq data

Bobby Ranjan, Wenjie Sun, Jinyu Park, Vipul Singhal, Nirmala Arul Rayan, Fatemeh Alipour, Ronald Xie and Shyam Prabhakar

Computational and Systems Biology, Genome Institute of Singapore, 60 Biopolis Street, Singapore, 138672, Singapore.

A critical step in unsupervised clustering of single-cell RNA sequencing data is feature selection, i.e. identification of a subset of genes that can separate cells into distinct clusters. Current methods for feature selection test each gene individually, thus ignoring expression correlations between genes. This is a major limitation, since cell-type-specific marker genes tend to be highly correlated with each other. We therefore developed DUBStepR (Determining the Underlying Basis using Stepwise Regression), a method that selects a basis set of strongly correlated genes that maximally explain variation in gene-correlation space. DUBStepR then expands this basis set to identify the features (genes) that optimize cluster separation.

We benchmarked DUBStepR on 8 datasets spanning 4 single-cell RNA sequencing protocols (10X, Drop-Seq, CEL-Seq2 and Smart-Seq2) and found that DUBStepR yielded greater cluster separation than 6 widely-used feature selection algorithms. We applied DUBStepR to multiple PBMC scRNA-seq datasets and identified low-frequency contaminating cell populations in each case, thus demonstrating the sensitivity of the method. Moreover, DUBStepR detected marker genes with consistently greater accuracy than the other methods. Since the logic of feature correlations is not specific to scRNA-seq data, we reasoned that DUBStepR could also be applied to other data types, such as scATAC-seq. Indeed, we found that DUBStepR improved clustering of scATAC-seq data from human bone marrow.

DUBStepR is extensively documented, freely available as an R package on GitHub (https://github.com/bbbranjan/DUBStepR), and can directly be incorporated into existing scRNA-seq and scATAC-seq workflows. In summary, DUBStepR provides a general- purpose feature selection solution for accurately clustering single-cell data.

P102

Defining mesenchymal stem/stromal cells

Pedro Raposo, Elsa Abranches, Orla O’Shea, Charlotte Chapman, Leo Perfect

No affiliations

Mesenchymal stem/stromal cell (MSC) is a contentious term that covers a plethora of cell types with different phenotypes and therapeutic potentials. They are currently defined by three criteria (as defined by the International Society for Cellular Therapy - ISCT): a limited set of protein markers, the capability to adhere to plastic in culture and the ability to differentiate into osteoblasts, chondrocytes and adipocytes. However, these criteria do not capture the broad MSC heterogeneity. This classification problem within the scientific community has led to inconsistent clinical trials and ultimately restricted the advancement of translational MSC-based therapies. To address the MSCs heterogeneity issue, we have compared the diversity between and within MSC populations from different tissue sources, using single cell RNA sequencing. So far, we analysed MSCs from three adult tissues, three human embryonic stem cell (hESC) lines and one induced pluripotent stem cell (iPSC) line, to shed light on how different the gene expression is between and within these groups of cells. Our analysis shows that all analysed MSC populations generally express the positive ISCT markers (CD105, CD73 and CD90) and lack the negative ones (CD45, CD34, CD14 or CD11b, CD79a or CD19 and HLA class II). However, cells from the different sources were also found to have distinct gene expression profiles, highlighting the need for more detailed and consistent analysis of MSCs identity before all their clinical potential can be unravelled. In the future, the introduction of additional datasets will increase the robustness of our findings and help untangle the cellular heterogeneity bottleneck that has hampered the clinical exploitation of MSCs. Also, analysis done by different bioinformatic tools, strategies, and pipelines will allow us to strengthen our conclusions. Ultimately, this study will help define the different cell types currently described as MSCs, with the aim of developing specific reference materials and aiding MSC-based therapies development.

P103

Transcriptional dynamics of hepatic sinusoid-associated cells after liver injury

Mike K. Terkelsen, Sofie M. Bendixen, Emma A. H. Scott, Andreas F. Moeller, Ronni Nielsen, Susanne Mandrup, Kedar N. Natarajan, Sönke Detlefsen, Henrik Dimke, and Kim Ravnskjaer

1 Department of Biochemistry and Molecular Biology, University of Southern Denmark, Denmark (MKT, SMB, EAHS, AFM, RN, SM, KNN, KR). 2 Department of Pathology, Odense University Hospital, Denmark (SD) 3 Department of Cardiovascular and Renal Research, Institute of Molecular Medicine, University of Southern Denmark, Denmark. (HD) 4 Department of Nephrology, Odense University Hospital, Denmark. (HD) 5 Center for Functional Genomics and Tissue Plasticity (ATLAS), University of Southern Denmark, Denmark. (MKT, SMB, RN, SM, KR)

Background & Aims: Hepatic sinusoidal cells are known actors in the fibrogenic response to injury. Activated hepatic stellate cells (HSCs), liver sinusoidal endothelial cells, and Kupffer cells are responsible for sinusoidal capillarization and perisinusoidal matrix deposition impairing vascular exchange and heightening the risk of advanced fibrosis. While the overall pathogenesis is well-understood, functional relations between cellular transitions during fibrogenesis are only beginning to be resolved. At single-cell resolution, we here explored the heterogeneity of individual cell types and dissected their transitions and crosstalk during fibrogenesis. Approach & Results: We applied single-cell transcriptomics to map the heterogeneity of sinusoid-associated cells in healthy and injured livers and reconstructed the single-lineage HSC trajectory from pericyte to myofibroblast. Stratifying each sinusoidal cell population by activation state we projected shifts in sinusoidal communication upon injury. Weighted Gene Co-Expression Network Analysis of the HSC trajectory led to the identification of core genes, whose expression proved highly predictive of advanced fibrosis in NASH patients. Among the core members of the injury-repressed module, we identified Plasmalemma vesicle-associated protein (PLVAP) as a protein amply expressed by mouse and human HSCs. PLVAP was suppressed in HSCs upon injury and may define hitherto unknown roles for HSCs in the regulation of microcirculatory exchange and its breakdown in chronic liver disease. Conclusions: Our study offers a single-cell resolved account of drug- induced injury of the mammalian liver and identifies key genes that may serve important roles in sinusoidal integrity and as markers of advance fibrosis in NASH patients.

P104

Impaired Differentiation: Understanding a Single Cell Fate Transition

Merrit Romeike 1,2, Celine Sin 3, Joerg Menche 3, Christa Buecker 1,2

1 Max Perutz Labs Vienna, Austria. 2 University of Vienna, Austria. 3 CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria.

Development is characterized by distinct, coordinated cell fate transitions. During each transition, an existing gene expression program is dismantled and a new cellular identity has to be established. This progression can be challenged by genetic manipulations, which therefore elucidate requirements for faithful execution of the transition.

The exit from naïve pluripotency is a highly accessible and controllable model system for a single cell fate transition: mouse embryonic stem cells cultivated under defined conditions are naïve pluripotent and act as a homogenous starting population. The core gene regulatory network maintaining naïve pluripotency is well established. Upon change of culture conditions, this network is rapidly dismantled and cells irreversibly commit to differentiation into formative pluripotency, a less characterized cell state.

Failure of differentiation is mostly described as prolonged expression of pluripotency markers or clustering of bulk transcription profiles with naïve wildtype. Despite extensive screening for factors required for exiting naïve pluripotency, so far not a single factor has been identified which completely abrogates differentiation ability. Differentiation impaired mutants rather exhibit a phenotype which appears as delay of progression and commitment.

Single cell methods open up a window into understanding this apparent delay. We are investigating how robustness of the cell fate transition is ensured - even in absence of key regulators - through fine-tuned differentiation time courses of wildtype and differentiation impaired mutants. We infer differences in the gene regulatory network progression across trajectories and genotypes. As entry point, we investigate the transcriptional repressor Tcf7l1, a key factor in the shutdown of the pluripotency network. We are following two hypotheses: the phenotype could be explained by delays in overcoming roadblocks of differentiation, but in principle employ the same gene regulatory network changes. Alternatively, networks not dominant under wildtype conditions could serve as rescue mechanism under impaired conditions, ensuring proper differentiation. Taken together, our work will elucidate the mechanism of a single cell fate transition and its intrinsic robustness.

P105

Deconstructing the spatial organization of the main olfactory epithelium by spatial transcriptomics

Mayra L. Ruiz Tejada Segura1, Melanie Maklouf 4, Bettina Malnic 5, Tiago Saike 5, Luis Saraiva 4, Antonio Scialdone 1,2,3

1 Inst. of Epigenetics and Stem Cells, Helmholtz Zentrum München, Germany, 2 Inst. of Computational Biology, Helmholtz Zentrum München, Germany, 3 Inst. of Functional Epigenetics, Helmholtz Zentrum München, Germany, 4 Sidra Medical and Research Center, Qatar, 5 Universidade de São Paulo, Brazil

Our sense of smell provides us with information about our environment. This information is conveyed by odorants, which are detected by olfactory sensory neurons (OSNs) through proteins encoded by olfactory receptor genes (ORs). Each OSN expresses only one OR randomly chosen out of thousands.

The position of olfactory sensory neurons in the main olfactory epithelium (MOE) has been shown to influence olfactory receptor genes' probability of activation, with some of them having a higher probability of being expressed in specific regions of the MOE called "zones" [1]. The role that ORs' spatial expression might have in olfaction is unknown; and until now, the detailed spatial pattern of expression in the MOE remains largely unknown. The sheer number of OR genes and the presence of still unknown molecular regulators are major challenges that hamper a detailed analysis of this system.

I will present the analysis of a spatial transcriptomic dataset from mouse MOE that was collected with a recently developed technique called TOMO-seq. Our analysis provides the first comprehensive 3D map of gene expression pattern in the MOE, which we used to identify new zonal markers and to redefine the zones of OR genes expression computationally, using information from a 3D dataset from hundreds of OR genes.

Moreover, I will show how such a 3D map can be combined with single-cell RNA-seq datasets to get further insights into, e.g., the spatial gene expression patterns in other cell types present in the MOE.

Key words: Spatial transcriptomics, Olfactory receptor genes, Olfactory sensory neurons.

[1] Miyamichi, K. Continuous and Overlapping Expression Domains of Odorant Receptor Genes in the Olfactory Epithelium Determine the Dorsal/Ventral Positioning of Glomeruli in the Olfactory Bulb. J. Neurosci. 25, 3586-3592 (2005).

P106

Inferring emergent dynamics in the cell nucleus from single-cell multi-omics experiments

Fabrizio Olmeda 1, Tim Lohoff 2, Stephen Clark 2, Heather Lee 3, Wolf Reik 2,4, Steffen Rulands 1,5

1 Max Planck Institute for the Physics of Complex Systems, Dresden, Germany 2 Epigenetics Programme, Babraham Institute, Cambridge, UK 3 School of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, The University of Newcastle, Callaghan NSW, Australia 4 Department of Physiology, Development and Neuroscience, University of Cambridge, UK 5 Center for Systems Biology Dresden, Dresden, Germany

Methods from single-cell multi-omics allow measuring several layers of regulation along the one-dimensional sequence of the DNA. The biological function of these processes relies, however, on emergent processes in the three-dimensional space of the nucleus, such as droplet formation through phase separation. How can measurements along the sequence of the DNA be translated into an understanding of emergent dynamics in nuclear space? Here, we combine single-cell NMT-sequencing experiments with a theoretical and computational approach to rigorously map measurements along the DNA sequence to a description of the emergent spatial dynamics in the nucleus.

Drawing on sc NMT-seq experiment in vitro and in vivo we demonstrate our approach in the context of early development. We show how epigenetic modifications of the DNA, DNA methylation, are established through the interplay between chemical and topological modifications of the DNA, leading to the formation of condensates of methylated DNA in the nucleus. We finally demonstrate how our findings can be used to identify specifically regulated genomic regions during differentiation.

Our work sheds new light on epigenetic mechanisms involved in cellular decision making. It also highlights how mechanistic insights into the spatio-temporal processes governing cell- fate decisions can be gained by the combination of methods from single-cell multi-genomics, computational biology and theoretical physics.

P107

Single-cell Analysis of Immune Response to Infection in Malaria

Jian Ryou, Massar Dieng, Manikandan Vinu, and Youssef Idaghdour

1. Department of Biology, New York University Abu Dhabi, UAE 2. Centre National de Recherche et Foromation sur le Paludisme (CNRFP), Ouagadougou, Burkina Faso

Despite many decades of intensive research for a cure, malaria still remains as one of the world's leading causes of infection-related deaths, with 80% of global malaria deaths in 2016 occurring in sub-Saharan Africa. Considerable variation in parasite load and clinical outcome in malaria patients makes the development of an effective vaccine difficult, and this is one of the major obstacles to overcoming this disease. The reasons behind this inter-individual variation are still poorly understood but likely involve the complex interactions between the host immune system and the parasite. One of the areas that needs better understanding is the nature of the host immune response to P. falciparum infection as there are still controversies regarding which components of the immune system play which role during the course of infection. Here we address some of these gaps in knowledge by using 10X Genomics single-cell transcriptional profiling of PBMCs collected from African malarial children. We profiled four children before and after P. falciparum infection and generated 26,110 single cell transcriptome with an average number of genes of 2000. Using quantitative and qualitative analysis of the data, we document PBMCs cell composition and the changes taking place before and after P. falciparum infection, and report the innate and adaptive cell types implicated as well as trends and inter-individual variations observed. These results provide novel insights for the malaria field and are the foundation for future research that will focus on expanding this work to profile more individuals and ethnic groups in Africa.

P108

Dissecting the cellular heterogeneity of the human thymus

Mario Saare, Siri T. Flåm, Marte Heimli, Eglė Stankevičiūtė, Hanne S. Hjorthaug, Teodora Ribarska, and Benedicte A. Lie

Department of Medical Genetics, University of Oslo and Oslo University Hospital, Oslo, Norway

The cellular interactions between the developing thymocytes and antigen-presenting cells (APCs) in the thymus form a major basis of immune tolerance to self-antigens. Thymocytes whose T cell receptor does not recognize the major histocompability complex molecules (MHC) on APCs, or in contrast, reacts too strongly with the self-peptides loaded onto the MHC, will go into apoptosis. These selection processes eliminate potentially autoreactive T cells from entering the circulation and prevent an immune attack against normal peripheral tissues. In recent years, single cell analyses have revealed a much broader diversity of specialized cell populations in the mouse thymi, which has greatly improved our understanding of how the thymic tissue environment helps to establish immune tolerance. However, the cellular diversity and its implications in the human thymus remain to be explored. We have set out to uncover the cellular and molecular networks in the human thymus by exploiting the state-of-the-art single-cell RNA-seq and feature barcoding approaches provided by 10X Genomics. We will collect the thymic tissue from children who undergo cardiac surgery, but are otherwise healthy. We will analyse transcriptomic profiles together with cell surface marker information from the same cells, which allows a more precise cell type classification. We aim to characterise common and rare populations among the thymocytes and APCs with a special focus on thymic epithelial cells, but also including dendritic cells, B cells and macrophages. Additionally, we will use target enrichment strategies to map the B and T cell receptor repertoire. We will complement the gene and protein expression profiles of CD45-negative populations with genome-wide chromatin accessibility data from single-cell ATAC-seq experiments. We expect to determine developmental trajectories of maturing thymocytes, identify transcription factor regulatory networks that govern the gene expression of all detected cell populations and predict the ligand-receptor pairs that modulate cell-cell interactions in the thymus. We believe that our results will be a key resource to gain a deeper knowledge of the processes required for the establishment of immune tolerance and use it to develop novel therapeutic strategies.

P109

Studying plant-microbe interactions by applying Spatial Transcriptomics

Mr Sami Saarenpää, Sami Saarenpää (1), Or Shalev (2), Detlef Weigel (2), Stefania Giacomello (1)

1: Department of Gene Technology, KTH Royal Institute of Technology, SciLifeLab, Stockholm, Sweden, 2: Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany

Bacterial infection affects plant host's gene expression by activating the innate immune system and signaling pathways, like phytohormone pathways. In plant infection biology several advances have been made by localizing the bacterial infection sites and the spread of the bacteria. However, we lack understanding on how the bacterial quantity and spread from infection sites affects the host tissues' spatial gene expression per location. To fill this gap, we applied an innovative, high-throughput technology originally developed for mammalian tissues, Spatial Transcriptomics, which enables the simultaneous quantification and visualization of transcriptional profiles in thin tissues at 100-μm resolution, to Pseudomonas droplet infected Arabidopsis thaliana leaves. We developed several advancements to the original method in order to study the concerted bacterial infection process and plant response in whole Arabidopsis thaliana leaves. First, we demonstrated that we are able to capture the bacterial RNA while preserving the complete morphology of the plant tissue. Second, we showed that the tissue treatments we introduced allow us to also spatially capture the plant mRNAs. In conclusion, our results indicate the feasibility of studying combined gene expression profiles of two different organisms opening up the possibility of extending our approach to different plant and systems such as crop species to elucidate complex infection processes where the spatial component is key for their understanding.

P110 dyno: Inferring, visualising and interpreting single-cell trajectories and RNA velocity

Wouter Saelens, Robrecht Cannoodt, Helena Todorov, Louise Deconinck, Yvan Saeys

VIB-UGent Inflammation Research Center

Since 2014, at least 71 tools for detecting trajectories in single-cell data have been developed. For end-users with a dataset of interest, these methods are difficult to execute and compare, mainly due to high variability in input/output data structures, software requirements, and programming interfaces. Moreover, RNA velocity provides an interesting alternative for investigating single-cell dynamics, but is not yet integrated with current trajectory inference methods. We developed dyno, a set of software tools that allow a user to easily infer, visualise and interpret single-cell trajectories (dyno.dynverse.org). First, the user can select the most optimal set of methods based on the size of the dataset, the prior knowledge on the trajectories' topology and other user preferences. Each selected method can then be easily run within a common interface. Because the output of these methods are all converted into a common consistent format, the trajectories can be easily interpreted. For this, we provide a complete toolkit: adding a root and labelling the different cell stages, detecting genes which are differentially expressed at different stages of the trajectory, and plotting the trajectory within any dimensionality reduction or heatmap. We also include several ways to integrate dyno with trajectory inference methods: (1) using RNA velocity as an input for trajectory inference algorithms, (2) using RNA velocity to postprocess trajectories, (3) calculating quality metrics for a trajectory using RNA velocity, and (4) visualizing trajectories and RNA velocity on the same figures. To conclude, dyno is user-friendly and powerful toolkit for unraveling single-cell dynamics. Given its modularity, it can also be easily extended with new methods and visualisation tools.

P111

A single-cell integrative analysis to understand cell-cell signaling networks in pancreatic islets associated with western diet-induced beta cell dysfunction

Somesh Sai$, Ibrahim Omar§, Kerstin Mühle*$, Han Zhu§, Fenfen Liu§, Ileana Matta§, Ramon Vidal$, Birgit Sawitzki*$, Sascha Sauer$, Maike Sander§

$Scientific Genomics Platform, Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany, * Institute of Medical Immunology, Charite – Universitätsmedizin, Berlin, Germany, $Berlin-Brandenburg Center for Regenerative Therapies (BCRT), Charite – Universitätsmedizin, Berlin, Germany, § Departments of Pediatrics and Cellular & Molecular Medicine, Pediatric Diabetes Research Center, University of California, San Diego, La Jolla, CA, USA

Pancreatic endocrine cells regulate blood glucose homeostasis by secreting insulin, glucagon, somatostatin, or pancreatic polypeptide. Chronic nutrient excess causes progressive dysfunction of insulin-secreting beta cells which results in glucose intolerance and eventually type 2 diabetes (T2D). It is known that T2D is associated with chronic inflammation of metabolically relevant tissues, including the pancreatic islets where endocrine cells reside. Thus, to develop strategies for therapeutic intervention in metabolic disease and T2D, it is critical to understand how tissue cells and immune cells interact. However, mechanisms of altered immune cell-tissue interactions are poorly understood and comprehensive analysis of how nutrient excess impacts immune cell dynamics and molecular features of relevant cell types have not been determined. The goal of this project is to conduct in-depth and integrative analysis of data from multi-parametric technologies such as single-cell RNA sequencing (scRNA-seq, 10x Genomics) and imaging mass cytometry (IMC-CyToF, Hyperion-Fludigm) to assess how a time course of western diet feeding affects immune cells and endocrine cells in mouse islets. Male C57BL/6J mice were fed a western diet (WD - 42% fat; 42.7% carbohydrates) or a normal chow diet (ND) for 1 week or 12 weeks. At both time points, islets were isolated and immune cells enriched by fluorescent activated cell sorting based on CD45. WD-fed mice showed increased body weight, developed glucose intolerance, insulin resistance, and islet inflammation after 12 weeks of feeding. Clustering of RNA profiles from single cells revealed individual clusters for all major endocrine cell types as well as populations of myeloid and lymphoid cells in islets from ND fed mice. While the composition of myeloid cells in islets was relatively stable during the course of WD feeding, their overall numbers significantly increased. In addition, we also observed infiltration of lymphocytes into islets of mice fed a WD for 12 weeks. To comprehensively profile immune cells, IMC-CyToF will be applied on pancreatic sections of mice fed a WD and ND. For this purpose, a comprehensive panel consisting of 34 markers has been established. These markers target a vast array of immune cell types and islet beta cells, allowing for their identification at the protein level and on a spatial scale. Ultimately, integrative analysis of the scRNA-seq and IMC data will allow us to infer local signaling networks between different immune cell populations and endocrine cells to identify drivers of beta cell dysfunction in metabolic disease.

P112

RCA2 - An improved framework for reference-based clustering of single cell transcriptomes

Florian Schmidt, Bobby Ranjan, Mohammad Amin Honardoost, Joanna Tan Hui Juan, Nirmala Arul Rayan, Shyam Prabhakar

Computational & Systems Biology, Genome Institute of Singapore (GIS), 60 Biopolis St, Singapore 138672

Reference component analysis (RCA) is an approach to cluster single cell transcriptomes guided by a reference panel currently derived from bulk data. In contrast to unsupervised (de novo) clustering, RCA is less susceptible to batch effects and technical variation (Li et al., 2017). However, the current implementation of RCA has limited functionality and does not easily scale to the size of current single cell data sets. In contrast to other reference-based methods, such as SingleR (Aran et al., 2019) scMatch (Hou et al., 2019) or scmap (Kiselev et al., 2019), the main purpose of RCA is not cell type identification but supervised clustering of single cell transcriptomes leveraging similarities between cells. With RCA2, we present a novel implementation of RCA that matches the state of the art in terms of scalability and functionality. RCA2 offers significantly faster data processing and requires less memory. Aside from hierarchical clustering, RCA2 also offers an option for graph-based clustering using shared nearest neighbor networks, which enables clustering of hundreds of thousands of cells due to the lower memory requirement (O(nk) vs. O(n2)). For data interpretation, RCA2 offers functions to automatically generate customizable figures such as heatmaps illustrating clustering results, bar plots of cluster composition and 2D and 3D UMAP representations of the reference projection. Inspired by the aforementioned cell type annotation tools, RCA2 annotates each single cell cluster with its closest cell type in the reference panel. RCA2 provides, in addition to the original panel containing 65 cell types and 84 tissues extracted from Microarray Data (Li et al., 2017), multiple manually curated reference panels for 15 hematopoietic cell types (Novershtern et al., 2011), 28 immune cell types (Monaco et al., 2019), as well as human and mouse primary cell type panels derived from uniformly processed ENCODE RNA-seq data, containing 97 and 22 different primary cell types, respectively. Besides, RCA2 offers means to generate custom panels from user-provided transcriptome data. RCA2 can directly be applied to the output of CellRanger for single cell data from the 10X Genomics platform. Additionally, any other form of count data can be considered. Importantly single cell data processed using the widely adopted Seurat pipeline can be easily analyzed as well. With RCA2, we provide the single cell community with a flexible, scalable, and easy-to-use software that can be easily integrated into existing workflows to benefit from the features of supervised clustering. RCA2 is available online at https://github.com/prabhakarlab/RCAv2.

P113

Single cell resolution implicates TGF-ß1/ALK5 signaling as a regulator of venous smooth muscle cell contractility

Julian Tristan Schwartze 1, Simon Fisher 1, Emma Louise Low 1, Daniel Kelly 1, Menzo Havenga 2, Wilfried Bakker 2, Andrew Baker 3, Martin McBride 1, Stuart A Nicklin 1, Angela Bradshaw 1

1 Institute of Cardiovascular and Medical Sciences, University of Glasgow, Glasgow, UK; 2 Batavia Biosciences B.V., Leiden, Netherlands; 3 Centre for Cardiovascular Science, University of Edinburgh, Edinburgh, UK

Background. Vein graft neointima formation following coronary artery bypass graft (CABG) surgery entails smooth muscle cell (SMC) phenotype switching, including loss of contractile function and gain in proliferative capacity. This study explored potential cross-talk between transforming growth factor (TGF)-β1 and bone morphogenetic protein (BMP)-9 signaling and subsequent regulation of saphenous vein (SV) SMC phenotype. Results. Immunohistochemistry localized TGF-β1, BMP-9 and their respective receptors, activin receptor-like kinase (ALK)5 and ALK1, to SMCs in SV samples from CABG patients. To characterize SVSMC phenotype switching in vitro, we developed a contractile differentiation protocol using SMCs isolated from donated SV samples from CABG patients. Smooth muscle differentiation supplement (SMDS)-treated SVSMCs demonstrated increased αSMA, CNN1, SM22α, total myosin light chain (MYL)9 and phospho-MYL9 protein levels, suggesting enhanced contractility. Similarly, TGF-β1-treated SVSMCs demonstrated an increase in αSMA, CNN1 and SM22α mRNA levels. Pharmacological inhibition of the TGF-β receptor ALK5 with SB525334 prevented the induction of contractile marker expression by TGF-β1/ALK5, also blunting the ability of SMDS to induce contractile differentiation. To characterize the contractile SMC phenotype in more detail, angiotensin II- induced intracellular Ca2+ release served as a surrogate readout for SMC contraction. Fluorescent live cell microscope imaging of SVSMCs loaded with a Ca2+-sensitive dye revealed enhanced Ca2+ release in TGF-β1-treated cells, reflecting increased contractility. SB525334 and/or BMP-9 inhibited TGF-β1/ALK5-driven increase in Ca2+ release, indicating BMP-9/ALK1 antagonism. Single cell analysis revealed a bimodal distribution of Ca2+ responses in all treatment groups. TGF-β1 treatment caused a reduction of non-responding cells and an increase in cells demonstrating enhanced Ca2+ release, generating a more homogeneous response. To identify potential Ca2+ handling mechanisms and to capture cell heterogeneity in more detail, we employed single cell RNA sequencing using the 10x Chromium platform to characterize single cell transcriptome changes. UMAP dimensionality- reduction, louvain clustering and RNA velocity analysis identified distinct heterogeneous cell populations in each treatment group. In TGF-β1-treated cells, ingenuity pathway analysis detected induction of genes in distinct clusters, linked to nuclear factor of activated T-cell, vascular endothelial growth factor and thrombin signaling known to regulate Ca2+ handling via phospholipase c signaling. Clusters in BMP-9/TGF-β1-co-treated cells expressed fewer genes associated with direct Ca2+ handling and more genes involved in cell proliferation and extracellular matrix/cell communication, potentially indicating distinct SMC phenotypes. Conclusion. TGF-β1/ALK5 signaling causes a more homogeneous contractile response in venous SMCs and is partially antagonized by BMP-9/ALK1 signaling. Selective ALK5 agonism may protect SMCs from phenotype switching following CABG surgery.

P114

From stem cell to megakaryocyte: Delineating lineage commitment in murine megakaryopoiesis using single-cell RNA-seq

Anita Scoones1, Laura Mincarelli1, Vladimir Uzun1, Matthew Madgwick1,3, Ashley Lister1, Stuart Rushworth2, Wilfried Haerty1, Iain Macaulay1,2*

1 Earlham Institute, 2 University of East Anglia, 3 Quadram Institute

The blood system is sustained by a pool of rare, self-renewing and multipotent haematopoietic stem cells (HSCs) through the carefully coordinated process of haematopoiesis. Megakaryocytes (MK), the source of circulating platelets, are thought to emerge from a bipotent progenitor with MK and Erythroid (E) restricted potential. However, with the discovery of a distinct HSC subset that exhibits platelet biased differentiation potential, it has become evident that MK differentiation pathway(s) require further understanding: The underlying mechanisms by which stem cell lineage bias arises in the first place, how this affects commitment to the MK lineage - and how this changes with ageing and stress - remains to be elucidated at the molecular and functional level.

Technological advances in single-cell 'omic' approaches such as high-throughput single-cell RNA sequencing (scRNAseq) have played a huge part in the discovery of novel cell types, states and transitions in the haematopoietic system. In the present work we performed scRNA-seq using the plate-based Smart-Seq2 protocol on 384 mouse bone-marrow derived cd150high LSK single-cells to investigate transcriptional heterogeneity within the HSC and early-MK/E progenitor compartment to delineate the hierarchy of changes in gene expression from HSC to MK progenitor.

Our analysis so far has identified differences in the transcriptional profiles of HSCs and early progenitor cells, in both young and aged mice, and enabled us to delineate a trajectory of commitment to the MK lineage. This data will advance our understanding of the mechanisms and regulatory networks which enable the first steps with which platelet-primed HSCs commit to the MK lineage.

P115

A multi-modal investigation into the human adrenocortical tumors

Ali Kerim Secener*, Somesh Sai*, Barbara Altieri**, Panagiota Arampatzi**, Silviu Sbiera**, Sarah N. Vitcetz*, Caroline Brauening*, Cristina L. Ronchi**/***, Cornelius Fischer*, Martin Fassnacht**, Sascha Sauer*

Max Delbrueck Center for Molecular Medicine (MDC) - Berlin Institute for Medical Systems Biology (BIMSB)* - University Hospital of Wuerzburg** - University of Birmingham***

Adrenocortical tumors can be divided into two categories: frequent benign adenomas (ACA) and rare aggressive carcinomas (ACC). Many studies in endocrinology have applied bulk approaches such as microarray and RNA sequencing. These studies revealed the role of cAMP/PKA pathway in pathogenesis of cortisol producing ACAs, the presence of CTNNB1 mutations in endocrinologically inactive ACAs and occurrence of molecular alterations of the Wnt/β-catenin pathway in a large proportion of ACCs. To complement these findings and to resolve the underlying cellular and molecular complexity, we have adopted a multi-modal approach, comprising single nuclei RNA (snRNAseq), spatial transcriptomics as well as single- cell DNA SNP sequencing. The study cohort consisted in healthy normal adrenal glands (NAGs), ACAs and ACCs. Our initial approach was to generate a comprehensive single-cell atlas of the healthy tissue, which would serve as a complete reference in the downstream part of the project. This revealed overall cellular composition of the adrenal gland and enabled the discovery of novel markers in relation to the zonular organization of the tissue. Subsequently, ACAs and ACCs were processed, and an integrative analysis was initiated to define characteristics for both ACAs and ACCs, functional (secreting) and nonfunctional (non- secreting), with the ultimate goal of understanding what drives adenoma or carcinoma behavior of the tumors. With this approach, we are first aiming to discover similarities and differences between our reference and both tumor types/subtypes. After their characteristics are uncovered, we intend to carry on with a comparative analysis of the tumor types, to evaluate hypotheses such as ACCs being originated from ACAs. With this pioneering study, we are contributing to a better understanding of adrenal gland biology, as well as providing a comprehensive atlas to promote endocrinology towards a single-cell understanding. Keywords: ACA, ACC, snRNAseq, spatial RNA, single cell DNA SNP, NAG, single-cell atlas

P116

Comparing heterogenous populations in high dimensional space: Machine learning methods to quantify of macrophage polarisation using high content imaging data

Julia Sero, Kaiyu Li 2, Sarah Filippi 2, Tim Keane 3, Marina Evangelou 2, Seth Flaxman 2, Lena Zhu 3, Isaac Pence 3, 3

1) Bath University, Department of Biology and Biochemistry; 2) Imperial College, Department of Mathematics; 3) Imperial College, Departments of Materials and Bioengineering

Guiding the behavior of immune cells, such as macrophages, to promote regeneration and prevent fibrosis is an important goal of tissue engineering. Macrophages perform different functions during wound healing via polarization into M1 ("acute inflammatory") or M2 ("chronic inflammatory") phenotypes. Macrophage polarization is generally quantified by the expression of cell surface markers by FACS. However, these states are also characterized by dramatic changes in cell morphology: M1 cells tend to be large and well-spread with ruffling edges, whereas M2 cells tend to be small and/or spindle-shaped. Highly polarized M1 and M2 cells can be consistently distinguished by multi-parametric quantitative morphometry, but populations are extremely heterogeneous. The degree of heterogeneity, high dimensionality, and collinearity of features in these datasets mean that comparing intermediate or mixed-state cell populations presents a considerable statistical challenge. We set out to determine the impact of animal-derived decellularized extracellular matrices (dECMs) on macrophage polarization in order to assess their utility as tissue regeneration adjuvants. We used a machine learning method called Maximum Mean Discrepancy (MMD) to make multivariate comparisons across a matrix of treatment conditions comprised of various combinations of dECM and cytokine exposure. This method has been used for two- sample tests of high-dimensional data. Here, we applied MMD as a series of three-sample tests to determine relative similarities in multivariate space, using the MMD statistics with estimated confidence intervals in place of p-values. This method allowed us to meaningfully compare intermediate and mixed phenotypes against reference populations and quantify the degree of M1- or M2-ness induced by dECMs alone or in combination with inflammatory cytokines. This method is useful not only for inferring macrophage function in culture, but could be applied to other morphologically diverse populations, such as tumor cells, as well as other kinds of noisy data, such as single cell sequencing.

P117

Engaging the Single Cell Community to define the Human Cell Atlas Metadata Standard

Marion Shadbolt, Ami Day, Javier Ferrer, Mallory Freeberg, Enrique Sapena-Ventura, Ray Stefancsik, Zina Perova, Danielle Welter, Norman Morrison, Laura Clarke, Tony Burdett

EMBL-EBI

The field of single-cell biology has expanded dramatically in recent years with vast amounts of data being produced to describe millions of individual cells generated by labs all around the world. The Human Cell Atlas (HCA) aims to leverage these data to characterise the type, state and position of all cells in the human body. Key to this pursuit is a rich, descriptive, well-defined language to describe the samples, protocols, analyses and results of the experiments used to generate these data.

The HCA metadata standard aims to provide this common language to enable both discoverability and interoperability of single-cell datasets. This language unlocks the power of combining multiple studies to discover new insights over and above what can be derived from individual studies. As single-cell technologies develop and evolve, ensuring the standard stays relevant requires a strong connection to the single-cell community. Here we present the ways in which the HCA metadata team engages with the community to create and evolve the standard and thus facilitate the creation of the most comprehensive reference map of the fundamental units of life.

P118

Single-cell transcriptome atlas of 15 adult human organs from one donor reveals inter-cell heterogeneity

Shuai He, Lin-he Wang, Yang Liu, Yi-qi Li, Haitian Chen, Jinghong Xu, Dan Wang, Jinxin Bei, Xiaoshun He, Zhiyong Guo

Sun Yat-sen University Cancer Centre, State Key Laboratory of Oncology in South China, Collaborative Innovation Centre for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangzhou 510060, P. R. China.

The transcriptome diversity of cell types within and across tissues is of great importance for the biological function of human beings. To comprehensively delineate the similarity and difference of transcriptome of different cell types, we performed single-cell RNA, T cell receptor (TCR) and B cell receptor (BCR) sequencing of 91,393 cells on 15 organs from one adult male. Firstly, we made a detailed annotation for cell subpopulations by unsupervised clustering in each organ. Secondly, by analyzing the TCR and BCR repertoire, we elucidated the clone structure in different tissues and sub-clusters of T and B cells. Then, we revealed the similarities and differences of major cell types within and across organs at multiple aspects, including gene profiles, regulon activity and potential biological functions of different cell subpopulations. Finally, we illustrated the potential communications between different cell types within tissues. Overall, our dataset provides a resource for transcriptome characteristics of multiple organs from healthy adults at the single-cell level.

P119

SMART-Seq Stranded Kit performance with ovarian cancer cells

1.Luke Sherlin, 1.Anastasia Potts, 1.Nathalie Bolduc, 1.Simon Lee, 1.Magnolia Bostick, 1.Andrew Farmer 2.Presented by Francois-Xavier Sicot

1.Takara Bio USA, Inc., Mountain View, CA 94043, USA. 2.Takara Bio Europe

Single-cell RNA sequencing (scRNA-seq) approaches are increasingly being used to characterize the abundance and functional state of tumor-associated cell types, and have provided unprecedented detail into cellular heterogeneity. Extracting meaningful biological information from the small amount of RNA in single cells requires a library preparation method with exceptional sensitivity and reproducibility. The SMART-Seq® v4 Ultra® Low Input RNA Kit for Sequencing (SMART-Seq v4) is an extremely sensitive scRNA-seq library preparation method in part due to its capability to retrieve information from full-length mRNA and not just the 3' end. However, this method can only capture polyadenylated mRNA. To address this, we have modified our SMART® RNA-seq technology to create the SMART- Seq Stranded Kit, a single-cell RNA-seq library preparation method that relies on random priming instead of oligo dT priming. The SMART-Seq Stranded Kit captures any RNA regardless of polyadenylation status and preserves strand-of-origin information, making it more amenable for distinguishing overlapping genes and comprehensive annotation and quantification of long noncoding RNA (lncRNAs). To show the applicability of the SMART- Seq Stranded Kit in characterizing tumor heterogeneity, we analyzed single cells dissociated from a solid tumor in stage IV ovarian cancer (serous carcinoma). We sorted CD45+ leukocytes and EpCAM+ tumor cells in 96-wells plates. After library preparation, sequencing, and analysis, we detected an average of 4,717 genes in the CD45+ cells and 8,039 genes in the EpCAM+ tumor cells. This analysis enabled identification ofwell-accepted markers of tumor-infiltrating lymphocytes (TILs) associated with ovarian carcinoma.

P120

Single-cell RNA-seq identifies hormone-producing cell types in the teleost fish pituitary

Khadeeja Siddique, Eirill Ager-Wick, Finn-Arne Weltzien, Christiaan Henkel

Norwegian University of Life Sciences, Faculty of Veterinary Medicine, Oslo, Norway

The pituitary is a master endocrine gland in vertebrates, which controls a variety of physiological functions including growth, metabolism, homeostasis, reproduction, and response to stress. These functions are modulated by the secretion of several protein hormones, e.g. growth hormone (Gh), luteinizing hormone (Lh) and follicle-stimulating hormone (Fsh). In teleost fish, each hormone is presumably produced by a specific cell type. Unfortunately, key details of hormone production and its regulation are still poorly understood. Therefore, in order to improve the experimental control of these physiological processes, we decided to study the mechanisms of pituitary hormone production in fish at the single-cell level.

Recent advances in genomics technology provide new avenues to address these important questions. We have used the 10x Genomics platform to perform scRNA-seq on the pituitary of the model species medaka (Oryzias latipes). Using Cell Ranger (10x) and Seurat for quality control, normalization and downstream analysis (cell clustering), we profiled 2258 pituitary cells.

Our single-cell data reveal eight hormone-producing cell types, demonstrating a strict division of labour - each hormone is produced by a dedicated cell type. This is in contrast to the tetrapod pituitary, in which a single cell type can produce, for example, both Lh and Fsh. Many of these cells also show extreme specialization: for instance, cells producing proopiomelanocortin (POMC, a precursor for hormones controlling metabolism, pigmentation, and stress) devote more than 50% of their transcript pool to the production of this protein. Finally, we identified two novel populations of prolactin-producing cells with different developmental origins. In fish, this hormone is involved in osmoregulation.

In conclusion, we present the first scRNA-seq study to characterize the teleost pituitary in detail. We will use these findings for detailed functional studies of hormone-producing cell types and their biological significance in fish reproduction.

P121

Cell-cell communication analysis in malignant and healthy bone marrow using COMUNET algorithm on scRNAseq data

Maria Solovey 1, Frank Ziemann 2, Klaus H. Metzeler 2, Maria Colomé-Tatché 1, Antonio Scialdone 1,3,4

1 Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany 2 Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany 3 Institute of Epigenetics and Stem Cells, Helmholtz Zentrum München, Munich, Germany 4 Institute of Functional Epigenetics, Helmholtz Zentrum München, Munich, Germany

Intercellular communication plays an essential role in the proper functioning of blood homeostasis in the bone marrow and its alteration might contribute to development and maintenance of malignant neoplasms. In our project, we developed COMUNET [1] to assess the cell-cell communication between distinct cell types using single-cell RNAseq data. With COMUNET [1], we assess differential communication between two biological conditions and use new visualisation and classification tools based on multiplex networks and network clustering strategies. Here we apply COMUNET [1] to a single cell RNAseq data set (van Galen et al., 2019 [2]) that is publicly available. We analysed bone marrow samples from an AML patient at diagnosis and in remission, as well as healthy bone marrow control. By this we could identify new lines of intercellular communication that change during malignant transformation and can be reestablished by treatment of the AML. These findings will serve as a basis for further biological validation of the found communication lines in vivo and show how existing data sets can be used to establish new data driven hypothesis for a deeper understanding of hematopoietic neoplasms.

1. Solovey, M. & Scialdone, A. COMUNET: a tool to explore and visualize intercellular communication. bioRxiv 864686 (2019) doi:10.1101/864686. 2. van Galen, P. et al. Single-Cell RNA-Seq Reveals AML Hierarchies Relevant to Disease Progression and Immunity. Cell 176, 1265-1281.e24 (2019).

P122

Characterizing Hospital Adaptation of MRSA Using Host and Pathogen Transcriptomics

Jacob B. Swanson, Gal Avital, Felicia Kuperwaser, Erin Zwack, Magdalena Podkowik, Victor J. Torres, Bo Shopsin, and Itai Yanai

NYU Grossman School of Medicine

Methicillin-resistant Staphylococcus aureus (MRSA) is a highly contagious and deadly bacterium that is difficult to treat due to its resistance to β-lactam antimicrobials. Distinct MRSA isolates are specialized to infect hosts in community and hospital environments, known as community-associated MRSA (CA-MRSA), and hospital-associated MRSA (HA- MRSA). CA-MRSA is marked by higher cytotoxicity compared with HA-MRSA, yet HA-MRSA can be more lethal in hospital settings. These observations suggest that MRSA isolates utilize distinct molecular mechanisms as they adapt from community to hospital environments. Interestingly, we have identified transitional CA-MRSA (tCA-MRSA) isolates that seem to be adapting to the hospital environment. The molecular signatures associated with these three forms of MRSA during infection remain poorly understood. We hypothesized that there are transcriptomic differences in both host and pathogen cells during MRSA infection, and that these differences will elucidate distinct molecular pathways that different MRSA isolates during hospital adaptation. To this end, we have begun to study the interaction between MRSA and human macrophages. Macrophages were treated with medium as a control or infected with either CA-MRSA, tCA-MRSA, or HA-MRSA, each tagged with GFP, at MOI 1. The infected cells were GFP-sorted by FACS and bulk populations of 200 infected cells were analyzed using CEL-Seq2 for host transcriptomics for 10 timepoints ranging 30 mins to 5h post-infection and bulk populations of 2,000 infected cells were analyzed using scDual-Seq for simultaneous host and bacterial transcriptomics for 1, 3, and 5h post-infection. Our CEL-Seq2 analysis revealed that host transcripts associated with inflammation such as NFKBIA, CCL3, CXCL1, SMAGP, TNF, and IL1A are enriched in host cells infected with tCA- and HA-MRSA, but not CA-MRSA. scDual-Seq analysis further revealed that the bacterial transcript involved in metabolism are enriched in cells infected with CA-MRSA compared with tCA- and HA-MRSA. In addition, cells infected with CA- and tCA-MRSA have upregulated agr, a master virulence regulator, compared with HA-MRSA. The pro-inflammatory profiles of host cells infected with tCA- and HA-MRSA are paradoxical based on their relatively low virulence in laboratory assays compared to CA- MRSA. These results suggest that CA-MRSA suppresses the inflammatory response of the host to replicate and bolster virulence. The upregulation of metabolic pathways in CA-MRSA further suggests that metabolic adaptation is a critical virulence strategy. Our results start to point to targetable molecular pathways utilized by both host and pathogen, which could potentially lead to more effective treatments for patients infected with diverse strains of MRSA.

P123

SingCellaR, a Novel Single-cell Analysis Pipeline, Reveals Cellular Architecture of Human Haematopoietic Stem and Progenitor Cells Throughout Fetal and Adult Life

Supat Thongjuea 1, Guanlin Wang 1, Sorcha O’Byrne 2, Natalina Elliott 2, Peng Hua 2, Elisabeth Heuston 3, Deena Iskander 4, Anastasios Karadimitris 4, David M. Bodine 3, Adam Mead 2, Irene Roberts 2, Anindita Roy 2, Bethan Psaila 2

(1) MRC WIMM Centre for Computational Biology, MRC WIMM, University of Oxford, Oxford OX3 9DS, U.K. (2) MRC Molecular Haematology Unit, WIMM, University of Oxford, Oxford, OX3 9DS, U.K. (3) Hematopoiesis Section, National Research Institute, National Institutes of Health, Bethesda, MD 20892-4442, U.S.A. (4) Centre for Haematology, Hammersmith Hospital, Imperial College Medicine, London W12 OHS, U.K.

Haematopoietic stem/progenitor cells (HSPCs) constantly replenish all mature blood cells throughout life. In humans, definitive haematopoiesis in the fetal liver (FL) begins at around 5 post conceptional weeks (pcw), which remains the main site of haematopoiesis throughout fetal life. Haematopoiesis in the bone marrow (BM) starts around 11-12 pcw, but does not take over as the primary site of haematopoiesis until just after birth. We know very little about how HSPC subsets change through ontogeny and whether they do so in a site and stage-specific manner. To explore HSPCs heterogeneity and differentiation trajectories during fetal and adult life, we analysed 57,489 individual lineage-negative, CD34+ cells by RNA-seq from first trimester FL (n=2, 13,864 cells; 7-8 pcw); matched second trimester FL and fetal BM (FBM) (n=2, 15,141 cells from FL and 12,073 cells from fetal BM; 18-19 pcw); paediatric BM (n=2, 10,769 cells); and healthy adult donor (n=2, 5,642 cells) using the Chromium 10x genomics platform. We have developed SingCellaR, a novel single-cell analysis pipeline in R that allows us to robustly integrate single-cell RNA-seq data from fetal and adult life. We used SingCellaR to identify 22 sub-clusters. Using the semi-automated cell type annotation system implemented in SingCellaR, we classified clusters into 6 main distinct cell types (HSPCs, myeloid, lymphoid, erythroid, megakaryocyte, and eosinophil/basophil/mast progenitor cells) with the different composition of cell cycle stages. We observed that the composition of the Lin-CD34+ cell compartment was distinct in different stages of ontogeny. The early FL samples showed a substantially higher proportion of megakaryocyte and erythroid progenitors, which decreased during development. HSPC composition also varied in a site-specific manner as evidenced by differences seen in matched second trimester FL and FBM samples from the same fetus. Second trimester FBM showed a higher proportion of myeloid and lymphoid lineage specific progenitors compared to all other tissues including the matched FL samples. SingCellaR identified signature genes and transcription factors, which highlighted transcriptional differences over the differentiation trajectories across ontogeny. In conclusion, we analysed the Lin-CD34+ HSPC compartment throughout human ontogeny using our novel computational analysis tool that provides the essential functionality for the analyses and visualisation of single-cell RNA-seq. There is clear evidence that CD34+ HSPC compartment varies in its composition and differentiation potential in a site and developmental stage-specific manner and may be dependent on the physiological demands of that particular developmental stage or occur in response to specific microenvironmental cues.

P124

Generating Synthetic Single-Cell RNA-Sequencing Data from Small Pilot Studies using Deep Learning

Martin Treppner, Adrián Salas, Stefan Lenz, Bismark Appiah, Tanja Vogel, and Harald Binder

Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg, Stefan-Meier-Str. 26, 79104, Freiburg, Germany

When researchers design experiments, they commonly start with a small pilot study to determine the sample size for a full-scale investigation. Also, biologists increasingly use deep learning models for working with single-cell RNA-sequencing data, which learn a low- dimensional representation of expression patterns within cells. Here, we examine the ability of these methods to aid the design of experiments, by learning the structure of data from a pilot study and subsequently generating expression patterns for planning full-scale experiments.

We investigate two deep generative models. Firstly, single-cell variational inference (scVI), a frequently used method for scRNA-seq data analysis. We generate samples from the posterior and the prior distribution, which ensures that samples come from a diverse region of the input space. Secondly, we propose single-cell deep Boltzmann machines (scDBM), which are particularly suitable for small datasets. Next, we take subsamples from the PBMC4K dataset from 10x Genomics. Afterward, we train the models on these subsamples and generate synthetic expression patterns in the size of the original study. Then, we apply downstream analyses and use the Davies-Bouldin Index to evaluate the clustering performance of the original and synthetic data, respectively. To investigate heterogeneity, we compare the relative frequencies of generated cells per cluster. Lastly, we examine whether the synthetic data resembles univariate and bivariate structures of the original data on marker genes.

We found that for clustering, scVI_posterior exhibits high variability, whereas expression patterns generated from scVI_prior and scDBM perform better in clustering tasks. Besides, the models show mixed results regarding similarity to cellular heterogeneity within samples- sometimes overestimating highly abundant cell types, whereas less abundant cell types are not detected. Furthermore, we observe that all models properly learn the univariate distribution of marker genes, but have difficulties with capturing complex correlations between genes.

We conclude that for making inference from a small-scale study to a larger experiment it is advantageous to use scVI_prior or scDBM since the commonly used scVI_posterior produces expression patterns that are too close to the input. Also, the scDBM shows an additional advantage for small datasets -potentially due to its reduced complexity. We show that deep learning models can improve experimental design and therefore advance the replicability of scRNA-seq experiments.

P125

Cross-regional single-cell multilineage diversity in multiple sclerosis

Amel Zulji1,2, Tim Trobisch1, Dmitry Velmeshev3,4, Maximilian Haeussler5, Michael Platten1,6, Arnold Kriegstein3,4, Simon Anders7, David Rowitch3,8,9, Lucas Schirmer1,10

1Department of Neurology, Medical Faculty Mannheim, University of Heidelberg, 68167 Mannheim, Germany; 2Heidelberg Biosciences International Graduate School (HBIGS), University of Heidelberg, 69120 Heidelberg, Germany; 3Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA 94143, USA; 4Department of Neurology, University of California, San Francisco, San Francisco, CA 94158, USA, 5Genomics Institute, University of California, Santa Cruz, CA, 95064, USA; 6DKTK Clinical Cooperation Unit Neuroimmunology and Brain Tumor Immunology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), 69120 Heidelberg, Germany. 7Center for Molecular Biology of Heidelberg University (ZMBH), 69120 Heidelberg, Germany. 8Department of Paediatrics and Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, CB2 0QQ, UK; 9Department of Pediatrics, University of California, San Francisco, San Francisco, CA 94158, USA; 10Interdisciplinary Center for Neurosciences (IZN), University of Heidelberg, 69120 Heidelberg, Germany.

Multiple sclerosis (MS) is a chronic inflammatory disease of the central nervous system (CNS) affecting at least 2.3 million people worldwide. The disease is assumed to be triggered by autoreactive lymphocytes, an event that ultimately leads to a disturbed tissue micro-environment driven by reactive glial cells resulting in loss of myelin and progressive neurodegeneration. Permanent damage is manifested through various symptoms and strongly correlates with the affected anatomical region. However, the precise mechanism of the disease initiation and its progression as well as susceptibility of different anatomical regions to the disease are not well understood. Here, we utilized single-nucleus RNA-sequencing (snRNA-seq) and performed an integrative analysis of human MS pathology as compared to control tissue comprising neocortex, cerebellum and spinal cord. In a first step, we identified highly specific marker genes for all cell populations. Further, we aimed at assessing similarities and differences of disease- induced cell-type specific transcriptomic changes throughout the different CNS regions. By performing differential gene expression analysis, we identified MS specific gene dysregulations in all major cell types, e.g. suggesting variable degrees of astrocyte and microglia activation in MS. In summary, integration of different regional snRNA-seq data sets from MS patients allows us to generate a comprehensive cross-regional single-cell transcriptomic map of MS lesion pathology. The findings help identify novel cell-type specific disease biomarkers and potential targets to better characterize MS and tailor therapeutic interventions.

P126

The human retrovirus HTLV-1: simultaneous single-cell sequencing of the provirus, integration site & transcriptome

Jocelyn Turpin (1,2), Anat Melamed (1), Ashleigh Lister (3), Iain Macaulay (3), Yorifumi Satou (2,4) , Charles R M Bangham (1)

1. Department of Medicine, Imperial College London, UK. 2. Division of Genomics and Transcriptomics, Joint Research Center for Human Retrovirus Infection, Kumamoto University, Kumamoto, Japan. 3. Earlham Institute, Norwich Research Park, Norwich NR4 7UH, UK. 4. International Research Center for Medical Sciences (IRCMS), Kumamoto University, Kumamoto, Japan

The Human T-cell leukaemia virus (HTLV-1) infects 5 to 10 million persons worldwide. Like other retroviruses, HTLV-1 integrates a double-stranded DNA copy of its genome (the provirus) into the host cell genome. A typical infected individual carries between 10^4 and 10^5 infected T-cell clones, each clone containing a single-copy provirus in a unique genomic site. Integration upstream of certain oncogenes has been associated with adult T cell leukaemia, one of the two diseases associated with HTLV-1. A fraction of proviruses are defective, containing either point mutations or deletions.

There is recent evidence that the HTLV-1 provirus is reactivated from latency in intense intermittent transcriptional bursts, and we wish to identify the factors that regulate this reactivation. We hypothesize that two factors that influence the transcriptional bursting are the proviral sequence and the site in which it is integrated in the host genome. We have developed a method based on the combined genome and transcriptome sequencing of single cells (G&T Seq - Macaulay et al, Nature Methods, 2015). Infected single cells are first sorted in multi-well plates by flow cytometry. After physical separation of the mRNA and genomic DNA, full-length cDNA libraries are synthesized using the Smart-Seq2 protocol. The genomic DNA is amplified by multiple displacement amplification. The HTLV-1 provirus is less than 10 kb long, so a single-copy provirus represents approximately 0.0003 % of the human genome. To increase the sensitivity of detection of the provirus, biotinylated oligoprobes covering the HTLV-1 genome are used to capture and enrich the proviral DNA. This DNA capture increases the sensitivity of detection of the proviral sequences by over 1000-fold, yielding reads aligning to the HTLV-1 provirus of between 0.8% to 60% of total mapped reads. Virus-host chimeric fragments containing the junction between the provirus and the flanking genomic DNA are also captured by the probes, enabling the detection of the integration site.

The results will be used to analyse the impact of the proviral sequence and genomic integration site on viral and host-cell transcription at the single-cell level. This approach will directly contribute to an understanding of what controls HTLV-1 transcription in vivo and can be applied to other retroviruses and transposable elements.

P127

Characterising cells found in human milk using single cell transcriptomics

Alecia-Jane Twigger1,2, Lisa K. Meixner2, Karsten Bach1, Isabel Schultz-Pernice2, Stefania Petricca2, Christina H. Scheel2, Walid T. Khaled1

1.Department of Pharmacology, University of Cambridge, United Kingdom 2.Institute for Stem Cell Research, Helmholtz Zentrum München, Germany

Breast cancer affects 1 in 8 women leading to 1000 death every month in the UK alone. Epidemiological data suggest that age is the greatest risk factor for breast cancer. The age- dependent risk of tumorigenesis is modulated by parity, which attenuates the risk or predisposing germline mutations which increases the risk. Currently there is little understanding of the cellular changes induced by parity in the human breast. Major changes occurring during pregnancy, prime the gland to fulfil its purpose of milk synthesis and secretion during lactation. The glandular tissue, rendered functional though this process, is composed of contractile basal myoepithelial cells and secretory luminal cells. Thus far, limited access to suitable human epithelial cells to study has meant that pathways governing maturation and milk production in the breast are incomplete. In this study we set out to chart the cellular changes of normal mammary cells by comprehensively characterizing cells found in human milk and comparing them to resting (non-pregnancy, non-lactating) mammary epithelial cells derived from breast reduction tissue. For this purpose, 29,078 human milk cells (from n=4 participants) together with 25,145 cells from resting breast tissue (from n=4 participants) were analysed using the 10X genomics single-cell RNA-sequencing platform. Our scRNAseq analysis revealed that human milk contained viable progenitor and differentiated luminal cells and immune cells which was corroborated by FACS analysis. We did not detect stromal and basal cells in human milk however arguably the cell type of greatest interest, secretory luminal cells, were present in high numbers across all participants and contributed to multiple cell clusters. Initial analysis comparing differentiated luminal cells derived from either milk or resting tissue revealed a large number of differentially expressed genes, many of which were associated with the synthesis and transport of various milk components. Further examination of the luminal progenitor cells found in the milk will be of great interest given the recent reports implicating this cell population as the cell of origin of breast cancer. Understanding changes in this cell population could help explain the impact of parity on breast cancer risk.

P128

Visualization and integration of single-cell long-read and short-read data for isoform expression analysis

Vladimir Uzun, Laura Mincarelli, Wilfried Haerty, Iain C. Macaulay

Earlham Institute, Norwich Bioscience Institutes, Norwich Research Park, Norwich

Rapidly improving single-cell technologies are leading to multiple data types available from the same cell. Integrating these different types is crucial for better characterisation of the cell and to understand the relationship between different modalities.

With additional data types and integration approaches, broad and meaningful data exploration is becoming more of a challenge. Easy-to-use inspection tools that do not require much computational background can be especially valuable to streamline discovery and the adoption of new approaches.

We produced parallel single cell short-read (Illumina) and long-read (PacBio) RNA-seq data from 10X Genomics cDNA libraries generated from mouse blood stem and progenitor cells. Short-read data with its high number of reads enables clustering of the cells and discriminate or identify different cell types, while long-read data supplements that with cluster specific isoform information. Integration of the two data sets allow us to observe the presence of alternative splicing events and changes of isoforms expression across cell types and biological states.

To explore this data more effectively, we have developed a Shiny application in R ("scigec" - single cell isoform and gene expression counts). that enables visual (barcharts and dimensionality reduction plots) and tabular inspection of isoform counts just by specifying the gene name. This allows analysis of isoform numbers within and between biological states, experimental conditions or cell types, as well as inspection of custom meta-data, plots with clustering based on short-read data with the option to see both long-read (isoform) and short-read data simultaneously.

P129

Epigenome-based inference of transcription factor activity from single cell expression data

Simon J. van Heeringen, -

Radboud University, Department of Molecular Developmental Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Nijmegen, The Netherlands.

Single-cell expression profiling (scRNA-seq) is rapidly becoming an essential technique to assay heterogeneous cell populations. ScRNA-seq experiments often result in a descriptive atlas of cell types or cellular trajectories inferred by computational models. However, it remains challenging to identify the molecular processes underlying the different cell states. Here, we present SCEPIA (Single Cell EPigenome-based Inference of Activity), a method that uses computationally inferred epigenomes of single cells to identify transcription factors that determine cellular states. We created a reference compendium of genome-wide profiles of the enhancer-associated histone mark H3K27ac. Using this reference, we infer enhancer activity profiles of single cells with their expression state as input. Based on the transcription factor motifs underlying the inferred enhancers, SCEPIA identifies transcription factor activities for single cells. We show that the majority of regulatory logic driving cell type identity is determined by enhancers and that our method significantly improves on promoter- based motif analysis. We use the 10X peripheral blood mononuclear cell (PBMC) scRNA- seq reference data to illustrate that SCEPIA successfully retrieves the transcription factors that drive the different hematopoietic lineages. Furthermore, we demonstrate that SCEPIA identifies relevant transcription factors even when using scRNA-seq data of complex tissues, or for cell types for which there is no exact reference match available. SCEPIA adds a layer of regulatory information to scRNA-seq data and provides a critical step towards the elucidation of the gene regulatory networks that determine cell fate.

P130

Cancer cells driving peritoneal metastasis and therapy resistance in high-grade serous ovarian cancer

Niels Vandamme (1, 2, 3), Jordy De Coninck (1,2), Nele Loret (1,2,4), Ruth Seurinck (1,2,3), Kevin Verstaen (2,3), Sam Dupont (3, 5), Jana Roels (2,3), Robin Browaeys (2,3), Wouter Saelens (2,3), Liesbet Martens, (2,3), Philippe Tummers (2,4), Gillian Blancke (1,2), Eva De Smedt (1,2), Yvan Saeys (2,3) and Geert Berx (1,2)

(1) Molecular and Cellular Oncology Lab, Department of Biomedical Molecular Biology, Ghent University, Technologiepark 927, 9052 Ghent, Belgium (2) Cancer Research Institute Ghent (CRIG), Ghent, Belgium (3) Data Mining and Modeling for Biomedicine, VIB Inflammation Research Center, Ghent, Belgium (4) Universitary hospital, Corneel Heymanslaan 10, 9000 Ghent, Belgium (5) Center for Inflammation Research, Laboratory of Immunoregulation, VIB, Ghent, Belgium.

With an overall five-year survival of only 40% epithelial ovarian cancer is the most lethal of all gynecologic malignancies. This poor prognosis is mainly due to a delayed diagnosis and frequent therapy-resistant relapses. In this project we will focus on the high-grade serous (HGS) histological subtype of epithelial ovarian cancer, responsible for the largest number of ovarian cancer deaths. The aim of this project is to investigate each of the steps in the formation of peritoneal metastases to reveal the subpopulation of cancer cells driving peritoneal metastasis. A second goal is to identify the cancer cells contributing to therapy resistance. For this purpose, fresh tumor specimens in different stages of tumor progression (primary tumor > ascites > peritoneal metastasis) were collected from patients with high- grade serous ovarian cancer before and after chemotherapy. After enzymatic digestion and depletion of immune cells, single cell libraries were constructed allowing RNA-sequecing of individual cancer cells of each sample. Subpopulations and single cells are characterized is performed via gene expression analysis, PCA analysis, clustering, and trajectory inference. In high-grade serous ovarian cancer patients, the subtype responsible for most ovarian cancer deaths, we identified distinct and overlapping gene expression signatures in primary, ascites-derived, metastatic, minimal residual disease (MRD) and/or resistant relapse before and after therapy exposure. Further gene expression analysis will allow to characterize in detail different cancer cell states and discover novel biomarkers for ovarian cancer progression and therapy resistance.

P131

Developing a standardized computational workflow for mass spectrometry-based single cell proteomics

Christophe Vanderaa, Laurent Gatto

Computational biology and bioinformatics, UCLouvain

Recent advances in sample preparation, processing and mass spectrometry (MS) have allowed the emergence of MS-based single-cell proteomics (SCP). Two main procedures are currently emerging: multiplexed labeled quantification using tandem mass tags (TMT) and label-free quantification (LFQ). Both methods are developed by separate groups and so are the computational tools. We present a package that provides a robust and standardized workflow for analyzing such MS-SCP data. The implementation uses well-defined classes from Bioconductor which already provides powerful tools for single-cell RNA sequencing. We demonstrate that our pipeline can reproduce and standardize TMT and LFQ data analyses using only a few lines of code.

P132

Functional heterogeneity within the developing zebrafish epicardium

Michael Weinberger1,2, Filipa C. Simões1,2, Tatjana Sauka-Spengler2, and Paul R. Riley1

1 University of Oxford, Department of Physiology, Anatomy and Genetics, Oxford OX1 3PT, UK 2 University of Oxford, Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, Oxford OX3 9DS, UK

I would like this abstract to be considered for POSTER presentation only. ABSTRACT TEXT: The epicardium is a sheet of cells enveloping the heart muscle and is essential during cardiac development, homeostasis and repair. Yet fundamental insights into epicardium formation, lineage heterogeneity and functional cross-talk with other cell types in the heart are currently lacking. Here, we investigated epicardial heterogeneity and the functional diversity of discrete epicardial subpopulations in the developing zebrafish heart. Smart-seq2 based single-cell RNA-sequencing uncovered three epicardial subpopulations (Epi1-3) with specific genetic programmes and distinctive spatial distribution in the developing heart. Functional perturbation identified tgm2b, a transglutaminase gene highly enriched in Epi1, as necessary for the proper development of the epicardial cell sheet. Epi2 was spatially localised in the cardiac outflow tract and expressed the chemokine sema3fb. Loss of sema3fb increased the number of tbx18+ cells in the outflow tract, suggesting it controls the spatiotemporal access of epicardial cells to this tissue. Epi3 was enriched for cell guidance cues such as cxcl12a, loss of which decreased the number of ptprc/CD45+ leukocytes on the epicardial surface. Understanding which mechanisms cells employ to establish a functional epicardium and to communicate with other cardiovascular cell types during development will bring us closer to repairing cellular relationships that are disrupted during cardiovascular disease.

P133

Single-Cell Transcriptomics Uncovers Zonation of Function in the Mesenchyme during Liver Fibrosis

Ross Dobie 1, John R. Wilson-Kanamori 1, Beth E.P. Henderson 1, James R. Smith 1, Kylie P. Matchett 1, Jordan R. Portman 1, Prakash Ramachandran 1, Chris P. Ponting 2,3, Sarah A. Teichmann 3,4,5, John C. Marioni 3,4,6, and Neil C. Henderson 1

1 Centre for Inflammation Research, The Queen’s Medical Research Institute, Edinburgh BioQuarter, University of Edinburgh, Edinburgh EH164TJ, UK 2 MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine at the University of Edinburgh, Edinburgh EH4 2XU, UK 3 Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK 4 European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge CB10 1SD, UK 5 Theory of Condensed Matter Group, The Cavendish Laboratory, University of Cambridge, Cambridge CB3 0HE, UK 6 Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Cambridge CB2 0RE, UK

Iterative liver injury results in progressive fibrosis, which disrupts hepatic architecture, regeneration potential, and function. An ideal anti-fibrotic therapy would specifically target the pathogenic collagen-producing cell population without perturbing homeostatic mesenchymal function. Therefore, increasing our understanding of the precise cellular and molecular mechanisms regulating liver fibrosis is fundamental to the rational design and development of effective anti-fibrotic therapies for patients with chronic liver disease.

We used single-cell RNA sequencing and spatial mapping to deconvolve the hepatic mesenchyme in healthy and fibrotic mouse liver, revealing three distinct mesenchymal subpopulations: hepatic stellate cells (HSCs), fibroblasts (FBs), and vascular smooth muscle cells (VSMCs). We showed that HSCs partition into topographically diametric lobule regions, designated portal vein-associated HSCs (PaHSCs) and central vein-associated HSCs (CaHSCs). Further, we uncovered functional zonation identifying CaHSCs as the dominant pathogenic collagen-producing cells in a mouse model of centrilobular fibrosis, and that CaHSCs, but not PaHSCs, differentiate into pathogenic collagen-producing cells following acute centrilobular liver injury.

P134

Direct correlation of phenotype and deep targeted gene expression profiling in single cells

Johannes B. Woehrstein, Heinrich Grabmayr, Ryan Sherrard, Philip Boehm, Markus Jobst, Moritz Weck, Thomas Eisenreich, Reiner Dunkl, Patrick Grossmann

Fakultät für Physik, LMU Munich

The current emergence of single-cell (SC) technologies and thus the possibility to investigate thousands of individual cells with molecular detail is a paradigm shift in life sciences. As living systems are typically complex communities comprised of thousands of distinctive cells which dynamically interact and evolve, their comprehensive understanding poses a unique challenge. Today, SC technologies allow the classification of sizeable heterogeneous cell populations within a sample, and with increasingly high definition, one will be able to differentiate more and more subtle variations. Detecting these subtle changes in isolated cells also promises unprecedented insights into the onset of biological mechanisms.

Of particular interest is how molecular dynamics within cells interact with their functional phenotypes, including morphology and cell type. These interactions can be studied by associating genetic activity with cell imaging. However, this correlation of mRNA expression levels and high-definition phenotypes has not been achieved in a quantitative and high- throughput way so far.

Here, we present a novel method that combines microwell-based cell compartmentalization with multiplexed single-molecule mRNA detection using fluorescently labeled DNA origami nanostructures. Both phenotyping and subsequent mRNA expression analysis of a single cell can be performed within the same microwell compartment, therefore allowing their correlation. By incorporating the microwells into a fluidic chip, we have built assays able to investigate > 5.000 cells. The isolated cells are imaged using a high-performance fluorescence microscope and thus stratified for morphology, CD-markers, and viability. Additionally, compartments with more than one cell are excluded from the SC analysis. After phenotyping, the cells are lysed, and mRNA targets are specifically captured in the compartments. We have previously demonstrated the robust design and detection of > 100 uniquely colored fluorescent DNA origami nanostructures and their applicability in quantitative nucleic acid detection. Here, we use these markers for the multiplexed identification of > 100 mRNA target species from the lysed cells. By hybridizing one nanostructure to one mRNA target, a precise, digital, and amplification free quantification is achieved.

We envision numerous applications in fields with inherently heterogeneous samples like immunology and oncology.

P135

Identification of platelet cell of origin using novel RNA-based molecular and clonal approaches

Edyta E. Wojtowicz1,2,3, Jayna Mistry2; James Lipscombe1; Ashleigh Lister1; Sten Eirik W. Jacobsen3,4,5; Stuart A. Rushworth2; Kristian M. Bowles2,6; Iain C. Macaulay1,2

1 Earlham Institute, Norwich Research Park, Norwich, UK 2 University of East Anglia, Norwich, UK 3 Department of Cell and Molecular Biology, Wallenberg Institute for Regenerative Medicine and Department of Medicine, Center for Hematology and Regenerative Medicine, Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden 4 MRC Molecular Hematology Unit, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK 5 Hematopoietic Stem Cell Laboratory, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK 6 Department of Haematology, School of Pharmacy, Norfolk and Norwich University Hospitals National Health Service (NHS) Foundation Trust, Norwich, UK

Platelets are anuclear cells formed through the cytoplasmic fragmentation of their direct ancestors, megakaryocytes (MKs). Although these cells arise from hematopoietic stem cells (HSC), the number and molecular identity of HSC and intermediate progenitors supporting platelet production is unknown. Such studies are hampered by the absence of reliable tracing markers in platelets, which (if available) could pinpoint the cell of origin for each individual circulating platelet. In this context we have developed a novel, RNA-based cellular barcoding approach to reveal the HSC and progenitors most efficiently producing platelets in-vivo. We hypothesised that given RNA is more abundant in cells than genomic DNA, the sensitivity of an RNA based cell fate tracking method should be higher than classical DNA- based barcoding methods. To validate the quantitative capability of RNA-based clonal studies, we transduced BaF3 murine cells with a lentiviral barcoded library containing an eGFP reporter. Single transduced cells were FACS-purified and expanded to grow monoclonal barcoded cultures. Cells from these cultures were mixed in equal ratios (5 to 500 000 cells/ monoclonal culture). Genomic DNA (gDNA) and RNA isolated from these samples was processed according to a modified G&T-seq protocol, which enables parallel analysis of DNA and RNA-derived cDNA from the same sample. Then we sequenced the resulting libraries and compared the barcodes retrieved from gDNA or cDNA. Since all previously published studies using cellular barcoding methods rely on gDNA, gDNA has served as the reference for validation. Equally mixed samples representing a range of input cell numbers per barcode were analyzed. In the cDNA samples, detection of barcodes was possible from as few as 5 cells, which was superior to gDNA samples. We repeated this analysis using cultures mixed at highly unequal ratios. Our results demonstrate that cDNA allows for the reliable detection of the smallest clones (contributing as little as 0.09%), whereas detection of the 3 least abundant clones was not possible with gDNA. Next, we transduced murine HSC with the barcoded viruses and transplanted these cells into myeloablated recipients. Preliminary results show that we can stably detect clones supporting platelets. Finally, we report that these clones also support erythroid, myeloid, and to a lesser extent lymphoid lineage development. Through barcode retrieval from RNA in platelets, progenitors, HSC and single RNA sequencing of stem and platelet progenitors we aim to identify key pathways driving thrombopoiesis in-vivo, to improve understanding of stress thrombopoiesis and give rise to improved therapies for thrombocytopenia.

P136

INVESTIGATING THE MECHANISMS UNDERLYING THE HETEROGENEOUS VSMC INJURY RESPONSE

Matt Worssam, Joel Chappell, Jenny Harman, Annabel Taylor, Lina Dobnikar, Jordi Lambert, Martin Bennett, Helle Jørgensen

Cardiovascular Division, Department of Medicine, University of Cambridge, Addenbrooke’s Hospital, Hills Road, CB2 0QQ

In healthy blood vessels, vascular smooth muscle cells (VSMCs) exist in a contractile, quiescent state but can switch phenotype to activate proliferation, migration and remodelling of the extracellular matrix. Phenotypically switched VSMCs contribute the majority of cells within neointimal lesions, characteristic of atherosclerosis and in-stent restenosis, diseases that underlie heart attack and stroke. Using multicolour "Confetti" VSMC-specific genetic lineage tracing in animal models of vascular disease, we showed that the extensive VSMC contribution to these lesions results from clonal expansion of few cells. To understand how the oligoclonality of the VSMC contribution to neointimal lesions arises and to understand the mechanisms activating VSMC proliferation in vivo, we quantified VSMC proliferation and clonal development over time after acute vascular injury using confocal microscopy of wholemounted arteries. We observed that, from day 5 post-injury, proliferation is induced in a small number of VSMCs that clonally expand to form patches in the medial layer. The number and size of medial patches increase continuously during the first two weeks post- injury, suggesting that the oligoclonality observed in lesions results from selective activation of proliferation in a small subset of VSMCs. Expanding VSMC clones are restricted to arterial bulges which display VSMC death and recruitment of immune cells to the vessel wall, implicating these events as potential sources of proliferation-inducing cues. Interestingly, only a subset of medial patches traverses the internal elastic lamina and gives rise to neointimal patches, suggesting that a second selective event contributes at least partially to the observed oligoclonality. Profiling of chromatin accessibility in contractile and activated VSMCs from healthy and injured vessels respectively revealed activation-induced opening of chromatin at loci associated with phenotypic switching (Spp1, Ly6a/Sca1) and cell cycle (Mki67) and implicated the AP-1 factors as potential drivers of these changes. Newly identified genes and pathways associated with activation-specific open chromatin could represent novel therapeutic targets to reduce or prevent neointimal lesion formation.

P137

Incorporation of spatial mapping and confirmation of gene signatures by a multiplex in situ hybridization technology into single cell RNA sequencing workflows

Morgane Rouault, Jyoti Phatak, Han Lu, Li Wang, Hailing Zong, Claudia May, Sara Wrobel, Xiao-Jun Ma, Courtney Anderson

Advanced Cell Diagnostics - Bio-Techne, 7707 Gateway Blvd, Newark, CA 94560

Complex and highly heterogenous tissues such as the brain are comprised of multiple cell types and states with exquisite spatial organization. Single-cell RNA sequencing (scRNA- seq) is now being widely used as a universal tool for classifying and characterizing known and novel cell populations within these heterogenous tissues, ushering in a new era of single cell biology. However, the use of scRNA-seq presents some limitations due to the use of dissociated cells which results in the loss of spatial context of the cell populations being analyzed. Incorporating a multiplexed spatial approach that can interrogate gene expression with single cell resolution in the tissue context is a powerful addition to the scRNA-seq workflow. In this study, we used the RNAscope Multiplex Fluorescent and RNAscope HiPlex in situ hybridization (ISH) assays to confirm and spatially map the diverse striatal neurons that have been previously identified by scRNA-seq in the mouse brain (Gokce et al, Cell Rep, 16(4):1126-1137, 2016). We confirmed the gene signatures of two discrete D1 and D2 subtypes of medium spiny neurons (MSN): Drd1a/Foxp1, Drd1a/Pcdh8, Drd2/Htr7, and Drd2/Synpr. The heterogenous MSN subpopulations were marked by a transcriptional gradient, which we could spatially resolve with RNA ISH. Numerous striatal non-neuronal cell populations identified by scRNA-seq, including vascular cells, immune cells, and oligodendrocytes, were also confirmed with the multiplex ISH assay. Finally, the spatial relationship between the D1 and D2 MSN subtypes identified by Gokce et al. was visualized using the RNAscope HiPlex assay, which allows for detection of up to 12 RNA targets simultaneously in intact tissues. In conclusion, we have demonstrated the utility of two multiplexed RNAscope ISH assays for the confirmation and spatial mapping of scRNA-seq transcriptomic results in the highly complex and heterogenous mouse striatum at the single cell level. Incorporating spatial mapping by the RNAscope technology into single cell transcriptomic workflows complements scRNA-seq results and provides additional biological insights into the cellular organization and functional states of diverse cell types in healthy and disease tissues.

P138

Dach1 downregulation marks lymphoid-primed progenitors early in haematopoiesis

Daniela Zalcenstein1,2*, Luyi Tian1,3,4*, Jaring Schreuder1*, Sara Tomei1,3, Dawn S. Lin1,3, Kirsten Fairfax5,6,7, Jess Bolden5,6, Mark McKenzie1,5, Andrew Jarratt5,6, Adrienne Hilton5,6, Jacob T. Jackson5,8, Ladina Di Rago6, Matthew P. McCormack9, Carolyn A. de Graaf5,6, Warren S. Alexander5,6, Doug Hilton5,6, Stephen L. Nutt1,5, Matthew E. Ritchie4,5, Ashley Ng5,6,10, Shalin H. Naik1,2,5

1. Immunology Division, The Walter & Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia, 3010 2. Single Cell Open Research Endeavour (SCORE), The Walter & Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia, 3010 3. Faculty of Medicine, Dentistry & Health Sciences, University of Melbourne, Parkville, Victoria, Australia, 3010 4. Epigenetics and Development Division, The Walter & Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia, 3010 5. Department of Medical Biology, The University of Melbourne, Parkville, Australia, 3010. 6. Blood Cells and Blood Cancer Division, The Walter & Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia, 3010 7. Menzies Institute for Medical Research, University of Tasmania, Tasmania, Australia 8. Inflammation Division, The Walter & Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia, 3010 9. Australian Centre for Blood Disorders, Monash University, 99 Commercial Road, Melbourne VIC 3004 10. Clinical Haematology, Peter MacCallum Cancer Centre & The Royal Melbourne Hospital, 3052, Parkville.

A classical view of blood cell development is that multipotent haematopoietic stem and progenitor cells (HSPCs) become lineage-restricted at defined stages. Lin-c-kit+Sca1+Flt3+ cells, termed lymphoid-primed multipotent progenitors (LMPPs), have lost megakaryocyte and erythroid potential but are heterogeneous in their fate. Here, through single cell RNA- sequencing, we identify the expression of Dach1 and associated genes in this fraction as being co-expressed with myeloid/stem genes but inversely correlated with lymphoid genes. Through generation of Dach1-GFP reporter mice, we identify a transcriptionally and functionally unique Dach1- subpopulation within LMPPs with lymphoid potential but devoid of myeloid potential. We term these 'lymphoid-primed progenitors', or LPPs. These findings define the earliest branch point of lymphoid development in haematopoiesis and a means for their prospective isolation.

P139

Single-cell RNA sequencing of human, macaque, and mouse testes uncovers primate- specific features of spermatogenesis

Xianing Zheng, Adrienne Niederriter Shami, Qianyi Ma, Saher Sue Hammoud, Jun Z. Li

Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA

Sperm are highly specialized terminally differentiated cells that carry the genetic information from father to offspring, thus providing a continuous link between the past and future of a species. Although extensively studied in mice, our current understanding of primate spermatogenesis is limited to cell populations defined by state-specific markers identified from rodent data. While between-species differences have been reported in the process duration and cellular differentiation hierarchy, it remains unclear how molecular markers and cell states are conserved or have diverged from mice to man. As a result, there has been limited success in applying knowledge from mice to inform the study of spermatogenesis and fertility in higher primates. Using the power of single cell transcriptomics, we sought to better discern molecularly analogous populations in order to identify shared and divergent properties of spermatogenesis between species, within both the germline and the somatic cells. We analyzed ~14K adult human and ~21K macaque testicular cells, and combined these new datasets with our previously published mouse single cell data (~35K, PMID: 30146481). This allowed us to directly define analogous cell types across species, and we uncovered both known and underrepresented cell types, including transient cell states too rare to be detected with low-throughput approaches. As a result, we produced a high-resolution three- species atlas with aligned analogous somatic and germ cell types/states. First, we identified six molecularly defined consensus spermatogonia states, which lead to the discovery of a primate enriched undifferentiated spermatogonial cell state, as well as a differentiating Type- A-like spermatogonial state in primates, analogous to the mouse Type A spermatogonial state, which has not previously been reported. Second, we described the heterochrony of the germ cell developmental trajectories between species and generated the first universal pseudo-timescale for mammalian spermatogenesis. Based on this universal pseudo- timescale, we identified genes that show conserved and diverged expression patterns along the trajectory, and related molecular pathways operating within or across species. Finally, we described between-species differences in potential somatic cell-germline communications. Overall, this study provides the first single-cell comparative analysis of the spermatogenesis program between primates and rodents. Such a new resource is expected to improve our knowledge base for future studies of germ cell development in primates, and ultimately improve our understanding of the intrinsic and extrinsic evolutionary changes of the gametogenesis program. Knowledge gained from these data will inform fertility restoration efforts, including SSC culture and in vitro gametogenesis.

P140

Cross-regional single-cell multilineage diversity in multiple sclerosis

Amel Zulji1,2, Tim Trobisch1, Dmitry Velmeshev3,4, Maximilian Haeussler5, Michael Platten1,6, Arnold Kriegstein3,4, Simon Anders7, David Rowitch3,8,9, Lucas Schirmer1,10

1Department of Neurology, Medical Faculty Mannheim, University of Heidelberg, 68167 Mannheim, Germany; 2Heidelberg Biosciences International Graduate School (HBIGS), University of Heidelberg, 69120 Heidelberg, Germany; 3Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA 94143, USA; 4Department of Neurology, University of California, San Francisco, San Francisco, CA 94158, USA, 5Genomics Institute, University of California, Santa Cruz, CA, 95064, USA; 6DKTK Clinical Cooperation Unit Neuroimmunology and Brain Tumor Immunology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), 69120 Heidelberg, Germany. 7Center for Molecular Biology of Heidelberg University (ZMBH), 69120 Heidelberg, Germany. 8Department of Paediatrics and Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, CB2 0QQ, UK; 9Department of Pediatrics, University of California, San Francisco, San Francisco, CA 94158, USA; 10Interdisciplinary Center for Neurosciences (IZN), University of Heidelberg, 69120 Heidelberg, Germany.

Multiple sclerosis (MS) is a chronic inflammatory disease of the central nervous system (CNS) affecting at least 2.3 million people worldwide. The disease is assumed to be triggered by autoreactive lymphocytes, an event that ultimately leads to a disturbed tissue micro-environment driven by reactive glial cells resulting in loss of myelin and progressive neurodegeneration. Permanent damage is manifested through various symptoms and strongly correlates with the affected anatomical region. However, the precise mechanism of the disease initiation and its progression as well as susceptibility of different anatomical regions to the disease are not well understood. Here, we utilized single-nucleus RNA-sequencing (snRNA-seq) and performed an integrative analysis of human MS pathology as compared to control tissue comprising neocortex, cerebellum and spinal cord. In a first step, we identified highly specific marker genes for all cell populations. Further, we aimed at assessing similarities and differences of disease- induced cell-type specific transcriptomic changes throughout the different CNS regions. By performing differential gene expression analysis, we identified MS specific gene dysregulations in all major cell types, e.g. suggesting variable degrees of astrocyte and microglia activation in MS. In summary, integration of different regional snRNA-seq data sets from MS patients allows us to generate a comprehensive cross-regional single-cell transcriptomic map of MS lesion pathology. The findings help identify novel cell-type specific disease biomarkers and potential targets to better characterize MS and tailor therapeutic interventions.

P141

Notes

P142

Notes

P143

Speaker & Delegate List

Alexander Aivazidis Josephine Bageritz Sanger Institute German Cancer Research Center [email protected] [email protected]

Bana Alamad Brad Balderson New York University Abu Dhabi Boden Lab [email protected] [email protected]

Hananeh Aliee Maria Ban Helmholtz Center University of Cambridge [email protected] [email protected]

Clara Alsinet Armengol Jeanette BaranGale Wellcome Sanger Institute University of Edinburgh [email protected] [email protected]

Simon Anders Dalia Barkley University of Heidelberg NYU Langone Health [email protected] [email protected]

Artemiy Andriyanov Florian Baumgartner First Moscow State Medical University 10x Genomics [email protected] [email protected]

Agne Antanaviciute Omer Bayraktar University of Oxford WTSI [email protected] [email protected]

Mikhail Arbatsky Volker Bergen Lomonosov Moscow State University Helmholtz Munich [email protected] [email protected]

Anna Arutyunyan Marjan Biocanin Wellcome Sanger Institute EPFL [email protected] [email protected]

Benjamin Auerbach Anton Bjorninen University of Pennsylvania Cartana [email protected] [email protected]

Gal Avital Ali Sina Booeshaghi NYU Langone Health California Institute of Technology [email protected] [email protected]

Karsten Bach Wolfgang Breitwieser University of Cambridge CRUK Manchester Institute [email protected] [email protected]

Wendi Bacon Robin Browaeys EMBL-EBI VIB-Ghent University [email protected] [email protected] Maren Buettner Sisi Chen Helmholtz Centre Munich Caltech [email protected] [email protected]

Tony Burdett Mariya Chhatriwala EMBL-EBI Wellcome Sanger Institute [email protected] [email protected]

Darren Burgess Sonia Pankaj Chothani Nature Reviews Genetics Duke-NUS school of medicine [email protected] [email protected]

Louis Cammarata Jonathan Chubb Harvard University MRC LMCB [email protected] [email protected]

Stefania Carobbio Manfred Claassen Wellcome Sanger Institute ETH Zurich [email protected] [email protected]

Sergi Castellano Stephen Clark UCL Babraham Institute [email protected] [email protected]

Vered Chalifa Caspi Brett Cook Ben-Gurion University Optical Biosystems [email protected] [email protected]

Lia Chappell Jonah Cool Wellcome Trust Sanger Institute Chan Zuckerberg Initiative [email protected] [email protected]

Ruben Chazarra Gil Justin Cooper Wellcome Sanger Institute 10x Genomics [email protected] [email protected]

Kathy Cheah Fabiola Curion University of Hong Kong University of Oxford [email protected] [email protected]

Sijie Chen Ana Cvejic Tsinghua University University of Cambridge [email protected] [email protected]

Peikai Chen Jordy De Coninck The University of Hong Kong University Gent [email protected] [email protected]

Wanze Chen Laura De Vargas Roditi EPFL University of Zurich [email protected] [email protected]

Louise Deconinck Marc Elosua Bayes Ghent University CNAG-CRG [email protected] [email protected]

Lauren Deighton Charles Farber Wellcome Sanger Institute University of Virginia [email protected] [email protected]

Amanda Demeter David Fawkner Corbett Earlham Institute University of Oxford [email protected] [email protected]

Sarah Dewery Christian Feregrino Bio-Techne University of Basel [email protected] [email protected]

Carmen Diaz Soria Javier Ferrer Sanger Institute EMBL - EBI [email protected] [email protected]

Athanasios Dimitriadis Heike Fiegler University College London Dolomite Bio [email protected] [email protected]

Leander Dony Jonathan Fiorentino Max-Planck-Institut für Psychiatrie Helmholtz Zentrum Muenchen [email protected] [email protected]

Ruben Dries David Fischer Dana-Farber Cancer Center Helmholtz Zentrum Munich [email protected] [email protected]

Richard Duerr Siri Flaam University of Pittsburgh OUS Ullevål [email protected] [email protected]

JeanClaude Dujardin Michael Flossdorf Institute of Tropical Medicine Technical University of Munich [email protected] [email protected]

Sam Dupont Polly FORDYCE VIB Stanford University [email protected] [email protected]

Jacquelyn DuVall Felix Frauhammer Cell Microsystems Heidelberg University [email protected] [email protected]

Sabine Eckert Anna Sophie Fröhlich Wellcome Sanger Institute Max Planck Institute of Psychiatry [email protected] [email protected]

Eileen Furlong Alison Hargreaves EMBL Partek Inc [email protected] [email protected]

FEDERICO Gaiti Peng He Weill Cornell Medicine EBI [email protected] [email protected]

Shashank Gandhi Martin Hemberg California Institute of Technology Wellcome Trust [email protected] [email protected]

Alvaro Gonzalez Neil Henderson 10x Genomics Centre for Inflammation [email protected] [email protected]

David Grainger Christiaan Henkel University of Oxford Norwegian University of Life Sciences [email protected] [email protected]

Gillian Griffiths Klaus Hentrich CIMR SPT Labtech [email protected] [email protected]

Dominic Grün Jacob Hepkema Max Planck Institute of Immunobiology and Wellcome Sanger Institute Epigenetics [email protected] [email protected] Gabriel Heringer Negreira Guoji Guo Institute of Tropical Medicine Antwerp Zhejiang University [email protected] [email protected] Shoko Hirosue Maximilian Haeussler MRC Cancer Unit Genomics Institute [email protected] [email protected] Wei Wen Vivien Ho Ling Hai WIMM - University of Oxford German Cancer Research Center (DKFZ) [email protected] [email protected] Zhiyuan Hu LouisFrancois Handfield University of Oxford Wellcome Sanger Institute [email protected] [email protected] Kui Hua Moritz Haneklaus Tsinghua University Wellcome Sanger Institute [email protected] [email protected] Linglin Huang Muzlifah Haniffa Harvard University Newcastle University [email protected] [email protected]

Maria Imaz Amanda Kedaigle University of Cambridge Broad Institute of MIT and Harvard [email protected] [email protected]

Ivan Imaz Rossshandler Tahmineh Khazaei Stem Cell Institute Caltech [email protected] [email protected]

Naveed Ishaque Helen Kiik BIH and Charité, Berlin Imperial College London [email protected] [email protected]

Yoh Isogai Helena Kilpinen University College London University College London [email protected] [email protected]

Noel Jee Hamish King Optical Biosystems Queen Mary University of London [email protected] [email protected]

Julie Jerber Kristina Kirschner Wellcome Sanger Institute University of Glasgow [email protected] [email protected]

Carys Johnson Vladimir Kiselev Cambridge Stem Cell Institute Wellcome Sanger Institute [email protected] [email protected]

Michael Johnson Laura Kitto Imperial College London University of Edinburgh [email protected] [email protected]

Nick Jones Vitalii Kleshchevnikov Imperial College Wellcome Sanger Institute [email protected] [email protected]

Helle Jorgensen Florian Klimm University of Cambridge Imperial College [email protected] [email protected]

Zeynep Kalender Atak Anna Klimovskaia CRUK, University of Cambridge Facebook AI [email protected] [email protected]

James Kane Viktoria Klunder Cell Microsystems Miltenyi Biotec B.V. & Co. KG [email protected] [email protected]

Chantriolnt-Andreas Kapourani HASHEM KOOHY MRC Human Genetic Unit Oxford University [email protected] [email protected]

Laura Nadine Kuester Marcela Lipovsek Miltenyi Biotec B.V. & Co. KG King's College London [email protected] [email protected]

Felicia Kuperwaser Ashleigh Lister NYU Langone Health Earlham Institute [email protected] [email protected]

Anna Laddach Tim Lohoff Francis Crick Institute Babraham Institute [email protected] [email protected]

Atefeh Lafzi mohammad lotfollahi CRG-CNAG Helmholtz Zentrum München [email protected] mohammad.lotfollahi@helmholtz- muenchen.de Jordan Lambert University of Cambridge Gabriele Lubatti [email protected] Helmholtz Zentrum München [email protected] Marius Lange Helmholtz Zentrum München Malte Luecken [email protected] Helmholtz Center Munich [email protected] Reena Lasrado Francis Crick Institute Ivan Lukic [email protected] Partek, Inc. [email protected] Marieke Lavaert University of Ghent Joakim Lundeberg [email protected] SciLifeLab [email protected] Jimmy Tsz Hang Lee Wellcome Sanger Institute Claire Ma [email protected] Cambridge Institute for Medical Research [email protected] HyunJung Lee University of Oxford Iain Macaulay [email protected] Earlham Institute [email protected] Dena Leshkowitz Weizmann Institute of Science Will Macnair [email protected] University of Zürich [email protected] Benedicte Alexandra Lie University of Oslo Elo Madissoon [email protected] EMBL-EBI / Sanger Institute [email protected] Daniel Lieber Takara Bio Europe Elmir Mahammadov [email protected] Helmholtz Munich [email protected]

Dyana Markose Tom Mitchell University of Edinburgh Wellcome Sanger Institute [email protected] [email protected]

Liesbet Martens Hisham Mohammed VIB OHSU [email protected] [email protected]

Kylie Matchett Gi Fay Mok University of Edinburgh Earlham Institute [email protected] [email protected]

Kathryn McClelland Reuben Moncada National Institutes of Health NYU Langone Health [email protected] [email protected]

Dan McCluskey Thomas Monfeuga King's College London Novo Nordisk Oxford [email protected] [email protected]

Emily McGibbon Pablo Moreno CELLINK EMBL-EBI [email protected] [email protected]

Chris McGinnis Samantha Morris University of California San Francisco Washington University in St Louis [email protected] [email protected]

Anat Melamed Ioannis Moustakas Imperial College London Leiden University Medical Center [email protected] [email protected]

Nicole Mende Klaas Mulder Wellcome - MRC Cambridge Stem Cell Radboud University Institute [email protected] [email protected] Ana Munoz Kerstin Meyer Karolinska Institutet Sanger Institute [email protected] [email protected] Parisa Nejad Zhen Miao UC Santa Cruz University of Pennsylvania [email protected] [email protected] Mari Niemi Laura Mincarelli Institute for Molecular Medicine Finland Earlham Institute [email protected] [email protected] Paula Nieto Maria Mircea CNAG Leiden University [email protected] [email protected]

Mats Nilsson Ryan Patterson Stockholms universitet National Institutes of Health [email protected] [email protected]

Alan O Callaghan Matthieu Pesant MRC Human Genetics Unit Takara Bio Europe a.b.o'[email protected] [email protected]

Paul Oakley Emma Pettengale Dolomite Bio Portland Press [email protected] [email protected]

Sebnem Oc Maria Polychronidou University of Cambridge Molecular Systems Biology [email protected] [email protected]

Silvia Ogbeide Catherine Porcher Earlham Institute University of Oxford [email protected] [email protected]

Donald Ogg Jordan Portman SPT Labtech The University of Edinburgh [email protected] [email protected]

Alexandros Onoufriadis Maayan Pour King's College London NYU Langone Health [email protected] [email protected]

Aik Ooi Maria Primo Mission Bio Wellcome Sanger Institute [email protected] [email protected]

Prasad Palani Velu Christos Proukakis University of Edinburgh University College London [email protected] [email protected]

Subarna Palit Michael Prummer ICB ETH Zurich [email protected] [email protected]

Alexandrina Pancheva Pau Puigdevall Costa University of Glasgow UCL Institute of Child Health [email protected] [email protected]

Eleni Papachristoforou Fu Xiang Quah The University of Edinburgh University of Cambridge [email protected] [email protected]

Jeongbin Park Rachel Queen Charité – Universitätsmedizin Berlin Newcastle University [email protected] [email protected]

Mashiat Rabbani Ellen Rothenberg University of Michigan California Institute of Technology [email protected] [email protected]

Maryam Rahim George Royal Francis Crick Institute GlaxoSmithKline [email protected] [email protected]

Prakash Ramachandran Mayra Luisa Ruiz Tejada Segura University of Edinburgh Helmholtz Zentrum München [email protected] [email protected]

Chris Rands Steffen Rulands University of Geneva Max Planck Institute for the Physics of [email protected] Complex Systems [email protected] Bobby Ranjan Genome Institute of Singapore Jian Ryou [email protected] New York University Abu Dhabi [email protected] Anna Maria Ranzoni University of Cambridge Mario Saare [email protected] University of Oslo [email protected] Anjali Rao NYU Langone Health Sami Saarenpää [email protected] KTH Royal Institute of Technology [email protected] Pedro Raposo NIBSC Wouter Saelens [email protected] VIB - Ghent University [email protected] Kim Ravnskjaer University of Southern Denmark Sudhakar Sahoo [email protected] CRUK MI [email protected] Emmanouela Repapi Oxford University Somesh Sai [email protected] Max-Delbrück Centrum für Molekulare Medizin Jan Rhode [email protected] Takara Bio Europe [email protected] Carlo Sala Frigerio UK Dementia Research Institute Merrit Romeike [email protected] Max F. Perutz Laboratories [email protected] Rahul Satija New York Genome Center Jean Rossier [email protected] Sorbonne University [email protected] Stephen Sawcer University of Cambridge [email protected]

Florian Schmidt Elly Sinkala Genome Institute of Singapore cytena GmbH [email protected] [email protected]

Michael Schneider Maria Solovey Uni Cambridge - CRUK-CI Helmholtz Zentrum München [email protected] [email protected]

Timm Schroeder Zoe Steier ETH Zurich (in Basel) University of California, Berkeley [email protected] [email protected]

Julian Tristan Schwartze Bilyana Stoilova University of Glasgow University of Oxford [email protected] [email protected]

Anita Scoones Stephanie Strohbuecker Earlham Institute The Francis Crick Institute [email protected] [email protected]

Ali Kerim Secener Mike Stubbington Max Delbrück Center for Molecular Medicine 10x Genomics [email protected] [email protected]

Julia Sero Jacob Swanson Bath University NYU Langone Health [email protected] [email protected]

Ruth Seurinck Tom Taghon VIB/UGent Ghent University [email protected] [email protected]

Marion Shadbolt Norbert Tavares EMBL - EBI Chan Zuckerberg Initiative [email protected] [email protected]

Yen Chi Kendig Sham Sarah Teichmann University of Cambridge Wellcome Sanger Institute [email protected] [email protected]

He Shuai Fabian Theis Sun Yat-sen University Cancer Center Helmholtz Zentrum Munich [email protected] [email protected]

Francois Xavier Sicot Tinne Thone Takara Bio Europe VIB [email protected] [email protected]

Khadeeja Siddique Supat Thongjuea Norwegian University of Life Sciences University of Oxford [email protected] [email protected]

Enrique Toledo Valda Vinson Novo Nordisk Oxford Science [email protected] [email protected]

Martin Treppner Viola Volpato University of Freiburg DRI at Cardiff University [email protected] [email protected]

Barbara Treutlein Sebastian Wallace ETH Zürich University of Edinburgh [email protected] [email protected]

Tim Trobisch Simone Webb Medical Faculty Mannheim, University of NEWCASTLE UNIVERSITY Heidelberg [email protected] [email protected] Arne Wehling Jocelyn Turpin ETH Zurich Imperial College [email protected] [email protected] Michael Weinberger Alecia Jane Twigger University of Oxford University of Cambridge [email protected] [email protected] Léa Wenger Vladimir Uzun University of Cambridge Earlham Institute [email protected] [email protected] Hans Wils Ludovic Vallier Janssen Cambridge Stem Cell Institute [email protected] [email protected] John Wilson Kanamori Bram Van de Sande University of Edinburgh UCB S.A. [email protected] [email protected] Theresa Wirtz Simon van Heeringen Uniklinik RWTH Aachen Radboud University [email protected] [email protected] Johannes B Woehrstein Niels Vandamme LMU Munich VIB/UGent [email protected] [email protected] Edyta Wojtowicz Christophe Vanderaa Earlham Institute UCLouvain [email protected] [email protected] Jeongmin Woo Roser Vento Tormo University of Oxford Wellcome Sanger Institute [email protected] [email protected]

Matt Worssam University of Cambridge [email protected]

Sara Wrobel Bio-Techne - ACD [email protected]

Itai Yanai NYU Langone Health [email protected]

Yifei Yang Imperial College [email protected]

Chengwei Yuan Gurdon Institute [email protected]

Daniela Zalcenstein Walter and Eliza Hall Institute [email protected]

Cheng Zhao Karolinska institute. [email protected]

Xianing Zheng University of Michigan [email protected]

Amel Zulji Medical Faculty Mannheim, University of Heidelberg [email protected]

Index Feregrino P31 Aivazidis P1 Fiegler P32 Alamad P2 Fiorentino P33 Aliee P3 Fischer P34 Alsinet Armengol P4 Flossdorf P35 Anders P5 Fordyce S51 Antanaviciute S7 Fröhlich P36 Arbatsky P6 Furlong S27 Arutyunyan P7 Auerbach P8 Gaiti S5 Avital S17 Gandhi S23 Gonzalez P37 Bach P9 Griffiths S45 Bageritz P10 Grün S39 Balderson P11 Guo S9 Barkley P12 Bayraktar S65 Hai P38 Bergen S57 Handfield P39 Biocanin P13 Haneklaus P40 Bjorninen P14 Haniffa S29 Booeshaghi P15 He P41 Browaeys P16 Hentrich P42 Buettner P17 Heringer Negreira P43 Hirosue P44 Chappell P18 Hu P45 Chazarra Gil P19 Hua P46 Cheah S19 Huang P47 Chen, P P20 Chen, Sijie P21 Imaz P48 Chen, Sisi S55 Chen, W S53 Jee P49 Chothani P22 Johnson S25 Clark P23 Curion P24 Kalender Atak P50 Kapourani P51 De Vargas Roditi P25 Kedaigle S15 Deconinck P26 King S47 Diaz Soria P27 Kleshchevnikov P52 Dony P28 Klimm P53 Dries P29 Klimovskaia P54 DuVall P30 Kuperwaser P55 Lafzi P56 Palit P91 Lambert P57 Pancheva P92 Lange P58 Park P93 Lasrado P59 Patterson P94 Lavaert P60 Pesant P95 Leshkowitz P61 Primo P96 Lipovsek P62 Proukakis P97 Lister P63 Prummer P98 Liu P64 Puigdevall Costa P99 Lohoff P65 lotfollahi P66 Rahim P100 Lubatti P67 Rands P101 Luecken P68 Ranjan P102 Lukic P69 Ranzoni S49 Lundeberg S61 Raposo P103 Ravnskjaer P104 Ma P70 Romeike P105 Macnair P71 Ruiz Tejada Segura P106 Mahammadov S33 Rulands P107 McClelland P72 Ryou P108 McCluskey P73 McGinnis P74 Saare P109 Mende P75 Saarenpää P110 Meyer P76 Saelens P111 Miao P77 Sai P112 Mincarelli P78 Satija S35 Mircea P79 Schmidt P113 Mitchell P80 Schroeder S43 Mohammed P81 Schwartze P114 Mok P82 Scoones P115 Moncada P83 Secener P116 Moreno P84 Sero P117 Morris S11 Shadbolt P118 Moustakas P85 Shuai P119 Sicot P120 Nejad P86 Siddique P121 Niemi P87 Solovey P122 Nilsson S63 Steier S41 Swanson P123 O Callaghan P88 Oc P89 Theis S37 Ooi P90 Thongjuea P124 Treppner P125 Treutlein S13 Trobisch P126 Turpin P127 Twigger P128

Uzun P129

Vallier S21 van Heeringen P130 Vandamme P131 Vanderaa P132 Vento S3 Volpato S67

Webb S31 Wehling S59 Weinberger P133 Wilson Kanamori P134 Woehrstein P135 Wojtowicz P136 Worssam P137 Wrobel P138

Yanai S1

Zalcenstein P139 Zheng P140 Zulji P141 To Hinxton Village (Vehicle access via main exit to site) Willow Court (A&B) B = 230-243 330-343 Mulberry Court (C) A = 244-259 201-229 344-361 301-329 401-406 A B

C

Disabled Tennis Court Conference Centre Training Suite Reception Francis Crick Auditorium James Watson Pavilion Rosalind Franklin Pavilion Hinxton Hall Loft Room 1 and 2 Pompeiian Room Library Room Green Room Restaurant Lounges/Bar Bedrooms 362-367/407-410

Hinxton Hall

Conference Centre

EMBL - EBI The Sulston Laboratories

West Pavilion

The Cairns Pavilion The Data Centre RSF

Wet

Labs The Morgan Building Morgan The

Reception

Fire Assembly Point Designated Smoking Area EBI South @ACSCevents Wellcome Genome Campus Courses and Conferences wellcomegenomecampus.org /coursesandconferences