Using a Trillion Points of Data to Deliver Precision Medicine

Atul Butte MD, PhD Director Institute for Computational Health Sciences Professor of Pediatrics University of California San Francisco, CA

Dr. Atul Butte is the Director of the new Institute for Computational Health Sciences (ICHS) at the University of California, San Francisco, and a Professor of Pediatrics. Dr. Butte trained in Computer Science at Brown University, worked as a software engineer at Apple and Microsoft, received his MD at Brown University, trained in Pediatrics and Pediatric Endocrinology at Children's Hospital Boston, then received his PhD from Harvard Medical School and MIT. Dr. Butte has authored nearly 200 publications, with research repeatedly featured in Wired Magazine, the New York Times and the Wall Street Journal. Dr. Butte is also the principal investigator of ImmPort, the archival and dissemination repository for clinical and molecular datasets funded by the National Institute of Allergy and Infectious Diseases. In 2013, Dr. Butte was recognized by the White House as an Open Science Champion of Change for promoting science through publicly available data. Other recent awards include the 2014 E. Mead Johnson Award for Research in Pediatrics, 2013 induction into the American Society for Clinical Investigation, the 2012 FierceBiotech IT “Top 10 Biotech Techies”, and the 2011 National Human Genome Research Institute Genomic Advance of the Month. Dr. Butte is also a founder of three investor-backed data- driven companies: Personalis, providing clinical interpretation of whole genome sequences; Carmenta, discovering diagnostics for pregnancy complications; and, NuMedii, finding new uses for drugs through open molecular data.

Annual Quality Congress, Plenary Session, Sunday, October 4, 2015 Using a Trillion Points of Data to Deliver Precision Medicine Objective: Reflect on the power of “big data” and how it will drive healthcare in the 21st century and beyond. Using a Trillion Points of Data to Deliver Precision Medicine Atul Butte, MD, PhD

Conflicts of Interest •Scientific founder and –Genstruct –Northrop Grumman advisory board membership – Tercica – Aptalis Using a Trillion Points of Data to Deliver –Genstruct –Ecoeos – Thomson Reuters –NuMedii –AnshLabs –Intel Precision Medicine – Personalis – Prevendia –SAP –Carmenta – Samsung –SV Angel – Assay Depot •Speakers’ bureau [email protected] • Honoraria for talks Atul Butte, MD, PhD – Lilly – Regeneron –None –Pfizer –Verinata • Companies started by students @atulbutte – Pathway Diagnostics Director, Institute for Computational –Siemens –Carmenta –GeisingerHealth @ImmPortDB –Bristol Myers Squibb – Serendipity Health Sciences –Covance – AstraZeneca –NuMedii – Wilson Sonsini Goodrich & –Roche – Stimulomics University of California, San Francisco Rosati – Genentech – NunaHealth – 10X Genomics – Warburg Pincus –Praedicat –Medgenics –MyTime •Past or present consultancy –GNS Healthcare – Flipora – Lilly – Gerson Lehman Group – Johnson and Johnson – Coatue Management –Roche •Corporate Relationships – NuMedii

Kilo

Kilo Kilo Mega Mega Giga

October 4, 2015 1 Using a Trillion Points of Data to Deliver Precision Medicine Atul Butte, MD, PhD

Kilo Kilo Mega Mega Giga Giga Tera Tera Peta

Kilo Kilo Mega Mega Giga Giga Tera Tera Peta Peta Exa Exa Zetta

Already nearly 1.6 million microarrays publicly‐available! Doubles every 2‐3 years!

Butte AJ. Translational : coming of age. JAMIA, 2008.

October 4, 2015 2 Using a Trillion Points of Data to Deliver Precision Medicine Atul Butte, MD, PhD

Yes, even a high‐school student can use public data to design a new diagnostic!

Preeclampsia: large cause of maternal and fetal death • Incidence • 5‐8% of all pregnancies in the U.S. and worldwide • 4.1 million births in the U.S. in 2009 • Up to 300K cases of preeclampsia annually in the U.S. • Mortality • Responsible for 18% of all maternal deaths in the U.S. • Maternal death in 56 out of every 100,000 live births in US • Neonatal death in 71 out of every 100,000 live births in US • Cost Linda Liu • $20 billion in direct costs in the U.S annually Matt Cooper • Average hospital stay of 3.5 days Marina Sirota Bruce Ling

Linda Liu Bruce Ling New markers for preeclampsia Matt Cooper

Liu LY, …, XB Ling, Butte AJ. BMC Medicine, 2013.

October 4, 2015 3 Using a Trillion Points of Data to Deliver Precision Medicine Atul Butte, MD, PhD

March of Dimes Life Science Need a Data analyzed, Public big data Center for SPARK grant Angels, other diagnostic for diagnostic available Prematurity ($50k) seed investors preeclampsia designed Research ($2 million)

Anti‐seizure drug works against a rat model of inflammatory bowel disease

Rat colonoscopy Rat with Inflammatory Inflammatory Bowel Disease Bowel Disease After Anti‐seizure Drug

Dudley JT, Sirota M, ..., Pasricha J, Butte AJ. Science Translational Medicine, 2011.

Psychiatric Drug Imipramine Shows Significant Activity Against Small Cell Lung Cancer

p53/Rb/p130 triple knockout Mazen Nasrallah model of SCLC Peter Marinkovich Mårten Winge Mice dosed after tumor formation

Joel Dudley Nadine Jahchan Julien Sage Alejandro Sweet‐Cordero Joel Neal Vehicle control Imipramine NuMedii Unpublished

October 4, 2015 4 Using a Trillion Points of Data to Deliver Precision Medicine Atul Butte, MD, PhD

Company launched, Claremont Creek, Need more drugs Public big data Data analyzed, ARRA, StartX, NIH funding Lightspeed ($3.5 for more diseases available method designed Stanford license, million) first deal

Unpublished

Credit: Whitehead Institute and MIT Credit: Oxford Nanopore Technologies and Wired

Credit: Euan Ashley, Russ Altman, Steve Quake, Lancet

October 4, 2015 5 Using a Trillion Points of Data to Deliver Precision Medicine Atul Butte, MD, PhD

Human genome sequence can be used to predict drug adverse events Important genome differences “locked up” in publications

Credit: Russ Altman and team Credit: Rong Chen, Optra Systems, and Personalis, Inc.

Collect the “big data” of findings across publications to analyze the “big data” of the genome

Credit: Rong Chen, Optra Systems, and Personalis, Inc. Credit: Rong Chen, Optra Systems, and Personalis, Inc.

Credit: Rong Chen, Optra Systems, and Personalis, Inc. Credit: Rong Chen and Alex Morgan, Lancet

October 4, 2015 6 Using a Trillion Points of Data to Deliver Precision Medicine Atul Butte, MD, PhD

Maybe the genome can be used to suggest (promote?) preventative health strategies?

Credit: Rong Chen, Alex Morgan, Joel Dudley, Lancet Credit: Rong Chen, Alex Morgan, Joel Dudley, Lancet

Nicholas Volker

Credit: Rong Chen, Alex Morgan, Joel Dudley, Lancet

Future blood tests will be performed in non‐traditional outlets. With daily Where will the data live? weight and intake measures, I have lost 50 pounds (22 kg) in the past 2 years!

October 4, 2015 7 Using a Trillion Points of Data to Deliver Precision Medicine Atul Butte, MD, PhD

Medical devices can be funded and designed by internet technologists! Where will the data live? Public data can drive health defining mobile apps

The cost of delivered care is becoming public. How will the public respond?

immport.niaid.nih.gov The next big open data? Raw clinical trials data

Download 100+ studies today

Jeff Wiser Patrick Dunn Sanchita Bhattacharya

October 4, 2015 8 Using a Trillion Points of Data to Deliver Precision Medicine Atul Butte, MD, PhD

Institute for Computational Health Sciences How can we expect health care professionals to review 6 billion pieces of data in a 15 minute encounter?

We already ask We already ask health care health care professionals professionals to review 1 GB of to review 1 GB of data in 15 data in 15 minutes… minutes…

What is Big Data in Biomedicine?

We already ask health care professionals to review 1 GB of data in 15 minutes… … but we give them tools to help them do this!

October 4, 2015 9 Using a Trillion Points of Data to Deliver Precision Medicine Atul Butte, MD, PhD

Big Data in Biomedicine is… Big Data in Biomedicine is…

Algorithms? Predicting the disease before it strikes

Programmers? Explaining the rare disease that defies experts

Databases? Finding drugs for diseases lacking attention

High‐performance computers? Making sure we do the right thing for patients

Mobile? An amazing platform for biomedical innovation

Collaborators •Jeff Wiser, Patrick Dunn, Mike Atassi / Northrop Grumman Big Data in Biomedicine is •Ashley Xia and Quan Chen / NIAID • Takashi Kadowaki, Momoko Horikoshi, Kazuo Hara, Hiroshi Ohtsu / U Tokyo •Kyoko Toda, Satoru Yamada, Junichiro Irie / Kitasato Univ and Hospital •Shiro Maeda / RIKEN • Alejandro Sweet‐Cordero, Julien Sage / Pediatric Oncology •Mark Davis, C. Garrison Fathman / •Russ Altman, Steve Quake / Bioengineering •Euan Ashley, Joseph Wu, Tom Quertermous / Cardiology Hope •Mike Snyder, Carlos Bustamante, Anne Brunet / Genetics •Jay Pasricha / Gastroenterology •Rob Tibshirani, Brad Efron / Statistics • Hannah Valantine, Kiran Khush/ Cardiology •Ken Weinberg / Pediatric Stem Cell Therapeutics •Mark Musen, Nigam Shah / National Center for Biomedical Ontology • Minnie Sarwal / Nephrology •David Miklos / Oncology

Support • Lucile Packard Foundation for Children's Health •NIH: NIAID, NLM, NIGMS, NCI; NIDDK, NHGRI, NIA, NHLBI, NCATS •March of Dimes •Hewlett Packard •Howard Hughes Medical Institute • California Institute for Regenerative Medicine •Luke Evnin and Deann Wright (Scleroderma Research Foundation) • Clayville Research Fund • PhRMA Foundation Admin and Tech Staff •Stanford Cancer Center, Bio‐X, SPARK •Mary Lyall • Mounira Kenaani •Kevin Kaier • Tarangini Deshpande •Boris Oskotsky •Sam Hawgood •Keith Yamamoto • Isaac Kohane

October 4, 2015 10