Hierarchical Neural Architectures for Classifying Cancer Pathology Reports

Total Page:16

File Type:pdf, Size:1020Kb

Hierarchical Neural Architectures for Classifying Cancer Pathology Reports University of Tennessee, Knoxville TRACE: Tennessee Research and Creative Exchange Doctoral Dissertations Graduate School 12-2019 Hierarchical Neural Architectures for Classifying Cancer Pathology Reports Shang Gao University of Tennessee, [email protected] Follow this and additional works at: https://trace.tennessee.edu/utk_graddiss Recommended Citation Gao, Shang, "Hierarchical Neural Architectures for Classifying Cancer Pathology Reports. " PhD diss., University of Tennessee, 2019. https://trace.tennessee.edu/utk_graddiss/5722 This Dissertation is brought to you for free and open access by the Graduate School at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in Doctoral Dissertations by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact [email protected]. To the Graduate Council: I am submitting herewith a dissertation written by Shang Gao entitled "Hierarchical Neural Architectures for Classifying Cancer Pathology Reports." I have examined the final electronic copy of this dissertation for form and content and recommend that it be accepted in partial fulfillment of the equirr ements for the degree of Doctor of Philosophy, with a major in Computer Science. Georgia Tourassi, Major Professor We have read this dissertation and recommend its acceptance: Arvind Ramanathan, Hairong Qi, Russell Zaretzki Accepted for the Council: Dixie L. Thompson Vice Provost and Dean of the Graduate School (Original signatures are on file with official studentecor r ds.) Hierarchical Neural Architectures for Classifying Cancer Pathology Reports A Dissertation Presented for the Doctor of Philosophy Degree The University of Tennessee, Knoxville Shang Gao December 2019 c by Shang Gao, 2019 All Rights Reserved. ii This work is dedicated to Nala, my landlord's cat, who thinks she is the queen of the universe. iii Acknowledgments This work was made possible thanks to the help of my esteemed advisors, Arvind Ramanathan and Georgia Tourassi, and my committee members, Hairong Qi and Russell Zaretzki. I would also like to thank the members of the Biomedical Sciences, Engineering, and Computing group at Oak Ridge National Laboratory for their help with this work { Mohammed Alawad, John Qiu, Jacob Hinkle, Noah Schaefferkoetter, Hong-Jun Yoon, and Blair Christian. Lastly, I would like to thank my wife Dasha Herrmannova for her support and for proofreading this work. This work has been supported in part by the Joint Design of Advanced Computing Solutions for Cancer (JDACS4C) program established by the U.S. Department of Energy (DOE) and the National Cancer Institute (NCI) of the National Institutes of Health. This work was performed under the auspices of the U.S. Department of Energy by Argonne National Laboratory under Contract DE-AC02-06-CH11357, Lawrence Livermore National Laboratory under Contract DEAC52-07NA27344, Los Alamos National Laboratory under Contract DE-AC5206NA25396, and Oak Ridge National Laboratory under Contract DE- AC05-00OR22725 This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. iv Abstract Electronic health records (EHRs) are the primary method for documenting and storing patient outcomes in modern healthcare; data mining and machine learning approaches utilize the information stored in EHRs to assist in clinical decision support and other critical healthcare applications. Important information in EHRs is often stored in the form of unstructured clinical text. Unfortunately, the state-of-the-art methods used to automatically extract useful information from unstructured clinical text lags significantly behind the state- of-the-art methods used in the general natural language processing (NLP) community for other tasks such as machine translation, question answering, and sentiment analysis. In this work, we attempt to bridge this gap by applying and developing hierarchical neural approaches to classify key data elements in cancer pathology reports, such as cancer site, histology, grade, and behavior. We (1) show that a hierarchical attention network (HAN), which has strong performance on classifying general text such as Yelp reviews and news snippets, achieves better classification accuracy and macro F-score on identifying cancer site and grade than previous state-of-the-art approaches, (2) develop a novel hierarchical self-attention network (HiSAN) which not only achieves better accuracy and macro F-score in cancer pathology pathology report classification than the HAN but also trains over 10x faster, and (3) introduce a hierarchical framework for incorporating case-level context when classifying cancer pathology reports and show that it gives a significant boost in accuracy and macro F-score. v Table of Contents 1 Introduction1 1.1 State of the Art..................................4 1.1.1 General NLP...............................4 1.1.2 Clinical NLP...............................6 1.1.3 Classifying Cancer Pathology Reports..................7 1.2 Research Objectives and Approach....................... 10 1.2.1 Hierarchical Attention Networks..................... 12 1.2.2 Hierarchical Self-Attention Networks.................. 14 1.2.3 Capturing Case-Level Context...................... 16 1.3 Dissertation Outline............................... 18 2 Hierarchical Attention Networks for Information Extraction from Cancer Pathology Reports 20 2.1 Abstract...................................... 21 2.2 Objective..................................... 22 2.3 Background.................................... 23 2.4 Materials and Methods.............................. 26 2.4.1 Dataset Description and Preprocessing................. 26 2.4.2 Word Embeddings............................ 28 2.4.3 Hierarchical Attention Network..................... 28 2.4.4 Hyperparameter Optimization...................... 31 2.4.5 Unsupervised Pretraining Step...................... 31 2.4.6 Experimental Setup and Evaluation Metrics.............. 32 vi 2.4.7 Model Baseline Comparisons....................... 33 2.5 Experimental Results............................... 35 2.5.1 Primary Site Classification........................ 35 2.5.2 Histological Grade Classification..................... 35 2.5.3 Model Visualizations........................... 37 2.5.4 Error Analysis............................... 37 2.6 Discussion and Conclusions........................... 40 3 Classifying Cancer Pathology Reports with Hierarchical Self-Attention Networks 45 3.1 Abstract...................................... 46 3.2 Introduction.................................... 47 3.3 Background.................................... 48 3.4 Materials and Methods.............................. 51 3.4.1 Dataset Details.............................. 51 3.4.2 Hierarchical Self-Attention Networks.................. 53 3.5 Results....................................... 58 3.5.1 Baselines, Hyperparameters, and Setup Details............. 59 3.5.2 Evaluation Metrics............................ 60 3.5.3 Experimental Results........................... 62 3.6 Discussion..................................... 66 3.6.1 Error Analysis............................... 66 3.6.2 Visualizing Document Embeddings................... 68 3.6.3 Visualizing Word Importance...................... 68 3.7 Conclusion..................................... 72 4 Using Case-Level Context to Classify Cancer Pathology Reports 74 4.1 Abstract...................................... 75 4.2 Introduction.................................... 75 4.3 Background.................................... 77 4.4 Materials and Methods.............................. 78 vii 4.4.1 Problem Description........................... 78 4.4.2 Capturing Case-Level Context...................... 79 4.4.3 Dataset.................................. 86 4.4.4 Baseline Models.............................. 87 4.5 Experiments.................................... 88 4.5.1 Setup Details............................... 88 4.5.2 Evaluation Metrics............................ 89 4.5.3 Results................................... 90 4.6 Discussion..................................... 93 4.7 Conclusion..................................... 98 5 Conclusion 100 5.1 Summary of Results............................... 100 5.2 Impact and Implications............................. 103 5.3 Transfer Learning................................. 105 5.4 Applications to General Text.......................... 109 5.5 A Note on ELMo and BERT........................... 113 5.6 Additional Future Work............................. 116 Bibliography 118 Appendices 137 A Chapter2 Supporting Information....................... 138 A.1 Preprocessing Pipeline.......................... 138 A.2 Preprocessing Procedures........................ 138 A.3 Corpus Histogram............................ 139 A.4 LSTMs and GRUs............................ 140 A.5 Attention Mechanism........................... 141 A.6 Hyperparameter Optimization Setup.................. 142 A.7 TF-IDF Weighted Word Embeddings.................. 143 A.8 Bootstrapping Procedures for Metrics.................. 143 viii A.9 Doc2Vec Embeddings........................... 144 A.10 CNN Embeddings............................. 145 A.11 Most Important Words for Each Classification Task.......... 145 A.12 Model Prediction Time.......................... 146 A.13 Confusion Matrices............................ 146
Recommended publications
  • Data Science at OLCF
    Data Science at OLCF Bronson Messer Scientific Computing Group Oak Ridge Leadership Computing Facility Oak Ridge National Laboratory Mallikarjam “Arjun” Shankar Group Leader – Advanced Data and Workflows Group Oak Ridge Leadership Computing Facility Oak Ridge National Laboratory ORNL is managed by UT-Battelle, LLC for the US Department of Energy OLCF Data/Learning Strategy & Tactics 1. Engage with applications – Summit Early Science Applications (e.g., CANDLE) – INCITE projects (e.g., Co-evolutionary Networks: From Genome to 3D Proteome, Jacobson, et al.) – Directors Discretionary projects (e.g., Fusion RNN, MiNerva) 2. Create leadership-class analytics capabilities – Leadership analytics (e.g., Frameworks: pbdR, TensorFlow + Horovod) – Algorithms requiring scale (e.g., non-negative matrix factorization) 3. Enable infrastructure for analytics/AI and data-intensive facilities – Workflows to include data from observations for analysis within OLCF – Analytics enabling technologies (e.g., container deployments for rapidly changing DL/ML frameworks, analytics notebooks, etc.) 2 Data Science at the OLCF Applications Supported through DD/ALCC: Selected Machine Learning Projects on Titan: 2016-2017 Program PI PI Employer Project Name Allocation (Titan core-hrs) Discovering Optimal Deep Learning and Neuromorphic Network Structures using Evolutionary ALCC Robert Patton ORNL 75,000,000 Approaches on High Performance Computers ALCC Gabriel Perdue FNAL Large scale deep neural network optimization for neutrino physics 58,000,000 ALCC Gregory Laskowski GE High-Fidelity Simulations of Gas Turbine Stages for Model Development using Machine Learning 30,000,000 High-Throughput Screening and Machine Learning for Predicting Catalyst Structure and Designing ALCC Efthimions Kaxiras Harvard U. 17,500,000 Effective Catalysts ALCC Georgia Tourassi ORNL CANDLE Treatment Strategy Challenge for Deep Learning Enabled Cancer Surveillance 10,000,000 DD Abhinav Vishnu PNNL Machine Learning on Extreme Scale GPU systems 3,500,000 DD J.
    [Show full text]
  • Sophie Voisin
    Sophie Voisin Contact Geographic Information Science & Technology Office: (865) 574-8235 Information Oak Ridge National Laboratory E-mail: [email protected] Oak Ridge, TN 37831-6017 Citizenship: France US Permanent Resident Biosketch Dr. Sophie Voisin received her PhD Degree in Computer Science and Image Processing from the University of Burgundy, France, in 2008. She was a visiting scholar with the Imaging, Robotics, and Intelligent Systems Laboratory at The University of Tennessee from October 2004 to December 2008 to work on her PhD research, and subsequently was involved in program development activities. In September 2010, she joined the Oak Ridge National Laboratory (ORNL) as a Postoctoral Reseach Associate to support various efforts related to applied signal processing, 2D and 3D image understanding, and high performance computing. Firstly, she worked at the ORNL Spallation Neutron Source performing quantitative analysis of neutron image data for various industrial and academic applications related to quality control, process monitoring, and to retrieve the structure of objects. Then, she joined the ORNL Biomedical Science & Engineering Center to develop on one hand image processing algorithms for eyegaze data analysis and on the other hand text processing techniques for social media data mining to correlate individuals' health history and their geographical exposure. Noteworthily she was part of the team that received a R&D 100 award for the developement of a personalized computer aid diagnostic system relying on eyegaze analysis for decision making. Since April 2014, she has been working for the Geographic Information Science & Technology group. Her research focuses on developing multispectral image processing algorithms for CPU and GPU platforms for high performance computing of satellite imagery.
    [Show full text]
  • Dr. Georgia Tourassi Director, Health Data Sciences Institute Group
    Dr. Georgia Tourassi Director, Health Data Sciences Institute Group Leader, Biomedical Science, Engineering and Computing Group Computing, Science, and Engineering Division Oak Ridge National Laboratory Dr. Georgia Tourassi is the founding Director of the Health Data Sciences Institute and Group Leader of Biomedical Sciences, Engineering, and Computing at the Oak Ridge National Laboratory (ORNL). Concurrently, she holds appointments as an Adjunct Professor of Radiology at Duke University and the University of Tennessee Graduate School of Medicine and as a joint UT-ORNL Professor of the Bredesen Center Data Sciences Program and Mechanical, Aerospace, and Biomedical Engineering at the University of Tennessee at Knoxville. Her research interests include artificial intelligence for biomedical applications, medical imaging, biomedical informatics, clinical decision support systems, digital epidemiology, and data-driven biomedical discovery. She served on the FDA Advisory Committee and Review Panel on Computer-Aided Diagnosis Devices, and she has served as appointed Charter member and ad hoc reviewer at several NIH Study sections. Her scholarly work has led to 11 US patents and innovation disclosures and more than 250 peer-reviewed journal articles, conference proceedings articles, editorials, and book chapters. Her research has been featured in numerous high-profile publications such as the MIT Science and Technology Review, Oncology Times and the Economist. Dr. Tourassi has served as Associate Editor of the scientific journals Radiology and Neurocomputing and as a Guest Associate Editor of Medical Physics and IEEE Journal of Biomedical and Health Informatics. She is elected Fellow of the American Institute of Medical and Biological Engineering (AIMBE), the American Association of Medical Physicists (AAPM) and the International Society for Optics and Photonics (SPIE).
    [Show full text]
  • Profiling Top Most Influential Blogger Using Synonym Substitution Approach
    VASANTHAKUMAR G U et al.: PTMIBSS: PROFILING TOP MOST INFLUENTIAL BLOGGER USING SYNONYM SUBSTITUTION APPROACH PTMIBSS: PROFILING TOP MOST INFLUENTIAL BLOGGER USING SYNONYM SUBSTITUTION APPROACH Vasanthakumar G.U1, Priyanka R, Vanitha Raj K.C, Bhavani S, Asha Rani B.R, P. Deepa Shenoy and Venugopal K. R. Department of Computer Science and Engineering, University Visvesvaraya College of Engineering, Bangalore University, India E-mail: [email protected] Abstract fulfill their needs this way is adversely affecting the human day- Users of Online Social Network (OSN) communicate with each other, to-day lives, relationships and even families. exchange information and spread rapidly influencing others in the In recent days online social network has gained huge attention network for taking various decisions. Blog sites allow their users to from businesses point of view, where industries consider knowing create and publish thoughts on various topics of their interest in the their audience as the key to their success. Since to remain form of blogs/blog documents, catching the attention and letting readers to perform various activities on them. Based on the content of competitive and noticed in the market is a need for industries and the blog documents posted by the user, they become popular. In this it has made them to engage in social networking which involves work, a novel method to profile Top Most Influential Blogger (TMIB) all age group people from various sectors. Many companies is proposed based on content analysis. Content of blog documents of provide various tools for their customers to engage and discuss bloggers under consideration in the blog network are compared and about their experience on their products, creating interactive analyzed.
    [Show full text]
  • MD Program.Fm
    Doctor of Medicine Program 24 Doctor of Medicine Program Mission Statement and the Medical Curriculum The mission of the Duke University School of Medicine is: To prepare students for excellence by first assuring the demonstration of defined core competencies. To complement the core curriculum with educational opportunities and advice regarding career planning which facilitates students to diversify their careers, from the physician-scientist to the primary care physician. To develop leaders for the twenty-first century in the research, education, and clinical practice of medicine. To develop and support educational programs and select and size a student body such that every student participates in a quality and relevant educational experi- ence. Physicians are facing profound changes in the need for understanding health, dis- ease, and the delivery of medical care changes which shape the vision of the medical school. These changes include: a broader scientific base for medical practice; a national crisis in the cost of health care; an increased number of career options for physicians yet the need for more generalists; an emphasis on career-long learning in investigative and clinical medicine; the necessity that physicians work cooperatively and effectively as leaders among other health care professionals; and the emergence of ethical issues not heretofore encountered by physicians. Medical educators must prepare physicians to respond to these changes.The most successful medical schools will position their stu- dents to take the lead addressing national health needs. Duke University School of Med- icine is prepared to meet this challenge by educating outstanding practitioners, physician scientists, and leaders. Continuing at the forefront of medical education requires more than educating Duke students in basic science, clinical research, and clinical programs for meeting the health care needs of society.
    [Show full text]
  • Use of Natural Language Processing to Extract Clinical Cancer Phenotypes From
    Author Manuscript Published OnlineFirst on August 8, 2019; DOI: 10.1158/0008-5472.CAN-19-0579 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Use of Natural Language Processing to Extract Clinical Cancer Phenotypes from Electronic Medical Records Authors and Affiliations: Guergana K. Savova 1,2, Ioana Danciu 3, Folami Alamudun 3, Timothy Miller 1,2, Chen Lin 1, Danielle S. Bitterman 2,4, Georgia Tourassi 3, Jeremy L. Warner 5 1. Computational Health Informatics Program, Boston Children’s Hospital, Boston, MA 2. Harvard Medical School, Boston, MA 3. Oak Ridge National Lab, Knoxville, TN 4. Dana Farber Cancer Institute, Boston, MA 5. Vanderbilt University Medical Center, Nashville, TN Running title: Natural Language Processing for Cancer Phenotypes from EMRs Corresponding author: Guergana Savova, PhD, FACMI PI Natural Language Processing Lab Boston Children's Hospital and Harvard Medical School 300 Longwood Avenue Mailstop: BCH3092 Enders 144.1 Boston, MA 02115 Tel: (617) 919-2972 Fax: (617) 730-0817 [email protected] Conflict of Interest Statement: none Downloaded from cancerres.aacrjournals.org on September 26, 2021. © 2019 American Association for Cancer Research. Author Manuscript Published OnlineFirst on August 8, 2019; DOI: 10.1158/0008-5472.CAN-19-0579 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Abstract Current models for correlating electronic medical records with -omics data largely ignore clinical text, which is an important source of phenotype information for cancer patients. This data convergence has the potential to reveal new insights about cancer initiation, progression, metastasis, and response to treatment.
    [Show full text]
  • Using Case-Level Context to Classify Cancer Pathology Reports
    University of Kentucky UKnowledge Kentucky Cancer Registry Faculty Publications Cancer Registry 5-12-2020 Using Case-Level Context to Classify Cancer Pathology Reports Shang Gao Oak Ridge National Laboratory Mohammed Alawad Oak Ridge National Laboratory Noah Schaefferkoetter Oak Ridge National Laboratory Lynne Penberthy National Cancer Institute Xiao-Cheng Wu Louisiana State University See next page for additional authors Follow this and additional works at: https://uknowledge.uky.edu/kcr_facpub Part of the Computational Engineering Commons, Data Science Commons, and the Oncology Commons Right click to open a feedback form in a new tab to let us know how this document benefits ou.y Repository Citation Gao, Shang; Alawad, Mohammed; Schaefferkoetter, Noah; Penberthy, Lynne; Wu, Xiao-Cheng; Durbin, Eric B.; Coyle, Linda; Ramanathan, Arvind; and Tourassi, Georgia, "Using Case-Level Context to Classify Cancer Pathology Reports" (2020). Kentucky Cancer Registry Faculty Publications. 1. https://uknowledge.uky.edu/kcr_facpub/1 This Article is brought to you for free and open access by the Cancer Registry at UKnowledge. It has been accepted for inclusion in Kentucky Cancer Registry Faculty Publications by an authorized administrator of UKnowledge. For more information, please contact [email protected]. Using Case-Level Context to Classify Cancer Pathology Reports Digital Object Identifier (DOI) https://doi.org/10.1371/journal.pone.0232840 Notes/Citation Information Published in PLOS ONE, v. 15, no. 5, 0232840, p. 1-21. This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
    [Show full text]
  • I V Computer Aided Detection of Masses in Breast Tomosynthesis
    Computer Aided Detection of Masses in Breast Tomosynthesis Imaging Using Information Theory Principles by Swatee Singh Department of Biomedical Engineering Duke University Date:_______________________ Approved: ___________________________ Joseph Y. Lo, Ph.D., Supervisor ___________________________ Jay A. Baker, M.D. ___________________________ Kathryn R. Nightingale, Ph.D. ___________________________ Loren W. Nolte, Ph.D. ___________________________ Georgia D. Tourassi, Ph.D. Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Biomedical Engineering in the Graduate School of Duke University 2008 i v ABSTRACT Computer Aided Detection of Masses in Breast Tomosynthesis Imaging Using Information Theory Principles by Swatee Singh Department of Biomedical Engineering Duke University Date:_______________________ Approved: ___________________________ Joseph Y. Lo, Ph.D., Supervisor ___________________________ Jay A. Baker, M.D. ___________________________ Kathryn R. Nightingale, Ph.D. ___________________________ Loren W. Nolte, Ph.D. ___________________________ Georgia D. Tourassi, Ph.D. An abstract of a dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Biomedical Engineering in the Graduate School of Duke University 2008 i v Copyright by Swatee Singh 2008 Abstract Breast cancer screening is currently performed by mammography, which is limited by overlying anatomy and dense breast tissue. Computer aided detection (CADe) systems can serve as a double reader to improve radiologist performance. Tomosynthesis is a limited-angle cone-beam x-ray imaging modality that is currently being investigated to overcome mammography’s limitations. CADe systems will play a crucial role to enhance workflow and performance for breast tomosynthesis. The purpose of this work was to develop unique CADe algorithms for breast tomosynthesis reconstructed volumes.
    [Show full text]
  • Envisioning Computational Innovations for Cancer Challenges
    ENVISIONING COMPUTATIONAL INNOVATIONS FOR CANCER CHALLENGES: SCOPING MEETING March 6-7, 2019 Livermore Valley Open Campus PRESENTER BIOSKETCHES Livermore, California WELCOME AND ORIENTATION Name Emily Greenspan, Ph.D. Title Program Director Affiliation National Cancer Institute Email [email protected] Dr. Greenspan works as a biomedical informatics program director in the Center for Biomedical Informatics and Information Technology (CBIIT) at NCI. Her primary role is NCI program lead for the NCI/Department of Energy collaborations, including the Accelerating Therapeutics for Opportunities in Medicine (ATOM) Consortium and the Joint Design of Advanced Computing Solutions for Cancer (JDACS4C) program. Prior to her role at CBIIT, Emily served as a program director for the NCI's Provocative Questions Initiative in the Center for Strategic Scientific Initiatives where she also helped organize and lead several of NCI's cancer technology focused programs. Emily's grad school and postdoctoral research focused on cancer chemoprevention and arachidonic acid metabolism. 3 | P a g e WELCOME AND ORIENTATION Name Carolyn Lauzon, Ph.D. Title Program Manager Affiliation US Department of Energy Office of Science Advanced Scientific Computing Research Email [email protected] Phone (301) 903-6139 Dr. Carolyn Vea Lauzon serves in the Department of Energy, Office of Science, Office of Advanced Scientific Computing Research. Carolyn is the Facilities Program Manager for the NERSC high performance computing facility at Lawrence Berkeley National Laboratory and the DOE Office of Science lead for DOE partnerships with the National Institutes of Health and the Department of Veterans Affairs. Before joining federal service, Carolyn completed her post doctoral fellowship in Computational Medical Image Analysis at Vanderbilt University and received her PhD in Molecular Biophysics from the Johns Hopkins School of Medicine.
    [Show full text]
  • Opportunities and Obstacles for Deep Learning in Biology and Medicine
    bioRxiv preprint doi: https://doi.org/10.1101/142760; this version posted January 19, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Opportunities and obstacles for deep learning in biology and medicine A DOI-citable preprint of this manuscript is available at https://doi.org/10.1101/142760. This manuscript was automatically generated from greenelab/deep-review@a01dd71 on January 19, 2018. Authors 1,☯ 2 3 4 Travers Ching , Daniel S. Himmelstein , Brett K. Beaulieu-Jones , Alexandr A. Kalinin , 5 2 6 7 2 Brian T. Do , Gregory P. Way , Enrico Ferrero , Paul-Michael Agapow , Michael Zietz , 8,9,10 11 12 13 Michael M. Hoffman , Wei Xie , Gail L. Rosen , Benjamin J. Lengerich , Johnny 14 15 12 16 Israeli , Jack Lanchantin , Stephen Woloszynek , Anne E. Carpenter , Avanti 17 18 19,20 21 Shrikumar , Jinbo Xu , Evan M. Cofer , Christopher A. Lavender , Srinivas C. 22 17 23 24 25 Turaga , Amr M. Alexandari , Zhiyong Lu , David J. Harris , Dave DeCaprio , 15 17,26 23 27 28 Yanjun Qi , Anshul Kundaje , Yifan Peng , Laura K. Wiley , Marwin H.S. Segler , 29 30 31 32,33,† Simina M. Boca , S. Joshua Swamidass , Austin Huang , Anthony Gitter , 2,† Casey S. Greene ☯ — Author order was determined with a randomized algorithm † — To whom correspondence should be addressed: [email protected] (A.G.) and [email protected] (C.S.G.) 1. Molecular Biosciences and Bioengineering Graduate Program, University of Hawaii at Manoa, Honolulu, HI 2.
    [Show full text]
  • BHI 2016 Workshop Proposal
    BHI 2016 Workshop Proposal Type: Half day Georgia Tourassi, PhD - IEEE Member Georgia Tourassi is the Director of the Health Data Sciences Institute at the Oak Ridge National Laboratory and Adjunct Professor of Radiology at Duke University Medical Center and the University of Tennessee at Knoxville. She holds a Ph.D. in Biomedical Engineering from Duke University. She is a senior member of IEEE, INNS, and SPIE, Fellow of the American Institute of Medical and Biological Engineering (AIMBE) and the American Association of Physicists in Medicine (AAPM). Bradford Hesse, PhD Bradford W. Hesse (Brad) is Chief of the Health Communication and Informatics Research Branch at the National Cancer Institute (NCI). Dr. Hesse has spent most of his career working to improve the ways in which mediated communication environments can be used to improve decision making, enhance the user experience, influence group outcomes, and to support adaptive and healthy behaviors. He has authored or co-authored over 160 publications including peer-reviewed journal articles, technical reports, books, and book chapters. His latest edited book, titled “Oncology Informatics,” is scheduled for release by Elsevier Publishing in the spring of 2016. TITLE: Web-Based Public Health Informatics Theme : Behavior and Health Informatics Keywords (Max 5): web mining, population health, outcomes research, knowledge discovery Abstract— The World Wide Web has been transforming radically the landscape of medical research and healthcare delivery. Online sources including digital social networks have facilitated a wide range of novel applications in the public health domain, including disease surveillance, early-warning, and rapid response, monitoring of population health-related interests, sentiment, and behaviors, drug performance monitoring and others.
    [Show full text]
  • Organization Oak Ridge National Laboratory (ORNL)
    Opportunity Title: ORNL Deep Learning for Biomedical and Health Informatics Post-Master's Research Associate Opportunity Reference Code: ORNL17-48-CSED Organization Oak Ridge National Laboratory (ORNL) Reference Code ORNL17-48-CSED Description The Biomedical Science, Engineering, and Computing (BSEC) Group in the Computational Science and Engineering Division (CSED) at the Oak Ridge National Laboratory (ORNL) is seeking a Post-Master's Research Associate with a focus on deep learning for biomedical applications. We are seeking creative people with solid analytical capabilities, broad machine learning and statistical modeling skills, and strong programming experience. Our group is committed to conducting research in computational, data, and engineering sciences in the context of biomedical, clinical, and public health disciplines and applying this knowledge to support the nation’s leading health initiatives. Job Duties and Responsibilities • Research and develop new and novel methods in deep learning as applied to biomedical applications. • Collaborate with BSEC/CSED scientists in a wide variety of data analytics and architecture research. • Author peer reviewed papers, technical papers, reports and proposals for internal and external release and represent the organization by giving technical presentations in large public forums. • Be a part of a vibrant collaborative research environment which will provide the opportunity to perform cutting-edge research in high-performance computing, using ORNL’s Leadership Class Supercomputer. Technical Questions: Georgia Tourassi Tel: (865) 576-4829 Email: [email protected] Qualifications M.S. in Computer Science, Applied Mathematics, Statistics, Engineering or a closely related field. Previous research experience in deep learning methods. Preferred: • Solid theoretical foundation, algorithmic skills, and hands-on experience with deep learning, in the biomedical domain.
    [Show full text]