2016 AIPR: Keynote and Invited Talks

Schedule

Tuesday, October 18 - & Artificial Intelligence 9:00 AM - 9:45 AM - Jason Matheny, Director, IARPA 11:25 AM - Noon - Trevor Darrell. EECS, University of California-Berkeley 1:30 PM - 2:15 PM - Christopher Rigano, Office of Science and Technology, National Institute of Justice 3:30 PM - 4:05 PM - John Kaufhold, Deep Learning Analytics

Wednesday, October 19 - HPC & Biomedical 8:45 AM - 9:30 AM - Richard Linderman, SES, Office of the Assistant Secretary of Defense, Research and Engineering 11:45 AM - 12:30 AM - Vijayakumar Bhagavatula, Associate Dean, Carnegie Mellon University 2:00 PM - 2:45 PM - Patricia Brennan, Director, National Library of Medicine, National Institutes of Health 4:00 PM - 4:45 PM - Nick Petrick, Center for Devices and Radiological Health, Food and Drug Administration 7:15 PM - 9:15 PM - Terry Sejnowski, Salk Institute & UCSD, Banquet Speaker - Deep Learning II

Thursday, October 20 - Big 3D Data & Image Quality 8:45 AM - 9:30 AM - Joe Mundy, Brown University & Vision Systems 1:00 PM - 1:45 PM - Steven Brumby, Descartes Labs

Biographies and Abstracts (where available)

Banquet Talk: Deep Learning II Deep Learning is based on the architecture of the primate visual system, which has a hierarchy of visual maps with increasingly abstract representations. This architecture is based on what we knew about the properties of neurons in the visual cortex in the 1960s. The next generation of deep learning based on more recent understanding of the visual system will be more energy efficient and have much higher temporal resolution. Speaker Bio: Dr. Terrence Sejnowski received his Ph.D. in physics from Princeton University. He was a postdoctoral fellow at Princeton University and the Harvard Medical School. He served on the faculty of Johns Hopkins University and was a Wiersma Visiting Professor of Neurobiology and a Sherman Fairchild Distinguished Scholar at Caltech. He is now an Investigator with the Howard Hughes Medical Institute and holds the Francis Crick Chair at The Salk Institute for Biological Studies. He is also a Professor of Biology at the University of California, San Diego, where he is co-director of the Institute for Neural Computation and co-director of the NSF Temporal Dynamics of Learning Center. He is a pioneer in computational neuroscience and his goal is to understand the principles that link brain to behavior. His laboratory uses both experimental and modeling techniques to study the biophysical properties of synapses and neurons and the population dynamics of large networks of neurons. New computational models and new analytical tools have been developed to understand how the brain represents the world and how new representations are formed through learning algorithms for changing the synaptic strengths of connections between neurons. He has published over 500 scientific papers and 12 books, including The Computational Brain, with Patricia Churchland. Sejnowski is the President of the Neural Information Processing Systems (NIPS) Foundation, which organizes an annual conference attended by over 2000 researchers in and neural computation and is the founding editor-in-chief of Neural Computation published by the MIT Press. He is a member of the Institute of Medicine, National Academy of Sciences and the National Academy of Engineering, one of only ten current scientists elected to all three national academies.

Keynote Talk: Value of Machine Learning for National Security, Jason Matheny, Director IARPA: This talk will describe how machine learning has affected a range of national security missions, the value of past research, and priorities for future research. Speaker Bio: Dr. Jason Matheny is Director of the Intelligence Advanced Research Projects Activity (IARPA), a U.S. Government organization that invests in high-risk, high-payoff research in support of national intelligence. Before IARPA, he worked at Oxford University, the World Bank, the Applied Physics Laboratory, the Center for Biosecurity and Princeton University, and is the co-founder of two biotechnology companies. His research has been published in Nature, Nature Biotechnology, Biosecurity and Bioterrorism, Clinical Pharmacology and Therapeutics, Risk Analysis, Tissue Engineering, and the World Health Organization’s Disease Control Priorities, among others. Dr. Matheny holds a PhD in applied economics from Johns Hopkins, an MPH from Johns Hopkins, an MBA from Duke, and a BA from the University of Chicago. He received the Intelligence Community’s Award for Individual Achievement in Science and Technology.

Invited Talk: Perceptual representation learning across diverse modalities and domains, Trevor Darrell, University of California, Berkeley: Learning of layered or "deep" representations has provided significant advances in in recent years, but has traditionally been limited to fully supervised settings with very large amounts of training data. New results show that such methods can also excel when learning in sparse/weakly labeled settings across modalities and domains. I'll review state-of-the-art models for fully convolutional pixel-dense segmentation from weakly labeled input, and will discuss new methods for adapting deep recognition models to new domains with few or no target labels for categories of interest. As time permits, I'll present recent results on long-term recurrent network models that can learn cross-modal descriptions and explanations. Speaker Bio: Prof. Darrell is on the faculty of the CS Division of the EECS Department at UC Berkeley and he is also appointed at the UC-affiliated International Computer Science Institute (ICSI). Darrell’s group develops algorithms for large-scale perceptual learning, including object and activity recognition and detection, for a variety of applications including multimodal interaction with robots and mobile devices. His interests include computer vision, machine learning, computer graphics, and perception- based human computer interfaces. Prof. Darrell was previously on the faculty of the MIT EECS department from 1999-2008, where he directed the Vision Interface Group. He was a member of the research staff at Interval Research Corporation from 1996-1999, and received the S.M., and PhD. degrees from MIT in 1992 and 1996, respectively. He obtained the B.S.E. degree from the University of Pennsylvania in 1988, having started his career in computer vision as an undergraduate researcher in Ruzena Bajcsy's GRASP lab.

Speaker Bio: Chris Rigano serves as a senior computer scientist at the National Institute of Justice, Office of Justice Programs, US. Department of Justice and is responsible for advancing the science of person-based analytics for the federal, state, local, and tribal criminal justice communities. His work includes research in image and video analytics, biometrics and social network analysis. Prior to this assignment, Mr. Rigano worked exploratory analytics as a contractor for the intelligence community. Mr. Rigano holds MS degrees in Computer Science from North Carolina State University and in Software Engineering from Monmouth University. He also holds a BS degree in Computer Science from Monmouth University and a BA degree in Criminal Justice from Iona College. He served in the US Army reaching the rank of Captain. His experience includes working as technical direction agent at the Office of Naval Research for the MITRE Corporation. His career spans multiple research areas in computer communications cyberspace, social network and media analytics.

Invited Talk: Deep Learning Past, Present and Near Future In the past 5 years, deep learning has become one of the hottest topics at the intersection of data science, society, and business. Google, Facebook, Microsoft, Baidu and other companies have embraced the technology and in domain after domain, deep learning is outperforming both people and competing algorithms at practical tasks. ImageNet Hit@5 object recognition error rates have fallen >85% since 2011 and now can recognize 1,000 different objects in photos faster and better than you can. All major speech recognition engines (Google’s, Baidu’s, Apple Siri, etc.) now use deep learning. In real time, deep learning can automatically translate a speaker’s voice in one language to the same voice speaking another language. Deep learning can now beat you at Atari and Go. These breakthroughs are visible as both product offerings as well as competitive results on international open benchmarks. This recent disruptive history of deep learning has led to a student and startup stampede to master key elements of the technology—and this landscape is evolving rapidly. The abundance of open data, Moore’s law, Koomey’s law, Dennard scaling, an open culture of innovation, a number of key algorithmic breakthroughs in deep learning, and a unique investment at the intersection of hardware and software have all converged as factors contributing to deep learning’s recent disruptive successes. And continued miniaturization in the direction of internet-connected devices in the form of the “Internet of Things” promises to flood sensor data across new problem domains to an already large, innovative, furiously active, and well resourced community of practice. But with disruptive AI technologies come apprehension—we now enjoy deep learning benefits like Siri every day, but privacy concerns, economic dislocation, anxieties about self driving cars, and military drones all loom on the horizon and our legal system has struggled to keep pace with technology. Speaker Bio: Dr. Kaufhold is a data scientist and managing partner of Deep Learning Analytics, a data science company named one of the four fastest growing companies in Arlington, Virginia in 2015. Dr. Kaufhold also serves as Secretary of the Washington Academy of Sciences and is a regular contributor to the DC Data Community, where he moderates the DC2 Deep Learning Discussion list. Prior to founding Deep Learning Analytics, Dr. Kaufhold investigated deep learning algorithms as a staff scientist at NIH. Prior to NIH, Dr. Kaufhold was the youngest member of the Technical Fellow Council at SAIC. Over 7 years at SAIC, Dr. Kaufhold served as principal investigator or technical lead on a number of large government contracts funded by NIH, DARPA and IARPA, among others, taking a sabbatical at MIT to study deep learning in 2010. Prior to joining SAIC, Dr. Kaufhold investigated machine learning algorithms for medical image analysis and image and video processing at GE's Global Research Center. On a Whitaker fellowship, Dr. Kaufhold earned his Ph.D. from Boston University's biomedical engineering department in 2001. Dr. Kaufhold is named inventor on >10 issued patents in image analysis, and author/coauthor on >40 publications in the fields of machine learning, image understanding and neuroscience.

Speaker Bio: Dr. Richard W. Linderman, a member of the Scientific and Professional Cadre of Senior Executives, is the Deputy Director for Information Systems and Cyber Technologies in the Office of the Assistant Secretary of Defense, Research and Engineering. In this position he provides the senior leadership and oversight of all Department of Defense science and technology programs in the areas of information technology; cyber security; autonomy; high performance computing and advanced computing; software and embedded systems; networks and communications; and large data and data- enabled decision making tools. He also has cognizance over the complete spectrum of efforts in computers and software technology, communications, information assurance, and information management and distribution. Dr. Linderman was commissioned as a second lieutenant in May 1980. Upon completing four years of graduate studies, he entered active-duty, teaching computer architecture courses and leading related research at the Air Force Institute of Technology. He was assigned to Rome Air Development Center in 1988, where he led surveillance signal processing architecture activities. In 1991, he transitioned from active-duty to civil service as a senior electronics engineer at Rome Laboratory, becoming a principal engineer in 1997. During these years, he pioneered three-dimensional packaging of embedded architectures and led the Department of Defense community exploring signal and image processing applications of high performance computers. He conceived and demonstrated the use of PS3 gaming consoles to architect supercomputers with outstanding affordability and power efficiency. Dr. Linderman holds seven U.S. patents with three pending U.S. Patent Applications and has published more than 90 journal, conference and technical papers.

Invited Talk: Innovations in Correlation Filter Design Architectures for Robust Object Recognition In many computer vision problems, the main task is to match two appearances of an object (e.g., face, iris, vehicle, etc.) that may exhibit appearance differences due to factors such as translation, rotation, scale change, occlusion, illumination variation and others. One class of methods to achieve accurate object recognition in the presence of such appearance variations is one where features computed in a sliding window in the target image are compared to features computed in a stationary window of the reference image. Correlation filters are an efficient frequency-domain method to implement such sliding window matching. They also offer benefits such as shift-invariance (i.e., the object of interest doesn’t have to be centered), no need for segmentation, graceful degradation and closed-form solutions. While the origins of correlation filters go back more than thirty years, there have been some very interesting and useful recent advances in correlation filter designs and their applications. For example, the new maximum margin correlation filters (MMCFs) show how the superior localization capabilities of correlation filters can be combined with the generalization capabilities of support vector machines (SVMs). Another major research advance is the development of vector correlation filters that use features (e.g., HOG) extracted from the input image rather than just input image pixel values. While past application of correlation filters focused mainly on automatic target recognition, more recent applications include face recognition, iris recognition, palmprint recognition and visual tracking. This talk will provide an overview of correlation filter designs and applications, with particular emphasis on these more recent advances. Speaker Bio: Prof. Vijayakumar (“Kumar”) Bhagavatula received his Ph.D. in Electrical Engineering from Carnegie Mellon University (CMU), Pittsburgh and since 1982, he has been a faculty member in the Electrical and Computer Engineering (ECE) Department at CMU where he is now the U.A. & Helen Whitaker Professor of ECE and the Associate Dean for the College of Engineering. He served as the Associate Head of the ECE Department and also as its Acting Department Head. Professor Kumar's research interests include Pattern Recognition and Coding and Signal Processing for Data Storage Systems and for Digital Communications. He has authored or co-authored over 600 technical papers, twenty book chapters and one book entitled Correlation Pattern Recognition. He served as a Topical Editor for Applied Optics and as an Associate Editor of IEEE Trans. Information Forensics and Security. Professor Kumar is a Fellow of IEEE, a Fellow of SPIE, a Fellow of Optical Society of America (OSA) and a Fellow of the International Association of Pattern Recognition (IAPR).

Keynote Talk: NLM Anticipating the Third Century: Image Informatics. The US National Library of Medicine is one of the 27 institutes and centers of the National Institutes of Health. NLM is the world’s largest research library in medicine and the life sciences, and is also home to research and development centers focusing on large databases in genomics, imagery and bibliographic data, as well as analytical tools that enable the use of this ‘big’ data by the biomedical and computational communities, and the general public. This talk will address NLM’s current work and future directions in image informatics – the confluence of image processing, machine learning and natural language processing. Among the many motivators for this work are automation in disease screening in resource- poor countries (for malaria and tuberculosis), automated extraction of bibliographic data from biomedical articles to build MEDLINE/PubMed citations, public access to photorealistic versions of rare historical volumes in biomedicine, mitigation of effects of large scale disasters (family reunification), and other goals within the NLM mission. The talk will conclude with an overview of the strategic vision of the NLM and its role in fostering efficient use of large data. Speaker Bio: Patricia Flatley Brennan, RN, PhD, is the Director of the National Library of Medicine (NLM). The NLM is the world’s largest biomedical library and the producer of digital information services used by scientists, health professionals and members of the public worldwide. She assumed the directorship in August 2016. Dr. Brennan came to NIH from the University of Wisconsin-Madison, where she was the Lillian L. Moehlman Bascom Professor at the School of Nursing and College of Engineering. She also led the Living Environments Laboratory at the Wisconsin Institutes for Discovery, which develops new ways for effective visualization of high dimensional data. Dr. Brennan is a pioneer in the development of information systems for patients. She developed ComputerLink, an electronic network designed to reduce isolation and improve self-care among home care patients. She directed HeartCare, a web-based information and communication service that helps home-dwelling cardiac patients recover faster, and with fewer symptoms. She also directed Project HealthDesign, an initiative designed to stimulate the next generation of personal health records. Dr. Brennan has also conducted external evaluations of health information technology architectures and worked to repurpose engineering methods for health care. She received a master of science in nursing from the University of Pennsylvania and a PhD in industrial engineering from the University of Wisconsin-Madison. Following seven years of clinical practice in critical care nursing and psychiatric nursing, Dr. Brennan held several academic positions at Marquette University, Milwaukee; Case Western Reserve University, Cleveland; and the University of Wisconsin-Madison. A past president of the American Medical Informatics Association, Dr. Brennan was elected to the Institute of Medicine of the National Academy of Sciences (now the National Academy of Medicine) in 2001. She is a fellow of the American Academy of Nursing, the American College of Medical Informatics, and the New York Academy of Medicine.

Invited Talk: Validation of Quantitative Imaging and Computer-aided Diagnosis Tools Nicholas Petrick, Division of Imaging, Diagnostics and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD 20993, USA Advances in biology are improving our understanding of the mechanisms underlying diseases and response to therapeutic interventions. These advances provide new opportunities to match patients with diagnostics and therapies that are more likely to be safe and effective, and enable “precision medicine.” We are also seeing advances in medical imaging and computer technologies allowing the extraction of biologically relevant information from medical images giving rise to a wide range of quantitative imaging (QI) and computer-aided diagnosis (CAD) tools for use in clinical medicine. QI tools extract specific features or metrics from medical images relevant to, for example, disease state and treatment response. Radiological CAD tools are computerized algorithms that incorporate pattern recognition and data analysis capabilities (i.e., combine values, measurements, or features extracted from patient radiological data to discover patterns in the data). CAD systems typically merges information from multiple QI features, either directly (e.g., support vector machine classifier) or indirectly (e.g., deep learning methods). A distinction between QI and CAD is that in QI, the emphasis is on establishing a specific imaging metric’s association with a disease condition while the emphasis in CAD is on how the output aids the clinician in decision-making. In order to advance QI and CAD tools into clinical use, there is a strong need to establish appropriate and widely accepted performance assessment methods. This talk will provide an overview of our latest ongoing regulatory research related to technical and clinic performance assessment methods for QI and CAD tools. These assessment techniques are not specific medical image analysis but generalize to a wide range of pattern recognition and machine learning tools. Speaker Bio: Nicholas Petrick is Acting Division Director for the Division of Imaging, Diagnostics and Software Reliability within the U.S. Food and Drug Administration, Center for Devices and Radiological Health and was appointed to the FDA Senior Biomedical Research Service. He earned his B.S. degree from Rochester Institute of Technology in Electrical Engineering and his M.S. and Ph.D degrees from the University of Michigan in Electrical Engineering Systems. His interest include imaging biomarkers, CAD, image processing, assessment methods and x-ray imaging physics.

Invited Talk: 3-D Reasoning from an AI Perspective: History and Future The field of artificial intelligence (AI) has from its very inception recognized the importance of grounding reasoning about the world in 3-d space. Scene understanding cannot be addressed without an underlying representation of 3-d space and its properties. In the first few decades of the research, the use of 3-d geometric models was prominent in describing scene content. At the same time, a broad base of logical reasoning was developed as the key approach to knowledge representation. The two research threads intersected in an area of research called geometric reasoning, as will be described in the presentation. Starting in the mid 90’s the role of 3-d representation begin to be deemphasized in favor of relying on massive datasets of 2-d images for training classifiers without any particular internal representation of the scene. This trend was further accelerated in the last decade by the resurgence of neural networks for the third time over AI’s existence. In this case, significant success was achieved because of the virtually exhaustive datasets and fast GPU-based hardware to support adapting millions of network parameters. In spite of this success in data-driven methods there is still a huge gap in understanding the underlying nature of scenes, and to express this understanding in an ontological/logical form. It is argued that such understanding is essential to deal with scene content and events that are not often visually observed but that can be characterized by generalization and abstract reasoning. The talk will provide examples and a conclusion about the way forward. Speaker Bio: Dr. Mundy received his B.E.E. (1963) and M.Eng. (1966) and Ph.D. (1969) from Rensselaer Polytechnic Institute. He has published over 100 papers and articles in computer vision and solid state electronics. Dr. Mundy joined General Electric’s Research and Development Center (CRD) in 1963. His early projects at CRD include: High power microwave tube design, superconductive computer memory devices, the design of high density integrated circuit associative memory arrays, and the application of transform coding to image data compression. He is the co-inventor of varactor bootstrapping, a key technique still widely used today in the design of CMOS integrated circuits. From 1972 until 2002, Dr. Mundy led a group involved in the research and development of image understanding and computer vision systems. In the early 1970’s his group developed one of the first major applications of computer vision to industrial inspection – inspecting incandescent lamp filaments at the rate of 15 parts/sec. and achieved classification performance of less than one error per thousand. His more recent research themes at CRD included: industrial photogrammetry for machine control, theory of geometric invariance, change detection in satellite imagery and CT classification of lung cancer lesions. In 1988, Dr. Mundy was named a Coolidge Fellow, GE’s highest technical honor. He applied the fellowship to a sabbatical at Oxford University, working with Sir Michael Brady and Prof. Andrew Zisserman on applications of invariant theory to computer vision. This work led to the Marr Prize award in 1993. In 2002 Dr. Mundy joined the School of Engineering at Brown as Prof. of Engineering (research). His research at Brown, under sponsorship of DARPA and NGA, includes video and image analysis, with emphasis on change detection and 3-d volumetric modeling. In 2011, Dr. Mundy co- founded Vision Systems Inc. and is President and CEO. At VSI, Dr. Mundy has managed the DARPA Tailwind and Visual Media Reasoning projects that are aimed at aerial video processing and scene analysis. Dr. Mundy also provides general project management of current VSI efforts in 3-d modeling from satellite imagery, geo-location of ground-level imagery and facial recognition.

Invited Talk: From Pixels to Answers: Cloud-based forecasting of global-scale systems using a living atlas of the world Decades of Earth observation missions have accumulated eight petabytes of science-grade imagery, which are now available in commercial cloud storage. These cloud platforms enable far more than just efficient storage and dissemination – adjacent massive computing resources constitute on-demand super- computing at a scale that was previously only available to national laboratories, in principle enabling automated real-time analysis of all this imagery. The remaining limitations are the need for remote sensing science and machine learning algorithms and software can that exploit this data-rich environment, and ground-truth data to train and validate models. While good sources of ground data remain elusive, recent advances in deep learning algorithms and scientific software tools have reached a critical threshold of new capability - we can now combine all the available satellite imagery into a living atlas of the world, and use that atlas to build a forecasting platform for global-scale systems. We describe processing petabytes of compressed raw image data acquired by the NASA/USGS Landsat, NASA MODIS, and ESA Sentinel programs over the past 40 years. Using commodity cloud computing resources, we convert the imagery to a calibrated, georeferenced, multi-resolution format suited for machine learning analysis. We use this multi-sensor, multi-constellation dataset for global-scale agricultural forecasting, environmental monitoring and disaster analysis. We apply remote sensing and deep learning algorithms to detect and classify agricultural crops and then estimate crop yields. In regions with poor ground data, still the case for most of the world, we explore simpler indicators of crop health for famine early warning. Our dataset and approach supports monitoring water resources and forestry resources, and characterization of land use within cities. Speaker Bio: Dr. Steven P. Brumby is Co-Founder and Chief Strategy Officer of Descartes Labs, a venture-backed start-up spun out of Los Alamos National Laboratory focused on understanding agriculture, natural resources and human geography using machine learning and satellite imagery in the cloud. Previously, Steven was a Senior Research Scientist at Los Alamos National Laboratory working on image, video and signals analysis for astronomy, planetary science and earth observation missions. He received his Ph.D. in Theoretical Physics at the University of Melbourne (Australia) in 1997.