Towards a Sustainable Ecosystem for Data Driven Research and Innovation

Total Page:16

File Type:pdf, Size:1020Kb

Towards a Sustainable Ecosystem for Data Driven Research and Innovation Towards a Sustainable Ecosystem for Data Driven Research and Innovation Dr. Francine Berman Chair, Research Data Alliance / US Edward P. Hamilton Distinguished Professor of Computer Science, Rensselaer Polytechnic Institute Fran Berman 1 Innovation opportunity: Research data driving solutions to scientific and societal challenges Who is most at risk to contract asthma? How can we increase wheat yields? Image: Lucas Taylor How can we best address energy needs and sustain the environment? How accurate is the Standard Model of Image: Ceinturion, Wikipedia Physics? Fran Berman 2 Infrastructure reality: Access, use and re-use of data now and in the future presupposes data sustainable stewardship and preservation today • Stewardship and Preservation critical: “Homeless” data cease to exist • Sustainable data infrastructure necessary to support – Data management plans – Public access to research data – Use and re-use of data – Reproducibility of results Fran Berman 3 How can we usefully think about sustainability? Sustainable development: "development that meets the needs of the present without compromising the ability of future generations to meet their own needs.“ Our Common Future, U.N. Brundtland Commission • Key components – Ecological sustainability – Cultural / institutional sustainability – Economic sustainability – Political sustainability Planet image: NASA; Quote from “Our Common Future” http://www.unFran Berman- 4 documents.net/our-common-future.pdf How can we measure sustainability? • Circles of Sustainability developed to assess and understand sustainability. Used – For managing projects directedAssessment Action towards socially sustainable outcomes Sustainability analysis helps identify strategic – To assess the sustainability of areas for policy, cities and urban settlements practice and • Used by global organizations investment including the United Nations Global Compact Cities Programme, The World Association of Metropolises, World Vision, and others. Fran Berman 5 Circles of sustainability provide a useful lens with which to consider sustainability of digital research data How can we What create a viable infrastructure is support model for needed to digital access and support digital preservation? stewardship and preservation? Why is digital How do we preservation and maximize the access such a hard access, sharing sell? and exchange of digital research data? Fran Berman 6 What infrastructure is needed to support digital stewardship and preservation? Fran Berman 7 Data-driven Geoscience: How can we respond to large-scale earthquakes? Earthquake simulations enable • Enhanced scientific understanding of the physical world • More strategic plans for bridge, building and other physical infrastructure reinforcements to increase safety • Better disaster response planning for police, fire fighters, ER teams in high- risk areas to increase their effectiveness Simulation courtesy of Amit Chourasia, SDSC, Table information courtesy of 8 Southern California Earthquake Center Fran Berman TeraShake simulation of 7.7 earthquake on the lower San Andreas fault Earthquake simulations enable • Enhanced scientific understanding of the physical world • More strategic plans for bridge, building and other physical infrastructure reinforcements to increase safety • Better disaster response planning for police, fire fighters, ER teams in high- risk areas to increase their effectiveness Simulation courtesy of Amit Chourasia, SDSC, Table information courtesy of 9 Southern California Earthquake Center Fran Berman Terashake data TeraShake Resources infrastructure* Computers and Systems • 80,000 hours on IBM Power 4 (DataStar) • Data Management • 256 GB memory p690 used for testing, p655s used for production run, – 10 Terabytes moved per day during TeraGrid used for porting execution over 5 days • 30 TB Global Parallel file GPFS – Derived data products registered into SCEC • Run-time 100 MB/s data transfer from digital library (total SCEC library has 168 TB) GPFS to SAM-QFS • Data Post-processing: • 27,000 hours post-processing for high resolution rendering – Movies of seismic wave propagation – Seismogram formatting for interactive on- People line analysis • 20+ people for IT support – Derived data: • 20+ people in domain research • Velocity magnitude Storage • Displacement vector field • SAM-QFS archival storage • Cumulative peak maps • HPSS backup • Statistics used in visualizations • Storage Resource Broker collection with 1,000,000 files * circa ~2007 Fran Berman 10 Technical infrastructure critical part of the ecosystem for data-driven innovation Data access via portals, science gateways, etc. Database and data collection systems Data services to support use and re-use Data analysis algorithms, data-driven models and simulations Data visualization tools Semantic frameworks Data management systems Data storage Fran Berman 11 Social, organizational, and human infrastructure for data-driven results equally important Policy Common Systems Standards Interoperability Sustainable Economics Workforce and Community Practice training Traffic Image: Fran Berman 12 Mike Gonzalez Data-driven ecosystem requires multiple kinds of infrastructure Data-driven Innovation Data Infrastructure Technical Infrastructure Social Infrastructure SW and systems, Tools and Policy, Practice, Standards, algorithms, Hardware and Rights, Community facilities culture Fran Berman 13 How can we create a viable support model for digital access and preservation? Fran Berman 14 Data economics: Responsible data stewardship requires a viable business model for sustaining its underlying infrastructure Data infrastructure costs increase with More mgt., stewardship usage, stewardship required and access More Long-lived curation requirements, data perceived value required “Locally Manageable” Greater costs at the Data extremes (including Access control, “big” data) … Big data broad access Coupled data services Fran Berman 15 It’s not just about the cost of storage Data Infrastructure costs include Resources and • Maintenance and upkeep Resource Refresh • Software tools and packages Model A (8-yr,15.2-mo 2X) TB Stored Planned Capacity 100000.0 • Utilities (power, cooling) • Space 10000.0 • Networking • Security and failover systems 1000.0 • People (expertise, help, infrastructure (TB) Storage Archival management, development) 100.0 • Training, documentation 10.0 June-97 June-98 June-99 June-00 June-01 June-02 June-03 June-04 June-05 June-06 June-07 June-08 June-09 • Monitoring, auditing Date • Reporting costs SDSC Data Storage Growth ‘97-’09 • Costs of compliance with regulation, • Most valuable data replicated policy, etc. … • As research collections increase, storage capacity must stay ahead of demand Information courtesyFran Berman of Richard Moore,16 SDSC Increased requirements for access mean increased need for data infrastructure • Increasing U.S. federal R&D agency requirements for managed and accessible research data • February 2013 OSTP Memo: New policies for public access to research data and publications – Strategy for capitalizing on what exists and fostering public-private partnerships with scientific journals – Strategy for increasing / enhancing discoverability, access, dissemination, stewardship, preservation – No new money: “Identification of resources within the existing agency budget to implement the plan” Fran Berman 17 Economics of public access: Who pays the data bill? Article: Science Magazine, August 9, 2013. Free public access link at http:/www.cs.rpi.edu/~bermaf/ Fran Berman 18 Op-ed recommendations: Cultivate / coordinate preservation and stewardship options in every sector Private Public Sector Sector Academia Individuals Fran Berman 19 Op-ed recommendations: Cultivate / coordinate preservation and stewardship options in every sector Private Sector • Facilitate private sector stewardship of public access research data as a public Not govt. ?? Govt. good supported supported Public Sector • Clarify public sector stewardship commitments: articulate what data will / won’t be supported Charleston Ballet blog: http://allianceblog.org/tag/charleston-ballet/ ; corporateFran and Berman collection logos 20 Op-ed recommendations: Cultivate / coordinate preservation and stewardship options in every sector Academic Sector • Create sustainable university library and repository stewardship solutions Individuals • Evolve research culture to take advantage of what works in the private sector Fran Berman 21 No magic economic bullet. Coordination between approaches can provide even more robust options for stewardship Private Public Sector Sector Academia Individuals Fran Berman 22 Why is digital preservation and access such a hard sell? Fran Berman 23 Academic and public sector infrastructure challenges Research Infrastructure What is New discoveries and breakthroughs Failure of systems newsworthy? What is the value Domain and national leadership Enabler of innovation proposition? and competitiveness What is the Fixed-term funding Continuous long-term support funding model? Who is Various govt. R&D agencies, NGOs, No-one’s major priority responsible? etc. Stephanie A. Miner, the Syracuse mayor, said [infrastructure is] too often overlooked when politicians want to spend money on economic development. “You don’t cut ribbons for new water mains, but that’s really what matters.” NY Times, Feburary 15, 2014 Fran Berman 24 Systemic challenges to sustainable stewardship (from the Blue Ribbon Task Force Interim Report [at brtf.sdsc.edu]) • Poor alignment between stakeholders
Recommended publications
  • Measuring the Impact of Digital Repositories
    Measuring the Impact of Digital Repositories February 28-March 1, 2017 Participant Introductions Bruce Ambacher’s career spans four decades at NARA, George Mason University, and the University of Maryland. His responsibilities included service as acting chief of NARA’s digital preservation unit and as court-appointed preservation officer for the PROFS, Iran-Contra, and Clinton email collections. He represented NARA on the Federal Geographic Data Committee and helped develop federal and international geospatial standards. He was NARA’s representative for the OAIS Reference Model. He co-chaired the development of TRAC. He helped develop the trustworthy digital repositories standards ISO 16363 and ISO 16919. He began teaching courses in archives and digital preservation at George Mason University in 1984, became an adjunct professor at the University of Maryland in 2000 and a Visiting Professor between 2007 and 2013. He has consulted on digital preservation for industry and cultural humanities institutions. He is a Research Affiliate of the Digital Curation Innovation Center at the University of Maryland Mary Barlow is head of the Strategic Project Management Office at the European Bioinformatics Institute (EMBL-EBI), and has oversight of impact reporting and the development of infrastructure business cases. Prior to taking on this role, Mary served as a Programme Manager for the multi-million pound investment in EMBL-EBI by the UK government's Large Facilities Capital Fund. This programme included the construction of new office space and the on-going public procurement of ICT infrastructure to support EMBL-EBI's growing public databases. Mary's work prior to EMBL-EBI focused on ICT integration and intelligent buildings.
    [Show full text]
  • The Institutional Challenges of Cyberinfrastructure and E-Research
    E-Research and E-Scholarship The Institutional Challenges Cyberinfrastructureof and E-Research cholarly practices across an astoundingly wide range of dis- ciplines have become profoundly and irrevocably changed By Clifford Lynch by the application of advanced information technology. This collection of new and emergent scholarly practices was first widely recognized in the science and engineering disciplines. In the late 1990s, the term e-science (or occasionally, particularly in Asia, cyber-science) began to be used as a shorthand for these new methods and approaches. The United Kingdom launched Sits formal e-science program in 2001.1 In the United States, a multi-year inquiry, having its roots in supercomputing support for the portfolio of science and engineering disciplines funded by the National Science Foundation, culminated in the production of the “Atkins Report” in 2003, though there was considerable delay before NSF began to act program- matically on the report.2 The quantitative social sciences—which are largely part of NSF’s funding purview and which have long traditions of data curation and sharing, as well as the use of high-end statistical com- putation—received more detailed examination in a 2005 NSF report.3 Key leaders in the humanities and qualitative social sciences recognized that IT-driven innovation in those disciplines was also well advanced, though less uniformly adopted (and indeed sometimes controversial). In fact, the humanities continue to showcase some of the most creative and transformative examples of the use of information technology to create new scholarship.4 Based on this recognition of the immense disciplinary scope of the impact of information technology, the more inclusive term e-research (occasionally, e-scholarship) has come into common use, at least in North America and Europe.
    [Show full text]
  • The OCLC Research Survey of Special Collections and Archives
    Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives Jackie M. Dooley Program Officer Katherine Luce Research Intern OCLC Research A publication of OCLC Research Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives Jackie M. Dooley and Katherine Luce, for OCLC Research © 2010 OCLC Online Computer Library Center, Inc. Reuse of this document is permitted as long as it is consistent with the terms of the Creative Commons Attribution-Noncommercial-Share Alike 3.0 (USA) license (CC-BY-NC- SA): http://creativecommons.org/licenses/by-nc-sa/3.0/. October 2010 Updates: 15 November 2010, p. 75: corrected percentage in final sentence. 17 November 2010, p. 2: added Creative Commons license statement. 28 January 2011, p. 25, penultimate para., line 3: deleted “or more” following “300%”; p. 26, final para., 5th line: changed 89 million to 90 million; p. 30, final para.: changed 2009-10 to 2010-11; p. 75, final para.: changed 400 to 80; p. 76, 2nd para.: corrected funding figures; p. 90, final line: changed 67% to 75%. OCLC Research Dublin, Ohio 43017 USA www.oclc.org ISBN: 1-55653-387-X (978-1-55653-387-7) OCLC (WorldCat): 651793026 Please direct correspondence to: Jackie Dooley Program Officer [email protected] Suggested citation: Dooley, Jackie M., and Katherine Luce. 2010. Taking our pulse: The OCLC Research survey of special collections and archives. Dublin, Ohio: OCLC Research. http://www.oclc.org/research/publications/library/2010/2010-11.pdf. http://www.oclc.org/research/publications/library/2010/2010-11.pdf October 2010 Jackie M.
    [Show full text]
  • 2018 Annual Report Alfred P
    2018 Annual Report Alfred P. Sloan Foundation $ 2018 Annual Report Contents Preface II Mission Statement III From the President IV The Year in Discovery VI About the Grants Listing 1 2018 Grants by Program 2 2018 Financial Review 101 Audited Financial Statements and Schedules 103 Board of Trustees 133 Officers and Staff 134 Index of 2018 Grant Recipients 135 Cover: The Sloan Foundation Telescope at Apache Point Observatory, New Mexico as it appeared in May 1998, when it achieved first light as the primary instrument of the Sloan Digital Sky Survey. An early set of images is shown superimposed on the sky behind it. (CREDIT: DAN LONG, APACHE POINT OBSERVATORY) I Alfred P. Sloan Foundation $ 2018 Annual Report Preface The ALFRED P. SLOAN FOUNDATION administers a private fund for the benefit of the public. It accordingly recognizes the responsibility of making periodic reports to the public on the management of this fund. The Foundation therefore submits this public report for the year 2018. II Alfred P. Sloan Foundation $ 2018 Annual Report Mission Statement The ALFRED P. SLOAN FOUNDATION makes grants primarily to support original research and education related to science, technology, engineering, mathematics, and economics. The Foundation believes that these fields—and the scholars and practitioners who work in them—are chief drivers of the nation’s health and prosperity. The Foundation also believes that a reasoned, systematic understanding of the forces of nature and society, when applied inventively and wisely, can lead to a better world for all. III Alfred P. Sloan Foundation $ 2018 Annual Report From the President ADAM F.
    [Show full text]
  • Istec Distinguished Lectures, Colorado State
    Colorado State University’s Information Science and Technology Center (ISTeC) presents two lectures by Dr. Fran Berman Director, San Diego Supercomputer Center Professor and HPC Endowed Chair, UC San Diego ISTeC Distinguished Lecture in conjunction with the Electrical and Computer Engineering Department and Computer Science Department Seminar Series “100 Years of Digital Data” Monday, March 31, 2008 Reception: 10:30 a.m. Lecture: 11:00 – 12:00 noon Location: Lory Student Center Room 214 • Joint Electrical and Computer Engineering Department and Computer Science Department Special Seminar sponsored by ISTeC “Cyberinfrastructure Challenges in Computer Science” Monday, March 31, 2008 Lecture: 4:00 – 5:00 p.m. Location: Wagar Room 231 ABSTRACTS “100 Years of Digital Data” The Information Age has brought with it a deluge of digital data. Current estimates are that in 2006, 161 exabytes (10^18 bytes) of digital data were created from cell phones, computers, iPods, DVDs, sensors, satellites, scientific instruments, and other sources, providing a foundation for our digital world. Migrating digital content through new generations of storage media, making sense of its content, and ensuring that needed information is accessible now and for the foreseeable future constitute some of the most critical challenges of the Information Age. The San Diego Supercomputer Center (SDSC) is a national Center leading the development and deployment of a comprehensive infrastructure for managing, storing, preserving, and using digital data. Leveraging ongoing collaborations with the research community (National Science Foundation, Department of Energy, etc.), data preservation and archival communities (Library of Congress, National Archives and Records Administration) and other partners, SDSC is providing innovative leadership in the emerging area of Data Cyberinfrastructure.
    [Show full text]
  • Rensselaer Professor Francine Berman Appointed by President Obama To
    Rensselaer Professor Francine Berman Appointed by President Obama to... http://news.rpi.edu/content/2016/01/11/francine-berman-appointed-presi... For the Media For the Community Archive Home > News > Faculty > Francine Berman Appointed by P... Top 100 Science Stories of 2015 - #59 A Wider, Groovier January 11, 2016 Milky Way Galaxy By Mary L. Martialay Rensselaer Cybersecurity Francine Berman has been appointed by President Obama and Expert Appointed to confirmed by the U.S. Senate to serve on the National Council on the Homeland Security Science Humanities (NCH), a board of 26 distinguished individuals who advise and Technology Advisory the chairman of the National Endowment of the Humanities (NEH). Committee Dr. Berman is the Edward P. Hamilton Distinguished Professor in Computer Science at Rensselaer Polytechnic Institute (RPI). Rensselaer Receives $9.44 Million Grant From U.S. Berman is an international leader in data science whose work focuses Department of Energy on the development of sustainable infrastructure for digital stewardship and preservation. Such infrastructure is critical to Modeling, Design, and support data-driven research and new discovery in a broad variety of Controls Expert B. Wayne fields, including the digital humanities. Berman has worked closely Bequette Named Fellow of with the library and museum community and co-chaired the Blue IEEE Ribbon Task Force on Sustainable Digital Preservation and Access and Popular Memories the National Research Council Board on Research Data and Information. She currently serves as the U.S. chair of the Research Data Alliance, an international community-driven organization founded to accelerate research data sharing and data-driven innovation worldwide.
    [Show full text]
  • Update on the Research Data Alliance, Etc
    Update on the Research Data Alliance, etc. Dr. Francine Berman Chair, Research Data Alliance / US Edward P. Hamilton Distinguished Professor of Computer Science, Rensselaer Polytechnic Institute Fran Berman, Research Data Alliance Data Sharing Driving New Discovery and Advances Fran Berman, Research Data Alliance Both Technical and Social Infrastructure Needed to support Data Sharing Adopted Policy Systems Common Types, Interoperability Standards, Metadata Adopted Sustainable Economics Training, Education, Community Practice Workforce Fran Berman, Research Data Alliance Traffic Image: Mike Gonzalez Prioritizing Infrastructure for Effort and Investment Challenging Stephanie A. Miner, the Syracuse mayor, said [infrastructure is] too often overlooked when politicians want to spend money on economic development. “You don’t cut ribbons for new water mains, but that’s really what matters.” NY Times, Feburary 15, 2014 Fran Berman, Research Data Alliance Getting the World Involved in Building / Coordinating Data Sharing Infrastructure: the Research Data Alliance . Research Data Alliance (RDA): Global community-driven organization whose mission is to build the social and technical bridges (infrastructure) that enable data sharing. Research Data Alliance Vision: Researchers and innovators openly share data across technologies, disciplines, and countries to address the grand challenges of society. Fran Berman, Research Data Alliance RDA: Accelerate Data Sharing and Interoperability Across Cultures, Communities, Scales, Technologies . Technical parts
    [Show full text]
  • Wednesday October 4
    We offer 20 tracks to help you navigate the schedule. Like tracks are color coded for even easier exploration. CAREER ARTIFICIAL WEDNESDAY INTELLIGENCE COMMUNITY OCTOBER 4 COMPUTER SYSTEMS CRA-W ENGINEERING STUDENT DATA SCIENCE OPPORTUNITY LAB HUMAN COMPUTER ACM RESEARCH INTERACTION COMPETITION INTERACTIVE MEDIA GENERAL POSTER SESSION SECURITY/PRIVACY GENERAL SESSION SOFTWARE ENGINEERING LUNCHES & RECEPTIONS OPEN SOURCE SPECIAL SESSIONS ORGANIZATION IOT / WEARABLE TECH TRANSFORMATION PRODUCTS A TO Z //////////////////////////////////////////////// TUESDAY, OCTOBER 3 / 5 - 6 p.m. GENERAL SESSION PRESENTATION First Timers Orientation OCCC W230C All NOTE: OCCC stands for the Orange County Convention Center #GHC17 DAY 1: WEDNESDAY #GHC17 DAY 1: WEDNESDAY //////////////////////////////////////////////// //////////////////////////////////////////////// 9 - 10:30 a.m. 11 a.m. - 6:30 p.m. GENERAL SESSION CAREER Wednesday Keynote Interviews OCCC WA2 Melinda Gates OCCC WB3/4 (Bill & Melinda Gates Foundation), Fei-Fei Li //////////////////////////////////////////////// (Stanford University; Google Cloud) 11:30 a.m. - 12:30 p.m. ARTIFICIAL INTELLIGENCE PRESENTATIONS //////////////////////////////////////////////// 10:30 a.m. - 5:30 p.m. Presentations: AI for Social Good OCCC W304C Jennifer Marsman (Microsoft), CAREER EXPO Neelima Kumar (Oracle) Beginner/Intermediate Career Fair CAREER PANEL OCCC WA3/4 & WB1/2 All For Good and For Profit: Exploring Careers in SPECIAL SESSIONS Social Enterprise Speaker Lounge OCCC W305 All Hyatt Regency Ballroom V Hannah SPECIAL SESSIONS Calhoon (Blue Ridge Labs@Robin Hood), Kamla Kasichainula (Change.org), Erin Mote Faculty Lounge OCCC W209C Faculty (InnovateEDU), Morgan Berman (MilkCrate), Donnovan Andrews (Overture) All //////////////////////////////////////////////// 10:30 a.m. - 6 p.m. CAREER PANEL Women in Tech: Get a Seat @ the Table! SPECIAL SESSIONS Hyatt Regency Ballroom S Monique Student Lounge sponsored by D.E.
    [Show full text]
  • Envisioning the Future: Research Advances 2007
    SAN DIEGO SUPERCOMPUTER CENTER at UC SAN DIEGO ENVISIONING TH E FUTUR E Research Advances 2007 With SDSC Data Cyberinfrastructure You Can Support Emergency Responders Design a Bird Flu Vaccine Preserve Historical Data on a Humanities Grid Travel Back to the Early Universe Meet the Ethanol Challenge Harvest Iceberg Data to Explore Climate Change and more... SAN DIEGO SUPERCOMPUTER CENTER For more than two decades, the San Diego Supercomputer Center (SDSC) has enabled breakthrough data-driven and computational science and engineering discoveries through the innovation and provision of high-performance computing resources, information infrastructure, technologies and interdisciplinary expertise. A key resource to academia and industry, SDSC is an international leader in data cyberinfrastruc- Fran Berman ture and computational science, and serves as a Director national data repository to nearly 100 public and San Diego Supercomputer Center private data collections. SDSC is an Organized Envisioning Research Unit and integral part of the University of California, San Diego and a founding site of the Cyberinfrastructure’s NSF TeraGrid. Future at SDSC As SDSC turns 21 we begin “envisioning the future” of what best captures the essence SDSC INFORMATION of the Center—data cyberinfrastructure Dr. Francine Berman, Director Page 2 San Diego Supercomputer Center University of California, San Diego 9500 Gilman Drive MC 0505 La Jolla, CA 92093-0505 Phone: 858-534-5000 Fax: 858-534-5152 [email protected] Arthur Ellis www.sdsc.edu Vice Chancellor for Research UC San Diego Warren R. Froelich Director of Communications and Public Relations SDSC: Driving Innovation at [email protected] UCSD and Beyond 858-822-3622 SDSC data and computational technolo- gies are poised to power new collaborative SDSC RESEARCH ADVANCES discoveries Published annually by SDSC and presenting leading-edge research in data cyberinfrastructure Page 3 and computational science.
    [Show full text]
  • The Research Data Alliance Building Community & Infrastructure For
    The Research Data Alliance Building Community & Infrastructure for Data Sharing World-wide Presentation to Pitt and CMU library faculty & staff on data sharing and the Research Data Alliance Thursday, October 22, 11 am-noon Mellon Institute Conference Room (MI 348), CMU Francine Berman, Research Data Alliance Three years ago, the Research Data Alliance (rd-alliance.org) was launched as a community-driven international organization with a mission to develop social and technical infrastructure that supports and accelerates data sharing world-wide. Since its launch, the RDA has grown precipitously and now has over 3100 members from 100+ countries. U.S. Chair Francine Berman discusses the opportunities and challenges of sharing research data, and describes RDA’s global efforts to build an agile and functional organization, a broad and cohesive community, and a pipeline of impact-focused infrastructure adopted by individuals, projects and organizations to solve problems. Dr. Berman is the Edward P. Hamilton Distinguished Professor in Computer Science at Rensselaer Polytechnic Institute. She is a Fellow of the Association of Computing Machinery (ACM), a Fellow of the Institute of Electrical and Electronics Engineers (IEEE), and a Fellow of the American Association for the Advancement of Science (AAAS). In 2009, she was the inaugural recipient of the ACM/IEEE-CS Ken Kennedy Award for “influential leadership in the design, development, and deployment of national-scale cyberinfrastructure.” Dr. Berman is U.S. lead of the Research Data Alliance (RDA). She also serves as Chair of the Anita Borg Institute Board of Trustees, as co-Chair of the NSF Advisory Committee for the Computer and Information Science and Engineering (CISE) Directorate, as a member of the Board of Trustees of the Sloan Foundation, and as a member of the Board of Directors of the Monterey Bay Aquarium Research Institute (MBARI).
    [Show full text]
  • A Grid-Based Middleware for Processing Distributed Data Streams
    A GRID-BASED MIDDLEWARE FOR PROCESSING DISTRIBUTED DATA STREAMS DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Liang Chen, M.S. * * * * * The Ohio State University 2006 Dissertation Committee: Approved by Gagan Agrawal, Adviser Hakan Ferhatosmanoglu Adviser Han-Wei Shen Graduate Program in Computer Science and Engineering ABSTRACT Increasingly, a number of applications rely on, or can potentially benefit from, analysis and monitoring of data streams. Moreover, many of these applications involve high volume data streams and require real-time and distributed processing of data arising from a distributed set of sources. We believe that a grid environment is well suited for flexible and adaptive analysis of these streams. Thus, our research focuses on designing and evaluating a middleware to support the processing of these streams. Our system is referred to as GATES (Grid-based AdapTive Execution on Streams). It flexibly achieves the best accuracy that is possible while maintaining the real-time constraint on the analysis. We have developed self-adaptation algorithms for this purpose. Further, we have addressed the problem of resource allocation in the GATES system. We design a static resource allocation algorithm to generate sub-optimal allocations and the experimental evaluation shows these allocations are very close to the optimal one. We also develop an infrastructure to support resource monitoring and dynamic resource allocation. Moreover, we implement efficient and low-cost dynamic migration for pipelined applications in a grid environment by using the notion of Light-weight Summary Structure (LSS). ii We have studied various data stream applications with support of GATES to intensively evaluate our middleware.
    [Show full text]
  • Making Research and Education Cyberinfrastructure Real
    Making Research and Education Cyberinfrastructure By Francine Berman REALhe Industrial Age transformed the world through the application of technology to research, practice, and everyday life. Technology revolutionized manufacturing, transportation, agriculture, and communications and created deep social and eco- nomic changes that continue today. In the last century, the explosion of information technologies ushered in an Information Age with similar transformative poten- tial. In 2008, it is hard to imagine modern life and work without the ability to access, manipulate, organize, and understand a sea of digital information on almost every conceivable topic. Francine Berman is Director of the San Diego Supercomputer Center and is Professor and High Performance Computing Endowed Chair in the Department of Computer Science and Engineering at the University of California, San Diego. 18 EDUCAUSE review July/August 2008 © 2008 Francine Berman Illustration by Dung Hoang, © 2008 Comprised of dynamically The driving engine for the Informa- evolving information tion Age is cyberinfrastructure (CI): the orga- nized aggregate of information technolo- technologies, CI is both a gies (computers, storage, data, networks, scientific instruments) that can be coordi- continuous work-in-progress and nated to address problems in science and society. Fundamental to modern research, a stable infrastructure driver for education, work, and life, CI has the po- tential to overcome the barriers of geog- invention and innovation. raphy, time, and individual capability to create new paradigms and approaches, to Technology, Leadership Under Challenge: progression of Parkinson’s is an active catalyze invention, innovation,1 and dis- Information Technology R&D in a Competitive area of research. Modern efforts to under- covery, and to deepen our understanding World, and by many other assessments of stand the behavior of Parkinson’s involve of the world around us.
    [Show full text]