Tetherless World Linked Data and on the Web #TWCHack11 Deborah L. McGuinness

Tetherless World Constellation Chair and Professor of Computer Science and Cognitive Science

Joanne S. Luciano

Research Associate Professor Tetherless World Constellation and Computer Science

Rensselaer Polytechnic Institute

RPI Tech Park – Pats Barn

June 27, 2011 Welcome !

Welcome to RPI / Tetherless World and the Elsevier/ Tetherless World Health and Life Sciences Hackathon. Agenda and info at: • http://tw.rpi.edu/web/event/TWCElsevierH ackathonJune2011 Twitter #TWCHack11

2 RPI Tetherless World Constellation (TWC)

Themes: • Semantic Foundations • Knowledge Provenance • Ontology Environments • Inference • Trust • Linked Data • Xinformatics • Semantic eScience • Data Science • eHealth • eEnvironment • Future Web • Web Science • Policy http://tw.rpi.edu • Social Chaired Professors: McGuinness, Fox, Hendler Research Prof: Luciano Tetherless World Constellation Linked Open Government Data Portal

http://logd.tw.rpi.edu Real World Data 8.6+ billion triples 400+ datasets 11+ sources  Wide range of domains  50+ demos Semantic Web Technology  completely open source  Demos/tutorials/videos  Tools to help build demos Community & Press  US government partner  NY Times (and other)  Sem Web Challenge

4 Health on the Web

• Next generation health representation and reasoning tools, visualizations, • Leveraging semantics along with sensor information – exercise/walking, diabetic, cardiac, monitoring, aging, … • Access to personal data • Using context • Health policy • PCAST Health Information Technology report – http://www.whitehouse.gov/sites/default/files/mi crosites/ostp/pcast-health-it-report.pdf http://health.tw.rpi.edu http://logd.tw.rpi.edu/health 6 7 PopSciGrid Example State -Hawaii

Extensible Mashups via Linked Data  Diverse datasets from NIH  Potentially linking to other content (e.g. “unemployment rate”) Accountable Mashups via Provenance  Annotate datasets used in demos  Feedback users’ comment to gov contact (e.g. %) 8  Annotation capabilities coming (and more) PopSciGrid in Action PopSciGrid II GeoNames (geonames.org)

Community Health Status Census Indicators Report (rdfabout.com/demo (communityhealth.hhs.gov) /census) Next Steps

• Joanne Luciano (TWC, WSTNet) “Carpe Diem” • Dominic DiFranzo (TWC): "Intro to Linked Data Mashups" • Remko Caprio (Developer Evangelist): "Elsevier Content APIs and programming SciVerse Apps and OpenSocial" • Elizabeth Brooks (UHI UK): "Health Mashup Use Cases" • April Oh/Frank Perna (NIH): "How to Get and Use C.L.A.S.S. (Classification of Laws Associated with School Students)" • John Erickson and Remko Caprio: "Challenge Guidelines" • Start hacking! (Elsevier and TWC experts will be available for consultation 24 hours) 6 pm: Dinner 9:30 pm: Snacks and desserts

12 Carpe Diem

The the shifts over the past 15 years --Technology and Society meet – WWW Internet-wide distributed information retrieval system » The address system (URI); » A network protocol (HTTP). » A markup language (HTML) » Data + Deep Data •Aging population •End of Era of “Blockbuster Drugs” (“one size fits all”) •Current Health System not sustainable – (incr costs, health) •Web Phenomenon •Patients are the experts – now part of our every day life 13 Need + technology = one possible approach?

“A perfect storm”

Dr Grant Cumming, MBChB, MD, P4 from Leroy Hood MRCOG. Consultant Obstetrician and Gynaecologist NHS 14 Challenge

• Can the web – Engage individuals in services provided via the web? – Personalize Health Applications and data input via the web? – Engage communities in services and as communities on the web? – How do we evaluate Health Web Services? – What are the future directions of Health services using the Web? 15

16

Extra

Population Sciences Grid Goals

• Convey complex health-related information to consumer and public health decision makers for community health impact • Leverage the growing evidence base for communicating health information on the Internet • Inform the development of future research opportunities effectively utilizing cyberinfrastructure for cancer prevention and control 17

Computer Science Perspective on PopSciGrid Goals

• How can semantic technologies be used to integrate, present, and analyze data for a wide range of users? • Can tools allow lay people to build their own demos and support public usage and accurate interpretation? • How do we facilitate collaboration and “viral” applications? • Within PopSciGrid: – Which policies (taxation, smoking bans, etc) impact health and health care costs? – What data should be displayed to help scientists and lay people evaluate related questions? – What data might be presented so that people choose to make (positive) behavior changes? – What does the data show? why should someone believe that?

– What are appropriate follow ups? 18

PopSciGrid Workflow

Publish

Ban coverage CSV2RDF4LOD Direct visualize

derive derive archive

Archive SemDiff CSV2RDF4LOD derive Enhance Inference Web: Making Data Transparent and Actionable Using Semantic Technologies

• How and when does it make sense to use smart system results & how do we interact with them?

(Mobile) Intelligent Knowledge Provenance in Agents Virtual Observatories NSF Interops: SONET SSIII – Sea Ice Intelligence Analyst Tools

Hypothesis Investigation / Policy Advisors 20 2 0 Tetherless World Constellation Linked Open Government Data Portal

End User Portal Applications Diverse, Real World Data Semantic Web in Gov Domain  Fast, low-cost mashups built  US, UK, China,…  Major partner of Data.gov by CS AND domain experts  Health, energy, economy  8.6+ billion triples in LOD  Attracting Press: NYT, …

21 Enabling Exploration and Hypothesis Generation,

• Should we focus on prevalence, packs sold, other? • What is prevalence (definition)? And how is it measured (overall / in this data set)? • What are the conditions under which the data was obtained (date, sample set, extenuating conditions, …) • What data is in these demos? What other data could/should we include? (e.g., http://tw.rpi.edu/web/project/PopSciGrid/datasets- and-key-publications-related-to-smoking ) Tetherless Faceted Browsing