<<

Introducing the AAS Working Group on and Astrostatistics Željko Ivezi ć, WISE The 223rd AAS meeting, Washington, D.C., Jan 7, 2014 SDSS Pan-STARRS 1 and 2 Gaia

DES

LSST Outline • History, motivation and organization ! • The main strategic goals ! • WGAA accomplishments and related developments ! • Astrostatistics and Astroinformatics Portal ! • Poster Announcement Motivation • Ever increasing data volume and complexity - SDSS is ~30 TB; LSST will be one SDSS per night, or a total of >100 PB of data • Sophisticated cross-disciplinary analysis - with the increasing data complexity, analysis becomes more complex, too (high-D spaces, truncated and censored heteroscedastic measurements, time series, unknown distributions and error behavior, etc) • Need for collaborative efforts - we are not data starved any more! - the bottleneck for new results is in human resources and analysis tools - nobody has an unlimited budget; we are a small field and we should collaborate and share! History, motivation and organization • In response to two White Papers submitted to the Astro2010 Decadal Survey (Borne et al. 2009, arXiv:0909.3892; Loredo et al. 2009, not on arXiv but pdf linked on ADS), a new AAS Working Group on Astroinformatics and Astrostatistics (WGAA) has been approved by the AAS Council at the 220th Meeting, June 2012, in Anchorage. History, motivation and organization • The motivation for this WG is the growing importance of the interface between and various branches of applied mathematics, and the emerging field of . • With the avalanche of new data-intensive projects, the need for advice derived from the focused of a group of AAS members who work in these areas is bound to increase History, motivation and organization • This Working Group is charged with spreading awareness of: • rapidly advancing computational techniques, • sophisticated statistical methods, and • highly capable software to further the goals of astronomical and astrophysical research. History, motivation and organization • We want all AAS members interested in astrostatistics and astroinformatics to directly benefit from this Working Group, but at the same time it must be acknowledged that real work can be plausibly expected only from a much smaller group. • This smaller group, “the steering committee", includes Kirk Borne, George Djorgovski, Eric Feigelson, Eric Ford, Alyssa Goodman, Joe Hilbe, Zeljko Ivezic (chair), Ashish Mahabal, Aneta Siemiginowska, Alex Szalay, Rick White, and Padma Yanamandra-Fisher. History, motivation and organization • The steering committee archives its discussions using Google Groups and [email protected] exploder; members have three years staggered terms • A much larger list, [email protected], includes 83 members; any AAS member can join (send email to Zeljko) • The main online repository is Astrostatistics and Astroinformatics Portal (discussed later) The main strategic goals • Develop, organize and maintain methodological resources (such as software tools, papers, books, and lectures); Methods • Enhance human resources (such as foster the creation of career paths, establish a Speakers' Bureau, establish and maintain an archived discussion forum, enable periodic news distribution) People • Organize topical meetings Meetings WGAA 1st-year accomplishments and! related developments • This Special Session is the first topical meeting (“to stimulate broad interest”) • Started archived discussions with by now close to 100 members • WGAA contributions to Astrostatistics and Astroinformatics Portal • New related organizations • A lot of new books and tools, also the first IAU Symposium on Astrostatistics (#306) New related organizations • See Feigelson et al. (2013, arXiv:1301.3069) • The IAU Commission 5 formed a Working Group in Astrostatistics and Astroinformatics. • International Statistical Institute (ISI, sister organization to the IAU) formed an Astrostatistics Committee in 2009, and in 2010 an Astrostatistics Network. It is now being reorganized as an independent International Astrostatistics Association. • LSST and Science Collaboration • (and ACM interest group under discussion) New textbooks and tools

• Advances in Machine and for Astronomy (Way et al. 2012) • Astrostatistical Challenges for the New Astronomy (Hilbe 2012) • Astrostatistics and Data Mining (Sarro et al. 2012) • Modern Statistical Methods for Astronomy With R Applications (Feigelson and Babu 2012) • Statistics, Data Mining, and in Astronomy: A Practical Python Guide for the Analysis of Survey Data (Ivezić, Connolly, Gray and VanderPlas 2014) • AstroPy, astroML, esutil, PyMC, healpy, and many more… PRINCETON SERIES IN MODERN OBSERVATIONAL ASTRONOMY

Željko Ivezić, Andrew Connolly, Jacob Statistics, Data Mining, and Vanderplas, Alex Gray Machine Learning in Astronomy Princeton University Press, 2014

A Practical Python Guide for the Analysis of Survey Data

Željko Ivezic,´ Andrew J. Connolly, Jacob T. VanderPlas & Alexander Gray

A “randomly” chosen example of a new textbook and supporting non-trivial data sets and open source python code Open source! www.astroML.org Open source! www.astroML.org

Example: You can make this plot from scratch in <1 hour! SDSS asteroids Visualization of 4-dimensional correlations Comparing Knuth’s rule and Scargle’s Bayesian Blocks • make this plot by running %run fig_bayes_blocks.py Note that Knuth’s method does not find the narrow peak in the middle for the smaller dataset! Bayesian Blocks method gives you the best step function that describes your data. It is excellent for low-count data and for time-series analysis! One of >100 examples

1000 Milky Way Extreme Deconvolution in high-D (XD) Bovy, Hogg & Roweis 2011 (arXiv:0905.2979)

• a mixture of high-D Gaussian components, with heteroscedastic errors (and missing data) SVM classification of variable stars from Palaversa+ 2013 • Based on 7 attributes (4 SDSS-2MASS colors and 3 LINEAR light curve parameters: period, amplitude, skewness) and 5 input classes (data also in astroML) Time series analysis: find a low-SNR burst (wavelet analysis)

Signal

Found it!

Open source: www.astroML.org Astrostatistics and Astroinformatics! Portal (ASAIP, https://asaip.psu.edu)! • ASAIP is edited by Eric Feigelson (Penn State University) and Joseph Hilbe (Arizona State University) • ASAIP is hosted by the Eberly College of Science of the Pennsylvania State University. • No guarantee is given for the validity or usefulness of or advice provided on ASAIP. • ASAIP is a new Web site serving the cross- disciplinary communities of astronomers, statisticians and computer scientists • Intended to foster research into advanced methodologies for astronomical research, and to promulgate such methods into the broader astronomy community. • Provides searchable abstracts to recent papers in the field, several discussion forums, various resources for researchers, brief articles by experts, lists of meetings, and access to various web resources such as on-line courses, books, jobs and blogs. Astrostatistics and Astroinformatics! Portal (ASAIP, https://asaip.psu.edu)!

Check it out! Poster Announcement • Filtergraph: A fast, intuitive, online data visualization system for large astronomy datasets (Stassun et al.) • NED in the Era of Very Large Extragalactic Surveys (Fadda et al.) • Spectroscopic and Photometric Variability in the A0 Supergiant HR 1040 (Corliss) • Managing the Big Data Avalanche in Astronomy - Data Mining the Classification (Borne) Poster Announcement • AstroML: Python-powered Machine Learning for Astronomy (VanderPlas, Connolly & Ivezić) • The Astrostatistics and Astroinformatics Portal (Feigelson & Hilbe) • Adventures in Modern Time Series Analysis: From the Sun to the Crab Nebula and Beyond (Scargle) • The Virtual Observatory for the Python Programmer (Plante et al.)