Responding to the Event Deluge
Total Page:16
File Type:pdf, Size:1020Kb
Responding to the Event Deluge Roy D. Williams·, Scott D. Barthelmyb, Robert B. Dennyc, Matthew J. Graham', and John Swinbankd 'California Institute of Technology "NASA Goddard Space Flight Center 'DC-3 Dreams SP d University of Amsterdam ABSTRACT We present the VOEventNet infrastructure for large-scale rapid follow-up of astronomical events, including selection, annotation, machine intelligence, and coordination of observations. The VOEvent.standard is central to this vision, with distributed and replicated services rather than centralized facilities. We also describe some of the event brokers, services, and software that .are connected to the network. These technologies will become more important in the coming years, with new event streams from Gaia, LOFAR, LIGO, LSST, and many others. Keywords: VOEvent, GCN, TAN, Skyalert, transients 1. Deluge In the late 199Os, the gamma-ray burst (GRB) community ignited the current excitement over transient astronontical events. Gamma-Ray Bursts (GRBs) were a real enigma until ultra-fast event dissemination allowed optical identification of afterglOWS, leading to rich data and rich ·science. The events back then were both valuable and infrequent: every new ORB could make a career for a young astronomer, and they were only detected every few days. However, in the next few years, surveys carried out by telescopes sucb as Gaia, LOFAR, Pan-STARRS, LSST and SICA will produce a flood of hundreds of events every 24 hours, with the scientific jewels surrounded by dross, and so both scalability and discrimination will be increasingly important. We should also point out that a deluge of event metadata is also coming, as more transient surveys come online, and each of those spawns its own streams offollow-up events and annotations. Follow-up observation will be in short supply in the era of the event deluge. Faint objects can only be observed with the largest telescopes - that are already over-subscribed, and objects with uncertain position require deep wide-field imaging to look for possible counteIparts. But selection of "interesting" events is subject to the same quality measures of any selection process, being the false-positive and false-negative rates. False positives are uninteresting things that waste valuable follow-up resources, and false negatives are exciting objects that were not identified as such. It will be important to bring together all possible information, as quickly as possible, and to build new ways of automated information fusion, in order to ntinimize both false positives and false negatives. There bave been several ways to represent and communicate astronomical transient notices. The Central Bureau of Astronomical Telegrams[l] bas been sending transient notices since 1882, and the Astronomer's Telegram website[2] has been running for several years. However these are naturallaoguage text, requiring a human in the loop for decision and action. Given the event deluge, and the importance of rapid follow-up, human readable text must give way to communications that can be understood and acted on by machines. The original GCN[3] used formally-defmed 160-byte packets, which enabled the blossoming of GRB science as noted above. However, in the last few years the VOEvent [4][5] syntax, from the International Virtual Observatory Alliance[6], bas become the common language for rapidly disseminating machine-readable astronomical transients. Once events are being disseminated, there may be other capabilities that can increase the science harvest from an event stream. Annotation is the process of adding information to a first detection, based on follow-up observation, arcbival search, or intelligence assessment. Consumers of events may wish to select events based on many kinds of criteria, depending on the event itself, or th.e associated annotations, or on the ability of their follow-up telescope, or other. queries. Those building machine-intelligence codes or mining events will want a repository (database) of events and their annotations, to build training sets, test their software, or decide on selection criteria. 1 In this paper, we introduce VOEventNet, a networl< architecture for broadcasting VOEvents, and we discuss the requirements of astronomers using the network, and some prototype nodes of the VOEventNet that can satisfY these needs. 2. Technology Figure I shows the components of the VOEventNet network, which we shall describe below. First we define a Broker to be a service that broadcasts VOEvents from a given domain narne and port number. The events may have been received from another broker, or through a publishing system attacbed to the VTP node (see below). Publication, in the sense of VOEvent, is the assignment of an identifier, and validation of the VOEvent packet. Just as with publishing books, the Publisher interfaces with the Author, which is the entity responsible for the scientific content of the VOEvent; then a closely-coupled broker will actually broadcast the events via VTP. A broker opens a VTP socket that allows subscribers to connect, and the connection remains open. VOEvents are broadcast to all the subscribers that are currently connected. The Registry (see section 5) will be used to advertise what is available, which brokers are broadcasting which event streams, and documentation of what the data in the events means. subscriber VOEvent broker registry Figure 1: The basic VOEventNet infrastructure. The transport infrastructure is used by brokers to brnadcast to subseribers. A global registrywill .a1lows discovery of resources such as publishers, brokers. event-stream metad.ta. 2.1 VOEvent In the past, notices of astronomical transients have been communicated by natural language, but the onset of the deluge meanS that macbines must take mucb of the burden of decision-making and pointing the telescopes. This realization led to the VOEvent[4)[5] standard, which gives authors flexibility in data representation, yet enough structore that subscribers' machines can understand and respond. VOEvent is XML-based, evolved over years, a recommendation of the International Virtual Observatory Alliance[6] (WOA). The results of astronomical observ.ations using real telescopes are expressed with VOEven!, to be published and transmitted, and then be captured and filtered by subscribers. Each event that survives rigorous fIltering can then be passed to other telescopes to acquire real follow-up observations - or passed to computational or archival software for 'virtual follow-ups'. This must happen quickly (often within seconds of the original VOEvent) and·must minimize unnecessary expenditures of either real Or virtual resources. A VOEvent packet provides a general purpose mechanism for representing transient astronomical events. The XML schema[7] is as simple as practical to allow the mininialrepresentation of scientifically meaningful, time critical, events. VOEvent also incorporates other standard VO and astronomical schema, specifically STC(8] for space-time coordinates and UCDs[9] to characterize the semantics of the data. Each event has a unique identifier (or WORN) that is composed of a stream identifier and a 'local' identifier within that stream. The registry is being built so that these identifiers can be resolved: from event identifier to its stream, then froding a repository that has events frnm that stream, then a query to that repository to resolve the event itself. In the XML representation ofVOEvent, there may be at most one of each of the following optional sub-elements: 2 Who: This is the contact infonnation for the organization that is responsible for the content of the message. • What: A structured description of the observed parameters, wbich may Include both key-value pairs and tabular data, together with appropriate metadata. WhereWhen: Space-time coordinates of the event. The data model here is a point in the sky with circular error, and a point in time with an interval of uncertainty. While a wide variety of coordinate systems are pOssible here, most event authors use ICRS and UTC coordinates. • How: Instrument configuration for the observation. • Why: This is for an initial scientific assessment; a classification and probability. • Citations: This element contains the identifier (!vORN) of other VOEvents that are relevant to the same astrophysical event. A sequence of instrumental observations would each cite one of the others in the chain; follow-up reports cite the !VORN of the discovery event. • Description: This is the place for natural language, a label that can be attached to any part of the rest of the event, describing, for example: one of the parameters, the author organization, etc. • Reference: References should be used for the URLs of related websites. Only those elements required to convey the event being described need be present. The intent ofVOEvent is to represent data ..sociated with an astronomical transient. The Who section of VOEvent allows an event author great flexibility in describing the data that is the heart of the event. While other elements specify facets of the event common to all astronomical transients (spacetime, classifications, provenance, etc.), it is the Who section that allows the event author to create a data model: there are parameters ('Parun') that can be organized into Groups, and a Table can be specified with a set of Fields (i.e. table columns). Each such quantity is a number or string, with descriptive metodata, including units, data type, and semantic classification[9]. When the semantic classification is meta. ref. urI, for example, then the subscriber can assume that it is a URL link. VOEvents are split into classes called Streams, generally based on the instrument or other source of the event. Each event in a stream will have similarly structured Params and Tables (in the What section of the VOEvent), with the same meaning, units, data types, etc. In this way, a subscriber can rely on using one or more specific parameters for building selection criteria. It is considered the responsibility of a' Publisher to make sure that each event of • given stream is valid, in terms of the previous metadat.