IVOA Identifiers Version
Total Page:16
File Type:pdf, Size:1020Kb
International Virtual Observatory Alliance IVOA Identifiers Version 2.0 IVOA Recommendation 2016-05-23 Working group Resource Registry This version http://www.ivoa.net/documents/Identifiers/20160523 Latest version http://www.ivoa.net/documents/Identifiers Previous versions REC-1.12 PR-20050302 PR-20040621 WD-20040209.html WD-20031031 WD-20030930 PR-20030830.html Author(s) Markus Demleitner, Raymond Plante, Tony Linde, Roy Williams, Keith Noddle, and the IVOA Registry Working Group Editor(s) Markus Demleitner Version Control Revision 3158, 2015-11-20 09:04:09 +0100 (Fri, 20 Nov 2015) https://volute.g-vo.org/svn/trunk/projects/registry/Identifiers/Identifiers.tex arXiv:1605.07501v1 [astro-ph.IM] 24 May 2016 Abstract An IVOA Identifier is a globally unique name for a resource within the Vir- tual Observatory. This name can be used to retrieve a unique description of the resource from an IVOA-compliant registry or to identify an entity like a dataset or a protocol without dereferencing the identifier. This document describes the syntax for IVOA Identifiers as well as how they are created. The syntax has been defined to encourage global-uniqueness naturally and to maximize the freedom of resource providers to control the character content of an identifier. Status of This Document This document has been reviewed by IVOA Members and other interested parties, and has been endorsed by the IVOA Executive Committee as an IVOA Recommendation. It is a stable document and may be used as reference material or cited as a normative reference from another document. IVOA’s role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability inside the Astronomical Community. A list of current IVOA Recommendations and other technical documents can be found at http://www.ivoa.net/Documents/. Contents 1 Introduction3 1.1 Definitions . .4 1.2 Selected Requirements . .6 1.3 Rationale for Version 2 . .6 1.4 IVOA Identifiers within the VO Architecture . .7 2 Specification8 2.1 Overview (non-normative) . .8 2.2 Characters . .9 2.3 Syntax Components . .9 2.3.1 Scheme . .9 2.3.2 Authority . 10 2.3.3 Resource Key . 11 2.3.4 Query . 12 2.3.5 Fragment . 13 2.4 Usage . 13 2.5 Reference Resolution . 14 2.6 Normalization and Comparison . 14 3 Creating Identifiers 16 2 4 Special Identifier Types 16 4.1 Dataset Identifiers . 16 4.2 Standard Identifiers . 17 A Changes from Previous Versions 18 A.1 Changes from PR-2015-07-09 . 19 A.2 Changes from 1.12 . 19 A.3 Changes from v1.10 . 19 A.4 Changes from v1.0 . 20 A.5 Changes from v0.1 . 20 Acknowledgments This document builds on the concept of a Uniform Resource Identifier as de- scribed in RFC 3986 (Berners-Lee et al., 2005) and its predecessors. This document has been developed with support from the National Sci- ence Foundation’s Information Technology Research Program under Coopera- tive Agreement AST0122449 with The Johns Hopkins University, from theUK Particle Physics and Astronomy Research Council (PPARC), from the European Commission’s Sixth Framework Program via the Optical Infrared Coordination Network (OPTICON), and from the German Astrophyiscal Virtual Observatory GAVO, BMBF grant 05A14VHA. Conformance-related definitions The words “MUST”, “SHALL”, “SHOULD”, “MAY”, “RECOMMENDED”, and “OPTIONAL” (in upper or lower case) used in this document are to be inter- preted as described in RFC 2119 (Bradner, 1997). Usage of ABNF This specification uses ABNF (Crocker and Overell, 1997) to specify grammar rules. The rules from RFC 3986 are assumed throughout. Where both this specification and RFC 3986 define a nonterminal, the rule in this specification overrides the corresponding rule from RFC 3986. For explicitness, we write ABNF nonterminals in angle brackets (hlike thisi ) throughout. 1 Introduction Virtual Observatory applications frequently need to unambiguously refer to some resource or concept which is described elsewhere. It is therefore necessary 3 to define global, potentially dereferenceable identifiers. In the VO, these are called IVOA identifiers (IVOIDs). An unambiguous reference within the entire Virtual Observatory requires that the identifier is globally unique. Ensuring this uniqueness inevitably requires oversight by a moderating authority; however, a flexible framework can minimize the opportunity for duplicated identifiers. Many data providers in the VO were creating and using identifiers long be- fore this specification was developed. Their choices of identifiers were made presumably to best fit the needs of the data. In order to minimize the cost of adoption of the IVOA identifier framework the design specified here maxi- mizes the control providers have, thus allowing the reuse of the identifiers data providers already have in place as well as the creation of new idenfiers that are consistent with their overall organization. Identifiers are crucial to the operation of registries that aid users in dis- covering data and services (Benson et al., 2009). In general, a registry stores descriptions of data and services in a searchable form, and it distinguishes them by the unique identifier defined here. It thus serves as a primary key for the VO Registry, and thus allows dereferencing identifiers to metadata about a resource (the resource record). IVOA identifiers with query or fragment parts can furthermore reference essentially arbitrary entities like datasets or protocols, based on this primary mechanism of dereferencing. We recognize that resources do not always remain in the control of a single organization forever. This necessitates a form of referencing that is location- independent – or more precisely, organization-independent. Apart from enabling seamless transfers of data curation, an attractive use case for such identifiers is when several copies of a dataset exist at several locations around the VO and one could refer to all of them collectively, deferring the choice of a particular instance until it is actually needed. Such references thus serve as persistent pointers to data that can be flexibly resolved. This is very important to journal publishers that wish to refer to data in publications (whose useful life might be measured in decades) without worry that the references will become obsolete. This specification, in contrast, defines organization-dependent identifiers. Persistent, organization- and location-independent identifiers are not (directly) defined here. Referencing resources is addressed by the IETF standard for URIs, RFC 3986 (Berners-Lee et al., 2005). Thus, the framework proposed in this document builds directly on this standard. Essentially, this standard sets the parameters left open for application use by RFC 3986. 1.1 Definitions A Uniform Resource Identifier (URI) is defined by RFC 3986 as “a compact sequence of characters that identifies an abstract or physical resource” which complies with the syntax specification of that document (Berners-Lee et al., 4 2005). It can point to an actual retrievable resource, but there is no requirement for it to be dereferenceable at all, let alone by a stock web browser. An IVOA identifier, or IVOID, is a special sort of URI complying with all parts of this specification. Historically, these have also been known as IVOA Resource Names (IVORNs), in parallel to the Uniform Resource Names (URNs) that formulated extra requirements on persistency and location-independence. As plain IVOIDs do not fulfill those requirements and the term URN has been deprecated by RFC 3986, we now deprecate the term IVORN, too. A full IVOID can thus be split into a Registry part (schema, authority, and path) and a possibly empty local part consisting of query and fragment component, again using RFC 3986 nomenclature. An IVOID with an empty local part is also known as a Registry reference. In VO practice, the term resource is somewhat ambiguous. The IVOA Rec- ommendation on Resource Metadata (Hanisch et al., 2007), from here on re- ferred to as RM, defines it as “a VO element that can be described in terms of who curates or maintains it and which can be given a name and a unique identifier.” It then goes on to define the relevant pieces of metadata, which later provided the foundations of the data model behind the IVOA Registry. This might lead to the expectation that there is a 1:1 relationship between Registry records, “VO resources”, and IVOA identifiers, and version 1 of this document essentially implied as much. In this version, we only require Registry references to resolve in the Registry. IVOIDs having a nonempty local part do not dereference to Registry records. Since we want to maintain the notion that a resource is whatever a URI points to, “resource” as used here does not correspond to the usage of the term in VORe- source (Plante et al., 2008). To maintain the distinction, we call resources in the sense of VOResource Registry records. These form a subset of the resources (in the URI sense) that can be referenced by IVOIDs. We refer to organizations and providers in the sense that they are defined in RM: An organization is a specific type of resource that brings people together to pursue participation in VO applications. Organizations can be hierarchical and range greatly in size and scope. At a high level, it could be a university, observatory, or government agency. At a finer level, it could be a specific scientific project, space mission, or individual researcher. A provider is an organization that makes data and/or services available to users over the network. Definitions of other types of resources, including data collection and service, are also provided in RM, and are assumed by this document. 5 1.2 Selected Requirements This proposal is the result of various requirement studies for VO identifiers and registries in general (e.g. NVO ID requirements1). This section highlights a few of the important ones that guided the design of the ID framework.