System Requirements Document
Total Page:16
File Type:pdf, Size:1020Kb
SNS Data Analysis Systems Requirements SNS IS-107020000-TD0001-R00 Draft September 28, 2004
SNS DATA ANALYSIS SYSTEMS
FUNCTIONAL REQUIREMENTS AND DESIRED CAPABILITIES
INSTRUMENT SYSTEMS
Draft Date September 2004
1 SNS Data Analysis Systems Requirements SNS IS-107020000-TD0001-R00 Draft September 28, 2004 Spallation Neutron Source
SNS Data Analysis Systems
Requirements and Desired Capabilities
Instrument Systems
Instrument Systems Senior Team Leader R. K. Crawford
Experimental Facilities Division Director I. S. Anderson
2 SNS Data Analysis Systems Requirements SNS IS-107020000-TD0001-R00 Draft September 28, 2004
TABLE OF CONTENTS
Section ______Page DICTIONARY AND ACRONYMS...... V
1.0 OVERVIEW AND SYSTEM DESCRIPTION...... 1
OVERVIEW...... 1 1.2 RELEVANT CONCEPTS AND EQUATIONS...... 3 1.3 SCHEDULE AND PRIORITIES...... 4 2.0 GENERAL REQUIREMENTS –...... 5
2.1 EXTENSIBILITY AND UPGRADEABILITY...... 7 2.2 COMMON “LOOK AND FEEL” FOR ALL INSTRUMENTS...... 8 2.3 WEB BASED...... 8 2.4 APPROPRIATE SECURITY PROTECTIONS...... 8 2.5 SNS USER ACCESS AND DATA PROPERTIES...... 8 3.0 INTERFACES...... 9
3.1 FACILITY-LEVEL INTERFACES...... 9 User...... 9 Security...... 9 Function...... 9 Data...... 9 Computer...... 9 3.2 STANDARD USE CASES...... 10 3.2.1 Desktop...... 10 3.2.2 Using application tools remotely...... 11 3.2.3 On-Site Software Use Case...... 12 4.0 USER INTERFACES...... 12
4.1 COMMON LOOK AND FEEL...... 12 4.2 COMMAND LINE INTERFACE...... 13 4.3 BATCH PROCESSING...... 13 4.4 COLLABORATION PORTAL...... 14
5.0 SECURITY INTERFACE...... 14
6.0 FUNCTION INTERFACE...... 15
6.1 ANALYSIS PORTAL...... 15 6.1.1 DATA REDUCTION...... 16 6.1.1.1 DATA QUALITY...... 16 6.1.1.2 Normalization...... 16 6.1.1.3 Background subtraction...... 16 6.1.1.4 Transformation of variables...... 16 6.1.1.5 Corrections...... 17 6.1.1.6 STATISTICS...... 17
3 SNS Data Analysis Systems Requirements SNS IS-107020000-TD0001-R00 Draft September 28, 2004
6.1.1.7 Certified software and pedigree of data...... 17 6.1.1.8 Associated metadata...... 17 6.1.1.9 Event data mode...... 18 6.1.1.10 OTHER PRELIMINARY DATA TREATMENT...... 18 6.1.2 SIMULATION...... 19 6.1.2.1 Instrument simulation...... 19 6.1.2.2 Scattering kernels...... 20 6.2 SOFTWARE REPOSITORY...... 20 7.0 DATA INTERFACE...... 20
7.1 DATA PORTAL...... 20 7.1.1 DATA ACCESS...... 21 7.2 DATA STORAGE AND ACCESS...... 21 7.2.1 Archival requirements...... 21 7.2.2 Fresh data requirements...... 22 7.2.3 Ephemeral data requirements...... 22 7.2.4 Audit Trail...... 22 7.2.5 Storage, distributed data access and lineage...... 22 8.0 COMPUTER INTERFACE...... 23
8.1 GRID ACCESS...... 23 8.2 CLUSTER COMPUTING AND DISTRIBUTED PROCESSING...... 24 9.2.2 General visualization...... 25 9.2.3 Remote visualization support...... 27 9.1 VISUAL PROGRAMMING INTERFACE...... 27 9.2 VISUALIZATION STEERING INTERFACE...... 27 10.0 EXPERIMENT CONTROL AND MONITORING...... 28
10.1 CONTROL PORTAL...... 28 10.1.1 DATA ACQUISITION...... 28 11.0 COMPREHENSIVE FACILITY OPERATIONS...... 28
11.1 USER WEB SERVICES...... 29 12.0 SOFTWARE DEVELOPMENT...... 29
12.1 INFRASTRUCTURE...... 30 12.2 DOCUMENTATION...... 30 12.3 TESTING...... 32 12.4 FEEDBACK MECHANISMS...... 33
4 SNS Data Analysis Systems Requirements SNS IS-107020000-TD0001-R00 Draft September 28, 2004 DICTIONARY AND ACRONYMS
Abbreviations, acronyms, and symbols. DAS: Data Acquisition System GUI: Graphical User Interface HTTP: Hypertext transfer protocol:
NAS: Network Attached Storage NAT: Network Address Translation NeSSI: Neutron Science Software Initiative QC: Quality Control SNS Spallation Neutron Source SQA: Software Quality Assurance SSL: Secure sockets layer. TBD: To Be Determined TOF: Time-Of-Flight URL: Uniform Resource Locator. XML: Extensible Markup Language.
Definitions and conventions
Satellite Computer: A satellite computer is a computer that responds to control messages from a master computer (control computer). The satellite computer communicates with controllers to affect their actions. Control Computer: The control computer is responsible for the coordination of an experiment. It sends control messages to various satellite computers that are directly responsible for control of hardware. Firewall Computer: The firewall computer is the satellite computer responsible for securing the data Acquisition private network. It provides certain web based applications that can be used to monitor and control neutron scattering experiments. User Computer: The user computer is the satellite computer that is responsible for interfacing the specialized user equipment or a user computer that controls specialized user equipment. NeXus file: A binary file stored in the HDF4 or HDF5 format. Here a NeXus file refers to one stored in a HDF5 based file. SNS Environment: The interface and architecture to Data Analysis software. Module: A Module is a piece of code that performs a specific function in the SNS Environment, and which may be independent of other modules, or depend on one or more other modules. If the SNS Environment can be downloaded and installed on a user’s home computer, each module does not need to be independent.
5 SNS Data Analysis Systems Requirements SNS IS-107020000-TD0001-R00 Draft September 28, 2004
SNS DATA ANALYSIS SYSTEMS REQUIREMENTS AND DESIRED CAPABILITIES
1.0 OVERVIEW AND SYSTEM DESCRIPTION 1.1 Overview The different SNS instruments are designed and optimized to answer very different types of scientific questions, but conceptually all are very similar when it comes to defining their data acquisition and analysis needs. All neutron scattering instruments must determine the energies or wavelengths of the neutrons being measured and sometimes must also measure the changes in neutron wavelength or energy during the scattering process. At SNS, most instruments use time- of-flight (TOF) measurements to determine the energies of the neutrons. Figure 1.1 provides a schematic representation of a typical neutron scattering instrument at a pulsed neutron source. In other cases energies can be determined by Bragg diffraction from crystal monochromators, and the crystal angle is changed to select different energies. The basic data produced is a number of events corresponding to neutrons scattered by the sample and detected in particular detector elements at particular energy or wavelength values. At a minimum each such event requires a position identifier and an energy or wavelength value to describe it, but other identifiers may be attached to describe particular experiment conditions. These data must be stored as they are produced either in their raw digital form or after some type of on-the-fly data compression (e.g., collection into a detector position-TOF histogram or detector position-monochromator angle histogram). In most cases a number of correction factors will be applied to these events, and the corrected events will eventually be used to build up a representation of the differential scattering cross-section of the sample or the related static or dynamic structure factors [S(Q) or S(Q,ω)] as functions of the wave-vector transfer Q and the energy transfer ħω. This differential scattering cross-section or structure factor will then be related to theory or models. The data acquisition and analyses systems for each instrument are the primary user interfaces to that instrument and serve as the user’s window into that instrument and the data it produces. Data flows through these systems, from acquisition through processing to the publishable results. The users participating in such an experiment will have the use of the instrument assigned to them for a period ranging from a few hours to a few days, depending on the experiment proposed. In addition, these users may require some facility-supplied experiment-simulation software to assist them in planning the experiment prior to their use of the instrument, and will need access to the data and reduction and analysis software in some form subsequent to the experiment. During the experiment the data acquisition system for that instrument must be able to accept each of the events from the detector system for that instrument, collect these events in appropriately organized data sets, and then handle the storage, manipulation, visualization and transport of the resulting data sets. It must also provide all the control functions necessary to establish the conditions for data collection, as well as to establish and monitor the sample environment and instrument conditions during the data collection period. Because each instrument can have a detector system that is unique to that instrument, the data acquisition systems must be capable of accepting the detected events in a variety of different analog or digital formats according to how the interface is defined with that particular detector system.
1.0 Overview and System Description - 1 SNS Data Analysis Systems Requirements SNS IS-107020000-TD0001-R00 Draft September 28, 2004 Similarly, the data acquisition systems must be capable of controlling and monitoring a wide variety of sample environment and instrument operating parameters. The data analysis systems (software and related infrastructure) are responsible for the conversion of the raw data to S(Q) or S(Q,ω ). These systems are also responsible for maintaining the raw and reduced data in a form that can be readily retrieved and utilized by the users. These systems, in addition, will need to supply some of the higher order operations such as simulations, visualization, or model fitting necessary to extract the ultimately desired information from the data. These needs are described in somewhat greater detail in the sections that follow.
Figure 1.1 Schematic representation of a typical pulsed neutron source neutron scattering instrument. Timing starts with a pulse of neutrons from the moderator. Neutron choppers are controlled in phase with this pulse to select a range of neutron speeds or wavelengths. Neutrons scattered by the sample are detected in individual spatial pixels of the detector array at particular times relative to the moderator pulse, producing a set of space-time events which are stored as the data from the experiment. Sample conditions are controlled and can be varied to modify the scattering process.
1.0 Overview and System Description - 2 1.2 Relevant concepts and equations
Incident neutrons are described by a wavevector k0 and scattered neutrons by a wavevector k1 2 Ï€ , as shown in Fig. 1.2. Each wavevector has a magnitude given by k , where is the λ h neutron deBroglie wavelength λ , h is Planck’s constant, m the neutron mass, and v the mv neutron speed. The direction of k is the direction the neutron is traveling. The neutron energy h2 E and neutron wavelength are related by E . 2mÎ2 »
Detector Scattered neutrons of wavevector Incident neutrons of wavevector
Sample Figure 1.2 Neutron scattering process. Typically about 10% of the neutron beam is scattered and the rest proceeds on through the sample. A particular scattering event is
described by the vectors k0 and k1 , and is the angle between those vectors.
Scattering processes are either elastic or inelastic.
Elastic scattering The neutron energy or wavelength does not change, only its direction of motion changes, so
k1 k 0 . The variable of interest is the scattering vector or wavevector transfer
Q k1 k 0 . The quantity of interest is the static structure factor S Q , which is related to the properly normalized and corrected measured intensity I Q by
2 I Q bcoh NS Q
Here N is the number of scattering centers contributing to the scattering and bcoh is the coherent scattering length of these scattering centers.
Inelastic scattering Both the direction of motion and the energy of the neutron change, so the neutron energy must be determined both before and after the scattering event. The variables of interest are the wavevector transfer Q k1 k 0 and the energy transfer ħω = E1 – E0 . The quantity of interest is the dynamic structure factor S Q ,ω, which is related to the properly normalized and corrected measured intensity I Q ,ω by
2 k1 I Q , ω bcoh NS Q ,ω k0
1.3 Schedule and priorities The SNS instruments will start accepting neutrons and collecting data during commissioning in April 2006. Some analysis and visualization software must be available at that time to support the commissioning activities. By early 2007 it is expected that “friendly users” will begin carrying out experiments on some of the instruments. A much larger fraction of the analysis, visualization, and data-handling software infrastructure must be in place by that time. This infrastructure must be a well-thought-out design capable of accommodating or expanding to accommodate all of the requirements outlined in this document. By mid-2007 to early 2008 it is anticipated that some of the SNS instruments will be made available to the general user community through the proposal system. By that time, SNS software should have primary software functionality in place. Subsequent software development can focus on providing more advanced functionality and capability. Primary functionality consists of software to handle the data from first instruments. This software must perform the needed data treatments and carry out analysis operations. The data must be viewable and allow the user some level of interaction with the images. This software must allow the instrument scientists and initial users to work with data to debug and certify instruments and be capable of handling the data from first experiments performed on these instruments. In addition to handling initial SNS data, software will be required to provide a means for managing data and for production and use of new and legacy software. The SNS facility must maintain appropriate data context (also known as metadata), relationships among files within an experiment, and proper storage and archival for all data produced. When the NeXus standards are appropriately mature, SNS shall utilize the NeXus data format. Should NeXus fail to mature prior to SNS becoming operational, SNS must produce its own data format. Note that context information (metadata) not contained with a NeXus file must also be stored. Advanced functionality is targeted to support users as SNS instruments begin to mature where having a desktop application may not satisfy growing needs. In addition to basic treatment, analysis, and visualization, functionality needs to exist to support: · Data management – data schemas, databases, and data archive · Remote user access via portals · Easy access to high performance computing · High speed data transfer · Support for both legacy and new analysis software · Incorporation of simulation software · Real time feedback and possibly control during acquisition The purpose of this document is to provide a comprehensive overview of software required for full SNS operation. As possible, SNS software development should take advantage of operating instruments at other facilities to validate software prior to instrument availability at SNS. Any new software developed that provides advances over the current state-of-the-art should find users willing to test it with data from those facilities, perhaps sooner than would be possible at SNS. In addition, SNS should look for opportunities to collaborate and provide software to sister facilities thus improving return on investments by mutually leveraging work products. 2.0 GENERAL SNS DATA ANALYSIS REQUIREMENTS When operational, SNS expects to support a large user base comprised of more than 1000 researchers annually. Anticipating that a large number of these users will be experts in their fields, but not necessarily neutron science or computer science, the software will need to be easy for these researchers to use. At the same time, the software must facilitate scientific research by enabling researchers to pursue their investigations. The software must automatically handle issues such as networking, data management, and computations transparently for the user. In this way, researchers can be freed from computer related issues and focus on their scientific research.
However, the software architecture must also accommodate researchers who find it necessary to advance software in order to advance their scientific research. This requirement must be balanced with the need to produce formally certified easy-to-use software. The software framework must allow users to produce their own software to work with data. Users who wish to develop software shall be provided a separate workspace area maintained within the SNS software framework. Software developed must undergo appropriate software quality procedures as any SNS validated software (documentation, review, testing, etc.) prior to becoming software available to all SNS users. To ensure data integrity and access control, all infrastructure development must be authorized and developed in close cooperation with the SNS software development team. Though it is generally not anticipated that users will play a significant role in developing infrastructure-related software, such as that needed for data management or for portals, cooperation is necessary.
Portal software has recently evolved within the computer science community as the method for structuring access to a network-based computing environment to support related user needs. Portals are a key element to enterprise class software built to support remote users accessing facility resources. Particular to SNS are enterprise class software needs that can both handle a large number of users and process large volumes of data. For instance, portals will need to allow users to search for and visualize data and perform compute intensive operations. Behind the scenes, software will need to access data and efficiently perform these computations. An architecture is required that supports these use cases.
Further examination of the use cases reveals a fundamental structure intrinsic to performing neutron science research via software. Architectural elements can be listed as: · Acquisition – data creation · Treatment and analysis · Simulation · Visualization · Data management · Resources – high speed networking, high performance computing, data storage · Software repository
These elements are implemented by software components, interconnected via data paths. Collectively these components constitute a network of services to be accessed directly or indirectly by users. The relationships among services and associated data stores are illustrated in Figure 2.1. Note that the service framework is separate from the science software that resides within it. The science software shall be produced in library form not dependent upon any particular service framework.
An additional requirement for SNS is to determine how to integrate existing and future software contributed by others without compromising usability and reliability. As a rule of thumb, the more dependence SNS has upon a function, the more integrally SNS should be involved in its development. For example, SNS must produce experimental data and has therefore undertaken the production of the acquisition software. SNS must also produce treatment software corresponding to the instrument geometries, but some post-treatment analysis software may be available elsewhere in the user community. On the other end of the spectrum, initially SNS will not be heavily involved in producing simulation software and looks to the user community to provide it. In the area of data management, SNS may need to look to experts for how to construct its data management system but must stay involved and manage its development via rigorous system requirements specifications and thorough verification and validation. Note that figure 2.1 in addition to illustrating major architectural components also illustrates general development responsibilities within the components. Figure 2.1 – Software architecture illustrating components and their interconnections. Notice the heavy dependency upon data access and storage.
2.1 Extensibility and upgradeability The SNS data analysis system must be extensible to handle widely differing data needs for different instruments and to meet future requirements. The system should also be piecewise upgradeable to incorporate improved performance capabilities on specific instruments at a later date without affecting other parts of the data analysis system. Performance improvements include not only improved algorithms and methodology but also changes, such as parallelization, to exploit advanced computational resources. 2.2 Common “look and feel” for all instruments Where applicable, the user interface shall provide the same “look and feel” to the users on each of the instruments. The user interface shall, however, be easily customized to meet the specific objectives of individual instruments and experiments. It is desirable to use the same interface, or at least the same look and feel, for both data acquisition and data analysis operations. To support common look and feel, SNS shall produce a software development Style Guide for developers to reference.
2.3 Web based access to services There should be a web-based user interface which supports accessing data (upload and download) and performing analysis and visualization. The web interface shall be designed and implemented to provide responsive interaction. In those cases where user finds the conventional browser- based web interface to be too slow, facilities must be provided to optimize use of SNS and the user’s local computer resources. This optimization implies standard, well defined, published, yet secure, interfaces to all SNS software services, as well as the ability of users to download software and data to run on their local computer. The web and service-oriented interfaces shall allow user access to SNS software either on-site or off-site.
2.4 Appropriate security protections The SNS systems must provide appropriate security features to protect the systems from external interference and to control access to data and resources.
2.5 SNS user access and data properties The SNS facility will have the following data and user properties: · The user base of the facility will be geographically dispersed throughout the world. · Several hundred terabytes of data will have to be archived and made accessible from the SNS over its useful lifetime. Subsets of this data may be mirrored for fault tolerance or performance reasons. · Most datasets will have to be maintained or archived for the entire lifespan of the facility. · A rich set of metadata will be kept to aid in locating and doing further analysis on datasets. · The data archiving infrastructure must be able to support a wide range of data policies and sharing dynamics, including an initial period of restricted access and later open access to the data and metadata. · A software repository will be maintained by the SNS allowing users to download analysis software and run it on compatible local systems. · Real-time analysis feedback while an experiment is running will require on-demand high performance computing. · Instrument simulations and material modeling will require high performance computers with both interactive and batch job execution. · SNS will facilitate access to high performance computing (HPC) capabilities for its users, subject to availability and access criteria established by HPC providers.
3.0 INTERFACES As shown in Figure 2.1, the facility-level SNS software architecture can be viewed in terms of the following elements: · Acquisition · Data management · Data treatment, reduction, and analysis · Visualization · Simulation · High performance computing access · Web portals
3.1 Facility-level Interfaces Each architectural element in Figure 2.1 contains boxes representing specific hardware, software, and data components that must ultimately be implemented to satisfy SNS operational and user requirements. To support rigorous specification of software and data components, the facility- level operational and user requirements must be defined in terms of the interfaces associated with and among the seven architectural elements listed above. Since these interfaces are the access points for all system services, they are the most appropriate starting point for defining consistent and complete component requirements. The interfaces can be categorized as follows: · User – may be via Web Browser, Desktop Client, or other User Application · Security – provides Access and Authorization Control for SNS resources · Function –includes a Data Portal for direct access to data stores, an Analysis Portal for access to all analysis software, and a Control Portal for real time monitoring and control of acquisition · Data – provides Data Management services via databases and data middleware for direct user access and analysis programs · Computation – provides a Computer Interface for high performance computing resources · Visualization – provides data visualization via a Data Portal or Analysis Portal · Development – allows production and ad hoc software development via a Software Repository portal These interfaces and their relationships are illustrated in Figure 3.1. The underlying services are assumed to be sufficiently modular to allow independent development of each interface. This requirement in turn promotes scalability. For instance, access control could initially be site specific but eventually become a single sign-on across multiple facilities. Another illustration would be to have the computer interface initially limited to on-site computers but eventually linked to ORNL computing and grid computing resources. Utilizing TeraGrid capability at some point in the future is another illustration of scalability. The interfaces shall also provide a means for utilizing legacy code and commercial software. Over time, the user community and other neutron science facilities have developed a significant code base in support of neutron scattering science. The SNS data analysis environment should maintain and/or provide users access to certified copies of these applications via a portal interface, though such software may not be included into the desktop software.
3.2 Standard Use Cases
The architecture and interfaces outlined above shall support many different use cases, thus providing significant flexibility to the user community. These uses may include for example working independently on a user computer, working remotely via an access portal, or working on-site while performing experiments. Sections 3.2.1-3.2.3 describe typical use cases as a starting point for comprehensive system requirements definition.
3.2.1 Desktop
Desktop software shall enable a user to independently view and process data. The user should be able to download this software from the SNS software repository via a web or portal interface. This software may be a subset of SNS hosted software and may have limited portability or restricted licensing outside the SNS environment. The Desktop software will also have to integrate with NeXus files and possibly associated file headers that group a series of NeXus files together. A user may wish to retrieve data from an SNS data repository, but process the data locally on their computer. Figure 3.1 Interface relationships illustrating resource access and data pathways.
3.2.2 Using application tools remotely While copies of the datasets and analysis software are at the SNS facility, users may be located anywhere in the world. In some cases the user may wish to run the analysis on his local machines with the data being remote. Other times he may wish to use remote computational resources and data. Or he may have local data (from simulation or another facility) that he wishes to analyze with tools available at SNS. All these scenarios point to the requirement that the SNS data infrastructure must accommodate data, tools, and users potentially being remote from each other in various combinations. This requires: · Software repository from which users can download and run individual software modules. Note that this requires that the portability of software components be well defined and documented. · Data archive that allows users to download datasets to local computers · A secure means to upload local data to the SNS analysis environment · A secure means to upload data and metadata to the SNS archive to complement or augment existing datasets stored there. · Ability to remotely run tools through web portals. This implies web service access to these tools. Web service is the ability to access and launch operations through the World Wide Web using standard interfaces (such as WSDL) and access protocols (such as SOAP). Building web services capability into the tools will allow them to be invoked easily through portals from a variety of client types including web browsers and task-oriented client applications.
3.2.3 On-Site Software Use Case The same interface should be used for both remote and on-site access. The essential difference is that on-site network topology and bandwidth can be controlled within the limits of available hardware. It should be a primary network design requirement that topology and bandwidth be optimized to maximize throughput and minimize response time as perceived by the user.
4.0 CLIENT-SIDE USER INTERFACES To provide a common interface to SNS data analysis services for both local and remote users, a client server architecture based on web services is required. As mentioned above and shown in Figure 3.1, client-side user interfaces may be accessed via a web browser, implemented as a task- specific desktop client, or embedded in a user-supplied application. This architecture implies three categories of support software to be provided to users by SNS: · Server-side proxies, or “wrappers”, to supply content to web browser interfaces for each SNS service · Dedicated client-side applications designed to cooperate with one or more specific SNS services or component · A client-side library for accessing SNS services from user-developed applications
4.1 Common look and feel The following are guidelines for browser-based and dedicated client-side interfaces. The library provided for user-developed applications should support and encourage these objectives to the maximum extent possible. · All instruments will have a similar interface in terms of accessing the available experiments, tool lookup, etc. · All software services (components, tools, etc) will have a similar interface for accessing, configuring and invoking them. · Standard mechanisms are required for accessing and configuring both local and remote services. · Context Sensitive Menus: Standard and uniform context sensitive menus that can aid in easily accessing and invoking the features for a tool such as “help”, “input and output” and other tool specific functionality. · Control Panels: A control panel which can be invoked in a standard way for any software tool, comprising of means to set and configure the tool (its functionality) for a specific run. · Widgets: A standard set of widely used GUI objects that fit into the control panel. These objects are combo boxes, text boxes, sliders, dials, etc., using which users can change or select parameters. · Style Guide: A style guide document shall be produced which will have guidelines for producing software with a common look and feel. This guide will be complete with illustrations and examples for developers to reference. The guide will contain information on tool layout, usage flow patterns, window management policies, widget or control feature formats, color schemes, and data visualization methods and tools. It will describe methods for monitoring tool usage for QC and will identify any facility emblems, copyrights, certification status, and licensing information a user must see.
4.2 Command line interface As a special case of the user-developed client interface, there should be a user friendly command-line scripting language (similar to a UNIX shell) with the ability to: · Invoke software tools or launch user application scripts with run-specific parameters. · Invoke help regarding a tool. · Navigate back and forth through the command history. · Query metadata to locate datasets of interest.
There should be scripting capabilities to compose applications for accessing SNS services, wiring them together, etc. Users should be able to compose ad hoc scripts at the command line or write them in specialized editors so they can be saved for future use. Command line invocation could be handled via operating system shell scripts or via a scripting language such as Java Groovy, Python, IDL, or Matlab.
4.3 Batch processing
Batch processing shall be supported via either command line script interface or via any of the three client-side interface mechanisms described above. The command line interface will use the same syntax and language as the interface described in the previous item. When using GUI- oriented clients the user will be able to: · Graphically select multiple files via file browser type interface · Select standard processing operations to be performed on the batch of measurements such as: data reduction, rebinning, normalization, +,-,*,/, - same operation performed on all files · Select a group of files that all have the same script or function applied. User can create scripts or use SNS supplied scripts. · Receive feedback when they have not properly executed a batch job (for example, selected the wrong number or type of files) or operation to be performed on the specified files. · Have processing errors handled gracefully so program execution stops without crashing main application and provides user feedback. · Check for syntactic errors in a chosen script before execution, informing the user of the errors encountered. · As possible, run batch jobs in parallel when utilizing SNS computational resources. · In the case of parametric experiments, append processed output from a number of measurements into one file maintaining consistent parameter relationships (for example, increasing or decreasing parameter value (temperature, pressure, magnetic field, etc.)). · Extract parameter information from a collection of measurements for use with further processing.
4.4 Collaboration Portal Remote collaboration represents a new way of doing neutron scattering experiments. There are many advantages that remote collaboration will provide to users, to SNS, and to the scientific community at-large. · It allows complementary expertise to be used to plan, execute and analyze experiments. With the internet, remote collaboration has world-wide reach, to both theorists and experimenters. It will enable better experiment planning and permit off-site users (e.g., professors or students) to participate during experiments. The goal is that synergies generated through remote collaboration will ultimately reduce the time to publication. · Remote collaboration is also an educational tool. Live or archived distance classes and demonstrations will be offered, allowing new users to consult others who have worked on similar problems and share the know-how. This will broaden the “user-base” of SNS as well as its impact. The requirements for remote collaboration can be met to a large extent with existing tools. The Collaborative, User Migration, User Library for Visualization and Steering (CUMULVS), for example, provides run-time visualization for multiple, geographically distributed users of large- scale computation projects. The Extensible Computational Chemistry Environment (ECCE) is another example. None of the existing tools is natural to neutron scattering data analysis environment, and therefore some development efforts will be required to adapt these tools for neutron scattering users. There are also acute security issues that will need to be fully and carefully addressed. Some of the desirable features for remote collaboration include · Desktop availability (perhaps with enhanced capability in specific locations, e.g., special labs at universities) · Webcam and voice · On-line data manipulation and sharing with remote viewing and control · Replicate data rooms (for teaching at off-site locations) · Electronic note book to record interaction sessions · Training sessions provided by the facility in the use of the technology
5.0 Security Interface
The security interface is required to authenticate users and control access to data and resources. The security interface must identify and authorize individual users and validate their permissions to access or modify data and resources they request. As illustrated in Figure 3.1, the security interface is layered above the functional interface and enables control of the portals – which in turn control access to all SNS services. Required security features include:
· Both authentication and authorization of remote users to modify an experiment. · Initial support for facility sign-on but eventual multi-facility single sign-on. · A user manager tool that allows an administrator easy access to configuring user privileges and privilege duration. · A user monitoring tool that allows an administrator to quickly survey login history and current login status as well as disk usage and running processes. The administrator will also be able to reassign the resources used.
6.0 Analysis Interface
The Analysis Portal provides the server-side access path from the client-side user interfaces described above to analysis software for all SNS instruments. 6.1 Analysis Portal
The SNS analysis portal should support two kinds of analysis/treatment functionality: · Applications developed at SNS primarily to support SNS instruments and · Legacy and commercial packages that scientists currently use.
The analysis portal is required to have the following functionality: o Interfaces to all of the analysis operations/tools identified below o Common execution environments so both tools and legacy packages can be executed. o A well defined input/output interface so that analysis tools can communicate with each other. o Ability to construct sequential operation of analysis tools passing outputs of one analysis into another. o The capability to upload users’ local tools and make them accessible via the analysis portal. These tools may be used privately by the user, or upon appropriate SNS review, also made generally available as public unsupported tools along with any certification information supplied by the contributor.
The analysis portal provides an interface to working with software which supports specific instruments being developed at SNS. Generally all the SNS instruments can be categorized as one of the following: o Spectrometers o Reflectometers o Powder and Engineering Diffractometers o Small Angle Diffractometers o Single Crystal Diffractometers
It is a primary requirement to support each of these instrument types, however the specific requirements for the complete set of analysis programs to be developed to support SNS instruments will be left to subsequent documents defining each analysis application individually. Following here are the general requirements for supporting the instruments by defining common components, methods, and superstructure to be provided via the analysis portal. It is expected that instrument-specific documentation and software will follow the guidelines established within this document.
6.1.1 Data Reduction Throughout this section the word “file” is used in a generic sense to denote a data set or other collection of related information that may be in an actual file or in other form of storage (e.g., RAM). This section represents a sample of what is currently possible and will be desired. It is a baseline requirement for the data analysis software. 6.1.1.1 Data Quality The quality of data will be determined and appropriate steps to correct for quality will be taken. Examples of this are masking out bad detectors and pixels, dead time corrections, calibrating detector positions. Other forms of data quality such as background and statistics are discussed below. 6.1.1.2 Normalization Each data set must be given a relative normalization, typically by comparison to a beam monitor, in order to reference measured spectra to the total number of neutrons incident on the sample. This type of normalization may be done on a wavelength-by-wavelength basis to simultaneously adjust for fluctuations in the incident wavelength spectrum. 6.1.1.3 Background subtraction Frequently the determination of the scattering from the sample requires that the background be subtracted from the total scattering measured with the sample in the beam. This is typically done by repeating the measurements with no sample in the beam to obtain a background data file. In order to obtain just the scattering from the sample, the two measurements must be normalized in the same manner (relative or absolute) and then the background data subtracted from the data obtained with the sample in the beam. In some cases it may be necessary to carry out other measurements in order to obtain data corrections needed to determine the absolute structure factor S(Q) or S(Q,E). This may involve determination of the scattering from the empty container, determination of the background generated by the sample (e.g., diffuse scattering, multiple scattering, incoherent scattering), or other scattering measurements. Such measurements may also need to have a background subtracted (e.g., subtraction of sample-independent background), and will have to be normalized on the same basis as for the other scattering data. Incorporation of such measurements into the final reduced data or into the data analysis will be discussed in the paragraph on data corrections.
6.1.1.4 Transformation of variables The neutron scattering histogram data will generally be in the form of intensity as a function of neutron TOF (or other information from which the neutron energy can be derived) and the position of the detector element recording the event. Detector element position can have as few as one coordinate (e.g., scattering angle) or as many as three (e.g., full three-dimensional location of the element). However, experimenters are generally interested in the scattering as a function of other variables such as neutron wavelength, neutron energy, neutron momentum transfer, neutron energy transfer, to name a few. To express the data as intensities as a function of some set of these more interesting variables requires a coordinate transformation. The software should contain standard certified coordinate transformation packages to carry out any such desired coordinate transformations in a mathematically correct manner.
6.1.1.5 Corrections A variety of different corrections may need to be applied to the data, depending on the problem at hand. These include corrections for absorption, container scattering, multiple scattering, extinction, inelasticity, and perhaps others. Some of these may need to be specific to the particular experiment and geometry. Furthermore, the order in which corrections are made may be quite important, and some may need to be made before any coordinate transformations while others may be best done on the transformed data. In general this becomes an instrument-specific process, or even a process dependent on the type of experiment for a given instrument. The SNS analysis software should contain built-in capabilities for some of the standard types of corrections as well as the capability to incorporate other types of corrections as needed. One important general correction is related to the time-of-flight to wavelength conversion. The delay time between neutron pulse creation in the target and emission of a particular wavelength needs to be determined for the individual moderators. There should be a standard software routine that corrects the data accordingly.
6.1.1.6 Statistics Each step in the data reduction, including transformation of variables, corrections, etc., must propagate the statistical uncertainties calculated according to accepted practices with well-vetted algorithms. At any stage in this process the intermediate data file must contain the correct statistical uncertainties.
6.1.1.7 Certified software and pedigree of data The pedigree or history followed to obtain an intermediate data file must be maintained embedded in that file. This includes unambiguous reference to each file and software package version utilized at every prior step of the data reduction as well as any other input information associated with each of those steps. To have meaningful references to the software used, all software in this sequence must have version numbers and be under strict version control, and the pedigree must contain the version number and other identifier for each software module applied in the reduction. The data management software must provide the capability to extract and display this pedigree at any stage of the process.
6.1.1.8 Associated metadata In addition to the various items indicated above that track the pedigree of the reduction process, the metadata associated with each of these intermediate data files should include, but not be limited to: Proposal information A universal reference to the proposal and a subset of the information from it, where appropriate, should be included as part of the metadata. Some portions of this such as proposal number, user name, etc., will be embedded automatically in the raw data file. Electronic notebook An electronic notebook should be used and included as part of the metadata to provide a view of top level comments (should be able to use electronic notepad used for many/most notebook entries). The notebook will contain a record of the entire experiment and the user can add annotations as desired throughout. The information contained in the notebook will make repeating an experiment possible.
Command Logging
During a user session commands executed shall be logged as part of the metadata. In this way, one will be able to reconstruct work performed.
Sample Environment and Sample Safety
Metadata shall also include information pertaining to the sample environment – specific information about the sample, such as position, environmental manipulation (temperature, pressure, etc.), sample container, etc. Sample Safety will include health physics reviews of samples to be placed within the beam and radiation survey information once a sample has been placed within the neutron beam. 6.1.1.9 Event data mode If event data is first binned into a histogram of some form, then suitable versions of each of these steps could be equally well applied to this histogram as to a histogram that was formed in the more usual manner on-the-fly. In this mode, this initial histogramming step would have to be performed as the first step in the data reduction sequence. A second method of reducing event data may become clear as time passes. The architecture described here will not limit the reduction of event data to histogramming then performing the standard reduction. It will be especially important to associate neutron detection data (intensity, TOF bin, detector bin) with the appropriate corresponding sample environment data (temperature, magnetic field etc.). In most cases the neutron counting data will be “live” but the sample environment data come in delayed).
6.1.1.10 Other Preliminary Data Treatment The SNS software needs to provide the functionality of currently available tools that are used by a majority of users to perform the analysis of their data. To this extent, the software should provide a seamless interface to existing packages. A priority should be placed on software that is used by the majority of the user community. If possible, the interface should allow the existing software to have the standard look and feel of other SNS software. Many aspects of the analysis performed on neutron scattering data involve fitting data to a model. Therefore the SNS software should include a robust and tested fitting routine as a module. One requirement will be to be able to take an arbitrary function and convolve it with the instrument resolution. The output should provide the parameters of the fit, and all the relevant quantities to determine the quality of a fit (e.g. reduced 2, correlation matrix, 2 test, etc.).
It will be possible to perform refinements using data from multiple sources such as x-ray measurements. It will also be necessary to visualize these data from other sources, ideally side- by-side with neutron scattering data, and where appropriate, data fused (overlaid) into a single view. The capability to view fused data is desired, but not required.
6.1.2 Simulation The simulations discussed here encompass both instrument and materials simulations. The former calculates neutron scattering intensities at the detectors for a given instrument with a “standard” sample. The latter is computer prediction of the structure and dynamics of materials. Today computer codes for both types of simulations are readily available. Instrument simulation, in particular, has been used extensively in the design of neutron scattering instruments at new and updated sources. To date, instrument and materials simulations have usually been performed separately. Such separate simulation capabilities will continue to be important. In addition, however, there will be a need for the ability to integrate instrument and materials simulations to enable predictions of neutron scattering intensities for real samples under real sample environments. This will become increasingly important as simulations and analyses become more sophisticated. A direct comparison of simulation (modeling) results with experimental data allows users to gain scientific insights from their data that would otherwise be difficult or impossible to obtain. As today’s research is being extended to materials of increasing complexity, a sample sometimes must be characterized with multiple instruments and under different sample environments to determine its structure and/or dynamics. With integrated simulations, it will be possible for users to analyze simultaneously multiple data sets from different instruments and under different sample environments with a single model. Secondly, computer simulations provide a tool for planning experiments. By performing a simulation before or during the experiment, users will be able to optimize experimental configurations and measurement strategies, which should greatly improve the success rate of their experiments. In addition, integrated simulation will serve as an educational tool, enabling novice users to learn the basics of neutron scattering by actually conducting a neutron scattering experiment in a computer. One important goal for SNS is to draw many more such novices into the ranks of occasional neutron scattering users. Tools such as this will play an important role in this endeavor. The following describes several requirements applicable to the simulation tools.
6.1.2.1 Instrument simulation The SNS analysis software must provide access to the well-established instrument simulation routines, including McStas, VITESS, and IDEAS. The modular approach, in which a neutron optical component is represented by a self-contained subroutine or function call, is a natural way of conducting the instrument simulation. The precompiled modules are loaded dynamically at run time, allowing rapid prototyping of an instrument or experiment. The modular approach also facilitates reuse of existing or legacy codes and provides great flexibility for incorporating new components. In the modular approach, the scattering samples are simply one of the modules in the simulation pipe line.
6.1.2.2 Scattering kernels To simulate real experiments, considerable effort must be devoted to developing sample modules, or scattering kernels. One or more standard sample modules should be provided for each instrument. The scattering properties of a given module should be readily modifiable by users to simulate their own experiments. Although generally specified in terms of S Q ,ω, the scattering properties can be expressed in much simpler forms for certain scattering applications. In powder diffraction, for example, the scattering properties are specified by the lattice-spacings and the associated structure factors of the diffraction peaks. In a stress mapping experiment, the scattering properties may be specified by a stress map. It may prove useful to archive various types of scattering kernels for future reuse. Not all materials simulations produce results that can be readily incorporated into the scattering kernels. Phase-field modeling, for example, produces a microstructure. Tools will be needed to convert simulation results to the scattering properties specified by the standard sample modules. In the case of phase-field modeling, the simulation results will have to be first converted to S Q to compare with data from small angle neutron scattering experiments.
6.2 Software Repository To avoid duplicating previous efforts, a survey should be conducted of existing analysis codes that could be usefully incorporated. In this regard, it may be desirable to establish a central software repository, where a library of common, as well as instrument-specific, modules will be stored for reuse. Several newly developed data analysis packages were highlighted at the NeSSI workshop: ISAW by IPNS for data visualization and some analysis, DAVE by NIST for analysis of inelastic neutron scattering data, and SScannSS software by Open University/ISIS (UK) for stress mapping experiments. ISAW is based on Java, whereas DAVE and SScannSS are developed in the commercial IDL programming environment. All three packages offered many useful features that meet the demands of existing data and are gaining popularity beyond their birth places.
7.0 DATA INTERFACE A facility such as SNS, with its large potential user base and intended longevity of at least a few decades, is faced with several different kinds of data storage and access requirements. Further, both these requirements and available storage technologies are likely to evolve and mature over the course of usage. SNS will be expected to provide 24/7 service to the user community and must provide an infrastructure capable of supporting this. At the very least, SNS should support three different kinds of data storage and associated access needs: new, intermediate, and archival data. SNS will also be challenged to identify appropriate database hardware and software to support data management needs.
7.1 Data Portal The SNS data portal will support the following user operations on data: o Browsing through raw, intermediate, scientific and simulation datasets that a user is allowed to access. o Performing standard file operations on these datasets such as view, download, upload, etc. o Cross-referencing scientific datasets along with their metadata and resulting publications.
7.1.1 Data Access · Metadata are required not just of raw data, but also of intermediate and scientific data. This means any intermediate processing performed by the user will need to be annotated with to include, but not be limited to, an audit trail for reproducibility. · There is the need for efficiently cataloging the datasets by creating mappings between datasets and searchable attributes that describe them. Such information would be stored in “Metadata Catalogs”. · The metadata infrastructure will be robust enough for users to modify and update information concerning datasets, add metadata regarding intermediate results and data products, etc. · Where it is meaningful, the data browser should be capable of providing a “thumbnail” image of the data to facilitate visual searching of data. Providing “thumbnail” views is an optional requirement.
7.2 Data Storage and Access New (fresh) data are likely to generate more accesses and should be cataloged and maintained in reliable, high speed storage with high performance data movement tools at its disposal. Intermediate (ephemeral) results are likely to be moved just once to their eventual destinations and need not be cataloged except as archival data. Archival data is required to be maintained persistently and is not likely to generate very many accesses, but needs to be cataloged in the event of future access. The SNS facility may decide to move and manage these datasets automatically (with very minimal administrator intervention) between high speed and cold archival storage depending on user data access patterns and data aging.
7.2.1 Archival requirements The SNS facility is likely to generate several hundred terabytes of data during its lifetime. Added to this, user data reduction and transformations will also generate new datasets and insights into neutron science. Over the years datasets are likely to “age” generating fewer and fewer requests for the data as time passes. However, there is a growing demand among the user community to archive all of the data generated at SNS and not discard any of it. Thus, · “Cold” or “old” data will need to be identified and archived constantly so there is room for “fresh” data coming out of the facility. · There is the need to identify a “threshold” age after which datasets will be archived. This threshold can be decided statically by the administrator or dynamically based on temporal access patterns. · Archival data—along with its metadata—will either need to be moved to offline media or lower down in the hierarchy of storage. · Archived data should easily be located and delivered in the event of request.
Data archiving and retrieval will be performed automatically, with appropriate access control, without intervention of the instrument scientist.
7.2.2 Fresh data requirements Fresh data produced by the SNS facility has the following storage requirements. · New data generated by the facility is likely to generate frequent user accesses and should be maintained in mainstream, high speed storage systems. These datasets are also required to be backed up periodically for resilience against disk crashes and failures. · These storage systems should support secure, reliable, high performance data movement tools to move terabytes of data to their respective destinations quickly. · In addition to data produced by the facility, users may wish to upload datasets resulting from their transformations to the facility for others to share. · All of these datasets will need to be cataloged for future access. · There is the need to support several storage access policies to support both facility data as well as user experiment collaboration data. Access control mechanisms restricting “certain” users to “certain” datasets are required. · Often in scientific environments, datasets are grouped into collections belonging to certain events. SNS may need to authorize data access to communities of users in terms of collections of data, as opposed to granting access to individual datasets.
7.2.3 Ephemeral data requirements In addition to archival and new datasets, the SNS facility is also faced with intermediate result data. Users will typically run their computations on SNS resources shipping only the results to their local sites for visualization. These result datasets will need to be stored temporarily at the facility until they can be transferred to the user or until the user decides to upload it for others to use, in which case it needs to be cataloged and stored permanently.
7.2.4 Audit Trail · Data and metadata lookups should also be linked with the above infrastructure so that the dataset needed for a particular software tool to conduct a specified experiment for an instrument can be easily located. · Parameters and files used for processing, modeling, and simulation will be stored. · User experiment history, collaborations, processing software versions, and intermediate data products must also be identifiable from this interface. · Software performing data treatment and analysis must be capable of incorporating data provenance and pedigree information as part of the metadata.
7.2.5 Storage, distributed data access and lineage Data generated at the SNS facility will be stored on-site. The SNS may also mirror data sets generated at other neutron facilities and they in turn may mirror the SNS data sets. SNS users may generate modeling, simulation, or analysis data at their home institutions that they wish to share with their whole team. Users might wish to upload these results to the facility site for others to access. Thus, the SNS has a potentially complex data storage, access, mirroring and replication environment. This poses several stringent requirements on the intended software solution. · SNS will need techniques for active mirroring of data and metadata—both its own data as well as other sites’. · As intermediate sites get involved, SNS will need a data replication hierarchy to maintain subsets of data at different levels and versions. · Mirroring and replication introduces cataloging issues in terms of maintaining metadata associations between datasets and physical locations. · There is a need to rapidly and efficiently move these large datasets to users around the world. · Sharing datasets from intermediate computations for use in subsequent transformations requires metadata mechanisms for cataloging intermediate data products, tracing data lineage, associating data products to operations and users, etc. · Software tools will be required to allow SNS data administration personnel the ability to monitor storage capacities, data access and volume traffic, location of data, back-up and archival histories, and other standard data and enterprise management tools. Grids address some of the issues here by providing solutions to secure data access, catalogue replicas, tracing data lineage through the use of virtual data products, providing efficient data movement.
8.0 COMPUTER INTERFACE
8.1 Grid access The neutron scattering community around the world will require access to both raw and processed data as well as metadata. To process data they will also require access to high performance computational resources either at the SNS or at their home institution. Because of the potentially dispersed user base, data sets, and computational resources, the Grid can aid in real-time data analysis, real-time visualization, and experiment control. Grids address several of the access and data requirements posed by the SNS community. Grid computing also provides a convenient environment for geographically dispersed researchers to collaborate with each other and build strong research teams. The South Eastern TeraGrid Extension to Neutron Science (SETENS) initiative attempts to bring massive neutron source data to the TeraGrid. The TeraGrid project will integrate computing and communication infrastructure from eight partner sites. The premise here is that bringing neutron source data to the TeraGrid will aid in harnessing the Grid testbed for storage, real-time analysis, and instrument control. The software infrastructure for SNS can harness Grid technologies in the following areas, which also fit into the scope of the SETENS work: · Storage, distributed data access and lineage · Enabling neutron application tools · Visualization support · Intelligent scheduling of experiments
8.2 Cluster Computing and Distributed Processing Easy-to-use tools such as Netsolve/Gridsolve shall be incorporated allowing users to easily take advantage of distributed computing resources. These tools shall be enabled via function calls from within software such as C/C++, python, IDL, and Matlab. Access to the following distributed computing resources (when available) shall be implemented,: · SNS internal computer clusters – centralized and per instrument · ORNL high performance computers · Computers available via the internet · Computers available via the TeraGrid
Computing resources shall be allocated such that users currently at the facility performing experiments shall have access to the “fastest” computing resources available, thus allowing users to make best use of their time while at SNS.
Instrument monitoring and control tools shall have an interface to manage distributed computing resources. In this way a system administrator can monitor and control job status, assess resource utilization, and control execution prioritization among SNS users. A subset of the monitoring and control tools shall be provided to the user to oversee and trace job execution, and with limited control of job execution. Tools provided the user shall also provide feedback in the event of errors occurring while performing distributed processing.
9.0 VISUALIZATION INTERFACE GUIDELINES Visualization is the visual representation of a complex, multidimensional set of data. To be useful this representation must be presented in a physically or mathematically meaningful way. Data visualization will be important at all steps in the analysis of neutron scattering experimental data. It will be used as a subjective gauge for the quality of raw data and correct instrument operation. Visualization will act as a guide to data reduction and statistical quality. The final reduced data sets will be compared to models and simulations and/or fit with various algorithms. Finally, the results may be displayed as reciprocal space results or even real space atomic and molecular representations. All of these functions will be needed in an integrated data visualization package. The key to success in such a package for the SNS will be in striking the correct balance between the economy of developing common tools for all of the instruments and the flexibility of such tools to be utilized by scientists to develop analysis routines unique to the various applications. A common interface should be developed for all of the instruments. This interface may incorporate all of the aspects of an experiment from data acquisition through final data analysis. It should contain menu driven commands as well as command line input. The visualization may be used to look at an intermediate data set after an operation has occurred or may be an integral part of the operation itself. An example of the later case would be an iterative analysis routine where one saves the data after visual inspection or chooses to perform a different operation. In general, a set of visualization routines should be available to the users to enable a predetermined basic set of operations needed to perform experiments, reduce the data, and analyze the data. While more complicated routines and analyses may be developed by the user, this basic set should be developed by SNS and should be the minimum that most users will need. These visualization routines must be able to render the appropriate image on a time scale on the order of a few seconds or less (depending on the size of the data set). There should be the options to choose the appropriate reductions and transformations on the data and then to choose the “cuts” to be taken through the results. These cuts should be able to be arbitrarily oriented. Two dimensional, three dimensional (contours, color plots, and perspective plots), and animated versions of data presentation may be selected as necessary. Finally, the quality and options of the visualization output should be controlled by the user. If the user is to rely on these packages to produce the final output after data analysis, then he/she must be able to tailor the looks of that output so that it is of appropriate publication quality for the journal to which it will be submitted. This includes selection of color schemes, fonts, line widths, etc. Some of the visualization will be interactive (point at a peak and get information), and some can be rendered through a web portal. In addition there will be a way to plot two datasets on top of each other for viewing data before and after a correction or comparing measurement to model or simulation. For portal use, the user shall be able to select visualization resolution (fine, medium, coarse, etc.) to adapt connection bandwidth variations in rendering images.
Visualization software should be separate from processing software, to the maximum extent possible. Keeping the two separate allows both to evolve at their own rate and supports modular replacement as necessary. In addition, a visualization interface definition shall be developed to facilitate creation of visualization software using any capable visualization tools. Practically speaking a limited number of tools should be identified and supported to reduce facility development and maintenance overhead efforts.
9.1 General visualization The user needs to be able to visualize the raw data and various stages of processed data in some meaningful form from the early stages of data collection (up to approximately one hour after starting collection) through the completion of data processing. The display mode should be very flexible, yet very intuitive. Displays should be similar to well developed display software (for example, ORIGIN). Automatic scaling, manual scaling, lin/log scaling should easily be possible. Source of data It should be possible to produce visualizations of raw data or partially reduced data from data in memory or in a data file. Data may also consist of many files to visualize as in the case of viewing multiple files from a parametric study. In this case, the graphics tools must provide a means for selecting and viewing a number of data sets. Advanced visualization may utilize multiple displays thus providing a larger viewing canvas. The design of multiple data set visualization must take into consideration bandwidth requirements for remote visualization. Such applications must optimize rendering speed by choosing whether to send data or images to the client machine. Speed Such visualizations should be updated in essentially real time (where a quantitative operational definition of “real time” is application dependent). The speed in which data displays should be updated is basically limited by the time required for the user is able to interpret what is going on (or if something important has happened. To support fast visualization of large data sets, it may be necessary to support scalable visualization in which visualization computation is performed separately from the rendering. For those applications it will be necessary to utilize some form of high performance computing with a high speed bandwidth connection between computation and rendering engines. Quantity of data Only the amount that can be meaningfully represented on a video display will be visualized at any given time. However, the user will need to be able to switch quickly (in less than 10 seconds) to display a different cut through the data. Location These displays should be viewable at a local SNS console terminal and also should be viewable remotely over the net.
Standardization across instruments Instrument visualization software shall follow the Style Guidelines to achieve standardization among instrument interface and data visualization.
3-d enabled
3D visualization shall be capable of taking advantage of graphical hardware accelerators, supporting standards such as OpenGL or Direct X.
Animation of data sets
These cases must be supported for visualizing animated data: 1. Parametric data visualization – data changing with respect to parameter changes 2. Simulation visual results updated with iteration to allow a user to visually analyze fitting convergence.
Support for multiple visualization tools
A visualization interface definition is required that provides an abstraction layer between the data to be visualized and the visualization software. Having such an interface definition provides development modularity, and future development pathways. In addition it allows the facility more independence for providing graphics/visualization functionality. Both commercial and open source software shall be supported. Visualization software shall support methods for which it is capable of supporting, else gracefully “ignore” commands or methods it cannot execute. As possible, the visualization software will utilize existing standards for graphics and visualization.
Graphical Interaction As applicable, visualization applications shall provide feedback for other applications to perform a function – such as providing coordinates or data values to analysis routines.
Tool Types
Visualization tools shall include, plotting, surface, contour, imaging, and data tabulation. These tools shall be accompanied the appropriate controls to manipulate visual data. As appropriate, these tools shall support 1D, 2D, 3D, and 4D data. These tools shall be easy to program allowing a user to develop GUI and visualization applications – thus visualization applications shall be built using higher level tools rather than expecting users to build applications from low level (perhaps hardware oriented) tools.
9.2 Remote visualization support Users visualize their end results from computations so they can derive insights on their models and analysis. These computations are most likely performed at the SNS facility and results shipped to the user’s home location to be visualized. We are faced with several challenges here that the Grid or distributed visualization tools such as CUMULVS can address: · Streaming visualization is a desired feature wherein intermediate results are dispatched to remote visualization clients while the parent application and/or experiment is still executing. · Remote visualization of moving data requires intelligent high-speed data transfer to produce jitter-free visualization of results. · Researchers—working in geographically dispersed groups—might desire to collaboratively visualize data. This is a challenging problem bringing to the forefront several issues in collaborative control, multicasting data transfers, etc. 9.3 Visual programming interface There can be multiple ways to connect software components. One way is to use a visual programming interface. A visual programming interface is desired, but not necessarily required. · A visual environment for composing applications with drag and drop features for selecting, connecting, and wiring software tools together is required. · An application composition editor is essential with the ability to represent tools and their connections, to delete tools, to save the composed application for future access, etc. · A library listing of tools, organized in terms of functionality, so they can be identified and invoked easily. 9.4 Visualization steering interface The visualization interface will be used to rendered analysis results as an image after a number of computational steps. Being able to perform the following steering operations on the image can add to the flexibility of the visualization process. Definitive requirements for visualization steering shall be captured in following documents. As a guide, the interface should support: · Standard, commonly used viewing modes including fly-bys and walkthroughs. · Provide color panels to change the color table, lighting, intensity, hue and saturation of the visualized object. Each visualization window is to have its own separate image visualization manipulation controls. · The ability to change the position of the “camera” and axis to view the object from various angles. · The ability to select, pick and magnify portions of the image, view cross sections, etc.
10.0 EXPERIMENT CONTROL AND MONITORING
10.1 Control Portal The Control Portal will provide users and instrument scientists the capability to monitor progress of an experiment without continuous presence. This capability must include two essential features: · Web service interfaces to experiment controls for different instruments so they can be steered remotely.. · Co-scheduling of computer resources and beam time for coupling simulation and experiment.
10.1.1 Data Acquisition Interactive control of the data acquisition process implies some form of decision making affecting operation of the data acquisition system. This process may be partially or entirely automated (e.g., count until a certain statistical accuracy is reached and then go on to the next planned operation) or may require a decision and/or actions by the user. Current neutron data acquisition systems typically count neutrons for a given configuration of the sample/instrument for a fixed length of time or number of proton pulses. The purpose of intelligent interactive control is to optimize use of the instrument beam time, maximizing the information content/science that can be obtained, and providing more flexibility in designing and implementing a data acquisition plan. Provision will be made for “simple” tests of the data (e.g. statistics in a specified interval of pixels and time) within the data acquisition system itself to allow rapid sampling. Control of the data acquisition strategy can then occur in nearly “real” time.
11.0 COMPREHENSIVE FACILITY OPERATIONS With the paradigm shift from stand-alone desktop application to portal based software comes new responsibilities for a facility. It now must be responsible to users working with facility resources. To effectively provide this, a facility must provide itself the appropriate management tools. These include: Process monitoring tools Data management and database tools Network management tools User manager tools
Having these tools becomes more important as the facility supports more users. Initially much of this functionality can be done manually with relatively low level tools, however it is anticipated that more robust tools will be required in the future. Tool needs are to be defined in subsequent documents and it is expected that existing tools will be evaluated to perform these tasks hopefully limiting facility development of such tools.
11.1 User Web Services Portal login also provides the facility the opportunity to provide additional web based services to assist users. These services provided to the user will be defined in more detail in subsequent documents. Again, existing tools should be examined to see if they provide the necessary functionality. Listing some basic services and resources: Tutorials Users guides and reference manuals Software repositories Facility news Facility and Instrument calendars Facility and instrument status Contact information Help
12.0 SOFTWARE QUALITY ASSURANCE
To produce the enterprise grade software necessary to support the 1000+ annual users of SNS, software quality assurance procedures based on software engineering best practices will need to be defined and applied. SNS users will require 24/7 access to software and data daily. A quality engineering process is needed to ensure that quality will be designed and built into the software and support infrastructure. Separate from this document, software quality assurance procedures and policies will be defined for the following: · GUI style guides and coding standards · Software configuration management · Software build procedures · Software testing requirements and methods Exception handling requirements and methodologies · Problem/Issue/Change tracking tools and procedures · Software release criteria · Software archive and repository management · Data archive and data integrity assurance 12.1 Infrastructure
Infrastructure is needed in the form of tools for: · Software version control – allows developers to manage software change · Issue tracking – recording and tracking failure reports and change requests from user feedback and testing results. For example, a software “bug report” can be initiated, have its status changed to “reviewed,” be linked to code modifications (i.e. corrections made), and have its disposition recorded (corrected, deferred, etc.). · Software building –extracting code from the software archive to build executable software according to repeatable procedures · Software testing – demonstrate that expected or intended actual use will happen failure free with statistical confidence · Web-based Communication –enable communication among various groups, such as users, developers, administrators, and management via web-based discussion groups and forums.
12.2 Software Development Work Products
This is the root document for producing SNS-developed software. These requirements and guidelines will be elaborated in more detailed requirements definitions for each of the interfaces discussed above and the classes of instruments being developed at SNS.
In general, it is required to have requirements and specifications defined for each software component from which software is developed for SNS operational use. Verification that SNS software implements the desired functionality and capability is achieved by peer review of work products and testing. From the specifications, test plans will be derived to guide software testing. Testing results may indicate needed modifications to requirements, specifications, code, or test plans. . These relationships are captured below in Figure 12.1 Figure 12.1: Relationship among work products. Notice that requirements initiate document flow where specifications support both code and test plan development. Code design information may be captured in a document format or as embedded annotation. Testing outcomes may indicate necessary changes for any of these work products. Verification ensures that overall, desired functionality is produced.
When considering enhancement or repair of existing code, a software impact evaluation shall be performed. This evaluation estimates the level of effort to make the change, scope of the change within the software, and the effect it may have on other functionality or the architecture. Undertaking the modifications requires updates, as appropriate, to the existing software work products.
Peer reviews are important to ensure correctness and completeness. For requirements, it is important to involve all stakeholders to ensure that their needs are considered. Peer reviews involving developers, software testers, and appropriate scientific domain experts (instrument scientists, for example), are necessary for specifications. Code review by qualified staff is the primary means to ensure that as-built code accurately reflects the specifications.
Producing user documentation is also necessary. Software with an interface to human users requires publication of User’s Guides and/or Reference Manuals. In addition, tutorials are needed to train users on software use. This information should be made available on-line so that users can readily access the most recent information.
Software development work products summary: · Requirements · Specifications · Design · Code · Test Plans · Test and Verification outcomes · User’s Guides · Reference Manuals · Tutorials
12.3 Testing
Developing test plans and performing testing is a required component for producing SNS analysis related software. New software will be developed incrementally. Each successive increment will be thoroughly tested to establish confidence that expected or intended actual use will occur without failure. For modifications to previously released software, thorough testing is required to ensure that errors have been corrected and that code modifications will not cause failures in previously certified functionality.
Software must be tested on each platform where it is expected to execute. For instance, an application expected to run under both Linux and Windows is really two different applications, and they must both be tested to ensure that they operate properly on both platforms.
It is expected that legacy software shall be incorporated and made available to users. However this legacy software will have been developed outside the process for SNS software and hence will not in general be readily testable. As a minimum limited testing is required to ensure that software runs from the provided interface. Additionally the user must be informed that the legacy software is provided “as-is,” and the user must be provided references to any applicable certification information. 12.4 Feedback Mechanisms
This software development effort will identify, implement, and utilize the appropriate feedback mechanisms to support the production of high quality software. To develop high quality software and continuously improve each succeeding software product, feedback mechanisms are required. These mechanisms can be categorized as: · User – provides input regarding problems in released software or desired feature changes or additions · Review records – provides a record of defects corrected and changes initiated during each phase of the development process · Formal Testing – provides feedback on the number, type, and severity of defects resulting from design and coding activities · Metrics collection and analysis – provides objective productivity and quality measurements from which targeted process improvement actions can be defined