University of Witwatersrand

Thesis

Doctor of Philosophy

Real-Time Protocol Strategies for Mission-Critical, Distributed Systems

R.M. Young

1996 University of Witwatersrand

Thesis

Doctor of Philosophy

Real-Time Protocol Strategies

for Mission-Critical, Distributed Systems

Author : Richard Young Pr Eng, MSc(Eng)

Issue : 1

Date : 1996-07-08

A thesis submitted to the Faculty of Engineering, University of Witwatersrand, Johannesburg, Republic of South Africa, in fulfilment of the requirements of the degree of Doctor of Philosophy. ydthsm2.wpd Declaration

I declare that this thesis is my own work. Where there has been collaboration with other parties, this is indicated by acknowledgement or reference.

It is being submitted for the degree of Doctor of Philosophy at the University of Witwatersrand, Johannesburg. It has not been submitted before for any degree or examination at any other University.

______R.M. Young

This 8th day of July, 1996.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page ii of 214 ydthsm2.wpd Abstract

This thesis addresses system-level issues applicable to real-time, mission-critical, distributed systems. In particular, it addresses the requirements for, and attributes of, data communication protocols to support the integration of data services into complex, real-time, distributed systems as well as the strategies applicable to the implementation of such systems.

The objectives of the work underlying this thesis are the analysis of the information management requirements of typical next-generation management and control systems and the synthesis of an optimal solution (in terms of performance, dependability, transparency and flexibility) using distributed computing elements and local area networks (LANs). Of particular significance is that the system solution should exhibit a high degree of integration across all its functional areas as well as an open systems architecture.

As the successful integration of distributed systems and the maximisation of interoperability rely on the employment of standards, a major objective is to critically analyze all currently available protocol standards in terms of their suitability for real-time, mission-critical, distributed systems and then synthesize an optimal solution using the most appropriate of these, with augmentation where necessary. As most of these standards were not necessarily developed for the applications of concern, innovative ways of optimising the solution without major deviation from accepted international standards are sought. Where off-the-shelf products are found to be unsuitable to implement specific elements of the proposed system solution, restricted design and development is proposed.

A system solution synthesized from the allocated and derived functional and performance requirements is proposed in terms of a data communications paradigm which meets these requirements and is practical in terms of available technology and affordability. The result is an implementable system catering for all physical and functional layers, i.e. from the physical cabling, up to the interface with the user's application software. All the layers are functionally decoupled to the maximum extent possible in order to provide for flexibility and obsolescence management.

While a systems solution considered appropriate for the present timeframe is identified, a methodology is also proposed which will systematically enable requirements of next generation systems to be matched to the capabilities and characteristics of technologies of the future.

By matching of appropriate technologies and techniques, the proposed network solution is capable of supporting a critical virtual circuit to provide dependable, closed-loop, real-time control of critical sensor/actuator sub-systems using local area networks. It is also capable of providing full performance and protocol functionality in internetwork topologies without omitting the network and transport layers.

In order to verify the validity of the proposed solution, an experimental testbed is designed to support prototyping of the various elements of the system solution as well as integration of these elements into a concept demonstrator of a complete system. This prototyping falls into both the rapid and evolutionary types. The former is used to validate concepts and support performance measurements, while the latter is used to develop a number of robust, re-useable software products, i.e. implementations of the Xpress Transport Protocol, a Network Time Services and Network Management Services as well as a novel Application Interface Services protocol.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page iii of 214 ydthsm2.wpd Index Terms

Real-time computing, real-time protocols, mission-critical computing, distributed systems, local area networks, fibre optic networks, high performance networks, information management, information technology, system architecture, system integration, fault-tolerant systems, survivability, application interface services, network time protocol, network management services.

Preface

Revision 1 facilitates an Adobe Acrobat Portable Document Format (PDF) version.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page iv of 214 ydthsm2.wpd Acknowledgements

The work in respect of the Architecture Demonstration Model was performed at CCII Systems (Pty) Ltd by a project team of six engineers under the leadership of the author. The overall concept, as well as that of the Architecture Demonstration Model, was that of the author.

The author wishes to acknowledge and thank the contributors to this exciting and rewarding project. His colleagues at C²I² Systems collaborated in the following areas :

Gerhard Krüger APIS Protocol Conceptual Design

Etienne de Villiers APIS Detailed Design and Implementation for II under iRMK Protocol Performance Measurements

Michael Evans APIS V1.0 Detailed Design and Implementation for PC

Søren Aalto XTP Implementation for Multibus II under iRMK

Wouter de Waal NTP Analysis and Implementation

Manie Steyn APIS V2.0 Implementation

Thanks are also due to Jane Wolfe-Coote for her diligent proof reading of the final drafts of this text.

Acknowledgement is made to Dr Alf Weaver and James McNabb of Network Xpress, Inc. for their generic XTP implementation for the Intel 80x86 platform as well as a specific implementation for the Schneider and Koch FDDI PC platform.

The financial support provided by Messrs Pierre Meiring and Anton Jordaan of Armscor as well as Capt Johnny Kamerman and Lt Cdr Jacques Pienaar of the SA Navy is gratefully acknowledged. Their confidence and enthusiasm were instrumental in facilitating the work underlying this thesis.

The author finally wishes to thank his supervisor, Prof. Ian MacLeod without whose wise and patient counsel this thesis could not have been accomplished.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page v of 214 ydthsm2.wpd Contents

1. Scope ...... 2 1.1 Scope ...... 2 1.2 Introduction ...... 2 1.3 Contribution of the Study ...... 6 1.4 Document Overview ...... 8

2. Evolution of Distributed Control Systems ...... 11 2.1 Introduction ...... 11 2.2 History ...... 11 2.2.1 Pneumatic Systems ...... 12 2.2.2 Hydraulic Systems ...... 12 2.2.3 Current Loop Systems ...... 12 2.2.4 More Complex Systems ...... 13 2.2.5 Local Area Networks ...... 16 2.2.6 Implications of LANs ...... 17 2.3 Real-Time Communication ...... 19 2.3.1 Determinism ...... 19 2.3.2 Latency ...... 21 2.3.3 System Responsiveness ...... 22 2.3.4 Priority ...... 23 2.3.5 Precedence ...... 23 2.4 Derived Requirements ...... 24 2.4.1 Data-driven vs Address-driven Approach ...... 24 2.4.2 Multicast ...... 25 2.5 Future...... 27 2.5.1 Short Term - The Next 5 Years ...... 27 2.5.1.1 Asynchronous Transfer Mode ...... 27 2.5.1.2 ...... 29 2.5.2 Medium Term - 5 to 15 Years ...... 30 2.5.2.1 Scalable Coherent Interface ...... 31 2.5.2.2 ...... 32 2.5.3 Upper Bound of Network Performance Requirements ...... 32 2.6 Chapter Summary ...... 33

3. Contextual Definitions ...... 35 3.1 Real-Time ...... 35 3.1.1 Definition ...... 35 3.1.2 Communications Implications ...... 35 3.2 Protocol ...... 35 3.2.1 Definition ...... 35 3.2.2 Communications Implications ...... 35 3.3 Strategy ...... 36 3.3.1 Definition ...... 36 3.3.2 Communications Implications ...... 36 3.4 Mission-Critical ...... 36 3.4.1 Definition ...... 37 3.4.2 Communications Implications ...... 37 3.5 Distributed ...... 37

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page vi of 214 ydthsm2.wpd Contents

3.5.1 Definition ...... 37 3.5.2 Communications Implications ...... 37 3.6 System ...... 37 3.6.1 Definition ...... 37 3.6.2 Communications Implications ...... 38 3.7 Open System ...... 38 3.7.1 Definition ...... 38 3.7.2 Communications Implications ...... 38 3.8 ...... 38 3.8.1 Definition ...... 38 3.8.2 Communications Implications ...... 39 3.9 LAN Profile ...... 39 3.9.1 Definition ...... 39 3.9.2 Communications Implications ...... 40 3.10 Fault-Tolerant ...... 40 3.10.1 Definition ...... 40 3.10.2 Communications Implications ...... 40 3.11 Packet...... 40 3.11.1 Definition ...... 40 3.11.2 Communications Implications ...... 40 3.12 Message ...... 41 3.12.1 Definition ...... 41 3.12.2 Communications Implications ...... 41 3.13 Cell...... 41 3.13.1 Definition ...... 41 3.13.2 Communications Implications ...... 41 3.14 Precedence ...... 41 3.14.1 Definition ...... 41 3.14.2 Communications Implications ...... 41 3.15 Priority ...... 41 3.15.1 Definition ...... 41 3.15.2 Communications Implications ...... 42 3.16 Chapter Summary ...... 42

4. Real-Time Distributed System Applications ...... 44 4.1 Scope ...... 44 4.2 Introduction ...... 44 4.3 Industrial Process Control Systems ...... 45 4.3.1 Functional Performance Requirements ...... 45 4.3.2 Network Performance Requirements ...... 45 4.3.3 Data Distribution Requirements ...... 46 4.3.4 Example ...... 46 4.4 Aerodynamic Control Systems ...... 48 4.4.1 Functional Performance Requirements ...... 48 4.4.2 Network Performance Requirements ...... 48 4.4.3 Data Distribution Requirements ...... 48 4.4.4 Example ...... 48 4.5 Space Launch Vehicle ...... 50 4.5.1 Functional Performance Requirements ...... 50 4.5.2 Example ...... 51

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page vii of 214 ydthsm2.wpd Contents

4.6 Next Generation Vetronics System ...... 53 4.6.1 Functional Performance Requirements ...... 53 4.6.2 Real-Time Data Transfer Requirements ...... 53 4.6.2.1 Critical Functions...... 53 4.6.2.2 Critical Data...... 54 4.6.3 Example ...... 54 4.7 Naval Surface Combat System ...... 56 4.7.1 Functional Performance Requirements ...... 56 4.7.2 Data Distribution Requirements ...... 57 4.7.2.1 Critical Functions...... 57 4.7.2.2 Critical Data...... 58 4.7.2.3 Synchronisation ...... 58 4.7.3 Network Architectures ...... 58 4.7.3.1 Present Generation USN Combat System (US Navy) ...... 59 4.7.3.1.1 Description ...... 59 4.7.3.1.2 Advantages ...... 59 4.7.3.1.3 Disadvantages...... 59 4.7.3.1.4 Topology...... 59 4.7.3.2 Single LAN Topology Architecture ...... 61 4.7.3.2.1 Description ...... 61 4.7.3.2.2 Advantages ...... 61 4.7.3.2.3 Disadvantages...... 61 4.7.3.2.4 Topology...... 61 4.7.3.3 Next Generation USN Combat System (US Navy) ...... 63 4.7.3.3.1 Description ...... 63 4.7.3.3.2 Advantages ...... 63 4.7.3.3.3 Disadvantages...... 63 4.7.3.3.4 Topology...... 64 4.7.3.4 Federated Backbone-Topology Integrated System ...... 66 4.7.3.4.1 Description ...... 66 4.7.3.4.2 Advantages ...... 66 4.7.3.4.3 Disadvantages...... 66 4.7.3.4.4 Topology...... 66 4.7.3.5 Integrated Hub-based Architecture ...... 68 4.7.3.5.1 Description ...... 68 4.7.3.5.2 Advantages ...... 68 4.7.3.5.3 Disadvantages...... 68 4.7.3.5.4 Topology...... 68 4.7.3.6 Client-Server Architecture ...... 70 4.7.3.6.1 Description ...... 70 4.7.3.6.2 Server Integrity Techniques ...... 71 4.7.3.6.3 Advantages ...... 72 4.7.3.6.4 Disadvantages...... 72 4.7.3.6.5 Topology...... 72 4.8 Multimedia Information System ...... 74 4.8.1 Functional Performance Requirements ...... 74 4.8.2 Requirements Implications ...... 74 4.8.3 Topology...... 74 4.9 Chapter Summary ...... 76

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page viii of 214 ydthsm2.wpd Contents

5. System Requirements ...... 78 5.1 Allocated Requirements...... 78 5.2 Derived Requirements ...... 79 5.2.1 Interconnectivity...... 80 5.2.2 Scalability ...... 80 5.2.3 Security...... 81 5.2.4 Determinism ...... 81 5.2.5 Priority ...... 81 5.2.6 Precedence ...... 82 5.2.7 Class of Service ...... 82 5.2.8 Quality of Service...... 83 5.2.9 Service Models ...... 83 5.2.9.1 Connection...... 83 5.2.9.2 Datagram ...... 84 5.2.9.3 Reliable Datagram ...... 84 5.2.9.4 Transaction...... 84 5.2.9.5 Broadcast ...... 85 5.2.9.6 Multicast ...... 86 5.2.9.7 Flash Message ...... 87 5.2.9.8 Critical Virtual Circuit...... 87 5.2.9.9 Safety Virtual Circuit...... 88 5.2.10 Dataflow Control ...... 88 5.2.10.1 Latency Control ...... 88 5.2.10.2 Error Control ...... 88 5.2.10.3 Flow Control ...... 89 5.2.10.4 Rate Control...... 90 5.2.10.5 Burst Control ...... 90 5.3 Chapter Summary ...... 90

6. System Architecture ...... 93 6.1 Architecture Derivation ...... 93 6.2 Architecture Models ...... 93 6.3 Cable Layer ...... 98 6.4 ...... 99 6.5 Data Link Layer Protocol ...... 100 6.5.1 MAC Sub-Layer Protocol...... 100 6.5.1.1 Data Transfer Modes ...... 101 6.5.1.1.1 Synchronous Mode...... 101 6.5.1.1.2 Asynchronous Mode...... 103 6.5.1.1.3 Isochronous Data ...... 103 6.5.1.2 Station Management ...... 104 6.5.2 Summary of FDDI Features ...... 105 6.5.3 Logical Link Control Sub-Layer Protocol ...... 105 6.5.3.1 LLC Type 1 ...... 105 6.5.3.2 LLC Type 4 ...... 106 6.5.4 Sub-Network Access Protocol ...... 106 6.5.5 Choice of Data Link Layer Protocol ...... 106 6.6 Network Layer Protocol ...... 107 6.6.1 Connection-Oriented vs Connectionless Data Transfer ...... 107

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page ix of 214 ydthsm2.wpd Contents

6.6.1.1 Connection-Oriented Approach ...... 108 6.6.1.2 Connectionless Approach ...... 108 6.6.2 Network Layer Requirements...... 108 6.6.3 OSI Network Layer Protocol ...... 109 6.6.3.1 Connection-Oriented Network Protocol ...... 109 6.6.3.2 Connectionless Network Protocol ...... 109 6.6.4 Internet Protocol (IP) ...... 110 6.6.5 XTP-Aware IP Routing ...... 111 6.6.6 Choice of Network Layer Protocol ...... 111 6.7 Transport Layer Protocol...... 112 6.7.1 Transport Layer Requirements ...... 113 6.7.2 TCP...... 113 6.7.2.1 TCP Priority Scheme ...... 114 6.7.2.2 TCP Deficiencies ...... 114 6.7.2.3 TCP Suitability ...... 115 6.7.3 ISO TP4 ...... 115 6.7.3.1 TP4 Priority Scheme ...... 116 6.7.3.2 TP4 Deficiencies ...... 116 6.7.3.3 TP4 Suitability ...... 116 6.7.4 XTP...... 116 6.7.4.1 XTP History...... 117 6.7.4.2 XTP Capabilities ...... 117 6.7.4.3 XTP's Orthogonal Approach ...... 117 6.7.4.4 XTP Suitability for Real-Time LAN Profile ...... 118 6.7.4.5 XTP Deficiencies...... 118 6.7.5 Priority Management ...... 119 6.7.6 Error Management ...... 120 6.7.7 Quality of Service...... 122 6.7.8 Flow Control...... 122 6.7.8.1 Sliding Window Scheme ...... 122 6.7.8.2 Credit Scheme ...... 122 6.7.9 Latency Control ...... 122 6.7.10 Transport Layer Protocol for Real-Time LAN Profile ...... 123 6.8 Extended Profile Services ...... 123 6.8.1 Application Interface Services ...... 124 6.8.1.1 Capabilities ...... 124 6.8.1.2 APIS Overview ...... 125 6.8.1.3 Principles of Operation ...... 126 6.8.1.4 Services ...... 127 6.8.1.5 Implications of the Data-Driven Approach ...... 128 6.8.1.6 APIS Dataflow Control ...... 129 6.8.1.7 APIS Architecture ...... 129 6.8.1.8 Off-host Architecture...... 131 6.8.2 Network Time Services ...... 131 6.8.2.1 Description...... 131 6.8.2.2 Capabilities ...... 132 6.8.2.3 Network Time Protocol ...... 132 6.8.2.4 NTP using FDDI as a Medium ...... 133 6.8.2.5 Fast Initialisation by Synchronisation Seed ...... 133 6.8.2.6 Timestamping ...... 134

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page x of 214 ydthsm2.wpd Contents

6.8.2.7 CPU/NIC Synchronisation...... 134 6.8.2.8 XTP Support for Timestamping ...... 134 6.8.3 Network Management Services ...... 135 6.8.3.1 Network Management Capabilities ...... 135 6.8.3.2 Network Management Standards ...... 135 6.8.3.3 Network Management Principles ...... 135 6.8.3.3.1 Managed Objects ...... 135 6.8.3.3.2 Management Agents ...... 136 6.8.3.3.3 Managing Applications ...... 136 6.8.3.3.4 Network Management Protocol ...... 136 6.8.3.4 Network Management Products ...... 137 6.8.3.5 Management Information Bases ...... 137 6.8.3.5.1 Parameter Management Frames ...... 137 6.8.3.5.2 SNMP MIBs ...... 137 6.8.3.6 Network Management Station ...... 138 6.8.4 Network Security Services ...... 140 6.8.4.1 Confidentiality ...... 140 6.8.4.2 Integrity ...... 140 6.8.4.3 Accountability ...... 141 6.8.4.4 Access Control...... 141 6.8.4.5 Availability ...... 141 6.8.5 Summary of Extended Profile Services ...... 141 6.9 Chapter Summary ...... 141

7. System Implementation ...... 143 7.1 Functional Integration ...... 143 7.1.1 Vertical Approach...... 143 7.1.2 Horizontal Approach ...... 144 7.1.3 Practical Implementation Approach ...... 145 7.2 Topology ...... 146 7.2.1 LAN Connectivity ...... 146 7.2.2 Interconnectivity...... 147 7.2.2.1 Coupling and Bypass Devices ...... 147 7.2.2.2 Repeaters ...... 147 7.2.2.3 Bridges...... 148 7.2.2.4 Routers...... 148 7.2.2.5 Gateways ...... 149 7.2.2.6 Switches...... 149 7.2.2.6.1 Circuit Switching ...... 149 7.2.2.6.2 Packet Switching ...... 149 7.2.2.6.3 Virtual Circuits ...... 150 7.2.2.6.4 Critical Virtual Circuits ...... 150 7.2.2.7 Concentrators...... 151 7.2.3 Wide Area Connectivity ...... 151 7.2.4 System Dependability...... 151 7.2.4.1 Availability ...... 151 7.2.4.2 Fault-Tolerance ...... 152 7.2.4.3 Reconfigurability ...... 152 7.2.4.4 Enabling Technology...... 152 7.3 Multimedia Networking...... 152

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xi of 214 ydthsm2.wpd Contents

7.3.1 Video and Image...... 153 7.3.2 Multiplexed Video and Image ...... 153 7.3.3 Video Data Compression ...... 154 7.3.3.1 Lossless Video Data Compression ...... 154 7.3.3.2 Lossy Video Data Compression ...... 154 7.3.3.3 JPEG Compression ...... 154 7.3.3.4 MPEG Compression ...... 155 7.3.3.5 Motion JPEG Compression ...... 155 7.3.3.6 H.261 Compression ...... 155 7.3.3.7 Compression Ratio...... 156 7.3.4 Networking Implications of Digital Video ...... 156 7.3.5 Audio ...... 156 7.3.6 Voice...... 157 7.3.7 Continuous Media Services ...... 157 7.3.8 Summary of Network Implications of Continuous Media ...... 158 7.4 Connectivity Issues ...... 158 7.4.1 Cable Plant ...... 158 7.4.2 Network Interface Cards...... 162 7.4.2.1 Intelligent NICs ...... 162 7.4.2.2 Non-Intelligent NICs ...... 162 7.4.3 Real-Time Operating Systems ...... 162 7.4.4 POSIX...... 163 7.5 Chapter Summary ...... 163

8. System Prototyping and Modelling ...... 166 8.1 Scope ...... 166 8.2 Objectives ...... 166 8.3 Experimental Testbed Description ...... 167 8.3.1 Local Area Network ...... 169 8.3.2 Protocol Engineering ...... 170 8.3.2.1 XTP Development ...... 170 8.3.2.2 IP Implementation ...... 171 8.3.2.3 APIS Development ...... 171 8.4 Performance Measurements...... 172 8.4.1 Measurement Setup ...... 172 8.4.2 Test Equipment...... 174 8.4.3 Test Scenarios...... 174 8.4.4 Protocol Overheads...... 175 8.4.5 Parallel Backplane Latencies...... 175 8.4.5.1 Test Setup ...... 175 8.4.5.1.1 Fixed Length Unsolicited Transfers ...... 175 8.4.5.1.2 Fixed Length Solicited Transfers ...... 175 8.4.5.2 Test Results ...... 176 8.4.5.2.1 Unsolicited Transfers ...... 176 8.4.5.2.2 Solicited Transfers ...... 177 8.4.5.2.3 Variable Length Solicited Inter-Host Transfers . . . 178 8.4.6 Logical Link Control (LLC1) Latencies ...... 178 8.4.7 XTP Latency...... 179 8.4.7.1 Vendor Measurements ...... 179 8.4.7.2 Own Measurements ...... 180

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xii of 214 ydthsm2.wpd Contents

8.4.8 XTP Throughput...... 180 8.4.9 Complete Stack under No Load ...... 181 8.4.9.1 Contribution per Layer...... 182 8.4.10 Complete Stack under FDDI Load ...... 183 8.5 NTP Development ...... 184 8.6 Simulator Development ...... 185 8.7 Technical Conclusions...... 186 8.7.1 Prototyping and Modelling...... 186 8.7.1.1 Transport Layer/Network Layer Coupling ...... 186 8.7.1.2 Off-the-Shelf Products...... 186 8.7.1.3 Software Language Compilers ...... 186 8.7.1.4 XTP ...... 186 8.7.1.5 Throughput Problem with MBII FDDI NIC ...... 187 8.7.1.6 APIS Development ...... 187 8.7.1.7 NTP Development ...... 187 8.7.1.8 Simulator Development ...... 187 8.7.2 Protocol Issues ...... 188 8.7.2.1 FDDI LAN Standard ...... 188 8.7.2.2 XTP ...... 188 8.7.2.3 Multicast ...... 189 8.7.2.4 Interconnectivity ...... 189 8.7.2.5 Determinism...... 189 8.7.2.6 Multiprotocol Operation ...... 189 8.7.2.7 Digitised Continuous Media Services ...... 189 8.7.2.8 Network Management Services ...... 190 8.7.3 Implementation Issues ...... 190 8.7.3.1 Prototyping...... 190 8.7.3.1.1 Rapid Prototyping...... 190 8.7.3.1.2 Evolutionary Prototyping ...... 190 8.7.3.2 Protocol Optimisation ...... 191 8.8 Technical Recommendations...... 191 8.8.1 Image and Video LANs ...... 191 8.8.2 Software Language Compilers ...... 191 8.8.3 Internet Protocol ...... 191 8.8.4 Real-Time Operating System ...... 191 8.8.5 FDDI Synchronous Bandwidth Allocator ...... 191

9. Conclusions ...... 193 9.1 Architecture Concept...... 193 9.1.1 Solution Derivation...... 194 9.1.2 Time Validity of Proposed Solution ...... 194 9.1.3 Networking Requirements for Next Generation Real-Time Systems . . . . . 194 9.1.4 Implications of High-Speed Networks ...... 195 9.1.5 Asynchronous Transfer Mode ...... 195 9.1.6 Scalability ...... 195 9.1.7 Spare Capacity ...... 196 9.1.8 Standards and Standard Building Blocks ...... 196 9.1.9 Relationship of Implementation and Real-Time LAN Profile ...... 196 9.1.10 Interoperability and Performance ...... 197 9.1.11 System Effectiveness ...... 197

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xiii of 214 ydthsm2.wpd Contents

9.1.12 Application Interface Services ...... 197 9.1.13 Real-Time Network Protocols ...... 198 9.1.14 Continuous Media Distribution ...... 198 9.1.15 Matching System Requirements with LAN Technologies ...... 199 9.1.16 Open Systems Architecture ...... 199 9.1.17 LAN Profile ...... 199 9.1.18 Building Blocks ...... 199 9.2 Significance of the Study...... 199 9.3 Limitations of the Study ...... 200 9.4 Final Conclusion ...... 201

References ...... 203

Bibliography ...... 213

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xiv of 214 ydthsm2.wpd List of Appendices

Appendix A - Physical Layer LAN Media and Protocols

Appendix B - Data Link Layer LAN Protocols

Appendix C - Network Layer LAN Protocols

Appendix D - Transport Layer LAN Protocols

Appendix E - Application Interface Services

Appendix F - Network Time Services

Appendix G - LAN Profiles

Appendix H - Error Analysis and Modelling

Appendix I - Dataflow Interface Management

Appendix J - Recommended Standards and Products

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xv of 214 ydthsm2.wpd List of Figures

Figure 1 : Integrated Mining Control Network Using FDDI and XTP ...... 47

Figure 2 : Fly-By-Wire Aerodynamic Control System based on FDDI ...... 49

Figure 3 : Space Launch Vehicle with Bus-based Distributed Control System ...... 52

Figure 4 : Next Generation Vetronics Architecture ...... 55

Figure 5 : Aegis Combat System Architecture ...... 60

Figure 6 : Combat System Architecture based on Single FDDI LAN Topology ...... 62

Figure 7 : Next Generation USN Combat System Architecture ...... 65

Figure 8 : Combat System Architecture based on Backbone Topology ...... 67

Figure 9 : Integrated Hub-based Combat System Architecture ...... 69

Figure 10 : Combat System Architecture based on Client-Server Topology ...... 73

Figure 11 : Multimedia Information System based on ATM ...... 75

Figure 12 : Real-Time Protocol Architecture...... 97

Figure 13 : APIS Architecture ...... 130

Figure 14 : Typical Network Management Man-Machine Interface ...... 139

Figure 15 : Vertically Integrated System ...... 144

Figure 16 : Horizontally Integrated System...... 145

Figure 17 : Typical System Fibre Cable Plant ...... 160

Figure 18 : Trunk Coupling Unit ...... 161

Figure 19 : Experimental Testbed Topology ...... 168

Figure 20 : Setup for Performance Measurements ...... 173

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xvi of 214 ydthsm2.wpd List of Tables

Table I : Example of System-Level Error Control Requirements ...... 89 Table II : ISO OSI Basic Reference Model ...... 94 Table III : Real-Time LAN Profile ...... 96 Table IV : XTP Implementer Options ...... 118 Table V : Example of System-Level Message Priority Scheme ...... 120 Table VI : Example of System-Level Message Error Control Policy ...... 121 Table VII : Capabilities Supporting Critical Virtual Circuits ...... 150 Table VIII : Data Rates for Uncompressed Video ...... 153 Table IX : Video Quality for Various Lossy Compression Ratios ...... 156 Table X : Multibus II Host Processor Boards and FDDI NICs used for Latency Tests . . . 174 Table XI : Protocol Stack Layers...... 174 Table XII : Protocol Layer Header Overheads per Packet ...... 175 Table XIII : Unsolicited Transfer Latencies...... 176 Table XIV : Solicited Transfer Latencies...... 177 Table XV : Solicited Inter-Host Variable Length Transfer Latencies ...... 178 Table XVI : Logical Link Control Latencies ...... 178 Table XVII : XTP Latency Performance Results ...... 179 Table XVIII : XTP Transport Latencies over Multibus II ...... 180 Table XIX : XTP Throughput Performance Results ...... 181 Table XX : Complete Stack Performance Measurements under No Load ...... 181 Table XXI : Protocol Layer Transfer Latencies ...... 182 Table XXII : Complete Stack Performance Measurements under Load ...... 183

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xvii of 214 ydthsm2.wpd Abbreviations and Acronyms

A/D Analogue to Digital ADM Architecture Demonstration Model APIS Application Interface Services ANSI American National Standards Institute ARQ Automatic Repeat Request ASAP Application Service Access Point ASCII American Standard Code for Information Interchange ATM Asynchronous Transfer Mode ATS APIS Test Shell BER Bit Error Rate BIT Built-In Test BITS Built-In Test Services BSRF Basic System Reference Frequency CAM Contents Addressable Memory CATNIP Common Architecture Technology for Next Generation IP CCS Command and Control System CDDI Copper Distributed Data Interface CD Compact Disc CLA Comsoft LAN Architecture CLNP Connectionless Network Protocol CONP Connection-Oriented Network Protocol COTS Commercial Off-the-Shelf CPU CSMA/CD Carrier Sense Multiple Access with Collision Detect CVC Critical Virtual Circuit D/A Digital to Analogue DAS Dual-Attachment Station DCT Discrete Cosine Transform DOD Department of Defense DQDB Dual Queue Dual Bus EISA Extended Industry Standard Architecture EMC Electromagnetic Compatibility EPROM Erasable Programable Read Only Memory ETS Extended Transport Service FC Fibre Channel FDDI Fibre Distributed Data Interface FDVDI Fibre Distributed Voice, Video and Data Interface FFOL FDDI Follow-on LAN FMECA Failure Modes, Effects and Criticality Analysis

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xviii of 214 ydthsm2.wpd Abbreviations and Acronyms

FOM Figure Of Merit FTA Fault Tree Analysis FTP File Transfer Protocol FTS File Transfer Services GPS Global Positioning System GRMS Generalised Rate Monotonic Scheduling HPN High Performance Network HPNWG High Performance Network Working Group HRC Hybrid Ring Control HSDB High-Speed Data Bus IC Integrated Circuit ID Identifier IEEE Institute of Electrical and Electronic Engineers IETF Internet Engineering Task Force IFD Information Flow Database I/O Input/Output IP Internet Protocol IPng Internet Protocol - New Generation IPX Internetwork Packet Exchange iRMX Intel Real-time Multitasking Executive iRMK Intel Real-time Multitasking Kernel ISA Industry Standard Architecture ISDN Integrated Services Digital Network ISO International Standards Organisation IT Information Technology ITU International Telecommunications Union IVIT Interface Verification and Integration Test JPEG Joint Photographic Experts Group KRM Kernel Reference Model LAN Local Area Network LSAP Link Service Access Point LSS Lightweight Support Services MAN Metropolitan Area Network MAC Media Access Control MAP Manufacturing Automation Protocol MIB Management Information Base MCA Microchannel Architecture MIPS Mega-Instruction Per Second MFC Multi-Function Console MLT-3 Multi-Level Threshold 3

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xix of 214 ydthsm2.wpd Abbreviations and Acronyms

MMI Man-Machine Interface MOD(N) Ministry of Defence - Navy (UK) MPD Medium Propagation Delay MPEG Motion Picture Experts Group NATO North Atlantic Treaty Organisation NETBLT Network Block Transfer NIC Network Interface Card NIMROD New Internet Routing and Addressing Architecture NMEA National Marine Electronics Association NMS Network Management Services NPU Numeric Processing Unit NSAP Network Service Access Point NTDS Navy Tactical Data System NTP Network Time Protocol NTS Network Time Services OBS Optical Bypass Switch OS Operating System OSI Open Systems Interconnect PAL Phase Alternating Line PBB Parallel Backplane Bus PCB Printed Circuit Board PHY Physical Layer Protocol PL PHY Latency PMD Physical Medium Dependent PMF Parameter Management Frames PC Personal Computer POSIX Portable Operating System Interface Extension psi pound per square inch QoS Quality of Service RAM Random-Access Memory RF Radio Frequency RFI Radio Frequency Interference RGB Red, Green, Blue RINA Royal Institute of Naval Architects RISC Reduced Instruction Set Computer RLT Ring Latency Time RN Royal Navy RSVP Resource Reservation Protocol RTP Real-Time Transport Protocol RTCS Real-Time Control System

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xx of 214 ydthsm2.wpd Abbreviations and Acronyms

R-T Real-Time SABS South African Bureau of Standards SAE Society of Automotive Engineers SAFENET Survivable Adaptable Fibre Optic Embedded Network SANDF South African National Defence Force SAP Service Access Point SAS Single-Attachment Station SBA Synchronous Bandwidth Allocator SCADA Supervisory Control and Data Acquisition SFCP System Fibre Cable Plant SDH Synchronous Digital Hierarchy SIA System Integration Authority SIPP Simple Internet Protocol Plus SMT Station Management SNAP Sub-Network Access Protocol SNMP Simple Network Management Protocol SONET Synchronous Optical Network SPX Sequenced Packet Exchange STANAG Standard NATO Agreement STP Shielded STS SAFENET Time Services TBD To Be Determined TCP Transmission Control Protocol TCU Trunk Coupling Unit TOP Technical Office Protocol TP Transport Protocol TP4 Transport Protocol Class 4 TRT Token Rotation Time TSAP Transport Service Access Point TTP Time-Triggered Protocol TTRT Target Token Rotation Time TUBA TCP and UDP with Bigger Addresses USN United States Navy WAN Wide Area Network WBC Wideband Channel WORM Write Once Read Many UDP User Datagram Protocol UPS Uninterruptable Power Supply UTP Unshielded Twisted Pair VLSI Very Large Scale Integration

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xxi of 214 ydthsm2.wpd Abbreviations and Acronyms

WER Word Error Rate VME Versa Module Europe VMTP Versatile Message Transaction Protocol XTP Xpress Transport Protocol

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xxii of 214 ydthsm2.wpd Scope

Chapter 1

Scope

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 1 of 214 ydthsm2.wpd Scope

1. Scope

1.1 Scope

This thesis addresses system-level issues applicable to real-time, mission-critical, distributed systems. In particular, it addresses the requirements for and attributes of data communication protocols to support the integration of data services into complex, real-time, distributed systems as well as the strategies applicable to the implementation of such systems.

The objectives of the work underlying this thesis are the analysis of the information management requirements of typical next generation, distributed control systems and the synthesis of an optimal solution (in terms of performance, dependability, transparency and flexibility) using distributed computing elements and local area networks (LANs). Of particular significance is that the system solution should exhibit a high degree of integration across all its functional areas as well as an open systems architecture.

As the successful integration of distributed systems and the maximisation of interoperability relies on the employment of standards, a major objective is to critically analyze all currently available protocol standards in terms of their suitability for real-time, mission-critical, distributed systems and then synthesize an optimal solution using the most appropriate of these, with augmentation where necessary. As most of these standards were not necessarily developed for the applications of concern, innovative ways of optimising the solution without major deviation from accepted international standards are sought.

For this reason, the work underlying this thesis is not aimed towards highly focused research in order to break new ground in terms of technology, but the broader system perspective of the implementation of real-life, real-time, mission-critical, distributed system solutions.

1.2 Introduction

In order to meet the ever-increasing demands for performance, effectiveness, flexibility and upgradeability, systems are being designed to rely to a greater and greater extent on computers. Originally computers were employed as high-speed computational aids rather than as controllers of complex systems. However, their use has advanced to a stage where their capabilities are much more intimately and interactively involved with our lives and environment. As Stankovic observes in his paper Real-Time Computing Systems: The Next Generation[123] :

"Real-time systems play a vital role in our society. Examples of current real-time systems are command and control systems, nuclear power plants, process control plants, flight control systems, space shuttle and aircraft avionics, and robotics."

The first computers were cumbersome, bulky and expensive to acquire, operate and maintain. Hence they were used in special facilities for specialised applications. Since its beginnings in the late 1940s and early 1950s, computer technology has made dramatic advances in terms of performance, cost and size. These advances have created opportunities for ubiquitous and dependable computing which is the basis of the information revolution which is changing our lives and the nature of our society. Stankovic continues :

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 2 of 214 ydthsm2.wpd Scope

".... hardware (and software) technology has made distributed computing and multiprocessing a reality, and soon there will be many networks of multiprocessors."

While computer hardware provides the computing platform for the required functionality, it is normally computer software which effects the actual functionality. As Kopetz and Veríssimo observe in their paper Real Time and Dependability Concepts[94] :

"The functionality of an intelligent product is determined by the integrated software."

In systems with a high number of operators, personnel costs (salaries, benefits, training, support, etc.) are a major determinant of the lifecycle cost of ownership. Dramatic operating cost savings can be achieved by reduced manning, with human operator functions being automated by computers and robotics. Such approaches also have added benefits in that fewer humans need to be exposed to life-threatening risk, e.g. in war or in hazardous chemical and nuclear process plants. In their paper Autonomic Ship Concept[69], Ditizio et al predict that aboard a naval surface warship, a reduction of required crew from 365 to 100 is possible using such automation technologies. If it is considered that a number of support personnel are required for each operational sailor, the cost savings are indeed substantial.

Significant progress in the area of computer hardware technology (processors, memory, data storage, etc.) has been made in recent years with the result that relatively inexpensive computing power is readily available. Although current computer hardware can offer considerable computing power, machines are susceptible to failure, with onerous implications in mission and safety-critical applications, especially in centralised computer architectures. Examples of such applications are fly-by-wire aircraft, medical life-support systems[80, 140] and air defence systems. Also, despite the considerable power of current computers, the demands of software (especially those of modern high-level language implementations) are rapidly beginning to outstrip the capabilities of the most powerful computers. For example, Geary and Masters[76] predict that the computing power of the US Navy Aegis combat system will have to expand from five standard units at present (Baseline 5) to 280 units at Baseline 7 in the year 2004.

A solution to these apparent dilemmas is the implementation of distributed computer architectures where a number of distributed, but connected, computers share the processing load. Halsall concurs in his book Data Communications, Computer Networks and OSI[82] :

"The dramatic advances in computer technology over the past decade, however, have resulted in an increasing number of information processing systems now being implemented as a linked set of computer-based equipment."

Distributed systems are those characterised by containing interconnections between multiple, independent computing nodes which co-operate to maintain some shared state. Such architectures allow sharing of information and resources over a wide geographical and organisational range. They can exploit small, inexpensive computing elements thereby achieving system reliability, expandability and affordability.

Certain applications are intrinsically inclined towards distributed architectures, e.g. military combat systems which must feature high survivability[74, 76, 130]. Such architectures have an advantage when their critical resources are geographically distributed. They may be able to

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 3 of 214 ydthsm2.wpd Scope

employ redundant connection paths or self-healing nodes[86] in order to maintain system-wide communication, or they may be able to gracefully fall back to a local, albeit degraded, mode of operation in order to effect self-defence or damage control[73].

While the implementation of any distributed computer architecture is non-trivial, the requirements for such systems become complex when applied to critically real-time applications[122]. Real-time systems are required to preserve both timeliness and order[134]. Maintaining timeliness and order is complex in distributed systems because of uncertainties intrinsic within the interconnection fabric connecting the distributed elements. These uncertainties arise from physical phenomena (such as noise), due to competition for the resources of the shared medium (i.e. bandwidth and time), as well as finite processing capabilities and buffer memory of the I/O processes. In internetwork topologies, the ordering of data packets can also be modified by packets travelling different routes to the end nodes.

In general, multi-functional, real-time, distributed systems demand the capability of handling high data rates and vast data volumes with low latency times in a reliable, deterministic and secure manner. Of particular significance are the issues concerning dependable, closed-loop, real-time control of critical sensor/actuator sub-systems using real-time protocols over local area networks. Examples are fly-by-wire and fly-by-light[81] aircraft where these high performance systems would be dynamically unstable without such intimate, real-time control of the aerodynamic surfaces. Civilian examples of these systems are the Airbus A320 and Boeing 777.

While various technologies have been developed to support distributed systems [viz. high performance processors (such as the Intel and DEC Alpha), high-speed fibre- optic LANs (such as FDDI[68, 107]) and WANs (such as ATM[66]), high performance high-level software languages (such as C++ and Ada[27]) and operating systems (such as iRMX, VxWorks[145] and LynxOS[96])], communications protocols to support mission-critical, real- time, distributed systems have yet to be fully defined and developed. As Mullender observes in his paper Interprocess Communication[109] :

"But now a new generation of much faster networks is emerging and, at the same time, new applications place new demands on transport protocols. Building local area networks operating at 100 Mbps has long ago ceased to be a technological challenge and networks operating at one Gbps now work in various laboratories. Building communication protocols that exploit these speeds is still a challenge."

While elegance and performance optimisation of the individual protocol layers is necessary to achieve an optimised real-time distributed system, there are many other requirements and factors in achieving a practical and coherent system. Most significant of these are interfacing between protocol layers, integration with the operating system and hardware platform as well as dataflow integration between user applications. Where the contributing components are designed and developed within one organisation these are non-trivial problems. Where they are developed by disparate organisations, the integration problem can acquire a new dimension of complexity. Stankovic concurs with this with an observation in his paper Real- Time Computing Systems: The Next Generation[122] :

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 4 of 214 ydthsm2.wpd Scope

"...However, integration of the low-level protocols with the functioning of the operating system kernel, I/O modules, and application modules, as well as the inclusion of fault tolerant features, are issues that remain to be solved."

It is proposed that one leg of the solution to this problem is the employment, wherever possible, of accepted standards for all components of the system. Such components need to have been rigorously qualified and their interfaces fully documented. As Veríssimo observes in his paper Real-Time Communication[134] :

"In the domain of hard real-time distributed systems one often finds proprietary approaches with replicated hardware. .... Non-replicated networks, from a cost and system complexity viewpoint, would be a preferable solution: hardware saving and bound to use standard, off- the-shelf components."

The other leg to the solution is to engineer the system in such a way as to provide transparency between the user applications and the communication infrastructure. Such transparency implies providing a simple interface between the user application which isolates the details and complexity of the latter without detracting from performance or flexibility. Such an approach is supported by Weihl in his paper Specifications of Concurrent and Distributed Systems[144] :

"The specifications of the individual modules provide another vital advantage: they allow us to understand the system without having to understand all the details of its construction. In other words they provide abstraction: they hide implementation details of one module that are not relevant to the workings of other modules. This is perhaps the most powerful technique we have available to us to reduce the complexity of large systems."

While Weihl makes this observation in the context of software modularity, it is contended that the concept is equally applicable to the interaction between a set of application users and the communication sub-system of the real-time distributed system. As Stankovic observes[123] :

"It is certainly true that the state-of-the-art in real-time system design is mostly ad-hoc. That does not mean, however, that a scientific approach is not possible."

Observation

For complex real-time, mission-critical, distributed systems it is contended that it is imperative to use scientific or system engineering methodologies to derive an optimal system architecture.

Traditionally designers of real-time systems were forced to sacrifice transparency and abstraction in order to meet performance requirements. As Strayer and Weaver point out in their paper Is XTP suitable for Distributed Real-Time Systems?[124] :

"As a consequence many communications subsystems supporting real-time applications are based on MAC layer services".

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 5 of 214 ydthsm2.wpd Scope

Basing communications on MAC-layer services implies leaving out the network and transport layers; this results in proprietary solutions.

Furthermore, in the real world where the integration and support of sub-systems and products from numerous manufacturers over extended periods of time are unavoidable requirements, the issues of open systems and obsolescence management are important factors. As Sha observes in his paper Industrial Computing - A Grand Challenge[119] :

"The resulting new industrial computing infrastructure must have built-in support for system upgrading so that new electromechanical equipment, computers, networks, and software can easily be introduced."

Integration of such heterogenous equipment requires them to be able to communicate in an unconstrained or open way. Sha continues :

"Finally, truly significant new technologies should be supported by open standards so that they can be used cost-effectively."

Thus the real-time protocols need not only to exhibit functionality and performance, but also to conform to international standards, be portable to different platforms as well as provide graceful upgrade paths to technologies of the future.

A major goal of this thesis is therefore, to identify and develop a system solution, based on standard components, which meets the real-time performance requirements of a generic, mission-critical, distributed system. It should do so in a way which supports integrateability and expandability by offering transparency between the user applications and the communication infrastructure.

1.3 Contribution of the Study

The thesis proposes a system solution to real-time, mission-critical, distributed applications using building blocks conforming to international standards. The result is an implementable system catering for all physical and functional layers, i.e. from the physical cabling, up to the interface with the user's application software. All the layers are functionally decoupled to the maximum extent possible in order to provide for obsolescence management.

While a systems solution considered appropriate for the present timeframe is identified, a methodology is also proposed which will systematically enable requirements of next generation systems to be matched to the capabilities and characteristics of technologies of the future.

By matching of appropriate technologies and techniques, the proposed network solution is capable of supporting a critical virtual circuit to provide dependable, closed-loop, real-time control of critical sensor/actuator sub-systems using local area networks. It is also capable of providing full performance and protocol functionality in internetwork topologies without omitting the network and transport layers.

A claim of this thesis is therefore, that it derives a set of generic, real-time communication requirements from the user requirements of a typical, multi-function, distributed system. An

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 6 of 214 ydthsm2.wpd Scope

appropriate layered Real-Time LAN Profile is derived to match these requirements with an optimal set of standard components for each of the layers being identified and proposed. These proposed candidates are optimal in terms of availability, performance, flexibility and expandability.

The communication system must exhibit high dependability and low latency in respect of performance and transparency in respect of integrateability. To meet the latency and transparency requirements, a unique, data-driven, Application Interface Services (APIS) layer is proposed to provide such an abstract interface between the communications infrastructure and the real-time application user. This approach allows the exclusion of the session, presentation and application layers of the ISO model as these are known to impact negatively on the performance of the protocol stack.

Where the specific system solution is inappropriate for a particular application, a methodology for the derivation of an optimal solution is proposed.

An implementation of APIS is designed, developed, tested and integrated using an evolutionary prototyping methodology in an experimental testbed.

End-to-end latency of the complete protocol stack, including APIS, is measured with worst- case, guaranteed values being in the order of 3 milliseconds (under 50% network load conditions) which approaches the lower bound of currently available technology. Analysis of a typical real-time, mission-critical, distributed system shows that such latencies are too large to meet the requirements of certain real-time systems, but that recovery of timeliness is possible by means of accurate synchronisation of distributed processes and timestamping of inter-process data entities. In order to support synchronisation and timestamping, the provision of Network Time Services is proposed. Such Network Time Services are adapted for an FDDI LAN from a public domain Network Time Protocol. It is proved both analytically and experimentally that NTS is capable of providing inter-process synchronisation via the communication infrastructure to an accuracy of better than 200 microseconds under normal conditions and better than 250 microseconds in the presence of one LAN fault. It is contended that such synchronisation accuracy suffices for the class of real-time, mission-critical, distributed systems under consideration.

It is also concluded that significant inter-process latencies arise outside of the communications infrastructure, i.e. due to latencies inherent within the parallel backplane buses of multiprocessor systems. It is further proposed that the Network Time Services can be extended to operate between such multiple processors and that a derivative of the Network Time Protocol is appropriate to provide such extended timing functionality.

To enhance communications reliability, it is proposed that an attribute of precedence is allocated to each dataflow which describes its functional, rather than time, criticality. A user-managed, error control scheme can be formulated using the attribute of precedence as well as the protocol-provided attribute of priority.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 7 of 214 ydthsm2.wpd Scope

1.4 Document Overview

The thesis is divided into a main section and a set of appendices. The main section consolidates the overall requirements and characteristics of real-time data communication protocols and describes these in terms of a proposed paradigm for the implementation of a real-time, mission-critical, distributed system. The main section also provides overall conclusions resulting from the research.

The appendices provide detailed analyses of each particular aspect of the important issues involved in the implementation of real-time, mission-critical, distributed systems. Each appendix also provides intermediate conclusions and recommendations resulting from that particular aspect of research. The implications of these are carried through to the main section of the thesis. The appendices are intended to be as self-standing as possible within their specific context, i.e. they are meaningful beyond the scope of the main section of the thesis.

Chapter 2 of the main section describes the evolution of the distributed control system and provides an overview and rationale of the state-of-the-art in real-time, mission-critical, distributed systems. An analysis of current and anticipated trends in the field is also provided.

Chapter 3 proceeds to provide the important contextual definitions applicable to the subject matter as well as the implications of these definitions for data communications.

Chapter 4 provides an overview of the typical requirements and architectures of specific examples of real-time, mission-critical, distributed systems from the areas of industrial process control, aerospace, military command and control systems and multimedia information systems.

Based on the problem space and relevant definitions given in Chapters 2 and 3, as well as the application requirements provided by Chapter 4, Chapter 5 proceeds with a distillation of the generic requirements of real-time, mission-critical, distributed systems. The common requirements in terms of data transfer are then determined and characterised, specifically with regard to dataflow determinism, precedence, priority and throughput. The derived requirements of dataflow control, synchronisation, multicast capability, addressing mechanisms and network management are then determined.

With Chapter 5 providing the system requirements as input, Chapter 6 describes the proposed system architecture in terms of a LAN Profile concluded to be optimal for real-time, mission- critical, distributed systems. At each layer of the Real-Time LAN Profile, optimal protocols are proposed to formulate an integrated, but flexible networked system. Wherever appropriate, standard components are proposed as candidates for the layers of the Real-Time LAN Profile. These are described in terms of their most important characteristics, with more comprehensive descriptions being provided in the appendices along with the rationale for their choice. Where deficiencies are identified, unique extended profile services are proposed to meet the requirements of the class of real-time, mission-critical, distributed system under consideration. In particular, an Application Interface Services (APIS) layer is proposed and described in terms of a set of simple service calls. A set of auxiliary time services, the Network Time Services (NTS), is also proposed in order to nullify the effects of the latencies inherent in standard network technologies. Finally, as comprehensive management will be

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 8 of 214 ydthsm2.wpd Scope

required to achieve flexibility, availability and security of the complex network sub-system, an extended profile Network Management Services (NMS) is proposed and described.

Based on the proposed system architecture developed in Chapter 6, Chapter 7 addresses issues applicable to the actual implementation of a typical system. Architecture models, topology and interconnectivity are discussed, while more prosaic issues such as physical network interfacing and connectivity, as well as the implications of supporting continuous media, are addressed.

Chapter 8 describes development of the extended profile services as well as performance characterisation thereof and integration of these with the standard components into a complete system. Such development and integration is undertaken using rapid and evolutionary prototyping. Chapter 8 also describes an experimental testbed, i.e. the Architecture Demonstration Model (ADM), used to verify the validity of the proposed solution, as well as to test concepts and support performance measurements. Where applicable, detailed results of these investigations are provided in the appendices, while an overview of the ADM, a summary of important test results, as well as some conclusions and recommendations in this particular context, are provided in the main section.

Chapter 9 concludes the main section of the thesis with a summary of the main findings as well as some general conclusions in the context of the study.

To identify optimal candidates for each of the lower four layers, detailed analyses of the characteristics of various standard networking protocols at each of these protocol layers, as well as their suitability for real-time control, are addressed in Appendices A to D.

Detailed requirements and attributes of extended profile protocols and their suitability for real-time control are addressed in Appendices E and F, while appropriate LAN Profiles are analysed in Appendix G.

In the context of real-time, mission-critical, distributed systems employing local area networks, of particular significance is the issue concerning latency of the critical sensor/actuator sub-systems. In this regard, an analysis is undertaken in Appendix H where the system errors due to the distributed nature of real-time systems (i.e. errors due to network latency) are quantified in relation to other system errors (i.e. spatial and quantisation errors). It is shown that even with protocol performance optimisation, the typical latencies that can be achieved using standard networking technologies are too high to meet the requirements of some of the applications under consideration. It is therefore concluded that other methods have to be employed to recover the timeliness of dataflows between distributed sub-systems.

A systemic method for data interface management is described in Appendix I while Appendix J provides Recommended Standards and Products as identified by the study.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 9 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

Chapter 2

Evolution of Distributed Control Systems

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 10 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

2. Evolution of Distributed Control Systems

This chapter provides an overview of the art and science of Distributed Control Systems. It does so by means of a review of the history and current state-of-the-art of this subject, while also proposing some likely trends for the future. It thereby establishes a rationale for the technologies and architectures that are proposed in the following chapters.

2.1 Introduction

Historically, plant automation was the first field for the application of real-time digital control. This came about due to the substantial cost benefits that could be gained from such automation. Certain advanced processes were also too complex for humans and simple machines to control.

Initially, process plants operated without automatic control; they relied on open loop or man- in-the-loop control. To effect such control, operators had to be stationed in the vicinity of the points of control. With the refinement of plant instrumentation and the development of remote automatic controllers, high-level plant control migrated to centralised monitoring facilities. Later, the monitoring facilities gave way to centralised control computers, initially analogue then digital, with a period where hybrid computers also found some application. These computers implement automatic control by means of feedback control whereby actual process conditions, measured by remote sensors, are compared to required conditions (e.g. setup conditions within the computer or positioning requirements determined by other sensors). Amplified difference signals are then fed back to remote actuators via the connecting media to force conditions to those that are required.

The central computers had to be fairly capable to control sizeable plants and were therefore expensive. They also constituted single points of failure in the system which was unacceptable for mission and safety-critical applications. This, together with technology advances reducing the cost of computers, paved the way for distributed computer control.

In distributed control systems a proportion of data processing functionality is allocated to nodes remote from the central point of control. If a central point of control exists, control processing can be shared amongst a set of collaborating computers, either sharing the processing load, or in replicated configurations where redundant devices resume processing in case of failure of the primary device.

2.2 History

Initially the plant sensors were provided with no intelligence; they merely measured process values such as temperature, flow or position and transmitted these values to the central computer by means of analogue signals. These analogue signals were transported by pneumatic, hydraulic or electrical media.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 11 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

2.2.1 Pneumatic Systems

Up until the early 1960s pneumatic systems were used to convey logic as well as analogue signals in process plants. Typically these systems employed 3 to 15 psi compressed air as the transport medium. The compressor equipment was bulky, required substantial amounts of power and was acoustically noisy.

Observation

Such pneumatic systems, while they may have exhibited a high degree of electromagnetic compatibility, suffered from leaks, punctures and the unreliability of compressor equipment. Due to pneumatic noise caused by pressure problems, the resolution of analogue pneumatic signals is severely limited. Also, due to the intrinsic inertia of the transport medium (air), the bandwidth of a digital pneumatic transmission channel is limited to some hertz.

2.2.2 Hydraulic Systems

Rather than compressed air, hydraulic systems convey logic signals by means of pressurised fluids. In the 1960s and 70s such systems also found extensive application in process control plants and even in avionic control systems.

Observation

Like pneumatic systems, hydraulic systems suffered from many problems. However, there were potentially especially severe consequences of leaks and punctures due to the damaging effects of the escaping fluid.

2.2.3 Current Loop Systems

In the mid-1960s pneumatic and hydraulic systems began to be replaced by electrical data transmission systems. Such systems were based on the 4-20 mA current loop standard, whereby the value of the measured quantity is transmitted in the form of an analogue current. 4-20 mA current loop systems are still in use today, as are related families of 1-5 V, 0-5 V, -5-5 V and 0-10 V analogue sensors.

These current loop signals had to maintain their integrity in severe electromagnetic environments. Invariably it was impossible to adequately protect these signals from this electromagnetic interference with the result that they became severely contaminated by noise, thus negating the performance of high resolution transducers. This resulted in the degradation of overall system performance.

With the advent of electronics in the 1960s and 1970s and especially VLSI integrated circuits, it became feasible to convert the transducer analogue outputs into digital format by means of A/D converters. However, with the analogue 4-20 mA standard prevailing at the time, it was commonplace to reconvert the digital signal to analogue for transmission and then back to digital on reception.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 12 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

In nearly all applications the potential advantages that could have resulted from this development were entirely negated by the persistence of analog 4-20 mA as a means of transmission. A/Ds fed into D/As and vice versa and the repeated requantisation of signals led to systems with practical dynamic ranges even less than some of their predecessors.

Later standards emerged to digitally modulate the 4-20 mA signals to carry the digitally sampled transducer signals directly to the digital central control computer.

Observation

With some levels of digital processing now occurring remotely from the central computer, i.e. at the plant interface itself, distributed control was born.

Initially local processing was modest, typically providing sensor identification, bidirectional data flow and auxiliary measurements, such as temperature at the sensor head, for compensation by the central controller.

Some technical limitations still remained. Although most systems used existing cabling, each sensor required its own cable to the central measurement or data logging device. Sensors could not be multi-dropped, thereby bringing no cost savings over old 4-20 mA data transmission. Furthermore, the divorce with analogue transmission was not complete while the various hybrid modulation schemes had RFI problems and relatively low speeds of data transmission which significantly limited information throughput.

Observation

As process control began its migration to the periphery of the system, the complexity of the interconnect cabling and the achievement of electromagnetic compatibility became critical factors affecting system architecture and design. Overall data throughput of the media and I/O controllers also became a major design factor.

2.2.4 More Complex Systems

Military command and control systems are somewhat unique in that they are normally required to exhibit the highest levels of capability throughout their operational lives. Cost is normally a secondary consideration and therefore such systems often represent the state-of-the-art in terms of the application of technology in order to meet operational requirements. Naval combat systems in particular are a good representation of this start-of-the-art and the evolution of such systems provides a good example of the evolution of general real-time, mission-critical, distributed systems.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 13 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

Observation

Certain systems need to meet operational requirements, including those which safeguard our liberty. For these systems, performance factors generally outweigh cost factors.

MOD(N) recounts the origin of Royal Navy combat systems in their document Principles of Combat System Highway Engineering on the Type 23 Frigate[130] :

"Virtually all major warships at sea today carry some form of central computer system to enable the total combat system to maximise its fighting capabilities. In 1963 the HMS Eagle was the first ship in the Royal Navy to be fitted with a digital computer. The computer known as Poseidon was used mainly for tracking and control of aircraft which was carried out in association with onboard sensors."

Observation

When the digital computer was first introduced into Royal Navy ships, the current hardware and software technology of the time led to the system configuration taking the form of a central digital computer interfaced through mainly analogue signals directly to the sensors, weapons and displays. At the time, these central computers were powerful, yet expensive.

These computers were part of a largely analogue combat system and were the hub of a star-connected network which was built with many different standards applied to the individual point-to-point links.

Although these warships have acquitted themselves in real combat situations, their performance has not always been acceptable; the self and consort defence success of Royal Navy frigates in the Falklands war being a case in point. The loss of the HMS Sheffield and HMS Atlantic Conveyer to Argentinean anti-ship missiles is a vivid example of such failure to perform adequately.

Centralised digital computer architectures suffer from a number of disadvantages :

! The systems are difficult and expensive to modify or enhance with new weapon or sensor systems because the interfaces are so different and varied.

! There is very limited scope for any form of fallback mode. Once the central computer system has become non-operational, the combat system ceases to function except in some very limited manual modes.

! The central computer system has to carry out all the electronic interface conversion functions which is time-consuming. Such functions also require considerable hardware which is expensive, consumes power and generates heat.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 14 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

! The central computer system has to convert all the various types of data into a standard format that is suitable for use within itself and then re- convert this data prior to re-transmission to other systems.

However, in the new generation of warships, this type of centralised architecture is becoming obsolete. New design philosophies and the availability of powerful mini and micro-computers have encouraged weapon and sensor manufacturers to build much more processing power into their own equipments. This has the advantage that a large amount of processing that was originally carried out in the central computer is now carried out in the individual weapon or sensor, thus allowing these main computers to carry out their true tasks associated with command and control.

Observation

This distribution of processing power and its associated redistribution of the many tasks that were carried out by the central computer system indicated clearly the need to introduce some form of digital inter-computer communication system into the total combat system.

The US Navy Aegis naval combat system was one of the first systems to apply automated control, firstly with a centralised architecture and later with a distributed control architecture.

Zitsman et al recount the origins of the Aegis combat system architecture in their paper Computer System Architecture Concepts for Future Combat Systems[148] :

"The Aegis computer system architecture is the result of an evolutionary process that was initiated decades ago. “Modern” combat systems began with the introduction of the Talos, Terrier, and Tartar (3T) systems in the 1950s and 1960 in an anti-air warfare (AAW) role, initially to counter the threat of the kamikaze and ultimately, the anti-ship cruise missile (ASCM)."

Observation

In order to meet mission-critical performance requirements, system architectures were derived and optimised to support a critical operational mission.

Zitsman et al also make the following observation :

"The performance of the combat system depends heavily on the integrity and performance of its computer system interconnections."

Currently, interconnection of Aegis computer elements is accomplished via dedicated point-to-point copper wire cables. These interconnections are NTDS (Navy Tactical Data Standard) channels. Data paths in the cables are unidirectional; two cables are needed for duplex transmission. For system reliability reasons, all data channels are redundant. Consequently for each point-to-point connection between computers, four cables are required; two for duplex transmission and two to provide for redundancy. Point-to-point channels provide clearly defined system

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 15 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

delays and, on an individual channel basis, are easily tested for timing and accurate data transmission.

Simoncic et al observe in their paper Shipnet: A Real-Time Local Area Network for Ships[117] :

"Historically, electronic communication aboard ships has been accomplished using point-to-point wiring. Due to the recent rapid increase in the number and type of shipboard electronic devices, wiring aboard a ship has now become a logistics nightmare."

Observation

The centralised architecture using point-to-point data communications infrastructure proved to be a major inhibiting factor for further expansion of the Aegis combat suite, thus rendering it unable to meet enhanced operational requirements.

Defensive engagement times on a modern surface combatant are in the order of seconds[44]. A large number of iterative and collaborative processes must be completed before engagement can commence and be sustained. Response times are beyond the capabilities of human operators, hence automation of these tasks is a necessity. There is also no time for manual intervention in the case of equipment failure. The defensive systems need to be completely dependable. Such dependability can only be achieved by tolerance of equipment faults. Such fault- tolerance is normally best achieved by use of replication and reconfiguration. Communications channels are definite candidates for redundancy. Other critical systems are also normally replicated, such replication having specific implications on the functionality of the communication system.

Zitsman et al sum up the requirements of naval combat systems :

"Fault-tolerance and reconfigurability are crucial factors in the performance of a combat system."

The solution to facilitating these divergent requirements was the employment of the local area network.

2.2.5 Local Area Networks

MOD(N) continues :

"This increased processing capability of the sensors and weapons that are being developed today for the first Type 23 Frigate enabled the Chief Naval Weapons System Engineer (CNWSE) to make a very subtle, fundamental change in the way in which the total combat system is connected. In October 1982 it was decided that the sensors, weapons and computer assisted command system (CACS4) that constitutes the Type 23 combat system would be interconnected by means of a LAN."

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 16 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

Observation

This apparently simple decision has caused a major re-assessment of the methods, mechanisms and techniques used for the design, production, integration and testing of the combat system and its individual systems.

MOD(N) continues :

"The new Type 23 Frigate has utilised the experience gained in all these systems (i.e. those evolved from the Poseidon) and has also taken one more major step forward, it has introduced a Local Area Network into the total combat system. This is considered to be one of the most significant changes in combat system architecture since the introduction of the Poseidon computer in HMS Eagle in those early days."

Observation

With digital processing occurring remotely from the central computer more sophisticated interconnect technologies were required, both to reliably transmit the data as well as to interconnect multiple remote devices. As these process control plants operated in real-time, albeit with modest performance initially, the requirement for real-time communication protocols was spawned on the one hand and multi-access interconnect technologies such as local area networks (LANs) on the other. In process control applications, such LANs so intimately involved with plant sensors and actuators and optimised therefore, were termed Fieldbuses. Fieldbuses are often associated with Supervisory, Control and Data Acquisition (SCADA) systems.

Hence LANs came to be a feature of distributed systems and started to exert their influence over modern control system design and implementation.

2.2.6 Implications of LANs

One of the early applications of LANs for real-time control was military combat systems, initially for tactical combat system control and in time, for strategic system command and control. Introduction of LAN topologies into combat system architectures had technical as well as acquisition implications.

MOD(N) summaries the implications of the use of LANs :

"In the perfect world a combat system would be designed as a total system to meet a very precise requirement. This requirement would define what the combat system should be capable of achieving and be directly related to the fighting characteristics that the ship must be able to exhibit. The design of individual member systems would thus emerge from the total combat system design process.

In the real world this is not how combat systems are designed. Industry in general has expertise in particular fields such as radar, sonar or missile systems and there is a tendency to update current systems or design and build new systems in the same

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 17 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

fields anticipating that a system will be selected for inclusion in the next or even the current generation of warship.

The introduction of the LAN highlighted this problem very clearly and provided a unique opportunity to introduce standards that cover much more of the integration problem than just observing the correct electrical connections and correct signal levels as has been the case in the past."

Observation

The result of using this type of logical rather than physical interface is that many totally different systems were capable of exhibiting a large number of common software functions. This resulted in substantial cost and complexity reductions.

Apart from achieving connectivity within a system, a LAN is capable of greatly simplifying the normally onerous task of the logical integration of numerous and diverse sub-systems into a coherent system. However, this requires a system-wide approach and management effort.

MOD(N) continues :

"There is a need to introduce into MOD(N) a Data Manager with suitably qualified support personnel to control the data management aspects of the combat system design and integration."

Observation

While managing the data interfaces of a point-to-point system is an onerous responsibility, managing these in a LAN environment requires special considerations. While the task can be simplified by the homogeneity of the data interfaces, there can be a tendency for sub-systems designers to view the LAN as an infinite resource both in terms of bandwidth and performance. In reality, these resources are limited and therefore have to be carefully managed by the system design authority.

MOD(N) summaries the advantages of the Combat System Highway :

"(It provides a) controlled, well documented approach to interface design and specification production resulting in many errors being found early in the design process."

"It will be relatively straightforward to replace or update the LAN."

"It will be easier to add new member systems to the CSH."

"For the first time in a combat system total data flows will be fully documented."

"Member system interaction will be known and be more understandable."

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 18 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

"There will be a reduction in ship's cables."

Observation

Practical experience with real-life, real-time, mission-critical, distributed systems has proven the extensive benefits of LANs.

J.H. Fraser makes the following observations in his paper Design Techniques Used to Enhance the Combat System Effectiveness of the T23 Frigate[74] :

"As with all large projects, the T23 Frigate suffered from the appearance of anomalies in the grey areas between the boundaries of responsibilities; I will refer to these anomalies as black holes. Sources of black holes are incomplete definition of an individual's scope of responsibility or incomplete specification of a requirement. However the most important source is invariably failure to communicate."

Observation

While Fraser's observations are made in a wider context than that of data communications, they re-enforce the requirement for robust communication protocols and interface control specifications which precisely specify the interaction between sub-systems connected via a network.

Fraser also makes the following specific observation :

"The widespread use of fibre optic data cable would have resolved many T23 EMC and space problems."

Observation

The use of optical fibre networks, or the lack thereof, has far reaching implications in the design, reliability and effectiveness of real-time, mission-critical, distributed systems.

2.3 Real-Time Communication

Communications systems require a number of unique attributes and capabilities in order to qualify them to support real-time systems.

While LANs offered enhanced flexibility in terms of reconfigurability, upgradeability, etc., they spawned a new set of requirements. These were dependable communication protocols, capable of operating in real-time over multi-access media in a deterministic manner.

2.3.1 Determinism

Real-time systems are required to exhibit deterministic behaviour. As Stankovic observes in his paper Real-Time Communication Systems: The Next Generation[123] :

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 19 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

"In the design of distributed real-time systems, there is a need for communication protocols that provide for deterministic behaviour of the communicating components."

Observation

The intrinsic characteristics of LAN technologies are critical in their suitability to support real-time systems. Certain data transfer protocols are inherently unable to support deterministic responses. For example, media access protocols such as Carrier Sense Multiple Access with Collision Detect (CSMA/CD) do not offer deterministic behaviour. It is contended therefore, that, in general, such protocols and their associated network technologies such as , are not appropriate for real-time, mission-critical, distributed systems.

Deterministic behaviour is achievable by the employment of controlled media access. Such controlled access may be centralised or distributed. Centralised schemes suffer from single points of failure and are therefore generally not suitable for mission-critical systems. Distributed access schemes fall into two main categories, collision avoidance and token passing. The fundamental nature of the former limits the geographic range of the communication medium, leaving token passing as the optimum media access method for other than small systems.

Deterministic behaviour is required not only from the communication protocols, but also the logic controlling the system. This logic is implemented by software running under real-time operating systems which also control the scheduling of the communications processes. As Stankovic and Ramamritham observe in their Tutorial - Hard Real-Time Systems[122] :

"The objective of real-time computing is to meet the individual timing requirement of each of the tasks. Rather than being fast (which is a relative term anyway), the most important property that the real-time system should have is predictability, i.e. its functional and timing behaviour should be as deterministic as is necessary to satisfy system specifications."

Observation

Operating systems supporting the class of applications under consideration are required to provide hard real-time performance, i.e. exhibit bounded and guaranteed worst-case task responses.

For real-time operating systems, this translates into guaranteeing worst-case context switch and interrupt latency times. Traditionally, operating systems have fallen into two main categories: general-purpose, capable of handling multiple processes and events, but not in real-time, or special-purpose, capable of real-time response, but proprietary and incapable of multiprocessing and scalability.

It is a major technological challenge to build real-time operating systems capable of hard real-time performance and open systems functionality.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 20 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

2.3.2 Latency

Critically real-time systems require low latency. As Stankovic points out :

"In particular, this requires protocols that result in bounded message communication delays."

Observation

Media access layer protocols must offer low latency by employing efficient media access control schemes. These fall into two main categories: token bus and schemes. Token bus implementations are limited in that only single packets of data can be transmitted at each token access opportunity as well as that there are significant time delays following bus faults and reconfiguration[77]. This leaves token ring as the optimal media access scheme. Token ring schemes fall into two categories of implementation, i.e. a simple token passing scheme such as that employed by IBM Token Ring[9] and a timed- token protocol as employed by FDDI[31]. The latter exhibits superior timing performance (refer Appendix B) in respect of both latency and jitter, as well as efficient bandwidth utilisation and is therefore contended to be the optimal media access scheme.

Transport and network layer protocols must also offer low latency by allowing fast protocol processing, both within end nodes as well as at intermediate nodes. Such fast protocol processing can only be achieved by optimal design of the protocol in terms of algorithmic simplicity as well as parameter positioning and alignment[125, 137]. On this issue Stankovic and Ramamritham observe :

"The communication media for next generation distributed real-time systems will be required to form the backbone upon which predicable, stable, extensible system solutions will be built. To be successful the real-time communication subsystem must be able to predictably satisfy individual message timing requirements. The timing requirements are driven not only by applications' inter-process communication, but also by time-constrained operating system functions invoked on behalf of application processes."

Observation

The scheduling and timing of data transfer services and application process tasks will normally be under the control of a real-time operating system which controls the resources of the entire system. Therefore these will be closely-coupled issues that require specific consideration in system design.

Stankovic and Ramamritham continue :

"In a nonreal-time setting, it is sufficient to verify the logical correctness of a communications solution; however in a real-time setting it is also necessary to verify timing correctness. Software engineering practices have helped in determining the logical correctness of system solutions, but have not addressed the timing correctness. Timing correctness includes insuring the schedulability of

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 21 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

synchronous and sporadic messages as well as insuring that the response time requirements of asynchronous messages are met. Insuring the timing correctness for static real-time communications systems using current techniques is difficult. Insuring the timing correctness in the next generation's dynamic environment will be a substantial research challenge."

Observation

Real-time computer systems can be employed to solve a wide variety of complex control problems. Applications range from the control of all the war-fighting resources of a naval ship, to the aerodynamic control of an aircraft, to the automatic braking of a motor car. While these can all be classified as hard real-time applications, they differ fundamentally in their external environments. In the latter two cases, the environment can be modelled to cater for all eventualities within the system's operating envelope. In the former case, the environment is completely random. Different approaches are appropriate in applying real-time solutions to these problems[93].

The requirements of determinism and low latency derives the characteristics of the media access control layer.

2.3.3 System Responsiveness

Where the external environment can be reasonably modelled, a time-triggered approach is best suited as the real-time control strategy. A time-triggered system is one that reacts to significant external events at pre-specified instants.

Where the external environment is essentially random, an event-triggered approach is best suited as the real-time control strategy. An event-triggered system is one that reacts to significant external events directly and immediately.

Time-triggered systems are simpler to test and prove for correctness. They are appropriate for small closed-loop systems controlling repetitive processes in static environments. Specific real-time transport protocols, such as the Time-Triggered Protocol[93] have been especially developed to support these types of systems. Typical target applications include fly-by-wire control, automotive control and small-scale process control. However, unless pre-allocation of bandwidth is made, time-triggered systems cannot cope with overload[94]. Overload in real-time systems typically result from event showers caused by a critical failure in a nuclear power plant, damage control alarms following a missile strike in a naval ship or saturation attack in an air defence system. Such pre-allocation of bandwidth is wasteful of resources and severely limits flexibility and upgradeability.

Event-triggered systems are better suited to handle aperiodic or sporadic events. Their behaviour is therefore not rigid and although they are entirely suitable for supporting a certain class of hard, real-time systems, they can be considered to be best-effort systems. In this regard, they are able to deal with external events immediately and can handle further simultaneous unprioritised events with decreasing attention. They do not suddenly reach saturation and then collapse, in

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 22 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

other words their performance degrades proportionately with increasing stress. By proper engineering, such degradation can be made to be graceful, i.e. events can be prioritised with the system handling highest priority events first and lowest last (or even discarding the latter completely).

It is contended that event-triggered, real-time systems are the general case and this study will concentrate on the real-time protocol strategies to meet the requirements of such systems.

Observation

In real-time systems, especially where the external environment cannot be precisely modelled, the concept and employment of a degradation manager, implemented by system logic, is useful to enhance system dependability under conditions of overload.

2.3.4 Priority

Following from the requirement for graceful system degradation by means of priority event handling, derives the communications requirement to convey priority information within data messages transmitted within the distributed system. Thus the transfer services should provide to the application user a priority message service.

Traditional transport protocols such as TCP and TP4 offer only a simple priority service, whereas more modern protocols such as XTP offer a more sophisticated priority service. In these priority schemes, higher priority messages pre-empt the scheduling of lower priority messages in order that they are transmitted first. However, in all these cases the priority scheme is static, i.e. messages are allocated a priority that is retained until the message is consumed. A more flexible approach would be a dynamic priority allocation where the priority could change with the passage of time. In certain cases the priority could increase (e.g. where an aperiodic message with long deadline was approaching this deadline), or could decrease (e.g. where a periodic message was approaching a time when it was about to be superseded anyway). However, dynamic priority schemes are complex and difficult to implement.

2.3.5 Precedence

In mission-critical systems, of equal significance to time-related priority is the characteristic of functional criticality, i.e. the importance of a message in relation to system functionality. For example, an aperiodic message containing a "torpedo alert" from a sonar system has far greater significance than a high priority periodic digital video sample, even though the torpedo alert can wait for some hundreds of milliseconds without compromising the system's ability to deal with the threat. What is important is that the torpedo alert message must reach its destination whatever happens, while the digital video sample can be discarded even though this may corrupt an MPEG-compressed video frame sequence.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 23 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

To cater for such functional criticality, it is proposed that an attribute of precedence is defined and that the real-time, mission-critical communication system offers a corresponding service.

Ideally, the transport protocol should offer such a precedence service. However, one of the fundamental proposals of this thesis is that the entire communication system be constructed from standard building blocks. No standard transport protocol features such a service. It is therefore proposed that user applications access such a service by means of an application layer protocol.

2.4 Derived Requirements

With the local area network established as the most effective method of logical integration of real-time, mission-critical, distributed systems, techniques were sought in order to enhance performance, minimise integration risks and minimise the implications of upgrade.

MOD(N) recognised the requirement for a new approach in order to meet these objectives :

"The introduction of the CSH brought with it some new techniques which caused a different approach to be taken to the way in which the combat system design was considered.

The first change was that all data is available in broadcast mode where only a single transmission of a unit of data is needed to service all users of that data. The value of this change is fully maximised if some form of standard data approach is taken as in NES 1028 and NES 1026 which are described later.

The second change was the replacement of destination oriented point-to-point addressing of data by the use of broadcast messages with a Message Type Number (MTN) mechanism.

Data messages are now classified by source of origin and the 'type' of data they contain. This classification is reflected by the different MTNs that are appended to the message by the transmitting member system. Receiving member systems can then select required messages from the CSH by simply filtering on the MTN."

Observation

The above technique allows future integration and upgrading of sub-systems to be more easily achieved because the member system sending the data does not have to be aware of what member systems are receiving the data and therefore no hardware, or more importantly, no software changes are reflected back into the sending member system when a new member system is connected to the LAN.

2.4.1 Data-driven vs Address-driven Approach

The data-driven rather than address-driven approach is an effective method of functionally decoupling the system user-level applications from the detailed implementation of the network. This allows the network to be considered as a virtual backplane and enhances flexibility and upgradeability. Importantly, it also allows application users to transparently interface to the network without requiring

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 24 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

detailed knowledge of the latter's technical implementation. Thus the application user can set data transmission policy without being involved in mechanism.

Observation

While the CSH's data-driven approach is commendable, it relies on the broadcast of messages. This has a number of serious negative implications. The first of these is that broadcast normally implies no error control. While error control could be performed, this would effectively imply that each consumer would have to acknowledge each message. This would be wasteful of bandwidth, have timing implications and require the producer to be aware of all consumers (otherwise it could not implement complete error control). Broadcast would also have a further, possibly more onerous, implication for critically real-time consumers; i.e. each consumer would have to respond to each and every message, most of which would not be destined for itself. Significant processing resources would be required to handle the interrupts accompanying each broadcast message.

Such a data-driven approach is much better implemented using a multicast capability supported by sophisticated multicast group management. This allows full multicast with varying degrees of error control being applied by the communication protocol and not by the application layer where such error control is contended to be inappropriate. Applications should not involve themselves in data transfer mechanism. Also, only consumers to which messages are destined are burdened with processing corresponding to the message interrupts.

2.4.2 Multicast

Apart from supporting a data-driven approach, there are many applications that inherently require multicast rather than unicast data distribution. Examples of such applications are distributed multiprocessing, distributed database management, digital telephony with conferencing facilities, video conferencing and advanced collaborative simulation.

Laniewski, in his thesis Multipoint Communication in Local Area Networks[95] was one of the earlier recognisers of the value of a multicast capability. He defines and develops a multipoint communication protocol as an extension to the IEEE 802.2 Logical Link Control sub-layer standard.

Laniewski describes his multipoint protocol :

"In the multipoint protocol, a source multicasts messages to all destinations in a group and collects acknowledgments from these destinations. It implements a sliding window and dynamic window size flow control. Reliability is achieved by a go-back-N scheme. The main goals for such a protocol are taken to be compatibility with an existing standard, simplicity and practical effectiveness."

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 25 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

Laniewski reports the performance of his multipoint protocol over equivalent point- to-point protocols :

"Modelling results show an improved performance by a factor or eight to twelve over equivalent point-to-point communication protocols."

The IEEE has also recently recognised the requirement for multicast and have taken steps to include such a facility in IEEE Logical Link Control Type 4 (LLC4).

The following is quoted from an IEEE report (IEEE 802.2-94/139) Functional Requirements for LLC Type 4[88] :

"Multicast: This has been agreed as a requirement to develop further but it is unclear as to what exactly this entails."

More recently, large military organisations have come to realise the importance of multicast.

The High Performance Network Working Group (HPNWG), in their report Available Technologies Final Report[85], makes the following observation concerning this issue :

"Studies by DOD and NATO have identified the lack of a multicast capability as a key issue in the use of OSI standards for military applications."

Observation

While Laniewski's early anticipation of the requirement for multicast is commendable, it is proposed that the most appropriate protocol layer for the implementation of supporting mechanisms such as error and flow control is the transport layer rather than the data link layer. This allows normal high performance data transmission to be effected by the lower three layers and then normal transport layer features to be applied at that layer. Again, the reliability of the multicast data transmission should be policy-driven, with this policy being determined by the application user and implemented at the transport layer. This is reinforced by the fact that modern physical media are inherently reliable, thus error policy should be left to the application entity which is the only entity capable of knowing the quality of service requirements for a particular data stream. The application user interfaces to the transport layer and therefore it is at the transport layer that dataflow control should be performed.

In addition, the multicast facility should be supported by a sophisticated multicast group management facility also provided by the transport layer.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 26 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

2.5 Future

2.5.1 Short Term - The Next 5 Years

In the short term future, real-time, mission-critical, distributed systems will increasingly feature the integration of control data as well as continuous media services (multimedia) on the same local area networks. Generally, multimedia transmission requires high throughput in real-time. Initially, these multimedia services may not be mission-critical, however when data fusion and knowledge- based systems become needed in order to meet operational requirements, these services will be afforded the same precedence as control data. Eventually, technology will support such applications over internetwork topologies. Only new technologies such as ATM and FDDI will be capable of supporting the next generation of applications and architectures. It is also contended that event- triggered protocols are better equipped to handle these diverse data types.

2.5.1.1 Asynchronous Transfer Mode

Asynchronous Transfer Mode (ATM) is undoubtably a LAN and WAN technology that will revolutionise the architecture and implementation of distributed systems in the future (refer also to Appendix B).

ATM offers very high levels of performance in terms of throughput. For example, SONET/SDH offers 155 Mbits-1 per link. With high performance switches, the aggregate throughput of an ATM LAN can be several gigabits-1, (a 6 Gbits-1 switch backplane bandwidth is typical of currently available equipment, with effective throughput being some 50% of that). This makes ATM eminently suitable for system applications involving multimedia. The technology offers great flexibility in terms of bandwidth, range and physical media options as well as potential affordability due to the large number of players that have embraced the technology.

However, ATM's immediate applicability to real-time, mission-critical, distributed systems is questionable.

The first issue of concern is ATM's data transfer efficiency. ATM packets consist of 53 byte cells of which only 48 bytes contain payload data. This implies a maximum efficiency of 90% before any higher layer protocol overhead. While this may not be important at high ATM speeds (i.e. > 600 Mbits-1), this is a significant limitation at lower speeds (i.e. 25 to 155 Mbits-1).

The second major concern is ATM's lack of fault-tolerance. Firstly, ATM only performs error control on the address portion of the cell and not on the payload data.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 27 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

The HPNWG makes the following observation concerning this issue in their report Available Technologies Final Report[85] :

"To date, ATM in and of itself has no ability to recover from data loss and errors. If and when the mechanisms underlying Type 3/4 and Type 5 assured service are defined, this capability will be available."

Observation

ATM leaves error control of payload data to higher protocol layers. This could have latency implications in certain high-speed applications. Secondly, the standard topologies employed by ATM do not support fault-tolerance.

The HPNWG continue :

"The 'star' topology that is natural to ATM LANs is intrinsically tolerant of equipment faults in the terminals or connecting links; if one terminal/link fails, it should have little or no effect on the switch or other terminals. On the other hand, the star topology makes ATM intrinsically intolerant of switch failure; a single switch failure could have severe consequences for a network. Hot standby switches with backup virtual circuits could of course be designed into an ATM network, but this has little to do with ATM technology itself: it's simply system engineering for system reliability/availability. The arbitrary inter-switch topologies that ATM supports does make this type of system engineering relatively straightforward, however."

Observation

ATM's essentially connection-oriented design makes it difficult to maintain the virtual channels when a switch fails. Switch-over to standby units is more complex than it would be for a connectionless service. Switch replication is an expensive solution.

Networks for real-time, mission-critical, distributed systems should contain no single points of failure. ATM technology is based on switches which constitute such single points of failure. By comparison, FDDI's dual-redundant ring topology essentially offers distributed LAN control and therefore exhibits no single points of failure.

Networks for real-time, mission-critical, distributed systems should also feature self-healing capabilities. Again ATM is deficient in this regard. The HPNWG point out :

"ATM has no self-healing capabilities, though there is a concept of a self-healing SONET ring that could prove useful if ATM were deployed over a SONET infrastructure. As mentioned above, however, the applicability of SONET ring concepts (which are really intended for

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 28 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

large systems or public carriers) to the local platform environment is questionable."

2.5.1.2 Fibre Distributed Data Interface

The Fibre Distributed Data Interface (FDDI) is a relatively new, high- speed 100 Mbits-1 LAN and MAN technology now internationally standardised.

FDDI features high dependability by employing a number of levels of self-healing :

! If a link fails, the connected nodes wrap and reconstitute the ring maintaining the entire system operational. However, dual- redundancy is then lost.

! If a node fails, the node's neighbours wrap and reconstitute the ring, thereby excising the failed node. Again, dual-redundancy is, however, then lost.

! If multiple nodes or links fail, the relevant nodes wrap and form multiple sub-rings, thereby excising the failed elements. With appropriate system design, a degraded mode of system operation could be possible in most instances of such failures.

! If optical bypass switches (OBSs) are employed and a node fails, the OBS excises the failed node while still maintaining dual- redundancy.

FDDI's self-healing capabilities make it an essentially fault-tolerant LAN technology. FDDI's intrinsic fault-tolerance makes it well suited for networked real-time, mission-critical, distributed systems.

FDDI is, however, limited in throughput performance. While FDDI offers 100 Mbits-1, this is an aggregate throughput which must be shared by tens to hundreds of nodes. This may be sufficient for many control applications. However, should the system require networked continuous media services, especially digital video, FDDI does not have the capacity of supporting more than a very small number of high quality digital video virtual circuits (even with compression).

Observation

When matching available technologies to complex system requirements, no technology offers all of the performance and dependability features. For real-time, mission-critical, distributed control, FDDI appears to provide an optimum solution. For multimedia systems, ATM appears to offer an optimum solution. In systems that contain an element of both real-time control as well as continuous media services, a hybrid solution

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 29 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

employing an FDDI control network and ATM multimedia network may be appropriate.

2.5.2 Medium Term - 5 to 15 Years

In systems involving real-time, mission-critical, distributed multimedia, affordable, off-the shelf standard solutions are not yet available. However, such systems solutions are envisioned in the medium term future. Examples of such applications are the integrated digital battlefield.

Geary and Masters, in their paper Investigating New Computing Technologies for Shipboard Combat Systems[76], make the following observations regarding the next generation of naval combat systems :

"In the future, other commercial computer and information transfer technologies will be considered, including Asynchronous Transfer Mode, Fibre Channel and Myrinet™ switching technologies. Some of the new switching technologies support transfer of small messages with extremely low overhead per message. Older commercial networks and communication protocols are often optimized for large data block transfers such as those encountered during file transfers in routine interoffice computing. Such designs can produce substantial overhead in message transmission and receipt, which increases latency. As stated earlier, message latencies on Ethernet can vary from 1 to 10 milliseconds depending on circumstances. In short, low latency is a critical requirement for the Navy, whose applications rely on timeliness vice bulk volume. We anticipate that application-to- application message delivery times of less than 100 microseconds are achievable with some switch-based LANs."

Observation

Such latencies are an order of magnitude smaller than what most commercial networks and communication protocols deliver in practice today. These latencies are of the same order of magnitude as context switch times inside computers, i.e. the time is takes the operating system to switch between tasks in the computer. This means that in the future, communication between computers may only be a little more time consuming than communication between tasks within a computer now. This is one of the key technological barriers that must be overcome to make true real-time distributed computing a reality. Distributed scheduling is another key to successful distributed computing because it allows processes in several computers to act in a coordinated fashion while working on a overall computing problem.

The High Performance Network Working Group (HPNWG), in their report High Performance Network: Architecture, Services and Requirements[86], state the following in terms of next generation latency, synchronisation and bit error rate requirements :

"However, the requirements for latency and bit error rates are much more stringent. Latencies in the order of 10 microseconds are needed to support low-latency

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 30 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

interrupts. Undetected bit error rates less than 10E-14 are often required to ensure proper system operation."

"Time synchronisation with an accuracy on (sic) the order of +/-5 microseconds is required for both (classes of sensor data distribution)."

Observation

Current LAN technologies are incapable of providing such levels of latency and synchronisation performance. FDDI is probably the highest performance standard LAN technology available at present. With multimode fibre, media propagation delay is 5,1 ìs/km while node latency is 0,6 ìs per node. In a ring configuration, which FDDI employs even when using concentrators, such low latencies cannot be achieved. Even when using a Network Time Protocol (refer Paragraph 6.8.2.3), such synchronisation accuracies cannot be achieved in ring topologies due to the fact that the token traverses the ring in one direction only, thereby causing a skew of approximately half the token walk time around the ring. Because optical buses cannot support LANs of much more than a hundred metres in range, networks providing the required levels of latency and synchronisation will probably have to be based on star-type topologies using switches. Synchronisation mechanisms will also have to be built into the LAN hardware, as software implementations will be unable to meet the timing requirements.

To meet the error rate requirements, fibre optic media will almost certainly be a prerequisite; such media offer error rates in the order of 10-12. Error detection and correction will have to be employed in the protocol layers to achieve rates of 10-14. However, this is quite achievable, even at present.

The US Navy is currently formally investigating next generation user requirements and attempting to derive applicable network requirements therefrom, as well as match high performance network technologies to these requirements. Two technologies identified to date are the Scalable Coherent Interface (SCI) and Fibre Channel (FC).

2.5.2.1 Scalable Coherent Interface

SCI is an approved IEEE standard (IEEE 1596)[13] intended to be the next generation high-speed backplane for interconnections in multiprocessor machines. SCI was designed to use point-to-point links in order to avoid the physics problems of using a backplane transmission line at very high data rates, e.g. distributed capacitances.

The stated purpose of SCI is "to define an interface standard for very high performance multiprocessor systems that supports a coherent shared-memory model scalable to systems with up to 64K nodes. This Scalable Coherent Interface (SCI) standard is to facilitate assembly of processor, memory, I/O, and bus adapters from multiple vendors into massively parallel systems with throughputs ranging up to more than 1012 operations per second."

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 31 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

There are a number of official IEEE follow-on SCI efforts underway. Of particular interest is SCI/RT (SCI/Real-Time) which is a follow-on group who are investigating the use of SCI protocols for real-time applications which require guaranteed latencies. SCI/RT also provides some increased fault-tolerance and error handling capabilities.

Observation

While SCI may offer extremely high performance, it is inherently designed for distributed multiprocessing. It is not necessarily optimally suitable for networked computer systems. The standard, as well as the technology, are not mature enough even to start designing real systems using SCI for the immediate future. Dependent on the support of the development and user community, the SCI standard may reach such maturity and become a candidate as a network standard for future real- time, mission-critical, distributed systems.

2.5.2.2 Fibre Channel

Fibre Channel (FC)[36] refers to a set of standards under development by the ANSI Fibre Channel committee, X3T9.3. Fibre Channel specifies a high-speed serial data channel that can connect nodes point-to-point or through a switch or switch network (switch fabric). FC was initially conceived as a peripheral interconnect channel, but its definition has developed such that it could support the construction of high performance local area networks.

FC supports a number of different link options, from shielded twisted pair supporting 200 Mbits-1 over 50 m up to singlemode fibre supporting 800 Mbits-1 over 10 km.

Observation

Fibre Channel is a very flexible and high performance network technology. It is a definite contender for next generation systems. However, it is contended that FC is not mature enough yet for immediate implementation.

2.5.3 Upper Bound of Network Performance Requirements

As technology and applications advance, system applications based on networks require greater and greater performance in terms of throughput, while real-time systems require minimised latency. It is considered relevant to speculate on the ultimate network performance requirements.

In terms of throughput, the application requiring the highest performance is high quality, real-time, digital imagery. The highest quality image is related to the image-capture capability of the human eye and image processing capability of the human brain.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 32 of 214 ydthsm2.wpd Evolution of Distributed Control Systems

According to Shostak in his article The Human Eye as an Imaging System[120], the human eye can be described in the following digital terms :

! 36 000 x 28 000 pixels ! 3 colours per pixel ! 17 bits of dynamic range ! 60 Hz update rate ! 2 channels

In terms of data throughput, such visual capabilities would require 750 Gbytes-1 or 6 000 Gbits-1.

Observation

The ultimate single channel bandwidth requirement can be considered to be bounded above by the figure of 6 000 Gbits-1. This is an enormous figure and would require the ultimate in optical technology (image capture and data transmission media) to achieve such capability.

Shostak concludes that it will be at least 15 years before the technologies to support such an ultimate electronic imaging system are developed.

Recently, in early 1996, researchers in Japan demonstrated in the laboratory fibre optic data transmission at 1 000 Gbits-1 over a point-to-point link.

2.6 Chapter Summary

In Chapter 2, the evolution of distributed control systems has been traced from its early beginnings in the 1950s and 60s to the next generation of systems that will be fielded in the first decade of the 21st Century. In particular, the shortcomings of the early centralised and point-to-point architectures has been analysed in terms of the requirements of present and next generation real-time, mission-critical, distributed systems. These overall requirements have consequently been translated into a generic system architecture based on local area networks.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 33 of 214 ydthsm2.wpd Contextual Definitions

Chapter 3

Contextual Definitions

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 34 of 214 ydthsm2.wpd Contextual Definitions

3. Contextual Definitions

The following definitions are made within the context of this thesis :

3.1 Real-Time

Real-Time systems are characterised by the requirement to execute multiple, concurrent tasks with hard deadlines; i.e. exhibit bounded and deterministic responses to external events[122]. Compromising these deadlines may have catastrophic results, including loss of life, loss of platform or mission failure.

3.1.1 Definition

An action which must be accomplished within an allotted amount of time, failing which such accomplishment has no, diminishing or negative value[122].

3.1.2 Communications Implications

With respect to data communications, real-time performance implies the requirement to transmit data and synchronisation information between distributed processes within strict deadlines via the communications media, or provide other means of recovering critical timing information.

Veríssimo defines a reliable real-time network as follows[134] :

A reliable real-time network displays bounded and known message delivery delay, in the presence of disturbing factors such as overload or faults.

As user application tasks will share processing resources with data communications tasks, multitasking will be required. This has operating system implications, specifically with regard to task switching and scheduling with hard deadlines.

3.2 Protocol

3.2.1 Definition

A protocol is a formal understanding (usually written) between co-operating entities which describes orderly interaction between them.

3.2.2 Communications Implications

Protocols have to be defined and developed such as to allow precise description of data communications in terms of control of data flow and data interpretation, including the maintenance of timeliness and order. They also have to be flexible in allowing many users with disparate communications requirements to employ standard protocols.

In the networking context, protocols of primary significance are usually taken to be the network and transport layer protocols. This is because protocols up to and

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 35 of 214 ydthsm2.wpd Contextual Definitions

including the transport layer constitute the minimum requirement to implement a LAN. The physical layer and data link layer protocols are normally bundled together with the network interface hardware leaving network implementers with the problem of the next two layers.

However, in the distributed system context, interactions occur at all layers. Such interactions, from physical signalling to application data interface management, are governed by protocols. This is especially important in real-time, mission-critical systems due to the criticality of dependability and performance as well as the correct interpretation at the system level of the data and control actions.

In safety-critical applications, especially where the implications of faulty operation are extraordinarily onerous (such as munitions release, particularly nuclear devices), a further level of protocol is mandatory, possibly statutory. Such protocols normally involve application-application and/or operator-operator interactions as well as invocation of authentication and access control capabilities of the integral network security protocols. Such protocols can be considered as logical interlocks that were normally implemented in hardware in older generation systems.

3.3 Strategy

3.3.1 Definition

A strategy is an overall plan in order to achieve a national, business or technological goal.

3.3.2 Communications Implications

As the communications requirements of real-time, mission-critical, distributed systems are very complex, a coherent strategy is required to identify these requirements and match these to available and emerging technologies such as to optimise the system's functionality, performance and cost over its complete lifecycle.

3.4 Mission-Critical

Mission-critical systems have differing definitions in the military, industrial process control and business environments. The definition provided in Paragraph 3.4.1 applies primarily to military or process control systems. In business environments, information systems managers would consider systems where failure could lead to loss of money (e.g. banking), serious inability to conduct business (e.g. online investment systems or accounting systems), or serious operational chaos (e.g. electronic data interchange systems), as being mission-critical.

In certain applications such as medical teleradiology networks, the unavailability of the system may not be intrinsically life threatening. However, the users (i.e. medical practitioners) may become to rely on the network to provide the necessary information to make critical medical decisions, with the result that the information system may well become mission-critical, although this may not have been the original intention.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 36 of 214 ydthsm2.wpd Contextual Definitions

Where the function of the network is to transport digital video and/or audio, e.g. video conferencing, the network can be considered to be performance-critical.

3.4.1 Definition

Mission-critical systems are those where failure of execution, or faulty execution, may have catastrophic results, including loss of life, serious injury, loss or serious damage to plant or platform or mission failure.

3.4.2 Communications Implications

The system must be able to depend on a reliable communications infrastructure. The latter should not be susceptible to electromagnetic radiation or produce electromagnetic interference which affects the system. The network operation should not be affected by temporary disturbances such as inserting or removing a network device, or by the failure of a network node or network interconnect segment. In other words, the network should be fault-tolerant. Normally replication or self-healing provides for fault-tolerance.

In military systems where there must also be tolerance of a certain level of battle damage, i.e. survivability, the network system should provide for multiple levels of replication (e.g. quad-redundancy) or provision of a completely separate communications capability for safety-critical functions, thereby allowing a degraded mode of operation.

3.5 Distributed

3.5.1 Definition

Entities are distributed if they are separated in space. Computing nodes are distributed if they are coupled such that they fail independently.

3.5.2 Communications Implications

If distributed systems are to participate in a collaborative processing effort, they need to be connected by a communications link. Control and data needs to be passed across this link. An inevitable implication of communication between distributed entities is time delay, re-ordering and the possibility of data errors. Protocols and interface specifications are required to manage this communication.

Data communication between distributed sub-systems can take place in either a parallel or a serial fashion. Parallel systems are those consisting of multiple physical data links while serial systems involve a single data link (simplex or half duplex) or single data link in each direction (full duplex). Serial systems may have further links for the passing of control information.

3.6 System

3.6.1 Definition

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 37 of 214 ydthsm2.wpd Contextual Definitions

A system is a composite of items, assemblies, skills and techniques capable of performing and/or supporting an operational mission[121].

A complete system includes related facilities, items, matériel, services and personnel required for its operation to a degree that it can be considered as self-sufficient in its intended operational and/or support environment.

An item is part of the system if it can be controlled by the system, otherwise it is part of the environment.

A system can be considered as being greater than the sum of its parts. The added value can be considered as the system's emergent properties.

3.6.2 Communications Implications

If a system is distributed, then the system's sub-systems are required to be connected by a communication infrastructure in order to maintain a shared state. This communication infrastructure provides a significant contribution to the system's emergent properties.

3.7 Open System

3.7.1 Definition

An open system is one that implements sufficient open (public) specifications for interfaces, services and supporting formats to allow properly engineered application software to :

! be ported with minimal changes across a wide variety of systems

! interoperate with other applications on local and remote systems

! interact with users in a style that facilitates user portability.

3.7.2 Communications Implications

The elements (layers) making up the data communication system must be non- proprietary, i.e. in the public domain (or at least the interfaces between layers should be so). These interfaces should be consistent and clearly specified. The layers should be organised into a communication framework or paradigm (e.g. ISO OSI Basic Reference Model, Internet, SAFENET, GOSIP or HPN Model).

3.8 Local Area Network

3.8.1 Definition

A Local Area Network (LAN) is a shared, serial digital communication fabric, facilitating the communication of information among the processes in, or users of, a set of distributed computers.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 38 of 214 ydthsm2.wpd Contextual Definitions

3.8.2 Communications Implications

LANs can be considered to have four basic abstract properties[134] :

broadcast - destinations receiving an uncorrupted packet transmission, receive the same packet

error detection - destinations detect any corruptions in the network in a locally received packet

network order - any two packets indicated in two different destination access points, are indicated in the same order

full duplex - indication, at the destination access point, of reception of the packets transmitted by the local source access point, may be provided on request

Real-Time LANs have a further four abstract properties :

tightness - destinations receiving an uncorrupted message transmission, receive it at real-time values that differ, at most, by a known interval

bounded transmission delay - every packet queued at a source access point, is transmitted by the network within a bounded delay

bounded omission degree - in a known interval, omission errors affect a finite number of transmissions

bounded inaccessibility - in a known interval, the network may be inaccessible a finite number of times and for a finite duration

To be effective in supplying a communications infrastructure, networks must have attributes such as dependability, timeliness and interoperability. To support the latter requirement, data transfer via networks is required to be in accordance with standard protocols and interface specifications.

3.9 LAN Profile

3.9.1 Definition

A LAN Profile is a grouping or set of standards along with additional implementation agreements required for interoperability.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 39 of 214 ydthsm2.wpd Contextual Definitions

3.9.2 Communications Implications

Most communication standards are not sufficiently precise to guarantee communication between nodes. LAN Profiles augment these standards such as to provide the level of dependability and performance required by mission-critical systems.

LAN Profiles are required in order to achieve reliable interconnectivity.

3.10 Fault-Tolerant

3.10.1 Definition

Fault-Tolerance is the ability of a system to continue the correct performance of functions in the presence of faults.

3.10.2 Communications Implications

Fault-tolerant data communication systems can have no single points of failure. This implies that components of the data communication system need to be self-healing or replicated. Self-healing implies that, in the presence of local failure, nodes or segments not critical for the particular mode of operation may be automatically excised from the network until maintenance actions have occurred. Replication implies redundancy.

3.11 Packet

3.11.1 Definition

A packet is a bounded-length protocol data unit that contains all of the mechanisms for transferring data and state information from one endpoint of a communication transaction, through all the intermediate switches, to the other endpoint or group of endpoints.

3.11.2 Communications Implications

A packet on a network will normally be a transport layer protocol data unit. It will be constructed from the user data plus increments of control data (state information) added at each protocol layer.

Different physical and media access network technologies support different maximum packet sizes; e.g. ATM supports packets (cells) of 53 bytes, Ethernet supports packets of 1 500 bytes and FDDI support packets of 4 500 bytes.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 40 of 214 ydthsm2.wpd Contextual Definitions

3.12 Message

3.12.1 Definition

A message is an arbitrarily long number of packets containing an arbitrarily large amount of user data.

3.12.2 Communications Implications

If a message consists of more than one packet, higher layer protocols must packetise the message, transmit the packets via the underlying layers and correctly re-order the packets on reception. This is especially important in internetwork topologies where different packets can be transmitted through different routers. This has specific implications for error, flow and rate control.

3.13 Cell

3.13.1 Definition

A cell is a short packet of fixed size.

3.13.2 Communications Implications

Being short, the cell can traverse the network with low latency. Being of fixed size, routing can be optimally performed by hardware rather than software.

However, cells can only carry a small amount of payload data; therefore a higher layer protocol is needed to decompose packets of user data into cells before transmission and recompose packets after reception. This may have timing implications for large packets.

3.14 Precedence

3.14.1 Definition

Rank in terms of importance.

3.14.2 Communications Implications

Messages are ranked (and possibly scheduled) in the order of criticality of their functionality.

3.15 Priority

3.15.1 Definition

Rank in terms of time.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 41 of 214 ydthsm2.wpd Contextual Definitions

3.15.2 Communications Implications

Messages are ranked and scheduled in the order of criticality of their deadline.

3.16 Chapter Summary

Chapter 3 has provided a set of definitions which are applicable in the domain of the thesis. Additional terms have also been defined, especially where these often have fairly loose definitions in the applicable literature, as well as where precise definitions are important in the context of this thesis.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 42 of 214 ydthsm2.wpd Real-Time System Applications

Chapter 4

Real-Time Distributed System Applications

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 43 of 214 ydthsm2.wpd Real-Time System Applications

4. Real-Time Distributed System Applications

4.1 Scope

This chapter provides an overview of typical examples of real-time, mission-critical, distributed systems from the areas of industrial process control, aerospace, military command and control systems and multimedia information systems. Such systems are proposed to be candidates for the application of LAN technologies and, as such, provide input for the real- time protocols under investigation.

This chapter also addresses various architecture concepts to meet the different system requirements, specifically those implementing real-time, distributed topologies using new concepts in computer networking technology. For each application example, a concept based on a LAN-based architecture, contended to be appropriate for the specific system, is proposed. The architecture is also analysed in terms of its relative advantages and disadvantages.

4.2 Introduction

Real-time, distributed, system architectures will find increasing application in all areas of computer-based systems. Initially these applications will be appropriate more at the high-end of such application domains. Many such systems are presently based on centralised computer architectures which their owners are beginning to find extremely limiting in terms of upgradeability, maintenance, etc. Distributed system architectures will also provide for new and innovative applications that were never feasible in the past.

As users and system integrators begin to trust smaller, less expensive computers and local area networks to a greater extent, they will allow designers and implementers to exploit distributed system architectures for real-time, mission-critical systems. Such architectures will become increasingly appropriate where the complexity and reaction times of system interactions supersede the capability of human operators.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 44 of 214 ydthsm2.wpd Real-Time System Applications

4.3 Industrial Process Control Systems

4.3.1 Functional Performance Requirements

The overall generic functional requirements of an Industrial Process Control System (IPCS) can be summarised as follows :

! Sensing plant conditions and environment from a variety of sensors (discrete, video and audio).

! Communication via datalinks of measurement data and surveillance information.

! Local generation of situation displays identifying normal conditions as well as alert and/or alarm conditions.

! Transfer of this information to higher levels of control (i.e. control and management centres) via datalinks.

! Situation analysis.

! Option synthesis.

! Simulation of potential problems and solutions as an aid to the decision making process of the operators, supervisors and managers.

! Automatic and man-in-the loop control of plant actuators.

! Built-in test (BIT) and the operation of built-in test equipment (BITE).

! System Management.

From these allocated requirements, it follows that information is required to be transmitted between sub-systems of the plant and between remote management and control centres.

The single most important consideration when partitioning an industrial control system is the performance of critical control loops.

4.3.2 Network Performance Requirements

The performance requirements of an Integrated Industrial Process Control System will vary widely accordingly to the type and number of processes being controlled. Typically advantages are to be gained by automation where the required reaction times are beyond the capabilities of a human operator. Other good candidates for automation are those where the operating environment is hazardous, e.g. mines, nuclear process plants and materials handling systems.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 45 of 214 ydthsm2.wpd Real-Time System Applications

Process control plants in these categories can benefit from remote automatic control and robotics. Such systems require reaction times in the order of milliseconds. State and event-type dataflows are typical with certain dataflows requiring high integrity. Where collaborative or distributed processes occur, a high degree of synchronisation is required between sub-systems, e.g. robotic manufacturing.

In large, complex plants, sub-systems from many vendors are normally required to be integrated together. This implies that the networking architecture needs to support open systems interconnectivity by offering multiprotocol support, diverse real-time operating systems and internetworking capability.

4.3.3 Data Distribution Requirements

The overall generic functional requirements of an industrial process control system network can be summarised as follows :

! Collect sensor data ! Distribute control orders ! Distribute control data ! Distribute synchronisation data ! Distribute management data ! Distribute multimedia data

4.3.4 Example

An example of an IPCS, i.e. an Integrated Mining Control Network, based on FDDI and XTP, is proposed in terms of a conceptual topology described in Figure 2 overleaf :

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 46 of 214 ydthsm2.wpd Above Ground

Single Attachment Stations CCTV Single Attachment Stations (MLT-3/STP) (MLT-3/STP) PABX Mass

MONI Storage Server MONI Network Management PC Workstations MONI Station

MONI PC Workstations MONI

Power 33 USV FDDI Concentrator Temp Run

Optical Bypass Switch

Power 33 USV Temp Run Optical Single Attachment Stations Bypass FDDI Concentrator Switch Optical (MLT-3/STP)

Power Bypass 33 USV Temp FDDI Switch Run FDDI Concentrator Commercial Backbone Broadcast Radio Optical Bypass Optical Pagers Switch Bypass Switch Telephones

Power 33 USV Temp Run

Power FDDI Concentrator 33 USV Temp MONI Run FDDI Concentrator Single Attachment Stations (MLT-3/STP) PC Workstations Single Attachment Stations (MLT-3/STP) PC Workstations MONI

MONI MONI PC Workstations

MONI MONI CCTV

Underground

Figure 1 : Integrated Mining Control Network using FDDI, CDDI and XTP

Page 47 of 214 file : yimcn1.cdr Real-Time System Applications

4.4 Aerodynamic Control Systems

4.4.1 Functional Performance Requirements

The overall generic functional requirements of an aerodynamic control system can be summarised as follows :

! Sensing of aerodynamic surface conditions.

! Sensing of environmental conditions.

! Automatic closed-loop control of aerodynamic surfaces.

! Guaranteed response times from all sensors and actuators.

! Extremely high dependability.

4.4.2 Network Performance Requirements

The single most important consideration in an aerodynamic control system is the performance of critical control loops. For such time-sensitive and safety-critical functionality, the network must provide guaranteed response. The operating environment can be characterised a priori, while the required operating envelope in the environment can be specified. In such an application, a time-triggered protocol is appropriate. To achieve the high dependability requirements, a high degree of replication of sensors, actuators, transmission paths and data entities is required. Typically a quad-redundant (quadruplex) system would be required.

4.4.3 Data Distribution Requirements

Hall and Stigall[81] have analyzed the aerodynamic characteristics of an F/A-18 airframe. This indicates that digital servo loop closure would require 5 000 Hz bandwidth. They estimate that about 250 16-bit words per update would be required to drive a quadruplex electro-hydraulic servo system. This results in a throughput requirement of 80 Mbits-1.

4.4.4 Example

An example of an aerodynamic control system for an F/A-18 air superiority fighter, based on FDDI, is proposed by Hall and Stigall. The conceptual topology of the system is described in Figure 2 overleaf :

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 48 of 214 ydthsm2.wpd Each line represents a dual-redundant ring

Flight Control Computers

Figur e 2 : Fly-by- Wire Aerodynam ic Cont rol Syst em based on FDDI

file : yacsl1.cdr

Page 49 of 214 Real-Time System Applications

4.5 Space Launch Vehicle

4.5.1 Functional Performance Requirements

The RSA-4 Space Launch Vehicle is a conceptual four-stage launch system capable of launching a satellite payload in low earth orbit. The first two stages of the launch vehicle provide primary lift while Stage 3 effects transitional orbit.

The Stage 4 Control Computer (SCC) controls launch and final injection and parking of the payload into a circular earth orbit of altitude [typically 1 400 km ± 1 km and inclination 55E ± 1E (2ó)]. The SCC is required to achieve this final orbit after Stage 3 transitional orbit resulting in a state vector of 7 800 ms-1 ± 67 ms-1 (2ó).

The SCC also controls the firing logic of the Stage 4 hydrazine (N24H ) acceleration thrusters. Payload parking has to be achieved within 6 orbits (. 112 minutes per orbit) with a maximum of 200 kg of hydrazine.

The SCC is required to provide a data interface between itself and other sub-systems (Stage 3 and below) of the launch vehicle as well as a data interface between itself and the payload.

Within Stage 4, the SCC interfaces to the Stage 4 Navigation Sub-System, Telemetry Sub-System, TeleCommand Sub-System and Power Switching Unit.

The Stage 4 control software implements the following functions :

! Pyrotechnic release of lower stages.

! Orbit Injection Control Logic.

! Fairing Ejection Control.

! Attitude Control.

! Spin Control.

! Active Nutation Control.

! Payload Parking Logic.

! Communications, Telemetry and TeleCommand Control.

! Co-ordination of System and Sub-System Self-test.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 50 of 214 ydthsm2.wpd Real-Time System Applications

The overall generic requirements of space launch vehicle can be summarised as follows :

! Automatic closed-loop control of all sensors and actuators.

! Extremely high dependability.

! Low size and mass.

! Low power consumption.

! Low susceptibility to space radiation.

! Advanced thermal management.

4.5.2 Example

An example of a Space Launch Vehicle using a bus-based distributed control system is described in Figure 3 overleaf :

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 51 of 214 ydthsm2.wpd Payload Power Payload Thruster Switching Control Unit Release Unit Unit

Power Payload Thruster Switching Control Unit Release Unit Unit

Stage 4 Mission Computer

Navigation Telemetry TeleCommand Stage 4 Sub-System Sub-System Sub-System Stage 4 Mission Computer

Stage 3

Navigation Telemetry TeleCommand Sub-System Sub-System Sub-System

Stage 4 Control System

Stage 2

R S

A Stage 1 4 Telemetry Links

Long Range Links

MONI MONI

MONI

MONI

MONI MONI

MONI MONI MONI

MONI

Launch Vehicle Local Ground Control System Remote Ground Control System

file : yslv01.cdr Page 52 of 214 Real-Time System Applications

4.6 Next Generation Vetronics System

4.6.1 Functional Performance Requirements

The overall functional requirements of a mobile vehicle combat system can be summarised as follows :

! Reception of surveillance information from a variety of sensors, both onboard and offboard, as well as the rest of the battle group via data links.

! Generation and display of situation plots identifying friend or foe.

! Transfer of this information to other combat vehicles, units and command and control centres via data links.

! Threat analysis of the situation.

! Simulation of potential engagements as an aid to the decision making process of the gunner and commander.

! Target designation, target weapon assignment and the control of counter- measures against the enemy's weapons.

! Operation and control of weapons and the release of munitions.

4.6.2 Real-Time Data Transfer Requirements

4.6.2.1 Critical Functions

The following real-time critical functions of the vehicle combat system determine LAN throughput requirements :

! Tracking of contacts from all sensor sources including Gunner's Sight, Commander's Sight, Fire Control Radar (FCR), Air Search Radar (ASR), Electronic Warfare (EW) suite and off-platform via Data Links.

! Providing closed-loop control between sensors, ballistics computer and gun drives.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 53 of 214 ydthsm2.wpd Real-Time System Applications

4.6.2.2 Critical Data

The applicable fire-control algorithms involve the transmission of the following time- and mission-critical data :

! Target Track Data (. 32 bytes) from a Tracker Sub-System (TSS) to a Gun Control Unit (GCU) every 5 ms.

! Platform Stabilisation Data (. 16 bytes) from the Navigation Sub- System (NSS) to the Gun Control Units every 2 ms.

4.6.3 Example

An example of a vehicle combat system using a LAN-based distributed control system is described in Figure 4 overleaf :

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 54 of 214 ydthsm2.wpd Turret LAN

Bridge ae5 f214 of 55 Page Vehicle LAN

file : yavt02.cdr FigurFiguree 44 :: NextNext GenerationGeneration VetronicsVetronics ArchitectureArchitecture Real-Time System Applications

4.7 Naval Surface Combat System

Modern naval combat systems are characterised by the requirement to defend against high performance threats from the air and from under the sea, at the same time as fulfilling their own strategic or tactical role. They thus need to exhibit high performance as well as be able to optimise this performance for a particular role. Warships are also designed to go in harm's way and therefore need to be able not only to survive a strike from lethal ordnance, but need to be able fight hurt, either to safely disengage from combat or to complete their mission.

Meeting all these requirements requires special considerations and techniques to be applied throughout the design of the platform and combat systems. As both systems will rely to a greater and greater extent on computers for automation in order to improve response times, target location and weapon delivery accuracy as well as execute damage control, these computer systems will require special design techniques in terms of architecture, application software and interconnectivity.

4.7.1 Functional Performance Requirements

The overall generic functional requirements of an Integrated Naval Surface Combat System (INSCS) can be summarised as follows :

! Reception of surveillance data from a variety of sensors, both onboard and offboard, as well as the rest of the task force via datalinks.

! Generation and display of air, surface and sub-surface situation plots and tracks identifying friend and foe.

! Transfer of this data to other units and vessels in the task force via datalinks.

! Threat analysis of the situation.

! Response option synthesis.

! Pre-engagement system confidence checking by means of system built-in test.

! Simulation of potential engagements as an aid to the decision making process of the operators, directors and commanders.

! Target designation, weapon assignment and the control of counter- measures against the enemy's weapons.

! Operation and control of weapons and the release of munitions.

From these allocated requirements, it follows that information is required to be transmitted between sub-systems on the same platform and other platforms. The transmission of data can be considered as a derived requirement as it is transparent to the user. The manner in which data and information flow is achieved has

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 56 of 214 ydthsm2.wpd Real-Time System Applications

significant implications on system architecture, design and implementation and should therefore be effected in such a manner as to optimise the design of the INSCS in areas such as functional performance, availability, survivability, flexibility, maintainability, upgradeability, training, cost, etc.

Such optimisation should be achieved by effective allocation of functions to physical equipment in order to achieve minimum data flow between equipments without compromising flexibility, survivability and upgradeability.

The transfer of information must be performed in real-time which places the following requirements on the data transfer system :

! Latency Control

! Jitter Control

! Quality of Service Guarantees

4.7.2 Data Distribution Requirements

The single most important consideration when partitioning a combat system is the time criticality of the various sensor/weapon/controller data paths.

4.7.2.1 Critical Functions

The following real-time critical functions are typical of an INSCS and determine LAN bandwidth and latency requirements :

! Tracking of up to 400 contacts from all sensor sources including Fire Control Radar (FCR), Primary Search Radar (PSR), Long Range Search Radar (LRSR), Anti-Submarine Warfare (ASW) suite, Electronic Warfare System (EWS) and off-ship via datalinks.

! Providing fire-control quality tracks for up to 16 air threats of speed Mach 3 (800 ms-1).

! Simultaneous engagement of up to 10 air threats with own ship's defence missile systems, capable of speed Mach 5 (1 500 ms-1).

! Simultaneous engagement of up to 4 air threats with ship's gun systems (close-in weapon systems).

! Simultaneous engagement of the balance of the air threats with ship's EW "weapons", i.e. jammers, chaff and decoys.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 57 of 214 ydthsm2.wpd Real-Time System Applications

4.7.2.2 Critical Data

The applicable fire-control algorithms involve the transmission of the following time- and mission-critical data :

! Target Track Data (. 520 bytes) from a Tracker Sub-System (TSS) to a Gun Control Unit (GCU) every 20 ms with jitter of less than 5 ms.

! Platform Stabilisation Data (. 56 bytes) from the Navigation Sub- System (NSS) to the Weapon Control Units (WCUs) every 10 ms with jitter of less than 5 ms.

! Target Track Data (. 40 bytes) from a Tracker Sub-System (TSS) to a Missile Control Unit (MCU) with a maximum latency of 20 ms.

! Implementation of safety-critical commands such as fire command between Tracker and Weapon over critical virtual circuits. Such commands consist of occasional, but continuous, fully error controlled digital signals with repetition every 20 ms.

4.7.2.3 Synchronisation

Collaborative, distributed application processes have a further requirement for absolute and relative synchronisation :

! Absolute (calendar time) synchronisation to a resolution and accuracy of 1 millisecond.

! Relative (between local clocks) synchronisation to a resolution and accuracy of 250 microseconds.

4.7.3 Network Architectures

The following existing network architecture is described and analyzed :

! Present Generation US Combat System

The following next generation network architectures are described and analyzed :

! Next Generation US Combat System ! Single LAN System Architecture ! Federated Network System Topology ! Internetworked Integrated System Architecture ! Client-Server System Architecture

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 58 of 214 ydthsm2.wpd Real-Time System Applications

4.7.3.1 Present Generation USN Combat System (US Navy)

4.7.3.1.1 Description

One of the latest operational combat systems in the US Navy is the Aegis Combat System[148]. The system was, however, designed some twenty years ago and features a centralised computer system architecture. The central processing elements are the Command and Decision (C&D) unit and the Weapon Control System (WCS). A complex system of dual- redundant, point-to-point, copper-wire interconnections exists, based on US Navy standard NTDS channels.

4.7.3.1.2 Advantages

The advantages of this topology are that it is a tried and proven system. It is also possible to characterise throughput and latency on a per link basis.

4.7.3.1.3 Disadvantages

Despite the fact that Aegis systems are currently in use, they cannot be considered as examples of modern combat system architecture. The centralised computer architecture and point- to-point data distribution topology are severely limiting in terms of availability, flexibility, reconfigurability and upgradeability. Replication is very complex. Management of the physical and functional links is also complex and expensive making upgrade difficult. The topology has little scope for flexibility and reconfigurability. The NTDS links also have very modest performance.

4.7.3.1.4 Topology

The architecture of the Aegis Combat System, based on a centralised computer controller and dedicated point-to-point serial links, is described in Figure 5 overleaf :

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 59 of 214 ydthsm2.wpd Aegis Tactical Data AirControl Display System

ORTS Link 11 Link 4A Tomahawk Weapon System SGS CCSS

LAMPS Electronic Helicopter Warfare System System

Airand Surface Gun Radar System Fire Control System

Identification C&DC & D WCS Phalanx Systems Weapon System

Cryptologic Combat Support Harpoon System Missile System AN/SPY-18 Radar System Navigation Fire Control Systems System

Sonar Underwater Vertical Systems Fire Control Launch System System

Acoustic Countermeasures

Sensors Control Weapons

Figure 5 : Aegis Combat System Architecture file : yaegisa02.cdr Page60 of 214 Real-Time System Applications

4.7.3.2 Single LAN Topology Architecture

4.7.3.2.1 Description

This topology consists of a single LAN only. All the sensors, weapons and watchstations are connected to the same LAN. The LAN may be dual-redundant to increase the availability of the system.

4.7.3.2.2 Advantages

The advantages of this topology are that it is affordable and simple to conceptualise and implement. It also directly supports a horizontal functional architecture.

4.7.3.2.3 Disadvantages

This topology is limiting in its support of the required attributes of availability and survivability. Throughput problems are also likely to result from the requirement to integrate image, graphics and shared databases on a single LAN.

4.7.3.2.4 Topology

An example of a single LAN-topology combat system architecture, based on FDDI, is proposed in terms of a conceptual topology described in Figure 6, 6 overleaf :

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 61 of 214 ydthsm2.wpd Legend Specialised Interface Units

Principal Warfare Areas Gateway Modem ASuW

OW FDDI LAN

Shared Resources AAW

EW

System and Network Management Dual-Redundant Console Database and ASW Fileservers

Detect Control Engage

file : ysflt02.cdr Page 62 of 214 Real-Time System Applications

4.7.3.3 Next Generation USN Combat System (US Navy)

4.7.3.3.1 Description

A group of combat system engineers and academics in the United States are presently giving consideration to the next generation of US Navy combat system requirements and architectures[148]. The work is being performed by the Aegis Computer Architecture, Data Bus and Fibre Optics Working Group sponsored by the US Naval Sea Systems Command (NavSea). This next generation of surface combatants is destined for operational use in the 2010 to 2030 timeframe; many of the concepts are therefore somewhat advanced for present consideration, while the full spectrum of operational requirements can only be speculated. However, the group recognises the requirement for transition and has formulated many relevant principles and philosophies, some of which are implementable in the short to medium term.

In terms of system architecture, NavSea propose a considerably segmented approach according to the principal warfare area and detect/control/engage principle. The systems constituting each of these segments are arranged horizontally with a corresponding matrix of dual-redundant fibre optic LAN segments providing interconnection. This amounts to a system of seven LAN segments arranged orthogonally with twelve LAN interconnects which they term gateways (but are probably more correctly termed bridges or routers).

4.7.3.3.2 Advantages

The system architecture and topology provides for a very flexible, reconfigurable and upgradeable combat system. LAN segmentation supports enclaving which enhances survivability. With the employment of high performance bridges or routers, the topology also supports horizontal functional integration. In terms of combat system engineering, it may represent the ultimate goal in this regard.

4.7.3.3.3 Disadvantages

Of consideration is that under certain conditions of system reconfiguration (e.g. after equipment failure) data messages may have to be routed through two or more intermediate switching nodes to reach their destinations. This would have latency implications, although this may well be insignificant with next generation routers. It is therefore necessary that network and transport layer protocols have enhanced capabilities, such as those of XTP and XTP-aware IP routers.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 63 of 214 ydthsm2.wpd Real-Time System Applications

The other drawback, especially for smaller vessels (such as frigates or corvettes) is that the solution could tend to be expensive.

4.7.3.3.4 Topology

The architecture and topology of the US Navy Next Generation Combat System, employing internetworking technologies, is described in Figure 7 overleaf :

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 64 of 214 ydthsm2.wpd Chaff System Guns Decoys Vertical (SAMs) (ASMs) Missiles Missiles & & Decoys Torpedoes Launch System Jammers, Chaff

Engage

epn LAN Weapons Weapons LAN Control Control Control Control EW ASW AAW Strike Control Damage

Navigation Weapon Control Weapon Control Weapon Control Weapon Control

oto LAN Control Control LAN Control System System System System Strike EW AAW ASW Control Power Control System Control System Control System Multi-Warfare Control System Management LAN LAN LAN LAN EW Correlator Correlator Correlator AAW Correlator ASW ASuW AAW LAN EW LAN

Track Correlator

Track Correlator ASW LAN Track Correlator Track Correlator Strike LAN

esrLAN Sensor Sensor LAN Detect Figure 7 : Next Generation USN Combat System Architecture & 3D Links W Array & EW IFF & Radar Radar Sonar Sonar I&W I Sensors 2D&3D 2D Data Links Towed Array Towed Hullmount & Bridge/Router Legend = Bridge/Router EW AAW ASW Warfare Areas Strike (OW) ASuW file : yusnnga02.cdr

Page 6565 ofof 214214 Real-Time System Applications

4.7.3.4 Federated Backbone-Topology Integrated System

4.7.3.4.1 Description

This topology employs a two-level hierarchy of LANs, i.e. local LANs for each functional or geographic area and a backbone LAN to interconnect the lower level LANs. Bridges or routers would provide the interface between the local LANs and backbone LAN.

4.7.3.4.2 Advantages

The advantage of this topology is that it is fairly non-complex and hence simple to design, develop and manage. Organisations developing specific segments of the combat system may feel "more comfortable" with this arrangement as interfacing and qualification are less complex. The topology also supports survivability through enclaving.

4.7.3.4.3 Disadvantages

Essentially this topology implies a vertical approach to system integration. While the transfer of local data may be optimised, data between segments will have to traverse two bridges or routers. This may have latency implications. The topology would also require a considerable number of bridges and routers; this would have cost implications especially if these devices were dual-redundant.

Further disadvantages of this topology are that reconfigurability is severely retarded. The backbone may also become a data throughput bottleneck if its bandwidth is equal to that of the LANs.

4.7.3.4.4 Topology

An example of a federated backbone-topology architecture, using FDDI, is proposed in terms of a conceptual topology described in Figure 8 overleaf :

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 66 of 214 ydthsm2.wpd e 8 : Combat System Architecture based on Backbone Topology Figur yfbbt02.cdr

Page 6767 ofof 214214 Real-Time System Applications

4.7.3.5 Integrated Hub-based Architecture

4.7.3.5.1 Description

This topology consists of a number of essentially independent LANs connected by a hub or router. LANs connect independently functional areas such that dataflow on the LANs is restricted locally and traffic through the hub is minimised.

4.7.3.5.2 Advantages

The advantages of this topology are that it provides functional separation of segments while not detracting from intersegment connectivity. It thereby supports less complex integration. Data between any segments only has to traverse one router. This will minimise latency.

The topology provides a good balance between horizontal functional integration, federalism and cost-effectiveness.

4.7.3.5.3 Disadvantages

The disadvantages are that the hub constitutes a single point of failure. This can be circumvented by installing dual- redundant hubs. However, these multi-LAN hubs are expensive. If these hubs were situated remotely from each other, there would be cabling complexity implications. Special switch-over software would also be required to allow the hot standby hub to transparently take over from the primary.

4.7.3.5.4 Topology

An example of a hub-based architecture, using FDDI, is proposed in terms of a conceptual topology described in Figure 9 overleaf :

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 68 of 214 ydthsm2.wpd LAN LAN Room Voice LAN LAN LAN and Ops LAN Management Force LAN ASW LAN Image and Voice LAN EW LAN Ship Management LAN Control LAN Task Force LAN Weapons LAN Legend Force Task Force Ops Room Engage FDDI Switch Control Network Management Station Comms External e 9 : Integrated Hub-based Combat System Architecture Figur Detect Control Management Management Offensive Electronic Anti-Air Anti-Submarine Anti-Surface Warfare Areas Bridge Ship Management Navigation Damage Control Power Management ygenarc2.cdr

Page 6969 ofof 214214 Real-Time System Applications

4.7.3.6 Client-Server Architecture

Client-server techniques are finding increasing implementation in a wide variety of applications. In many organisations, expensive, centralised mainframe computers are being replaced by client-server systems consisting of distributed file and computation servers connected to networked workstations. This is due to the ability of levering the affordability of modern, high performance computer processors, inexpensive data storage media and peripheral devices as well as reliable computer networks.

Such architectures offer enhanced dependability as they can be designed to contain no single points of failure, enhanced performance due to distributed multiprocessing as well as increased user access to the system resources.

4.7.3.6.1 Description

It is proposed that client-server techniques, using high performance, fault-tolerant computer networks are eminently suitable in achieving the requirements of next generation, real-time, mission-critical, distributed systems.

The architecture is based on generic computing resources, e.g. servers and multifunction consoles (MFCs), i.e. the clients. The servers include application servers (fileservers), database servers and computation servers. The servers are distributed throughout the platform, normally in server farms. All critical servers are replicated, either in server pairs or server pools.

Application servers store application software programmes and download these to MFCs at startup, either at system startup, or following reconfiguration or MFC failure.

The particular application software downloaded to an MFC characterises the latter whereby it becomes a task-oriented console, for example, sonar console or air defence console. MFCs become the clients of the system. These MFCs are standard in all respects, having identical characteristics and hardware configuration. They are also distributed throughout the platform, primarily in operations rooms, but also in such a manner as to enhance survivability in the case of significant battle damage such as missile or torpedo strike. MFCs may be fitted with local mass storage, preferably robust storage such as read/write optical disk which may be used for local programme load in the event of network communication failure. They may be also be fitted with high-speed storage such as Flash EPROM which may be used for default application characterisation in the event of temporary power

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 70 of 214 ydthsm2.wpd Real-Time System Applications

failure or other system disturbances. MFCs should in any event be supplied with UPS (Uninterruptable Power Supply), preferably on a individual basis, but at least in local groups.

Database servers store application data using suitable storage techniques, i.e. high-speed technologies such as NOVRAM for real-time databases and optical disk for mass storage and local backup of real-time databases. All database servers are mirrored, either in mirrored server pairs or mirrored server pools. Typically the database servers and MFCs will interact using Structured Query Language (SQL).

Computation servers are optimised, high-performance numeric processing units and signal processors. Normally these would have to be provided with input data in the appropriate format by dedicated front-end processors. These computation servers would operate under a multiprocessing operating system and would provide multiprocessing services to the system clients. Typically clients (MFCs) would determine the requirement for computation resources in terms of volume and speed as well as availability of processors. Relevant data streams would then be directed via the network to the computation servers from the front-end processors under control of the client. This network would be required to support a transaction-oriented service model (i.e. client-server model), reliable multicast and multicast group management. Competition between clients for processing resources would be arbitrated by a replicated system management unit. After processing, data would be transferred to the clients for further task-specific processing such as display, target labelling, designation, engagement, etc.

4.7.3.6.2 Server Integrity Techniques

Database management requires special techniques in terms of integrity and concurrency. A technique that is appropriate to support the maintenance of server integrity is server mirroring.

Server mirroring involves enhanced network operating system capabilities that allow a hot-standby server to have its data and configuration updated online over a local datalink, i.e. synchronised with the primary. The performance of such a datalink needs to be considerable if the mirrored servers are to maintain real-time concurrency.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 71 of 214 ydthsm2.wpd Real-Time System Applications

4.7.3.6.3 Advantages

The advantages of this architecture are that it offers the highest level of performance, flexibility, reconfigurability and accessibility. Relatively simple techniques can be employed to enhance dependability and survivability.

4.7.3.6.4 Disadvantages

The disadvantages are that many of the fundamental building blocks required to implement this architecture do not yet exist in proven form. These include reliable real-time, distributed operating systems, real-time distributed databases and database access languages. However, development of these technologies is being undertaken.

Complexity and cost of such architectures are also problems at present.

4.7.3.6.5 Topology

An example of a client-server architecture, based FDDI, is proposed in terms of a conceptual topology described in Figure 10 overleaf :

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 72 of 214 ydthsm2.wpd Specialised Interface Units

Principal Warfare Areas ASuW Modem Gateway

Servers (File, Database, OW Application, Computation, Communications)

Frontend Processors AAW FDDI LAN

EW

System and Network Management Reconfigurable Console ASW Multifunction Consoles

Detect Control Engage

Figure 10 : Combat System based on Client-Server Topology

file : ycst02.cdr Page73 of 214 Real-Time System Applications

4.8 Multimedia Information System

Distributed multimedia is typically required in systems requiring remote surveillance (such as security systems), as well as collaborative simulation systems (such as team training systems).

In the entertainment industry, advanced animation techniques are becoming to be extensively used in modern film production, both in the production of purely animated films, as well as animation of real actors. Such techniques save substantial amounts of money and production time, as well as increase safety in situations that were previously dangerous and required specialised personnel (stuntmen and special effects operators).

4.8.1 Functional Performance Requirements

The overall generic functional requirements of a multimedia information system can be summarised as follows :

! Provide a minimum of 64 physical circuits.

! Each physical circuit supporting a minimum of 16 virtual circuits.

! Each virtual circuit supporting up to :

" 8 broadcast quality digital video circuits, or " 16 professional quality digital video circuits, or " 32 multimedia quality digital video circuits.

! Dynamic Compression Control.

! Geographic range > 1 000 m.

! Video Server integration.

! Fault-Tolerant Switches and Inter-Switch Links.

4.8.2 Requirements Implications

Network technologies such as FDDI do not have the throughput to support such systems. There are therefore requirements for very high-speed technologies such as ATM. To achieve full performance connectivity across network segments may require ultra-high-speed technologies such as Fibre Channel and SCI.

4.8.3 Topology

An example of a Multimedia Information System, based on ATM is proposed in terms of a conceptual topology described in Figure 11 overleaf :

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 74 of 214 ydthsm2.wpd Redundant High-Speed Redundant High-Speed ATM, Fibre Channel or SCI Link ATM, Fibre Channel or SCI Link

ATM Switch ATM Switch ATM Switch

Video Video Video Video Video Video Producers Consumers Producers Consumers Power USV Power Temp USV Run Consumers Temp Producers Run 100 Power USV 100

100

Temp Run

Power USV Temp Run

Power USV Temp Run 33

33

Power USV

33

Temp

Run Interface Units Interface Units (A/D, Real-Time Compression, LAN I/F) Interface Units (A/D, Real-Time Compression, LAN I/F) (A/D, Real-Time Compression, LAN I/F)

Power USV

Power USV Temp Run 33

33

Temp Run

Power USV Temp

33

Run

Power USV Temp Run

Power USV Temp Run 33

33

Power USV

33

Temp

Run

Power USV Temp Run

Power USV Temp Run 33

33

Power USV

33

Temp

Run

Power USV Temp Run

Power USV Temp Run 33

33

Power

33 USV

Temp

Run ae7 f214 of 75 Page

Power USV Temp Run

Power USV Temp Run 33

33

Power USV

33

Temp

Run

Power USV Temp Run

Power USV Temp Run ATM ATM 33 33

Power USV Temp ATM Run Local Area Network Local Area Network 33

Power USV Temp Run over Multimode Fibre

Power USV Temp 33 Local Area Network Run over Multimode Fibre Video Server 33 Video Server

Power USV Temp over Multimode Fibre 33 Video Server

Run

Figure 11 : Multimedia Information System based ATM

file : ygvss02.cdr Real-Time System Applications

4.9 Chapter Summary

Chapter 4 has provided a set of typical applications examples from a diverse range of application areas which are applicable in the domain of the thesis.

With the domain and context established, the problem space can be addressed in terms of a set of formal requirements for the system.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 76 of 214 ydthsm2.wpd System Requirements

Chapter 5

System Requirements

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 77 of 214 ydthsm2.wpd System Requirements

5. System Requirements

The technical requirements of any system should be derived from the user requirements, these being both operational and environmental. The user requirements will be expressed in terms of mission descriptions and conditions, as well as the probability of success in completion of those missions. The user will normally also wish to complete these missions safely (c.f. kamikaze); i.e. with respect to the personnel operating the plant or platform, as well as survival of the plant itself.

These become the allocated requirements.

5.1 Allocated Requirements

An appropriate way of quantitatively expressing the user's mission requirements is in terms of system effectiveness, i.e. the product of the probability of successfully completing the mission and the probability that the equipment will be available when called into use, i.e. its dependability. The probability of successfully completing a mission relates closely to the system's functional performance while the system's dependability relates directly to its reliability and maintainability.

The user's functional requirements for a highly effective system may be qualitatively summarised as follows :

! Having fast access to accurate and informative information

! in order to provide the ability to make quick and well-founded strategic and tactical decisions

! to respond as the operational requirement may demand.

From this it can be derived that the user requires a coherent, dependable and operable system.

To provide system coherency, sensors, decision support systems and actuators need to be fully integrated. For the system to be dependable, it must be fault-tolerant. Operability requires that the system be user-friendly (ergonomically optimal) and supportable (operators can be trained, etc.).

The user also requires the system to be secure, survivable and flexible.

Deriving lower-level requirements from those above means that the system must be reliable (fail seldom), maintainable (faults can be corrected with available resources in a reasonable time), reconfigurable (system resources reallocated according to mission mode or if onboard maintenance is not possible) and electromagnetically compatible (neither being susceptible to nor producing electromagnetic interference), as well as small in size and low in mass (to minimise volume, propulsion, fuel usage, radar cross-section, etc.).

Apart from these, the user organisation has further requirements of any operational system. It must be expandable (to facilitate growth), upgradeable (to facilitate new technologies) and affordable.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 78 of 214 ydthsm2.wpd System Requirements

To be affordable generally implies that the components of the system should be commercial off-the-shelf (COTS) items. This in turn normally requires that these items conform to accepted standards, i.e. they are non-proprietary.

In terms of the functional performance of the system, this can be extensively enhanced by the use of shared databases and multimedia. In the medium-term future, knowledge-based (expert) systems will be required in order to give the user the competitive advantage.

Apart from all the above requirements, the system must operate in a critically real-time environment. This demands the capability of handling high data rates and vast data volumes with low latency times in a deterministic manner.

5.2 Derived Requirements

It is proposed that the extensive array of requirements determined in Paragraph 5.1 focuses the resulting system architecture towards one with the following attributes :

A distributed system architecture integrated by means of a system of local area networks (LANs) [115, 146, 147]

The LAN implements sub-system connectivity, supports sub-system replication, as well provides the required data throughput. A system of interconnected LANs supports a high- level of system integration across the major functional areas and reduction of individual LAN throughput requirements, while promoting survivability in the case of localised damage.

Such an architecture also facilitates co-ordination and development of a coherent, multi- dimensional scenario visualisation supporting collaborative decision making by the system command or management team.

For real-time, mission-critical systems the specific system architecture should be one to :

! Minimise latency of end-to-end critical data transfers.

! Provide simultaneous data transfer between multiple sub-systems (i.e. broadcast and multicast).

! Maximise dependability and survivability.

! Maximise functional integration.

In cases where direct LAN communication is not possible, multi-platform data communication can be supported by datalinks, e.g. packet radio modems or satellite communications, effectively forming a Wide Area Network (WAN). In this regard the HPNWG offer the following observation[86] :

"The generic internetwork scenario represents the challenges presented by a global communication environment and off-platform communications."

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 79 of 214 ydthsm2.wpd System Requirements

The main characteristics of the network should be fault-tolerance, electromagnetic compatibility and transparency. The main attributes of the LAN should be intrinsic redundancy and fibre optics technology. Together these provide the capability of high bandwidth, dependability and survivability.

In essence the LAN provides networking services by providing interconnectivity, operating system services and interfaces to shared resources such as input and output devices.

The LAN also provides an infrastructure for data services. Data services encompass data access services (database management services), data interchange services (gateways, routers, bridges and switches) and data storage services.

Together the networking services and data services provide an information management infrastructure. These become the derived requirements of the system.

The following derived requirements are addressed more specifically :

5.2.1 Interconnectivity

The requirement for interconnectivity implies the LANs should be internetworkable, i.e. transparently share control and user data across inter-LAN boundaries. Such transparency applies in terms of functionality as well as in timeliness and order.

The capability of internetworking implies the requirement for bridging, routing or switching services within one of the protocol layers. As packets may travel on different routes in an internetwork topology, they may be re-ordered at the destination and therefore provision must be made within one of the layers for correct re-ordering.

While message delays can result in single LAN topologies, these can be exacerbated when crossing LAN boundaries, especially in the case of routers where software performs this task. It is a requirement in real-time systems that provision be made to minimise transmission delays and/or provide mechanisms to recover timing information.

5.2.2 Scalability

Scalability is a measure of the appropriateness of a general solution for a range of applications from small to large, as well as the ability of an initial solution to be more or less linearly extended (in terms or effort, cost, etc.).

The requirement for scalability implies that the solution should not include components that are inordinately expensive, i.e. a minimum configuration should not require such items. Preferably the cost of a network infrastructure should also scale proportionately with size.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 80 of 214 ydthsm2.wpd System Requirements

5.2.3 Security

With such a high degree of data and information integration, especially in military and certain commercial (e.g. banking) applications, there are critical requirements for sophisticated security mechanisms to be implemented throughout the information management system.

The network infrastructure must be able to offer security services to the system and support the implementation of the user's security policy.

5.2.4 Determinism

A major requirement for the transfer of data in a real-time system is that it be deterministic, i.e. exhibit bounded transmission delay. This ensures transfer of time critical data messages to occur within guaranteed time 'windows'. This allows control algorithms implemented by collaborative, distributed sub-systems to converge in real-time and control orders to occur within hard deadlines.

Determinism is also a highly desirable attribute in test, evaluation and qualification as these processes are difficult and time consuming in non-deterministic systems.

Essentially, non-deterministic systems offer "best-effort" service. As Kopetz and Veríssimo[94] observe :

"It is difficult to systematically validate the adequacy of a best-effort design, particularly for rare event situations."

Definition and employment of priority and precedence attributes can provide guaranteed response from event-triggered systems which would otherwise offer only a best-effort service.

5.2.5 Priority

Priority is an attribute defining the time criticality of a data message, i.e. the time period after sampling for which the data is valid.

Real-time protocols are required to offer a priority message service. This service ensures processing of the highest priority messages before those with lower priority and assists in guaranteeing end-to-end delivery latencies. The service has the capability to discriminate between messages, thus allowing time critical messages to be scheduled ahead of others, thus assisting them in meeting their deadlines.

The granularity of the priority service defines the resolution with which the system can implement a system-wide message priority scheme, i.e. differentiate between each message or class of message.

Certain protocols offer a very large number of priority levels (e.g. 216). It may be not be appropriate for the system or user application to define so many priority levels in

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 81 of 214 ydthsm2.wpd System Requirements

the message scheduling policy. It may be more appropriate to categorise priorities into bands for this purpose.

In fact, Sha and Sathaye[118] conclude that in multitasking systems, using Generalised Rate Monotonic Scheduling (GRMS), schedulability loss is negligible with 8 encoded priority bits, corresponding to 256 priority levels. That is, the worst case schedulability obtained with 8 priority bits is close to that obtained with an unlimited number of priority bits. It is postulated that this also applies in the case of priority message scheduling.

5.2.6 Precedence

Precedence is the relative importance of a data message, i.e. the functional criticality of the message.

In a real-time, mission-critical, distributed system not all characteristics can or will be accorded equal importance. In any system, the system capabilities should be ranked according to a classification of characteristics. Often this is done by specifying capabilities and characteristics as critical, major or minor.

Critical characteristics are those significant to the performance of the system's primary mission or to the safety of the operating personnel or platform itself. Major characteristics are those significant to the performance of the system's secondary mission(s). Minor characteristics make up the balance and normally relate to the aesthetic and/or engineering elegance of the system solution.

With regard to data messages, a precedence should be allocated to each message corresponding to the classification of the system function with which it is directly involved.

With event-triggered protocols, which provide a best-effort service, precedence allows graceful degradation in the case of transient overload, i.e. the most functionally critical events are handled first.

The significance of allocating precedences to data messages is that it allows the system designer to implement an overall error control policy at the system level and allocate quality of service attributes to such messages.

5.2.7 Class of Service

The Class of Service is a mechanism by which the application user can provide an indication to the transfer infrastructure of the traffic profile, i.e. a description of the payload data type. Examples are state (synchronous or periodic), event (asynchronous or aperiodic), isochronous (fixed sample size and period) and continuous media (e.g. compressed or uncompressed video, audio or voice). The transfer infrastructure can then use Class of Service to optimise dataflow for each particular data stream.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 82 of 214 ydthsm2.wpd System Requirements

5.2.8 Quality of Service

The Quality of Service (QoS) is a mechanism by which the application user can provide an indication to the transfer infrastructure of particular attributes or requirements of the payload data. QoS can be used to negotiate with end nodes as to the type of service requested (by the transmitter) and that which can be provided (by the receiver). Certain parameters may be designated for intermediate nodes while others may be destined for action at endpoints.

Typically such special processing requirements are exemption from flow control (e.g. for continuous media), remote priority processing or resource allocation such as buffer allocation or processor time allocation.

de Rezende et al propose the following QoS parameters for multimedia applications[67]: throughput, end-to-end delay, delay jitter and error rate.

The HPNWG proposes a more comprehensive set of QoS parameters for general next-generation applications[86]: throughput, transport delay, transport jitter, protection (security), residual error rate, transport establishment connection delay, active group integrity (for multicast groups) and flow control.

By providing Class of Service and Quality of Service descriptors, the user application can invoke data transfer policy, implemented by the intermediate and destination nodes, based on discrimination of payload data type. Such a discriminatory capability allows user policy to invoke protocol mechanism thereby optimising LAN dataflow in terms of timeliness, reliability and efficiency.

5.2.9 Service Models

To support the data transfer requirements of the generic real-time, distributed system, the network infrastructure should offer a rich set of service models :

5.2.9.1 Connection

A connection is an association between endpoints that is established by protocol control data. User data is transmitted between the endpoints without further path establishment until the association is disestablished.

Connection-type services are useful for long lived associations, especially for the transfer of state-type data as well as continuous media. As messages may consist of a number of packets, it is often optimal to open a connection for the message, transmit the requisite number of packets and then close the connection.

Connection establishment does have a certain overhead, however, and this should be taken into account when allocating a connection-type service to a dataflow. For example, TCP and TP4 require six and five interactions respectively to establish and disestablish a connection. By comparison, XTP requires only three interactions.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 83 of 214 ydthsm2.wpd System Requirements

A connection also provides the basic service for a critical virtual circuit.

5.2.9.2 Datagram

A datagram is a self-contained data entity providing a connectionless mode of data transfer between endpoints. Each packet requires path establishment and contains sufficient information within the packet to establish the correct path. A datagram is not necessarily acknowledged and is therefore an unreliable service. A datagram consists of a single packet only.

Datagrams can be used to transmit single packets with very low latency. Such a service is useful where the overhead of a connection cannot be afforded or justified and transport level error control is not required, e.g. state-type messages where samples are updated fast enough to nullify the effects of the occasional lost message.

5.2.9.3 Reliable Datagram

Datagrams can be made to be reliable by forcing the receiver to acknowledge receipt of the packet. A reliable datagram consists of just two packets.

Reliable datagrams can be used to transmit single packets with low latency. Such a service is useful where the overhead of a connection cannot be afforded or justified, but transport level error control is required, e.g. control or event messages where no lost messages can be tolerated.

5.2.9.4 Transaction

The transaction mode of service is a request/response interaction between a client process and one or more server processes. Transactions are characterised by high priority short length requests with longer length responses from single or multiple servers. When the transaction completes, control returns to the client. An unreliable transaction consists of just two packets, while a fully reliable transaction consists of three packets.

The transaction service is the basis of the remote procedure call (RPC). According to Mullender[109] :

"One of the easiest and simplest communication interfaces is the remote procedure call. It has already become one of the most popular in distributed systems research and appears to make its way slowly into commercial and networked distributed systems."

There appears to be a strong trend in distributed system design to follow a client-server architecture. The advent of powerful, yet inexpensive computers capable of taking the form of both node processors (i.e. servers) as well as workstations, well supports this trend. The architecture itself

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 84 of 214 ydthsm2.wpd System Requirements

supports parallel processing, dependability, availability and survivability. Special languages such as Structured Query Language (SQL) have been developed especially for client-server database applications and distributed operating systems, such as Mach[76] are also under development.

As client-server architectures establish themselves as a fundamental computer architecture of the future, remote procedure calls and transaction-type message services increase in importance. As Davids and Karakbek[63] observe :

"Transaction-oriented applications are of increasing importance with the increasing demand of data-bank applications and information services like for instance world wide web. Characteristic for such scenarios is the high number of transactions, that is request-response associations (i.e. data- bank queries), per time unit. Such transactions are usually extremely short- lived associations which are established at request time and are torn down after reception of the response."

Mullender concurs as he points out :

"An overwhelming proportion of interactions between processes in a distributed system are remote operations - one process sends a message to another with a request of some kind and the other process returns a reply or an acknowledgement."

It is contended that client-server architectures are equally appropriate for real-time, mission-critical applications as these have the same intrinsic requirements, albeit more stringent. In particular, communication between remote processes must be achieved reliably and timeously.

A highly desirable attribute of transactions is that they exhibit atomic commitment i.e. they complete successfully (commit) or fail completely (abort), with no partial state. Some transaction protocols are designed to implement this intrinsically, otherwise the real-time operating system must fulfill this requirement.

It can be concluded that reliable transactions capable of low latency are extremely useful in the implementation of real-time client-server systems.

5.2.9.5 Broadcast

Broadcast service is where all nodes on the network more or less simultaneously receive the same packet. Broadcast service requires network media and media access control methods that support simultaneous packet reception. Broadcast service is normally unreliable as receiver acknowledgments are not normally implemented.

Multi-access media (such as copper and fibre) and multi-access MAC- layers (such as CSMA and token access) offer basic broadcast services.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 85 of 214 ydthsm2.wpd System Requirements

This allows a transmitter to transmit a packet simultaneously (disregarding media delay) to all active nodes connected to the medium.

In bus and star-type topologies tightness (reception time differential) is a function of media propagation delay (MPD) which is small. In ring-type topologies there are further, much more substantial, latencies due to retransmission in each node. In fibre media, optical-to-electrical-to-optical conversion must take place; for example, in FDDI such latency is termed PHY Latency and is specified as 0,6 µs, which may not be insignificant (considering that there may be up to 1 000 PHYs in a LAN). MPD in fibre media can also be a factor; for example in FDDI this is 5,1 µs/km, which is significant (considering that a LAN may have a circumference of 200 km in a wrapped state).

Broadcast is a useful transmission mode in a number of circumstances; for example, broadcast of calendar time or synchronisation packets. It can also be used for global commands from a central network manager (e.g. shutdown).

However, broadcast mode should be used sparingly as it is not a reliable form of communication. More importantly, the reception of each broadcast at a node causes a message interrupt to the host processor. Handling of non-useful interrupts is extremely wasteful of processing power.

5.2.9.6 Multicast

Multicast is a special case of broadcast where all receivers physically receive the packet, but only act thereon if addressed. Multicast can be reliable, partly reliable or completely reliable, depending on the extent of acknowledgement by the members of the multicast receiver group.

Multicast is an extremely useful communication service. As Geary and Masters[76] observe :

"Isis, ... , provides support for a number of functions critical to distributed computing. The most basic service is atomic group multicast."

Multicast is especially useful for distributed parallel processing, distributed database access, sensor data distribution, collaborative simulation, digital telephony and digital video distribution.

Multicast is also useful (if not, necessary) for efficient implementation of address-independent application interface protocols.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 86 of 214 ydthsm2.wpd System Requirements

5.2.9.7 Flash Message

Flash messages (packets) are urgent, critical messages that must take highest priority and highest precedence over all others, even at the expense of disturbing other dataflows. Flash messages would normally be event- type messages.

Flash service is appropriate in mission or safety-critical circumstances such as communication of urgent alarms and deployment of time-critical counter-measures (such as electronic jammers).

Flash messages can be constructed from reliable datagrams with highest levels of priority, precedence and error control set. Acknowledgement from the remote host would also normally be required to effect the highest level of reliability.

5.2.9.8 Critical Virtual Circuit

A virtual circuit is a connection established by a switched network. A critical virtual circuit requires low latency and no residual error. Error and flow control is therefore normally required at both the link and transport layers.

Critical Virtual Circuits (CVCs) are required to provide real-time, closed- loop control (critical control loops) over a network[86]. Critical Control Loops (CCLs) are processing paths which react to external events and control mission-critical equipment. These are superimposed on the normal traffic of the system and, due to their critical nature, take precedence over all other traffic (except flash messages). Criticality also imposes a requirement of continuous availability (fault-tolerance) and minimum recovery time. Thus a network offering CVC service must provide end-to- end priority and continuous availability.

Typically CVC traffic will have very low latency, low to moderate throughput and high quality of service requirements (i.e. priority, error, latency and jitter control). Traffic may be periodic or aperiodic. Normally CVCs will be point-to-point (unicast), but multicast service may be required in certain instances. In this case, reliable multicast will normally be required.

A CVC can be constructed from a normal transport connection with priority, precedence and error control capabilities invoked.

CVCs are an important mechanism in achieving reliable, real-time, closed- loop control via networks in that they provide dependable links between sensor and controller as well as controller and actuator. They are also flexible in that they are virtual and can therefore be of arbitrary duration and can be switched in real-time to provide a re-configured control loop.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 87 of 214 ydthsm2.wpd System Requirements

5.2.9.9 Safety Virtual Circuit

Some connections, such as weapon firing circuits, may require a higher level of integrity control. This would normally be provided by the user application layer software.

In certain applications, especially in military systems where munition release is involved, policy requires that a user application layer transaction (i.e. host acknowledge) has to occur before munition release is authorized. Typically this would require a three message transaction e.g. fire command, fire command confirm request, fire command confirm.

Each transaction should be fully error controlled and may involve other security features such as authentication and authorization codes. Applications would typically also implement local safety interlocking.

Safety Critical Circuits should be timestamped for recording purposes to facilitate later offline analysis or enquiry.

5.2.10 Dataflow Control

To support the data transfer requirements of the generic real-time, distributed system, the network infrastructure should offer various control mechanisms by which to ensure timeliness, order and integrity of dataflows :

5.2.10.1 Latency Control

Latency Control is the policy and mechanisms provided by the system and the transfer protocols to control network latency thereby ensuring that time-critical data messages are transmitted within their validity periods.

Latency control is important for real-time systems in order to ensure meeting the deadlines of time-critical messages.

5.2.10.2 Error Control

Error Control is the policy and mechanisms provided by the system and the transfer protocols to detect and recover from data transmission errors thereby ensuring that data messages are transmitted correctly or superseded by correct data within the validity period.

Error control is performed at various layers of the LAN profile, with each layer performing error control on a higher level data entity. At the lower levels, error detection and correction are normally transparent to the user application.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 88 of 214 ydthsm2.wpd System Requirements

Table I describes error detection and correction at each protocol layer.

Layer Error Error Correction Detection Physical bit errors Control Data - Mandatory User Data - Optional Datalink packet errors Control Data - Mandatory User Data - Optional Network node-to-node Control Data - Mandatory packet errors User Data - Optional Transport end-to-end Control Data - Mandatory message User Data - Optional errors Application user defined User Data - Optional

Table I : Example of System-Level Error Control Requirements

Error detection and correction does have latency implications with increasing severity with increasing layer. Different data types qualify for different error handling mechanisms. For example, it does not make sense to compromise the timeliness of high precedence, high priority messages in order to correct low precedence data. Another example is where data has such a high repetition cycle that it would be invalid by the time it was retransmitted in the case of an error.

Whatever the application, the system design authority of any real-time, mission-critical, distributed system needs to determine and implement a comprehensive error control policy. This will normally be done by invoking and providing parameters to the options offered by the lower level protocols, especially the transport protocol.

Error control is necessary in real-time, mission-critical, distributed systems because errors in data transfer can result from numerous causes. Different dataflows are candidates for different error control policies due to their differing payloads, i.e. from critical discrete signals to continuous media with various degrees of information redundancy. Therefore a flexible error control policy with a wide range of error control mechanisms is required to support a general purpose networked system.

5.2.10.3 Flow Control

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 89 of 214 ydthsm2.wpd System Requirements

Flow Control is the ability of a receiver to constrain the volume of data that a transmitter may transmit to the buffer space available at the receiver.

Flow control allows the receiver of information to inform the sender about the current state of its receiving data buffers. Because buffer memory is normally a finite resource in the I/O sub-system, flow control is a necessary capability in a general purpose, real-time, networked system. Flexible flow control mechanisms offer more scope for system optimization, while no flow control is often necessary for continuous media.

5.2.10.4 Rate Control

Rate Control is the ability of a receiver to restrict the rate (amount per unit time) of data that a transmitter may transmit to the rate at which the receiver can process packets.

Rate control allows the restriction of the size and time spacing of data from a sender in order that the ability of a data receiver (or intermediate routers) to decipher and queue data is not overwhelmed.

Even when provided with infinite buffer resources, flow control is unable to provide effective dataflow control when sub-system I/O processing resources are finite. In such circumstances, rate control is a necessary capability in a general purpose, real-time, networked system.

5.2.10.5 Burst Control

Burst Control is the ability of a receiver to restrict the size (amount per transmission) of data that a transmitter may transmit per unit time.

While rate control can effectively throttle a high throughput transmitter where data transfer is evenly spread over time, it may be ineffective where data transmission may be sporadic, but be in large bursts, e.g. FDDI can transmit back-to-back packets of 4 500 bytes. In such circumstances, burst control is an effective supplementary dataflow control capability in a general purpose, real-time, networked system.

5.3 Chapter Summary

While users have comprehensive requirements for integrated systems, these are normally abstract in the context of the data communication infrastructure. This is because the latter should be transparent to the operation of the system. The requirements for the network sub- system therefore need to be derived from the user's abstract system requirements.

Chapter 5 has identified the appropriate system requirements and derived a set of technical requirements for the network sub-system. With this accomplished, an architecture, based on components capable of satisfying these requirements, can be synthesized.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 90 of 214 ydthsm2.wpd System Requirements

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 91 of 214 ydthsm2.wpd System Architecture

Chapter 6

System Architecture

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 92 of 214 ydthsm2.wpd System Architecture

6. System Architecture

Once all the allocated and derived requirements of the integrated, real-time, distributed system have been determined, as well as an appropriate communication model or Real-Time LAN Profile defined, synthesis of a generic system architecture can proceed.

6.1 Architecture Derivation

The proposed approach to synthesizing an appropriate and optimal system architecture is to allocate optimal options to each of the layers of the Real-Time LAN Profile. This is possible after review and trade-off of the characteristics of all the potential candidates for these layers and matching these to the system requirements. This is addressed in greater detail in Appendices A to D where the specific requirements for each layer are determined in detail. Candidate technologies are then analysed in terms of these requirements and qualitatively traded-off against each other. The most appropriate technology for each layer is then proposed for inclusion in the Real-Time LAN Profile.

Of specific significance to the choice of options, is their ability to support dependable, real- time performance.

6.2 Architecture Models

Considering that the purpose of a local area network is to facilitate communication between computers and that these computers may consist of hardware and software components from different vendors, local area networks are driven to standardisation. Appropriate models thus require broad, preferably international, or open acceptance.

6.2.1 Reference Model

With this objective, the International Standards Organisation (ISO) defined and in 1984 published a framework for communications standards which partitions the functions required for communication into seven layers. This was termed the Open Systems Interconnect (OSI) Basic Reference Model. The ISO has since published a set of standards conforming to this framework.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 93 of 214 ydthsm2.wpd System Architecture

The International Standards Organisation's Open System Interconnect Basic Reference Model is a seven-layer architecture for data communication protocol suites. While insufficient for the complete specification of a local area network, the OSI Basic Reference Model has proved valuable in its role as a conceptual and functional framework for co-ordinating the development of protocol standards[39].

The OSI Basic Reference Model encapsulates a set of communications functions within each layer. Each layer provides a set of services to the next higher layer which requests a set of services from the next lower layer. Layer interaction takes place on well defined boundaries by a small number of service primitives. These primitives abstract the details of the more basic tasks being performed in the service-providing layer.

The layered structure of the OSI Basic Reference Model is depicted in Table II.

No. Layer Functional Description

7 Application Interfaces to user programs by translating user application syntax into abstract syntax as well provides specific and common services for applications

6 Presentation Negotiates appropriate transfer syntax (format) used within the network and translates abstract syntax into transfer syntax 5 Session Sets up and controls a logical communication path (session connection) including logical synchronisation

4 Transport Enhances reliability of the network by providing end- to-end dataflow control and optimises use of the network by providing special services such as timestamping and priority message scheduling

3 Network Provides internetwork message routing, global addressing, congestion control, intermediate error control as well as packet fragmentation and reassembly

2 Data Link Controls access to the physical medium, formats/disassembles packets, implements link-level flow and error control as well as low-level network addressing

1 Physical Encodes and physically transfers packets bit-wise onto the physical medium

Table II : ISO OSI Basic Reference Model

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 94 of 214 ydthsm2.wpd System Architecture

6.2.2 LAN Profiles

The standards of the OSI Reference Model are presented very broadly, however, and are subject to various interpretations. The result of such interpretation is that either a further set of documented restrictions must be imposed upon the LAN, or the danger of different equipment, ostensibly developed according to the same standards, not being able to communicate.

A number of organisations have addressed this issue and defined a number of groupings or sets of standards, along with additional implementation agreements required for interoperability. These are known as LAN Profiles.

6.2.3 Real-Time LAN Profile

To support real-time, mission-critical systems, the LAN profile is required to include protocols capable of real-time performance at each layer. Commonly used LAN profiles are analysed in Appendix G. There it is determined that a derivative of the SAFENET model[29] is appropriate for generic real-time systems. This derivative is termed the Real-Time LAN Profile.

The Survivable Adaptable Fibre Optic Embedded Network (SAFENET) model is prescribed for the new generation of US Navy shipborne, airborne and ground facility applications[78]. The SAFENET standard and guidebook incorporate a wealth of collective knowledge and expertise, including that obtained in real implementations.

It is contended that the Real-Time LAN Profile is appropriate in that it is a flexible, practical and achievable implementation of the ISO OSI Model that is capable of cost-effective, real-time performance as well as maximum interconnectivity. It provides for all of the layers required for a complete system and is achievable because implementations (hardware and software) for all of the layers exist or are implementable without resorting to proprietary solutions.

Another reason for its applicability is that, while SAFENET is a US Navy standard, it can be followed as a guideline without necessarily requiring full adherence. This is significant in the South African context (or even in many commercial or industrial contexts) as full adherence would have extensive and possibly prohibitive cost implications.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 95 of 214 ydthsm2.wpd System Architecture

Table III shows the Real-Time LAN Profile with the various options each appropriate level.

Layer No. ISO OSI Layers Real-Time LAN Profile

9 Application Software Application Tasks Process * Network Management Services Timestamping Services

8 Operating System POSIX Real-Time Operating System Extension * Real-Time Operating System

7 Built-in Network File Application Test Time Transfer Interface Application Services Services Services Services (APIS) 6 Presentation Null Network Time Protocol

5 Session Null Null

4 Transport UDP XTP

3 Network IP

2 Data Link SNAP

IEEE LLC Type I Protocol

ANSI FDDI SMT Protocol ANSI FDDI MAC Protocol

1 Physical ANSI FDDI PHY Protocol

0 Cable Layer * Multimode Fibre Cable Plant

Note: Layers marked with an asterisk (*) fall outside the ISO OSI 7-layer Model.

Table III : Real-Time LAN Profile

6.2.4 Real-Time Protocol Stack

The Real-Time LAN Profile is implemented as a real-time protocol stack. Normally an application resides on a Central Processing Unit (CPU) which communicates with a Network Interface Card (NIC) via a Parallel Backplane Bus (PBB).

An example of a complete Real-Time Protocol Stack is shown in Figure 12 overleaf :

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 96 of 214 ydthsm2.wpd NIC CPU

BITS

NTS

FTS Applications Application S N F Interface N T T Services M P P (APIS) P Application Interface Layer

Null Null Null

Null Null Null

Null Null Null UDP TCP XTP PBB Transport Layer PBB Transport Layer

IP Null Null LLC PBB Data Link Layer PBB Data Link Layer ae9 f214 of 97 Page MAC PBB Message PBB Message PHY Passing Control PMD Passing Control

LAN Parallel Backplane Bus

Figure 1212 :: Real-TimeReal-Time ProtocolProtocol Architecture Architecture File : ygips02.cdr System Architecture

6.3 Cable Layer

The Cable Layer is responsible for physically transporting data packets between collaborating system nodes. While, theoretically, physical media other than cable (e.g. the electromagnetic ether) could perform this, only physical interconnects (i.e. cable) presently have the dependability to perform data transfer for mission-critical systems.

The functions of the Cable Layer are fundamental to the suitability of local area networks for real-time, mission-critical, distributed systems as they determine such capabilities as data throughput, electromagnetic compatibility and the maximum physical size of the LAN (i.e. geographic coverage) as well as size, mass and cost of the media.

Cable Layer issues include media type and characteristics. While media type issues are not directly protocol issues, they fundamentally affect LAN considerations such as reliability and topology, which in turn have extensive protocol implications for real-time, mission-critical, distributed systems.

A detailed analysis of physical media standards appropriate for the Cable Layer is provided in Appendix A, along with conclusions and recommendations as to the most appropriate of these for the applications of interest.

The chief capabilities that are sought of the Cable Layer are that it :

! supports present and future functional (i.e. bandwidth) requirements

! meets the electromagnetic compatibility requirements

! meets the physical range (i.e. geographic coverage) requirements

! meets the physical (i.e. mass and size) requirements

! meets the cost-effectiveness requirements

! meets the reliability requirements

! meets the supportability requirements

While copper-based cable plant can meet the latter three requirements, it falls short in meeting the first four. Fibre optic media is the only technology that can support all the requirements. These media fall into two categories, multimode and singlemode. Of these, multimode fibre optic cable is reliable, affordable and maintainable while offering adequate bandwidth for most applications. Singlemode fibre media provide higher performance, but are more costly and are therefore appropriate for more specialised applications.

While in the past, fibre optic components such as cable, connectors, splices and interconnection systems have not exhibited adequate reliability, supportability or affordability for mission-critical systems in harsh environments, this situation has now been overcome through research and product development. A range of products has recently undergone qualification and is available for the next generation of naval warships[92].

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 98 of 214 ydthsm2.wpd System Architecture

Products are also being developed for the next generation of military fly-by-light aircraft[90]. This class of application probably represents the upper bound of robustness and dependability requirements.

Thus the identified Cable Layer option satisfies the system requirements of bandwidth, dependability, supportability, affordability and electromagnetic compatibility as well as low size and mass.

6.4 Physical Layer

Digital data transmission requires binary signals representing the digital data to be modulated and encoded onto the physical media of the Cable Layer. Modulation is required to convert the binary signal to a form compatible with the medium, whether it be metallic, an RF carrier or an optical fibre. Typically, amplitude, frequency or phase modulation techniques are used with metallic and RF carriers while optical fibre carriers employ binary optical power level modulation.

Encoding is required in order to provide for bit-level error detection and correction and bit- level synchronisation between the transmitting and receiving nodes.

Encoding schemes differ in complexity with increasing complexity achieving increased efficiencies of use of the medium's potential bandwidth. This modulation and encoding is performed by the Physical Layer Protocols.

The chief attributes that are sought of the Physical Layer encoding schemes are that it :

! ensures a suitable number of transitions to reliably extract signalling synchronisation, i.e. clock recovery.

! ensures that over a reasonable period of time an equal number of HIGH and LOW logic states occur so that the average is zero and power is not transmitted down the line (in wire media).

! provides for low level error detection

! provides efficient usage of the physical medium's bandwidth capability

! minimises resultant electromagnetic radiation from the media

A detailed analysis of Physical Layer standards is provided in Appendix A.

It is concluded that simple encoding schemes give rise to inefficient use of the medium's bandwidth and electromagnetic interference in the case of metallic media. Advanced encoding schemes such those employed by FDDI (4B/5B), Fibre Channel (8B/10B) and Scalable Coherent Interface (17B/20B) are capable of efficient use of the media as well as bit-level error detection.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 99 of 214 ydthsm2.wpd System Architecture

6.5 Data Link Layer Protocol

Layer 2 of the OSI model is termed the Data Link Layer. The Data Link Layer (DLL) is responsible for formatting and disassembly of data packets as well as flow and error control across a datalink. The DLL also provides the basic network addressing mechanism.

The functions of the DLL protocol are fundamental to the suitability of local area networks for real-time, mission-critical, distributed systems as they determine such capabilities as data transfer latency, network topology, intrinsic redundancy and multi-access capability. The media access mechanism also influences the maximum physical size of the LAN.

The Data Link Layer is normally considered as two sub-layers, the Media Access Control (MAC) sub-layer and Logical Link Control (LLC) sub-layer. The characteristics of each are described below :

6.5.1 MAC Sub-Layer Protocol

Because with all LANs network nodes share the mediums's transmission capacity, some means of controlling access to the transmission medium is needed so that two particular nodes can exchange data. This means of access is fundamental to the LAN's ability to support real-time, deterministic transfer and to provide appropriate topologies for mission-critical networks. The MAC Layer is also responsible for low- level error detection and network addressing.

To meet all of the requirements of the next generation of real-time, mission-critical, distributed systems, it is highly desirable to utilise a commercial MAC standard that :

! supports present and future performance (i.e. timing) requirements

! supports the data transfer determinism requirements

! provides high throughput by making efficient use of physical layer bandwidth

! supports present and future functional (i.e. addressing, including multi- access) requirements

! supports appropriate network topologies

! meets the cost-effectiveness requirements

! represents a stable technology

A detailed analysis of MAC sub-layer standards is provided in Appendix B along with conclusions and recommendations as to the most appropriate of these for the applications of interest.

It is concluded that while commonly used technologies such as Ethernet and IBM Token Ring can meet the latter four requirements, they fall short in meeting the first

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 100 of 214 ydthsm2.wpd System Architecture

three. While new technologies such as ATM show promise, they do not meet the last three requirements.

It is specifically concluded that FDDI (Fibre Distributed Data Interface) is an international standard which meets all these requirements.

As is indicated in Table III on Page 96, the ANSI FDDI standard is recommended at Layers 1 and 2 for the Real-Time LAN Profile. FDDI[64, 99] is an optical fibre network offering high speed, efficient, reliable and fault-tolerant data transfer at 100 Mbits-1. FDDI is a commercial networking standard designed to support data- intensive applications such as image processing, image and real-time distributed databases and graphics in LAN (Local Area Network) and MAN (Metropolitan Area Network) topologies.

While FDDI does not strictly support deterministic data transfer (c.f. MIL-STD-1553), it is contended that a combination of its high speed, timed-token protocol and synchronous/asynchronous message modes, allows it to support reliable networking for most real-time, mission-critical, distributed systems. FDDI also offers a low bit error rate (BER) of 2,5 x 10-10.

6.5.1.1 Data Transfer Modes

Generally, real-time systems are characterised by periodic as well as aperiodic processes. If these systems are distributed, these processes give rise to state and event-type communications between collaborating sub- systems.

It is important that data communications networks are capable of offering both types of service. They are then able to support time- as well as event- triggered systems. While event-type transfers can be effected with state- type communications, this is normally extremely wasteful of available throughput if low latency is required. This is because message repetition cycles need to be less than or equal to the required latency.

The FDDI standard offers both synchronous as well as asynchronous modes of data transmission. Synchronous mode corresponds with state- type messages while asynchronous corresponds with event-type messages.

6.5.1.1.1 Synchronous Mode

Synchronous data is that produced repetitively by a producer and is synonymous with state data.

FDDI synchronous mode is where a producer of data is guaranteed a certain proportion of LAN bandwidth. Synchronous data has priority (at the data link level) over asynchronous data. FDDI synchronous mode provides Quality of Service guarantees required by time critical applications

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 101 of 214 ydthsm2.wpd System Architecture

(such as critical virtual circuits), multimedia applications and high throughput applications (such as server mirroring).

While the original FDDI standard specified synchronous transmission services, provision of this was optional. Up until very recently manufacturers of commercial network interface cards (NICs) did not provide this option with their products. However, with the increased requirement for networked multimedia, NIC manufacturers have identified the importance of this service.

A group of companies, led by IBM Corp. have formed the FDDI Synchronous Forum in order to define a common and interoperable approach to FDDI synchronous data services[87]. They have recently published the FDDI Synchronous Forum "Implementer's Agreement"[135] which amounts to a proposal and request for comment (RFC) in this regard. Manufacturers of FDDI network interfaces are beginning to provide synchronous drivers, conforming to this RFC, with their hardware.

A central issue pertaining to FDDI synchronous transmission is that regarding synchronous bandwidth allocation (SBA), particularly how the host applications bargain for synchronous bandwidth and how this bandwidth is dynamically managed by the network.

Two SBA schemes have been proposed, i.e. a dynamic scheme and a static scheme[87].

! Dynamic SBA Scheme

The Dynamic SBA scheme involves all network nodes negotiating bandwidth allocation with the Synchronous Bandwidth Allocator and then maintaining a dynamic table describing the resulting allocation. This scheme is the most flexible and provides the most efficient use of the FDDI bandwidth. As bandwidth allocation is dynamic, implementation is more complex especially if a redundant SBA scheme is required.

! Static SBA Scheme

A static SBA scheme involves all network nodes negotiating bandwidth allocation at system startup with the Synchronous Bandwidth Allocator and then maintaining a fixed table describing the resulting allocation. This scheme is the most simple to implement.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 102 of 214 ydthsm2.wpd System Architecture

It is, however, somewhat inflexible and provides less efficient use of the FDDI bandwidth.

Unused synchronous bandwidth is not wasted, if stations do not use their allocation, this can be used by other stations. The SBA only ensures that there is not an a priori over-allocation of bandwidth.

Typically a network management station would be a suitable candidate as an SBA master. Consideration has to be given to replication of the SBA for fault-tolerant applications.

6.5.1.1.2 Asynchronous Mode

Asynchronous data is that produced non-repetitively by a producer and is synonymous with event data. Asynchronous bandwidth is allocated from the pool of remaining ring bandwidth. The allocation of bandwidth is controlled by two classes of special tokens, the restricted and non-restricted tokens. The restricted token gives the right to restrict transmission to nodes within a dialog. Other asynchronous traffic is deferred for the duration of the dialog. Implementation of Restricted Mode is optional in the FDDI standard and few, if any, vendors have implemented this option.

Non-Restricted Mode is used for all other asynchronous traffic. This mode supports up to eight levels of priority (at the MAC Layer) of which only one level is mandatory.

Generally FDDI asynchronous mode is applicable where response time is not critical as well as where the amount or occurrence of data is not predictable.

All commercially-available FDDI NICs are provided with asynchronous mode capability.

6.5.1.1.3 Isochronous Data

Isochronous data is that produced repetitively and in fixed amounts by a producer, e.g. that produced by sampled voice, video or sensor sources. In FDDI II (a still to be fully developed extension of FDDI)[64, 99], isochronous data has higher priority (at the FDDI level) than both synchronous and asynchronous data. FDDI provides isochronous services by implementing circuit switching by means of hardware support. Such hardware is in form of an extra chip in the standard FDDI chipset.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 103 of 214 ydthsm2.wpd System Architecture

While FDDI I cannot offer true isochronous service, higher protocol layers such as XTP can provide a level of isochronous data transmission service by utilising FDDI synchronous transfer mode and higher layer protocol features (such as priority and quality of service)[65, 139].

6.5.1.2 Station Management

The Station Management (SMT) layer can be considered as a vertical layer which can access all FDDI sub-layers specified in terms of the OSI model. SMT provides the fundamental services required for built-in test of the FDDI LAN infrastructure, as well as providing the built-in test results and performance statistics to the Network Management Services. It thereby significantly enhances the dependability and maintainability of the network, as well as allowing performance optimisation.

The SMT function permanently monitors the FDDI ring, coordinates the ring during network start-up and produces a status table of ring and station states. The SMT manages the station's PHYs, MACs, PMDs, optical bypasses, timers, as well as management objects such as counters, parameters and statistics. The functionality of SMT can be grouped into connection management (CMT) and ring management (RMT).

CMT is responsible for ring configuration, reconfiguration after malfunction, network statistics and diagnostics. It consists of :

! Entity Co-ordination Management (ECM) ! Configuration Management (CFM) ! Physical Connection Management (PCM)

ECM provides for controlling bypass switches and signalling to PCM that the medium is available as well as for co-ordination of trace functions. CFM provides for configuring PHY and MAC entities within a node. PCM is responsible for bit signalling, line state identification during link construction and signal monitoring, as well as performing link confidence tests when the ring is started up. During operation, PCM regularly calculates the bit error rate and other error statistics which are represented as link error monitor data.

RMT receives status information from Media Access Control and CMT. RMT then reports this status to Station Management. Services provided by ring management include :

! stuck beacon detection ! resolution of problems through the trace process ! determination of MAC availability for transmission ! detection of duplicate addresses

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 104 of 214 ydthsm2.wpd System Architecture

6.5.2 Summary of FDDI Features

The dual-attached FDDI topology offers excellent scalability in that each extra node requires only one set of network interface, cable segment and optical bypass. By comparison, technologies such as and ATM require expensive switches even in a minimum configuration.

The detailed features and capabilities of FDDI, as well as an analysis of FDDI Ring Latency Time, are provided in Appendices B and F respectively. The latter shows that latencies in the sub-millisecond range can be achieved for small networks of less than 40 nodes and 10 km length.

It is concluded that FDDI offers excellent dependability due to its self-healing capability and its intrinsic dual-redundant design, as well as inherent electromagnetic compatibility using fibre optic media.

6.5.3 Logical Link Control Sub-Layer Protocol

Logical Link Control (LLC) defines the transmission of a packet of data between two stations. It thereby provides an underlying data delivery service to the layers above it. LLC is responsible for formatting and disassembly of packets, establishing and disestablishing links between nodes and implementing link-level flow control.

Two types of services are provided by the LLC Layer; an unacknowledged connectionless service and a set of connection-oriented services. The former provides a means of initiating transfer of service data units with a minimum of overheads. This service is appropriate when functions such as error recovery and sequencing are provided by a higher level protocol and hence do not need to be replicated by the LLC Layer. LLC connection-oriented services provide the means to first establish a link-level logical connection before initiating the transfer of service data units and then implementing error control and sequencing over the connection. These services would be appropriate when high levels of network dependability are required in the absence of network and transport layers. Typically these would be ultra-high performance networks over single LAN topologies where internetworking capability was not required.

The IEEE specifies the functionality of LLC in the IEEE 802.2 standard. The level of functionality and reliability of service is classified into four types of increasing functionality and reliability. LLC Type 1, offering the most basic service and LLC Type 4, offering the most reliable service, are described below.

6.5.3.1 LLC Type 1

Type 1 (LLC1) is an Unacknowledged Connectionless Service and provides datagram packet delivery over the local network segment and any other network segments reachable via the link layer switches. Higher layer packets must be placed in the protocol data units of the underlying data delivery service. This is termed encapsulation.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 105 of 214 ydthsm2.wpd System Architecture

LLC packets carry user information between nodes on an FDDI or other network. For example, each FDDI packet containing user data also contains LLC information for the end node. These packets may cross bridges and can be transmitted to nodes on the extended LAN. FDDI assumes implementation of the IEEE 802.2 LLC1 standard.

XTP, providing a reliable protocol, only requires basic LLC services, i.e. LLC1. While XTP does not require reliable LLC service and provides other services such as multicast, other communication profiles do require enhanced LLC services. IEEE has therefore initiated the definition of LLC Type 4 (LLC4).

6.5.3.2 LLC Type 4

LLC Type 4 has been proposed to offer the following services :

! Reliable Sequenced Delivery ! Reliable Non-sequenced Delivery ! Non-reliable Sequenced Delivery ! Segmentation/Re-assembly ! Connection Setup with Embedded User Data ! Connection Release with Embedded User Data ! Multiple Logical Connections between LSAP Pairs ! Quality of Service ! Protection of Bit Errors through Bridges ! Ability to Deliver Corrupted Data ! Multicast

Analysis of these capabilities indicates a close resemblance to services offered by XTP at the Transport Layer (refer Paragraph 6.7.4.2). They reflect the basic requirements to support current and emerging service models for the delivery of real-time and continuous media data.

6.5.4 Sub-Network Access Protocol

The Sub-Network Address Protocol (SNAP) is a lower level Internet family protocol. SNAP is used by IP and ARP (Address Resolution Protocol) to identify IP packets.

SNAP is required by SAFENET-conformant implementations offering an Internet profile.

6.5.5 Choice of Data Link Layer Protocol

It is contended that, where a network and transport layer are provided, the LCC Type 1 protocol is appropriate and satisfies the system requirements of timeliness, determinism, logical connectivity and technological stability.

Where a network and transport layer are not provided, the LCC Type 4 protocol is appropriate and satisfies the added requirement of reliability.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 106 of 214 ydthsm2.wpd System Architecture

6.6 Network Layer Protocol

Layer 3 of the OSI model is termed the Network Layer (NL). The functions of the NL protocol are to provide message routing (or switching) in internetwork topologies as well as internetwork flow and congestion control.

The functions of the NL protocol are significant to the suitability of local area networks for real-time, mission-critical, distributed systems as they determine such capabilities as global addressing, congestion control, intermediate error control, packet fragmentation and reassembly. Most importantly, as the HPNWG point out[85], "It is vital to internetworking."

The Network Layer is responsible for providing interconnection of network layer users (e.g. transport layer entities) such that users are shielded from the details of the number and characteristics of sub-networks separating them. The NL provides transparent interconnection of all network layer service users attached to the internetwork.

A fundamental consideration in the determination of the functionality of the network layer is whether it should provide an end-to-end, reliable data transfer service or whether this should be the responsibility of the transport layer.

An end-to-end, reliable data transfer service can be considered as a connection-oriented approach whereas a non-reliable service can be considered as a connectionless approach. Academics and implementers in the network field are sharply divided as to the merits and appropriateness of each approach.

6.6.1 Connection-Oriented vs Connectionless Data Transfer

Two models exist for data transfer across a network, i.e. connection-oriented and connectionless. There is significant difference of opinion within the networking community as to the most appropriate model and at which layer connection establishment should be implemented. Connection-oriented service requires greater management overhead in the case of short transfers, but is more efficient for long- lived connections. Connections also provide for better sequencing of data on reception.

The most appropriate model for a particular application should be determined by the type of data delivery service required, i.e. the service models. In general, connections should be implemented by the highest data transfer layer implemented within that particular profile. For instance, if an application functions in a request/reply mode with a transfer size of only one packet, then the connectionless mode should be used to take advantage of connectionless response times. On the other hand, if most of the transfers are simplex or consist of large numbers of packets, the transfers should use the connection-oriented mode in order to ensure packet delivery and integrity of data.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 107 of 214 ydthsm2.wpd System Architecture

6.6.1.1 Connection-Oriented Approach

A connection-oriented approach involves a set of end-to-end transactions whereby a connection is established between nodes until the connection is no longer required. Data transfer occurs while the connection is valid, followed by a final packet or transaction to disestablish the connection.

A connection-oriented approach is appropriate for state-type data, long- lived messages, continuous media and critical virtual circuits.

6.6.1.2 Connectionless Approach

A connectionless approach involves the transmission of independent messages on an individual basis. Each message carries the full destination address. Message-level error control is not performed and this is left to the transport layer.

A connectionless approach is appropriate for event-type data and control messages (e.g. reliable datagrams) as well as transactions (e.g. client- server type applications).

6.6.2 Network Layer Requirements

In summary, the optimal candidate as Network Layer Protocol for the Real-Time LAN Profile is a commercial standard that :

! meets present and future functional and performance requirements (routing, congestion control, global addressing)

! provides a connectionless service to the Transport Layer

! supports special services (such as priority routing)

! is available as a production quality software product

! minimises implementation risk by having widespread application

A detailed analysis of Network Layer Protocol standards is provided in Appendix C along with conclusions and recommendations as to the most appropriate of these for the applications of interest. It is concluded there that two of the current commonly used standards will meet most of the requirements of the next generation of real-time, mission-critical, distributed systems. These two standard Network Layer protocols are the Internet Protocol (IP) and the OSI Connectionless Network Protocol (CLNP).

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 108 of 214 ydthsm2.wpd System Architecture

6.6.3 OSI Network Layer Protocol

The OSI Profile provides services from Layer 3 to Layer 7. At Layer 3, the OSI Profile offers the OSI Network Protocol. Two options are available, i.e. the Connection-Oriented Network Protocol (CONP) and the Connectionless Network Protocol (CLNP).

Originally the OSI Basic Reference Model followed the connection-oriented approach. Later a connectionless option was also adopted[124]. The reason behind this was that, at the time, networking practitioners were unsure whether to implement connection establishment at the network or transport layers. Modern thinking is that it is most appropriate at the transport layer as this supports maximum flexibility in the implementation of the user's data transfer policy.

Various commercial companies have developed protocol products (usually software) to conform to the OSI NL standard. These are available as "shrink wrap" off-the- shelf products which a system implementor binds together with his other protocol software to form a software system. It is also possible, but normally more difficult to obtain OEM- (Original Equipment Manufacturer) type products which an implementer can tailor for incorporation into his system. Often the NL is bundled together with the transport layer and an access interface to the NL is not available. This implies that such products cannot be used to supply a network layer to other than the incorporated transport layer.

6.6.3.1 Connection-Oriented Network Protocol

The Connection-Oriented Network Protocol (CONP) is an OSI conformant protocol described by ISO 8348[16]. At this time, only a means for adapting the X.25 network protocol is provided. This has not found wide acceptance, especially in the USA.

6.6.3.2 Connectionless Network Protocol

The Connectionless Network Protocol (CLNP) is an OSI conformant protocol specified by ISO 8437[17]. CLNP is prescribed as the OSI family option at the Network Layer of the SAFENET LAN Profile. CLNP is also an option for the SAFENET real-time profile running under XTP.

CLNP provides for passing data units from one End System to another End System (though Intermediate Systems where required). An End System is a user station which does not route data to other users. An Intermediate System is a station that routes data between the initial sending and final receiving of data. CLNP provides two primary services: relaying of data units and route determination. Relaying of data units involves the passing of a data unit from one End or Intermediate System to another. Route determination is where the paths for the relaying of data units is dynamically determined. In most typical implementations, there are many more End Systems than Intermediate Systems. The OSI routing

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 109 of 214 ydthsm2.wpd System Architecture

architecture is intended to result in a few complex Intermediate Systems balanced by many simple End Systems.

Two types of protocol data units (PDUs) are defined in CLNP, the Data PDU and the Error PDU. The Data PDU pipes data from the Transport Layer to the lower layers of the network and the Error PDU is used to report packets that had to be dropped to the source node. Several functions are defined in CLNP in order to build these PDUs and control the forwarding of them to other nodes. These functions are used by the other OSI protocols as building blocks for more sophisticated routing functions.

CLNP uses three routing protocols; End System - Intermediate System Routing Protocol (ES-IS), Intermediate System - Intermediate System Routing Protocol (IS-IS) and Inter-Domain Routing Protocol (IDRP). The addresses used by CLNP identify nodes instead of interfaces (c.f. IP).

CLNP is more modern than IP and was in fact developed from IP. It therefore overcomes some of the problems of the latter and is even being considered as an IPng contender, i.e. TUBA (refer Appendix C). However, unless IPng offers CLNP a new lease of life (which is considered unlikely), it is contended that IP will spell the demise of CLNP.

6.6.4 Internet Protocol (IP)

The Internet Protocol (IP) is the primary network layer protocol of the Internet family. In the Internet Profile, the Internet Layer is concerned with routing data between two hosts attached to different or multiple networks. An internet is an interconnected set of networks. The Internet Protocol (IP) is used at this layer. IP was developed by the US Department of Defense and is specified in MIL-STD-1777[25].

IP is specified as the Internet family option at the Network Layer of the SAFENET LAN Profile. IP is also an option for the SAFENET real-time profile running under XTP.

The Internet Protocol is a connectionless network layer protocol that is the foundation of the Internet Protocol Suite. Packets sent via IP are independent, i.e. they may travel on independent paths to reach their destination. Routers running IP are not required to maintain state information describing the streams of packets passing them. Consequently, reserving bandwidth and guaranteeing end-to-end latencies is difficult.

There are a number of additional protocols that are used in conjunction with IP. The Address Resolution Protocol (ARP) is used by hosts to determine the mapping between an IP address and a MAC address. IP datagrams must be encapsulated in a MAC packet before being transmitted on a LAN.

The Internet Control Message Protocol (ICMP) is used for a number of purposes. A simple echo facility aids in debugging IP networks. Another ICMP function provided

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 110 of 214 ydthsm2.wpd System Architecture

is to redirect hosts to use a more optimal route to reach a destination. Requests for reports on the address mask used on a network can also be made via ICMP.

The Routing Information Protocol (RIP) and the Open Shortest Path First (OSPF) protocol are two popular routing protocols used by IP. Both protocols are used to compute and distribute routing information throughout an IP network. However, they use different algorithms to achieve this.

Despite the wide acceptance of IP, it has some fundamental design problems which will render it ineffective in the medium term. The first result of these problems is that classical IP addresses will be expended in the short term. The second result is that IP routing capability will become saturated. This is covered in greater detail in Appendix C. However, these problems will only be significant in wide area networks and therefore do not affect most real-time applications.

6.6.5 XTP-Aware IP Routing

While maximising interoperability, the employment of IP can have latency and throughput implications in internetwork topologies because IP is not optimised for routing packets across a network boundary.

Functional extensions to an IP Router can overcome this problem, however. XTP features a traffic descriptor which will allow an XTP-aware IP router to prioritise and expedite XTP packets. An extended IP implementation recognizes XTP packets, accomplishes resource reservation and admission control, guarantees quality of service, etc.

While such products are still in the research stage, the University of Virginia have developed a prototype XTP-Aware IP Router capable of 50 to 60 Mbits-1 throughput over an FDDI network[51, 141].

It is contended that this is a major requirement for real-time, mission-critical systems which employ an internetwork topology and IP.

6.6.6 Choice of Network Layer Protocol

It is concluded that CLNP and IP have equivalent functionality and performance and are therefore, theoretically, both suitable candidates for real-time, mission-critical, distributed systems.

To reduce the candidates to only one, two approaches can be considered; either to adopt a multiprotocol approach or discard one option in favour of the other. In the latter case, other than technical considerations apply. It is contended that, despite the elegance and technical appropriateness of the OSI network layer protocol, IP will prevail in the real-world due to its extensive installed Internet (ARPAnet) base and consequent legitimacy amongst many network users, championed by the world's largest organisation, the US Department of Defense.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 111 of 214 ydthsm2.wpd System Architecture

Real-world realities have also resulted in commercial networking companies abandoning their OSI protocol products. This can only point to the ultimate demise of the OSI product profile.

The recommended option for the network layer protocol of the Real-Time LAN Profile is therefore the Internet Protocol (IP).

It is thus contended that the identified Network Layer option satisfies the system requirements of internetworking (i.e. routing) and logical interconnectivity (i.e. global addressing).

6.7 Transport Layer Protocol

Layer 4 of the OSI model is termed the Transport Layer (TL). The functions of the TL protocol are to enhance data communication reliability by providing end-to-end dataflow control and to optimise use of the network by providing special services. The transport layer provides an interface between the higher application-oriented layers and the underlying network-dependent layers. It provides the higher layers with a reliable message transfer service that is independent of the underlying network type. The transport layer masks the detailed operation of the underlying network from the higher layers and provides the latter with a defined set of message transfer facilities.

The functions of the TL protocol are critical to the suitability of local area networks for real- time, mission-critical, distributed systems as they determine such capabilities as end-to-end error, flow and rate control as well as special services which are of specific significance to mission-critical, real-time systems. Examples of such special services are multicast, timestamping, priority message scheduling and security.

A detailed analysis of Transport Layer protocols is provided in Appendix D along with conclusions and recommendations as to the most appropriate of these for the applications of interest. It is concluded there that none of the current, commonly used protocols will meet all of the requirements of the next generation of real-time, mission-critical, distributed systems. However, certain new protocols do meet these requirements. It is also concluded that current, commonly used protocols such as TCP will have such widespread use for the medium term that a multiprotocol approach is appropriate at the Transport Layer in order to maintain maximum interconnectivity. In addition, commonly used higher layer protocols such as the File Transfer Protocol (FTP) rely on the services provided by TCP.

In order to support real-time, mission-critical, distributed systems, networks are required to provide enhanced transport services. Traditional transport layer protocols are deficient in certain of these capabilities. Important deficiencies include effective priority message scheduling, reliable datagrams, efficient transactions processing and advanced dataflow control. Real-time, mission-critical, distributed systems also require that the user of the network services is able to tailor the services and performance of these network services in order to optimise the performance of the system. To do this, the user must be able to define and invoke data transfer policy, with transfer mechanism being transparently effected by network services.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 112 of 214 ydthsm2.wpd System Architecture

6.7.1 Transport Layer Requirements

It is therefore proposed that one or more standard commercial TL protocols are selected that :

! meets present and future functional requirements (e.g. dataflow control)

! meets present and future performance requirements (in terms of latency, jitter and throughput).

! provides a sophisticated, multilevel, dynamic message priority scheme

! provides special services (such as service discrimination and out-of-band signalling)

! supports special services (such as timestamping)

! provides a reliable multicast service

! supports transfer of real-time continuous media

! provides interconnectivity between diverse systems

! allows the real-time system implementer to set his own data transfer policy and not be constrained by protocol mechanism

! is available as a production quality software product

! minimises implementation risk by having or finding widespread application

6.7.2 TCP

TCP (Transmission Control Protocol) is the primary transport protocol of the Internet Profile. It was developed by the US Department of Defense and is specified in MIL-STD-1778[26].

TCP offers services that communicate between, and provide control of, incompatible computers and networks. It is sometimes referred to as "the Rosetta Stone of Internetworking". It provides point-to-point, guaranteed-delivery communication between networked nodes and was originally designed for packet switching communications. It was designed to work over the unreliable network data delivery service provided by the Internet Protocol (IP).

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 113 of 214 ydthsm2.wpd System Architecture

6.7.2.1 TCP Priority Scheme

TCP provides a priority message scheduling scheme, however this is somewhat primitive. It does so by providing two priority mechanisms, the push flag and urgent pointer. Priority data is inserted into a packet with the push flag set. Such packets are then delivered to the user in an expedited, but still-in-sequence fashion. The transmission buffers for the connection are flushed immediately when the push flag is set. The urgent pointer indicates the place in the data stream where data requiring special attention begins. Urgent data may travel in packets carrying ordinary data.

TCP does not specify what actions a receiver must take to expedite processing of urgent data.

6.7.2.2 TCP Deficiencies

While TCP has found extensive implementation in large and sophisticated networks, it was designed in the era of 56 kbits-1 data links[125] and is intrinsically unable to support data rates much above a few Mbits-1 [70]. Although extensive in its internetworking features, it has not been designed for deterministic and real-time data transfer. By only providing point-to-point communication it cannot support multicast which is required by many real-time, distributed systems.

TCP was not engineered for the type of high-speed networks that are beginning to find application (e.g. FDDI and ATM). TCP's sequence spaces and windows sizes are too small to support gigabit performance.

TCP uses a 16-bit window size that allows up to 64 kbytes of data to be sent before acknowledgement is required. With high data transmission speeds, a transport user will be forced to wait for such an acknowledgement every 64 kbyte. At 1 Gbits-1, this would occur every 490 µs. This is would seriously reduce throughput. Proposals have, however, been made to provide a scaling factor for the window size to allow it to extend beyond its current limit[85].

TCP's 32-bit sequence space size is also a problem. At gigabit speeds, the sequence space is capable of wrapping every 17 s. A wrapped sequence space introduces an ambiguity about whether a received packet is for an active connection or is intended for a previously closed connection. Two methods have been proposed to prevent the sequence space wrapping problem. The first extends the sequence space to 32 bits and the second uses TCP timestamps to protect against old duplicate packets.

Because of TCP's intrinsic connection-oriented design, it cannot efficiently provide request/response type interactions without an excessive number (six) of connection management packet exchanges. While TCP can transmit user data in the first packet of connection establishment, this is

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 114 of 214 ydthsm2.wpd System Architecture

only passed to the application when the connection is confirmed by the third packet.

When TCP is used for multiple simultaneous transactions it exhibits an even more serious problem; i.e. it can occupy the host processor almost entirely with protocol execution. The reason for this is the so-called TIME- WAIT state of the TCP state machine: after a TCP connection is terminated, it enters a "zombie" state for up to four minutes before the individual instance of the state machine is finally destroyed. This is to ensure that duplicated connection establishment packets that arrive after a connection has been terminated do not get interpreted as connection establishment requests for a new connection. By contrast, XTP associations are terminated by using an explicit handshake, augmented where necessary with a timer; state information is de-allocated upon completing the handshake, although the entry in the context lookup database is kept until no hazard due to re-appearing packets is possible. Consequently, XTP avoids the overhead of managing "zombie" contexts[125].

TCP transaction times also increase significantly under load while the maximum transaction rate is low. Davids and Karakbek[63] report that on a Sun workstation they found TCP's maximum transaction rate was about 25 per second and durations extended to over 30 ms, while the CPU usage was in the order of 86%. Generally, this would have unacceptable implications for application processing. By comparison, XTP's maximum transaction rate was about 110 per second and durations were always less than 10 ms, while the CPU usage was less than 5%.

TCP's error control only allows a go-back-n error recovery scheme. This is inefficient, especially if only a few bytes in an otherwise error-free transmission are corrupt.

As described in Paragraph 6.7.2.1 above, TCP's priority scheme is too primitive for many real-time applications.

6.7.2.3 TCP Suitability

Despite being somewhat dated in terms of performance and capability, there are many qualified implementations of TCP for many operating systems. TCP's wide implementation also maximises interconnectivity between diverse systems. For these reasons, TCP is a good supplementary protocol for the short to medium term.

6.7.3 ISO TP4

TP4 is the ISO Class 4 Transport Protocol which was designed in 1982. Layer 4 of the OSI model, i.e. the Transport Layer[39], consists of five classes of increasing capability with respect to retransmission of lost data, flow control and reordering of packets. TP4 is specified in ISO 8073[14].

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 115 of 214 ydthsm2.wpd System Architecture

TP4 offers reliable end-to-end data transfer service over an unreliable network connection. TP4 provides the capability to establish a connection and transfer Transport Protocol Data Units (TPDUs) on that connection. It includes mechanisms for segmentation, flow control and multiplexing of several transport connections over a single network connection. TP4 also provides the capability to detect and recover from errors which occur at the network layer and below. TP4 provides a message- type service in contrast to TCP's stream-oriented service.

6.7.3.1 TP4 Priority Scheme

TP4 also provides a priority message scheduling scheme, but again this is somewhat primitive. It does so by providing two sequence spaces, one for ordinary data and one for expedited data. After the transmission of expedited data, no other packet may be transmitted until the data packet is acknowledged. Urgent data may not travel in packets carrying ordinary data.

6.7.3.2 TP4 Deficiencies

TP4 also cannot efficiently provide request/response type interactions without an excessive number of connection management packet exchanges (at least 5 in this case). As TP4 assumes that aborts will be handled by a higher protocol layer, it does not offer a graceful close mechanism. While TP4 can transmit user data in the first packet of connection establishment, this is limited to just 32 bytes.

As described in Paragraph 6.7.3.1 above, TP4's priority scheme is too primitive for many real-time applications. 6.7.3.3 TP4 Suitability

While not providing the highest levels of performance required for LANs and internetworks of the future, the ISO OSI protocols are amongst the most modern of which software implementations exist and that have fairly widespread application. Furthermore, the ISO OSI model does currently provide the most widely accepted communications model. For these reasons, ISO TP4 may be an appropriate supplementary protocol for some applications in the short term. It is predicted, however, that TP4 (as well as other OSI protocols) will disappear in the medium to long term (5 to 10 years) due to predation by the Internet Protocols and ATM.

6.7.4 XTP

The XTP (Xpress Transport Protocol)[38, 141] is a new standard which meets most of the requirements of real-time systems.

XTP is prescribed as the real-time option at Layer 4 for the SAFENET LAN Profile.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 116 of 214 ydthsm2.wpd System Architecture

6.7.4.1 XTP History

XTP's current status is Revision 4.0 where it is termed the Xpress Transport Protocol and conforms to a true ISO OSI transport layer protocol[37, 125, 138]. Previously (as recently as mid-1994), it was termed the Xpress Transfer Protocol and was defined to encompass both transport layer and network layer. This was done in order to enhance performance as it was recognised that bottlenecks invariably resulted at the interface between these two protocol layers.

Originally XTP was designed to be a "silicon" protocol, i.e. implemented in hardware and was termed a "Protocol Engine"[48, 49]. This concept proved to be unsuccessful (at least commercially) due to the dynamic nature of transport and network layer protocol definitions, with each new definition obsoleting all previous silicon versions. Another influencing factor is that processing power is now an abundant commodity and general purpose processors can easily supply the required protocol performance (at least for current data rates in the order of 100 Mbits-1) without requiring special- to-type devices such as RISC processors.

6.7.4.2 XTP Capabilities

The major features of XTP are detailed in Appendix D. A summary of the most important functional features of XTP is as follows :

! Reliable Multicast ! Multicast Group Management ! Reliable Datagram ! Multilevel Priority Message Scheduling ! Rate and Burst Control ! Efficient Connection Management (requiring only 3 packets) ! Selectable Error Control ! Selectable Flow Control ! Selective Retransmission ! Selective Acknowledgement ! Maximum Transmission Unit Detection ! Out-of-Band Data ! Alignment ! Traffic Descriptors

6.7.4.3 XTP's Orthogonal Approach

XTP's orthogonal approach to policy and mechanism is one of its most important features as a real-time protocol. What is meant by this orthogonal approach is that the protocol definition and implementation differentiates between policy regarding real-time LAN issues such as

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 117 of 214 ydthsm2.wpd System Architecture

dataflow control and the mechanisms by which these are actually implemented, as well as how they interface to the user application. XTP provides the implementer with options for almost every protocol feature. Some of the options that are of specific significance to real-time systems are summarised in Table III below.

Real-Time Feature XTP Options Addressing Scheme Parametric (e.g. Internet, ISO) Retransmission Go-back-n, Selective, Automatic Repeat Request (ARQ) Flow Control Credit-based Sliding Window Full, None, Reservation Mode Error Control Full, None Rate Control Full, None Burst Control Full, None Maximum Transmission Unit Parametric

Acknowledge On Demand, Fast Negative Acknowledge (FASTNACK) Priority Multiple Levels (fine granularity) 216 for Transmit, 216 for Receive Multicast Various Multicast Address Schemes Unreliable, Semi-Reliable, Reliable Multicast Multicast Group Management

Table IV : XTP Implementer Options

6.7.4.4 XTP Suitability for Real-Time LAN Profile

XTP's flexible transfer capabilities, in conjunction with FDDI's synchronous and asynchronous transfer modes, provide good support for the implementation of event-type, real-time distributed systems, in both LAN and internetwork topologies, while retaining guaranteed response service.

6.7.4.5 XTP Deficiencies

While XTP provides many of the capabilities required to support real-time, mission-critical, distributed systems, it is nevertheless deficient in some areas.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 118 of 214 ydthsm2.wpd System Architecture

It is concluded that XTP is specifically deficient in the following areas :

! Lack of parameterised amount of error that can be tolerated in a specific period of time (which is important for continuous media).

! Does not support a precedence feature.

! Does not provide performance guarantees (e.g. latency control).

! Provides a static rather than dynamic priority scheme.

! Does not provide network timing services.

6.7.5 Priority Management

Real-time networks are required to provide priority message services in order to ensure that time-critical messages meet their deadlines.

An important consideration in priority message scheduling is priority inversion. This can arise from a number of causes. For example, the IBM Token Ring and FDDI MAC-layers offer eight levels of priority (for asynchronous messages in the case of FDDI) while Token Bus has four. Packets tagged with different system priorities may be mapped onto the same MAC priority and therefore treated as equal, potentially causing message priority inversion. Also message priorities can only be resolved within a node and not within the complete network. Therefore a downstream node with a high priority packet may have to wait for upstream nodes to transmit their lower priority packets. Dependent on I/O hardware architecture, priority inversion can also occur within a node (e.g. in IBM Token Ring due to the availability of only a single request queue).

It is therefore important that appropriate mapping of user-level priorities (or urgency) to transport layer priorities and finally to MAC-layer priorities be undertaken. It is also important to realise that priorities cannot guarantee packet transmission order. It is proposed that the task of priority mapping should be allocated to a software- implemented Priority Manager.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 119 of 214 ydthsm2.wpd System Architecture

Table V provides an example of a priority band allocation of four levels.

Message Type Priority Band Repetition Cycle State 5 milliseconds Band 1 20 milliseconds Band 2 100 milliseconds Band 3 1 second Band 4 Event Deadline 5 milliseconds Band 1 20 milliseconds Band 2 100 milliseconds Band 3 1 second Band 4

Table V : Example of System-Level Message Priority Scheme

6.7.6 Error Management

Similarly to priority management, a system-level error control policy is required to be defined and implemented. Such a policy can be derived from message precedences, priorities and transmission modes. In the case of detected errors, various actions such as retries, local and remote alerts and alarms can be implemented, depending on the nature of the particular real-time, mission-critical system.

Table VI provides an example of a system-level error control policy.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 120 of 214 ydthsm2.wpd System Architecture

Error Control Policy/Action Precedence Priority Transmission Acknowledge Retry Action on Fail Mode Count Sub-System Management

Critical Band 1 Synchronous FASTNACK, Acknowledge # 2 Alarm Alert Record

Band 2 Synchronous FASTNACK , Acknowledge # 4 Alarm Alert Record

Band 3 Synchronous Acknowledge # 8 Alarm Alert Record

Band 4 Asynchronous Acknowledge # 16 Alarm Alert Record

Major Band 1 Synchronous FASTNACK # 2 Alert Indicate Record

Band 2 Synchronous FASTNACK # 6 Alert Indicate Record

Band 3 Asynchronous Acknowledge # 8 Alert Indicate Record

Band 4 Asynchronous Acknowledge # 12 Alert Indicate Record

Minor Band 1 Synchronous NOERROR none None Record

Band 2 Asynchronous FASTNACK # 4 None Record

Band 3 Asynchronous Acknowledge # 6 None Record

Band 4 Asynchronous Acknowledge # 8 None Record

Table VI : Example of System-Level Message Error Control Policy

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 121 of 214 ydthsm2.wpd System Architecture

6.7.7 Quality of Service

While current transport protocols do not act on QoS parameters themselves, the next generation of protocols are being designed to do so. Such protocols include the Resource Reservation Protocol (RSVP) and Q93.B, while the ATM Forum is considering QoS approaches for implementation in the ATM Adaptation Layer. General-purpose transport protocols should in any event be capable of passing QoS parameters to next-generation protocols.

6.7.8 Flow Control

Because buffers are a finite resource in the I/O sub-system, flow control is necessary in order for a receiver to constrain the volume of data that a transmitter may transmit to a receiver.

Two basic techniques exist to achieve flow control, i.e. a fixed sliding window scheme and a credit scheme. Both schemes rely on feedback from the receiver. A combination of both is possible and provides the most flexible solution.

6.7.8.1 Sliding Window Scheme

A sliding window of packet-based sequence numbers is maintained at both the transmitter and receiver. At the receiver this window represents the sequence numbers of those packets that may be accepted; any data packet arriving with a sequence number outside the window is discarded. The window is of fixed size and simply slides across the sequence space as packets are acknowledged.

6.7.8.2 Credit Scheme

The credit scheme also uses a sliding window, but the window is of variable size. The window size is controlled by a credit parameter in the acknowledgement packets. The credit value is used to determine the upper edge of the window while the sequence number of the acknowledgement packet is used to set the lower edge.

6.7.9 Latency Control

While a new class of real-time protocols under development (e.g. ST-II and RSVP) has latency control attributes built-in, current transport protocols do not[137]. Where this service is not directly available from the data transfer services, other approaches need to be taken to ensure that latency is at least bounded.

One such approach is appropriate in FDDI LANs. FDDI offers both synchronous and asynchronous service (at the MAC-layer). It also guarantees transmission of synchronous packets within twice the Target Token Rotation Time (2*TTRT)[116] if the following conditions are met :

! Total synchronous transmission is known a priori

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 122 of 214 ydthsm2.wpd System Architecture

! Asynchronous transmission is bounded to a finite proportion of total bandwidth utilisation

Thus the TTRT must be designed to be half the shortest message deadline.

The SAFENET Network Development Guidance[103] provides a quantitative method of performing synchronous bandwidth allocation.

6.7.10 Transport Layer Protocol for Real-Time LAN Profile

Considering the requirements of the transport layer and the capabilities of the available options, the recommended real-time option for the Real-Time LAN Profile is therefore the Xpress Transport Protocol (XTP), while the recommended maximum interconnectivity option is the Transmission Control Protocol (TCP). A multiprotocol approach is therefore proposed.

Thus the identified Transport Layer options satisfy the derived requirements of flexible dataflow control, priority message scheduling and interconnectivity. They also provide a rich set of service models and service discrimination to cater for the various payload types of a diverse set of real-time distributed applications. Together these capabilities are contended to satisfy the system requirement of error-free, as well as timely and orderly data transfer. With a multiprotocol approach, real-time performance is achievable, while maximum interconnectivity can be retained.

Where the transfer layers are unable to provide the required capabilities (such as precedence discrimination and synchronisation), the Real-Time LAN Profile must make provision for these deficiencies within the Extended Profile Services.

6.8 Extended Profile Services

Extended Profile Services are those provided by the protocol layers above the transport layer. In the ISO OSI Profile, these services are provided in the Session, Presentation and Application Layers. It is well known that these higher layers of the ISO Profile are not well suited to real-time performance, while certain services at this level are also not well developed. As the SAFENET Network Development Guidance[103] observes :

"First the area of network management services is not adequately defined. Second, time efficient extended profiles, tailored to current Navy practice in inter-process message passing, do not exist in the OSI protocol set."

The concept of extended services is therefore necessary to allow the definition of application- specific services and protocols, especially to meet real-time performance and implementation-specific requirements. This can, however, represent a direct trade-off between performance and interoperability.

The proposed Real-Time LAN Profile provides the following Extended Profile Services: Application Interface Services (APIS), Network Time Services (NTS) and Network Management Services (NMS). NTS is based on a Network Time Protocol (NTP) while NMS

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 123 of 214 ydthsm2.wpd System Architecture

is supported by Built-in Test Services (BITS) and the Simple Network Management Protocol (SNMP).

6.8.1 Application Interface Services

Generic application interface services are those services required by certain applications, but are not catered for by the lower protocol layers. In real-time systems such services must ensure that latency is minimised. Provision of such services allows implementers maximum flexibility in tailoring communication services at the upper layers to meet the needs of a specific application (e.g. passing messages between tasks executed by distributed processors).

Access to user services at the Application Layer can be greatly facilitated through the use of standard application programming interfaces (APIs). These provide well defined accessibility for application programs to obtain services or information from the underlying service provider, while hiding the complexity of that service provider from the application programmer. Thus the API provides service abstraction of the transfer layer to the application user.

It is proposed that an extended profile protocol with functionality spanning the session, presentation and application layers be defined and developed to meet the needs of the specific system and provide the generic, as well as specific features and capabilities to support the real-life, real-time system.

6.8.1.1 Capabilities

It is therefore contended that it is required to utilise a higher-layer protocol (or set of protocols) that has the following generic capabilities :

! Allows system applications to dynamically setup and manage system dataflow, i.e. system dataflow is data-based and not address-based

! Provides a transparent interface to each user application

! Does not detract from the real-time capabilities of the transfer layers beneath it

! Does not impact on the real-time performance of the user system above it

! Provides a virtual backplane to the application user

! Provides multiprotocol support, specifically for XTP, ISO TP4/CLNP and TCP/IP

! Provides an adequate tellback of status and error conditions to the user application without detracting from the latter's own processing performance

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 124 of 214 ydthsm2.wpd System Architecture

! Supports network management by providing Network Management Services

! Allows the real-time system implementer to set his own data transfer policy

! Provides the system integrator the maximum support to integrate and qualify the system with minimum effort

! Supports the system integrator in managing the system dataflow throughout the lifecycle of the system

! Meets the specific needs of the system, including :

" provision of an interface to the real-time operating system

" provision of a device driver for the supported network interface cards

! Supports an off-host network interface architecture

! Provides an interface to the host CPU

! Is implemented in the C++ or Ada high-level language

An example of such an extended protocol is APIS (Application Interface Services)[58] developed by the author and colleagues between 1993 and 1996.

A more detailed description of the APIS extended profile service is provided in Appendix E. An overview, the principles of operation and summary of APIS services are provided below.

6.8.1.2 APIS Overview

Application Interface Services (APIS) is a network communications protocol designed for the exchange of information between functionally independent applications incorporated in a distributed real-time system.

APIS conceptually encompasses Layers 5 to 7 of the ISO OSI Reference Model and so interfaces below to Layer 4, the Transport Layer and above to the APIS Service User (ASU) which will normally be a collaborative, networked, software application.

The ASU is a producer and/or consumer of data of different types. Data types are pre-defined by an ASU administration authority, i.e. the System Data Manager (e.g. the system integration authority) as part of the network system design and each data type is ascribed a unique identification code or Message Identifier. The APIS protocol establishes the necessary communication channels between ASUs by registering and matching their

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 125 of 214 ydthsm2.wpd System Architecture

producers and consumers. LAN dataflow will therefore be determined by the data type of ASU messages and not by predefined ASU addresses.

This data driven approach to dataflow management provides a higher level of flexibility than the traditional addressed-point-to-addressed-point facilities provided by general purpose LAN protocols. The objective of this approach is to simplify ASU communication and configuration logic, thereby decoupling system design from network design.

6.8.1.3 Principles of Operation

The intrinsic operation of APIS requires a number of specific services from the lower layer protocols. These are broadcast, reliable multicast and multicast group management.

The requirement for broadcast implies the provision of media and MAC- layer protocols capable of multiple access. The requirement for reliable multicast implies the provision of this service by the transport layer, e.g. XTP or possibly by LLC Type 4 (refer Paragraph 6.5.3). The requirement for multicast group management implies the provision of this service by the transport layer, e.g. XTP or implementation of this functionality within APIS itself.

Apart from lower layer services, APIS relies on two fundamental mechanisms to support the data-driven approach, i.e. message classification and regular expression or wildcarding.

Message classification characterises each message by type, sub-type and identifier (ID). Messages are grouped by type and sub-type and uniquely identified by Message ID. Such classification is made on the basis of origin (i.e. producer) and application category (e.g. contact, target, navigation data, etc.).

Producers and consumers register with their own (local) APIS which identify them to the system by means of an APIS broadcast. The status of all producers and consumers is then maintained in a local status table which is updated periodically as well as at each significant status event. The aggregate of this process provides a distributed, real-time, dataflow management agent.

When a consumer requires a message it registers its requirement (i.e. demands the message) and APIS sets up all necessary internal mechanisms and control messages to ensure that the producer provides this message.

The Message Identification scheme is designed in such a way as to support wildcarding. Wildcarding is defined as group addressing by means of address subsets. Thus groups of producers and consumers can be accessed using a generic addressing scheme. APIS employs wildcarding to manage production and consumption of messages, both individually as well as by

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 126 of 214 ydthsm2.wpd System Architecture

group (type) and sub-group (sub-type). Wildcard produce and wildcard demand are both supported as APIS service options.

APIS employs multicast and multicast group management to manage production and consumption of messages. A Producer uses reliable multicast to transmit to all Consumers requiring a particular message. Consumer groups, including the joining and leaving of groups after startup, are managed by multicast group management facilities provided by XTP. For security reasons, joining a consumer group requires authentication by APIS.

6.8.1.4 Services

APIS services are message and dataflow control services supplied to the APIS Service User. A short description of each APIS service follows. A more detailed description of the APIS protocol is provided in Appendix E.

! Initialise

APIS_INIT()

This service command assists APIS with table administration. It allows for the removal of all ASU information linked to the application host that issued the command, as well as freeing of associated unused memory buffers. This command should only be issued once per application host after start-up. Multiple hosts are allowed.

! Register

APIS_OPEN()

This service command provides for the registration of the user application with APIS. This specifies the Application Service Access Point (ASAP) and the communication parameters between the Application User Interface and the user application.

! Deregister

APIS_CLOSE()

This service command provides for the closure of the ASAP.

! Produce Data

APIS_PRODUCE()

This service command registers a data message with the Application User Interface for transmission.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 127 of 214 ydthsm2.wpd System Architecture

! Consume Data

APIS_DEMAND()

This service command registers a data message with the Application User Interface for reception.

! Deregister Production of Data

APIS_REMOVE_PRODUCE()

This service command provides for the removal of a stream data message from the Application User Interface.

! Deregister Consumption of Data

APIS_REMOVE_DEMAND()

This service command provides for the removal of a stream data message from the Application User Interface.

! Send Message

APIS_SEND_MSG()

This service command allows for the transmission of a stream data message from the user application to peer applications on the network that are registered as requiring that particular message.

! Receive Message

APIS_RECEIVE_MSG()

This service command allows for the passing on to the user application the received stream data message from a peer application. This is not a request made by the ASU, but an event generated by APIS.

6.8.1.5 Implications of the Data-Driven Approach

While APIS's data-driven approach abstracts the implementation details from the application user, it has certain implications which may be significant for real-time distributed systems. In particular, if applications are completely decoupled from the transfer infrastructure they are unable to effect control of dataflow. Essentially, a producer produces a message at the highest repetition rate required by all the consumers requiring that message. Should a particular consumer be unable to accept messages at that rate, messages may be missed. This may be due to that particular consumer being less capable than its peers, or it might have a permanent

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 128 of 214 ydthsm2.wpd System Architecture

or temporary heavier application processing load. A producer could invoke transport level flow or rate control, but then this would reduce the rate of production which might negatively affect other consumers. Various methods have to be provided to overcome such a situation. The following methods are possible.

6.8.1.6 APIS Dataflow Control

! Duplicate, but Lower Rate Message

The affected consumer could request another message with identical contents, but at a different rate. This may increase the load on the affected producer, or it could increase the load on the LAN.

! Message Filtering

If the affected consumer requires the message at a lower rate, but cannot be interrupted at each message reception, the Network Interface Card (NIC) could filter the incoming message stream and thereby retain only a certain proportion of messages (e.g. every second or every tenth message). The NIC would be required to have reasonable onboard processing capability to effect this filtering and buffering.

! Extended Message Buffering

Certain applications might be capable of processing all the incoming messages, but not instantly that they arrive. This would typically be due to other higher application priorities. In this case the NIC could buffer a number of incoming messages (e.g. 10 or 100) and only pass the messages to the application host either on request or when the buffer was full. To provide this functionality the NIC would be required to be fitted with extended buffer memory. Again, the NIC would be required to have some onboard processing capability to effect this functionality.

6.8.1.7 APIS Architecture

Figure 13 depicts the APIS Architecture.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 129 of 214 ydthsm2.wpd APIS Message Database

ParallelBackplane Bus APIS Interpreter Network (PBB) and Interface Table Supervisor

Debug

ASUcommand TCEP Control

Command_Socket

ASU Service AccessPoint (ASAP) APISService User [source] Transport Protocol (ASU) (XTP) Producer_Socket Status_Socket

ASU_Rx_Socket Consumer_Socket1 Consumer_Socket2

TCEP [destination]

Application APIS LAN

- Mailbox TCEP - Transmission Transport Connection Endpoint

Figure 13 : APIS Architecture yapis02.cdr Page 130 of 214 System Architecture

6.8.1.8 Off-host Architecture

As all software-implemented protocols, including APIS, require a greater or lesser degree of protocol processing, system performance can be significantly enhanced by relieving the sub-system host of such processing. This implies the requirement for the network interface to be provided with its own processing capability, i.e. an off-host architecture. While this is more costly, it leads to much greater data transfer performance which is more often than not justified in critically real-time systems. Where API dataflow control is required, such as that Described in Paragraph 6.8.1.5 above, it would be imperative for the network interface to be provide with onboard processing and buffering capability.

6.8.2 Network Time Services

Despite the raw performance available at the lower layers of the LAN profile, networked systems can still suffer from transfer latency and jitter. Latency and jitter are problems as they may lead to instability in distributed control algorithms. Appendix H provides a fuller treatment of these issues from a system perspective. In Appendix H it is proved that in certain real-time applications stringent timing requirements cannot be directly met by the transfer infrastructure even with minimised latency. However, the data repetition rates required by the collaborative distributed algorithms are sufficient if accurate timing of the data samples can be recovered.

Real-time systems therefore require services from the network which circumvent the problems of latency and jitter. The network must also provide synchronisation services, both with respect to calendar time (i.e. absolute time) as well as relative time, between distributed clocks. For many distributed applications only relative time synchronisation is required. For certain applications, e.g. synchronisation of remote encryption devices, calendar time synchronisation is required. A Network Time Services extended profile protocol is therefore required to provide these services.

6.8.2.1 Description

Network Time Services (NTS) is an extended profile protocol consisting of a set of timing primitives and functions which can, for example, run over XTP and provide generic timing functionality to the SAFENET Time Services which provide application-specific functionality.

The SAFENET-defined Global Time Service[29, 78] provides to processes within a node a precise Calendar Time (time of day) which is consistent over the network. A precision of one binary millisecond over a timespan of several hundred years is mandatory with provision for a precision of better than a nanosecond being optional. It also provides a means to co- ordinate this time with an external time reference such as that which may be obtained from a navigation system and to provide a timestamp in a variety of formats including Greenwich Mean Time (GMT) and Co-

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 131 of 214 ydthsm2.wpd System Architecture

ordinated Universal Time (UTC). There is no existing or proposed ISO standard for such a service over a network.

6.8.2.2 Capabilities

The overall capabilities of NTS are the following :

! effects system synchronisation by means of a Network Time Protocol

! provides timestamping of messages

Refer to Appendix F for a full analysis and description of the Network Time Protocol (NTP) as tailored to operate over an FDDI LAN. An overview of NTP is provided below.

6.8.2.3 Network Time Protocol

The Network Time Protocol implements timing mechanisms between all participating sub-systems over the network and provides basic functionality such as synchronisation and timestamping to NTS. NTS in turn provides user-level timing services to the application.

NTP implements an exchange of timestamps with one or more peers sharing a synchronisation network and calculates the time offsets between the peer clocks and the local clock. These offsets are processed by several algorithms within each node which refine and combine the offsets to produce an ensemble average. This can be used to adjust the local clock time and frequency.

NTP was designed to function in the Internet environment and to run under the Unix operating system[104, 105]. It makes provision for using an external radio clock, but does not rely on such a clock for accurate timekeeping. The protocol does not rely on the accuracy of the clock of a single peer, but rather attempts to find the most accurate time source available to it.

A standard nomenclature is used in respect of the Network Time Protocol[105]. The stability of a clock is how well it can maintain a constant frequency, the accuracy is how well its frequency and time compare with national standards and the precision is the resolution of the clock, i.e. the smallest time increment that the clock can measure usefully. The offset of two clocks is the time difference between them, while the skew is the frequency difference (first derivative of offset with time) and the drift is the variation in skew with time (second derivative of offset with time).

When NTP is implemented on a wide area packet-switched network such as the Internet, the accuracy of the clock is affected mostly by errors due to the random network delays. Even in this environment, where message

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 132 of 214 ydthsm2.wpd System Architecture

delays are in the order of seconds, a clock precision of a millisecond a day can be achieved by clock synchronisation using NTP.

6.8.2.4 NTP using FDDI as a Medium

Different factors become important when implementing a clock synchronisation algorithm on a fast network such as FDDI. The network delay errors no longer dominate, so that any delay or jitter introduced by the clock reading and related algorithms, as well as the protocol stacks, become proportionately significant. Also, an error is introduced as a result of the network transmit and receive paths not being the same length (i.e. due to the ring topology).

NTP relies on the exchange of timestamps to synchronise to a peer. One of the basic assumptions is that message times to and from the peer are statistically equal. If the message transmit and receive times are not equal, the NTP algorithm introduces an error of half the difference between the two times.

When using the token ring protocol, messages move in a constant direction around the ring. Normally the path between two nodes on the ring would consist of a short path in one direction and a long path in the other, resulting in unequal transmit and receive times. The delay for the stations in an FDDI network is 0,6 µs/station and the media propagation delay is 5,1 µs/km[147]. Therefore, for a network of 500 stations, with 200 km of optical fibre and with two stations next to one another communicating, the roundtrip delay, which is also the worst case difference between transmit and receive times, would be about 1,62 ms.

If a more typical control network of 50 stations and 2,5 km of optical fibre is considered, the worst case difference in transmit and receive times becomes 43 µs. This would introduce an error (offset) of some 22 µs in the local time.

In the light of this factor, as well as other sources of error, it is concluded that an algorithm based on NTP over a 50-node, 2,5 km FDDI LAN is capable of maintaining synchronisation with a worst case accuracy of 220 µs. Appendix F provides a fuller treatment of this determination.

6.8.2.5 Fast Initialisation by Synchronisation Seed

Due to the nature to the Internet, NTP takes some time to settle to its final precision. Measurements have shown that it can typically take 24 hours to settle to 1 ms. This would not be sufficient for most real-time networks. However, in closed networks employing technologies such as FDDI, techniques are possible to circumvent this.

Using a high-speed network such as FDDI has a distinct advantage when a clock needs to be initialised, as would happen at system startup. Since

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 133 of 214 ydthsm2.wpd System Architecture

the seed timestamp "ages" by a maximum of 1,62 s while being transmitted over the network, the local node clocks can be set by a seed value. By allocating this seed time to the network protocol and the clock setting algorithm, this would guarantee that a local node clock will settle to within one to two milliseconds of the correct system time within one token rotation after startup.

6.8.2.6 Timestamping

While FDDI can guarantee certain latency requirements (typically 5 ms to 10 ms over a 50-node LAN), FDDI networks exhibit intrinsic jitter, i.e. uncertainty of the arrival time of packets. This is due to the time token protocol as well as the stochastic nature of sub-system asynchronous data transfer requirements. Timestamping of data by the producer, using accurate network time derived from NTP, allows a consumer to reconstruct critical timing information by determining the age of the data.

Typically the network interface cards (NICs) within a network would be synchronised by NTP. This implies a requirement that they possess both their own clock and processing capability. However, in real-time systems it is the applications which need to be synchronised.

6.8.2.7 CPU/NIC Synchronisation

Applications normally execute on one or more host CPUs within a sub- system. These CPUs normally communicate with each other and other peripherals, including the LAN interface, via a parallel backplane bus (PBB). Such buses invariably exhibit latencies and non-deterministic data transfer behaviour of their own. For these reasons, the host CPUs would then have to synchronise with their NIC and possibly with each other. A synchronisation protocol, similar to NTP, can then be implemented between co-operating CPUs and NIC via the PBB. Refer to Paragraph 8.4.5 for typical measured latencies on the Multibus II parallel backplane bus.

6.8.2.8 XTP Support for Timestamping

The XTP protocol offers a useful feature to support timestamping. This is termed out-of-band data. This allows a user to append eight bytes of tagged data to an information packet to describe the packet in a special way without embedding the tagged data within the packet itself. Typically it can be used to timestamp the time-critical data.

Out-of-band data can also be used to pass other control information between nodes (e.g. the state of an end-user process).

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 134 of 214 ydthsm2.wpd System Architecture

6.8.3 Network Management Services

In order to support dependability, flexibility and maintainability for network-based real-time, mission-critical distributed systems, such systems should be provided with effective network management services. Its importance is underwritten by the SAFENET Network Development Guidance[103] which makes the following observation :

"The use of network management in a network claiming conformance to SAFENET is optional, but it is expected that if fault tolerance or dynamic system reconfiguration is required in a system then network management will be needed in all but the simplest of systems."

6.8.3.1 Network Management Capabilities

The Network Management function should provide real-time monitoring (integrity checking) and dynamic configuration control of the entire networked system, event and alarm reporting, statistical performance measurement, comprehensive diagnostics facilities, computer-assisted troubling shooting and maintenance as well as support network security.

Network Management is required to provide the mechanisms to effect the following functions :

! bring up, enroll and/or alter network resources ! keep network resources operational ! fine tune the resources and/or plan for their expansion ! manage the accounting of their use ! manage their protection from unauthorised use

Network Management should be effected at two levels; the lower level implementing network management functions and the higher level implementing system health monitoring.

6.8.3.2 Network Management Standards

The ISO OSI[21], IEEE[5] and Internet[112] network profiles define network management standards.

6.8.3.3 Network Management Principles

6.8.3.3.1 Managed Objects

A managed resource is any network resource such as a layer entity, connection, or an item of physical communication equipment within the managed system that is subject to management.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 135 of 214 ydthsm2.wpd System Architecture

A managed object is an abstraction of such a resource that represents its properties as seen by, and for the purposes of, management. The definition of a managed object includes its attributes (e.g. the parameters that can be read or set), control actions which can be taken on the real entity it represents and notification of important events which can be transmitted to a managing application.

6.8.3.3.2 Management Agents

A management agent is an application program which performs management operations on managed objects and produces notification of events on behalf of managed objects within the node in which it resides. It acts as an intermediary between the managing application and the entities which have managed objects within the node.

Standard network management agents only provide basic functionality for the monitoring of the sub-system network component itself and not the sub-system elements. The latter functionality is extremely important in complex real-time, mission-critical distributed systems. Extensible network management agents should therefore be employed which extend monitoring functionality beyond the network interface to the sub-system elements. The latter functionality will enhance user confidence, dependability and maintainability.

6.8.3.3.3 Managing Applications

A managing application manages managed objects by means of management agents. It derives data from the managed objects by means of a LAN-based network management protocol and stores this data in a Management Information Base (MIB). This data in the MIBs is then processed to provide user level management information by means of a man-machine interface.

6.8.3.3.4 Network Management Protocol

The Simple Network Management Protocol (SNMP)[45, 46] is an Internet Profile network management protocol which provides basic management information exchange capability to systems.

SNMP is also recommended for the extended SAFENET profile. Due to its flexibility and widespread deployment, SNMP is also recommended for the Real-Time LAN Profile.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 136 of 214 ydthsm2.wpd System Architecture

6.8.3.4 Network Management Products

A number of commercial off-the-shelf network management products exist. The de-facto standard is widely accepted to be HP OpenView[84] from Hewlett-Packard Corp. which, with its derivatives, is available on a number of platforms including Unix (HP-UX), MS-Windows and Windows NT. OpenView is a powerful, flexible system that can be configured (by programming) to provide for almost any functionality. OpenView is a network management platform that supports intrinsic network management functionality as well as application-level network management software from a wide variety of other vendors. It also supports extensible network management agents.

Less powerful network management products are available for the MS-DOS platform. However, network management is essentially a multitasking application and DOS is inherently unsuited to hosting such applications due to its primitive tasking model.

6.8.3.5 Management Information Bases

The FDDI standard provides strong intrinsic support for network management by means of its Station Management Layer (SMT)[35]. This however, only offers a kernel for network management and some higher- level functionality must be provided in order to implement useful network management.

Two main options are available, Parameter Management Frames (PMF) and SNMP Management Information Base (MIB)[45], each having their shortfalls[35].

6.8.3.5.1 Parameter Management Frames

Parameter Management Frames (PMF) are appropriate in a homogenous FDDI environment and where maximised coupling is required between FDDI SMT and the network management agent. PMF is also appropriate where multiple backplane FDDI hubs are employed and the state of these backplanes is required to be monitored.

6.8.3.5.2 SNMP MIBs

SNMP MIBs are appropriate in heterogeneous network environments where bridges or routers are employed.

In general, it is concluded that in most larger real-time, mission-critical, distributed systems, heterogenous architectures will be employed in order to take advantages of the functionality and cost benefits of the various technologies.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 137 of 214 ydthsm2.wpd System Architecture

PMF technology is less mature than SNMP MIB technology while the latter is also developing as a de-facto standard. In general therefore, it is recommended that SNMP MIB (MIB 2)[46] is employed for network management.

XTP V4.0 supports network management by means of an SNMP MIB[101].

6.8.3.6 Network Management Station

It is proposed that where applicable, real-time, mission-critical, distributed systems be provided with a Network Management Station (NMS). The NMS should be dedicated if possible, but may be hosted in any suitable workstation. The NMS should provide online monitoring and control of the entire network system, reconfiguration management and high-level diagnostics (BIT) facilities. The NMS is also a good candidate for static Synchronous Bandwidth Allocator (SBA) functionality.

The precise configuration of the NMS will depend on the nature of the system. If the system is not manned by maintenance crew (e.g. aircraft or spacecraft), it would be inappropriate for the NMS to have extensive man- machine interface functionality. In such applications, the NMS should be essentially for monitoring and recording purposes, the latter for later offline analysis which can be implemented by special test equipment.

If the system is manned by maintenance crew (e.g. process control plant or ship), it is essential for the NMS to have extensive man-machine interface functionality. In this case it should consist of an operator's console (preferably dedicated) which provides graphics-based, diagrammatic visualisation of the system and its network components. The display should provide high resolution colour graphics and interact with pointing devices such as mouse, trackball or joystick.

Typically the NMC will consist of hardware and computer software components. The hardware component provides physical connectivity to the LAN as well as providing a processing platform for the NMC software component. The latter provides a logical connection to the LAN and implements functions such as network monitoring, network management, network testing, re-configuration control, etc.

The complexity of the NMS would be dependent on the sophistication and complexity of the real-time distributed system.

Figure 14 provides an example of a typical network management man- machine interface.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 138 of 214 ydthsm2.wpd COMPARTMENT CONNECTION DIAGRAM - COMP CABLE ELEMENT CONNECTION DIAGRAM - CONN MAIN REDUN MENT TCU NIC CABLE POWER PORT TRUNK TRUNK FIND FIND EQUIPMENT TORPEDO EQUIPMENT SELECT ROOM WE4 STARBOARD ROOM WE2 ZOOM TCU TCU SELECT PLACE U-62 U-63 DELETE EQUIPMENT TORPEDO EQUIPMENT BRIDGE ROOM WE3 PORT ROOM WE1 ZOOM NAME SBS SBR PSR ORT 452-220 481-160

TA OPERATIONS GUNBAY 2 PLACE ELECTRONIC ROOM MDR IFU (TPR) EW IFU 452-170 XXX-XXX

FCR IFU (TPR) 482-2B0 DELETE CENTRE of SONAR GUNBAY 1 GRAVITY FCR FCIFU 482-2C0

CABLE INFORMATION DISPLAY - TCU CONNECTION DIAGRAM - CABLEFIBRE OBS CONN CABLE MAIN REDUN BKOUT NIC POWER AMP PORT TRUNK TRUNK NAME IMS 1 TYPE MAIN TRUNK FIND STATE ACTIVE SELECT COMPONENT A COMPONENT B ZOOM A B A NAME START END NAME START END 1.1 OBS1.B.1 TCU 1.13 1.1 OBS1.B.2 TCU 1.14 PLACE 1.2 TCU 1.13 OBS2.A.1 1.2 TCU 1.11 OBS2.A.2 Legend DELETE C D B Trunk Active NAME

ae19o 214 of 139 Page Trunk Failed COMPONENT C COMPONENT D Redundant Element NAME START END NAME START END B DBDD B ACCC AA Zoomed Element

Figure 14 : Typical Network Management Man-Machine Interface

file : ynms01.cdr System Architecture

6.8.4 Network Security Services

To provide the necessary level of security required by certain network systems, the design and implementation of the network must offer a range of security services :

6.8.4.1 Confidentiality

This service ensures that information is not made available or disclosed to unauthorised individuals, entities or processes.

Multi-access media are vulnerable to promiscuous mode terminal access. Wire-type media are especially vulnerable. Access to fibre optic media is more difficult to achieve without detection due to the detectability of absorption of optical transmission power, but front-end fibre optic receivers with extremely low power requirements are available that would make even this type of intrusion difficult to detect. For this reason, encryption of network data is required by certain systems.

Where multicast services (such as APIS) are used, multicast group management should verify a consumers's credentials before admission to the multicast group. Network Management Services should be kept continuously aware of user status and alerts or alarms should be generated when anomalous access is requested or detected.

6.8.4.2 Integrity

This service verifies that data has not been altered or destroyed in an unauthorised or accidental manner.

The network system, as well as applications and operating systems, must offer secure, but readily available, backup capability. Centralised backup has extensive implication on the required network throughput. However, backup data can be assigned low levels of priority so as not to affect real- time traffic. In the case of server mirroring via the network, the throughput implications are somewhat more extensive depending on the rate of change of data stored on the primary devices and how quickly the secondary devices are required to reflect the updates.

To ensure full integrity, backups have to be replicated, i.e. multiple copies on different media and stored geographically separate (locally and remotely). Magnetic media should be protected from destructive magnetic fields and other environmental hazards (e.g. dust, salt spray and blast). All media need to be protected from fire.

Optical media (e.g. optical read/write disk and WORM disk) are very effective media for the storage of backups and archival data due to their large capacity and immunity to magnetic fields, dust and moisture.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 140 of 214 ydthsm2.wpd System Architecture

6.8.4.3 Accountability

This service enables security-relevant activities on a system to be known and traced. The Network Operating System should record and trace such activities and generate alerts and/or alarms in the event of anomalous conditions.

6.8.4.4 Access Control

This service prevents the unauthorised use of resources, including the prevention of use of resources in an unauthorised manner. Nodes storing confidential information should be physically protected.

Applications and operating systems must offer multilevel authentication codes (passwords) and key control features. The higher layer network protocols must encrypt such passwords before transmitting them on the network.

6.8.4.5 Availability

This service is the ability of the system to detect, recover from and resist denial of service conditions.

Applications, operating systems and network protocols must offer data encryption where required, i.e. for data storage and/or transmission. Ultra high security data such as that involving electronic funds transfer, distribution of RF communications frequencies or the identity of military targets should be afforded the highest levels of encryption. The management and security of encryption keys requires special attention.

6.8.5 Summary of Extended Profile Services

Where the transfer layers have been found to be deficient in certain requirements of real-time, mission-critical, distributed systems, extended profile options are proposed to satisfy these deficiencies. These services cover the provision of application interfaces services, network time services and network management services. These then meet the system requirements of transparency, flexibility, timeliness, manageability, availability and security.

6.9 Chapter Summary

In order to provide for real-time performance, a layered Real-Time LAN Profile has been identified. The requirements of each layer are derived and suitable candidates proposed from the pool of currently available options. Together with the extended profile protocols, these satisfy the allocated system requirements.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 141 of 214 ydthsm2.wpd System Implementation

Chapter 7

System Implementation

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 142 of 214 ydthsm2.wpd System Implementation

7. System Implementation

While it is contended that the proposed protocol architecture described in Chapter 6 will meet the system requirements in Paragraph 5, there are a number of other considerations that need to be addressed in the implementation of a complete, coherent real-time system. These are addressed below :

7.1 Functional Integration

In order to meet the operational requirement, the system needs to be functionally integrated, i.e. functional coherency between the system's constituent elements must be achieved. Such functional coherency needs to be maintained over the life of the system.

There are two fundamental and opposing approaches to functional integration[108]. These are the vertical and horizontal approaches. While it is possible that the same system functional performance can initially be achieved using either approach, there are wider system performance, upgradeability and survivability implications.

7.1.1 Vertical Approach

The vertical approach amounts to decomposing the system into functional segments and functionally integrating each of these segments, especially with regard to network topology and dataflow. The segments are then integrated by a higher level communication fabric with only segment to segment data entities flowing across the higher level interfaces.

While this approach may appear elegant in that lower level dataflow is restricted to segment LANs, thereby reducing overall LAN traffic, this approach may be limiting to the overall system in that segments are unable to directly access data entities existing on other segments. Should an existing system be required to be upgraded during its lifecycle (which invariably is the case), or should a system be reconfigured online to optimise for a particular mission, or following damage, e.g. battle damage in the case of military systems or tempest (e.g. fire or earthquake) in the case of distributed process control systems, there can be extensive advantages of making useful data available across segment boundaries.

The functional limitation of a vertical approach is well illustrated by way of an example from a multi-function naval combat system application. In this case a vertically-integrated weapon system will drop tracks derived from its own sensor that are of no interest to its own weapons. In addition, weapon systems often combine or correlate tracks in order to minimise the amount of track monitoring required by operators. However, these tracks may be of interest to other segments of the combat system. If this track data cannot be supplied across the external segment interface due to a vertical design approach, the other segments cannot take advantage of this potentially useful data. Also if this data is preprocessed (e.g. running raw tracks through track filters), it may be rendered unusable to other potential users.

The above example illustrates the considerations required for planning maximum flexibility into the system design.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 143 of 214 ydthsm2.wpd System Implementation

Figure 15 depicts a vertically integrated system.

Figure 15 : Vertically Integrated System

7.1.2 Horizontal Approach

The horizontal approach requires that all data be available to all segments, whatever the level of functional independence between these segments.

A single LAN would facilitate a horizontal approach; however this does not imply the recommendation of a single LAN.

Use of segment LANs also promotes enclaving which is a powerful technique for the maximisation of survivability[44, 69]. Enclaving is the physically secure grouping of sub-systems collaborating in the implementation of major system functions along with all required support services (power, cooling, etc.).

The first implication of this from a data communications perspective is that if all this data were to be transferred on a single LAN, the LAN's bandwidth could be saturated and the latency of critical dataflows compromised. Clearly this not acceptable and innovative techniques must be sought to overcome this problem.

The solution to this problem lies in the employment of internetworking topologies, using devices such as routers or network switches between otherwise separate LANs. The system is still partitioned into logical segments in such a manner as to restrict dataflow to within segments consisting of functionally dependent sub-systems, but

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 144 of 214 ydthsm2.wpd System Implementation

also to allow high performance routers or switches to transfer data entities between segment LANs without compromising real-time performance.

While this is a functionally-optimal topology solution, it comes not without cost implications. High performance routers or switches are expensive items and most systems would require a number of these. Therefore the cost-effectiveness of this approach requires careful consideration before implementation.

Figure 16 depicts a horizontally integrated system.

Figure 16 : Horizontally Integrated System

7.1.3 Practical Implementation Approach

Two approaches for implementing either vertically or horizontally integrated system are proposed. Should the system performance requirements and lifecycle issues clearly point towards a fully horizontal solution, then this approach should be followed immediately.

Should system performance requirements such as survivability and mode optimisation or lifecycle issues be unclear, or should system constraints (e.g. cost) not allow a fully horizontal approach, then a vertical approach should be followed. However, horizontal integration requirements should also be considered during sub-

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 145 of 214 ydthsm2.wpd System Implementation

system design and development in such a way as to minimise upgrade implications. This is especially important in sub-system software and interface design.

It is contended that this approach will satisfy the system requirements to maximise functional integrateability, flexibility, reconfigurability and affordability.

7.2 Topology

The topology of a particular system reflects the functionality and complexity of the particular requirement and implementation. It should also reflect the approach considered appropriate in terms of vertical and horizontal integration issues. In order to maintain coherence and traceability with the allocated requirements, the topology of a system should be derived from a formal process such as the System Engineering Process[131].

Veríssimo[134] observes in this regard :

"Standard LANs, ... , have been used in control and automation, but ad-hoc, ´plug-in´ approaches have, not seldom, failed to meet the desired timeliness and reliability goals. Real- time behaviour of real-time systems is obtained through a systematic approach, that is, by establishing a model (traffic patterns, reliability and timeliness requirements, failure assumptions, etc.), a service and interface definition, and dressing the elementary LAN with the necessary hardware and software in order to comply with those requirements."

The possibilities are extensive, but one approach that is recommended by Zitsman et al[148] is a topology based on logical system segmentation into principal functional areas, each with its own LAN. This approach offers flexibility in terms of reconfigurability and survivability. It also provides for management of bandwidth allocation and the achievement of acceptable levels of bandwidth utilisation. Interconnectivity between the LANs would be provided by high performance bridges, routers or switches.

Appendix B provides more detailed descriptions of topology options while Chapter 4 describes some typical applications using selected topologies. It is contended that the dual- redundant ring topology offers distinct advantages in terms of fault-tolerance, self-healing, range, etc.

7.2.1 LAN Connectivity

All critical sub-systems on the LANs should be FDDI dual-attachment stations (DASs). Non-critical sub-systems could be FDDI single-attachment stations (SASs) connected by FDDI concentrators. These can be connected dual-attached in a ring or tree topology. Certain applications that do not require dual-attachment for all nodes can use dual-homing for critical nodes and single-attachment for non-critical nodes.

Non-critical or bought-out sub-systems without FDDI interfaces would be connected by gateways. Routers would typically support Ethernet and IBM Token Ring LAN standards while gateways would support MIL-STD-1553B, RS-232, RS-422, HDLC/SDLC, synchro/digital and WAN communications interfaces.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 146 of 214 ydthsm2.wpd System Implementation

7.2.2 Interconnectivity

Interconnectivity should be implemented where a multi-LAN or internetwork topology is appropriate. This would be where the system dataflow is extensive and the provision of multiple LANs is required to reduce LAN dataflow to acceptable levels, especially so that heavy LAN traffic will not compromise timeliness requirements.

LAN segmentation has to be particularly carefully undertaken when considering timing implications, especially concerning latency and throughput. Routers and bridges are limited in their capability in forwarding data between LANs and LAN segments. Their performance is also a direct function of cost (or vice versa) and then affordability becomes an important criterion in system design.

Being critical points of interconnection, dependability of the interconnectivity devices is also an important consideration. To enhance availability of the devices, their maintainability should be optimised. In support thereof, a desirable capability of bridges and routers is that known as live insertion (or hot swapping). Such a capability would allow online maintenance of the network by swapping of the appropriate modules within a device while it remains powered-up. This would negate the otherwise significant downtime that could result if the equipment had to be switched off and then re-initialised after reconfiguration.

It is important in mission-critical applications that failure of a node or node interconnect does not cause failure of the network. With wire- or optical fibre-type interconnects, a short or open circuit would cause such failure. Interconnect methods are required which circumvent failure due to such occurrences.

7.2.2.1 Coupling and Bypass Devices

With wire-type systems, the solution to the problem of node disconnect is the transformer-coupled or optically-coupled interconnect. Using such techniques localises faults to the node side of the connection. With optical fibre-type systems, the solution is interconnection using optical bypass switches. These devices re-route the optical signals around a failed node.

Apart from interconnect failure, operational disconnect may be intentional, e.g. release of booster stages in space launch vehicles or smart munition release on military platforms. Such intentional disconnect has to be catered for in an entirely reliable manner without any possibility of corrective maintenance action.

7.2.2.2 Repeaters

A repeater is an interconnect device which connects two segments of the same LAN without protocol conversion. In fibre optic systems the devices are sometimes termed fibre amplifiers. Repeaters are used where the length of a LAN segment would otherwise exceed the drive capability of the physical components.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 147 of 214 ydthsm2.wpd System Implementation

In fibre optic FDDI LANs, a fibre amplifier can be used to extend an inter- node segment from 2 km to over 20 km. Another important application for fibre amplifiers is where optical bypass switches (OBSs) are employed. OBSs, being passive devices, attenuate the optical signal in both the normal and especially the switched paths. This implies that only a certain number of OBSs (in the region of three to four) can be switched before the power of the signal degrades below the specified minimum, thereby disturbing the ring. In mission-critical systems it is normally imperative that such circumstances do not affect critical system performance. In many applications power-down conditions that can cause multiple node outages are common, these being unintentional or intentional. For example, the failure or switch-over of a power supply to a bank of equipment. In submarines (and even surface vessels) it is operational practice to switch off all equipment not required in certain modes, especially acoustic surveillance mode or to save power. It is therefore required to design for these conditions by using OBSs and fibre amplifiers. The latter need to be supplied with uninterruptable power (UPS). Fibre Amplifiers, being active electronic devices, can also fail (although they are fairly non-complex devices and therefore have high mean time between failures). Fibre Amplifiers should not be fitted with OBSs, otherwise if they do fail, this cannot be detected by network management (without complex diagnostics) and thereby precludes immediate maintenance action.

In Ethernet LANs, the application of repeaters is limited due to the fact that Ethernet relies on receiving transmission echoes within a certain time of reception in order to sense collisions. If an echo is not received within this time the node considers the transmission corrupt and retransmits, with obvious results. Ethernet LANs are therefore limited in range to a maximum of 2,8 km without bridges.

7.2.2.3 Bridges

A bridge is an interconnect device which connects multiple LANs with protocol conversion at the Data Link Layer.

Bridges are appropriate where the network consists of multiple homogenous LAN segments.

7.2.2.4 Routers

A router is an interconnect device which connects multiple LANs with protocol conversion at the Network Layer.

Routers are appropriate where the internetwork consists of multiple LANs featuring homogenous or heterogenous LAN technologies. Routers normally employ software to perform routing, therefore there can be significant latencies in LAN-to-LAN data transmission[129], the implications of which can be important in real-time systems.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 148 of 214 ydthsm2.wpd System Implementation

7.2.2.5 Gateways

A gateway is an interconnect device which connects multiple networks with protocol conversion at the Application Layer.

Gateways are appropriate where internetwork and wide area connectivity is required and where such multiple networks consist of heterogenous communication technologies over a wide geographic area.

7.2.2.6 Switches

A switch is an interconnect device which redirects (switches) data streams according to address information contained either within the data stream or outside the data stream, e.g. some a priori agreement mechanism such as a timeslot mechanism.

Switching can be categorised into two types, circuit switching and packet switching.

7.2.2.6.1 Circuit Switching

Circuit switching establishes a communication channel through a network which remains until the circuit is disestablished. Data can traverse the circuit without further control action. Circuit-switched connections essentially offer a fixed data rate service. These are useful for long-lived connections. Circuits may be physical or virtual.

Physical circuits are established by physical switching devices and physical communication channels. Physical circuits are typically used by public switched telephone networks which often provide the infrastructure for wide area networks. Older type switching devices were electro-mechanical which provided very slow circuit setup times, in the order of tens of seconds. Modern switches employ semiconductor logic controlled by software which can provide fairly fast circuit setup times, in the order of tens of milliseconds.

7.2.2.6.2 Packet Switching

A packet switch is an interconnect device which redirects (switches) packets according to address information contained within the packet. Switching may be performed by hardware, e.g. contents-addressable memory (CAM) or software, i.e. protocols. Packet-switched connections essentially offer a variable data rate service or what is often termed bandwidth on demand. This is useful for short-lived connections or the transfer of control and status data.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 149 of 214 ydthsm2.wpd System Implementation

Switches are used in packet-switched networks such as ATM and Fast Ethernet.

Circuit and packet switching offer different types of data transfer service to the application user. General-purpose networks require both types of service to support different traffic profiles. Physical circuit switching is often inappropriate due to long circuit setup times as well as being inefficient in the use of bandwidth if both transmit and receive channels do not require equal throughput or when data rates vary during the life of the connection. Virtual circuits can be constructed from packet-switched services.

7.2.2.6.3 Virtual Circuits

Virtual circuits are long-lived, logical connections established by digital logic. While there is a physical medium, many virtual channels may exist on the same physical medium. Virtual circuits can provide fairly fast channel setup times, in the order of hundreds of microseconds. This level of performance can provide real-time capability.

7.2.2.6.4 Critical Virtual Circuits

A Critical Virtual Circuit (CVC) is a virtual circuit supporting a critical function in a real-time, mission-critical, distributed system. As such they have to offer enhanced levels of reliability and real-time performance. Typically a distributed, digital control loop would employ a CVC.

Critical Virtual Circuits can be implemented within the proposed Real-Time LAN Profile by utilising specific features and capabilities of the different layers outlined in Table VII below.

Layer Option Capability Data Link FDDI high speed low bit error rate synchronous mode Network IP XTP-Aware IP Routing Transport XTP transport layer priority (set high) connection-oriented service full dataflow control Application APIS precedence (set high)

Table VII : Capabilities Supporting Critical Virtual Circuits

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 150 of 214 ydthsm2.wpd System Implementation

7.2.2.7 Concentrators

Concentrators are network devices which connect nodes of a star- connected network. They are typically found in 10BaseT Ethernet and single-attached FDDI networks.

Concentrators distinguish themselves from switches by the fact that the latter aggregate the bandwidth of each link. For example, an ATM switch can support multiple high-speed (e.g. 155 Mbits-1) links to each connected node. A concentrator on the other hand shares the bandwidth amongst the nodes, e.g. an FDDI concentrator shares the 100 Mbits-1 of the FDDI LAN among, say, 16 nodes, each getting some 8 Mbits-1. This is very significant in instances of high bandwidth, continuous media applications.

7.2.3 Wide Area Connectivity

While most real-time, mission-critical systems would involve LAN or MAN types of topologies, enhanced system effectiveness can be achieved by wide area connectivity. Such connectivity would typically not exhibit true real-time performance as the operation of the underlying communication infrastructure or medium (i.e. the public switched network, satellites or electromagnetic ether) would normally be beyond the control of the real-time system implementer.

Extended connectivity may be achieved by employing gateways to WAN-type communications infrastructures such as ISDN or RF communications systems using packet radio modems. In the latter case, soft real-time performance can be achieved, albeit with low data rates and non-deterministic availability.

7.2.4 System Dependability

The requirement for the dependability of a system is derived from two perspectives; firstly the system must be available when called into use and secondly the system must be fault-tolerant whilst in use.

7.2.4.1 Availability

The factors of reliability and maintainability contribute to the availability of a system.

In mission-critical systems, therefore, it is normally a design goal to maximise both to achieve maximum availability. In network systems, however, especially fibre optic networks, reliability and maintainability can be conflicting. Networks normally employ physical connectors to achieve connectivity, however it is well known that connectors are one of the most common sources of system failure, especially those deployed in harsh environments. However, where maintenance action is possible (e.g. ships and land-based plant), the use of connectors offers extensive scope for enhanced maintainability as well as reconfigurability. Analyses and trade-offs need to be performed to optimise system availability.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 151 of 214 ydthsm2.wpd System Implementation

7.2.4.2 Fault-Tolerance

In order for a system to be fault-tolerant, it must exhibit no single points of failure. These are identified by performing failure modes, effects and criticality analyses (FMECAs) as well as fault tree analyses (FTAs) during the system engineering process.

Fault-tolerance can be achieved by the implementation of replication or by self-healing.

In mission-critical systems it is often appropriate to apply redundancy at multiple levels; i.e. from integrated circuit (IC) level, through card, sub- assembly and assembly level to sub-system level[148].

Self-healing can be implemented by the system or sub-system excising the damaged section or component until it is repaired by maintenance action. This implies that a particular node or function will not be available which may imply degraded functionality; however this is normally preferable to the loss of the complete system. An example of self-healing is that implemented by FDDI where the internal node wrap function excises a failed cable section or failed upstream or downstream neighbour.

Optical Bypass Switches (OBSs) can offer an extra level of self-healing over and above that offered by FDDI itself.

7.2.4.3 Reconfigurability

Online reconfiguration and System BIT (Built-in Test) are important mechanisms by which fault-tolerance can be achieved. Such capabilities need to be supported by effective network management services[103, 86].

7.2.4.4 Enabling Technology

Employment of such technologies such as Flash EPROM offers significant advantages in achieving online reconfigurability. Flash EPROM, in particular the capability of In-System Write (ISW), allows for the online download of code from a central data storage facility such as a fileserver while still maintaining code integrity in the case of power failure or LAN failure. This capability also provides for enhanced software upgradeability as computer boards do not have to be removed from equipment racks, thus providing for less down and requalification time, as well as increased user confidence following set-to-work.

7.3 Multimedia Networking

Multimedia is the generic term for integrated digital text, graphics, video, image and audio.

While present day real-time systems may not widely employ multimedia, next generation real-time systems will certainly do so in order to meet operational requirements. It is

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 152 of 214 ydthsm2.wpd System Implementation

contended that multimedia capability will enhance a systems's capability to perform data fusion, thereby enhancing an operator's ability to make complex decisions with improved reaction times.

7.3.1 Video and Image

In typical real-time, mission-critical, distributed systems, digital video and image can arise from a number of different sources, e.g. surveillance cameras, FLIRs (Forward- Looking InfraRed devices), optical tracking devices, scan-converted radar and from various image databases. Such data sources are used for the detection, tracking and correlation of targets in real-time (e.g. surveillance radar). The attribute which characterises all these sources is the extensive throughput that they require over the networks.

Video is normally considered to be a continuous signal that is visible to a human when displayed. Image is normally considered to be either a continuous signal that is meaningful only to a signal processor or a discontinuous (frame-by frame) signal that is visible to a human when displayed.

7.3.2 Multiplexed Video and Image

Another capability which may be required within the system is to display multiple images simultaneously at different control consoles. There is thus the requirement to multiplex and multicast video and image data.

Video and continuous image transmission require to cater for the necessary resolution of the digitised signal. In certain applications, e.g. monitoring, only adequate visible rendition may be required, in which case varying degrees of lossy compression may be employed. In other applications, e.g. target recognition or remote photographic reconnaissance, no information can be lost and therefore only lossless compression can be employed.

Data rates for uncompressed video are indicated in Table VIII :

Source Type Quality Resolution Frame Rate Bits Data Rate (Pixels) (Hz) per Pixel (Mbits-1) Mono NTSC Low 640 x 480 30 8 74 RGB PAL Medium 768 x 576 25 24 265

SVGA High 640 x 480 72 24 531

HDTV Very High 1 024 x 1 024 72 32 2 416

Table VIII : Data Rates for Uncompressed Video

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 153 of 214 ydthsm2.wpd System Implementation

Cursory analysis of Table VIII shows that the transmission of video with even modest resolution would saturate the highest performance networks. It can be concluded that for networks to transport digital video, especially from multiple sources simultaneously, substantial compression is required. For certain classes of digital video signals, such compression needs to be performed in real-time. This requires extensive processing power.

7.3.3 Video Data Compression

All video compression schemes share a common objective, that is to identify and eliminate redundancy in an image. This reduces the number of bytes required to transmit the image. Compression algorithms can be lossless or lossy.

7.3.3.1 Lossless Video Data Compression

Lossless compression schemes guarantee that the decompressed image will be identical to the original, with no degradation in signal quality. Unfortunately, lossless compression schemes can only achieve modest compression ratios, in the order of 2:1 to 4:1.

7.3.3.2 Lossy Video Data Compression

An image compressed with a lossy compression scheme and later decompressed, is only an approximation of the original image. It is a debatable question as to how much compression can be used before the signal degrades to unacceptable limits. There are three types of widely accepted lossy compression schemes, namely JPEG, MPEG and H.261.

7.3.3.3 JPEG Compression

The Joint Photographic Experts Group (JPEG) has developed an international standard for compression of still images. Based on discrete cosine transforms (DCTs), JPEG compresses an image into a representation that is a fraction of the size of the original image that, when decompressed, yields an approximation of the original image.

The JPEG algorithm is based on first sub-sampling and then compressing the image. The human eye is less sensitive to colour information than to black and white information and sub-sampling exploits this characteristic by reducing the colour information without measurably affecting picture quality.

The actual compression ratio of the JPEG algorithm can not be set to a fixed value. Instead, the quality factor for the transformation is specified. The compression ratio is determined by the complexity of the source image. In this way a fixed picture quality can be maintained, while the throughput required to transfer that picture will vary with picture complexity. This will have implications for the data transfer infrastructure, especially the requirement to provide for class of service discrimination.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 154 of 214 ydthsm2.wpd System Implementation

7.3.3.4 MPEG Compression

The Motion Picture Experts Group devised MPEG to take advantage of the fact that sequential frames of a moving image contain significant redundant information. MPEG encodes an image into a base frame and n additional frames that contain only the information that has changed from that contained in the base frame, followed by another base frame. The value of n is adjustable and MPEG with n = 0 is similar to JPEG. For the same series of images, MPEG should produce the same quality as JPEG, but with a lower data rate.

The disadvantage of using MPEG as opposed to JPEG in a packet- switched network, is that every frame of a JPEG image is complete, while all n update frames in an MPEG image are dependent on the preceding base frame. If the base frame is lost, n+1 frames are lost, so if there is no error control, the value of n must remain low in packet-switched networks.

Flexible error control is a requirement for optimised transmission of digital video in networks. Where MPEG is used, the base frame could be fully error controlled with the subsequent frames provided with less error control or none at all.

Currently, MPEG is only available in software, which is still too slow for real-time use. Should dedicated MPEG compression engines become available, MPEG would be superior to JPEG for moving images.

7.3.3.5 Motion JPEG Compression

Motion JPEG is identical to standard JPEG, as described in Paragraph 7.3.3.3. Each video frame is grabbed and compressed in real- time and the resultant information is distributed to the relevant user. The decompression algorithm is also executed in real-time, so that full motion video is regenerated.

When JPEG is used to compress moving images, it treats each frame as an independent still image. This means that redundant information in an image is reduced, but information that is redundant from frame to frame remains. From this it can be seen that MPEG would provide a greater overall reduction in bandwidth, while JPEG would provide more graceful degradation in the presence of network errors.

7.3.3.6 H.261 Compression

H.261 is an ITU compression standard[128] also known as p*64. It is used for transmitting video images over digital networks at data rates from 64 kbits-1 to 2,048 Mbits-1. H.261 is similar to MPEG and is also based on the DCT technique, but allows much faster transmission speeds, although not without some trade-off in image size and quality.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 155 of 214 ydthsm2.wpd System Implementation

H.261 is one of a complete family of ITU standards for the compression (H.261), framing (H.221), control (H.230 and H.242), encryption (H.233), video conferencing (H.231 and H.243) and transmission (H.320) of digital video.

7.3.3.7 Compression Ratio

The Compression Ratio depends on the video source characteristics and on the acceptable degree of image degradation. Tests performed using JPEG on live video at different compression ratios have produced the guidelines provided by Table IX:

Compression Ratio Resultant Quality 1:1 to 6:1 Broadcast Quality 6:1 to 12:1 Professional Quality 12:1 to 20:1 Consumer Quality 20:1 to 40:1 Multimedia Quality

Table IX : Video Quality for Various Lossy Compression Ratios

At a compression ratio of 1:1 (sub-sampling, but no compression), a real- time framegrabber will convert an RGB PAL signal of 768 x 576 pixels (25 frames/second) to digital information with a data rate of approximately 265 Mbits-1-1. This data rate can be compressed to between 22 and 44 Mbits-1 (depending on image complexity) while still maintaining professional image quality.

7.3.4 Networking Implications of Digital Video

Networking of digital video requires high-speed technologies (such as FDDI or ATM). FDDI can accommodate a small number of digital video circuits, but standard FDDI shares bandwidth between all nodes. ATM is well suited to digital video applications (e.g. 155 Mbits-1 SONET). Each link can accommodate a number of virtual video circuits while the ATM switch can aggregate the combined bandwidth of a number of individual links (Refer Paragraph 7.2.2.7).

7.3.5 Audio

Audio transmission is characterised by the requirement to cater for the complete audio spectrum of the human ear, effectively 0 to 20 kHz. Audio is normally simplex, i.e. it flows in one direction from producer to consumer(s).

Examples of audio are sonar and broadcast quality music. In a real-time, mission- critical, distributed system such as an anti-submarine warfare (ASW) frigate, sonar information would be absolutely critical and therefore allocated a high precedence.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 156 of 214 ydthsm2.wpd System Implementation

Due to the relatively low bandwidth of sonar and dynamics of the underwater environment, it may not be accorded the highest priority. However, an incoming torpedo alert from a sonar system may be accorded the highest precedence and high, but not highest, priority.

Video or audio for crew or workforce entertainment would be the first signals to be discarded in a multimedia, real-time, mission-critical, distributed system and would therefore be assigned the lowest priority when an action state was entered or when available bandwidth became scarce.

Full bandwidth audio is sampled at 50 kHz and at 16 bits per sample giving rise to some 800 kbits-1.

7.3.6 Voice

Despite extensive electronic integration of functions within modern systems, it is likely that there will still be a extensive reliance on voice communications between operators. This is partly a matter of doctrine and partly because human operators are presently much more capable than machines at performing data fusion.

The most important difference between voice and video is that the former requires some hundreds of kbits-1 while the latter requires some hundreds of Mbits-1 (without compression).

Voice transmission is characterised by the requirement to transmit the primary power band of intelligible human speech (effectively 300 to 4 000 Hz) while providing speaker recognition. Voice can therefore normally be band-limited in the analogue domain and compressed in the digital domain. Normal quality voice is usually sampled at 8 kHz and at 8 bits per sample giving rise to 64 kbits-1. High quality voice is sampled at 16 kHz and at 12 bits per sample giving rise to 192 kbits-1. Due to the significant redundancy in human speech, voice can be considerably compressed without significant degradation of quality, thus reducing the required network throughput considerably. Voice is also normally duplex, i.e. it flows in two directions where each node is a producer and consumer.

7.3.7 Continuous Media Services

A special requirement of real-time digital video and audio is that continuous media services are required to be provided by the network.

Continuous media require specific protocol support from the transfer protocols. This includes flexible error control, the option to dispense with flow control and class of service discrimination.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 157 of 214 ydthsm2.wpd System Implementation

Standard FDDI does not directly provide this support; however the following augmentation schemes or alternative options are possible :

! Use of real-time network protocols, e.g. XTP, with some extensions in the areas of flexible error control (such as Slack ARQ[65]) and use of the Class of Service and Quality of Service options.

! Adoption of ATM LAN standard (when sufficiently mature).

! Adoption of FDDI II LAN standard (when available).

7.3.8 Summary of Network Implications of Continuous Media

Analysis of the distribution of digitised continuous media underscores the derived system requirements of networks capable of high throughput, low latency, deterministic data transfer and flexible dataflow control.

7.4 Connectivity Issues

7.4.1 Cable Plant

The Cable Plant consists of all the hardware items to provide physical connection between sub-systems and thereby support the sharing of data. Its characteristics are therefore important for real-time, mission-critical systems, especially in the area of dependability.

It is proposed that, in general, fibre optic cable should be used for the System Cable Plant. This is for reasons of electromagnetic compatibility, specifically reduction of cross-talk, i.e. electromagnetic susceptibility (especially due to radiation from high- power radiators such as radars and electromotive equipment), radiation into other equipments (especially highly sensitive equipments such as RF communication systems and sonars) and communications cables. Use of fibre optic cables also circumvents the ubiquitous problems of bonding and grounding.

The System Fibre Cable Plant (SFCP) therefore provides the physical medium for data transfer by propagation of optical signals. The SFCP typically consists of a segmented main trunk, segment interconnection equipment and local connection equipment.

It is recommended that the SFCP should consist of a dual- or quad-redundant fibre trunk main employing a multicore multimode fibre (50µ/125µ or 62,5µ/125µ) cable (4- to 12-core) which is routed throughout the system. To support FDDI, the SFCP will physically and logically have a counter-rotating ring topology. The precise physical cable routing should be such as to maximise system survivability. Typically aboard a mobile platform, the routing will be according to the starboard/high - port/low principal (or vice versa). In fixed installations, the cabling should be routed for maximum separation and security (i.e. restricted physical access).

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 158 of 214 ydthsm2.wpd System Implementation

The trunk segments interconnect system compartments and run between pairs of Trunk Coupling Units (TCUs), i.e. one TCU per compartment for each trunk. Coupling between sub-systems and the LAN cable trunk should be accomplished by TCUs incorporating pairs of 2x2 Optical Bypass Switches (OBSs) per sub-system. TCUs are provided in each system compartment where network services are required. Fibre optic patchchords connect each sub-system requiring network services to a TCU. These mate with suitable feedthrough connectors provided on each sub-system equipment. Internal patchchords provide connection with the FDDI Network Interface Cards (NICs) within the sub-system equipment.

OBSs re-route the FDDI LAN signals in the case of sub-system power-off or failure. Due to internal optical losses, only a certain number of OBSs can be active in series at one time, therefore Fibre Amplifiers need to be provided to regenerate the FDDI LAN signals as appropriate.

Refer to Figure 18 for a representation of a typical System Fibre Cable Plant arrangement and to Figure 19 for a representation of a TCU.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 159 of 214 ydthsm2.wpd TCU U-D2 TCU TCU TCU TCU TCU U-23 TCU TCUU-63 TCU TCU U-D3 U-82 U-83 U-22 U-62 U-A2 U-A3

NSS IFU SBS SBR (SRX) PSR MGW IFU (76) xxx-xxx 452-220 711-1A0 SSM IFU (1/2) EOT ORT 481-260 481-160 721-170 OD IFU MRL IFU 482-330 MDR IFU (TPR) EW IFU TFC IFU * 474-120 452-170 xxx-xxx 483-120 FCR FRIFU DPG IFU (35) 482-2B0 711-220 FCR FCIFU 482-2C0

TCU TCU TCU TCU TCU TCU TCU TCU TCU TCU U-E2 U-E3 U-72 U-73 U-C2 U-C3 U-52 U-53 U-92 U-93

SSM IFU (2/2) TED IFU* SSM MCIFU MGW IFU (76) 721-170 471-240 482-120 711-1A0

SSM GCIFU 482-160

CSM SMC (1/2) 491-110

MRL xxx-xxx

TCU TCU TCU TCU TCU TCU TCU TCU TCU U-42 U-43 U-36 U-B2 U-B3 U-32 U-34 U-35 U-33

ORT CON TRX COC (1/2) WCS WCU (2/2) ESC 481-150 435-520 411-610 xxx-xxx Main Trunk HMS RXC EOT CON TRX COC (2/2) AIS ACA1 ECC 461-140 481-250 Redundant Trunk 435-520 411-110 xxx-xxx Breakout Cable WCS WCU (1/2) TCS FCU AIS ACA2 FCR 411-610 Fibre Patchchords 411-210 411-120 xxx-xxx * MMI SRC HCU MSCU Future Expansion HMS OPR (OPC) 452-310 xxx-xxx xxx-xxx 461-151

Figure 1717 :: Typical Typical SystemSystem FibreFibre CableCable PlantPlant ysfcp02.cdr Page 160 of 214 Input

Fibre Amplifier Power Output

Optical Bypass Switches

Spare

Splicing Post

Dustcaps

Dustcaps

Spare Spare

Optical Bypass Switch Control

Fibre Amplifier Power

Main Fibre Cable Trunk

Fibre Optic Patchcords

Figure 18 : Trunk Coupling Unit ytcu02.cdr Page 161 of 214 System Implementation

7.4.2 Network Interface Cards

Network Interface Cards (NICs) provide connection of the host to the LAN. As such, their performance, capabilities and architecture are critical to the real-time, distributed system[102]. NICs are normally found in two main categories, i.e. intelligent NICs featuring their own processing and other resources and non- intelligent NICs which share the processing resources (memory, timers, etc.) of the host CPU. Each type has its own advantages and disadvantages.

7.4.2.1 Intelligent NICs

The main advantage of intelligent NICs is that they can support LAN operations without detracting from the hosts's application processing, especially by generating a high number of interrupts. High-speed LANs are capable of generating vast amounts of data for a host CPU to consume and thus an off-host multiprocessor (host and LAN processor) architecture is often more appropriate in high performance systems.

The main disadvantages of intelligent NICs are that they are more expensive than non-intelligent types and that, due to rapid advances in CPU technology, tend to obsolesce more rapidly. Another important consideration is that host CPUs and NICs communicate by a non- deterministic parallel backplane bus. There is therefore a further source of data communication latency and jitter between NIC and host, thus requiring a secondary level of time synchronisation (refer also to Paragraph 6.8.2.6).

7.4.2.2 Non-Intelligent NICs

The main advantages of non-intelligent NICs is that are they less expensive and smaller than intelligent types.

The main disadvantage of non-intelligent NICs is that they can considerably detract from the hosts's application processing capabilities, especially by generating a high number of interrupts. NTP processing may exacerbate this.

7.4.3 Real-Time Operating Systems

Ultra-high processing speeds are required by most real-time, mission-critical, distributed systems with as short as possible task response times (task switch times) being provided by the operating system. Typical maximum figures for such times are in the order of 50 ìs to 250 ìs.

In complex software systems where the functionality is implemented by many software tasks, real-time operation is most effectively implemented under control of a real-time operating system (R-TOS). The R-TOS manages all system resources including task scheduling, inter-task communication, synchronisation, interrupt handling, memory management, general I/O and LAN I/O.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 162 of 214 ydthsm2.wpd System Implementation

In order to support the real-time system as well as real-time networking, a R-TOS must have the following capabilities[118] :

! Predictably fast response to urgent events ! A high degree of schedulability ! Stability under transient overload

Among the most useful of theoretical approaches to the management of real-time systems, especially distributed real-time system, is Generalised Rate Monotonic Scheduling (GRMS)[118]. GRMS provides a useful model that allows developers of real-time systems to meet the above requirements by managing system concurrency and timing constraints at the level of tasking and message passing.

Because of the onerous performance requirements of critically real-time systems, real-time operating systems were traditionally programmed using proprietary techniques. This has normally resulted in the corresponding real-time computing system being a closed system, i.e. unable to interface to any entity not especially developed for the purpose. Recent advances in hardware and software technology, however, have made it possible to apply the concept of open systems to demanding real-time systems. The attributes of such operating systems are that they should exhibit fast and deterministic real-time performance, be compatible with industry standards and be available on a variety of hardware platforms. They should support a variety of standard features, networking and graphical user interfaces, software high-level languages as well as memory and mass storage management[132].

7.4.4 POSIX

In order to further the objective of open systems, the IEEE is developing standards that will enhance the portability of operating systems across different computer environments. In particular, the IEEE is presently developing or refining nine POSIX (Portable Operating System Interface eXtension) standards in specific areas.

Of specific importance to real-time systems are the POSIX 1003.1 Compliance Test[2], POSIX 1003.4 Real-Time Extensions[3] and POSIX 1003.4a Threads Interface[4] standards.

7.5 Chapter Summary

Protocols supporting real-time, mission-critical, distributed systems must operate in a complex hardware and software environment. While it should be a goal to achieve decoupling between the entities of the system in order to maximise flexibility and upgradeability, complex interactions are inevitable within the system. The design and implementation of real-time protocols therefore have to be considered within the contexts of system integration, dependability, topology and interconnectivity. Furthermore, the real- time protocols have to co-exist with real-time operating systems, as well as share critical computer resources such as processor time, memory and I/O. The lower level protocols have to interact very closely with network interface hardware which in turn have to communicate

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 163 of 214 ydthsm2.wpd System Implementation

with host processors via a parallel backplane bus. Optimisation of the complete system, in order for it to support critically real-time, mission-critical, distributed applications is therefore of major significance.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 164 of 214 ydthsm2.wpd System Prototyping and Modelling

Chapter 8

System Prototyping and Modelling

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 165 of 214 ydthsm2.wpd System Prototyping and Modelling

8. System Prototyping and Modelling

8.1 Scope

This chapter provides a description of an experimental testbed developed to investigate real- time protocols, especially their performance, integrateability and robustness. This testbed facilitates detailed performance measurement and correlation of required performance specifications with achievable results. It also proves useful for feeding these achievable results back into the system design, thus practically influencing the latter in such a way as to initiate alternate designs or set more realistic system requirements.

Apart from being a protocol development and measurement testbed, the infrastructure represents a skeletal framework for a LAN-based, distributed Integrated Naval Surface Combat System (INSCS). This is termed the Architecture Demonstration Model (ADM). As such, the ADM also proves useful for evaluating system concepts other than those purely network-related.

The experimental approach followed in this study was to identify certain areas of the Real- Time LAN Profile which required further investigation in terms of performance measurement, implementation details and interface design. Support of such an investigation required extensive software design and development, integration of bought-out and developed hardware and software products, as well as the setup and execution of detailed performance tests.

8.2 Objectives

The objectives of the project can be summarised as follows :

! Acquisition and integration of commercially-available, industry-standard, hardware and software building blocks into a skeletal system representing the INSCS's LAN- based, distributed processing architecture.

! Protocol engineering and development of an Application Interface Services (APIS) protocol meeting the generic and specific requirements described in Paragraph 6.8.1.

! Implementation of the Xpress Transport Protocol (XTP) for the Intel 80x86 platform (specifically for the Concurrent Technologies (CCT) CL486/DAS Multibus II and SysKonnect SK-NET FDDI-FE PC/EISA FDDI NICs) and interfacing to the APIS protocol.

! Implementation of the IP and CLNP network layer protocols interfacing to the XTP, TP4 and TCP transport layer protocols.

! Adaptation of the Internet Network Time Protocol (NTP) (specifically for the CCT CL486/DAS Multibus II and SK-NET FDDI-FE PC/EISA FDDI NICs) to operate over an FDDI LAN.

! Derivation of a Network Management philosophy.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 166 of 214 ydthsm2.wpd System Prototyping and Modelling

! Implementation of Network Management Services (NMS) with appropriate man- machine interface.

! Development of software simulators to exercise the system in order to gain insight into system behaviour.

! Characterisation of the following requirements :

" Data Transfer Latency " System timing and synchronisation " Closed-loop control over the LAN " Reliable data multicast " Multicast group management " Multiprotocol data transfer " Data transfer determinism " File transfer

! Documentation of the complete system and all its components, including :

" System Specification[56] " System Design Document[57] " Host/APIS Interface Specification[58] " Host/NTS Interface Specification[59] " Host/BITS Interface Specification[60] " Host/FTS Interface Specification[61]

8.3 Experimental Testbed Description

The Experimental Testbed consists of an FDDI local area network connecting a number of protocol development stations, simulators and test stations. Refer to Figure 19 Experimental Testbed Topology for a diagrammatic representation of the experimental system and applicable technologies.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 167 of 214 ydthsm2.wpd Secondary Secondary Fibre Amplifier Power Trunk Coupling Units Fibre Amplifier Power Primary Primary

Optical Bypass Switches with Optical Bypass Switches Fibre Amplifiers

Splicing Posts and Splicing Posts

Optical Bypass Switches

MONI

MONI

MONI NSS Multibus II MONI Simulator Search Radar Simulator Radar "NSS" "SRS" MONI Target FDDI LAN Simulator Multibus II

MONI

MONI FDDI

MONI ae18o 214 of 168 Page MONI LAN Analyser XTP Development XTP APIS Protocols Development Protocol Operating Systems OSITP4/CLNP Analyser LynxOS R-TUnix TCP/IP iRMX XTPV4.0 MS-DOS Figure 19 : Experimental Testbed Topology File : yett01.cdr System Prototyping and Modelling

8.3.1 Local Area Network

The LAN consists of nine dual-attached FDDI nodes, seven being IBM PC- compatibles (Intel 80486DX2-66 CPUs using EISA-bus) and two being Multibus II development workstations (90 MHz Pentium® CPUs).

The PCs host SysKonnect FDDI EISA NICs (designated SK-NET FDDI-FE) which do not employ an onboard CPU; they thus require the processing resources of the host. The SK-NET FDDI-FE NICs use the Universal Portable Protocol Stack (UPPS) which is S&K's common multiprotocol interface platform. UPPS supports all of the most common LAN protocols including TCP/IP, TP4 and NetWare IPX/SPX as well as most of the popular operating systems such as Microsoft MS-DOS, Novell NetWare, Microsoft Windows NT, SCO Unix and IBM OS/2.

The Multibus II workstations host Concurrent Technologies CL486/DAS FDDI NICs which employ an onboard 80486DX4-100 CPU which executes protocol processing as well as the onboard APIS running under an iRMK real-time kernel. The CL486/DAS uses the CLA (Comsoft LAN Architecture)[54] which provides a tightly- coupled TP4/CLNP protocol implementation running under iRMK. CCT also offer a TCP/IP option running under iRMXIII, Intel Corporation's full-featured, proprietary, real-time operating system.

Both types of FDDI NIC use the Advanced Micro Devices Supernet II FDDI chipset. This is an important consideration as the Supernet II chipset supports FDDI synchronous mode whereas the main alternatives, the and Motorola FDDI chipsets, do not.

Two PCs primarily support XTP and APIS development as well as protocol performance measurement.

The third PC simulates a Navigation Sub-System ("NSS") which multicasts simulated Navigation Data to the whole network with a nominal repetition rate of 5 ms. The "NSS" receives Calendar Time information and (static) Own Position from a Global Positioning System (GPS) via a serial RS-232C link supporting the NMEA 0813 protocol.

The GPS Calendar Time provides global time synchronisation, by means of a broadcast message from the GPS host (i.e. the "NSS") to the system via the LAN every 10 ms.

The fourth PC simulates a Search Radar Sub-System ("SRS") which receives simulated Target Data over the FDDI LAN from the fifth PC implementing a radar data simulator.

The simulator PCs also act as development platforms for APIS consumers and multicast group management strategies.

The sixth PC acts as a host for a high-level FDDI protocol analyzer, FDDI LANWatch (an ftpSystems software product OEMed by S&K who have developed

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 169 of 214 ydthsm2.wpd System Prototyping and Modelling

promiscuous mode packet drivers for the SK-NET FDDI-FE NIC). LANWatch can capture all FDDI packets in real-time or filter these by means of a flexible filtering capability. The captured packets can then be analysed or displayed with all protocol, timing information and data being available.

The seventh PC acts as a host for an APIS protocol analyzer, IVIT (Interface Test and Verification Tool) intended to assist development and integration of APIS. IVIT allows a LAN interface developer to interactively set up messages for reception and transmission according to system dataflow management directives. These messages can be captured, analysed or displayed in real-time with data being converted to engineering units. The APIS protocol information can also be displayed. In essence, IVIT can provide a static scenario simulator. It can also, however, provide a user application interface whereby application code can access IVIT services thereby supporting a fully dynamic scenario simulator.

8.3.2 Protocol Engineering

8.3.2.1 XTP Development

The organisation responsible for promoting XTP and maintaining the standard is known as the XTP Forum. Apart from the XTP specification documents[37, 38], they also provide (to members) an XTP Kernel Reference Model (KRM) in Sun and Sparcstation formats (for V3.6). For V4.0, Sandia National Laboratories (California, USA) have a public-domain, object-oriented implementation (written in C++).

Other commercial organisations have ported XTP to specific platforms. In particular, these include Network Xpress, Inc. (NXI) and Mentat, Inc. The former have ported XTP to the Intel 80x86 family of processors while the latter have implemented a Streams version, primarily for Unix environments.

In line with the concept of the maximum use of commercial off-the-shelf (COTS) components, the most practical option was to acquire such an implementation of XTP. As the primary computing platform in both the Multibus II and PC cases is based on the Intel 80486 or Pentium CPU, the NXI XTP was purchased as a "qualified" software source code product.

In order to expedite development, it was also decided to contract NXI to port their XTP to the SK-NET FDDI-FE NIC. NXI consequently provided a ported XTP implementation with 16-bit device drivers for this FDDI NIC.

Four versions of NXI XTP were supplied over a period of eighteen months as the XTP definition was being finalised; these being NXI XTP V3.6, V3.7, V4.0 and V4.01, the first two being fundamentally different from the latter two and corresponding with XTP's migration from a transfer protocol to a transport protocol.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 170 of 214 ydthsm2.wpd System Prototyping and Modelling

The project team successfully ported NXI XTP to the Multibus II/iRMK target.

8.3.2.2 IP Implementation

Initially XTP (V3.6) did not require a network layer protocol. However, when XTP V4.0 was released and NXI XTP V4.0 was delivered, a network layer protocol was again required for the real-time option. It was decided to approach the issue from two perspectives and in two phases.

Initially NXI could provide XTP with two lower-layer interface options, i.e. directly to the data link layer or with their own stripped-down version of IP where an encapsulated IP service is offered. This service includes ARP (Address Resolution Protocol), but excludes full-featured routing capabilities as well as a broadcast capability. As APIS requires a broadcast service in order to establish a consumer group, it had to access the LLC layer for this service.

The second phase would be to incorporate a standard, full-featured IP into the profile. This would include a broadcast as well full routing capabilities. Network Xpress are still to complete this version of XTP/IP.

8.3.2.3 APIS Development

APIS, being a newly proposed concept as well as being intended to interface directly to a wide spectrum of users, required attention from three perspectives; formal requirements specification, prototyping and then development using some formal methodology. The end result being a product, evolutionary prototyping was considered to be the most appropriate approach.

Detailed requirements were derived, documented and formally reviewed. Once an acceptable baseline was determined, prototyping was undertaken before the software was demonstrated and integrated with representative user sub-systems. This process will be re-iterated until a production quality product is achieved.

Such development was undertaken and the following achieved :

! Advanced prototypes of APIS V1.0 for both Multibus II and PC/EISA.

! Integration with own application software (i.e. interfacing to sub- system simulations).

! Integration with another organisation's MBII APIS implementation. In this case APIS was provided to the user as source code, application programming documentation and an interface specification.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 171 of 214 ydthsm2.wpd System Prototyping and Modelling

! APIS V2.0 design using formal object-oriented methodology (Booch notation) and CASE (Rational Rose).

! APIS V2.0 development using C under iRMK on Multibus II.

! Latency and throughput performance measurements.

8.4 Performance Measurements

A major objective of the experimental testbed was to test the capabilities of the complete protocol stack as a transparent, data-driven protocol profile which does not compromise underlying real-time performance. Of specific interest to real-time, distributed systems are throughput and latency.

8.4.1 Measurement Setup

The measurement setup consisted essentially of two peer FDDI stations, a protocol analyzer and an oscilloscope. The latter had one input connected to the parallel I/O port of the FDDI transmitter's host processor and the other input connected to the parallel I/O port of the FDDI receivers's host processor. The software of the APIS Test Shell (ATS) would set a bit of the I/O port as it submitted a packet to APIS for transmission while ATS would do likewise at the receiver as it received the packet from APIS. The time difference between the two events therefore represents the end- to-end latency between Application Services Users (ASUs). Various combinations of the above would show the latencies at each layer in the protocol stack as well as that contributed by the parallel backplane bus.

To measure throughput, a continuous stream of like size packets would be transmitted for a fixed period with the number of transmitted packet being counted and the bit transfer rated computed.

Measurements were performed under no load and at various levels of load up to 68% of FDDI bandwidth (i.e. 68 Mbits-1). Load was achieved by two PC-based stations transmitting simultaneously during the measurement.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 172 of 214 ydthsm2.wpd System Prototyping and Modelling

Figure 20 depicts the equipment setup for the measurement of protocol performance :

Figure 20 : Setup for Performance Measurements

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 173 of 214 ydthsm2.wpd System Prototyping and Modelling

8.4.2 Test Equipment

Table X lists the processor boards used for latency tests :

Processor Boards Used in Tests Model Manufacturer CPU Host Processor 1 PSBCP5090 Intel Corporation Pentium 90 MHz Multibus II with PC Features 2 486/DX2 Intel Corporation 80486 66 MHz Multibus II with PC Features 3 486/DX4 Intel Corporation 80486 100 MHz Multibus II with PC Features 4 CCT ETC Concurrent Technologies Ltd 80486 100 MHz Multibus II with PC Features 5 CL486/DAS Concurrent Technologies Ltd 80486 66 MHz Standard Multibus II FDDI NIC 6 CL486/DA4 Concurrent Technologies Ltd 80486 100 MHz Standard Multibus II FDDI NIC

Table X : Multibus II Host Processor Boards and FDDI NICs used for Latency Tests

8.4.3 Test Scenarios

Performance Measurements were performed across the different levels of the protocol layers as indicated by Table XI.

Protocol Layers ISO Layer Protocol Layer Implementer Description 7 Application APIS C²I² Systems Producer/Consumer based Application Interface Services (APIS) 4 Transport Multibus II TP Intel Transport Message passing protocol based on IEEE 1296 4 Transport XTP 4.0 Network Xpress Xpress Transport Protocol 3 Network Encapsulated IP Network Xpress Internet Protocol 2 Data Link LLC1 Comsoft Logical Link Control Type 1 ISO 8802-2 (IEEE 802.2)

Table XI : Protocol Stack Layers

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 174 of 214 ydthsm2.wpd System Prototyping and Modelling

8.4.4 Protocol Overheads

There are overheads associated with the transmission of FDDI packets over the network. These are listed in Table XII below:

Protocol Layer Header Overheads per Packet MAC 16 bytes LLC1 8 bytes IP 20 bytes XTP 32 bytes APIS 10 bytes Total 86 bytes

Table XII : Protocol Layer Header Overheads per Packet

8.4.5 Parallel Backplane Bus Latencies

8.4.5.1 Test Setup

8.4.5.1.1 Fixed Length Unsolicited Transfers

This test involves sending a Multibus Transport Protocol unsolicited message of fixed length (28 bytes) repeatedly (with full handshake) between any two processor boards via the Multibus II PBB.

8.4.5.1.2 Fixed Length Solicited Transfers

This test involves sending a Multibus Transport Protocol solicited message of fixed length (28 bytes) repeatedly between any two processor boards via the Multibus II PBB.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 175 of 214 ydthsm2.wpd System Prototyping and Modelling

8.4.5.2 Test Results

8.4.5.2.1 Unsolicited Transfers

Table XIII below provides the latency test results for unsolicited transfers of 28 bytes :

Unsolicited (Single Way) Latency Delays (in µs) Intel Intel Intel CCT CCT CCT ETH (80486- (80486- Pentium CL486/DAS CL486/DA4 (80486 DX2) DX4) (80486 (80486 -DX4) -DX2) -DX4) Intel - 80486-DX2 Intel -- 80486-DX4 Intel 143 - 63 Pentium CCT 150 133 77 78 CL486/DAS (80486-DX2) CCT 148 113 55 65 50 CL486/DA4 (80486-DX4) CCT ETH 140 - 55 64 51 - 80486-DX4

Table XIII : Unsolicited Transfer Latencies

Discussion

Table XIII indicates fairly low Multibus II latencies for unsolicited messages of short length. In all cases, these are lower than 150 µs with a minimum of 50 µs. These latencies are generally somewhat below achievable NTP synchronisation accuracies over a typical FDDI control LAN; hence will not affect system performance substantially. Latencies improve with increased CPU performance, i.e. from the 66 MHz 80486DX-2 to the 100 MHz 80486DX-4 up to the 90 MHz Pentium. Thus time-critical systems are better supported by higher performance (and more expensive) CPUs.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 176 of 214 ydthsm2.wpd System Prototyping and Modelling

8.4.5.2.2 Solicited Transfers

Table XIV shows the test results for solicited transfers of 50 bytes :

Solicited (Single Way) Latency Delays (in µs) Intel Intel Intel CCT CCT CCT (80486 (80486 Pentium CL486/DAS CL486/DA4 ETH -DX2) -DX4) (80486 (80486 (80486- -DX2) -DX4) DX4) Intel - (80486-DX2) Intel -- (80486-DX4) Intel 488 - 228 Pentium CCT 570 500 336 ~ CL486/DAS (*363) (*488) (80486-DX2) CCT 500 390 245 ~ ~ CL486/DA4 (*543) (*278) (*395) (*297) (80486-DX4) CCT ETH 485 - 225 350 265 - (80486-DX4) (*275) (*395) (*302)

(* Includes release_buffer() call)

Table XIV : Solicited Transfer Latencies

Discussion

Table XIV indicates significantly higher Multibus II latencies for solicited messages of short length. These range from 225 µs up to 570 µs. These latencies are one to two times NTP synchronisation accuracies over a typical FDDI control LAN; hence they will affect system performance substantially. Again, latencies improve significantly with increased CPU performance, with 90 MHz Pentium latencies typically being half those of the 66 MHz 80486DX-2. Thus higher performance CPUs are imperative for time-critical systems employing solicited messages over Multibus II.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 177 of 214 ydthsm2.wpd System Prototyping and Modelling

8.4.5.2.3 Variable Length Solicited Inter-Host Transfers

Table XV shows the test results for solicited inter-host transfers of variable length :

Solicited (Single Way) Latency Delays between Pentium Processors (in ms) No. of Bytes 1 000 2 000 3 000 4 000 5 000 6 000 Time 0,48 0,78 1,08 1,36 1,66 1,96

Table XV : Solicited Inter-Host Variable Length Transfer Latencies

Discussion

Table XV indicates increasing Multibus II latencies for solicited messages of increasing length. These range from 480 µs (0,48 ms) up to 1,96 ms. These latencies are almost an order of magnitude greater than NTP synchronisation accuracies over a typical FDDI control LAN; hence they may seriously affect system performance. Where messages are time-critical, they should be kept short. Non-critical long messages between other Multibus II hosts can also affect the latencies of short messages to the NIC; therefore the primary host processor and NIC should be allocated the highest PBB priorities. Where PBB traffic is not deterministic and cannot be accurately characterised, a time protocol between the host processors and NIC should be implemented.

8.4.6 Logical Link Control (LLC1) Latencies

Table XVI shows the test results for Logical Link Control latencies :

LLC1 Performance Measurements Packet Size Latency Implementer's Throughput Measurement 50 bytes 0,63 ms 8,4 Mbits-1 @ 4 475 byte packet size 4 000 bytes 1,66 ms

Table XVI : Logical Link Control Latencies

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 178 of 214 ydthsm2.wpd System Prototyping and Modelling

Discussion

These latency measurements provide the time between the send-side XTP protocol code queuing a MAC-level packet to be sent and receive-side XTP protocol code receiving notification of a received packet. What is noteworthy about these measurements is that the developers of the original XTP code claim an end-to-end latency of only 0,35 ms for a PC-based implementation of their protocol stack over FDDI. In the Multibus II implementation, the latency due to the MAC layer only is twice this large. This either implies a non-optimum Multibus II implementation, a very highly tuned PC implementation, questionable implementer's test results or substantial Multibus II PBB latencies.

The throughput of the LLC1 layer could not be estimated directly as these measurements were taken for packets sent through the entire Multibus II TP/APIS/XTP/LLC protocol stack. However, for the entire protocol stack, a typical single station throughput of approximately 8,4 Mbits-1 was observed. Given that the documented throughput for the LLC layer is approximately 4 times greater, it seems unlikely that the LLC layer presents a bottleneck in this implementation.

8.4.7 XTP Latency

8.4.7.1 Vendor Measurements

Table XVII provides XTP latency results for tests performed over FDDI and Ethernet by Network Xpress, Inc.

Conditions : end-to-end (user memory to user memory) 100 byte messages no load

Result Computing Platform Network CPU Bus CPU Operating Technology Type Type Speed System 350 ìs FDDI Intel EISA 50 MHz MS-DOS 80486DX

700 ìs Ethernet IBM RISC MCA AIX

Table XVII : XTP Latency Performance Results

Discussion

XTP latency tests with FDDI on a PC-type machine show latencies of 350 µs which is low considering the 50 MHz PC with its EISA parallel backplane bus (PBB) and MS-DOS operating system. The IBM RISC machine with its Microchannel Architecture (MCA) PBB and AIX

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 179 of 214 ydthsm2.wpd System Prototyping and Modelling

operating system is a much more capable platform. Therefore the 700 µs being achieved with this platform over Ethernet shows the considerably higher capability of FDDI over Ethernet.

8.4.7.2 Own Measurements

Table XVIII shows own test results for XTP transport latencies :

Own XTP 4.0 Performance Measurements Packet Size Latency 50 bytes 1,7 ms 4 000 bytes 3,5 ms

Table XVIII : XTP Transport Latencies over Multibus II

Discussion

These latencies measure the delay between APIS calling XTP_send on the send side and the XTP protocol code performing an upcall to APIS on the receive side in order to indicate that new data has been received. These figures are surprisingly high when compared to the 0,350 ms claimed by the implementer for their PC/DOS based implementation over FDDI. This difference should be investigated in more detail.

It is postulated that a reason that an iRMK implementation could be slower than an MS-DOS version of the same code is that iRMK uses 48-bit pointers, whereas the MS-DOS code uses a mixture of 32-bit and 16-bit pointers. Additionally, it is more time-consuming dereferencing far pointers in protected mode (as would be the case under iRMK) than it is dereferencing far pointers in real mode under MS-DOS. The XTP code makes extensive use of pointers, so that the overall speed of the protocol implementation will depend heavily on the speed of pointer operations under the operating system and software language used for the implementation. This aspect of iRMK and the current implementation of XTP under iRMK should be investigated more completely.

8.4.8 XTP Throughput

Table XIX provides XTP throughput results for tests performed over FDDI by Network Xpress, Inc.

Conditions : end-to-end (user memory to user memory) 64 Kbyte messages

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 180 of 214 ydthsm2.wpd System Prototyping and Modelling

Result Computing Platform Network CPU Bus CPU Operating Technology Type Type Speed System 55 Mbits-1 FDDI Intel EISA 50 MHz MS-DOS 80486DX

92 Mbits-1 FDDI IBM RISC MCA AIX

Table XIX : XTP Throughput Performance Results

Discussion

XTP throughput tests over FDDI on a PC-type machine show throughputs of 55 Mbits-1 which is commendable considering the modest performance of the PC EISA parallel backplane bus (PBB) and MS-DOS operating system. The superior performance of the IBM RISC machine with its MCA PBB and AIX operating system allows 92 Mbits-1, showing that data transfer is directly related to protocol processing performance.

8.4.9 Complete Stack under No Load

Tables XX and XXI show the test results for latency and throughput of the complete protocol stack under no load :

Complete Stack Performance Measurements Packet Size Latency Throughput 10 bytes 2,55 ms 4 000 bytes 5,5 ms . 11 Mbits-1

Table XX : Complete Stack Performance Measurements under No Load

Discussion

These measurements firstly indicate the delay between APIS receiving a complete APIS_send message via a Multibus II solicited transfer from the host processor on the send side and starting a solicited transfer of the received message to the host processor on the receive side. Secondly, they indicate available throughput of the complete protocol stack.

Best case latency of 2,55 ms is achieved with small packet sizes of 10 bytes with this increasing to 5,5 ms with packet sizes of 4 000 bytes. Many classes of real-time

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 181 of 214 ydthsm2.wpd System Prototyping and Modelling

systems will not be able to tolerate these levels of latency. Other methods are therefore required to recover timeliness of the distributed system.

In respect of throughput, only 11 Mbits-1 is achieved, even with packet sizes of 4 000 bytes (with throughput, performance will be worse for decreasing packet sizes due to increased aggregate overhead). This level of performance illustrates that protocol processing requires substantial CPU performance. It is important to note, however, that this throughput is per FDDI link. The LAN will support an aggregate of the per node throughput (up to 100 Mbits-1).

8.4.9.1 Contribution per Layer

Table XXI below shows the contribution that each layer of the protocol stack makes to the overall latency.

Protocol Layer Transfer Latencies Transmitting Side Receiving Side 50 bytes 4 000 bytes 50 bytes 4 000 bytes Multibus II TP 0,38 ms 1,04 ms 0,28 ms 1,13 ms APIS 0,06 ms 0,06 ms 0,12 ms 0,12 ms XTP 0,29 ms 0,66 ms 0,78 ms 1,16 ms LLC1 0,63 ms 1,66 ms (*over FDDI)

Table XXI : Protocol Layer Transfer Latencies

Discussion

Several observations concerning these measurements can be made :

! The XTP latency on the receiving side is over twice that of the transmitting side. This is to be expected, however, as XTP uses a reliable multicast algorithm. This means that the XTP protocol engine must acknowledge each message when it is received. As a result, the receive engine must process a received packet and build and send an acknowledgement, whereas the transmit engine only sends the original packet (its processing of the acknowledgement falls outside these measurements). When changing the XTP engine to use unreliable multicast, it was found that the receive-side processing latency was approximately the same as for the send-side processing. Although it would be sensible to delay sending the acknowledgement until after the

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 182 of 214 ydthsm2.wpd System Prototyping and Modelling

received packet had been delivered to APIS, it is not simple to make this change in the Network Xpress XTP implementation.

! The Multibus II TP latencies are surprisingly high. Furthermore, they are affected mostly by message length.

! The LLC1 latencies are also surprisingly high and are similarly affected by message length.

! The APIS latencies are very low, only 60 µs on transmit and 120 µs on receive. This indicates the efficiency and optimisation of the APIS implementation.

8.4.10 Complete Stack under FDDI Load

Table XXII shows the test results for latencies of the complete protocol stack under various conditions of FDDI LAN loading :

End-to-end Latencies under FDDI Load FDDI LAN Load Latency with 80486DX2 Latency with 80486DX4 66 MHz CPU 100 MHz CPU 0 2,86 ms 2,59 ms 32% (32 Mbits-1) 3,24 ms 2,80 ms 35% (35 Mbits-1) 3,29 ms 2,84 ms 54% (54 Mbits-1) 3,27 ms 2,86 ms 68% (68 Mbits-1) 3,67 ms 2,96 ms

Table XXII : Complete Stack Performance Measurements under Load

Discussion

The effects of FDDI LAN loading on the end-to-end latency of the system were measured. In order to test this, a small MS-DOS program was written that sent 1 000 byte packets to a non-existent MAC address as rapidly as possible. The SysKonnect LAN Analyzer software was used to determine the actual LAN load. A single PC would load the FDDI LAN from 32% (32 Mbits-1) to 54% (54 Mbits-1) depending on the CPU being used. Two PCs could be used together to get a higher aggregate loading. However, beyond 80% load, it was found that the LAN Analyzer software behaved suspiciously as the host processor running the software seemed to become overloaded.

The overall latency measurements did not seem to be substantially affected by the presence of LAN load. The measured latency appears to vary linearly as a function

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 183 of 214 ydthsm2.wpd System Prototyping and Modelling

of LAN load in the range measured. Extrapolation determines an asymptotic worst case latency of 4,0 ms when the LAN is saturated.

It is noteworthy that a 65% LAN load corresponds to 8 NICs transmitting at 8,0 Mbits-1, which is nearly the full bandwidth that has been observed in practice with the test setup. The maximum throughput required for a single application would be sending a 4 Kbyte message every 20 ms. This would amount to a throughput of 1,6 Mbits-1. Therefore, a 65% LAN load would accommodate 40 applications transmitting at their maximum throughput requirements. The current implementation of APIS/XTP functions quite comfortably under this LAN loading.

The fact that the latency measurements are affected by CPU speed seems to suggest that the protocol implementation on the NIC, particularly the XTP implementation, demands a considerable proportion of CPU resources. This suggests that the XTP implementation could become a throughput bottleneck. However, it was not possible to saturate the NIC protocol engine with the iRMX test program. Although packets could be sent from the test program quickly enough for the protocol stages to begin overlapping (i.e. with APIS processing the next send request while XTP was processing the previous one), the NIC code never ran out of buffers at the APIS layer. This indicates that the 8,4 Mbits-1 throughput that was observed with the test setup was limited by the speed at which the iRMX test application could send messages to APIS. This amounted to sending approximately 250 packets of 4 000 bytes per second, or approximately 500 packets of 50 bytes per second.

Finally, although some degradation in overall stack latency with LAN loading was observed, in all cases measurements were made using FDDI asynchronous mode. Measurements using synchronous mode were not possible due to the unavailability of synchronous mode drivers, both for end station support and synchronous bandwidth allocation, for the FDDI network interface cards being used. It is postulated that with the FDDI LAN being loaded with asynchronous traffic and the test packets using synchronous mode, significantly less degradation would have been observed.

8.5 NTP Development

A prototype of NTP, running over both Ethernet and FDDI LANs, was developed. Two main problems were experienced in porting the public domain NTP code to the development environment. These problems affected the performance measurements.

The first problem revolved around the fact that NTP was developed for the Unix environment. As such, it requires Unix's adjust time system call to set a node's local clock correctly. It appears that with both SCO Unix and LynxOS this call does not function correctly (both operating systems use generic Unix code). The result of this problem is that while NTP works accurately enough in the WAN-type environment of the Internet where synchronisation precisions of tens to hundreds of milliseconds are expected, it cannot provide sub-millisecond precision in a small FDDI LAN. With the Linux public-domain operating system, where adjust time was observed to work correctly, such sub-millisecond accuracies could be obtained over Ethernet (Linux providing suitable Ethernet device drivers). As Linux did not provide FDDI drivers, it could not be used over FDDI. However, with SCO Unix

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 184 of 214 ydthsm2.wpd System Prototyping and Modelling

over FDDI, NTP was observed to work correctly for short periods of time until the incorrect operation of adjust time disturbed the local clock.

Refer to Appendix F for a full description of the NTP implementation over Ethernet and FDDI as well as performance test results.

8.6 Simulator Development

A number of sophisticated graphics-based simulators were developed using rapid prototyping and an extensive software library of re-useable objects.

This included a realistic Search Radar Simulator ("SRS") with 360E plan position indication of synthetic radar target plots and tracks. The latter are produced by a target generator and transmitted via the FDDI LAN.

In order to demonstrate the concept of a critical virtual circuit and closed-loop control, the SRS graphical user interface was integrated with a remote control unit (RCU) and its manual input device (in this case a mouse). All mouse positions on the RCU were transmitted to the SRS via a virtual circuit over the FDDI LAN, with interactive application responses being returned to the RCU on the duplex virtual channel.

A Navigation System Simulator derives Calendar Time and Own Position from a GPS and multicasts this information to the "SRS" along with synthetic Platform Motion Data representing platform roll, pitch and yaw with an update rate of 10 ms.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 185 of 214 ydthsm2.wpd System Prototyping and Modelling

8.7 Technical Conclusions

8.7.1 Prototyping and Modelling

8.7.1.1 Transport Layer/Network Layer Coupling

Despite the well defined, layered approach of OSI, in the real world available off-the-shelf products do not always offer the flexibility promoted by the OSI model. This is because vendors usually provide combinations of TP4/CLNP or TCP/IP, but only the transport layer is accessible and not the network layer which is what is required by XTP.

8.7.1.2 Off-the-Shelf Products

Despite the fact that TP4 and TCP/IP specifications have been stable for some years, this does not necessarily imply that products are widely available in stable, fully-conformant software implementations.

The influence of the Internet has probably resulted in the position that TCP/IP is available in many different forms from many different vendors and it is possible to find good implementations. However, the same cannot be said of TP4. Options and development tools are limited, software implementations often exhibit bugs, documentation is often deficient and vendor support is generally weak.

8.7.1.3 Software Language Compilers

While software compilers are not strictly communication protocol issues, they can have extensive protocol and system implementation implications in terms of protocol development. Compatibility problems arise when protocol source code is provided for integration, but the developer and integrator use different compilers.

8.7.1.4 XTP

NXI XTP has been used to run extended performance measurements over the PC-based FDDI LAN to determine such characteristics as latency, throughput and multicast capabilities.

Exercising the NXI XTP has been extremely beneficial in that, apart from learning curve benefits, certain deficiencies in the protocol implementation have been identified. These include capabilities specified by the Kernel Reference Model (KRM) not being "exposed" by the particular implementation, checksum algorithms not being correctly implemented, bugs in the supplied software, as well as deficiencies and errors in the NXI XTP documentation.

Analysis of the performance results has provided the basis for some important conclusions regarding the effective throughput of PC/EISA

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 186 of 214 ydthsm2.wpd System Prototyping and Modelling

FDDI NICs. By comparison they have also provided conclusions pointing to what can be concluded to be fundamental deficiencies in the CL486/DAS MBII FDDI NIC performance.

8.7.1.5 Throughput Problem with MBII FDDI NIC

It appears that the CCT MBII FDDI NIC (CL486/DAS) suffers from throughput deficiency due to fundamental hardware design constraints, i.e. that incoming data needs to be copied through a series of buffer is which in itself is an inefficient process, but more so if data is not word-aligned. This results in the onboard CPU (in this case an Intel 80486) being required to use its COPY instruction which copies unaligned data on a byte by byte basis.

On this issue Svobodova[127] observes :

"Copying data to and from buffers between protocol layers represents considerable overhead; it has been one of the most widely publicised don'ts in the design of multi-layer communication systems."

8.7.1.6 APIS Development

APIS integration and performance achievements indicate that the APIS concept is sound, i.e. that transparent, data-driven communication logic is valid and that a robust, high performance, integrateable protocol can be achieved using evolutionary prototyping.

8.7.1.7 NTP Development

Despite problems with Unix calls, NTP has proven capable of providing synchronisation between NICs on a non-deterministic Ethernet LAN with accuracy better than 250 µs. With the development of suitable Unix device drivers, it is expected that significantly better results will be achievable on a small FDDI LAN.

8.7.1.8 Simulator Development

Simulator development has assisted in the definition and validation of the APIS interface as well as provided valuable insight into the complex operation of the complete system, especially the closed-loop, real-time nature of the problem.

Simulators have also provided for strong external interest by way of supporting motivating and interesting demonstrations, as well as demonstrated the efficacy of rapid prototyping and software re-use.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 187 of 214 ydthsm2.wpd System Prototyping and Modelling

8.7.2 Protocol Issues

8.7.2.1 FDDI LAN Standard

The FDDI LAN standard offers intrinsic redundancy, determinism, low error rates and high electromagnetic compatibility while supporting high data throughput at affordable cost.

The ADM project demonstrated that FDDI is reliable, affordable, effective and simple to use.

FDDI is the LAN technology chosen for the US Navy's SAFENET LAN standards suite, the Royal Navy's nuclear submarine LAN standard[41] and the European Air Traffic Control authority's (European Organisation for the Safety of Air Navigation) recommended LAN standard[1].

The SAFENET standard has progressed through the development phase and is now in the deployment stage. One example of deployment is aboard the DDG 51 Arleigh Burke class of destroyer where it forms the basis for the Data Multiplex System (DMS) for the platform management system[91]. The DMS is a general purpose network servicing over 1 650 user interfaces.

The FF-21 Multi-Mission Frigate, one of the US Navy's latest warship designs, also proposes the employment of SAFENET and FDDI to integrate its state-of-the-art combat system[40] as does the Regional Deterrence Ship 2010[44].

8.7.2.2 XTP

XTP supports real-time, mission-critical, data communications by providing a range of flexible real-time services. These are error, flow, rate and burst control, pre-emptive priority scheduling, optimized inter- network addressing mechanisms and reliable multicast support.

XTP provides an orthogonal approach to policy and mechanism in that the protocol definition and implementation differentiate between policy regarding real-time network issues such as dataflow control and the mechanisms of how these are actually implemented and how they interface to the user application.

XTP is concluded to be eminently capable of supporting closed-loop, real- time control of critical sensor/actuator control loops.

XTP can, with protocol augmentation if necessary, also support networking of animation-quality, digital video and audio.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 188 of 214 ydthsm2.wpd System Prototyping and Modelling

Using XTP, XTP-aware IP routing and IP, full real-time functionality can be preserved within the Network Profile without leaving out the transport and network layers in LAN and internetwork topologies.

8.7.2.3 Multicast

Multicast is an increasingly important service required by real-time distributed systems. It is also a requirement for address-independent, application layer protocols such as APIS. A reliable multicast service is required by mission-critical systems. XTP offers a reliable transport-level multicast service.

8.7.2.4 Interconnectivity

Gateways to support a variety of data communication standards are readily implemented and thereby achieve maximum flexibility and affordability in providing for maximum functional integration and interoperability.

In many cases, while gateways can provide interconnectivity, real-time performance is compromised due to the processing limitations of the gateway or the throughput and/or latency limitations of the technology of the connected network.

8.7.2.5 Determinism

FDDI is sufficiently deterministic to guarantee latency to less than 2 ms (where LANs are geographically small to medium in size i.e. less than 50 nodes and 2,5 km in diameter).

By employing timestamping and the Network Time Protocol, synchronisation of distributed, application-layer processes to within 220 ìs can be achieved.

8.7.2.6 Multiprotocol Operation

Multiprotocol operation is possible. Protocols such as TCP/IP, TP4/CLNP, XTP and NetWare SPX/IPX can co-exist peacefully on a network, while multiprotocol stacks are implementable on LAN interfaces.

Multiprotocol operation is desirable as it enhances interoperability and open systems interconnectivity without compromising the real-time component of the LAN traffic.

8.7.2.7 Digitised Continuous Media Services

FDDI only supplies packet-switched services. Digitised continuous media signals such as video and audio require circuit-switched type services. There is a requirement for multimedia networks to support these services.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 189 of 214 ydthsm2.wpd System Prototyping and Modelling

8.7.2.8 Network Management Services

The implementation of network management in SAFENET-conformant LANs is optional. However it is concluded that if fault-tolerance or dynamic system reconfiguration is required in a real-time, mission-critical, distributed system, then network management services will be needed.

While HP OpenView is recognised as a de-facto standard and is available as a commercial off-the-shelf product, it cannot necessarily be used in all applications. The reason for this is that OpenView requires special hardware support in the form of a specific RISC processor. OpenView is therefore limited in its portability.

8.7.3 Implementation Issues

8.7.3.1 Prototyping

Judicious use of rapid and evolutionary prototyping reduces implementation timescales as well as technical risks without necessarily compromising product quality.

8.7.3.1.1 Rapid Prototyping

Rapid prototyping is the technique whereby software is rapidly developed in order to test concepts or conduct specific investigations without formal methodologies or documentation. Once the investigations are complete, the software source code should be discarded.

Rapid prototyping is appropriate for demonstrating and testing concepts and algorithms, as well as building simulators.

Rapid prototyping also provides for performance benchmarking which is important when evaluating commercial off-the-shelf building blocks. Specific areas where this is critical are timing, synchronisation, throughput and interfacing capabilities of COTS products.

8.7.3.1.2 Evolutionary Prototyping

Evolutionary prototyping is the technique whereby software products are constructed using loose adherence to formal methodologies and with minimal initial documentation. Once the complete software system concept is proven, reverse engineering is employed to produce a robust and maintainable software product.

Evolutionary prototyping is appropriate where software engineering costs and timescales are paramount, basic

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 190 of 214 ydthsm2.wpd System Prototyping and Modelling

requirements and algorithms are already known and possibly an extensive re-useable software library already exists.

8.7.3.2 Protocol Optimisation

The discrepancy between the XTP latency performance results as measured by the vendor, Network Xpress and those of the version implemented during the course of the project, leads to the conclusion that performance is implementation-dependent. Optimisation of protocol performance requires intimate insight into its software design and implementation as well as fine tuning of the code and algorithms. In the case of the NXI XTP V4.0 product, this was only available as commented source code, with the level of commenting being largely inadequate. Full insight into the software can only be properly provided by formal methods, specifically soft design documents using formal methodologies and notation.

8.8 Technical Recommendations

8.8.1 Image and Video LANs

It is recommended that image and digital video signals be allocated their own LAN (or LANs) where they cannot saturate more critical data transmissions.

8.8.2 Software Language Compilers

Only standard software compilers should be used for the development and integration of communication protocols. Non-standard compiler directives must be completely avoided. This also applies to programming techniques.

8.8.3 Internet Protocol

A full-featured Internet Protocol implementation is required for the network layer. This needs to offer full portability and performance.

8.8.4 Real-Time Operating System

A real-time operating system offering Multibus II support, FDDI drivers and a correctly working adjust time function call is required. Wind River Systems's VxWorks[145] is recommended for investigation.

8.8.5 FDDI Synchronous Bandwidth Allocator

A portable FDDI synchronous bandwidth allocator implementation offering at least static bandwidth allocation is required to support FDDI synchronous mode. IBM's Unix implementation is recommended for investigation.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 191 of 214 ydthsm2.wpd Conclusions

Chapter 9

Conclusions

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 192 of 214 ydthsm2.wpd Conclusions

9. Conclusions

9.1 Architecture Concept

In Chapter 2 it has been contended that in general, analogue and dedicated digital control techniques are almost entirely obsolete due to their modest performance, high maintainability requirements and almost complete lack of flexibility. Investigation of business, industrial process control and military command and control systems indicates that digital computer- based systems have superseded these systems, at least in the West.

Distributed system solutions have been proposed to be superior to centralised architectures for the reasons of survivability, dependability, affordability, reconfigurability and upgradeability. Point-to-point connection schemes in distributed architectures have been shown to be severely limiting, especially with regard to flexibility and upgradeability. Moreover, with the application of appropriate technologies and techniques, as identified and expanded in this thesis, local area networks have been shown to be appropriate for a broad range of real-time, mission-critical, distributed systems.

This thesis has argued that computer networks are appropriate for distributed architectures in that they can provide an optimum system solution in terms of the derived system requirements.

The proposed internetwork solutions offer further advantages in that a higher degree of integration is possible without the inappropriate mixing of diverse data types on the same LAN. The problems associated with this, e.g. responsibility for integration, qualification, etc., have, up until recently, contributed largely to the very conservative approach adopted by many designers of large, complex systems.

The proposed architectures are thus contended to support a good compromise of federalism, performance and cost-effectiveness. They are also very flexible in terms of system design options, scalability and online reconfiguration, as well as system upgrade.

However, the design of real-time, mission-critical, distributed systems using computer networks requires special attention to a number of critical issues, especially timing, synchronisation, fault-tolerance techniques, system engineering management and dataflow management.

Central to these issues is that of network protocols. These must offer real-time performance and support the requirements of dependability and fault-tolerance in order to support mission-critical applications. These protocols span the entire network system, or profile, i.e. from the physical layer up to the application interface layer. The performance of each layer is critical in order not to create bottlenecks which would compromise the real-time performance of the system. Also critical are the interfaces between each layer of the profile, both in terms of performance and robustness. Of specific significance is the interface to the application user, where system-specific requirements exist.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 193 of 214 ydthsm2.wpd Conclusions

9.1.1 Solution Derivation

Considering the extent of the problem space as well as the spectrum of implementation options that are available, derivation of an optimum, network-based, system solution for distributed systems can be complex.

However, for real-time, mission-critical systems, a straightforward process of elimination considerably restricts the solution space when the fundamental issues are considered.

A primary consideration should normally be electromagnetic compatibility; this immediately points to a fibre optic LAN solution.

The next consideration is fault-tolerance. The network system should not exhibit any single point of failure either due to node failure, cable segment failure, or node insertion or extraction from the LAN. This immediately implies a replicated topology.

A further consideration is survivability following damage or component failure. This implies a self-healing capability and the maintenance of at least critical functionality and/or degraded functionality in such circumstances.

The final technical consideration is that the fundamental LAN technology, topology and intrinsic protocols should support real-time performance. This primarily implies low latency, determinism, dataflow control and high throughput.

Finally, an important, but non-technical factor is affordability.

Once these requirements and capabilities are established, it is concluded that, in the present timeframe, the system network solution reduces to the employment of the Fibre Distributed Data Interface (FDDI) in a dual- or quad-redundant topology operating over multimode fibre optic media. FDDI in a dual-redundant configuration exhibits excellent scalability from small implementations to large, making its potential application domain extensive.

9.1.2 Time Validity of Proposed Solution

It is contended that the proposed solution will remain appropriate and thereby meet the system performance requirements for the next 5 to 20 years.

9.1.3 Networking Requirements for Next Generation Real-Time Systems

While FDDI will support present real-time system designs, those of the future will require much higher performance networks.

In the medium term, throughputs of some 600 Mbits-1 are likely to be required by real-time control systems. In the long term, throughputs of some 3 Gbits-1 are likely to be required. In the very long term throughputs of up to 10 Gbits-1 will possibly be required.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 194 of 214 ydthsm2.wpd Conclusions

In the medium to long term (5 to 30 years) the performance and functionality of the High Performance Network (HPN)[85, 86] will be required for next generation systems, especially if multimedia networking is involved.

Latencies of less than 100 µs and synchronisation accuracies of less than 5 µs are predicted to be required by next generation real-time systems[86].

9.1.4 Implications of High-Speed Networks

While the Protocol Engine[48, 49] was not a commercial success, it is proposed that the technical concept behind the project was indeed valid. While dedicated and optimised protocol processors may not be required for data transfer rates of up to 100 Mbits-1, general purpose processors will almost certainly be unable to cope with gigabit data rates. It is predicted that when such high data rates start finding application outside of the network backbone, i.e. at the LAN nodes, the protocol engine project will gain a new lease of life.

9.1.5 Asynchronous Transfer Mode

Asynchronous Transfer Mode (ATM) is a technology being supported by many players, many with their own vested interests, as the standard next generation high- speed network. It is contended that this will not realised in the short to medium term for three reasons; the complexity of the technology, the cost of ATM switching equipment and the lack of major progress to date in the standardisation effort.

Due to the extensive interest in ATM, both academic and financial, it will become the standard high-speed network in the medium to long term (3 to 10 years). It will initially be appropriate and find application in WAN and MAN topologies, mainly in public networks. Only later will it find extensive application in LAN applications and this will initially be at the lower speeds (i.e. 25 to 155 Mbits-1).

ATM does also not intrinsically provide fault-tolerance (c.f. FDDI dual counter- rotating ring). Special techniques will have to be employed to achieve fault-tolerance. This is not likely to be without considerable cost. ATM also provides no error control over payload data. This may be a limiting factor for very low latency transfers which require complete integrity.

The current maturity of ATM standards and technology does not support the present implementation of mission-critical, real-time systems.

Despite the above limitations, ATM has definite advantages in multimedia applications due to its high bandwidth and low latency.

9.1.6 Scalability

The architecture, technologies and topologies that have been proposed are scalable to provide value-added, cost-effective solutions to any class of system; military, industrial or commercial.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 195 of 214 ydthsm2.wpd Conclusions

The proposed systems are applicable to both new constructions as well as refits to existing systems.

It is contended that a modest investment in an integrated system will provide a force multiplication, which is affordable even to organisations of limited means.

9.1.7 Spare Capacity

The information processing capability of a real-time, mission-critical, distributed system will normally have to be continually upgraded throughout its life. This will place a heavy burden on the information management infrastructure. This implies that the network system should be designed with considerable spare capacity (typically 50% to 500%).

9.1.8 Standards and Standard Building Blocks

The solution should be strictly standards-based. Only where no existing, useable standards exist should proprietary standards or products be developed. Where this is appropriate, such products (especially software) should be developed and documented using formal methodologies and documentation standards.

There are many hardware and software products available supporting specific areas of real-time, distributed systems. There are also many organisations offering partial, proprietary solutions. When these products become obsolete, so will the systems that they support. Non-standard solutions have limited lifecycles.

Refer to Appendix J for a list of recommended products and standards.

9.1.9 Relationship of Implementation and Real-Time LAN Profile

It is contended that suitable models for the management of data communications have been identified, modified and adopted. These are the ISO OSI, SAFENET and Real-Time LAN Profiles respectively.

The Real-Time LAN Profile is achievable, flexible and affordable, as well as having considerable scope for upgradeability without major implications to the system.

Technological solutions for each of the appropriate layers of the Real-Time LAN Profile have been considered (in Appendices A to F), with at least one achievable solution at each layer being proposed. Where certain layers are found to have deficiencies (e.g. XTP's lack of latency control), techniques have been proposed to overcome these limitations (e.g. use of FDDI's deterministic mode of synchronous data transfer and timestamping using Network Time Services).

Thus a complete information management infrastructure has been synthesized in terms of these LAN profiles, as well as the allocated and derived requirements and available technologies.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 196 of 214 ydthsm2.wpd Conclusions

9.1.10 Interoperability and Performance

In traditional real-time, distributed systems it was the norm to implement a communication system in an proprietary fashion. While this may have optimised the system in terms of performance, this would have been at the expense of interoperability; in other words it would have provided for a closed system.

When the OSI layered approach started to find practical implementation and high performance network interface hardware became available off-the-shelf, network system implementers attempted to embrace OSI. However, they invariably found the performance of the network and transport layers to be limiting factors. They tended to implement interfaces directly from the application to the MAC sub-layer. Again, while representing an improvement in terms of performance, this had implications on internetworking and interoperability.

Recent advances in protocol engineering, as well as the performance of protocol processing platforms, have led to the development of reliable, high performance standard network and transport layer protocols. Of particular significance are IP (Internet Protocol) at the network layer and XTP (Xpress Transport Protocol) at the transport layer. Thus internetwork topologies are achievable for real-time systems without omitting the network and transport layers.

Furthermore, while IP and XTP provide the required real-time performance, LANs have been developed to support multiple protocols concurrently. Thus a situation has been reached where maximum interoperability and performance can be achieved simultaneously.

9.1.11 System Effectiveness

Applying Information Technology (IT) to real-time, mission-critical, distributed systems greatly enhances their effectiveness in terms of functional performance and survivability. The proposed IT solution also offers the owner investment protection by providing clear opportunities and strategies for upgradeability and the management of obsolescence.

Information Technology will play an increasingly important role in providing the critical advantage in competitive environments.

9.1.12 Application Interface Services

Despite the elegance of the OSI 7-layer Basic Reference Model, it is nevertheless only a paradigm for data communication and not an absolute implementation objective in itself. Effectively, the lower four layers of the model provide all the requirements of the network system. The upper three layers are appropriate only to specific systems and, in fact, standard OSI implementations of these three layers detract substantially from the required performance of real-time, mission-critical, distributed systems.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 197 of 214 ydthsm2.wpd Conclusions

Implementers of these systems therefore need to dispense with strict adherence to the provision of the upper three layers. In their place, appropriate lightweight support services need to be provided. Lightweight implies real-time performance.

The APIS protocol defined and developed as described in Paragraph 6.8.1, meets the requirements of the particular class of real-time, mission-critical, distributed systems under consideration. It also supports programme constraints such as providing standard high performance interfaces to specified real-time operating systems and network interface hardware. It furthermore provides complete abstraction of the network system to the application user by providing a small set of software service calls. In fact, this abstraction of the underlying network infrastructure allows the application user to consider the LAN as a virtual backplane. The application user can communicate with another device on the same parallel backplane bus in the same manner as a remote device connected via a LAN or WAN. Thus the user requirement of transparency is achieved. The only capability which remains non-transparent is the quality of service provided by the network. However, this is dependent on the performance of the network, which is a technology and topology consideration which, in turn, is a system issue.

This transparency also allows that any of the network infrastructure can be replaced or upgraded entirely without implication to the application user. Transfer layer protocols and the network cable plant can be upgraded without affecting the performance of the system.

Thus the user requirement of obsolescence management is achieved.

9.1.13 Real-Time Network Protocols

In order for event-triggered, mission-critical, distributed systems to exhibit real-time performance, they require networks offering real-time protocols such as XTP. Such protocols offer flexible, yet dependable message scheduling and dataflow control, while degrading gracefully in cases of transient overload.

9.1.14 Continuous Media Distribution

Integrated data, image and audio, in both LAN and internetwork topologies using FDDI and ATM with XTP, is possible without compromising the hard deadlines of the critical real-time control data.

Real-time multiplexing of a small number (up to eight) animation-quality digital video signals over an FDDI LAN is possible if compression techniques are employed. Real-time multiplexing of a number (typically up to 64) animation-quality digital video signals over an ATM LAN is possible with compression.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 198 of 214 ydthsm2.wpd Conclusions

Sophisticated networked solutions such as digital telephony are possible using high- speed LANs supporting multicast and real-time protocols.

9.1.15 Matching System Requirements with LAN Technologies

Only rigorous system engineering, including cost/benefit analysis, can determine precisely the appropriate choice of LAN technology and topology for the particular application.

9.1.16 Open Systems Architecture

The solution should be based on an open-systems architecture providing for product obsolescence management, flexibility, upgradeability and lifecycle support.

9.1.17 LAN Profile

While full SAFENET compliance may be considered the ultimate technological objective in terms of network implementation, this should be tailored to meet the specific user requirements in terms of functional performance, dependability and affordability.

The proposed Real-Time LAN Profile offers a good compromise of performance, interconnectivity and affordability.

9.1.18 Building Blocks

As far as possible, the system should be constructed from available, commercial off- the-shelf building blocks, ruggedised if necessary.

One exception to this applies in the area of the Fibre Optic Cable Plant. It is recommended that only fully qualified components, such as cables, connectors and splices are used unless rigorous analysis shows lower quality components to be appropriate. It is concluded that the extra initial outlay in this area will prove cost- effective in terms of the system lifecycle.

9.2 Significance of the Study

The study has identified and proposed a system solution to real-time, mission-critical, distributed applications using network building blocks conforming to international standards. The result is an implementable system catering for all physical and functional layers, i.e. from the physical cabling, up to the interface with the user's application software. All the layers are functionally decoupled to the maximum extent possible in order to provide for obsolescence management and enhanced system flexibility. The system solution derived from the allocated and derived functional and performance requirements is proposed in terms of a data communications paradigm which meets these requirements and is practical in terms of available technology and affordability.

By matching of appropriate technologies and techniques, the proposed network solution is capable of supporting a critical virtual circuit capability to provide dependable, closed-loop,

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 199 of 214 ydthsm2.wpd Conclusions

real-time control of critical sensor/actuator sub-systems using local area networks. It is also capable of providing full performance and protocol functionality in internetwork topologies without omitting the network and transport layers.

In order to verify the validity of the proposed solution, an experimental testbed has been constructed to support prototyping of the various elements of the system solution as well as integration of these elements into a concept demonstrator of a complete system, representative of the complex applications of interest. This prototyping falls into both the rapid and evolutionary types. The former is used to validate concepts and support performance measurements, while the latter is used to develop a number of robust, re-useable software products, i.e. implementations of the Xpress Transport Protocol, a Network Time Protocol and application-specific Network Management Services as well as a novel Application Interface Services protocol.

While at the date of writing the concepts have not been operationally qualified, they have been verified in the Experimental Testbed. However, in the 1997/8 timeframe they will be set-to-work and fully functionally qualified in the Ashore Integration Test Facility of a real- life, real-time, mission-critical, distributed system, i.e. a state-of-the-art naval combat suite for the South African Navy. In the 1998/9 timeframe they will be installed, set-to-work and fully qualified at sea in a naval combat surface vessel.

9.3 Limitations of the Study

The study underlying this thesis has approached the problem from a system level, with an attempt having been made to define a complete system with appropriate standard components at each level. The technology of networking is, however, extremely broad with major defining role players in the military, government, national standardisation organisations and international corporations. These role players are also spread across the world, making culture and geographic separation significant hurdles to true standardisation and the achievement of true open systems (viz. European-dominated ISO and ITU vs US-based ANSI and IEEE). All the players, especially the large corporates have vested interests in the definition of standards, these interests being either financial or matters of national pride. These realities introduce three factors of uncertainty into the analysis of optimal standards and synthesis of an optimal solution. These are :

! the validity of standards over time ! the validity of technology over time ! the race to market for new technologies and products

The global networking market is so large that the large corporates especially attempt to either define or influence standards by designing and releasing products which they hope will first become the de-facto standard and then the accepted standard (viz. IBM with token ring and 62,5ì optic fibre technologies).

The consequence of these realities for this study is that over its three year period, many baselines have changed (viz. SAFENET's use of IP and XTP's metamorphosis from a transfer protocol to a transport protocol). While some of the changes have been identified in this thesis, there may be some that have not. There are also many "standards" in the laboratories and boardrooms of the corporates and committee rooms of standards organisations that today

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 200 of 214 ydthsm2.wpd Conclusions

are confidential or RFCs (requests for comment) and tomorrow may be standards (true or de- facto). Cases in point are CDDI, ATM, Fast Ethernet and the High-Speed Data Bus (HSDB).

In terms of technical limitations, some difficulties were experienced which prevented complete integration and testing of the full protocol profile within the timespan of the project. This included problems with NTP, specifically faulty operation of off-the-shelf versions of Unix and the unavailability of FDDI devices drivers for Unix. Off-the-shelf operating systems offering true open systems functionality and real-time performance were found to be deficient in many areas which again inhibit progress in some areas. Due to the unavailability of suitable off-the-shelf synchronous bandwidth allocators, protocol performance tests using FDDI synchronous mode were never undertaken. Despite these problems, however, theoretical analyses and extrapolation of measured results prove that the protocol concept will support the classes of real-time, mission-critical, distributed systems under consideration.

A further limitation of the study was that despite very generous financial support from the project sponsors, the study was by nature very broad and only a limited number of standard off-the-shelf components could be acquired, integrated and tested in each functional area. This limitation is exacerbated by the fact that networking, especially the requirement to perform peer-to-peer and multicast performance testing, requires replication of identical or similar hardware.

9.4 Final Conclusion

With a modern society that is relying to a greater and greater extent on industrialisation, automation and communication, the applicability and impact of real-time, mission-critical, distributed systems is gaining in importance. The capabilities and performance of real-time protocols are fundamental to the operation of these systems.

This thesis has identified and described a complete set of protocols, tools and techniques, within the framework of a coherent implementation paradigm, that can be used to synthesize these complex, network-based, real-time, mission-critical distributed systems.

While a systems solution for the present timeframe is identified, a methodology is also proposed which will systematically enable the requirements of next generation systems to be matched to the capabilities and characteristics of technologies of the future.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 201 of 214 ydthsm2.wpd References

References

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 202 of 214 ydthsm2.wpd References

References

Standards

[1] European Organisation for the Safety of Air Navigation Common Operational Performance Specifications (COPS) for the Future Controller Operating Environment, Report Version 5, 1991-04-15.

[2] IEEE P1003.1 - POSIX Compliance Test.

[3] IEEE P1003.4 - POSIX Real-Time Extensions.

[4] IEEE P1003.4a - POSIX Threads Interface.

[5] IEEE 802 - IEEE Standards for Local and Metropolitan Area Networks - Overview and Architecture, 1990.

[6] IEEE 802.2 - Logical Link Control, November 1982.

[7] IEEE 802.3 - Carrier Sense Multiple Access with Collision Detect Protocol, 1982.

[8] IEEE 802.4 - Token-Passing Bus LAN, 1985.

[9] IEEE 802.5 - Token-Passing Ring LAN, 1985.

[10] IEEE 896.3 - + Recommended Practice, 1993.

[11] IEEE P1386 - Draft Standard for a Computer Mezzanine Card Family: CMC, Draft 2.0, 1995-04-04.

[12] IEEE P1386.1 - Draft Standard Physical and Environmental Layers for PCI Mezzanine Cards: PMC, Draft 2.0, 1995-04-04.

[13] IEEE 1596 - Scalable Coherent Interface (SCI), 1992.

[14] ISO 8073 - Connection-Oriented Transport Protocol, International Standards Organisation, September 1989.

[15] ISO 8327 - Basic Connection-Oriented Session Protocol, International Standards Organisation, 1985.

[16] ISO 8348 - Connection-Oriented Network Protocol, International Standards Organisation, March 1988.

[17] ISO 8473 - Connectionless Network Protocol, International Standards Organisation, December 1988.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 203 of 214 ydthsm2.wpd References

[18] ISO 8602 - Connectionless Transport Protocol, International Standards Organisation, December 1987.

[19] ISO 8823 - Presentation Protocol, International Standards Organisation, 1985.

[20] ISO 9542 - ES-IS Routing Exchange Protocol, International Standards Organisation.

[21] ISO 10040 - System Management Overview, International Standards Organisation.

[22] FIPS146-1 - U.S. Government Open Systems Interconnection Profile (GOSIP), National Institute of Standards and Technology, 1991-04-03 .

[23] Manufacturing Automation Protocol, Revision 3.0, General Motors Corporation, April 1987.

[24] MIL-STD-1553B - Digital Time Division Command/Response Multiplex Data Bus, Notice II, 1978.

[25] MIL-STD-1777 - Internet Protocol, 1983-08-12.

[26] MIL-STD-1778 - Transmission Control Protocol, 1983-08-12.

[27] MIL-STD-1815A - Ada Language Reference Manual (1983-02-17).

[28] MIL-STD-2204 - Survivable Adaptable Fiber Optic Embedded Network, 1992-10-31.

[29] MIL-STD-2204A - Survivable Adaptable Fiber Optic Embedded Network, SAFENET, 1994-09-30.

[30] PCI Special Interest Group, PCI Local Bus Specification Revision 2.1, 1995-06-01.

[31] X3.139-1987 - ANSI FDDI Media Access Control, 1986-11-05.

[32] X3.148-1988 - ANSI FDDI Physical Layer Protocol, 1988.

[33] X3.166-1988 - ANSI FDDI Physical Media Dependent Protocol, Rev. 9, 1989-03-01.

[34] X3T12 - ANSI FDDI Twisted Pair - Physical Media Dependent.

[35] X3T9/92-067 - ANSI FDDI Station Management, Rev. 7.3, 1994.

[36] X3T9.3 - ANSI Fibre Channel Specification.

[37] Xpress Transfer Protocol (XTP) Definition, Revision 3.6, Protocol Engines, Inc., January 1992.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 204 of 214 ydthsm2.wpd References

[38] Xpress Transport Protocol (XTP) Definition, Revision 4.0, XTP Forum, XTP 95-20, 1995-03-01.

[39] 7498-1984(E) - ISO Opens Systems Interconnection - Basic Reference Model, October 1983.

Reference Documents

[40] Afanasieff L. and Mabry J.P., The Design of the FF-21 Multi-Mission Frigate, Naval Engineers Journal, May 1994.

[41] British Navy to use FDDI Networks, Electronic Design Magazine, March 19, 1992.

[42] Burt A.M., Freestone G. and Jones G., Analysis of Current Communication Protocols and an Assessment of How Well They Meet the Requirements of Platform Management Systems, Proceedings - Tenth Ship Control Symposium, October 1993.

[43] Bux W., Local area networks: A performance comparison, IEEE Transactions on Communications, October 1981.

[44] Calvano C.N. and Riedel J.S., The Regional Deterrence Ship (RDS 2010), Naval Engineers Journal, January 1996.

[45] Case J., Fedor M., Schoffstall M. and Davin J., A Simple Network Management Protocol, Network Working Group Request for Comment No. 1067, August 1988.

[46] Case J., McCloghrie K. and Waldbusser S., Structure of Management Information for Version 2 of the Simple Network Management Protocol, Network Working Group Request for Comment No. 1442, April 1993.

[47] Chang, High Performance Networking: XTP or TCP?, Microelectronics Centre North Carolina, XTP Forum Research Affiliate Annual Report, 1993.

[48] Chesson G., The Evolution of XTP, Procedures of the Third Conference on High Speed Networking, North-Holland, 1991.

[49] Chesson G., The Protocol Engine Chipset, Protocol Engines, Inc. Report No. PEI-92-49, 1991.

[50] Cheriton D.R., VMTP: The Versatile Message Transaction Protocol, Protocol Specification, RFC 1045, Network Information Centre, SRI International, February 1988.

[51] Christie R.W. and Weaver A.C., Supporting Multimedia Traffic Via an XTP-Aware IP Router, Computer Networks Laboratory, Department of Computer Science, University of Virginia, Q4 1994.

[52] Clark D.D., Lamber M.L. and Zhang L., NETBLT: A Bulk Data Transfer Protocol, RFC 998, Network Information Centre, SRI International, March 1987.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 205 of 214 ydthsm2.wpd References

[53] Cohn M., Guidelines for Enhanced Communications Functions for OSI, Transfer Magazine, Protocol Engines, September/October, 1991.

[54] Comsoft, Comsoft LAN Architecture Programmer's Guide, Version 2.3, 1994-03-23.

[55] Comsoft, CLA Performance Figures, Version 1.1, 1994-11-28.

[56] C²I² Systems, Prime Item Development Specification for the Information Management System, Document No. CCII/A500/IMS/6-PIDS, Issue 1.1 dated 1996-01-22.

[57] C²I² Systems, Sub-System Design Document for the Information Management System, Document No. CCII/A500/IMS/6-SDD, Issue 0.1 dated 1995-12-08.

[58] C²I² Systems, Interface Control Document for the Information Management System Application Interface Services, Document No. CCII/A500/IMS/6-ICD/1, Issue 0.2, dated 1996-03-28.

[59] C²I² Systems, Interface Control Document for the Information Management System Network Time Services, Document No. CCII/A500/IMS/6-ICD/3, Issue 0.1, dated 1996-03-14.

[60] C²I² Systems, Interface Control Document for the Information Management System Built-in Test Services, Document No. CCII/A500/IMS/6-ICD/2, Issue 0.1, dated 1996-03-28.

[61] C²I² Systems, Interface Control Document for the Information Management System File Transfer Services, Document No. CCII/A500/IMS/6-ICD/2, Issue 0.2, dated 1996-03-28.

[62] C²I² Systems, Operator Interface Control Document for the Information Management System, Document No. CCII/A500/IMS/6-ICD/6, Issue 0.1, dated 1996-03-28.

[63] Davids P. and Karakbek R., Transport Protocols and Client Server Applications, Transfer Magazine, XTP Forum, Volume 8 Number 1, January/February 1995.

[64] Davids P., Meueser T. and Spaniol O., FDDI: status and perspectives, Computer Networks and ISDN Systems, 1994.

[65] Dempsey B.J., Liebeherr J. and Weaver A.C., A Delay-Sensitive Error Control Scheme for Continuous Media Communications, Proceedings - High Performance Communication Sub- Systems, Williamsburg, VA, September 1-3 1993.

[66] de Pryker M., Asynchronous Transfer Mode - Solution for Broadband ISDN, Ellis Horwood Limited, 1993.

[67] de Rezende J.F., Mauthe A., Hutchinson D. and Fdida S., M-Connection Service: A Multicast Service for Distributed Multimedia Applications, Transfer Magazine, XTP Forum, Volume 8 Number 4, July/August 1995.

[68] Digital Equipment Corporation, A Primer to FDDI: Fiber Distributed Data Interface, 1991.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 206 of 214 ydthsm2.wpd References

[69] Ditizio F.B., Hoyle S.B. and Pruitt H.L., Autonomic Ship Concept, Naval Engineers Journal, September 1995.

[70] Emerging PC LAN Technologies Report, Computer Technology Research Corporation, January 1992.

[71] Eitelberg E., Modelling for Combat System Defensive Error Budgets and Hit Probabilities, NOY Business, 1994-11-08.

[72] Fan C., Luckenbach T. and Xu X., Performance Comparison and Analysis of XTP and TCP/IP over BERKOM Broadband ISDN Network, German National Research Corporation for Computer Science, 1992-10-14.

[73] Ferrerio L., Offboard Command Casualty Launch, Proceedings - RINA International Conference on Interaction between Naval Weapon Systems and Warship Design, December 1990.

[74] Fraser J.H., Design Techniques to Optimise the Combat System Effectiveness of the T23 Frigate, Proceedings - RINA International Conference on Interaction between Naval Weapon Systems and Warship Design, December 1990.

[75] French Ministry of Defence, GAM-T-103 Military Real-Time Local Area Network Reference Model (Transfer Layer), 1987-02-07.

[76] Geary J.W. and Masters M.W., Investigating New Computing Technologies for Shipboard Combat Systems, Naval Engineers Journal, May 1995.

[77] Granum-Jensen M. and Hansen T.N., Analysis of Basic Transmission Networks for Integrated Ship Control Systems, Proceedings - Tenth Ship Control Symposium, October 1993.

[78] Green D.T. and Marlow D.T., Application of LAN Standards to the Navy's Combat Systems, Naval Engineers Journal, May 1990.

[79] Germuska M. and Morgan G., High Performance Networking: XTP or TCP?, Defence Research Agency - Maritime, UK, XTP Forum Research Affiliate Annual Report, 1992.

[80] Gorry G.A., Enterprises Medical Applications of High Speed Networking, Transfer Magazine, XTP Forum, May/June 1993.

[81] Hall O.K. and Stigall P.D., Distributed Flight Control System Using Fiber Distributed Data Interface, FDDI, IEEE AES Magazine, June 1992.

[82] Halsall F., Data Communications, Computer Networks and OSI, Addison-Wesley Publishing Company, 1988.

[83] Hedrick C.L., Introduction to Internet Protocols, Rutgers State University of New Jersey, 1987-07-03.

[84] Hewlett-Packard, HP OpenView Node Manager, Doc. No. 5962-9587E, October 1994.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 207 of 214 ydthsm2.wpd References

[85] High Performance Network Working Group - Available Technologies Subgroup, Available Technologies Final Report, US Navy, 1994-06-02.

[86] High Performance Network Working Group, High Performance Network: Architecture, Services and Requirements, US Navy, 1994-12-07.

[87] IBM Corporation, FDDI I Synchronous Forum White Paper, Version 2.0, 1994-04-09.

[88] IEEE 802.2-94/139, Functional Requirements for LLC Type 4, Draft 6, September 1994.

[89] Irey P.M. and Marlow D.T., Integrating XTP with Existing Protocols, XTP Forum Research Affiliate Annual Report, 1993.

[90] Jordan A.F., On the Brink: Fiber Optic LANs for Avionics and Space, Defense Electronics Magazine, October 1993.

[91] Kahn R. and Mitchell B., Upgrading a General Purpose Shipboard Data Network, Proceedings - Tenth Ship Control Symposium, October 1993.

[92] Knudsen D.R., Brown G.D., Ingold J.P. and Spence S.E., A Ship-Wide System Engineering Approach for Fiber Optics for Surface Combatants, Naval Engineers Journal, May 1990.

[93] Kopetz H. and Grüsteidl G., TTP - A Protocol for Fault-Tolerant Real-Time Systems, Distributed Systems, IEEE Computer, Vol. 27 No. 1, January 1994.

[94] Kopetz H. and Veríssimo P., Real Time and Dependability Concepts, Distributed Systems, Second Edition, ACM Press, 1993.

[95] Laniewski G., Multipoint Communication in Local Area Networks, PhD Thesis, University of Witwatersrand, 1991.

[96] Lynx Real-Time Systems, A General Overview of LynxOS, Revision 1.7, 1994-06-03.

[97] Mapp G., Preliminary Performance Evaluation of SandiaXTP on ATM at ORL, Olivetti Research Limited, 1995.

[98] Malcolm N. and Zhao W., A Timed-Token Protocol for Real-Time Communications, IEEE Computer, Volume 27 Number 1, January 1994.

[99] Mazzaferro J.F. and Dell'Acqua A.A., FDDI Technology Report, Computer Technology Research Corporation, April 1991.

[100] McCloghrie K. and Rose M., Management Information Base for Network Management of TCP/IP-based Internets, Network Working Group Request for Comment No. 1066, August 1988.

[101] McNabb J., XTP 4.0 SNMP MIB, Transfer Magazine, XTP Forum, Volume 8 Number 3, May/June 1995.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 208 of 214 ydthsm2.wpd References

[102] Michel J.R., Waterman A.S. and Weaver A.C., Performance Evaluation of an Off-Host Communications Architecture, Proceedings - High Performance Communication Sub-Systems, Williamsburg, VA, September 1-3 1993.

[103] MIL-HDBK-818A - Survivable Adaptable Fiber Optic Embedded Network, Network Development Guidance, 1994-09-30.

[104] Mills D.L., Improved Algorithms for Synchronising Computer Network Clocks, IEEE Transactions on Networking, June 1995.

[105] Mills D.L., Network Time Protocol, Version 3 Specification, Implementation and Analysis, RFC-1305, March 1992.

[106] MIL-STD-1553B and the Next Generation - Conference Volume, ERA Technology, 1989.

[107] Mirchandani S. and Khana R., FDDI Technology and Applications, John Wiley and Sons, Inc., 1993.

[108] Mittura A. and Karp M.S., Combat System Engineering: A Return to Fundamentals, Naval Engineers Journal, May 1993.

[109] Mullender S., Interprocess Communication, Distributed Systems, Second Edition, ACM Press, 1993.

[110] Oerlikon Pocket-Book, Werkzeugmaschinenfabrik Oerlikon-Bührle AG, 1981.

[111] Rodd M.G. and Deravi F., Communications Systems for Factory Automation, University of Wales, 1987.

[112] Rose M. and McCloghrie K., Structure and Identification of Management Information for TCP/IP-based Internets, Network Working Group Request for Comment No. 1155, May 1990.

[113] Ross F.E., FDDI - A Perspective, Fiber Optics Sourcebook, (extract undated).

[114] Saunders R.M. and Weaver A.C., The Xpress Transfer Protocol - A Tutorial, Computer Networks Laboratory, Department of Computer Science, University of Virginia, undated.

[115] Schroeder M.D., A State-of-the-Art Distributed System: Computing with BOB, Distributed Systems, Second Edition, ACM Press, 1993.

[116] Sevcik K.C. and Johnson M.J., Cycle Time Properties of the FDDI Token Ring Protocol, IEEE Transactions on Software Engineering, Vol. SE-13, March 1987.

[117] Simoncic R., Weaver A.C., Cain B. and Colvin M.A., Shipnet: A Real-Time Local Area Network for Ships, Proceedings - 13th Annual Conference on Local Computer Networks, Minneapolis, MN, October 10-12 1988.

[118] Sha L. and Sathaye S.S., A Systematic Approach to Designing Distributed Real-Time Systems, IEEE Computer, Volume 26 Number 9, September 1993.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 209 of 214 ydthsm2.wpd References

[119] Sha L., Industrial Computing - A Grand Challenge, IEEE Computer, Volume 27 Number 1, January 1994.

[120] Shostak S., The Human Eye as an Imaging System, Advanced Imaging Magazine, August 1992.

[121] Sparrius A., The System Acquisition Process, Course Notes, 1986.

[122] Stankovic J.A. and Ramamritham K., Tutorial - Hard Real-Time Systems, IEEE Computer Society, 1988.

[123] Stankovic J.A., Real-Time Computing Systems: The Next Generation, IEEE Computer Society, February 19, 1988.

[124] Strayer W.T. and Weaver A.C., Is XTP Suitable for Distributed Real-Time Systems?, Proceedings - International Workshop on Advanced Communications and Applications for High Speed Networks, Munich, Germany, March 16-19, 1992.

[125] Strayer W.T., Dempsey B.J. and Weaver A.C., XTP: The Xpress Transfer Protocol, Addison Wesley Publishing Company, 1992.

[126] Strayer W.T., Protocol Changes for XTP, Transfer Magazine, XTP Forum, May/June 1994.

[127] Svobodova L., Implementing OSI Systems, Selected Areas in Communication, Volume 7 Number 7, September 1989.

[128] Szuprowicz B.O., Multimedia Networking and Communications, Computer Technology Research Corporation, 1994.

[129] Trewitt G., Local-Area Internetworks: Measurement and Analysis, PhD Thesis, Stanford University, March 1990.

[130] UK Ministry of Defence, Principles of Combat System Highway Engineering on the Type 23 Frigate, Issue 3, 1986-08-14.

[131] US Army Management Engineering Training Agency, The System Engineering Process, undated.

[132] van Tyle S., Bringing Standards to Embedded Systems Design, Electronic Design Magazine, May 1, 1996.

[133] van Zyl D., Report on Simulation Study Done to Establish Acceptable Limits on Timing Errors between Tracker and Guns (classified), Issue : A, Reutech Systems, 1996-07-18.

[134] Veríssimo P., Real-Time Communication, Distributed Systems, Second Edition, ACM Press, 1993.

[135] Warren J.R. and Raimbault F., FDDI Synchronous Forum "Implementer's Agreement", Rev. 3, IBM Corporation, 1995-07-21.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 210 of 214 ydthsm2.wpd References

[136] Watson R.W., Delta-t Protocols Specification, Lawrence Livermore Laboratory, 1983-04-15.

[137] Weaver A.C., XTP: A Communications Protocol for Real-Time Distributed Systems, Proceedings - IECON '93, Maui, Hawaii, November 1993.

[138] Weaver A.C., XTP: A New Communications Protocol for Factory Automation, Proceedings - International Workshop on Emerging Technologies and Factory Automation, Cairns, Australia, August 17-19, 1992.

[139] Weaver A.C. and McNabb J., Digitised Voice Distribution using XTP and FDDI, Proceedings - 17th Local Computer Networks Conference, Minneapolis, MN, September 13-16, 1992.

[140] Weaver A.C., Teleradiology Using XTP over ATM, Transfer Magazine, XTP Forum, Volume 6 Number 3, May/June 1993.

[141] Weaver A.C., Xpress Transport Protocol Version 4, Transfer Magazine, XTP Forum, Volume 7 Number 6, November/December 1994.

[142] Weaver A.C., Network Xpress Delivers XTP 4.0, Transfer Magazine, XTP Forum, Volume 8 Number 1, January/February 1995.

[143] Weaver A.C. and Butler D.W., A Fault-Tolerant Network Protocol for Real-Time Communications, IEEE Transactions on Industrial Electronics, Volume IE-33, Number 3, August 1986.

[144] Weihl W.E., Specifications of Concurrent and Distributed Systems, Distributed Systems, Second Edition, ACM Press, 1993.

[145] Wind River Systems, VxWorks, Doc. No. MCL-DST-VX-52, 1995.

[146] Young R.M., Information Management Infrastructure for an Integrated Combat Suite Architecture, Proceedings - RINA International Conference on Information Technology and Warships, December 1991.

[147] Young R.M., Real-Time Distributed System Architecture using Local Area Networks, MSc(Eng) Dissertation, University of Cape Town, 1992-10-16.

[148] Zitzman L.H., Falatko S.M. and Papach J.L., Computer System Architecture Concepts for Future Combat Systems, Naval Engineers Journal, May 1990.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 211 of 214 ydthsm2.wpd Bibliography

Bibliography

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 212 of 214 ydthsm2.wpd Bibliography

Bibliography

Standards

[149] DOD-STD-1773 - Fibre Optic Mechanisation of an Aircraft Internal Time Division Command/Response Multiplex Data Bus.

[150] STANAG 3910 - High Speed Data Transmission under STANAG 3838 or Fibre Optic Equivalent Control, 1989-06-23.

Reference Documents

[151] Duitsman L.L. and Pinelli M.M., Boeing Computer Services, An all-purpose model to aid in all phases of network design, Data Communication, September 1987.

[152] Laplante P.A., Real-Time Systems Design and Analysis, IEEE Computer Society Press, 1993.

[153] Quarterman J.S. and Wilhelm S., UNIX, POSIX and Open Systems - The Open Standards Puzzle, Addison-Wesley Publishing Company, Inc., 1993.

[154] Senior J.M., Optical Fiber Communications - Principles and Practice, Prentice-Hall International, 1985.

[155] Shankar A.U. and Lee D., Minimum-Latency Transport Protocols with Modulo-N Incarnation Numbers, IEEE/ACM Transactions on Networking, Volume 3 Number 3, June 1995.

[156] Shin K.G. and Cui X., Computing Time Delay and its Effect on Real-Time Systems, IEEE Transactions on Control Systems Technology, Volume 3 Number 2, June 1995.

[157] Stallings W., The TCP/IP Protocol Suite, Macmillan, Inc., 1989.

[158] Yanis E. and Schmitt R., Design Techniques to Upgrade the Combat System Effectiveness of the FFG 7 Class Frigate, Proceedings - RINA International Conference on Interaction between Naval Weapon Systems and Warship Design, December 1990.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 213 of 214 ydthsm2.wpd Appendices

Appendices

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page 214 of 214 ydthsm2.wpd Appendix A

Physical Layer LAN Media and Protocols

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A1 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

Appendix A ...... 1

1. Scope ...... 5 1.1 Scope ...... 5 1.2 Introduction ...... 5 1.3 Appendix Layout ...... 5

2. Cable Layer Technologies ...... 6 2.1 Media ...... 6 2.1.1 Pneumatic...... 6 2.1.2 Hydraulic Systems ...... 6 2.1.3 Radio Frequency ...... 6 2.1.4 Optical Beam ...... 6 2.1.5 Metallic Media...... 7 2.1.5.1Two-Wire Open Line...... 8 2.1.5.2Unshielded Twisted Pair ...... 8 2.1.5.3Shielded Twisted Pair ...... 9 2.1.5.4Screened UTP ...... 9 2.1.5.5Coaxial Cable...... 10 2.1.6 Optical Fibre Cable ...... 10 2.1.6.1Multimode ...... 11 2.1.6.2Singlemode ...... 12 2.2 Cable Layer Technologies - Performance Summary ...... 12

3. Physical Layer Protocols ...... 13 3.1 Encoding Schemes...... 13 3.1.1 RZ ...... 13 3.1.2 Manchester Biphase...... 13 3.1.3 NRZI ...... 14 3.1.4 4B/5B ...... 14 3.1.5 8B/10B ...... 14 3.1.6 17B/20B ...... 15 3.1.7 MLT-3 ...... 15

4. Physical Layer Technologies - Implementations ...... 16 4.1 MIL-STD-1553 ...... 16 4.2 Ethernet ...... 16 4.3 Token Ring Schemes ...... 16 4.3.1 IBM Token Ring (IEEE 802.5) ...... 16 4.3.2 FDDI ...... 17 4.3.3 CDDI ...... 17 4.4 LAN Technologies - Performance Summary ...... 18

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A2 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

5. Conclusions ...... 19 5.1 Performance ...... 19 5.2 Cost Effectiveness ...... 19 5.3 Media Bulk and Mass ...... 19 5.4 Electromagnetic Compatibility ...... 19 5.5 50µ/125µ vs 62,5µ/125µ Multimode Fibre ...... 19

6. Recommendations ...... 21 6.1 Cable Layer ...... 21 6.1.1 Fibre Optic Media ...... 21 6.1.2 Copper Media ...... 21 6.1.3 Upgradeability ...... 21 6.1.4 Optical Power Analysis ...... 21 6.2 Physical Layer Protocols ...... 22 6.2.1 Encoding Complexity ...... 22 6.2.2 MLT-3 Encoding ...... 22

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A3 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

List of Tables

Table AI : Cable Layer Technologies - Performance Summary ...... 12 Table AII : LAN Technologies - Performance Summary ...... 18

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A4 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

1. Scope

1.1 Scope

This appendix describes the detailed characteristics of Physical Layer (i.e. Layer 1) LAN media and protocols, specifically in terms of their performance and their suitability or otherwise for the implementation of real-time, mission-critical, distributed systems.

1.2 Introduction

The International Standards Organisation (ISO) defined and in 1984 published a framework for communications standards which partitions the functions required for communication into seven layers. This was termed the Open Systems Interconnect (OSI) Basic Reference Model. Layer 1 of the OSI model is termed the Physical Layer (PL). The functions of the PL protocol are to encode data packets and transfer them bit-wise over the physical medium.

The OSI Physical Layer excludes the physical media. However, there is close interaction between the physical media and the physical layer protocols. The physical media characteristics also impact on system design considerations, e.g. size, mass and electromagnetic compatibility. LAN Profiles such as SAFENET therefore include the physical media. In this thesis this is defined as the Cable Layer.

The functions of the PL protocol are fundamental to the suitability of local area networks for real-time, mission-critical, distributed systems as they determine such capabilities as data throughput, electromagnetic compatibility and the maximum physical size of the LAN (i.e. geographic range).

Physical Layer issues include media type and characteristics as well as encoding mechanisms. While media type issues are not directly protocol issues, they fundamentally affect LAN considerations such as reliability and topology which in turn have extensive protocol implications for real-time, mission-critical, distributed systems.

1.3 Appendix Layout

The appendix commences with an overview of the characteristics of available Physical Layer media and protocols and proceeds with descriptions of specific implementations thereof. Important implications of these characteristics and implementations are then analysed within the context of real-time, mission-critical, distributed systems.

Specific conclusions and recommendations are then made, with the most significant of these being adopted in the main section. In particular, specific options are recommended for inclusion in the Real-Time LAN Profile described in the main section.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A5 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

2. Cable Layer Technologies

While the physical media are outside of the scope of the ISO Physical Layer, their characteristics are important to the Physical Layer encoding schemes and their suitability for the implementation of real-time, distributed systems.

2.1 Media

2.1.1 Pneumatic

In the past (up until the early 1960s) pneumatic systems were used to convey logic and even analogue signals in industrial process control plants. Typically these systems employ 3-15 psi (pound per square inch) air pressure. These systems suffer from low data rates, poor resolution (due to pneumatic "noise") as well as reliability problems due to punctures, leaks in the distribution piping and compressor problems. Pneumatic systems are, however, immune to electromagnetic interference and do not create radio frequency interference.

2.1.2 Hydraulic Systems

Rather than compressed air, hydraulic systems convey logic signals by means of pressurised fluids. In the 1960s and 70s such systems also found extensive application in process control plants and even in avionic control systems.

Like pneumatic systems, hydraulic systems suffer from many problems. However, there are especially severe consequences of leaks and punctures due to the damaging effects of the escaping fluid.

2.1.3 Radio Frequency

Transmission of digital data is possible using electromagnetic radio frequency (RF) energy. Local area networks can use this method for data transmission to achieve modest data rates over limited geographic ranges. Such systems suffer from modest performance and low reliability due to the vagaries of the electromagnetic environment as well as being susceptible to intercept and intentional denial (jamming). Modern spread-spectrum techniques are now being employed which reduce these effects, however. RF LANs are nevertheless useful and highly flexible for certain application such as office LANs, but rarely for real-time, mission-critical, distributed systems.

2.1.4 Optical Beam

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A6 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

High-speed Laser Links operate using optical beams by converting signals from fibre or wire media to laser beam allowing the transmission of data through the atmosphere or free space.

Such transmission techniques are capable of high bandwidth due to the intrinsically high bandwidth of optical frequencies. Typically transfer rates of 100 Mbits-1 (e.g. FDDI) or 155 Mbits-1 (e.g. ATM) over distances of up to 2 000 m are possible under most weather conditions, while distances of 5 000 m are achievable, albeit with less availability and higher error rates.

LED Links convert signals from fibre or wire media to narrowband light beam allowing the transmission of up to 50 Mbits-1 signals over distances of up to 5 000 m through the atmosphere.

Both types of links require focused and aligned optical equipment as well as atmospheric stability. They thus suffer from reliability problems due to misalignment as well as the vagaries of prevailing atmospheric conditions such as rain, fog and snow. A further limitation is that only point-to-point links are possible.

These types of links are capable of very high performance and protocol independence. This, as well as providing the capability to circumvent the requirement for the time- consuming and expensive laying of cables, makes these types of links very useful in certain circumstances.

2.1.5 Metallic Media

Metallic media for data applications normally consist of fine gauge copper or occasionally, aluminium wire. Copper media feature low cost, moderate bandwidth (up to 200 Mbits-1 over 100 m) and are easy to work with (i.e. soldering or wrapping).

Metallic media suffer from poor electromagnetic compatibility, being susceptible to and emitting radio frequency interference. They are also susceptible to intercept as signal power losses are difficult to detect and unauthorised connection is simple. Wire media can conduct electrical power, making wire interconnects a safety hazard due to electrical shock in cases of power equipment fault. Wire media also conduct power surges emanating from and atmospheric electromagnetic pulse (e.g. nuclear explosion). These factors are serious considerations for platform vulnerability, either from natural or deliberate causes. Due to the fact that wire media carry data by means of electrical power, electrical discharges can occur under cable fault conditions. Such discharges can be catastrophic in explosive environments such as those for the storage of highly volatile ordnance, petro-chemical plants or explosives manufacturing plants. Wire media offer no electrical isolation and therefore suffer from the perennial and ubiquitous ground loop problem. Wire media are also prone to corrosion, especially where joined or bonded. Such problems are especially severe when metallic media are

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A7 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

used for the transmission of high frequency data. For such applications soldering is inappropriate and wire wrapping techniques must be employed. While the integrity of soldered joints may initially appear good, small amounts of corrosion soon render the joint inoperative or intermittent.

Metallic media have high mass to length and data rate ratios; this being an extremely important consideration in mass-critical applications such as space avionics, vetronics and even aboard ships.

2.1.5.1 Two-Wire Open Line

Two-Wire Open Line is the simplest type of transmission medium. In such media, each wire is insulated from the other and both are open to free space.

Such media are capable of only modest transmission distances, typically up to 50 m and modest transmission rates, typically up to 20 kbits-1. They suffer from electromagnetic interference, specifically crosstalk from adjacent pairs due to capacitive coupling between the two wires. They also emit electromagnetic radiation. This type of media is especially susceptible due to the fact that interference can be picked up in just one of the wires leading to erroneous signal interpretation by the receiver, i.e. they are poor in rejecting common-mode interference.

2.1.5.2 Unshielded Twisted Pair

Unshielded Twisted Pair (UTP) consists of a pair of wires twisted together in order that the two wires are always in very close proximity to one another, thus ensuring pickup of spurious signals in both conductors and therefore effecting common-mode rejection.

UTP media feature very low cost, moderate bandwidth (up to 50 Mbits-1 over 100 m using signal processing techniques), low volume and are easy to work with. A major consideration for the use of UTP for office LANs is that many buildings have multi-pair internal telephone cabling laid when being built. As a major proportion of the installation cost of office LANs is cabling, this factor makes it very attractive to use UTP. These days (mid-1990s) most office LANs employ Ethernet over UTP (i.e. 10BaseT) while a large number of IBM Token Ring sites are also switching from STP to UTP.

UTP is suited to moderate bandwidth applications for workstation connection in office LAN environments where costs are a critical consideration and electromagnetic compatibility considerations are unimportant. Generally, UTP is inappropriate for mission-critical applications.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A8 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

Copper as a transmission medium shows low pass frequency characteristics. This means that higher frequencies are attenuated to a greater extent than lower frequencies. For example, with NRZI encoding, the main frequency of a 100 Mbits-1 data signal is about 62,5 MHz. The signal attenuation is relatively high if UTP cabling is used with such a high frequency. Besides this, a greater level of electromagnetic radiation is emitted. Shielding the cables is the best method to reduce this.

2.1.5.3 Shielded Twisted Pair

Shielded Twisted Pair (STP) consists of a twisted wire pair together with an outer shield to protect the internal wire pair from interference effects and to attenuate radiation. The outer shield consists of a braided jacket of multi- stranded fine copper conductors. STP thus offers reasonable levels of electromagnetic compatibility.

STP media feature moderate cost and moderate bandwidth (up to 200 Mbits-1 over 100 m using signal processing techniques). The shield increases the volume of STP often making it comparatively difficult to work with.

STP media were originally the specified media for IBM Token Ring; however its cost and awkwardness for little advantage in the cost-sensitive, non-critical environment of office LANs are causing a switch from STP to UTP.

STP is suited to low to moderate bandwidth applications in factory LAN environments where costs are not the primary consideration, optical fibre cannot be justified and electromagnetic compatibility considerations are important.

2.1.5.4 Screened UTP

Due to the attractiveness of UTP, there are many reasons for using UTP in office LANs. However, normal UTP has modest bandwidth capability and is inappropriate for high-speed technologies such as 100 Mbits-1 FDDI or 155 Mbits-1 ATM. High quality UTP featuring a screen has therefore been developed to support high-speed LAN technologies while retaining low to moderate cost. Usually this screen is a simple metallic foil wrapped around each twisted pair in the bundle as well as around the bundle of pairs. Often screened UTP is known as Category 5 UTP.

Screened UTP is suited to office LAN applications where costs are an important consideration, where electromagnetic compatibility considerations are unimportant and higher speed networking is or will be required. Screened

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A9 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

UTP is suitable for real-time applications where connectivity ranges are short and electromagnetic compatibility requirements are not important.

2.1.5.5 Coaxial Cable

Coaxial cable consists of a solid central conductor running concentrically inside a solid or braided outer conductor of circular cross-section. The space between the conductors is filled with a dielectric insulating material.

Due to its geometry, the centre conductor is effectively shielded from external interference as well as producing minimal skin effect and radiation losses.

Before the advent of 10BaseT, the primary media for Ethernet applications was coaxial cable. Initially 10Base5 (Thicknet) was the standard and later 10Base2 (Thinnet or Cheapernet) began to be used more widely due to the latter's significantly lower cost and the ease with which it can be worked.

Coaxial wire media are suited to moderate bandwidth applications in office LAN environments where costs are an important consideration and electromagnetic compatibility considerations are a factor.

10Base5 media are sometimes used for mission-critical applications where fibre cannot be justified. 10Base2 media are normally inappropriate for such applications as the connectorising method, consisting of bayonet-type T-pieces and cable plugs, are extremely unreliable, with any one connector fault causing the entire LAN to fail.

2.1.6 Optical Fibre Cable

Optical fibre cable consists of optically transparent fibres constructed concentrically with strengthening members and protective jacketing. The optical fibres transmit data by means of a discrete (digital) or continuous (analogue) light beam. The optical fibres are normally manufactured from purified and processed glass (silica), but plastic fibre can also be used. The cable may contain from one to several hundred fibres, each capable of supporting several hundreds of Mbits-1 over distance of up to several kilometres (for glass fibre). Due to the optical opaqueness of plastic, only moderate data rates (tens of Mbits-1) and distances (tens of metres) can be achieved.

While optical fibres can transmit vast volumes of data, they are ineffective for transmitting power, although laser beams carrying certain levels of optical power can be transmitted by optical fibre.

Optical fibre technology is fairly recent and consequently cost is still a factor. Optical fibre cable is also fairly difficult to work with, requiring special terminating and

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A10 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

splicing techniques involving specialised training and expensive equipment. Fibre optic connectors are also orders of magnitude more expensive than wire-type media connectors.

Optical fibre media exhibit excellent electromagnetic compatibility by being neither susceptible to nor emitting radio frequency interference. They are also not easily susceptible to intercept as signal power losses are comparatively easy to detect and unauthorised connection is difficult. As optical fibre media cannot conduct electrical power, they are intrinsically safe from electrical shock hazard as well as that due to lightning or atmospheric electromagnetic pulse (e.g. nuclear explosion). Optical fibre can, however, darken due to nuclear radiation effects with a resultant degradation of transmission effectiveness. Optical fibre media do not suffer from ground loop problems or corrosion.

Optical fibre media have low mass to length and data rate ratios making them ideal for mass-critical applications such as space avionics and mobile platforms.

Optical fibre may be categorised by their mode of light propagation, the internal and external diameters of the fibre as well as the wavelength of the fundamental frequency of light transmission. Typically the wavelength of the fundamental frequency is 1 300 nm.

2.1.6.1 Multimode

Multimode fibre is so termed due to the multiple modes of light propagation within the optical media. Multimode fibre is formed by chemically doping pure silica in such a way as to modify the internal refractive index of the fibre. This effectively forms an optical waveguide with the light signals being constricted to the fibre by the refractive barrier.

Multimode fibre for LAN applications typically has internal fibre diameters of 50, 62,5, 85 and 100 µm and external diameters of 125 and 140 µm.

Multimode optical fibre transmission normally uses light emitting diodes (LEDs) as light sources. These devices represent a significant cost proportion of fibre network interfaces which normally cost considerably more than non- fibre types. This has been a major factor in the somewhat limited deployment of fibre media for LAN applications.

Typical bandwidths for multimode fibre are 800 Mbits-1 over distances of 2 000 m for 50 ìm fibre and 600 Mbits-1 for 62,5 ìm fibre.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A11 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

2.1.6.2 Singlemode

Singlemode fibre is so termed due to the single mode of light propagation within the optical media. Singlemode fibre is formed from ultra-pure silica.

Singlemode fibre for LAN applications typically has internal fibre diameters of 7 and 9 µm and an external diameter of 125 µm.

Singlemode optical fibre transmission normally uses lasers as light sources. These devices are expensive; this has inhibited development of singlemode fibre network interfaces except for specialised applications. Such applications include longhaul network segments (2 to 40 km), ultra high-speed links and public carrier networks.

Typical bandwidths for singlemode fibre are 10 Gbits-1 over distances of several tens of kilometres.

Despite their higher cost, optical fibre media offer optimal performance for real-time, mission-critical networks, especially in harsh electromagnetic environments, where safety factors are imperative and high speed over long distances are required.

2.2 Cable Layer Technologies - Performance Summary

Table I below provide typical performance capabilities of common cable layer LAN technologies :

Type Bandwidth Distance EMC Cost 2-Wire Open 20 kbits-1 50 m Very Poor Very Low Line UTP 50 Mbits-1 100 m Poor Very Low Screened UTP 155 Mbits-1 100 m Moderate Low STP 200 Mbits-1 100 m Good Moderate Coaxial 50 Mbits-1 50 m Good Moderate Multimode Fibre 800 Mbits-1 2 000 m Excellent Moderate to High Singlemode Fibre 10 Gbits-1 10 000 m Excellent High

Table AI : Cable Layer Technologies - Performance Summary

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A12 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

3. Physical Layer Protocols

Physical layer protocols are required to encode data for transmission over the physical media of the Cable Layer.

3.1 Encoding Schemes

Digital data transmission networks require data to be encoded for two reasons :

! To ensure a suitable number of transitions to reliably extract signalling synchronisation, i.e. clock recovery.

! To ensure that over a reasonable period of time an equal number of HIGH and LOW logic states occur so that the average is zero and power is not transmitted down the line (in wire media).

Encoding schemes differ in complexity with increasing complexity achieving increased efficiencies of use of the medium's potential bandwidth.

Common encoding schemes in order of increasing complexity are :

3.1.1 RZ

RZ (Return to Zero) is a 3-state, bipolar encoding scheme whereby a positive pulse represents a one and a negative pulse represents a zero. A drawback of this scheme is that consecutive like bits of data can cause a voltage offset on the physical media.

3.1.2 Manchester Biphase

Manchester Biphase encoding is a form of Non-Return-to-Zero (NRZ) encoding used with asynchronous signalling schemes. Decoding of the bit stream extracts the clock information. As two signal transitions are required per bit, the scheme is 50% efficient in terms of bandwidth utilisation.

The MIL-STD-1553 Multiplex Databus uses Manchester Biphase encoding. This means that the 1 Mbits-1 data stream uses 2 MHz bandwidth.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A13 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

3.1.3 NRZI

NRZI (Non-Return to Zero Invert ones) is a 2-state encoding scheme whereby a transition between high and low states represents a one; no transition represents a zero. NRZI minimises bandwidth required by reducing the number of transitions in the data stream, thus allowing the use of less expensive optical components.

FDDI employs NRZI.

3.1.4 4B/5B

4B/5B is the encoding process by which 4-bit symbols are converted to 5-bit symbols by the addition of a fifth bit. The extra bit is added for clocking purposes. Each 5-bit code group is designed not to contain more than 3 zeroes in a row in order that synchronisation is never lost.

As there are 4 useful data bits for every 5 transmitted bits, the encoding scheme is 80% efficient.

FDDI employs 4B/5B encoding. This means that the 100 Mbits-1 data stream uses 125 MHz bandwidth. Due to NRZI encoding, the primary frequency is half of this, i.e. 62,5 MHz.

3.1.5 8B/10B

8B/10B is the encoding process by which 8-bit symbols are converted to 10-bit symbols by the addition of two extra bits. In addition to supporting all 256 8-bit data combinations, some of the remaining characters in the 10-bit transmission code have special meaning. The transmission codes, referred to as special characters, are used to identify frame boundaries, transmit primitive function requests and maintain proper link transmission characteristics during periods of inactivity (i.e. idle). Internal tables are used both to generate valid transmission characters and to check the validity of received transmission characters.

The 8B/10B encoding scheme is a byte-oriented scheme which is used to integrate data and clock, maintain DC balance and provide word alignment.

As there are 8 useful data bits for every 10 transmitted bits, the encoding scheme is 80% efficient.

Fibre Channel employs 8B/10B encoding.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A14 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

3.1.6 17B/20B

17B/20B is the encoding process by which 16-bit data words and one packet-delimiter flag bit are concatenated with 3 signalling bits to form a 20-bit transmission code.

SCI employs 17B/20B encoding.

3.1.7 MLT-3

MLT-3 (Multilevel Threshold-3) is a 3-state encoding scheme where logical ones are represented by voltage level transitions and zeroes are represented by lack of transitions. Transitions are always between two adjacent levels. Compared with NRZI, the MLT-3 emission power spectrum is two-thirds lower.

MLT-3 is specified by ANSI for FDDI over copper. It is also an option for ATM over copper media.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A15 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

4. Physical Layer Technologies - Implementations

4.1 MIL-STD-1553

MIL-STD-1553[24, 106] is the document number for the US military avionics databus standard fully entitled Digital Time Division Multiplex Command/Response Databus. The standard was first published in 1973 and is now in its current revision of MIL-STD-1553B (Notice II) which was published in 1978.

The standard has found extensive application in a wide variety of airborne platforms including aircraft, missiles, satellites, spacecraft, space launch vehicles and intelligent munitions. It has also found application in land vehicles and smaller naval platforms such as submarines and small surface ships.

MIL-STD-1553 employs serial asynchronous communications over a shielded, twisted wire pair. The throughput is 1 Mbits-1 using Manchester II biphase signalling resulting in a baud rate of 2 Mbits-1 (i.e. 50% efficiency). The throughput is very modest by current standards.

MIL-STD-1553 does not follow the layered ISO OSI approach. As such, it essentially provides a closely-coupled LAN architecture which restricts its flexibility.

4.2 Ethernet

Ethernet LAN technology was developed by Xerox, Intel and DEC Corporations. It was initially developed as a proprietary technology, but was later standardised by the IEEE and designated IEEE 802.3. Ethernet employs Manchester-encoded, baseband digital signalling.

The primary network media for Ethernet are thin co-axial cable (10Base2) and UTP (10BaseT), although thick Ethernet (10Base5) is also used. 10Base2 and 10Base5 employ bus-type topologies while 10BaseT employs a star-type topology. Ethernet can also be transmitted over optical fibre cables by use of special converters.

4.3 Token Ring Schemes

4.3.1 IBM Token Ring (IEEE 802.5)

IBM Token Ring is the specific token ring LAN technology developed by International Business Machines (IBM). It was initially developed as a proprietary technology, but was later standardised by the IEEE and designated IEEE 802.5.

The primary network medium for IBM Token Ring is Shielded Twisted Pair (STP). More recently, UTP has found Token Ring application. In both cases, star-type topologies employing Multi-Access Units (MAUs) are used. Token Ring can also be transmitted over optical fibre cables by use of special converters.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A16 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

4.3.2 FDDI

The Fibre Distributed Data Interface (FDDI) is a high-speed LAN standard developed under co-ordination of ANSI.

FDDI over fibre has been ratified as a standard since 1989 by the ANSI X3T9.5 standards committee.

The primary medium of communication is multimode fibre in a ring topology. While single connection station attachment is optional, FDDI has been designed to support a dual-redundant counter-rotating ring topology. Further protection against node failure can be achieved using optional optical bypass switches which opto-mechanically switch the optical signals around a failed node.

FDDI supports a data rate of 100 Mbits-1 using 4B/5B encoding, resulting in a (bit- wise) transmission efficiency of 80%. FDDI allows for a total ring diameter of 200 km; this results in an effective ring diameter of 100 km in the dual-redundant configuration. A total of 1 000 attachments is allowed; again resulting in an effective 500 dual attachments in the dual-redundant configuration. Standard FDDI allows for intersegment lengths of up to 2 km between stations, but up to 40 km can be attained at present using singlemode fibre. FDDI also specifies a low bit error rate of < 2,5 x 10-10.

The FDDI Physical Layer Standard is specified in two ANSI documents :

! FDDI Physical Media Dependent (PMD) Protocol (Rev. 9)[33]. This defines the characteristics and parameters of the physical media.

! FDDI Physical Layer (PHY) Protocol[32]. This defines the encoding, decoding and synchronisation of the station-to-station bit streams.

FDDI, while conforming to internationally-accepted standards, also offers considerable flexibility in terms of media and station attachment.

4.3.3 CDDI

FDDI was primarily designed to operate over multimode fibre optic cabling, but advances in signal processing now allow FDDI to operate over copper media. 'FDDI' over both Shielded Twisted Pair (STP) and Unshielded Twisted Pair (UTP) are both options. These options are sometimes termed CDDI (Copper Distributed Data Interface). While CDDI supports the data rates of FDDI, intersegment lengths are considerably reduced to 100 m ( c.f. 2 000 m for multimode fibre) and error rates are also increased.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A17 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

For commercial LANs, especially where copper media cabling was already in existence, it made commercial sense to use FDDI over copper media, i.e. CDDI. This was especially true where the workstation applications were not mission-critical and single attachment was appropriate.

Most commercial organisations saw the requirement to upgrade from either Ethernet or IBM Token Ring LANs. In the former case, 10BaseT is the predominant technology. 10BaseT operates on a star-type topology using Ethernet hubs or concentrators and Unshielded Twisted Pair (UTP) cabling. IBM Token Ring also uses a star-type topology based on Multi-Access Units (MAUs) and Shielded Twisted Pair (STP) cabling. Thus the primary topology for CDDI is star-based.

Another consideration was that the CDDI standard had to conform to specific electromagnetic compatibility requirements. These two requirements focused the solution to one where a different physical layer encoding scheme was used. This resulted in the 3-state MLT-3 (Multilevel Threshold-3) encoding standard being chosen over the 2-state NRZI (Non-Return to Zero Invert Ones).

With UTP cabling, which does not provide a shield, only a reduction in transmission frequency can reduce attenuation. With MLT-3, the main frequency of transmission is 31 MHz, therefore MLT-3 can offer reduced EMI and an improved signal-to-noise ratio.

The MLT-3 standard for FDDI over copper media[34] has now been ratified by ANSI and manufacturers are producing concentrators and NICs conforming to this standard.

4.4 LAN Technologies - Performance Summary

Table II below provide typical performance capabilities of common LAN technologies :

Type Bandwidth Distance Range Nodes EMC Cost MIL-STD-1553 1 Mbits-1 0,1 km 0,1 km 32 Modest High Ethernet 10 Mbits-1 2,8 km 2,8 km 1 024 Poor Very Low IBM Token Ring 16 Mbits-1 100 m 2,5 km 260 Low Low FDDI 100 Mbits-1 2,0 km 100 km 500 Excellent Moderate to High CDDI 100 Mbits-1 0,1 km 0,1 km 500 Poor Moderate

Table AII : LAN Technologies - Performance Summary

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A18 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

5. Conclusions

5.1 Performance

It can be concluded that fibre optic media are superior to other LAN media in their support for real-time, mission-critical, distributed systems.

5.2 Cost Effectiveness

While copper media has drawbacks in terms of EMC, it presently has significant advantages in terms of cost as well the ease with which it can be worked.

The cost of fibre media as well as associated parts such as connectors and terminating equipment is, however, rapidly decreasing. With expected widespread use in order to support consumer video on demand as well as the Information Superhighway, these costs will decrease even further. Optical fibre is intrinsically less expensive than copper and it is therefore expected than in the long term, copper media will be more expensive than optical fibre media.

5.3 Media Bulk and Mass

The high mass to length and mass to data rate ratios of metallic media are extremely important considerations in mass-critical applications such as space avionics, vetronics and even aboard ships.

It is said that a typical 1970s frigate contained a greater mass of copper cabling than structural steel. This has profound implications on propulsion and fuel requirements which in turn has implications on endurance and ship size which in turn has major tactical implications on radar cross-section which increases vulnerability.

Fibre optic media in LAN topologies can reduce the mass of typical ship data cabling by two orders of magnitude.

5.4 Electromagnetic Compatibility

Fibre optic cables offer clear advantages over wire-type media in respect of electromagnetic compatibility.

5.5 50µ/125µ vs 62,5µ/125µ Multimode Fibre

50µ multimode fibre has cost and performance advantages over 62,5µ fibre. Despite the fact that 62,5µ fibre is specified as the default standard for FDDI, 50µ fibre is the preferred standard throughout the world except the USA. This was due to the vested interests of one of the world's major role players in communications and computer technology, i.e. IBM

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A19 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

Corporation. When fibre optics began to become a major communications technology, IBM attempted to control the market by monopolistic practices with an IBM-aligned, optical fibre manufacturer, Siecor.

If 600 Mbits-1 62,5 ìm multimode fibre is installed for FDDI, its bandwidth will be marginal to support, for example, 622 Mbits-1 ATM. Therefore, if an upgrade capability is required, 800 Mbits-1 50 ìm optical fibre should be installed.

However, there are other considerations in the optimal choice of fibre type for a specific application. While 62,5 µm fibre exhibits lower launch losses than 50 µm fibre due to its larger diameter, it also exhibits higher attenuation per unit length.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A20 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

6. Recommendations

6.1 Cable Layer

6.1.1 Fibre Optic Media

Fibre optic media should be adopted as the Cable Layer LAN standard for all mission- critical applications, where EMC is a factor, LANs are logically or geographically extensive, or high bandwidth is required.

6.1.2 Copper Media

Where EMC is a not factor, or LANs are not geographically extensive, copper media using shielded twisted pair or screened unshielded twisted pair are appropriate.

CDDI over screened UTP and using a fibre backbone, is recommended in FDDI applications where costs are a factor and fault-tolerance is not required from each and every LAN connection.

STP should be preferred over UTP in factory environments unless other factors militate against this.

6.1.3 Upgradeability

Should an upgrade for a LAN infrastructure be envisioned where data throughput exceeds 155 Mbits-1, fibre media should be used in preference to copper media.

Should an upgrade for a LAN infrastructure be envisioned where data throughput exceeds 800 Mbits-1, singlemode fibre media should be used in preference to multimode fibre media.

6.1.4 Optical Power Analysis

Especially in complex fibre cable plant designs where optical bypass switches and optical connectors are used to maximise survivability and maintainability, trade-offs as well as optical power analyses must be performed to optimise overall system performance.

The SAFENET Network Development Guidance[103] provides a good basis on which to perform such optical analyses for mission-critical applications.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A21 of 22 ydthsaa11-02.wpd Appendix A Physical Layer LAN Media and Protocols

6.2 Physical Layer Protocols

6.2.1 Encoding Complexity

In order to maximise the utilisation of the medium's raw data transfer capacity, sophisticated encoding protocols should be employed.

6.2.2 MLT-3 Encoding

When FDDI or ATM over copper media is appropriate, the MLT-3 encoding scheme is recommended.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page A22 of 22 ydthsaa11-02.wpd Appendix B

Data Link Layer LAN Protocols

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B1 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

1. Scope ...... 6 1.1 Scope ...... 6 1.2 Introduction ...... 6 1.3 Appendix Layout ...... 6

2. Datalink Layer Protocols - Characteristics ...... 7 2.1 Media Access Control Protocols ...... 7 2.1.1 Centralised Access Schemes ...... 7 2.1.2 Distributed Access Schemes ...... 8 2.1.2.1Collision Detect Schemes ...... 8 2.1.2.2 Collision Avoidance Schemes ...... 8 2.1.2.3Token Ring Schemes ...... 9 2.1.2.4Token Bus Schemes...... 9 2.1.3 Packet-Switched Schemes ...... 10 2.2 Logical Link Control Protocol...... 10

3. Datalink Layer Protocols - Implementations ...... 12 3.1 Centralised Access Schemes ...... 12 3.1.1 MIL-STD-1553 ...... 12 3.1.2 HDLC and SDLC...... 13 3.1.2.1 Unbalanced Normal Response Mode ...... 13 3.1.2.2 Asynchronous Balanced Mode ...... 13 3.2 Collision Detect Schemes ...... 14 3.2.1 Ethernet ...... 14 3.2.2 Fast Ethernet ...... 14 3.3 Token Ring Schemes ...... 15 3.3.1 IBM Token Ring (IEEE 802.5) ...... 15 3.3.2 Dedicated Token Ring ...... 15 3.3.3 FDDI ...... 16 3.3.3.1 FDDI Station Management Layer ...... 16 3.3.3.2FDDI Transmission Modes ...... 17 3.4 Token Bus Schemes...... 18 3.4.1 Arcnet...... 18 3.4.2 ...... 18 3.4.3 Proway C ...... 18 3.5 Packet-Switched Schemes ...... 19 3.5.1 ATM...... 19 3.6 New Technologies ...... 19 3.6.1 Fast Arcnet...... 19 3.6.2 FDDI II...... 19 3.6.3 FFOL ...... 20 3.6.4 FDVDI ...... 20 3.6.5 DQDB ...... 21 3.6.6 Scalable Coherent Interface...... 21 3.6.7 Fibre Channel...... 22

4. Network Topologies ...... 25

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B2 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

4.1 Basic Topologies ...... 25 4.1.1 Bus Topology...... 25 4.1.2 Star Topology ...... 26 4.1.3 Ring Topology ...... 27 4.1.4 Tree Topology ...... 29 4.2 Complex Topologies ...... 30 4.3 Interconnect Devices ...... 31 4.3.1 Repeaters ...... 31 4.3.2 Concentrators ...... 31 4.3.3 Bridges ...... 31 4.3.4 Routers ...... 31 4.3.5 Bridge/Routers ...... 31 4.3.6 Gateways ...... 31 4.3.7 Switches ...... 31

5. Data Link Layer Protocols - Performance Summary ...... 32

6. Conclusions ...... 34 6.1 LAN Technologies...... 34 6.2 Performance ...... 34 6.3 Cost Effectiveness ...... 34 6.4 Ethernet ...... 34 6.5 Token Bus ...... 35 6.6 IBM Token Ring ...... 35 6.7 FDDI ...... 35 6.8 CDDI ...... 35 6.9 ATM...... 36 6.10 Scalable Coherent Interface...... 37 6.11 Fibre Channel...... 37

7. Recommendations ...... 38 7.1 FDDI ...... 38 7.2 CDDI ...... 38 7.3 ATM...... 38 7.4 Ethernet ...... 38

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B3 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

List of Figures

Figure 1 : Bus Topology ...... 26

Figure 2 : Star Topology ...... 27

Figure 3 : Ring Topology...... 29

Figure 4 : Tree Topology ...... 30

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B4 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

List of Tables

Table BI : FDVDI Options ...... 21 Table BII : Fibre Channel Link Options...... 24 Table BIII : P- and CP-Factors for Common LAN Technologies ...... 33

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B5 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

1. Scope

1.1 Scope

This appendix describes the detailed characteristics of Data Link Layer (i.e. Layer 2) LAN protocols, specifically in terms of their performance and their suitability or otherwise for the implementation of real-time, mission-critical, distributed systems. It also provides a quantitative trade-off between possible contenders as the data link layer protocol options for the Real-Time LAN Profile.

1.2 Introduction

The International Standards Organisation (ISO) defined and in 1984 published a framework for communications standards which partitions the functions required for communication into seven layers. This was termed the Open Systems Interconnect (OSI) Basic Reference Model. Layer 2 of the OSI model is termed the Data Link Layer (DLL).

The DLL is responsible for formatting and disassembly of data packets as well as flow and error control across a datalink. The DLL also provides the basic network addressing mechanism. The DLL is normally divided into two sub-layers, the Media Access Control (MAC) sub-layer and Logical Link Control (LLC) sub-layer.

The functions of the DLL protocol are fundamental to the suitability of local area networks for real-time, mission-critical, distributed systems as they determine such capabilities as data transfer latency, network topology, intrinsic redundancy and multi-access capability. The media access mechanism also influences the maximum physical size of the LAN.

1.3 Appendix Layout

The appendix commences with an overview of the characteristics of available Data Link Protocols and proceeds with descriptions of specific implementations thereof. Important implications of these characteristics and implementations are then analysed within the context of real-time, mission-critical, distributed systems.

Specific conclusions and recommendations are then made, with the most significant of these being adopted in the main section. In particular, specific options are recommended for inclusion in the Real-Time LAN Profile described in the main section.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B6 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

2. Datalink Layer Protocols - Characteristics

The Data Link Layer (DLL) provides the means by which the network medium (i.e. Physical Layer) is accessed by the higher-level protocols. As such, the characteristics of the DLL are critical to the performance of the network.

The functions typically associated with the DLL are the assembly of data into frames with addresses and Cyclic Redundancy Check (CRC) fields. The latter are a means of detecting errors within the transmitted data. Other functions include disassembly of received frames and execution of address recognition and CRC validation. The overall function of the DLL is the management of communication over the link. In the IEEE 802 standard for local area networks, these functions are divided into two groups. The groups correspond to two sub-layers, i.e. the Logical Ling Control (LLC) sub-layer and the Media Access Control (MAC) sub-layer.

2.1 Media Access Control Protocols

Because with all LANs network nodes share the mediums's transmission capacity, some means of controlling access to the transmission medium is needed so that two particular nodes can exchange data. The access method is generally according to one of two schemes, i.e. centralised or distributed.

2.1.1 Centralised Access Schemes

Centralised access schemes are those where one particular node acts as a master and controls access to the LAN by other nodes, e.g. the command/response scheme of the MIL-STD-1553 Multiplex Data Bus. A centralised access mechanism can provide greater LAN control over such issues as priorities and overrides, as well as guaranteeing LAN bandwidth to each node. However this scheme may have negative implications in that the centralised controller may act as a bottleneck and reduce efficiency of the LAN. Centralised controllers are also single points of failure within a system with consequent reliability implications.

In a centralised access scheme there is always a controller controlling the scheduling on the LAN. A station can only transmit if it is given the authority to do so by the controller. The controller verifies all bus traffic for integrity and will reschedule messages if any data was lost during transfer.

The following centralised access implementations are identified :

! MIL-STD-1553B

! HDLC/SDLC.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B7 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

2.1.2 Distributed Access Schemes

Distributed access schemes are those where all nodes have equal access to the communication medium. Access may be controlled or it may be random.

Random access schemes include Carrier Sense Multiple Access (CSMA), Carrier Sense Multiple Access with Collision Detect (CSMA/CD) and Slotted Ring (Cambridge Ring).

Controlled access schemes include Token control and Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA).

2.1.2.1 Collision Detect Schemes

In a Carrier Sense Multiple Access with Collision Detect (CSMA/CD) scheme, a station that requires to transmit data, first "listens" for any bus activity. If no activity is sensed, the station is free to start transmitting. It may happen that two stations transmit at the same time. In this case a collision is said to occur. The simplest form of collision detection requires a higher level of protocol to detect that data has been lost. This could be accomplished by waiting for an acknowledgement from the receiving station or detection of corrupt returned packets.

In IEEE 802.3[7], a CSMA/CD scheme is specified. In this scheme collision detection is performed while the station is transmitting. This requires that the frame has a certain minimum length and that after a collision is detected a few additional bytes be transferred for the collision to propagate throughout the system. The minimum frame length and number of additional bytes are dependent on the physical length of the bus. After a collision, the station will retransmit its frame after a random time delay.

If activity is detected on the bus, the station "backs off" for a predefined time (1 ìs) and "listens" again. This is repeated until no activity is detected and the station is then free to transmit.

The following collision detect implementations are identified :

! Ethernet

! Fast Ethernet.

2.1.2.2 Collision Avoidance Schemes

Collision Avoidance is a mechanism which uses a bitwise arbitration algorithm. This requires that each bit transmitted on the LAN must be able to propagate to the full extent of the physical LAN before the next

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B8 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

bit is transmitted. This has the implication that the maximum length of the bus is very limited.

The following collision avoidance implementation is identified :

! CAN-bus (Controller Area Network).

2.1.2.3 Token Ring Schemes

In a token ring scheme, a station may only transmit if it has possession of the token. A token is a special type of frame. When a station requires the LAN it will take possession of the token and hold it if the station has a higher priority than the priority specified in the token. If not, the station will request the token by placing its priority in the reserve bits of the token. After the station has removed the token from the LAN, it then transmits its information.

The following token ring implementations are identified :

! IBM Token Ring.

! Slotted Ring.

! FDDI.

2.1.2.4 Token Bus Schemes

A token bus scheme as defined by IEEE 802.4 is a physical bus, but a logical ring. When a node receives the token, it acquires the right to use the LAN for a specified time before the token must be passed on to the next node in the logical ring.

The token bus invokes a somewhat complex algorithm when tokens are lost. This has the implication that LAN recovery take some considerable amount of time[77].

IEEE 802.4 is specified for three different physical layers :

! Single-Channel, Phase-Continuous FSK (frequency shift key); (1 Mbits-1).

! Single-Channel, Phase-Coherent FSK; (5 or 10 Mbits-1).

! Broadband; (1, 5 or 10 Mbits-1).

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B9 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

The following token bus implementations are identified (with limiting implementation details) :

! Arcnet (2,5 Mbits-1).

! Profibus (500 kbits-1).

! Proway C (1 Mbits-1)

2.1.3 Packet-Switched Schemes

Packet-switched schemes are those where data traffic is relayed through the network using a label that is contained in a packet's header. The header field contains information about the virtual channel and virtual path in use, payload type and cell loss priority. The Header Error Control field supports correction of single bit errors and detection of multiple bit errors.

The following packet switched schemes are identified :

! X.25.

! Frame Relay.

! ATM.

2.2 Logical Link Control Protocol

The LLC sub-layer multiplexes several logical links onto the one physical network as well as enhances the frame delivery service of the MAC Layer by performing error and flow control between pairs of network nodes. The LLC can provide end-to-end error control and acknowledgement of packets and guarantees error-free transmission of packets between nodes.

The IEEE specifies the functionality of LLC in the IEEE 802.2 standard. The level of functionality and reliability of service is classified into types.

2.2.1 LLC Type 1

Type 1 (LLC1) is an Unacknowledged Connectionless Service and provides datagram frame delivery over the local network segment and any other network segments reachable via the link layer switches. Higher layer packets must be placed in the protocol data units of the underlying data delivery service. This is termed encapsulation. FDDI assumes implementation of the IEEE 802.2 LLC standard.

LLC frames carry user information between nodes on an FDDI or other network. Each FDDI frame containing user data contains LLC information for the end

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B10 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

node. These frames do cross bridges and can be transmitted to nodes on the extended LAN.

XTP, providing a reliable protocol, only requires basic LLC services, i.e. LLC1. While XTP does not require reliable LLC service and provides other services such as multicast, other communication profiles do require enhanced LLC services. IEEE has therefore initiated the definition of LLC Type 4 (LLC4).

2.2.2 LLC Type 4

LLC Type 4 has been proposed by the IEEE 802.2 Logical Link Control working group as an extended functionality, high-performance alternative to LLC Types 1 and 2. The target application for LLC Type 4 is bridged LAN and MAN networks.

LLC Type 4 has been proposed to offer the following services :

! Reliable Sequenced Delivery

! Reliable Non-sequenced Delivery

! Non-reliable Sequenced Delivery

! Segmentation/Re-assembly

! Connection Setup with Embedded User Data

! Connection Release with Embedded User Data

! Multiple Logical Connections between LSAP Pairs

! Quality of Service

! Protection of Bit Errors through Bridges

! Ability to Deliver Corrupted Data

! Multicast

LLC4 presently exists as a draft proposal and final definition awaits IEEE committee ratification.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B11 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

3. Datalink Layer Protocols - Implementations

3.1 Centralised Access Schemes

3.1.1 MIL-STD-1553

The MIL-STD-1553 communication protocol specifies a command/response mechanism, i.e. a centralised access control mechanism whereby a bus controller schedules all messages on the bus. It allows for up to 31 remote terminals plus a broadcast option. The number of data words per message is limited to 32 16-bit words.

Both stationary and dynamic bus control options are allowed. The latter overcomes single point of failure problems. In real-life systems, stationary bus controller schemes are often employed, usually to circumvent implementation complexity and integration responsibility problems. In such cases, backup bus controllers are employed for mission-critical systems. The problem then becomes one of transfer of control to the backup controller when the primary fails. Further problems can arise due to state ambiguities should the primary recover. This is because not all failure modes are simple and deterministic. Discrete connections are often employed between bus controllers whereby the discrete signal acts as "go/no-go" control ("dead man's hand"). In safety-critical systems such as fly-by- wire and manned space vehicle applications, triple redundancy is normally employed for bus control. In such cases, complex handover mechanisms are required, e.g. voting or arbitration schemes.

MIL-STD-1553 offers both synchronous (state) and asynchronous (event) data transmission modes. The former are scheduled according to pre-determined message tables held by the (current) bus controller. The latter are scheduled in response to flags set in the status words of synchronous messages or detected by means of polling.

Traffic on a MIL-STD-1553 bus is normally organised into minor cycles and major cycles. The former are derived from the repetition rate of the most frequent synchronous messages while the latter are derived from the repetition rate of the least frequent synchronous messages. Essentially MIL-STD-1553 implements a time-triggered protocol.

As the transmission time for the longest (32-word) message is some 670 ìs (without allowing for any retries), minor cycle times are lower bounded by this figure. Practical minimum minor cycle times are in the order of 5 ms. In such cases, bus bandwidth is not efficiently used due to the high overhead involved. Typical major cycle times are in the order of 1 s.

Normally, system applications interface to the MIL-STD-1553 bus infrastructure by means of proprietary application interface protocols implemented above the data link layer. Such architectures do not conform to the ISO OSI layered model and, as such, do not provide an open system approach to system design. Such

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B12 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

approaches often lead to closely-coupled systems which are difficult to expand and upgrade.

3.1.2 HDLC and SDLC

The High-Level Data Link Control (HDLC) protocol is a link-level protocol that has been defined by ISO for use on both point-to-point as well as multipoint (multidrop) data links. It supports full-duplex transparent mode operation and is now extensively used in both terminal-based networks and computer networks.

Although the acronym HDLC is now widely accepted, a number of large manufacturers and other standards bodies still use their own acronyms. These include SDLC (Synchronous Data Link Control) by IBM and ADCCP (Advanced Data Communications Control Procedure) which is used by ANSI.

HDLC is a bit-oriented protocol which is more efficient than a character-oriented protocol. With HDLC, both data and control messages are carried in a standard format that is referred to as a frame. HDLC employs three different classes of frame, Unnumbered Frames (for link control), Information Frames (to carry data) and Supervisory Frames (for error and flow control). HDLC employs a Frame Check Sequence which is a 16-bit cyclic redundancy check for the complete frame contents enclosed between the two frame delimiters.

Because HDLC has been defined as a general purpose data link control protocol, a specific mode of operation is selected when the data link is first set up. The two most prevalent modes are :

3.1.2.1 Unbalanced Normal Response Mode

Unbalanced Normal Response Mode (NRM) is mainly used in terminal- based networks since, in this mode, slave stations (or secondaries) can only transmit when specifically instructed by the master (primary) station. The link may be point-to-point or multidrop, but in the case of the latter, only one primary station is allowed.

3.1.2.2 Asynchronous Balanced Mode

Asynchronous Balanced Mode (ABM) is mainly used on point-to-point links for computer-to-computer communications or for connections between other devices such as statistical multiplexers. In this mode, each station has equal status and performs both primary and secondary functions.

While HDLC and SDLC are widely used protocols for the connection of computers and peripherals, they are only appropriate in point-to-point and small multidrop (LAN) applications. As such, they are inappropriate for the sophisticated real-time, mission-critical, distributed systems under consideration.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B13 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

3.2 Collision Detect Schemes

3.2.1 Ethernet

Ethernet is a LAN technology developed by Xerox Corporation. It was initially developed as a proprietary technology, but was later standardised by the IEEE and designated IEEE 802.3[7] as well as by ISO (designated as ISO 8802.3).

The media access scheme of Ethernet has a number of important implications as to its performance. The first is that the backoff and retry process of the CSMA/CD protocol results in an inherent inefficiency in the use of the bandwidth of the underlying physical layer. In fact, in moderate to heavily loaded Ethernet LANs, this efficiency may only be as much as 30% to 40%, implying that the intrinsic 10 Mbits-1 is reduced to some 3 to 4 Mbits-1 [43, 111].

The second implication is that concerning data transfer determinism. Due both to a node finding the medium busy, as well as due to the random nature of the retry method, it cannot be guaranteed that a node will gain access to the medium within any specific time period. In heavily loaded Ethernet LANs, such latencies may be in the order of hundreds of milliseconds[76]. This is clearly unacceptable in most classes of real-time, distributed systems. Furthermore, in the case of periodic type data transfer, there will be significant jitter from sample to sample. Such jitter may lead to instability in distributed control algorithms.

The media access scheme also has implications in the maximum length of the LAN. Because Ethernet gains access to the media by issuing data packets and detecting integrity of the reflected packet, Ethernet LANs are restricted in length due to the timing involved in the return of these packets. Thus even if long haul media such as optical fibre are employed, repeaters are ineffective in extending the useful LAN length beyond some 2 800 m. Bridges or routers are required in such cases. These compound the latency problem.

Ethernet also has a maximum protocol data unit of 1 514 bytes which can be limiting in the case of certain traffic profiles.

3.2.2 Fast Ethernet

Fast Ethernet is a new high-speed LAN technology developed by various companies. Two versions exist, one designated 100BaseT and the other 100BaseVG. The former is promoted by the Fast (FEA) consisting of 60 of the world's largest LAN vendors. The latter is promoted primarily by Hewlett-Packard and AT&T. 100BaseT has been ratified by IEEE as IEEE 802.3u, while ratification is being sought for 100BaseVG.

While both offer raw throughput of 100 Mbits-1, they use different signalling techniques. 100BaseT uses Ethernet's CSMA/CD access method and frame format over two-pair twisted pair (UTP and STP) or fibre media. Full duplex operation is possible giving an aggregate of 200 Mbits-1. Because Fast Ethernet uses

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B14 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

CSMA/CD, it is only capable of 35% to 45% usage of the 100 Mbits-1 raw bandwidth and, like standard Ethernet, also suffers from latency and jitter problems.

100BaseVG, on the other hand, multiplexes four 25 Mbits-1 channels over four pairs of twisted pair cabling using a signalling protocol termed demand priority. Consideration is also being given to two-pair as well as fibre operation. Full duplex capability is not supported.

3.3 Token Ring Schemes

3.3.1 IBM Token Ring (IEEE 802.5)

IBM Token Ring is the specific token ring LAN technology developed by International Business Machines (IBM). It was initially developed as a proprietary technology, but was later standardised by the IEEE and designated IEEE 802.5[9] as well as by ISO (designated as ISO 8802.5). The token ring scheme as defined by IEEE 802.5 is a logical ring, but a physical star.

Initially IBM Token Ring was developed to run at 4 Mbits-1; with the later version running at 16 Mbits-1.

In IEEE 802.5, three types of frames are defined. These are the Token Frame, Data Frame and Abort Frame. Each station receives the frame and determines whether the frame is destined for itself. In this case, the data contained in the frame is copied to buffer memory. The station updates the frame status bits and transmits the frame back to the source station. When the source station receives the frame and has determined that the destination station has received the frame successfully, it removes the frame from the LAN and releases the token.

Due to IBM Token Ring's simple token handling scheme (c.f. FDDI timed, early release token), this LAN technology exhibits modest efficiency in moderate to heavily loaded LANs, i.e. in the order of 50% to 60%. A further implication of this token passing scheme is that periodic transfers are subject to jitter, i.e. the uncertainty of precise repetition intervals between data samples.

16 Mbits-1 IBM Token Ring has a maximum protocol data unit of 16 000 bytes which is very large.

3.3.2 Dedicated Token Ring

Dedicated Token Ring (DTR), or Switched Token Ring, is a new standard being proposed and developed by the IEEE 802.5 committee. DTR will employ Token Ring switches rather than passive Multi-Access Units (MAUs) and will provide full bandwidth between the node and the switch (i.e. 16 Mbits-1 to the node).

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B15 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

The attraction of such an approach is that substantial performance improvement without an organisation's extensive investment in network interface cards being immediately obsoleted by completely new technology.

3.3.3 FDDI

The Fibre Distributed Data Interface (FDDI) is a high-speed LAN standard developed under co-ordination of the ANSI X3T9.5 committee. FDDI over fibre has been ratified as a standard since 1989 by the ANSI X3T9.5 standards committee.

FDDI's token ring scheme distinguishes itself from IBM Token Ring in terms of two features :

! It uses a timed token, i.e. timing fields within the token ensure token release within specified delays.

! It is an early release token scheme, i.e. the token is released immediately after a station has transmitted its data, unlike IBM Token Ring where the token is only released after transmitting and receiving its own data.

FDDI's timed token protocol offers an efficient, deterministic, collision-free access to the network, regardless of the number of connected stations. Such a protocol results in an overall transmission efficiency of up to 95% of the physical medium's bandwidth.

Possession of the token allows a station to transmit one or more data packets of up to 4 500 bytes per packet.

The FDDI DLL standard is specified in two ANSI documents :

! ANSI FDDI Media Access Control[31] standard which specifies frame and token construction as well frame transmission and reception.

! ANSI FDDI Station Management (Rev. 7.3)[35] standard which specifies the necessary services at the station level to monitor and control FDDI nodes.

3.3.3.1 FDDI Station Management Layer

The Station Management (SMT) layer can be considered as a vertical layer which can access all FDDI sub-layers specified in terms of the OSI model. The SMT function permanently monitors the FDDI ring, coordinates the ring during network start-up and produces a status table of the ring's and station's state. The SMT manages the station's PHYs, MACs, PMDs, optical bypasses, timers, as well as management objects such as counters, parameters and statistics. The functionality of SMT

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B16 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

can be divided into connection management (CMT) and ring management (RMT).

CMT is responsible for ring configuration, reconfiguration after malfunction, network statistics and diagnostics. It consists of :

! Entity Co-ordination Management (ECM)

! Configuration Management (CFM)

! Physical Connection Management (PCM)

PCM is responsible for bit signalling, line state identification during link construction and signal monitoring, as well as performing link confidence tests when the ring is started up. During operation, PCM regularly calculates the bit error rate and other error statistics which are represented as link error monitor data.

RMT receives status information from Media Access Control and connection management. RMT then reports this status to Station Management. Services provided by ring management include :

! stuck beacon detection

! resolution of problems through the trace process

! determination of MAC availability for transmission

! detection of duplicate addresses

3.3.3.2 FDDI Transmission Modes

The FDDI standard offers both synchronous as well as asynchronous modes of data transmission. Synchronous mode corresponds with state- type messages while asynchronous corresponds with event-type messages.

3.3.3.2.1 Synchronous Mode

FDDI synchronous mode is where a producer of data is guaranteed a certain proportion of LAN bandwidth. Synchronous data has priority (at the data link level) over asynchronous data. FDDI synchronous mode provides Quality of Service guarantees required by time critical applications.

While the original FDDI standard specified synchronous transmission services, provision of this is optional.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B17 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

In order to support synchronous transmission, an FDDI synchronous bandwidth allocator (SBA) is required. Two SBA schemes have been proposed, i.e. a dynamic scheme and a static scheme[87].

3.3.3.2.2 Asynchronous Mode

Asynchronous bandwidth is allocated from the pool of remaining ring bandwidth. The allocation of bandwidth is controlled by two classes of special tokens, the restricted and non-restricted tokens. The restricted token gives the right to restrict transmission to nodes within a dialog. Other asynchronous traffic is deferred for the duration of the dialog.

All commercially-available FDDI NICs are provided with asynchronous mode capability.

3.4 Token Bus Schemes

3.4.1 Arcnet

Arcnet is a simple LAN technology (developed by Datapoint Corp.) based on token passing. When a node has the token it is only allowed to transmit one message before passing the token to the next node in the logical ring. Startup, adding a new mode to the LAN and token loss are all handled by re-initialisation.

3.4.2 Profibus

Profibus is a hybrid of the token bus and master/slave access schemes and allows a multi-master access scheme. The protocol covers Layers 1, 2 and 3 of the 7- layer model. At the MAC-layer, Profibus is derived from the IEEE 802.4 standard. There are a number of differences, however; for example, Profibus allows only two priorities while IEEE 802.4 specifies four.

3.4.3 Proway C

Proway C covers Layers 1 and 2 of the 7-layer model, including the MAC- and LLC sub-layers at Layer 2. It conforms quite closely, but not fully to the IEEE 802.2 (LLC) and IEEE 802.4 Token Passing Bus standards. The main difference is the restriction that Proway C only allows one option at the physical layer, i.e. 1 Mbits-1 phase-continuous FSK.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B18 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

3.5 Packet-Switched Schemes

3.5.1 ATM

Asynchronous Transfer Mode (ATM) is a network technology under development by many IT organisations throughout the world. At present it is envisaged for use with LANs up to 155 Mbits-1 and WANs up to 622 Mbits-1, but will be capable of supporting bandwidths of up to several Gbits-1. ATM is also being developed for silicon implementation and supports packet and circuit-switched data transfer in LAN, MAN and WAN network topologies.

ATM offers very high levels of throughput performance. For example SONET/SDH offers 155 Mbits-1 per link. With high performance switches the aggregate throughput of an ATM LAN can be several gigabits-1 (a 6 Gbits-1 switch backplane throughput is typical of currently available equipment, with effective LAN throughput being some 50% of that).

ATM uses short, fixed length packets called cells for transmission of all data. Each cell is 53 bytes long; 48 bytes for data and 5 bytes for a preceding header.

ATM has been designed as a completely heterogenous network technology. As such it supports numerous media, physical layers, protocols and services within the same infrastructure.

3.6 New Technologies

3.6.1 Fast Arcnet

A new high-speed 20 Mbits-1 option is being proposed, however this is in the specification stage. Another detraction from Arcnet is that it is essentially a proprietary technology promoted by a single company Datapoint, Inc. There is, however, an 80 member Arcnet Trade Association (ATA) which may open Arcnet into the public domain.

3.6.2 FDDI II

FDDI II is an enhancement of the basic FDDI LAN protocol[99]. While FDDI (I) offers a packet (switched) service, FDDI II offers a circuit-switched service. A packet service provides for the delivery of variable length, delimited data packets to network stations on the basis of an address within the packet. A circuit- switched service provides a continuous connection between two stations or between one station and multiple stations. Instead of using addresses, the connection is established between the stations based on some prior agreement mechanism such as a timeslot mechanism.

A typical timeslot mechanism is implemented using a standard timing marker such as the Basic System Reference Frequency (BSRF) which is a 125 ìs clock used by public networks. In the context of FDDI II, this is termed the cycle clock.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B19 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

FDDI II is capable of supporting isochronous data, i.e. data which occurs in precise amounts on a precise time basis. Typical examples of isochronous data are digital samples from sensors, voice data and video data.

The implementation of FDDI II constitutes a superset of FDDI (I) with the addition of one further function known as Hybrid Ring Control (HRC).

FDDI II supports dynamic bandwidth partitioning between packet and circuit- switched services to allow both modes of operation. Allocation of bandwidth is effected by means of 8 kbits-1 sub-channels up to a 6,144 Mbits-1 Wideband Channel (WBC). Up to 16 WBCs are assignable to isochronous services. It is possible to allocate any or all of the WBCs to one virtual service, thus satisfying the requirements for high-resolution video.

If all 16 WBCs are allocated to circuit-switched service (i.e. 16 x 6,144 Mbits-1 = 98,304 Mbits-1), a residual 1 Mbits-1 channel for packet traffic remains.

While FDDI II has many of the required characteristics as a next generation LAN technology, especially for multimedia applications, it is unlikely to reach commercial maturity due to supersedence by ATM.

3.6.3 FFOL

The FDDI Follow-On LAN (FFOL)[99] is a very high performance fibre optic LAN proposed by the ANSI X3T9.5 committee for future application. Important proposed requirements for FFOL include the following :

! Ability to provide a backbone for multiple FDDI networks.

! Data Rates > 600 Mbits-1, but < 1,25 Gbits-1.

! Support for singlemode fibre.

! Ability to use existing FDDI cable plant.

While FFOL has all the required characteristics as a next generation LAN technology, it is unlikely to reach commercial maturity due to the predominance of ATM.

3.6.4 FDVDI

Fibre Distributed Voice, Video, Data Interface (FDVDI)[99] is a proposed new standard for simultaneous packet and circuit-switched services. As its name suggests, it is targeted to support fully integrated data, voice and video communications. FDVDI is a competitive standard to FDDI II and it is likely that only one standard will emerge as dominant in this segment.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B20 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

FDVDI development is taking a three-phased approach, each phase corresponding to a particular data rate. Table I summarises the relevant FDVDI options.

Phase Data Rate Chip Technology I 35/45 Mbits-1 CMOS II 155 Mbits-1 BiCMOS III 565 Mbits-1 As yet Undetermined

Table BI : FDVDI Options

While FDVDI has all the required characteristics as a next generation LAN technology, it is also unlikely to reach commercial maturity due to the predominance of ATM.

3.6.5 DQDB

Distributed Queue Dual Bus (DQDB) was standardised as IEEE 802.6. It is basically a shared media LAN technology which exchanges fixed length data cells of 53 bytes over two counter-directional buses. Despite its simplicity and potential, it has never established itself as a mainstream LAN technology since system vendors have essentially ignored it. DQDB has found limited application in European switched, multi-megabit MAN-type networks, typically operating in the 34 to 155 Mbits-1 range.

3.6.6 Scalable Coherent Interface

The Scalable Coherent Interface (SCI) is an approved IEEE standard (IEEE 1596) intended to be the next generation high-speed backplane for interconnections in multiprocessor machines. SCI was designed to use point-to-point links in order to avoid the physics problems in using a backplane transmission line at very high data rates, e.g. distributed capacitances.

The stated purpose of SCI is "to define an interface standard for very high performance multiprocessor systems that supports a coherent shared-memory model scalable to systems with up to 64K nodes. This Scalable Coherent Interface (SCI) standard is to facilitate assembly of processor, memory, I/O, and bus adapters from multiple vendors into massively parallel systems with throughputs ranging up to more than 1012 operations per second."

SCI uses point-to-point signalling to simulate a bus without actually using one. This results is higher speeds due to greatly simplifying electrical transmission problems. In order to keep track of how the data is being used (e.g. has it been successfully received, whose turn is it to transmit, etc.) protocols have to be used which are different to those used with buses.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B21 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

The SCI standard defines two levels of interfaces. The physical layer specifies electrical, mechanical and thermal characteristics of connectors and cards. The logical level describes the address space, data transfer protocols, cache coherence mechanisms, synchronisation primitives, control and status registers as well as initialization and error recovery facilities.

Three physical layer interfaces are defined for SCI: an electrical parallel interface designed for short distances of less than 10 m with a data rate of 8 Gbits-1, an electrical serial interface used for distances in the order of tens of metres at a data rate of 1 Gbits-1 and a serial optical interface used for up to ten kilometres at a data rate of 1 Gbits-1.

Although SCI is typically described in terms of a simple ringlet, a variety of interconnection topologies can be used. The standard does not specify a particular topology. The use of more elaborate interconnection schemes requires additional expense as well as complexity of design and operation.

There are currently six SCI official IEEE follow-on efforts underway. Of particular interest is SCI/RT (SCI/Real-Time) which is a follow-on group which is investigating the use of SCI protocols for real-time applications which require guaranteed latencies. SCI/RT also provides some increased fault-tolerance and error handling capabilities.

3.6.7 Fibre Channel

Fibre Channel (FC) refers to a set of standards under development by the ANSI Fibre Channel committee, X3T9.3. Fibre Channel specifies a high-speed serial data channel that can connect nodes point-to-point or through a switch or switch network (switch fabric). FC was initially conceived as a peripheral interconnect channel, but its definition has developed such that it could support the construction of high performance local area networks.

The requirements set for Fibre Channel are the following :

! Small footprint (implies a serial channel).

! 2 to 10 kilometre operating distance.

! Up to 800 Mbits-1 in payload.

! Efficient operation over long distances.

! Greater connectivity than existing multidrop channels (e.g. SCSI).

! Efficient multiplexing of multiple streams into a single port.

! Support multiple existing interface command sets without modifications.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B22 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

! Support multiple cost and performance levels.

FC offers a number of different link options, from shielded twisted pair supporting 200 Mbits-1 over 50 m, up to singlemode fibre supporting 800 Mbits-1 over 10 km.

Table II provides Fibre Channel link options :

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B23 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

Media Type Transmitter Data Rate Line Rate Maximum Nomenclature Type (Mbits-1) (Mbits-1) Distance

Optical Fibre

Singlemode 1 300 nm Laser 800 1 062,50 10 000 100-SM-LL-L

(9 µm) 1 300 nm Laser 800 1 062,50 2 000 100-SM-LL-I

1 300 nm Laser 400 531,25 10 000 50-SM-LL-L

1 300 nm Laser 200 265,63 10 000 25-SM-LL-L

1 300 nm Laser 200 265,63 2 000 25-SM-LL-I

Multimode 780 nm Laser 800 1 062,50 500 100-M5-SL-S

(50 µm) 780 nm Laser 400 531,25 1 500 50-M5-SL-I

780 nm Laser 200 265,63 2 000 25-M5-SL-I

1 300 nm LED 200 265,63 25-SM-LE-I

1 300 nm LED 100 132,81 12-SM-LE-I

Multimode 780 nm Laser 800 1 062,50 100-M6-SL-S

(62,5 µm) 780 nm Laser 400 531,25 350 50-M6-SL-I

780 nm Laser 200 265,63 350 25-M6-SL-I

1 300 nm LED 200 265,63 1 500 25-M6-LE-I

1 300 nm LED 100 132,81 1 500 12-M6-LE-I

Copper Media

Video Coax ECL 800 1 062,50 25 100-TV-EL-S

(75 Ohm) ECL 400 531,25 50 50-TV-EL-S

ECL 200 265,63 75 25-TV-EL-S

ECL 100 132,81 100 12-TV-EL-S

Mini Coax ECL 800 1 062,50 10 100-MI-EL-S

(75 Ohm) ECL 400 531,25 20 50-MI-EL-S

ECL 200 265,63 30 25-MI-EL-S

ECL 100 132,81 40 12-MI-EL-S

STP ECL 200 265,63 50 25-TP-EL-S

(150 Ohm) ECL 100 132,81 100 12-TP-EL-S

Table BII : Fibre Channel Link Options

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B24 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

4. Network Topologies

While network topology is not strictly speaking a protocol issue, it has profound implications on the implementation of real-time, mission-critical, distributed systems. The topology influences network capabilities such as fault-tolerance, reliability, dependability and survivability.

The interconnect elements of most network technologies are critical components of the network, without which the network will not operate correctly or at all. Thus failures of these elements amount to single points of failure. In certain applications, the system requires that the network be disconnected, e.g. the jettisoning of a stage of a multistage missile or space launch vehicle or the release of an intelligent payload such as a space satellite or smart munition. Innovative methods are required to circumvent system failure due to these intended LAN disconnects. Certain topologies are eminently better suited to support these type of applications than others.

4.1 Basic Topologies

4.1.1 Bus Topology

A bus topology is a linear connection scheme where all network nodes are coupled to the physical medium. The extremities are normally connected to termination devices to prevent reflection of energy back into the network. Such reflections would interfere with valid transmissions.

A bus topology has no physical restrictions as to the number of nodes that can simultaneously access the network except for the amount of electrical or optical power transmitted.

The terminators of most bus technologies are critical components of the LAN without which the LAN will not operate correctly or at all. Thus failure of the terminators (which is rare as they are simple passive devices) or any cable break amount to single points of failure. In certain LAN technologies, such as thin Ethernet (10Base2), such failures would also occur during the addition or extraction of a LAN node. Such occurrences are unacceptable in real-time, mission-critical, distributed systems.

Bus topologies are the least costly and least complex topologies to implement.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B25 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

Figure 1 below illustrates a generic bus topology.

Figure 1 : Bus Topology

4.1.2 Star Topology

A star topology is a connection scheme where there is a central control or switching element to which all network nodes couple individually by means of the physical medium.

The central control or switching element of the star topology is a vulnerable component of the LAN and amounts to a single point of failure.

Star topologies have an advantage should LAN segments or nodes fail as such failures are localised to the failed segment and do not affect the rest of the LAN. This is advantageous in networks such as office LANs where users often intentionally or unintentionally disconnect their workstations. This is one of the reasons for the popularity of the star-based 10BaseT and 100BaseT Ethernet technologies as well as the IBM Token "Ring" (which is a physical star).

More cabling is required to implement a star topology than bus or ring topologies. This is significant from the system affordability perspective. Star-type topologies can be implemented as dual-redundant; however this would have extensive implications in terms of complexity and cost as well as system upgradeability.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B26 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

Figure 2 below illustrates a generic star topology.

Figure 2 : Star Topology

4.1.3 Ring Topology

A ring topology is a connection scheme where each node connects to the next in a ring fashion. The signal is propagated from one node to the next with each node responsible for retransmission of the signal. In certain cases, e.g. fibre optic rings, the node is also responsible for amplification of the signal, without which the LAN could not operate due to optical signal attenuation over fairly modest distances.

Thus ring topologies contain active elements which amount to single points of failure. Such failures would also occur with the extraction or addition of a node to a LAN. Such occurrences are unacceptable in real-time, mission-critical, distributed systems. Innovative methods are required to circumvent such failures.

IBM Token Ring uses a physical star to interconnect the LAN. This interconnect element is termed a Multi-Access Unit (MAU) which is a passive device. The MAU is a single point of failure, however the whole LAN will not fail due to the failure of a single node interconnect.

FDDI circumvents single point of failure by employing a dual counter-rotating ring. The second ring is only a backup for the primary (in standard FDDI implementations). Should an interconnect or node fail, the upstream and

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B27 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

downstream neighbours of the failed node redirect the optical signal from the primary to the secondary ring thereby maintaining a ring. Optical Bypass Switches (OBSs) can be employed to maintain the dual ring by switching the optical signal paths around a failed or temporarily extracted node. Such devices are characterised by significant optical signal attenuation and only a certain number (in the region of three to five) of OBSs can be active in series. To circumvent this problem, fibre amplifiers can be employed to reconstitute the optical signal. High bandwidth fibre amplifiers are not inexpensive devices, however and cost/dependability trade-offs are appropriate.

Ring topologies are only marginally more complex in terms of cabling than bus topologies. For applications where the LAN must traverse extensive length, ring closure implies double LAN length. For dual-redundant implementations, the return leg should be geographically distant from the outgoing leg in order to maximise survivability. This has definite cost implications when considering multiple physical cables, trenching etc.

For certain applications such as real-time, mission-critical, distributed systems which are normally geographically constrained, ring topologies are most appropriate. Examples are shipboard systems and the operations centres of process control plants.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B28 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

Figure 3 below illustrates a generic ring topology.

Figure 3 : Ring Topology

4.1.4 Tree Topology

A tree topology is one where the network trunks interconnect network switching nodes or concentrators which in turn connect to network end nodes.

The tree topology has advantages where the trunk is required to exhibit high reliability, but the end nodes do not. In such cases, high reliability techniques can be applied to the trunks (which are normally few in number), while the nodes (which are normally many in number) can employ standard, less expensive interconnections.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B29 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

Figure 4 below illustrates a generic tree topology.

Figure 4 : Tree Topology

4.2 Complex Topologies

Complex topologies are multiple instances of basic topologies with the sub-networks being either homogenous or heterogenous. They are appropriate in the following cases :

! Where traffic profiles are different in the sub-networks.

! The network is very large, either in terms of connected nodes or geographically.

! Considerations such as survivability or security require sub-network isolation.

! Legacy LANs, i.e. where organisations merge and different LAN technologies are inherited, but interconnectivity is required.

Complex topologies require the use of interconnect devices.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B30 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

4.3 Interconnect Devices

4.3.1 Repeaters

Repeaters connect two segments of the same homogenous LAN with connectivity being performed at the physical layer.

4.3.2 Concentrators

Concentrators are network devices which connect nodes of a star-connected network. They are typically found in 10BaseT Ethernet and single-attached FDDI networks.

4.3.3 Bridges

Bridges connect multiple homogenous LANs with connectivity being performed at the data link layer.

4.3.4 Routers

Routers connect multiple heterogenous LANs with connectivity being performed at the network layer.

4.3.5 Bridge/Routers

Bridge/Routers are devices incorporating the functionality of bridges and routers.

4.3.6 Gateways

Gateways connect multiple heterogenous networks with connectivity being performed at the application layer.

4.3.7 Switches

A switch is an interconnect device which redirects (switches) data streams according to address information contained either within the data stream or outside the data stream, e.g. some a priori agreement mechanism such as a timeslot mechanism.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B31 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

5. Data Link Layer Protocols - Performance Summary

LAN performance can be quantified in terms of a Figure of Merit (FOM). Two such FOMs are the P-Factor[113] and CP-Factor.

The P-Factor (Performance Factor) is the product of the maximum length, maximum number of stations and raw throughput of a network :

P = Length x Size x Raw Throughput station-km-bits-1

The CP-Factor (Cost-Performance Factor) is the quotient of the P-Factor and product of the Bit Error Rate (BER) and connection cost (in US dollars) for a network. The CP-Factor can be considered as a price/performance index for a LAN technology.

CP = Length x Size x Raw Throughput station-km-bits-1 per Bit Error Rate x Cost error per dollar

Table III provides P- and CP-Factors for MIL-STD-1553, Ethernet (10BaseT/IEEE 802.3), IBM Token Ring (IEEE 802.5), Fast Ethernet (100BaseT/IEEE 802.3u), Arcnet, Fast Arcnet, FDDI, ATM and Fibre Channel technologies.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B32 of 38 ydthsab1-02.wpd Appendix B - Data Link Layer LAN Protocols

LAN Standard Backplane P-Factor Cost CP-Factor Comments Bus Length Size Data Rate Bit Error Rate per Node (km) (Nodes) (Mbit/s) ($)

MIL-STD-1553B ISA 0,1 32 1 3E+006 1,0E-012 5 000 6E+014 Ethernet (10BaseT/IEEE 802.3) PCI 0,5 1 024 10 5E+009 1,0E-008 200 3E+015 IBM Token Ring (IEEE 802.5) PCI 0,3 260 16 1E+009 1,0E-008 400 3E+014 Fast Ethernet (100BaseT/IEEE 802.3u) PCI 0,1 1 024 100 1E+010 1,0E-008 300 3E+015 Arcnet ISA 0,1 1 024 2,5 3E+008 1,0E-008 200 1E+014 Fast Arcnet PCI 0,1 1 024 20 2E+009 1,0E-008 500 4E+014 Estimate Cost FDDI PCI 100,0 1 000 100 1E+013 2,5E-010 1 500 3E+019 ATM PCI 100,0 1 000 155 2E+013 2,5E-010 2 500 2E+019 Fibre Channel EISA 10,0 1 000 1 062 1E+013 2,5E-010 7 500 6E+018

Table III : P- and CP-Factors for Common LAN Technologies

Formulae

P = Length x Size x Raw Throughput station-kilometre-bits-per-second ...... (1)

CP = P station-kilometre-bits-per-second- ...... (2) Bit Error Rate x Cost per error-per dollar

file : ylanpi2.wk3 Issue : 1 1996-07-08 Revision : 2 2006-05-31 1996-07-08 Page B33 of 38 Appendix B Data Link Layer LAN Protocols

6. Conclusions

6.1 LAN Technologies

It is concluded that only the most recent of LAN technologies are appropriate for the next generation of real-time, mission-critical, distributed systems. High throughput, in the order of tens to hundreds of Mbits-1 will be required to support high performance sensors, real-time video and image transfer, graphics and shared databases. This clearly excludes from contention standards such as MIL-STD-1553B (1 Mbits-1). Even Ethernet (10 Mbits-1) and IBM Token Ring (16 Mbits-1) are only likely to find only short-term useability in such applications.

Apart from throughput, LANs should ideally offer other attributes such as determinism, synchronous and asynchronous transfer modes and fibre optic media. Only the new generation of LAN standards, including FDDI, FDDI II, ATM, SCI and Fibre Channel supports these capabilities. Of these, only FDDI is currently readily available as affordable, standard, off-the-shelf equipment.

6.2 Performance

It can be concluded from Table III that the FDDI, ATM and Fibre Channel standards offer significantly higher performance factors than all the other LAN standards.

6.3 Cost Effectiveness

FDDI and ATM offer significantly higher CP-Factors than the other LAN technologies, indicating that these technologies have very advantageous price/performance indices.

6.4 Ethernet

The major characteristics of Ethernet are that it offers modest data rates, poor efficiency, non-deterministic data transfer, limited range, but at low cost, as well as very flexible physical media options and low implementation risk.

Due to greater commercial application and cheaper cabling cost, standard Ethernet is much less expensive than standard FDDI. Ethernet can be deployed over optical fibre media and dual redundancy can be implemented. However, this would raise the cost of an Ethernet solution above that of an FDDI solution. Also, substantial software development may be required to fully implement redundancy. Ethernet would still only offer some 5% of an FDDI LAN's throughput.

While Ethernet is considered to be an excellent LAN standard for commercial or non-critical industrial or military applications, it is not considered suitable for real-time, mission-critical applications. The CSMA/CD signalling protocol gives rise to a significant degree of non- determinism which cannot be tolerated in such applications. Another consequence of CSMA/CD is that, in heavily loaded networks, the backing off and retry process associated with CSMA/CD gives rise to very poor network throughput efficiency, as low as 30%[111] compared with 95% for FDDI.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B34 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

Bit error rates on Ethernet networks are also a factor, especially over standard Ethernet media, i.e. coaxial cable or UTP. To offset this, Ethernet could be run over fibre media, but then its main positive attribute, i.e. low cost, is diminished.

6.5 Token Bus

Token Bus protocols offer limited ranges, limited number of connections, complex and timely startup and reconfiguration procedures. They can, however, offer deterministic data transfer.

6.6 IBM Token Ring

The major characteristics of IBM Token Ring are that it offers modest data rates, modest efficiency, significant transfer jitter, but at low to moderate cost, as well as flexible physical media options and low implementation risk.

6.7 FDDI

The timed, early release, token ring protocol as employed by FDDI offers largely deterministic data transfer, high efficiency, low jitter and priority message scheduling.

The FDDI LAN standard offers intrinsic redundancy, low error rates and electromagnetic compatibility while supporting high data throughput at affordable cost. FDDI in dual- attached configurations offers excellent scalability with connectivity costs scaling almost linearly from two nodes to 500 nodes (the maximum for the single LAN).

The downside factors of FDDI are very few :

! There is a modest price premium for FDDI network interface cards. It is concluded, however, that the intrinsic dual-redundancy, low word error rate and high throughput more than compensate for this premium.

! In all optical connections there are optical losses. With FDDI there is a maximum permissable optical loss between active adjacent nodes (which regenerate the optical signal). This means that there is a maximum number of connectors that can be employed between active adjacent nodes. While the use of connectors does enhance maintainability, it does detract from reliability (connectors are notoriously unreliable components), as well as optical performance. Extensive use of connectors also contributes significantly to cost as high reliability connectors are expensive items.

6.8 CDDI

While CDDI is considered to be an excellent LAN standard for commercial or non-critical industrial or military applications, it is not considered suitable for real-time, mission-critical applications. The bit error rates on CDDI networks are considerably higher than for FDDI, especially over UTP cable.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B35 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

CDDI is also limited in length between nodes. For MLT-3 the specified length is 100 m (90 m for main wiring and 10 m for the fly-lead).

While the use of concentrators provides for an elegant topology solution in certain applications, the follow considerations are applicable :

! Concentrators are not inexpensive.

! Concentrators detract from scalability.

! Concentrators constitute single points of failure (unless dual-homing is employed).

! Dual-homing is a more costly solution to fault-tolerance than dual-attachment.

Considering all of the aspects of cost, scalability and electromagnetic compatibility, FDDI is superior to CDDI.

6.9 ATM

While ATM technology has many merits, its immediate applicability to real-time, mission- critical, distributed systems is questionable.

The first issue of concern is ATM's data transfer efficiency. ATM packets consist of 53 byte cells of which only 48 bytes contain payload data. This implies a maximum efficiency of 90% before any higher layer protocol overhead. While this may not be important at high ATM speeds (i.e. > 600 Mbits-1), this may be a limitation at lower speeds (i.e. 25 to 155 Mbits-1).

The second major concern is ATM's lack of fault-tolerance. ATM in and of itself has no ability to recover from data loss and errors.

Despite the technical attributes of ATM, as well as the extent of installation that the technology will eventually find, FDDI is superior to ATM in respect of real-time, mission- critical, distributed systems. FDDI features intrinsic redundancy (and therefore fault- tolerance). ATM was designed for commercial high-performance networks and therefore does not incorporate redundancy as a built-in feature. This could be achieved by duplicating the ATM network interface cards in each network node, as well as the switching and cable plant. However, this would be extremely costly and complex, especially considering that ATM features a star-type topology.

ATM's star-type topology also implies that it is not a scalable technology. It requires at least one expensive switch even to connect two nodes.

Despite its drawbacks, ATM has both strong technical credentials and extensive commercial backing. It is almost certainly the network technology of the future for general-purpose, local and wide area networks.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B36 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

6.10 Scalable Coherent Interface

Due to its high performance, flexible link options and support for a shared memory model, SCI is likely to supersede traditional parallel backplane buses (PBBs) in the medium term future.

Most common PBBs such as VME, Multibus II and Futurebus+ are either not optimised for multiprocessing (VME), suffer from significant backplane latencies (Multibus II), or are considerably expensive (Futurebus+). SCI will be able to exploit these weaknesses and thereby provide the optimal inter-processor communications infrastructure of future real- time, mission-critical, distributed systems. SCI will be able to support true, transparent, distributed processing between processors located in the same housing or separated by thousands of metres.

6.11 Fibre Channel

Due to its high performance, flexible link options and reliable communications topology, Fibre Channel is likely to supersede traditional LANs in the medium term future.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B37 of 38 ydthsab1-02.wpd Appendix B Data Link Layer LAN Protocols

7. Recommendations

7.1 FDDI

FDDI should be adopted as the LAN standard for all mission-critical applications, where EMC is a factor, or LANs are logically or geographically extensive.

7.2 CDDI

CDDI, using a fibre backbone and concentrators, is recommended where high performance is required, but costs are a factor and fault-tolerance is not required from each and every LAN connection.

7.3 ATM

ATM is recommended where extremely high throughput is required and fault-tolerance is not required from each and every LAN connection.

In mission-critical systems where fault-tolerance is not required from every LAN node, the backbone network, i.e. that connecting ATM switches, can be replicated.

7.4 Ethernet

Ethernet is recommended in non-real-time systems requiring modest data transfer performance and where acquisition costs are a major factor.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page B38 of 38 ydthsab1-02.wpd Appendix C

Network Layer LAN Protocols

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C1 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

Appendix C ...... 1

1. Scope ...... 4 1.1 Scope...... 4 1.2 Introduction...... 4 1.3 Appendix Layout ...... 4

2. Network Layer Protocols - Characteristics ...... 5 2.1 Global Addressing ...... 5 2.2 Routing ...... 5 2.3 Fragmentation and Re-Assembly ...... 6 2.4 Special Services ...... 6 2.5 Connectivity ...... 6 2.5.1 Connection-Oriented Approach...... 6 2.5.2 Connectionless Approach ...... 6 2.6 OSI Network Layer Protocol ...... 6 2.6.1 Connection-Oriented Network Protocol ...... 7 2.6.2 Connectionless Network Protocol (CLNP) ...... 7 2.7 Internet Protocol (IP) ...... 8 2.7.1 Functionality ...... 8 2.7.2 Sub-Protocols...... 8 2.7.3 Problems ...... 8 2.7.3.1 Address Space...... 9 2.7.3.2 Routing ...... 9 2.7.4 Classless Inter-Domain Routing ...... 9 2.7.5 Host Extension for IP Multicast ...... 10 2.7.6 XTP-Aware IP Router ...... 10 2.8 Novell NetWare Protocols ...... 10 2.8.1 NetWare IPX ...... 11 2.9 ATM ...... 11 2.10 HPN ...... 11 2.11 New Generation Protocols ...... 11 2.11.1 Next Generation IP (IPng) ...... 12 2.11.1.1 Simple Internet Protocol Plus (SIPP) ...... 12 2.11.1.2 TCP and UDP with Bigger Addresses (TUBA) ...... 13 2.11.1.3 CATNIP ...... 14 2.11.2 Internet Stream Protocol (ST-II) ...... 15 2.11.3 NIMROD ...... 16

3. Conclusions ...... 17 3.1 Connectionless vs Connection-Oriented Approach ...... 17 3.2 CLNP vs IP...... 17 3.3 Network Layer Access ...... 17 3.4 NetWare Protocols ...... 18 3.5 Internet Stream Protocol - ST-II...... 18

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C2 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

4. Recommendations ...... 19 4.1 Recommended Option ...... 19 4.1.1 Short Term ...... 19 4.1.2 Medium to Long Term...... 19

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C3 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

1. Scope

1.1 Scope

This appendix describes the detailed characteristics of Network Layer (i.e. Layer 3) LAN protocols, specifically in terms of their real-time performance and their suitability or otherwise for the implementation of real-time, mission-critical, distributed systems. It also provides a qualitative trade-off between possible contenders for the network layer protocol option of the Real-Time LAN Profile.

1.2 Introduction

The International Standards Organisation (ISO) defined and in 1984 published a framework for communications standards which partitions the functions required for communication into seven layers. This was termed the Open Systems Interconnect (OSI) Basic Reference Model. Layer 3 of the OSI model is termed the Network Layer (NL). The primary functions of the NL protocol are to provide message routing, switching as well as internetwork flow and congestion control.

The functions of the NL protocol are significant to the suitability of local area networks for real-time, mission-critical, distributed systems. They determine such capabilities as global addressing, internetworking (for larger networks), congestion control, intermediate error control, packet fragmentation and reassembly.

1.3 Appendix Layout

The appendix commences with an overview of the characteristics of available Network Layer Protocols and proceeds with descriptions of specific current and proposed implementations thereof. Important implications of these characteristics and implementations are then analysed within the context of real-time, mission-critical, distributed systems.

Specific conclusions and recommendations are then made, with the most significant of these being adopted in the main section. In particular, a specific option is recommended for inclusion in the Real-Time LAN Profile described in the main section.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C4 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

2. Network Layer Protocols - Characteristics

The Network Layer is responsible for providing interconnection between network layer users (e.g. transport layer entities) such that users are shielded from the details of the number and characteristics of sub-networks separating them. The NL provides transparent interconnection between all network layer service users attached to the internetwork.

The network layer protocol functions can be summarised as follows :

! Global Addressing ! Routing ! Fragmentation and Re-Assembly ! Special Services

2.1 Global Addressing

Global addressing allows the delivery of data across an interconnected system of local area networks. Thus addressing information is required to identify both networks and end stations. The Network Layer performs such addressing in terms of a Global Addressing Scheme. This scheme therefore has two components, the network component and the station component.

Routing functions implemented within the network layer route packets first to the destination sub-network and then to the destination end stations. Such a hierarchical addressing scheme allows efficient routing through networks consisting potentially of many (millions of) end stations.

2.2 Routing

Routing is the logical directing of packets through the network from end station to end station via intermediate stations, i.e. the routers. Routing may be static, i.e. according to paths determined at setup time, or it may be adaptive, i.e. according to dynamic routing tables determined according to online network conditions such as node failures or delay characteristics.

Static routing is only appropriate for small, simple networks. Adaptive routing is more complex than static routing and can be classified as isolated, centralised or distributed. Isolated routing utilises information gleaned from packets as they traverse the router. Isolated routing is non-optimal due to the limited availability of useful routing information at the routing nodes. Centralised routing relies on global information residing in a central master routing node. As such, it constitutes a single point of failure within a system. Distributed routing involves communication between routers to perform and optimise routing throughout the complete network.

Communication between routers is also governed by protocols, the routing protocols which are sub-protocols of the network layer protocols.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C5 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

2.3 Fragmentation and Re-Assembly

In heterogenous networks, sub-networks may provide data delivery services for packets of different sizes. When packets are exchanged between one sub-network and another, the network layer must ensure that packets do not exceed the maximum transmission unit (MTU). If a receiving sub-network has an MTU smaller that the transmitting sub-network, the network layer is responsible for fragmentation of the packet on delivery and then re-assembly on reception.

2.4 Special Services

The network layer may offer optional special services. Examples of such special services are security mechanisms, route recording, timestamping, reverse charging, enforcement of simplex communication and expedited data.

2.5 Connectivity

A fundamental consideration in the determination of the functionality of the network layer is whether it should provide an end-to-end, reliable data transfer service or whether this should be the responsibility of the transport layer.

An end-to-end, reliable data transfer service can be considered as a connection-oriented approach whereas a non-reliable service can be considered as a connectionless approach. Academics and implementers in the network field are sharply divided as to the merits and appropriateness of each approach.

2.5.1 Connection-Oriented Approach

A connection-oriented approach involves a set of end-to-end transactions whereby a connection is established between nodes until the connection is no longer required. Data transfer occurs while the connection is valid and then a transaction occurs to disestablish the connection.

2.5.2 Connectionless Approach

A connectionless approach involves the transmission of independent messages on an individual basis. Each message carries the full destination address. Error control is not performed and this is left to the transport layer.

2.6 OSI Network Layer Protocol

Originally the OSI Basic Reference Model followed the connection-oriented approach. Later a connectionless option was also adopted.

Various commercial companies have developed protocol products (usually software) to conform to the OSI NL standard. These are available as "shrink wrap", off-the-shelf products which a system implementor binds together with other protocol software to form a software communication system. It is also possible, but normally more difficult, to obtain OEM

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C6 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

(Original Equipment Manufacturer) type products which an implementer can tailor for incorporation into the system.

Often the NL is tightly coupled to the transport layer and a direct access interface to the NL is not available.

2.6.1 Connection-Oriented Network Protocol

The Connection-Oriented Network Protocol (CONP) is an OSI conformant protocol described by ISO 8348[16]. At this time, only a means for adapting the X.25 network protocol is provided. This has not found wide acceptance, especially in the USA.

2.6.2 Connectionless Network Protocol (CLNP)

The Connectionless Network Protocol (CLNP) is an OSI conformant protocol described by ISO 8437[17]. CLNP provides for passing data units from one End System to another End System (though Intermediate Systems where required). An End System is a user station which does not route data to other users. An Intermediate System is a station that routes data between the initial sending and final receiving of data. CLNP provides two primary services: relaying of data units and route determination. Relaying of data units involves the passing of a data unit from one End or Intermediate System to another. Route determination is where the paths for the relaying of data units is dynamically determined. In most typical implementations, there are many more End Systems than Intermediate Systems. The OSI routing architecture is intended to result in a few complex Intermediate Systems balanced by many simple End Systems.

Two types of protocol data units (PDUs) are defined in CLNP, the Data PDU and the Error PDU. The Data PDU pipes data from the Transport Layer to the lower layers of the network and the Error PDU is used to report packets that had to be dropped to the source node. Several functions are defined in CLNP in order to build these PDUs and control the forwarding of them to other nodes. These functions are used by the other OSI protocols as building blocks for more sophisticated routing functions.

CLNP uses three routing protocols; End System - Intermediate System Routing Protocol (ES-IS), Intermediate System - Intermediate System Routing Protocol (IS-IS) and Inter-Domain Routing Protocol (IDRP).

The addresses used by CLNP identify nodes instead of interfaces.

The US Government has selected the CLNP as a mandatory part of the Government Open Systems Interconnect Profile (GOSIP)[22].

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C7 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

2.7 Internet Protocol (IP)

In the Internet Profile, the Internet Layer is concerned with routing data between two hosts attached to different or multiple networks. An internet is an interconnected set of networks. The Internet Protocol (IP) is used at this layer.

2.7.1 Functionality

IP provides for the routing functions between hosts. It is a connectionless protocol and is responsible for the transmission of segmented data. Packets sent via IP are independent, i.e. they may travel on independent paths to reach their destination. Routers running IP are not required to maintain state information describing the streams of packets passing through them. Consequently, reserving bandwidth and guaranteeing end-to-end latencies are difficult.

A key feature of IP is its popularity among today's networking community. IP is the basis of the Internet, which is a global computer network. IP's popularity has resulted in many vendors choosing not to implement IP's counterpart, CLNP.

2.7.2 Sub-Protocols

There are a number of additional protocols that are used in conjunction with IP. The Address Resolution Protocol (ARP) is used by hosts to determine the mapping between an IP address and a MAC address. IP datagrams must be encapsulated in a MAC packet before being transmitted on a LAN.

The Internet Control Message Protocol (ICMP) is used for a number of purposes. A simple echo facility aids in debugging IP networks. Another ICMP function is to redirect hosts to use a more optimal route to reach a destination. Requests for reports on the address mask used on a network can also be made via ICMP.

The Routing Information Protocol (RIP) and the Open Shortest Path First (OSPF) protocol are two popular routing protocols used by IP. Although both protocols are used to compute and distribute routing information throughout an IP network, they use different algorithms to achieve this.

Despite the wide acceptance of IP, it has some fundamental design problems which will render it ineffective in the medium term. The first result of these problems is that classical IP addresses will be expended in the short term. The second result is that IP routing capability will become saturated.

2.7.3 Problems

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C8 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

2.7.3.1 Address Space

The first problem is caused by IP's addressing scheme. IP uses a flat address space divided into three commonly used spaces called classes (Class A, B and C). Of the classes, Class B is the most popular since the number of hosts supported matches the requirements of many user sites. The Class B space allows up to 65 536 hosts on the network. The Class A space is too large (i.e. 16 million hosts) and the Class C space too small (i.e. 256 hosts). Thus if traditional IP class allocation is used, it is predicted that the Class B space will run out of addresses in the short term based on the current allocation rates.

2.7.3.2 Routing

As additional networks are added to the Internet, there is a corresponding increase in the number of routes to be maintained. Soon routers will not be able to reasonably route packets using current Internet schemes. If the Class B address space runs out, large sites will have to be allocated Class A addresses. This again increases the number of advertised routes and further exacerbates the route explosion problem.

2.7.4 Classless Inter-Domain Routing

In order to overcome IP's addressing and routing problems, a new IP addressing scheme has been proposed that will overcome both the address depletion and route explosion problems. This scheme is known as Classless Inter-Domain Routing (CIDR).

CIDR eliminates traditional IP's class hierarchy by using address prefixes. An address prefix is a tuple consisting of a 32-bit mask and a 32-bit IP address. A series of contiguous ones in the mask signifies which bits in the IP address deal with routing. Address prefixes allow variable length (i.e. any power of two) address blocks to be assigned. Address blocks can be allocated which reflect a given network's size requirements instead of the fixed size as is done with classes. This allows the address space to be assigned more efficiently. Address prefixes can be used to build a hierarchical address space. For example, a service provider could acquire a large address block with a given prefix. It could then assign customers smaller blocks under that prefix. They could assign smaller blocks within their blocks and so on. The service provider would only have to advertise a route to its overall prefix instead of all the individual networks it serves. This can dramatically reduce the number of routes advertised on the Internet.

Many observers are of the opinion that CIDR can extend the life of IP until the year 2005 or beyond. CIDR proponents are also investigating the use of address recycling whereby sites can return addresses assigned to them. For example, a site with 50 nodes and a Class B licence could exchange it for a CIDR block with room for 64 nodes. This gives some room for expansion and releases over 65 000 addresses which were not being used before.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C9 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

2.7.5 Host Extension for IP Multicast

The Host Extension for IP Multicast specifies extensions to give an IP implementation the ability to send multicast messages to a "host group". A host group is a set of zero or more receivers identified by a single IP destination address. The messages are sent as datagrams, thus reliable delivery of messages is not guaranteed.

2.7.6 XTP-Aware IP Router

While maximising interoperability, the employment of IP can have latency implications in internetwork topologies because IP is not optimised for routing packets through a network.

Functional extensions to an IP Router can overcome this problem, however. XTP features a traffic descriptor which will allow an XTP-aware IP router[51] to prioritise and expedite XTP packets. An extended IP implementation recognizes XTP packets, accomplishes resource reservation and admission control, guarantees quality of service, etc.

It is contended that this is a major requirement for real-time, mission-critical systems which employ an internetwork topology and IP.

2.8 Novell NetWare Protocols

Novell Corporation are the developer's of the Novell NetWare Network Operating System. NetWare fileservers are the hosts of a large proportion of all commercial computer nodes connected via local area networks. Until recently, this was estimated to be in the region of over 50% worldwide and as high as 95% in South Africa. While NetWare has lost some market share to other operating systems such as Microsoft Windows NT and Unix, it constitutes a significant factor in global network connectivity. As such, NetWare protocols require consideration in terms of functionality and interconnectivity (refer also Paragraph 2.11.1.3).

When Novell began operations in 1982, several proprietary protocols for transferring data between workstations were used. As time went on, the decision was made to base Novell's network communications on a fast and efficient networking standard. Xerox's XNS protocol was determined to be one of the best available at the time so Novell's Internetwork Packet Exchange (IPX) protocol was developed to conform to the XNS standard. NetWare IPX is functionally equivalent to Xerox's Internet Datagram Protocol (IDP).

Three primary peer-to-peer protocols are supported in the NetWare LAN environment, NetWare IPX, SPX and NetBIOS. Additional protocols supported include the Transport Layer Interface (TLI), Named Pipes, LU6.2 and others. IPX is Novell's Network Layer Protocol.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C10 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

2.8.1 NetWare IPX

NetWare IPX is a true datagram protocol. It makes a best-effort attempt to send packets by using a 12-byte addressing scheme. The 12-byte address is split into three addresses: the network address, which is used to address individual workgroups; the node address, which addresses network nodes within the workgroups and the socket address, which can be used to multiplex between functions within a network node. When sending a NetWare IPX packet from one node to another, the sending node must know the receiving node's 12-byte address.

2.9 ATM

Inserting payload data into the 48 byte data field of an ATM cell is accomplished by the ATM Adaptation Layer (AAL). This provides ATM with the flexibility to offer entirely different types of data transport services within the same format.

One possibility is for ATM to perform LAN emulation by supporting a classical network layer protocol. The ATM Forum is investigating using IP over ATM for this purpose. The IP over ATM working group of the Internet Engineering Task Force (IETF) is developing protocols to enable ATM networks to serve as expanded links for the Internet. In particular, a "Classical IP and ARP over ATM" specification (RFC 1577) was produced by the IETF IP over ATM working group. This will support the ability to transfer IP packets in an ATM infrastructure and the ability for an application knowing an IP address to find the corresponding ATM address.

2.10 HPN

The US Navy is planning a follow-on project to SAFENET, i.e. the High Performance Network (HPN) which will probably be based on a combination of ATM , SCI and Fibre Channel (600 Mbits-1 to 3 Gbits-1). HPN will typically find application in WANs and the network backbone, while SAFENET/FDDI will find application for the local control LANs.

It is clear that both IP and CLNP are deficient in certain functional and performance areas. New generation, high performance network layer protocols will therefore be required to support the High Performance Network architecture.

2.11 New Generation Protocols

Current network layer protocols are deficient in certain functional areas as well as falling short in meeting the real-time requirements of next generation mission-critical, distributed systems.

The US Navy's High Performance Network Working Group (HPNWG) has identified three contenders to replace SAFENET's IP and CLNP for the US Navy's High Performance Network (HPN). These are IP Next Generation (IPng), ST-II and NIMROD[85].

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C11 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

2.11.1 Next Generation IP (IPng)

The Internet Engineering Task Force (IETF) has identified three contenders to replace IP. These are the Simple Internet Protocol Plus (SIPP), TCP and UDP with Bigger Addresses (TUBA) and Common Architecture Technology for Next- Generation Internet Protocol (CATNIP).

2.11.1.1 Simple Internet Protocol Plus (SIPP)

The Simple Internet Protocol Plus (SIPP) is a merger of three proposals; Simple Internet Protocol (SIP), "P" Internet Protocol (PIP) and IP Address Extensions (IPAE). While each of the three proposals were developed and completed independently in the IPng arena, each had weaknesses that were strongly addressed by the other proposals. The merger of the three proposals into SIPP has had a synergistic effect resulting in a technically and democratically stronger proposal.

SIP is the base network layer protocol for SIPP. It differs from the current IP in three main areas; it has expanded addressing capabilities, it simplifies the IP header format and it provides better support for adding optional capabilities.

To deal with the address deficiency problems of IP, SIP uses a 64-bit fixed length address field which the SIP designers feel is sufficient to accommodate Internet growth for the next 25 years. Also, 64 bits allow for efficient processing with the next generation of 64-bit microprocessors. SIP addresses do not have the strict class boundaries of IP. Like IP, bit masks can be used to define network hierarchy. Because of the large flexible address space, a number of address schemes are being considered. These include provider-based addressing, geographic- based addressing and hybrid schemes. IP addresses can be embedded into SIP addresses by placing IP addresses into the lower 32 bits of the SIP address and setting the high order bit in the SIP address. SIP provides support for multicast by defining a multicast address field with a 48-bit group ID and an 8-bit flags and scope field.

PIP brings flexibility to the addressing and routing architecture of SIPP to support service provider selection, mobility, cluster addressing, auto- configuration and variable length addressing. This enhanced architecture is built on the 64-bit address field and optional routing header of SIP.

The addresses used by SIPP differ from IP addresses in that they identify nodes instead of interfaces. This is similar to the addressing method used by CLNP. Each SIPP node has a globally unique 64-bit network address. Other SIPP addresses may have a more limited scope. For example, a workstation in the process of rebooting may use a locally assigned address (e.g. its MAC address) until such time as it can determine its global address.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C12 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

Route reversal concepts from PIP have been incorporated into SIPP. For example, when a node which is not the initiator of an association receives a packet, it can follow a set of rules to determine a transmission path back to the originator without an understanding of the network infrastructure between the two nodes.

IPAE brings a transition plan to SIPP which allows IP Version 4 hosts to migrate to SIPP without disrupting operation of the Internet. IPAE allows SIPP systems to interoperate with IP V4 hosts. SIPP-only hosts can communicate through an IP V4 infrastructure by encapsulating SIPP packets into IP V4 packets. Mechanisms are also provided to translate IP V4 packets into SIPP packets. This will be needed during the transition phase where end-to-end IP V4 connectivity is not available due to widespread SIPP deployment.

The status of SIPP is that a draft protocol specification has been written. A number of organisations have developed prototype implementations of SIP and PIP which have been demonstrated running on various machines on the Internet.

2.11.1.2 TCP and UDP with Bigger Addresses (TUBA)

TCP and UDP with Bigger Addresses (TUBA) is an IPng proposal which is aimed at solving the IP addressing and scaling problems by replacing IP with CLNP. Of the IPng candidates being considered, TUBA is the most controversial since it brings together technologies from the Internet and OSI domains.

Supporters of TUBA are of the opinion that using CLNP and its associated protocols solves the problems plaguing IP. Since address fields in CLNP, called NSAPs (Network Service Access Points) can be fairly large (e.g. GOSIP uses 20 bytes), the 32-bit address limitation of IP is overcome. Since NSAP address assignment is quite flexible, arbitrary levels of hierarchy can be assigned with appropriate partitioning of the address space. Since most routers can support CLNP, there may well be less conversion implications than with other proposals.

The concept of TUBA is simple: replace IP with CLNP in such a way that it is invisible to the applications. This is actually not that difficult to accomplish. Since CLNP was originally derived from IP, many fields in the protocols are similar or serve similar functions. Consequently, IP packets are not too difficult to map into CLNP packets.

A transition strategy from IP V4 to TUBA has been identified. This is that IP-only systems use IP to communicate to all other systems. Consequently, IP infrastructure must be available along the entire path from source to destination. Such connectivity is ubiquitous today, but if TUBA becomes dominant, IP-only hosts will be forced to upgrade to

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C13 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

TUBA or forfeit global connectivity. TUBA-capable hosts use both IP and CLNP at the Network Layer. To communicate with IP-only hosts, TUBA-capable hosts use IP; to communicate with TUBA-capable hosts, they use CLNP.

The status of TUBA is that a draft protocol specification has been written. A number of organisations have developed prototype implementations which have been demonstrated running on various machines on the Internet. Of significance, Internet applications such as finger, TELNET and FTP have been demonstrated running over CLNP.

2.11.1.3 CATNIP

The Common Architecture Technology for Next-Generation Internet Protocol (CATNIP) is an IPng proposal which seeks to use packet header compression to unify three of the most widely used Network Layer protocols. CATNIP defines the compression in such a way that the format of compressed packets is identical for IP, CLNP and IPX. These three protocols were selected because of their wide deployment. It is relevent to note that the CATNIP authors observe that there is a larger installed base of IPX than IP and CLNP combined[85].

CATNIP has a less complex transition strategy than SIPP or TUBA. It is claimed that a CATNIP system can start running "out-of-the-box". For example, there are no additions to the Domain Name Service to look up CATNIP addresses as is required for both SIPP and TUBA. No complex address translation is done by CATNIP. There is a simple algorithmic mapping from IP and IPX by using a simple address prefix. CATNIP does not create legacy systems. Any IP, CLNP or IPX system can communicate with all systems with which it can at present. Furthermore, with small administrative changes, such as assigning IPX domain addresses to CLNP hosts, unmodified hosts can gain greater connectivity.

CATNIP uses a simple packet format structured to make header compression possible. The first 32 bits contain the Network Layer Protocol Identifier (NLPID), some flags, the size of the header and the packet time-to-live. The second 32-bit field contains a Forward Cache Identifier (FCI) while the third contains the length of the datagram. This allows packets to be up to 4 Gbytes long. The fourth 32-bit field contains the Transport Protocol Identifier and packet checksum. All subsequent fields are optional. These would typically include the destination address, sources address and options fields.

The FCI is the key to header compression. Next hop stations provide FCIs to their downstream neighbours. When a station receives a packet, it is able to quickly locate the forwarding record. Such quick lookups are

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C14 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

critical for real-time dataflows. Flow descriptors can be associated with the routing record.

CATNIP uses NSAPs as the common address format. This is of great interest to the OSI community, especially the TUBA group. In fact, the latter have met with the CATNIP group to discuss common interests. It is speculated that the two groups may merge.

The CATNIP specification exists in draft form only.

2.11.2 Internet Stream Protocol (ST-II)

The Internet Stream Protocol, designated ST, is an experimental Internet family Network Layer protocol used to provide guaranteed bandwidth and controlled delay characteristics. Unlike with IP, ST routers are required to maintain state information describing the streams of packets passing through them. This allows routers to allocate bandwidth in an intelligent manner. Pre-allocation of resources allows packets to be forwarded with low delay, low overhead and low probability of loss due to congestion. ST has been used to conduct voice conferencing on the Internet. More recently, with the advent of ST-II, multimedia conferences have been conducted on the Internet where voice, video and pointing device positions have been integrated for display at multiple remote sites.

ST provides two approaches to establish connections, simplex and full duplex. In the simplex approach, the calling party requests a simplex connection to the called party which, after accepting the request, requests a simplex connection in the reverse direction. Full duplex connections are those in which, after the caller accepts the connection request, data can flow in both directions. Simplex connections can take maximum advantage of the available resources by using different routes for each direction of the connection.

The Internet Stream Protocol ST-II, is a protocol design effort based on the ST protocol. The four main differences between ST and ST-II are as follows :

! ST-II is decoupled from the Access Controller, the centralised distributor of conference information.

! The stream construct of ST-II is a directed tree carrying traffic away from the source (the root) to the destinations (the leaves).

! A number of robustness and recovery mechanisms which were left undefined in ST are defined in ST-II.

! No distinction is made in ST-II between streams with two participants and those with more than two.

The robustness and recovery mechanisms make ST-II a candidate for HPN and therefore for future real-time, mission-critical, distributed systems.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C15 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

2.11.3 NIMROD

The goal of the New Internet Routing and Addressing Architecture, designated NIMROD, is to design a flexible new routing and addressing architecture that is suitable for very large-scale internets. Consequently, the shortcomings of IP Version 4 will be addressed by the proposal. NIMROD will be based on an architecture in which network topology maps are distributed between nodes. Each node uses these maps to compute routes so that it can generate a source-specified route in outgoing packets. NIMROD will examine both inter-domain and intra- domain routing aspects. Variable length addresses will be used to provide multiple levels of abstraction.

NIMROD is not considered as a direct contender in the IPng arena. Many view this work as an IPng follow-on effort.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C16 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

3. Conclusions

3.1 Connectionless vs Connection-Oriented Approach

Connection-oriented schemes require an extra level of functionality at the network layer in order to manage connections and error control. This implies an extra overhead in terms of software protocol processing resources, especially time. It is concluded that for real-time systems this is inappropriate and that a connectionless approach is superior to a connection- oriented one at the network layer.

3.2 CLNP vs IP

It is concluded that CLNP and IP have equivalent functionality and performance and therefore, that theoretically either are suitable candidates for real-time, mission-critical, distributed systems.

IP does suffer from two fairly significant problems related to addressing and routing; however, these problems will not manifest themselves for some time. They are also only applicable in the wide area Internet environment and not in most real-time control systems.

CLNP is more modern than IP and was in fact developed from IP. It therefore overcomes some of the problems of the latter and is even being considered as an IPng contender, i.e. TUBA.

The US Government attempted to prescribe GOSIP[22] conformance as mandatory for all future government network acquisitions, including those in the US military. GOSIP is based on OSI. However, programme managers ubiquitously and consistently found means to have this prescription wavered to such an extent that the GOSIP initiative has effectively been nullified. In most cases, GOSIP gave way to the Internet Profile, thus enhancing the application base of IP at the expense of CLNP.

Two approaches can be considered in the choice of appropriate Network Layer protocol, either to adopt a multi-protocol approach or discard one option in favour of the other. In the latter case, other than technical considerations apply. It is contended that, despite the elegance and technical appropriateness of the OSI network layer protocol, IP will prevail in the real- world due to its extensive installed Internet (ARPAnet) base and consequent legitimacy amongst many network users, championed by the world's largest organisation, the US Department of Defense.

Real-world realities have resulted in commercial networking companies abandoning their OSI protocol products. This can only point to the ultimate demise of the OSI product profile.

3.3 Network Layer Access

With many available NL products, the NL protocol is bundled together with the transport layer (TP4 or TCP) and therefore an access interface to the NL is not available to the system implementer.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C17 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

It is concluded that this presents a problem to the real-time system implementer as his objective would be to integrate the real-time transport layer option (e.g. XTP) directly with the network layer.

3.4 NetWare Protocols

It has been observed that there is a larger installed base of IPX than IP and CLNP combined. For this reason, the authors of CATNIP have catered for IPX in their IPng proposal.

3.5 Internet Stream Protocol - ST-II

The robustness and recovery mechanisms of ST-II make it a candidate for HPN and therefore for future real-time, mission-critical, distributed systems.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C18 of 19 ydthsac1-02.wpd Appendix C Network Layer LAN Protocols

4. Recommendations

4.1 Recommended Option

4.1.1 Short Term

The recommended short term option for the Network Layer protocol is the Internet Protocol (IP).

4.1.2 Medium to Long Term

For real-time, mission-critical, distributed systems, the recommended medium to long term option for the Network Layer protocol is the Next Generation Internet Protocol (IPng).

Organisations requiring specific features from IPng should attempt to influence the latter's definition through participation in the IETF.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page C19 of 19 ydthsac1-02.wpd Appendix D

Transport Layer LAN Protocols

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D1 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

Appendix D ...... 1

1. Scope ...... 5 1.1 Scope ...... 5 1.2 Introduction ...... 5 1.3 Appendix Layout ...... 5

2. Transport Layer Protocols - Characteristics ...... 6 2.1 Transport Layer Functionality...... 6 2.1.1 Addressing ...... 6 2.1.2 Segmentation and Re-Assembly...... 7 2.1.3 Connection Establishment and Termination ...... 7 2.1.4 Flow and Rate Control ...... 7 2.1.5 Error Control...... 7 2.1.6 Special Data Transfer Services...... 7 2.2 Transport Layer Protocol Analysis ...... 8 2.2.1 TCP...... 8 2.2.1.1 TCP Priority Scheme ...... 8 2.2.1.2 TCP Deficiencies ...... 9 2.2.2 User Datagram Protocol ...... 10 2.2.3 OSI TP4 ...... 10 2.2.3.1 TP4 Priority Scheme ...... 11 2.2.3.2 TP4 Deficiencies...... 11 2.2.3.3 TP4 Suitability ...... 11 2.2.4 CLTP ...... 11 2.2.5 Novell's Transport Protocols ...... 12 2.2.5.1 Sequenced Packet Exchange Protocol ...... 12 2.2.5.2 NetBIOS ...... 12 2.2.6 Real-Time Protocols...... 13 2.2.6.1 Delta-T ...... 13 2.2.6.2 VMTP ...... 14 2.2.6.3 NETBLT ...... 14 2.2.6.4 GAM-T-103 ...... 15 2.2.6.5 XTP ...... 15 2.2.6.5.1 Flow Control ...... 16 2.2.6.5.2 Rate and Burst Control ...... 16 2.2.6.5.3 Error Control ...... 17 2.2.6.5.4 Priority Message Scheduling ...... 18 2.2.6.5.5 Reliable Multicast ...... 18 2.2.6.5.6 XTP Features ...... 18 2.2.6.5.7 XTP Standardisation ...... 19 2.2.6.5.8 XTP Implementations ...... 19 2.2.6.5.9 XTP Reference Models ...... 20 2.2.6.6 ATM Transport ...... 20 2.3 Summary of Transport Layer Protocols ...... 25

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D2 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

3. Transport Layer Protocols - Performance ...... 26 3.1 TP4 over Ethernet ...... 26 3.1.1 Test Conditions...... 26 3.1.2 Test Results...... 27 3.1.3 Analysis of Results ...... 27 3.2 TP4 over FDDI ...... 28 3.2.1 Test Conditions...... 28 3.2.2 Test Results...... 29 3.2.3 Analysis of Results ...... 29 3.3 XTP over FDDI ...... 30 3.3.1 Test Conditions...... 30 3.3.2 Test Results...... 30 3.3.3 Analysis of Results ...... 31

4. Conclusions ...... 32 4.1 Applicability ...... 32 4.2 Maximum Performance Option ...... 32 4.3 Maximum Interoperability Options...... 32 4.4 Next Generation Standard Transport Protocols ...... 33

5. Recommendations ...... 34 5.1 Maximum Performance Option ...... 34 5.2 Maximum Interoperability Options...... 34

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D3 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

List of Tables

Table I : Ethernet Throughput Performance Test Results for Multibus II NIC using CLA TP4 . . 27 Table II : FDDI Throughput Performance Test Results for Multibus II NIC using CLA TP4 . . . . 29 Table III : XTP over FDDI Throughput Performance Test Results for Multibus II NIC ...... 30

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D4 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

1. Scope

1.1 Scope

This appendix describes the detailed characteristics of Transport Layer (i.e. Layer 4) LAN protocols, specifically in terms of their performance and their suitability or otherwise for the implementation of real-time, mission-critical distributed systems. Some performance measurement results for two commonly used transport layer protocols are presented and analyzed. The appendix also provides a qualitative trade-off between possible contenders as the transport layer protocol option for the Real-Time LAN Profile.

1.2 Introduction

The International Standards Organisation (ISO) defined and in 1984 published a framework for communications standards which partitions the functions required for communication into seven layers. This was termed the Open Systems Interconnect (OSI) Basic Reference Model. Layer 4 of the OSI model is termed the Transport Layer (TL). The primary functions of the TL protocol are to optimise use of the network and enhance data communication reliability by providing end-to-end dataflow control.

The functions of the TL protocol are critical to the suitability of local area networks for real-time, mission-critical, distributed systems as they determine such capabilities as end-to-end error, rate and flow control as well as special services. The latter are of specific significance to mission-critical, real-time systems. Examples of such special services are priority message scheduling, synchronisation and timestamping.

1.3 Appendix Layout

The appendix commences with an overview of the characteristics of available Transport Layer Protocols and proceeds with descriptions of specific current and proposed implementations thereof. Important implications of these characteristics and implementations are then analysed within the context of real-time, mission-critical, distributed systems.

Some transport layer protocol performance measurement results and analysis are then presented.

Specific conclusions and recommendations are then made, with the most significant of these being adopted in the main section. In particular, specific options are recommended for inclusion in the Real-Time LAN Profile described in the main section.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D5 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

2. Transport Layer Protocols - Characteristics

The transport layer provides an interface between the higher application-oriented layers and the underlying network-dependent layers. It provides the higher layers with a reliable message transfer service that is independent of the underlying network type. The transport layer masks the detailed operation of the underlying network from the higher layers and provides the latter with a defined set of message transfer facilities.

The transport layer provides for a reliable end-to-end messaging service where the underlying data delivery service is assumed to be able to re-order, corrupt, lose or significantly delay packets in route.

In general, a transport layer protocol specifies the exchange of information in the form of user data and control information. A packet is the basic transport information exchange unit. A packet may carry control information in its header and trailer fields and either data or control information within its middle part.

2.1 Transport Layer Functionality

Transport layer functionality includes :

! Addressing ! Segmentation and Re-Assembly ! Connection Establishment and Termination ! Flow and Rate Control ! Error Control ! Special Data Transfer Services

Each of these functions relies on control information in some form, either in header and trailer fields or in control packets.

2.1.1 Addressing

The transport layer multiplexes data transfer services between a number transport service users. A number of transport services may also co-exist simultaneously. The transport service therefore needs transport user identification along with transport entity identification and end-station identification in order to successfully deliver user data.

The user identification uniquely identifies the transport layer service user among all others also using the same service provider. In the Internet environment, this identifier is typically called a port or socket while in the ISO environment it is termed a transport service access point (TSAP). The transport layer passes the end-station identifier to the network layer which then resolves its logical and physical address.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D6 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

2.1.2 Segmentation and Re-Assembly

The transport layer provides general message transfer services by segmenting arbitrarily large Transport Service Data Units (TSDUs) into multiple smaller Transport Protocol Data Units (TPDUs) suitable for use by the network layer. At the remote transport peer, these TDPUs are re-assembled into the original TSDU. Segmentation and re-assembly are transparent to the transport layer user.

2.1.3 Connection Establishment and Termination

Connection management techniques must include the capability to establish and release connections. Typically, such techniques are based on either handshaking or implicit management techniques. Handshaking involves the exchange of state information between the endpoints of a connection. Implicit connection management takes the form of on-demand connection setup and timer-based connection management and release. On-demand connection setup refers to the establishment of connection state information at the remote endpoint upon the arrival of the first TPDU from the initiating endpoint. Timer-based techniques use timers instead of explicit handshaking to correctly manage the connection state at endpoints.

2.1.4 Flow and Rate Control

Mismatches between a receiver's ability to process and buffer incoming data and a peer transmitter's ability to deliver data to the network can severely degrade communication effectiveness. Lost data can also result in retransmission which can result in network congestion. Unrecovered data can be unacceptable in mission-critical, distributed systems while late data is unacceptable in real-time, distributed systems. Mechanisms to prevent lost and late data are therefore required in such systems. These are flow and rate control. Transport layer protocols implement end-to-end flow and rate control.

2.1.5 Error Control

Mission-critical, distributed systems require a reliable data delivery service. The transport service is required to deliver user data intact, unduplicated and properly sequenced at the destination transport peer. Mechanisms to prevent such lost and corrupted data are therefore required in such systems. These are termed error control. Transport layer protocols implement end-to-end error control.

2.1.6 Special Data Transfer Services

In order to enhance reliability and responsiveness, real-time, mission-critical, distributed systems generally require enhanced data transfer services. Transport layer protocols may provide these special data transfer services. Such services include expedited or priority message services, exemption from dataflow

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D7 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

control, resource reservation as well as special signalling such as time synchronisation and co-ordination i.e. timestamping.

2.2 Transport Layer Protocol Analysis

2.2.1 TCP

During the mid-1970s, the USA Department of Defence (DOD) commissioned the design of TCP/IP (Transmission Control Protocol/Internet Protocol)[25, 26]. This was before the OSI Reference Model was conceived. TCP provides point- to-point, guaranteed-delivery communication between networked nodes and was originally designed for packet-switching communications (such as X.25 WAN systems). TCP provides a reliable byte stream over a full-duplex virtual circuit connection. User data is not structured by the TCP transport service; remote users must interpret the arriving byte stream.

TCP is a connection-oriented sliding window protocol which uses byte-based sequence numbers, positive acknowledgements and timer-based retransmission to provide a reliable service. Each TCP connection provides full-duplex octet stream communication between the endpoints of the connection.

TCP/IP is the core of the Internet Profile which consists of a range of protocols, offering services that communicate between and provide control of incompatible computers and networks.

The two transport protocols of the Internet Profile are[83] :

! TCP - Transmission Control Protocol ! UDP - User Datagram Protocol

The Host-to-Host Layer uses TCP to ensure the reliability of data transfer between two hosts. TCP is a connection-orientated protocol providing reliable data transfer between two transport users. Data is passed from the transport user to TCP which then encapsulates the data into segments containing the user data and control information. Outgoing segments are numbered sequentially and are acknowledged by number by the destination TCP module.

The TCP standard defines the main levels of service as being Multiplexing (Multiple Users), Connection Management, Data Transport and Error Reporting. TCP allows the transport user to specify the quality of transmission service and offers a simple data transmission priority service.

2.2.1.1 TCP Priority Scheme

TCP provides a somewhat primitive priority message scheduling scheme. It does so by providing two priority mechanisms, the push flag and urgent pointer. Priority data is inserted into a packet with the push flag set. Such packets are then delivered to the user in an

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D8 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

expedited, but still-in-sequence fashion. The transmission buffers for the connection are flushed immediately when the push flag is set. The urgent pointer indicates the place in the data stream where data requiring special attention begins. Urgent data may travel in packets carrying ordinary data. TCP does not specify what actions a receiver must take to expedite processing of urgent data.

2.2.1.2 TCP Deficiencies

While TCP has found extensive implementation in large and sophisticated networks, it was designed in the era of 56 kbits-1 data links[125] and is intrinsically unable to support data rates much above a few Mbits-1 [70]. Although extensive in its internetworking features, it has not been designed for deterministic and real-time data transfer. By only providing point-to-point communication it cannot support multicast which is required by many real-time, distributed systems.

TCP was not engineered for the type of high-speed networks that are beginning to find application (e.g. FDDI and ATM). TCP's sequence spaces and windows sizes are too small to support gigabit performance.

TCP uses a 16-bit window size that allows up to 64 kbytes of data to be sent before acknowledgement is required. With high data transmission speeds, a transport user will be forced to wait for such an acknowledgement every 64 kbyte. At 1 Gbits-1, this would occur every 490 µs. This is would seriously reduce throughput. Proposals have, however, been made to provide a scaling factor for the window size to allow it to extend beyond its current limit[85].

TCP's 32-bit sequence space size is also a problem. At gigabit speeds, the sequence space is capable of wrapping every 17 s. A wrapped sequence space introduces an ambiguity about whether a received packet is for an active connection or is intended for a previously closed connection. Two methods have been proposed to prevent the sequence space wrapping problem. The first extends the sequence space to 32 bits and the second uses TCP timestamps to protect against old duplicate packets.

Because of TCP's intrinsic connection-oriented design, it cannot efficiently provide request/response type interactions without an excessive number (six) of connection management packet exchanges. While TCP can transmit user data in the first packet of connection establishment, this is only passed to the application when the connection is confirmed by the third packet.

When TCP is used for multiple simultaneous transactions, it exhibits an even more serious problem; i.e. it can occupy the host processor

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D9 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

almost entirely with protocol execution. The reason for this is the so- called TIME-WAIT state of the TCP state machine: after a TCP connection is terminated, it enters a "zombie" state for up to four minutes before the individual instance of the state machine is finally destroyed. This is to ensure that duplicated connection establishment packets that arrive after a connection has been terminated do not get interpreted as connection establishment requests for a new connection. In contrast, XTP associations are terminated by using an explicit handshake augmented where necessary with a timer; state information is de-allocated upon completing the handshake, although the entry in the context lookup database is kept until no hazard due to re-appearing packets is possible. Consequently, XTP avoids the overhead of managing "zombie" contexts[125].

TCP transaction times also increase significantly under load while the maximum transaction rate is low.

TCP's error control only allows a go-back-n error recovery scheme. This is inefficient, especially if only a few bytes in an otherwise error- free transmission are corrupt.

As described in Paragraph 2.2.1.1 above, TCP's priority scheme is too primitive for real-time applications.

2.2.2 User Datagram Protocol

The User Datagram Protocol (UDP) is an Internet transport layer protocol that does not support a reliable delivery service. It is a "best-effort" transport protocol used for applications that do not need protection against data loss, e.g. distribution of periodic sensor data. The User Datagram Protocol (UDP) is a connectionless transport layer protocol which is very similar to ISO CLTP.

Having no flow or error control, UDP is a "lightweight" protocol that is efficient and has high performance. Like TCP, UDP implements a fixed communication policy that cannot be set by the communication user to match a particular service model.

It is concluded that UDP is appropriate in Internet environments where flow and error control are not required. In general, UDP is not appropriate for real-time, mission-critical, distributed systems.

2.2.3 OSI TP4

TP4 is the ISO Class 4 Transport Protocol which was designed in 1982. Layer 4 of the OSI model, i.e. the Transport Layer[39], consists of five classes of increasing capability with respect to retransmission of lost data, flow control and reordering of packets. TP4 is specified in ISO 8073[14].

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D10 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

TP4 offers reliable end-to-end data transfer service over an unreliable network connection. TP4 provides the capability to establish a connection and transfer TPDUs on that connection. It includes mechanisms for segmentation, flow control and multiplexing of several transport connections over a single network connection. TP4 also provides the capability to detect and recover from errors which occur at the network layer and below. TP4 provides a message-type service in contrast to TCP's stream-oriented service.

2.2.3.1 TP4 Priority Scheme

TP4 provides a somewhat primitive priority message scheduling scheme. It does so by providing two sequence spaces, one for ordinary data and one for expedited data. After the transmission of expedited data, no other packet may be transmitted until the data packet is acknowledged. Urgent data may not travel in packets carrying ordinary data.

2.2.3.2 TP4 Deficiencies

TP4 also cannot efficiently provide request/response type interactions without an excessive number of connection management packet exchanges (at least 5 in this case). As TP4 assumes that aborts will be handled by a higher protocol layer, it does not offer a graceful close mechanism. While TP4 can transmit user data in the first packet of connection establishment, this is limited to just 32 bytes.

As described in Paragraph 2.2.3.1 above, TP4's priority scheme is too primitive for real-time applications.

2.2.3.3 TP4 Suitability

While not providing the highest levels of performance required for LANs and internetworks of the future, the ISO OSI protocols are amongst the most modern of which software implementations exist and that have fairly widespread application. Furthermore, the ISO OSI model does currently provide the most widely accepted communications model. For these reasons, ISO TP4 can also be considered as supplementary protocol for the short term. It is predicted, however, that TP4 (as well as other OSI protocols) will disappear in the medium to long term (5 to 10 years) due to predation by the Internet Protocols and ATM.

2.2.4 CLTP

The Connectionless Transport Protocol (CLTP) is a connectionless option offered by the ISO protocol suite. CLTP offers a basic datagram service and besides an optional checksum, provides no other reliability mechanisms. Efforts are progressing to incorporate basic multicast features into CLTP.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D11 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

2.2.5 Novell's Transport Protocols

2.2.5.1 Sequenced Packet Exchange Protocol

The Sequenced Packet Exchange protocol (SPX) is a connection-oriented, communications protocol that is built on NetWare IPX. When an application program makes a call to SPX to send a packet, SPX will do some housekeeping-type work on the packet, but will call NetWare IPX to actually send the packet. SPX guarantees packet delivery, whereas NetWare IPX only provides a best effort to deliver packets. This added feature of SPX has obvious advantages, but it also adds overhead to the data transfer cycle and is slower.

2.2.5.2 NetBIOS

The Network Basic Input/Output System (NetBIOS) functions in either a connectionless mode or a connection-oriented mode. An application designed for the NetBIOS interface can use either of these modes. For instance, if an application functions in a request/reply mode with a transfer size of only one packet, then the connectionless mode should be used to take advantage of connectionless response times. On the other hand, if most of the transfers are simplex or consist of large numbers of packets, the transfers should use the connection-oriented mode in order to ensure packet delivery and integrity of data. Novell's NetBIOS emulator is built on NetWare IPX in the same way that SPX is.

The NetBIOS emulator is called an emulator because it is implemented entirely in software, whereas the original NetBIOS introduced by IBM and Sytek was located in firmware.

Because NetBIOS was introduced by IBM, it was almost instantly accepted as an industry standard. Most networking vendors have implemented the specification developed by IBM that allows almost any application designed for the NetBIOS interface to operate in any environment.

A common problem with the NetBIOS specification, however, is that it only deals with the upper layer functions of the interface. It does not specify what communications protocol should be used underneath it. As a result, almost every networking vendor has written NetBIOS on top of their own proprietary communications protocol which cannot communicate with other vendors' protocols.

A commendable feature that NetBIOS has to offer the networking industry is its provision of easy address resolution among locally-connected workstations. All nodes on a network that use

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D12 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

NetBIOS register a unique name. When a node desires to communicate with another node, all it needs to know is the node's unique NetBIOS name and NetBIOS will ensure that the packet arrives at the proper location.

2.2.6 Real-Time Protocols

In order to support real-time, mission-critical, distributed systems, networks are required to provide enhanced transport services. Traditional transport layer protocols are deficient in certain of these capabilities. Important deficiencies include effective priority message scheduling, reliable datagrams, efficient transactions processing and flexible dataflow control. Real-time, mission- critical, distributed systems also require that the user of the network services is able to tailor the services and performance of these network services in order to optimise the performance of the system. To do this, the user must be able to define and invoke data transfer policy, with transfer mechanism being transparently effected by transfer services.

Such requirements have therefore led to the design and development of a new generation of lightweight protocols such as Delta-T, VMTP, NETBLT, GAM-T-103 and XTP.

2.2.6.1 Delta-T

Delta-T[136] is a high performance experimental protocol developed in the late 1970s by the Lawrence Livermore National Laboratories in the US. It was designed to meet the needs of an integrated network and distributed operating system architecture, specifically to support the request/response transaction traffic profile of client/server architectures, the stream traffic profile of terminal-type architectures and the bulk data transfer of the mainframe-type environment.

Delta-t's primary feature was the use of timer-based mechanisms for the achievement of safe connection management and, through the use of these timer-based mechanisms, develop a hazard-free, connection- oriented protocol.

It was a design goal of Delta-t that the minimum number of packets exchanged for a reliable, hazard-free connection be two, one for data and one for acknowledgement. No other packets are required for connection opening and closing since the arrival of the first packet in a communication opens the connection which is subsequently released by a timeout. It is through timer-based mechanisms that connection hazards are avoided, since three-way handshakes and unique connection identifiers are not sufficient to prevent certain spurious conditions.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D13 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

Delta-t's primary contribution to real-time transport protocols is the use of these timer-based mechanisms.

2.2.6.2 VMTP

VMTP[50] (Versatile Message Transaction Protocol) is an experimental transport protocol developed at Stanford University in the late 1980s by David Cheriton. VMTP was developed as the communications component within the V Distributed System. VMTP was designed to meet the needs of distributed computation and on-demand paging, specifically to support the request/response transaction traffic profiles of such environments.

VMTP provides transport communication services via a message transaction model. A message transaction consists of a request message sent by a client process to one or server processes, followed by zero or more responses sent back to the client by the server process. VMTP follows a strongly connectionless approach with connection establishment being left to a higher layer.

In VMTP there are three variants of the basic message transaction which widen its applicability and efficiency. These variations may be combined to provide even more flexibility. A group message transaction is a transaction in which the client sends a multicast to a group of server entities; in return the client may receive multiple responses. A datagram transaction occurs when a client sends the request message with a indication that no response is expected. A forward message session is a transaction in which a request message may be forwarded to another server which responds directly to the client. This may be considered to be an optimisation of nested procedure calls.

VMTP's primary contribution to real-time transport protocols is the use of the transaction-type paradigm.

2.2.6.3 NETBLT

NETBLT[52] (Network Block Transfer) is an experimental transport protocol developed at MIT in the late 1980s by David Clark. NETBLT is a transport layer protocol specifically designed for efficient transfers of large amounts of data. The algorithms which make up this protocol are optimised to provide high throughput over long delay channels while retaining good performance in LAN environments. This is achieved by minimizing network congestion, delays associated with long haul links and packet loss.

NETBLT's primary feature is its two-parameter rate control algorithm, i.e. rate and burst.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D14 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

NETBLT's primary contribution to real-time transport protocols is the use of the supplementary rate and burst dataflow control mechanisms.

2.2.6.4 GAM-T-103

GAM-T-103[75] is a transfer protocol developed by the French Ministry of Defence for military, real-time networks. As a transfer protocol, the protocol encompasses both the network and transport layers. This was in order to optimise real-time performance.

GAM-T-103 was specifically designed for real-time systems with emphasis being accorded to capability in the areas of low latency, point-to-multipoint (i.e. multicast) and multipoint-to-point (i.e. concentration) data transfer as well as synchronisation.

The ISO Enhanced Transport Service (ETS), described in Paragraph 2.2.7.1, is based on GAM-T-103.

GAM-T-103's primary contribution to real-time transport protocols was its definition as a transfer protocol as well as definition of the semantics of multicast and concentration data transfer service models.

2.2.6.5 XTP

XTP (Xpress Transport Protocol)[38, 141] has been developed through the late 1980s and early 1990s by various developers in the USA under co-ordination of the XTP Forum. XTP is a new protocol that has drawn extensively from the experimental transport protocols outlined above.

The XTP development effort has received considerable support from the US Navy, specifically with respect to real-time systems. XTP is specified as the real-time transport protocol of the SAFENET standards suite.

Initially XTP followed the lead of GAM-T-103 in following a transfer protocol approach. In fact, XTP was termed the Xpress Transfer Protocol until Version 3.6 in 1987. Due to pressure from the ISO and Internet communities and standards authorities, as well as increased performance of later implementations of network layer protocols such as IP and CLNP, XTP was redefined as a transport protocol at Version 4.0 (officially in March 1995) and is now termed the Xpress Transport Protocol.

The major attributes of XTP are error, flow and rate control, priority message scheduling, optimised inter-network addressing mechanisms and reliable multicast support.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D15 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

The design of XTP has incorporated contributing features from the new generation of experimental transport protocols. The most important of these are :

Delta-t - timer-based mechanisms VMTP - transactions NETBLT - rate and burst control GAM-T-103 - multicast

XTP offers a wide range of data transfer services; these include reliable datagrams, transactions, connectionless and connection- oriented services. It also allows a range of flexible dataflow control capabilities including flow, rate and burst control as well as error control.

2.2.6.5.1 Flow Control

Flow Control allows the receiver of information to inform the sender about the current state of its receiving buffers. Flow control applies to endpoints of an association and not to the participating intermediate nodes. In XTP, the receivers's flow control parameters are included in control packets sent from the receiver to the sender.

XTP provides the mechanisms and procedures for an end-to-end, credit-based, sliding window flow control algorithm. It also allows for a no-flow mode to allow for cases where flow control would disrupt the integrity of certain data streams. Typical situations that would require this mode are those involving multicast transfer where a slow receiver could attempt to apply flow control to a critically real-time data stream thereby disrupting transfer to other more capable receivers.

2.2.6.5.2 Rate and Burst Control

Rate Control allows the restriction of the size and time spacing of data from a sender in order that the ability of a data receiver (or intermediate routers) to decipher and queue data is not overwhelmed. Rate control is effected by all participating nodes of an association and not only by endpoints. Since each node in the path receives packets and forwards them, each pair of nodes in the path forms a producer/consumer pair. Rate control regulates the rate at which data is produced at one node so as not to overrun its downstream consumer. That consumer node then becomes the producer for the next XTP node

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D16 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

along the path until the final destination end-node to reached. Since it is possible that the generation of many packets within a short amount of time could overrun a node along a path, XTP provides rate and burst control mechanisms and procedures by which consumers may throttle producers.

The rate parameter specifies in bytes per second the maximum rate at which data can be consumed by a receiver. The burst parameter specifies the maximum number of bytes which can be consumed in one burst of packets, i.e. packets sent in rapid succession.

2.2.6.5.3 Error Control

Error Control provides for the detection of errors and retransmissions of data. XTP uses two checksums over the XTP packet contents to verify the integrity of the data received over the network. The XTP checksum algorithms were chosen for execution speed and VLSI implementation compatibility. The XTP checksums are also placed at the end of an XTP packet allowing concurrent checksum calculation with packet transmission or reception.

XTP supports three types of error control; fully reliable mode for applications such as file transfer, an unacknowledged service for applications where the receiving application effects any error control that may be required and fast negative acknowledge (fastnack) for applications such as real-time control. In real-time control applications, which typically use LAN-type topologies which are inherently reliable, missing data normally implies that data has been corrupted and not merely delayed (e.g. through a router). On detecting delivery of out-of-sequence data, the fastnack service pre-empts any timeouts and immediately signals the transmitter of this condition The transmitter can then immediately resend the missing data. Such a capability enhances responsiveness and dependability in real-time, mission-critical network applications.

XTP allows for a no-error mode to allow for cases where error control would disrupt the integrity of certain data streams. Correctly received data is properly sequenced, but gaps are not re-transmitted. Typical situations that would require this mode are those involving data transfer containing redundant data, such as sampled video and

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D17 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

audio. Such data streams may not inherently require error control, especially of the data content, while error control may disrupt more critical dataflows (such as real-time control data).

2.2.6.5.4 Priority Message Scheduling

XTP supports prioritization of packet processing at both the sender and receiver using pre-emptive priority scheduling. If a node is currently processing a low priority packet as a higher priority packet arrives for service, the node is pre-empted from processing the lower priority packet and begins processing the higher priority packet. Only after all higher priority packets have been completed or blocked will the node return to the lower priority packet.

In XTP, two pre-emptive schedulers exist, one for incoming packets and one for outgoing packets. For both the receiver and sender prioritization schemes, XTP supports 216 different priorities. Each context is associated with a particular priority level. Multiple contexts can be assigned the same priority level simultaneously.

2.2.6.5.5 Reliable Multicast

XTP offers a multicast capability with optional degrees (from full to none) of reliability as well as sophisticated multicast group management.

2.2.6.5.6 XTP Features

Specific features of XTP are the following :

! XTP can provide user applications with multi- packet exchange sequences offering a transport- level virtual circuit capability and a transport-level datagram service.

! XTP can be considered as a lightweight protocol for a number of reasons. Firstly, it is a fairly simple, yet flexible algorithm. Secondly, packet headers are of fixed size and contain sufficient information to screen and steer the packet through the network. The core of the protocol is essentially contained within four fixed-size fields in the

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D18 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

header. Additional mode bits and flags are kept to a minimum to simplify packet processing.

! XTP offers an orthogonal approach to policy and mechanism. This is an important consideration for a flexible, real-time protocol. What is meant by an orthogonal approach is that the protocol definition and implementation differentiates between policy regarding real-time LAN issues such as addressing, error, flow and rate control, and the mechanisms of how these are actually implemented and how they interface to the user application.

! XTP is specifically designed for parallel operation as opposed to serial operation. Address translation, context creation, flow control, error control, rate control and host system interfacing can all execute in parallel. This feature will stand XTP in good stead should XTP be implemented as a "silicon protocol" in order to support the gigabit transfer rates of the future.

2.2.6.5.7 XTP Standardisation

One of the reasons that XTP has evolved into a standard transport protocol rather than a transfer protocol was in order to gain alignment with the various standards bodies' standardisation efforts. One example is alignment with ISO's High Speed Transport Protocol (HSTP).

Proposals have also been made within the ANSI X3S3.3 committee to incorporate XTP's features into the ISO protocol suite[85].

2.2.6.5.8 XTP Implementations

Commercial software implementations of XTP are available off-the-shelf. For example, Network Xpress Inc. has developed the Xpress Transport Protocol (Version 4.0) which is compatible with Intel iAPX 80x86 processors and the MS-DOS operating system, as well as Motorola 680x0 processors and the pSOS or pSOS+ operating systems. XTP has also been ported to the LynxOS, VxWorks and Windows NT operating systems.

Mentat, Inc. have implemented a Streams version, primarily for Unix environments.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D19 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

Sandia National Laboratories (California, USA) have developed a public-domain, object-oriented implementation (written in C++).

2.2.6.5.9 XTP Reference Models

The XTP Forum has developed software reference models for both versions of XTP. For the Xpress Transfer Protocol the reference model is the Kernel Reference Model (KRM) written in C. For the Xpress Transport Protocol the reference model is the Sandia implementation written in C++.

2.2.6.6 ATM Transport

It has been proposed that XTP is suitable to operate over the ATM Adaptation Layer (AAL) as an ATM transport layer protocol. Research is in progress at the University of Karlsruhe in Germany as well as at Olivetti Research Limited[97] in the UK into the requirements of XTP to operate over ATM.

Specific issues that require addressing are the following :

! Parameterized Addressing

! Parameterized Traffic Specification

! Control Modes (to overcome functional overlaps between the two layers)

! Paradigm Independence.

2.2.7 Next Generation Transport Protocols

Next Generation transport protocols are under consideration by groups such as ISO, the Internet Engineering Task Force (IETF) and the US Navy High Performance Network Working Group (HPNWG). The former two organisations are involved in the co-ordination and development of applicable protocol standards, while the latter will be a user of the protocols.

2.2.7.1 ISO Efforts

The ISO SC6 committee has inaugurated a project to address enhanced transport layer facilities. This project is termed Enhanced Communications Functions and Facilities (ECFF).

The following specific enhanced services are being investigated: multicast, request/response service, fast response service,

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D20 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

acknowledged connectionless service and sampling service. The following specific enhanced mechanisms are also being investigated: out-of-band signalling, graceful release and priority.

The Enhanced Transport Service (ETS) is a transport layer service definition intended to support real-time operation. ETS provides mechanisms which offer users high reliability, low latency, low access delay, priority mechanisms as well as deterministic behaviour. It is intended to be used on closed networks (i.e. LANs) where the group of users is known. ETS is based on GAM-T-103.

2.2.7.2 IETF Efforts

There are a number of IETF efforts ongoing to investigate the transport protocol requirements of the next generation of applications. In particular, a new service model, i.e. one which provides a more comprehensive set of services than the current best-effort model, is being investigated.

2.2.7.2.1 Real-Time Transport Protocol

The Real-Time Transport Protocol (RTP) provides for end-to-end unicast or multicast transport of real-time data such as audio, video or simulation data. RTP provides sequencing and timing capabilities for transferring data streams. It has been developed in conjunction with the Real-Time Transport Control Protocol (RTCP) by the IETF's Audio-Video Transport working group. RTCP provides minimal control and identification functionality, particularly in support of multicast transmission. These protocols do not address resource allocation (i.e. bandwidth allocation) and provide no guarantees related to quality of service. Such services may be provided by an underlying layer (e.g. ST-II) and in these cases RTP will convey these benefits to its users.

While RTP has been designed to run directly on top of IP it will, in most cases, be more appropriate to run it on top of UDP due to the limited demultiplexing capability of RTP. While RTP can be used for a variety of applications, its focus is to support audio-video conferencing. RTP packets may contain payloads of audio, video or other stream data.

The optional capabilities supported by RTCP are implemented via optional header fields within an RTP protocol data unit (PDU). RTCP is intended to support the transfer of control information in a loosely controlled

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D21 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

session. Such capabilities may not be needed once suitable session control protocols are developed (e.g. MMUSIC, refer to Paragraph 2.2.7.2.2).

RTP provides mechanisms to support bridges (devices which convert one RTP stream into another of different capabilities while retaining timing integrity) and translators (devices which convert RTP streams without retaining timing integrity). Security features support private sessions with capabilities for multiple keys.

2.2.7.2.2 MMUSIC

The Multiparty Multimedia Session Control (MMUSIC) working group of the IETF is working to develop a tightly controlled multimedia conference control protocol.

A feature of MMUSIC is a session manager which is a control entity that exists in each participating network node. The distributed session management will control network resources such as bandwidth. One of the underlying services that MMUSIC will require is reliable multicast.

2.2.7.3 IETF Integrated Services Architecture

The IETF Integrated Services Architecture is defined to be the transport of audio, video, real-time control and classical data traffic within a single network infrastructure. A working group of the IETF is investigating such integrated services for the Internet environment. The three main focuses of the group are as follows :

! Clearly define the services to be provided.

! Define the application service, router scheduling and subnet interfaces.

! Develop router validation requirements which can ensure that the proper service is provided.

The Integrated Services Architecture being proposed for the Internet consists of four basic elements :

! Admission Control.

! Packet Scheduling.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D22 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

! Packet Classification.

! Reservation Setup Protocol.

The most developed of the elements to date is the Resource Reservation Protocol (RSVP). Several design principles are considered fundamental to the elements of the proposed architecture, i.e. :

! They must be designed for multicast as well as unicast.

! They must support a heterogeneous environment.

! They must be scalable.

2.2.7.3.1 Admission Control

Admission Control is the decision algorithm in a host or router which grants service requests. Service requests are based on previous resource allocations granted. If resources are available, the service request can be granted, otherwise the service request will be denied. The implementation of this capability will bring a radical change to the architecture of the Internet protocols since state information will now have to be maintained in routers to save previous allocations. Several admission control algorithms have been proposed and experimentation is ongoing with them.

A key issue in admission control is authentication. Before resources are allocated, it is desirable to ensure that both the resource requester and the amount of resources being requested are valid. Without such facilities, a malicious user could try to monopolise all resources.

2.2.7.3.2 Packet Scheduling

Packet scheduling manages the forwarding of packet streams in a manner that ensures that the resource requests that have been granted by the admission control algorithm are met. This component must be implemented at the point where packets are queued. Consequently, the packet scheduler will become an integral component in routers which provide integrated service capabilities. The packet scheduler must have intimate knowledge of the characteristics of the underlying Data Link Layer.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D23 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

An estimator is used to assist the packet scheduler and admission controller. This algorithm measures the properties of the outgoing traffic streams and develops statistics that are used for future packet scheduling and admission control.

2.2.7.3.3 Packet Classification

Packet classification maps incoming packets into classes. All packets of the same class are treated equally by the packet scheduler. A class may contain a single flow or an aggregate of many dataflows. To classify packets, a class descriptor (or filterspec) must be set up. The class descriptor must express packet selection criteria adequately, drive the classification engine and support filterspec algebra, i.e. the ability to combine filterspecs using logical operators.

A prototype classification engine has been developed by the Massachusetts Institute of Technology. The implementation uses a discontiguous patricia tree classifier algorithm. The current classifier can classify a packet in about 10 µs. Significantly better performance is expected as the algorithm is tuned.

2.2.7.4 Resource Reservation Protocol

The Resource Reservation Protocol (RSVP) is a resource reservation setup protocol that provides for receiver-initiated setup of resource reservations for multicast or unicast dataflows. In an RSVP interaction, a host requests a specific quality of service (QoS) for a particular data stream. RSVP delivers this request to the various hosts and routers along the paths of that data stream. Resources are reserved in these nodes with router and host states being maintained. In RSVP the receiver is responsible for the initiation and maintenance of the resource reservation. In addition, state information is maintained in the routers to enable gracefully dynamic membership changes and automatic adaptation to routing changes. RSVP assumes the presence of multicast functionality.

RSVP also carries a flow-spec which is used to parametise the packet scheduling mechanism for a particular dataflow. A flow-spec proposed by the IETF contains the following parameters: maximum transmission size, token bucket rate, maximum transmission rate, minimum delay notice, maximum delay variation, loss sensitivity, burst loss sensitivity, loss interval and quality of guarantee.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D24 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

A prototype implementation of RSVP is being developed by the Information Sciences Institute at the University of Southern California. Also, a draft functional specification has been released on the Internet.

2.3 Summary of Transport Layer Protocols

Traditional, general purpose transport protocols such as TCP, UDP and TP4 have deficiencies making them unsuitable for real-time, mission-critical, distributed applications. A number of experimental transport protocols such as Delta-t, VMTP, NETBLT and GAM-T-103 were developed to address these deficiencies, however none of these were entirely appropriate.

The US Navy, recognising its future needs as well the deficiency of existing transport protocols, sponsored the development of a new real-time protocol, the Xpress Transport Protocol which drew on the functionality of both existing as well as the experimental protocols. XTP thus provides the transport protocol of the real-time option of the US Navy SAFENET LAN profile. Despite XTP's extensive functionality and flexibility, it nevertheless exhibits some deficiencies in order to support the next generation of real- time, mission-critical, distributed systems. These include optimal support for multimedia applications, latency control, etc.

Recognising the requirements of next generation distributed systems, organisations such as the US Navy's High Performance Network Working Group, the Internet Engineering Task Force and the International Standards Organisation's Extended Transport Services group have initiated requirements studies and preliminary definition efforts to design a new generation of real-time transport protocols. These are likely to supersede, or operate in parallel with, XTP.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D25 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

3. Transport Layer Protocols - Performance

In order to gain insight into the performance of available transport protocols, some latency and throughput measurement test results are presented and analysed.

3.1 TP4 over Ethernet

The following throughput performance measurements, made by Comsoft GmbH[55], are representative of TP4 over Ethernet.

3.1.1 Test Conditions

Equipment :

! Multibus II-based system, but Multibus II parallel backplane not active.

! Concurrent Technologies Ethernet Network Interface Card (CL386/296), with onboard Intel 80386DX 33 MHz CPU and 8 Mbytes onboard RAM.

! Concurrent Technologies Ethernet Network Interface Card (CL486/596), with onboard Intel 80486DX 50 MHz CPU and 8 Mbytes onboard RAM.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D26 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

3.1.2 Test Results

Software Protocol Specific Throughput Conditions (Mbits-1) CL386/296 CL486/596 CLA-OSI LLC1 Unreliable 8,2 9,7 Connectionless 1 500 Bytes/Packet CLA-OSI ISO TP4 Unreliable 5,3 9,7 Connectionless ES-IS 1 412 Bytes/Packet CLA-OSI ISO TP4 Reliable 5,8 9,4 Connection- Oriented ES-IS 32 768 Bytes/Buffer CLA-OSI ISO TP4 Unreliable 6,1 9,7 Connectionless Inactive NL 1 412 Bytes/Packet CLA-OSI ISO TP4 Reliable 7,0 9,4 Connection- Oriented 30 Connections Inactive NL 32 768 Bytes/Buffer

Table I : Ethernet Throughput Performance Test Results for Multibus II NIC using CLA TP4

3.1.3 Analysis of Results

Analysis of the results provides some indication of the performance of the ISO TP4 transport protocol.

Generally, the performance figures of the 80486-based CL486/596 Ethernet network interface card (NIC) are higher than those of the 80386-based CL386/296 Ethernet NIC and, in fact tend close to the limits of Ethernet, i.e. 10 Mbits-1. This indicates that the protocol implementation is processor- intensive with the 80486-based machine supplying full capability and the 80386-based machine only a proportion thereof. Analysis of the latter's performance therefore gives an indication of the contribution of each increment in protocol complexity.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D27 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

As the measurements are made over a point-to-point connection, the effects of the CSMA/CD physical layer protocol do not apply. Therefore, where the CPU can provide the required protocol processing capability, essentially full Ethernet bandwidth can be achieved.

With only the Data Link Layer, 8,2 Mbits-1 is achieved. With the Transport Layer (TP4) in connectionless mode, this reduces to 5,3 Mbits-1; thus TP4 contributes a significant overhead. In connection-oriented mode, the throughput increases to 5,8 Mbits-1 indicating a modest increase in efficiency and therefore throughput over connectionless mode. With the Network Layer rendered inactive, the throughput again improves marginally to 6,1 Mbits-1 in connectionless mode thus indicating that this layer (in this case CLNP) does contribute a small overhead. In connection-oriented mode there is a significant improvement to 7,0 Mbits-1, again indicating the improved efficiency of this mode.

In the case of the CL486/596, only slight performance differences can be realised due to the fact that the bottleneck is in the intrinsic capability of Ethernet itself and not in the performance of each protocol layer.

3.2 TP4 over FDDI

3.2.1 Test Conditions

Equipment :

! Multibus II-based system, but Multibus II parallel backplane not active.

! Concurrent Technologies FDDI Network Interface Card (CL486/DAS), with onboard Intel 80486DX-2 66 MHz CPU and 8 Mbytes onboard RAM.

! Concurrent Technologies FDDI Network Interface Card (CL486/DA4), with onboard Intel 80486DX-4 100 MHz CPU and 8 Mbytes onboard RAM.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D28 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

3.2.2 Test Results

Throughput Software Protocol Specific (Mbits-1) Conditions CL486/DAS CL486/DA4

CLA-OSI MAC Unreliable 40,5 47,2 Connectionless ES-IS 4 390 Bytes/Packet CLA-OSI LLC1 Reliable 34,2 37,8 Connection- Oriented ES-IS 130 600 Bytes/Buffer CLA-OSI ISO TP4 Unreliable 27,0 29,9 Connectionless ES-IS 4 390 Bytes/Packet CLA-OSI ISO TP4 Reliable 34,3 38,5 Connection- Oriented ES-IS 130 600 Bytes/Buffer CLA-OSI ISO TP4 Unreliable 28,2 32,0 Connectionless Inactive NL 4 390 Bytes/Packet CLA-OSI ISO TP4 Reliable 37,3 44,9 Connection- Oriented Inactive NL 130 600 Bytes/Buffer

Table II : FDDI Throughput Performance Test Results for Multibus II NIC using CLA TP4

3.2.3 Analysis of Results

Analysis of the FDDI results, as well as comparison with those of the Ethernet case, provides further indication of the performance of the ISO TP4 transport protocol.

Generally, the performance figures of both 80486-based CL486/DAS and CL486/DA4 FDDI network interface cards are quite similar, i.e. well below FDDI's maximum of 100 Mbits-1. In all cases, the throughput results with the 100 MHz CPU are somewhat better than with the 66 MHz CPU. This again

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D29 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

indicates that the protocol implementation is processor-intensive with the 80486-based machines unable to supply full throughput. This indicates that the bottlenecks are not related to the intrinsic capability of FDDI, but performance of the transfer protocols themselves, as well as the CPU's ability to perform protocol processing.

With only the Media Access Control layer, 47,2 Mbits-1 is achieved. With only the MAC and Logical Link Control layers, this reduces to 37,8 Mbits-1 indicating a significant overhead of the LLC layer.

With the Transport Layer (TP4) in connectionless mode, this reduces to 29,9 Mbits-1; thus TP4 contributes a significant overhead. In connection-oriented mode, the throughput increases to 38,5 Mbits-1 indicating a modest increase in efficiency and therefore throughput over connectionless mode. With the Network Layer rendered inactive, the throughput again improves marginally from 29,9 Mbits-1 to 32,0 Mbits-1 in connectionless mode, thus indicating that this layer does contribute a small overhead. In connection-oriented mode there is a significant improvement to 44,9 Mbits-1, again indicating the improved efficiency of this mode.

3.3 XTP over FDDI

Table III provides XTP over FDDI throughput test results.

3.3.1 Test Conditions

Equipment : As indicated

Conditions : end-to-end (user memory to user memory)

64 Kbyte messages

3.3.2 Test Results

Result Computing Platform Network CPU Bus CPU Operating Technology Type Type Speed System 55 Mbits-1 FDDI Intel EISA 50 MHz MS-DOS 80486DX

92 Mbits-1 FDDI IBM RISC MCA AIX

Table III : XTP over FDDI Throughput Performance Test Results for Multibus II NIC

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D30 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

3.3.3 Analysis of Results

Throughput tests of XTP over FDDI on a PC-type machine show throughputs of 55 Mbits-1 which is commendable considering the modest performance of the PC EISA parallel backplane bus (PBB) and MS-DOS operating system. The superior performance of the IBM RISC machine with its MCA PBB and AIX operating system allows 92 Mbits-1 showing that data transfer is directly related to the protocol processing performance of the host.

In both cases, the throughput is much higher than in the case of the Multibus II- based NIC as indicated in Table II. In PC/EISA, MCA and Multibus II, the parallel backplane bus is capable of 264 Mbits-1 (32 MHz x 8 bits/byte) raw bandwidth. Two conclusions can therefore be drawn from this; XTP is more efficient than TP4 and/or the Multibus II-based NIC suffers from an inherent design bottleneck.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D31 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

4. Conclusions

4.1 Applicability

It is concluded that none of the current commercial protocol standards are appropriate for real-time, mission-critical, distributed systems. Supporting the tens to hundreds of Mbits-1 throughput and sub-millisecond latency requirements, will require protocols which have only recently been conceptualised and implemented.

While TCP/IP has found extensive implementation in large and sophisticated networks, it was designed in the era of 56 kbits-1 data links[114] and is intrinsically unable to support data rates much above a few Mbits-1 [70]. Similarly, ISO TP4, although extensive in its internetworking features, has not been designed for deterministic and real-time data transfer.

Because of TCP's and TP4's intrinsic connection-oriented designs, they cannot efficiently provide request/response type interactions without an excessive number of connection management packet exchanges. TCP and TP4 also lack sophisticated priority message scheduling schemes which renders them unsuitable for real-time applications.

Only emerging protocols standards such as GAM-T-103 and XTP are capable of the throughput and other requirements of real-time, mission-critical, distributed systems. Of these XTP has emerged as the leading contender for standardization for real-time LANs, viz. SAFENET.

While GAM-T-103 is optimised for real-time applications, it is essentially constitutes a proprietary approach. It is also a transfer protocol not conforming to the OSI Reference Model. XTP was pressured by the standards community to revert from a transfer protocol to a true transport protocol. For these reasons, GAM-T-103 is unlikely to find widespread application outside the French Ministry of Defence. Some of its features, however, are being considered for the next generation ISO Enhanced Transport Service.

4.2 Maximum Performance Option

Due to its real-time capability and extended flexibility, the maximum performance option is concluded to be XTP.

4.3 Maximum Interoperability Options

Primarily due to its widespread Internet application, the maximum interoperability option is concluded to be TCP.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D32 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

4.4 Next Generation Standard Transport Protocols

It is relevant to note that many of the features being proposed in the next generation standard commercial transport protocols contain many of the features already provided by XTP. These include multicast, priority message scheduling, reservation mode and out-of-band data.

Areas which require attention are latency control, jitter control, guaranteed quality of service (QoS) and security features. While progress is being made on the definition and design of next generation transport protocols, these efforts are still some way from completion and general acceptance.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D33 of 34 ydthsad1-02.wpd Appendix D Transport Layer LAN Protocols

5. Recommendations

5.1 Maximum Performance Option

Where maximum real-time performance is required from the transport protocol, the recommended option is XTP, especially in real-time control systems.

5.2 Maximum Interoperability Options

Where maximum interoperability is required from the transport protocol, TCP is recommended.

Where the system incorporates real-time features as well as maximum interoperability, the recommended approach is a multiprotocol combination of TCP and XTP.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page D34 of 34 ydthsad1-02.wpd Appendix E

Application Interface Services

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page E1 of 17 ydthsae1-02.wpd Appendix E Application Interface Services

Appendix E ...... 1

1. Scope ...... 5 1.1 Scope ...... 5 1.2 Introduction ...... 5 1.3 Appendix Layout ...... 6

2. APIS Protocol - Characteristics ...... 7 2.1 APIS Requirements, Goals and Constraints ...... 7 2.2 APIS Overview ...... 7 2.3 System Dataflow Management and Interface Control ...... 8 2.4 Integration and Verification...... 8 2.5 APIS Services ...... 8 2.5.1 APIS_INIT() ...... 8 2.5.1.1 Message Description...... 8 2.5.1.2 Parameters...... 9 2.5.2 APIS_OPEN() ...... 9 2.5.2.1 Message Description...... 9 2.5.2.2 Parameters...... 9 2.5.3 APIS_CLOSE() ...... 9 2.5.3.1 Message Description...... 9 2.5.3.2 Parameters...... 10 2.5.4 APIS_PRODUCE() ...... 10 2.5.4.1 Message Description...... 10 2.5.4.2 Parameters...... 10 2.5.5 APIS_DEMAND() ...... 10 2.5.5.1 Message Description...... 10 2.5.5.2 Parameters...... 10 2.5.6 APIS_REMOVE_PRODUCE() ...... 11 2.5.6.1 Message Description...... 11 2.5.6.2 Parameters...... 11 2.5.7 APIS_REMOVE_DEMAND() ...... 11 2.5.7.1 Message Description...... 11 2.5.7.2 Parameters...... 11 2.5.8 APIS_SEND_MSG() ...... 11 2.5.8.1 Message Description...... 11 2.5.8.2 Parameters...... 12 2.5.9 APIS_RECEIVE_MSG() ...... 12 2.5.9.1 Message Description...... 12 2.5.9.2 Parameters...... 12 2.6 APIS Error Messages...... 13 2.7 APIS Diagrams ...... 14 2.7.1 APIS Environment ...... 14 2.7.2 APIS Transition Diagram ...... 14

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page E2 of 17

ydthsae1-02.wpd Appendix E Application Interface Services

3. Conclusions ...... 15 3.1 Evolutionary Development ...... 15 3.2 Implementation Coherency ...... 15 3.3 Object-Oriented Implementation ...... 15 3.4 Implementation Reliability ...... 16 3.5 Wildcard Listening...... 16

4. Recommendations ...... 17 4.1 Object-Oriented Implementation ...... 17 4.2 Multitasking Operating System Support ...... 17 4.3 Full-Function IP...... 17 4.4 Formal Validation ...... 17

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page E3 of 17

ydthsae1-02.wpd Appendix E Application Interface Services

List of Figures

Figure 1 : Typical APIS Environment...... 14

Figure 2 : APIS Transition Diagram ...... 14

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page E4 of 17 ydthsae1-02.wpd Appendix E Application Interface Services

1. Scope

1.1 Scope

This appendix describes the detailed characteristics of an Application Interface Services (APIS) proprietary LAN protocol. APIS was developed during the course of the research reported in this thesis to meet the functional and performance requirements of a specific real-time, mission- critical, distributed system. APIS conceptually spans the Layers 5 to 7 of the OSI model.

1.2 Introduction

The International Standards Organisation (ISO) defined and in 1984 published a framework for communications standards which partitions the functions required for communication into seven layers. This was termed the Open Systems Interconnect (OSI) Basic Reference Model. The lower four layers are collectively termed the transfer layers and provide the basic infrastructure for network communication. Above the transfer layers are the Session, Presentation and Application Layers.

The Session Layer provides the means necessary for two application layer protocol entities to organize and synchronise their dialogue and to manage data exchange. It is responsible for establishing and disestablishing a communication channel between two peer application layer entities for the duration of the entire network transaction. Once such a transaction is in progress, a number of optional services are offered, including interaction management, logical synchronisation and exception reporting.

The Presentation Layer is responsible for the syntax of the data during transfer between two peer application layer protocol entities. To achieve true open systems interconnectivity, a number of common abstract data syntax formats have been defined for use by application layer entities together with associated transfer (or concrete) syntaxes. The presentation layer therefore negotiates and selects the appropriate transfer syntax to be used during a transaction so that the syntax of the messages being exchanged between two application entities is maintained. If the external format is different from the internal format, the presentation layer performs the necessary conversions.

The Application Layer provides the user interface with a range of network-wide distributed information services. These include file transfer, access and management as well as document and general message exchange services such as electronic mail. Access to the application services is normally implemented through a defined set of primitives, each with associated parameters, which are supported by the local operating system.

The Session, Presentation and Application Layers are known to require extensive processing support and therefore exhibit modest performance in terms of latency and throughput. Real- time computer systems have specific functional objectives (missions) and are also normally constructed from hardware specific to the needs of the system. Such systems call for deterministic data transfer and the minimisation of latency, rather than the maximisation of interoperability and provision of services such as electronic mail and document access. In many cases, a lightweight suite of protocol services is better suited to the needs of the real-time

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page E5 of 17

ydthsae1-02.wpd Appendix E Application Interface Services

system than the standard commercial protocols defined by OSI. Such a lightweight suite is termed an Application Programming Interface (API).

Access to user services at the Application Layer can be greatly facilitated through the use of standardised APIs. APIs provide well defined accessibility for application programs to obtain services or information from the underlying service provider while hiding the complexity of that service provider from the application programmer.

An API to meet specific performance requirements and implementation constraints was developed by a team of developers under leadership of the author. This API is termed the Application Interface Services (APIS) layer.

1.3 Appendix Layout

The appendix commences with an overview of the characteristics of an Application Interface Services protocol designed to meet the specific requirements of a real-time, distributed system. It then proceeds with a description of the generic implementation thereof in terms of interface characteristics, services call descriptions, call parameters, transition diagram and error messages.

Important implications of these characteristics and implementations are then analysed within the context of real-time, mission-critical, distributed systems. Conclusions and recommendations in the context of APIS are then made.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page E6 of 17 ydthsae1-02.wpd Appendix E Application Interface Services

2. APIS Protocol - Characteristics

2.1 APIS Requirements, Goals and Constraints

The following were the requirements, goals and constraints of APIS :

! Interface to a Multibus II-based FDDI Network Interface Card using the Multibus II Message Passing Transport Protocol (TP).

! Interface to the iRMX real-time multitasking operating system in the case of the Multibus II implementation.

! Interface to a real-time Unix operating system.

! Provide a worst case end-to-end (application-to-application) latency over the Multibus II parallel backplane bus as well as FDDI of less than 5 ms using commercial off-the-shelf technology.

! Provide a worst case end-to-end (application-to-application) throughput over the Multibus II parallel backplane bus as well as FDDI of greater than 15 Mbits-1 using commercial off-the-shelf technology.

! Provide maximum transparency between the application user and the network.

! Provide a virtual backplane to the application user.

2.2 APIS Overview

Application Interface Services (APIS) is a network communications protocol designed for the exchange of information between functionally independent applications incorporated into a distributed, real-time system.

APIS conceptually encompasses Layers 5 to 7 of the ISO OSI Reference Model and so interfaces below to Layer 4, the Transport Layer and above to the APIS Service User (ASU) which will normally be a collaborative network application running on an independent host CPU.

The ASU is a producer and/or consumer of data of different types. Data types are pre-defined by an ASU administration authority, i.e. the System Data Manager (e.g. the system integration authority) as part of the network system design and each data type is ascribed a unique identification code or Message Identifier. The APIS protocol establishes the necessary communication channels between ASUs by registering and matching their producers and consumers. LAN dataflow will therefore be determined by the data type of ASU messages and not by predefined ASU addresses.

This data driven approach to dataflow management provides a higher level of flexibility than the traditional addressed-point-to-addressed-point facilities provided by general purpose LAN

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page E7 of 17

ydthsae1-02.wpd Appendix E Application Interface Services

protocols. The objective of this approach is to simplify ASU communication and configuration logic, thereby decoupling system design from network design.

2.3 System Dataflow Management and Interface Control

APIS has been designed to support system management and interface control. It does this by allowing dataflow to be system and sub-system application issues requiring no knowledge to the details of network interface drivers and transfer protocols. Changes in dataflow definitions only have implications in directly related applications and do not impact the communication infrastructure in any way.

2.4 Integration and Verification

In order to support integration of APIS as well as verification of data interfaces, two supplementary products were developed. These were an APIS Test Shell, being a simple to use, graphically-based, man-machine interface to APIS. This allows an interface developer to statically set up messages and message contents and observe all relevant interactions between the co-operating LAN nodes.

The Integration and Verification Test Tool is a more sophisticated APIS protocol analyzer capable of detailed protocol analysis, message capture, storage and filtering. IVIT is capable of static scenario simulation, but also provides a programming interface to higher level simulation applications allowing a fully dynamic scenario or mission simulator to be constructed.

2.5 APIS Services

APIS services are those supplied to the APIS Service User. These services are implemented by a set of simple software calls. The application user can effect all data transfer requirements through this set of calls. Each call has an associated set of parameters passed between the ASU and APIS.

The ASU, residing on the host CPU and separated from the NIC by the parallel backplane bus (PBB), in this case Multibus II (MBII), invokes these calls by means of Multibus Transport Protocol messages. Thus these service calls over the MBII PBB define the interface between the ASUs and APIS. Service calls are accompanied by a comprehensive set of Error Messages which allow the host and APIS, implemented on the NIC, to communicate on anomalous conditions thereby allowing corrective action and enhancing communication reliability.

A description of each APIS service call with associated parameters follows.

2.5.1 APIS_INIT()

2.5.1.1 Message Description

This message is used to inform APIS of the start-up of an application host processor. It must be called once by every host, before any applications on this host issue any other APIS request. It allows for the removal of all ASU

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page E8 of 17 ydthsae1-02.wpd Appendix E Application Interface Services

information linked to the application host processor that issued the command, as well as for freeing associated unused memory buffers. It is a Multibus II TP transaction message of type unsolicited request with unsolicited reply. The unsolicited request will come from the application host and the unsolicited reply from the NIC.

2.5.1.2 Parameters

Command ID ASU Host ID Status Port ID

2.5.2 APIS_OPEN()

2.5.2.1 Message Description

This message is used to open APIS for an application. It is a Multibus II TP transaction message of type unsolicited request with unsolicited reply. The unsolicited request will come from the application host and the unsolicited reply from the NIC.

2.5.2.2 Parameters

Command ID Application ID ASU Host ID Status Port ID ASU Text String ASAP

2.5.3 APIS_CLOSE()

2.5.3.1 Message Description

This message is used to close a Application Service Access Point (ASAP). All messages sent via this ASAP, before this command is issued and successfully acknowledged, will be transferred to consumers. All messages previously registered under this ASAP will be removed. It is a Multibus II TP transaction message of type unsolicited request with unsolicited reply. The unsolicited request will come from the application host and the unsolicited reply from the NIC. If an ASAP is closed, all the messages transmitted and received will be removed from the APIS memory.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page E9 of 17 ydthsae1-02.wpd Appendix E Application Interface Services

2.5.3.2 Parameters

Command ID ASAP ASU Host ID Status Port ID

2.5.4 APIS_PRODUCE()

2.5.4.1 Message Description

This message is used to add a message for transmission to the message list. It is a Multibus II TP transaction message of type unsolicited request with unsolicited reply. The unsolicited request will come from the application host and the unsolicited reply from the NIC.

2.5.4.2 Parameters

Command ID Priority ASAP ASU Host ID Status Port ID Message ID Data Length Committed Repetition Interval

2.5.5 APIS_DEMAND()

2.5.5.1 Message Description

This message is used to add a message for reception to the message list. It is a Multibus II TP transaction message of type unsolicited request with unsolicited reply. The unsolicited request will come from the application host and the unsolicited reply from the NIC.

2.5.5.2 Parameters

Command ID ASAP ASU Host ID Status Port ID Message ID Consumer Port ID Repetition Interval

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page E10 of 17 ydthsae1-02.wpd Appendix E Application Interface Services

2.5.6 APIS_REMOVE_PRODUCE()

2.5.6.1 Message Description

This message is used to delete a message for transmission from the message list. All messages matching the specified Message ID, sent via this ASAP before this command is issued and successfully acknowledged, will be transferred to consumers. It is a Multibus II TP transaction message of type unsolicited request with unsolicited reply. The unsolicited request will come from the application host and the unsolicited reply from the NIC.

2.5.6.2 Parameters

Command ID ASAP ASU Host ID Status Port ID Message ID

2.5.7 APIS_REMOVE_DEMAND()

2.5.7.1 Message Description

This message is used to delete a message for reception from the message list. No messages, as specified by the Message ID parameter, received after this command is issued and successfully acknowledged, will be transferred to the ASU. It is a Multibus II TP transaction message of type unsolicited request with unsolicited reply. The unsolicited request will come from the application host and the unsolicited reply from the NIC.

2.5.7.2 Parameters

Command ID ASAP ASU Host ID Status Port ID Message ID

2.5.8 APIS_SEND_MSG()

2.5.8.1 Message Description

This message is used to send a stream data message from an application. It is a Multibus II TP transaction message of type solicited request with unsolicited reply. The solicited request will come from the application host and the unsolicited reply from the NIC.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page E11 of 17 ydthsae1-02.wpd Appendix E Application Interface Services

2.5.8.2 Parameters

Command ID ASU Host ID Status Port ID Message ID ASAP Message Length Message Content

2.5.9 APIS_RECEIVE_MSG()

2.5.9.1 Message Description

This message is used to send a stream data message received by the NIC to the application data stream socket. It is a Multibus II TP transaction message of type solicited message. The solicited message will come from the NIC. No reply is required for this data transfer event.

This message is passed through the stream data socket (Consumer Socket) as specified in the APIS_DEMAND() command.

2.5.9.2 Parameters

Command ID Message ID ASAP Message Length Message Content

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page E12 of 17 ydthsae1-02.wpd Appendix E Application Interface Services

2.6 APIS Error Messages

The following list provides all possible APIS error codes and their meanings which are returned by the above APIS service calls :

Return Value Decimal Value Status Description

NO_ERROR 0 no error occurred

E_ASAP_NOT_OPEN -101 specified ASAP not registered on this NIC E_APID_IN_USE -102 specified Application ID has already been used on this NIC

E_MSG_ID_INVALID -201 specified Msg_Id is invalid E_STRING_INVALID -202 specified ASU TEXT STRING is invalid E_LENGTH_INVALID -203 specified Length is invalid E_PRIORITY_INVALID -204 specified Priority is invalid E_INTERVAL_INVALID -205 specified Repetition Interval is invalid E_CONS_SKT_INVALID -206 specified MBII socket is invalid

E_PROD_EXIST_FOR_ASAP -301 specified Msg_id is already registered for this Priority and ASAP E_PRODUCER_LIMIT -302 maximum number of producers has registered this Msg_Id E_PROD_BW_CHANGE -303 specified Length and Repetition Interval values do not correspond with previously registered values. (Only valid if specified Msg_Id is already registered under a different ASAP).

E_DEMAND_EXIST_FOR_ASAP -401 specified Msg_Id is already registered for this ASAP

E_MSG_NOT_REGISTERED -501 specified Msg_Id is not registered for this ASAP

E_ASAP_INVALID -601 specified Msg_Id not registered for this ASAP E_NO_CONSUMER -602 no consumers are registered for this Msg_Id

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page E13 of 17 ydthsae1-02.wpd Appendix E Application Interface Services

2.7 APIS Diagrams

2.7.1 APIS Environment

Figure 1 below depicts a typical APIS environment.

Figure 1 : Typical APIS Environment

2.7.2 APIS Transition Diagram

Figure 2 below depicts the relationship of the various APIS service calls :

Figure 2 : APIS Transition Diagram

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page E14 of 17 ydthsae1-02.wpd Appendix E Application Interface Services

3. Conclusions

3.1 Evolutionary Development

Up until the present, two versions of APIS have been developed, i.e. V1.0 and V2.0. The former was an experimental prototype used to validate the design concept. The development methodology used was rapid prototyping. APIS V1.0 was coded in the C high-level language to operate above the TP4 transport protocol (Retix TP4 in the PC case and CLA TP4 in the Multibus II case). Apart from concept validation, the exercise allowed for a formal interface specification to be developed as well as provided the basis for formal definition of APIS V2.0.

APIS V2.0 has also undergone fullscale development using more formal software development methodologies. Initially, V2.0 underwent formal design using a Computer-Aided Software Engineering (CASE) tool (is this case Rational Rose) and a formal object-oriented methodology (in this case the Booch notation). It had been planned that APIS V2.0 would be coded in the C++ high-level language, however the iRMX operating system's support of C++ was short- lived, thus requiring a return to standard C. This effectively invalidated the formal object- oriented design. However, the implementation was made to correspond with the formal design as far as possible. In essence then, V2.0 was derived using an evolutionary prototyping approach.

APIS V2.0 operates above the XTP V4.0 transport protocol (specifically NXI XTP V4.01), using the latter's multicast and priority capabilities. This version of XTP incorporates an encapsulated IP service which does not provide a broadcast facility. APIS therefore accesses XTP for unicast and multicast, but LLC for broadcast. This is considered somewhat inelegant and may detract from its open systems design.

3.2 Implementation Coherency

The APIS protocol has been successfully implemented on the MS-DOS operating system and PC platform as well as iRMK real-time kernel and Multibus II platform. The interaction between APIS and the two operating systems was kept the same as far as possible and consequently there is a common APIS kernel which both implementations use. It is important to incorporate the APIS protocol stack into the operating system as seamlessly as possible. DOS device drivers are unwieldy and limit achieving greater co-operation between APIS and DOS. Currently APIS running with DOS is not interrupt-driven and it therefore needs to poll the transport layer device driver. The onus is therefore on the ASU to service APIS regularly to ensure that the communications system achieves real-time performance. A real-time PC operating system would circumvent this limitation.

3.3 Object-Oriented Implementation

While the design of the protocol has followed an object-oriented approach, the implementation has been coded in ANSI C. A rewrite using C++ will be advantageous for extensibility and reusability, especially if it is to be ported to an object-oriented operating system.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page E15 of 17

ydthsae1-02.wpd Appendix E Application Interface Services

3.4 Implementation Reliability

Consumer_Sockets are memory pointers that are passed over the network by the PC-based APIS implementation. This is not a reliable method due to the fact that network errors can cause the loss of these pointers. An intermediate indexed table would eliminate the consequence of dereferencing an invalid pointer, but would also add to protocol processing.

3.5 Wildcard Listening

The fundamental concept of APIS relies on broadcast, multicast and wildcarding. The underlying transfer layer protocols need to provide wildcard listening functionality to APIS.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page E16 of 17 ydthsae1-02.wpd Appendix E Application Interface Services

4. Recommendations

4.1 Object-Oriented Implementation

As long as the real-time operating system supports C++, the APIS kernel should be rewritten (to V3.0) using the object-oriented design approach and the C++ language.

4.2 Multitasking Operating System Support

PC-based APIS should be implemented under a true multitasking operating system such as LynxOS or VxWorks real-time Unix in order to achieve true real-time performance. Suitable drivers for the PC network interface hardware will be required.

4.3 Full-Function IP

A full-function IP layer, providing the broadcast facility, is required by APIS and should be acquired and ported to operate above LLC1 and below XTP.

4.4 Formal Validation

The feasibility of formally validating APIS for correctness and completeness using an accepted formal protocol validation methodology should be investigated. This would be a complex task requiring considerable effort.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page E17 of 17

ydthsae1-02.wpd Appendix F

Network Time Services

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F1 of 26 ydthsaf1-02.wpd Appendix F Network Time Services

Appendix F ...... 1

1. Scope ...... 6 1.1 Scope ...... 6 1.2 Introduction ...... 6 1.3 Appendix Layout ...... 6

2. FDDI Ring Latency Time Analysis ...... 8 2.1 Medium Propagation Time ...... 8 2.2 PHY Latency ...... 8 2.3 FDDI Ring Latency Time (RLT)...... 9 2.4 RTCS Cycle Times ...... 10 2.5 TTRT for the RTCS...... 10

3. Network Time Protocol ...... 12 3.1 Nomenclature...... 12 3.2 Network Time Protocol Implementation ...... 12 3.3 NTP and the SAFENET Standard ...... 13 3.4 NTP Hardware Requirements ...... 13 3.5 NTP using FDDI as a Medium ...... 14 3.5.1 Asymmetry of the FDDI Ring ...... 14 3.5.2 Fast Initialisation by Synchronisation Seed ...... 14 3.6 Implementation of NTP ...... 14 3.6.1 Implementing NTP on the FDDI NIC ...... 15 3.6.2 Implementing the NTP on the Host CPU ...... 15 3.7 NTP Timestamp Format ...... 15 3.8 Exchanging Timestamps ...... 15 3.9 Maintaining an Accurate Local Clock...... 16 3.9.1 Input Frequency as a Power of Two ...... 17 3.9.2 Input Frequency not a Power of Two ...... 17 3.10 Adjusting the NTP Clock...... 17 3.11 Accessing the NTP Clock ...... 18 3.12 NTS Software Architecture ...... 18 3.13 Use of Existing Code ...... 18

4. NTP over FDDI Error Analysis ...... 19 4.1 Network Intact ...... 19 4.2 Network Interrupted in Forward Path ...... 20 4.3 Network Interrupted in Return Path...... 20 4.4 Correctness Interval ...... 21

5. NTP Performance Measurements ...... 23 5.1 Test Scenarios ...... 23 5.2 Test Results ...... 23 5.3 Test Conclusions ...... 24

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F2 of 26

ydthsaf1-02.wpd Appendix F Network Time Services

6. Conclusions and Recommendations ...... 25 6.1 LAN Timing...... 25 6.2 NTP Performance...... 25 6.3 Synchronisation ...... 25 6.4 Synchronous Bandwidth Allocation ...... 26 6.5 Topology Implications...... 26

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F3 of 26

ydthsaf1-02.wpd Appendix F Network Time Services

List of Figures

Figure 1 : Exchange of Timestamps ...... 16

Figure 2 : NTS Architecture ...... 18

Figure 3 : FDDI with Ring Intact ...... 19

Figure 4 : FDDI with Ring Interrupted (A)...... 20

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F4 of 26 ydthsaf1-02.wpd Appendix F Network Time Services

List of Tables

Table I : Minimum TTRTs for Various Fractions of Synchronous FDDI Traffic ...... 11 Table II : Synchronisation Accuracies for FDDI Ring in Thru' and Wrapped States ...... 22 Table III : NTP Measurement Environment ...... 23

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F5 of 26

ydthsaf1-02.wpd Appendix F Network Time Services

1. Scope

1.1 Scope

This appendix addresses the time-oriented issue of FDDI ring latency. FDDI token rotation times are analyzed in terms of intrinsic FDDI performance capabilities and typical FDDI LAN sizes, i.e. range and number of nodes. Analysis determines that bounded minimum latencies are possible using FDDI with various proportions of synchronous and asynchronous traffic.

The appendix also addresses the Network Time Services relevant to the implementation of real-time distributed systems. In particular, it introduces a Network Time Protocol and addresses the feasibility of the latter for use as the primary synchronisation mechanism when the local time on various networked devices needs to be synchronised to sub-millisecond accuracies. A mechanism for distributing accurate calendar time is also proposed.

1.2 Introduction

FDDI employs a dual counter-rotating ring topology and an early release, timed token, MAC- layer protocol. As such, it is capable of high performance in terms of throughput. However, finite and in certain cases significant, latency and jitter are inherent in the technology and topology. An FDDI Ring Latency Time analysis is therefore necessary to determine the capability of the underlying FDDI token ring transfer service in terms of latency and jitter.

Network Time Services (NTS) are those services which provide nodes with the means of synchronizing between themselves (i.e. relative synchronisation) and with calendar time (i.e. absolute synchronisation) in order to perform timestamping of data packets transferred between nodes. Such timestamping is required in order to provide timeliness of the collaborative distributed processes within the system. This is required due to the fact that the data transfer services are incapable of guaranteeing latencies low enough for the requirements of certain distributed algorithms.

The Network Time Protocol (NTP) is an extended profile protocol which implements timing mechanisms between all participating sub-systems over the network and provides basic functionality such as synchronisation and timestamping to NTS. NTS in turn provides user services to the application.

1.3 Appendix Layout

The appendix commences with an FDDI Ring Latency Time analysis undertaken in the context of a typical real-time control system LAN using FDDI technology in a dual-attached ring topology.

An overview of the characteristics of a Network Time Protocol designed to provide special timing services to the real-time distributed system is then provided. It then proceeds with a description of a specific implementation thereof, as well as an error analysis which determines the timing capability of NTP over FDDI. Important implications of these characteristics and implementation are then analysed within the context of real-time, mission-critical, distributed systems.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F6 of 26

ydthsaf1-02.wpd Appendix F Network Time Services

Specific conclusions and recommendations are then made, with the most significant of these being adopted in the main section.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F7 of 26 ydthsaf1-02.wpd Appendix F Network Time Services

2. FDDI Ring Latency Time Analysis

An FDDI LAN in its standard configuration consists of a dual-redundant, counter-rotating ring of multimode fibre optic cable. Each node is separated by a cable segment and contains an active network interface card (NIC). The NIC converts optical signals to electrical signals, amplifies and filters these and then converts them back to optical signals. These processes require a finite amount of time. Under normal conditions, i.e. no fault (thru' state), the optical path equals the ring circumference. Under conditions of one LAN fault (wrapped state), the optical path equals twice the ring circumference.

Dual-attached NICs have two physical interfaces (PHYs); normally only one being active. In the wrapped state both PHYs become active in order to complete the ring using the redundant optical fibre.

The latency of an FDDI ring is dependent on two factors, medium propagation delay (MPD) and PHY latency. MPD is the time it takes for the light signal to travel a unit distance over the optical medium.

Relevant figures are determined below for a typical real-time control system (designated RTCS) and

worst case (designated WC).

2.1 Medium Propagation Time

Medium Propagation Time (MPT) is the product of MPD and the path length i.e. the FDDI ring circumference (RC) :

MPT = MPD.RC

MPD = 5,1 ìs/km (FDDI standard)

RTCS

. RCRTCS 2,5 km (typical - thru' state) . RCRTCS' 5,0 km (typical - wrapped state)

Worst Case

RCWC = 100 km (FDDI standard - thru' state)

RCWC' = 200 km (FDDI standard - wrapped state)

2.2 PHY Latency

Total PHY Latency (TPL) is the sum of each physical interface (PHY) latency (PL) which is equivalent to the product of the number of PHYs and the PHY Latency. For a dual- attachment station (DAS), each DAS has two active PHYs; this case is applicable when one station has failed and the ring is in the wrapped configuration.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F8 of 26

ydthsaf1-02.wpd Appendix F Network Time Services

TPL = PL.D (typical - thru' state) = PL.N (typical - wrapped state)

PL = 0,6 ìs per PHY (FDDI standard)

RTCS

No. of DASs (D) . 50 (typical - thru' state) No. of PHYs (N) . 100 (typical - wrapped state)

Worst Case

No. of DASs (D) = 500 (FDDI standard - thru' state) No. of PHYs (N) = 1 000 (FDDI standard - wrapped state)

2.3 FDDI Ring Latency Time (RLT)

Thru' Ring Configuration

RLT = MPT + TPL

= MPD.RC + PL.D

RLTRTCS = 5,1 x 2,5 + 0,6 x 50

= 43 ìs

RLTWC = 5,1 x 100 + 0,6 x 500

= 810 ìs

Wrapped Ring Configuration

RLT = MPD.RC + 2PL.N

RLTRTCS' = 5,1 x (2 x 2,5) + 0,6 x (2 x 50)

= 86 ìs

RLTWC' = 5,1 x (2 x 100) + 0,6 x (2 x 500)

= 1 620 ìs

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F9 of 26 ydthsaf1-02.wpd Appendix F Network Time Services

= 1,620 ms

2.4 RTCS Cycle Times

Cycle times for the RTCS are derived from the mission- and time-critical dataflows. The Target Token Rotation Time (TTRT) is a fundamental characteristic of a specific FDDI implementation and an analytical effort is required to determine a suitable value for the RTCS TTRT in respect of the system-level performance characteristics.

The Token Rotation Time (TRT) of an FDDI ring is the time taken for the timed-token to circumnavigate the ring and thus represents the maximum time taken for a station to gain access to the LAN's message transfer services.

It can be proven mathematically that the timed-token protocol of FDDI has two important properties :

# TRTaverage TTRT...... (Property 1) # TRTmaximum 2 x TTRT ...... (Property 2)

The proof of these properties is somewhat complex and reference should be made to the paper entitled Cycle Time Properties of the FDDI Token Ring Protocol by Sevcik and Johnson[116].

2.5 TTRT for the RTCS

The choice of TTRT for a system is not only dependent on the ring latency time, but also on the relative bandwidth allocated to synchronous and asynchronous traffic. Sevcik and Johnson also derive two formulae to determine Minimum TTRTs to Permit Various Fractions of Total Capacity to be Allocated to Synchronous Traffic.

TTRT $ N x Z + P...... (Formula 1) 1 - S

N x Z + P . MPD x (RC + N)...... (Formula 2)

where : N = Number of PHYs S = Proportion of Synchronous Traffic Z = Upper bound of the sum of all token delays within node P = Total Propagation Delay around the ring MPD = Medium Propagation Delay (= 5,1 µs/km)

Table I provides minimum TTRTs for various fractions of synchronous FDDI traffic for numbers of nodes varying from 10 to 1 000 and ring circumference from 100 m to 1 000 km.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F10 of 26 ydthsaf1-02.wpd Appendix G - Network Time Services

TTRT min (ms)

No. of Ring N x Z + P Synchonous Bandwith Allocation (S) Stations Circumference (N) (RC) 10% 20% 30% 40% 50% 60% 70% 80% 90% 95% 99,5% 99,9% (km) (ms)

10 0,1 0,05 0,06 0,06 0,07 0,08 0,10 0,13 0,17 0,25 0,51 1,01 10 50 10 1,0 0,06 0,06 0,07 0,08 0,09 0,11 0,14 0,18 0,28 0,55 1,10 11 55 10 2,5 0,06 0,07 0,08 0,09 0,10 0,13 0,16 0,21 0,31 0,63 1,25 12 62 10 5 0,08 0,08 0,09 0,11 0,13 0,15 0,19 0,25 0,38 0,75 1,50 15 75 10 10 0,10 0,11 0,13 0,14 0,17 0,20 0,25 0,33 0,50 1,00 2,00 20 100 10 100 0,55 0,61 0,69 0,79 0,92 1,10 1,38 1,83 2,75 5,5 11 110 550 10 200 1,05 1,17 1,31 1,50 1,75 2,10 2,63 3,50 5,3 11 21 210 1 050

19 1,0 0,10 0,11 0,13 0,14 0,17 0,20 0,25 0,33 0,50 1,0 2,0 20 100 19 2,5 0,11 0,12 0,13 0,15 0,18 0,22 0,27 0,36 0,54 1,1 2,1 21 107 19 5 0,12 0,13 0,15 0,17 0,20 0,24 0,30 0,40 0,60 1,2 2,4 24 120 19 10 0,15 0,16 0,18 0,21 0,24 0,29 0,36 0,48 0,73 1,5 2,9 29 145 19 100 0,60 0,66 0,74 0,85 0,99 1,19 1,49 1,98 2,98 6,0 12 119 595 19 200 1,10 1,22 1,37 1,56 1,83 2,19 2,74 3,65 5,5 11 22 219 1 095

40 1,0 0,21 0,23 0,26 0,29 0,34 0,41 0,51 0,68 1,03 2,05 4,10 41 205 40 2,5 0,21 0,24 0,27 0,30 0,35 0,43 0,53 0,71 1,06 2,13 4,25 42 212 40 5 0,23 0,25 0,28 0,32 0,38 0,45 0,56 0,75 1,13 2,25 4,50 45 225 40 10 0,25 0,28 0,31 0,36 0,42 0,50 0,63 0,83 1,25 2,50 5,0 50 250 40 100 0,70 0,78 0,88 1,00 1,17 1,40 1,75 2,33 3,5 7,0 14 140 700 40 200 1,20 1,33 1,50 1,71 2,00 2,40 3,00 4,00 6,0 12 24 240 1 200

50 1,0 0,26 0,28 0,32 0,36 0,43 0,51 0,64 0,85 1,28 2,6 5,1 51 255 50 2,5 0,26 0,29 0,33 0,38 0,44 0,53 0,66 0,88 1,31 2,6 5,2 52 263 50 5 0,28 0,31 0,34 0,39 0,46 0,55 0,69 0,92 1,38 2,8 5,5 55 275 50 10 0,30 0,33 0,38 0,43 0,50 0,60 0,75 1,00 1,50 3,0 6,0 60 300 50 100 0,75 0,83 0,94 1,07 1,25 1,50 1,88 2,50 3,75 7,5 15 150 750 50 200 1,25 1,39 1,56 1,79 2,08 2,50 3,13 4,17 6,3 13 25 250 1 250

100 1,0 0,51 0,56 0,63 0,72 0,84 1,01 1,26 1,68 2,53 5,1 10 101 505 100 2,5 0,51 0,57 0,64 0,73 0,85 1,03 1,28 1,71 2,56 5,1 10 102 513 100 5,0 0,53 0,58 0,66 0,75 0,88 1,05 1,31 1,75 2,63 5,3 10 105 525 100 10 0,55 0,61 0,69 0,79 0,92 1,10 1,38 1,83 2,8 5,5 11 110 550 100 100 1,00 1,11 1,25 1,43 1,67 2,00 2,50 3,33 5,0 10,0 20 200 1 000 100 200 1,50 1,67 1,88 2,14 2,50 3,00 3,75 5,00 7,5 15,0 30 300 1 500

490 1,0 2,46 2,73 3,07 3,51 4,09 4,91 6,1 8,2 12 25 49 491 2 455 490 2,5 2,46 2,74 3,08 3,52 4,10 4,93 6,2 8,2 12 25 49 492 2 462 490 5 2,48 2,75 3,09 3,54 4,13 4,95 6,2 8,3 12 25 49 495 2 475 490 10 2,50 2,8 3,1 3,6 4,2 5,0 6,3 8,3 13 25 50 500 2 500 490 100 2,95 3,3 3,7 4,2 4,9 5,9 7,4 9,8 15 30 59 590 2 950 490 200 3,45 3,8 4,3 4,9 5,8 6,9 8,6 11,5 17 35 69 690 3 450

900 1,0 4,51 5,0 5,6 6,4 7,5 9,0 11 15 23 45 90 901 4 505 900 2,5 4,51 5,0 5,6 6,4 7,5 9,0 11 15 23 45 90 902 4 513 900 10 4,55 5,1 5,7 6,5 7,6 9,1 11 15 23 46 91 910 4 550 900 100 5,00 5,6 6,3 7,1 8,310131725 50 100 1 000 5 000 900 200 5,50 6,1 6,9 7,9 9,211141828 55 110 1 100 5 500

1 000 1,0 5,01 5,6 6,3 7,2 8,310131725 50 100 1 001 5 005 1 000 2,5 5,01 5,6 6,3 7,2 8,410131725 50 100 1 002 5 012 1 000 5 5,03 5,6 6,3 7,2 8,410131725 50 100 1 005 5 025 1 000 10 5,05 5,6 6,3 7,2 8,410131725 51 101 1 010 5 050 1 000 100 5,50 6,1 6,9 7,9 9,211141828 55 110 1 100 5 500 1 000 200 6,00 6,7 7,5 8,6 10 12 15 20 30 60 120 1 200 6 000

Table I : Minimum TTRTs for Various Fractions of Synchronous FDDI Traffic

Notes : 1. Shading indicates areas of interest for typical real-time systems. 2. Table derived from that of Sevcik and Johnson.

Formulae TTRT > N x Z + P ms ...... (1) 1 - S

N x Z + P = 0,005 x (RD +N) ms ...... (2)

file : yttrt02.123 Issue : 1996-07-08 Revision : 2 2006-05-31 Page F11 of 26 Appendix F Network Time Services

3. Network Time Protocol

3.1 Nomenclature

A standard nomenclature is used in respect of the Network Time Protocol[105]. The stability of a clock is how well it can maintain a constant frequency, the accuracy is how well its frequency and time compare with national standards and the precision is the resolution of the clock, i.e. the smallest time increment that the clock can measure usefully. The offset of two clocks is the time difference between them, while the skew is the frequency difference (first derivative of offset with time) and the drift is the variation in skew with time (second derivative of offset with time).

3.2 Network Time Protocol Implementation

NTP was originally designed to function in the Internet environment and to run under the Unix operating system. It makes provision for using an external radio clock, but does not rely on such a clock for accurate timekeeping. The protocol does not rely on the accuracy of the clock of a single peer, but rather attempts to find the most accurate time source available to it.

The NTP relies on the exchange of timestamps with one or more peers and on calculating the time offsets between the peer clocks and the local clock. Several algorithms are used to eliminate false information, so that an accurate estimation of the local clock error can be made. This is then used to adjust the local clock.

The approach used by NTP to achieve reliable time synchronisation between machines on a network differs slightly from other such protocols. In particular, NTP does not synchronise clocks to each other. Each peer on the network attempts to synchronise to calendar time using the best available combination of timeserver and network path to that source. As such, a group of NTP synchronised clocks will be close to each other in time as a consequence of them all being close to calendar time.

Time is distributed through a hierarchy of NTP servers, with each server adopting a "stratum" which indicates how far away from an external source of calendar time it is operating. Stratum-1 servers have access to some external time source, usually a radio clock synchronized to time signal broadcasts from radio stations which explicitly provide a standard time service. A stratum-2 server is one which is currently obtaining time from a stratum-1 server, while a stratum-3 server gets its time from a stratum-2 server and so on.

Each client in the synchronization subnet (which may also be a server for other, higher stratum clients) chooses exactly one of the available servers with which to synchronize, usually from among the lowest stratum servers to which it has access. NTP is most effective when several sources of lower stratum time are available, since an agreement algorithm can then be applied to select the best synchronisation source.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F12 of 26

ydthsaf1-02.wpd Appendix F Network Time Services

3.3 NTP and the SAFENET Standard

The SAFENET Time Service (STS)[29] is closely based on NTP. There are some small differences, one being that NTP timestamps are relative to 1st January, 1900, whereas the STS timestamps are relative to 1st January, 1970. NTP makes provision for insertion of leap seconds and tracking of leap years, but to prevent discontinuities, STS does not incorporate this capability.

NTP was designed to work on the Internet and many of the parameters, such as the minimum and maximum polling intervals, were chosen for best results in that environment. The optimal values for these parameters are provided by Mills[105]. If a different network is used, optimal values for these parameters must be determined experimentally. For this reason, the STS must be able to modify these parameters during execution in order to facilitate optimisation.

3.4 NTP Hardware Requirements

NTP time information consists of timestamps. A timestamp is a 64-bit (8 byte) number, of which the first 32 bits represent seconds and the last 32 bits represent binary fractions of a second. In this way, the smallest time increment that can be represented is ± 232 picoseconds (2-32 seconds) and the longest time that can be represented is about 136 years (232 seconds).

Implementing a clock on a computer normally consists of generating a periodic interrupt. When the interrupt is received, a counter variable is incremented. In this way, the counter variable represents the time, with the precision of the clock being limited by the interrupt frequency. For example, if the interrupt period is 20 milliseconds and the counter value is 3 000, this would represent one minute. To increase the precision, the interrupt frequency also has to increase, leading to higher processor overheads.

In the NTP algorithm, a timestamp is obtained by a twofold process. When an interrupt occurs, the interrupt period, in binary fractions of a second, is added to the local time. This is exactly the same as the previous example, except that the conversion is done when updating the clock and not when reading it. In the above example, the same result could be obtained by adding 0,02 to a counter at every interrupt and then reading the counter after 3 000 interrupts. The counter value would then be 60, or one minute. This does not increase the precision of the clock, it is still only accurate to the nearest 20 ms.

When a timestamp is required, the current value of the timer is read. This would be a value between zero and the maximum counter value, representing the time since the last interrupt was generated. This value can be scaled and added to the local clock value, resulting in a timestamp with a precision equal to the least significant bit of the timer. For example, using a 16-bit timer in the above example would give a precision of 305 nanoseconds, which is much more than most processors can effectively utilise.

In practice, the timer is programmed to operate in the same units as the NTP timestamp, namely binary fractions of a second. This leads to lower overheads, since the timer value can be added directly to the timestamp without requiring any conversion.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F13 of 26 ydthsaf1-02.wpd Appendix F Network Time Services

3.5 NTP using FDDI as a Medium

3.5.1 Asymmetry of the FDDI Ring

NTP relies on the exchange of timestamps to synchronise to a peer. One of the basic assumptions is that message times to and from the peer are statistically equal. If the message transmit and receive times are not equal, the NTP algorithm introduces an error of half the difference between the two times.

When using the token ring protocol, messages move in a constant direction around the ring. Normally the path around the ring would consist of a short path and a long path, resulting in unequal transmit and receive times. The delay for the stations in an FDDI network is 0,6 µs/station and the cable delay is 5,1 µs/km (refer to Section 2 of this appendix). Therefore, for a network of 500 stations, with 200 km of optical fibre and with two stations next to one another communicating, the roundtrip delay, which is also the worst case difference between transmit and receive times, would be about 1,62 ms.

If a more typical control network of 50 stations and 2,5 km of optical fibre is considered, the worst case difference in transmit and receive times becomes 43 µs. This would introduce an error of some 22 µs in the local time.

3.5.2 Fast Initialisation by Synchronisation Seed

Due to the nature to the Internet, NTP takes some time to settle to its final precision. It can typically take 24 hours to settle to 1 ms. This would not be sufficient for most real-time networks. However, in closed networks employing technologies such as FDDI, techniques are possible to circumvent this.

Using a high-speed network such as FDDI has a distinct advantage when a clock needs to be initialised, as would happen at system startup. Since the seed timestamp "ages" by a maximum of 1,62 ms while being transmitted over the network, the local node clocks can be set by a seed value. By allocating this seed time to the network protocol and the clock setting algorithm, this would guarantee that a local node clock will settle to within 1 to 2 milliseconds of the correct system time within one token rotation after startup.

3.6 Implementation of NTP

As stated previously, the Network Time Protocol was devised for the Internet environment.

In this analysis, it is assumed that the system which needs to be synchronised is a Multibus II system with an intelligent FDDI Network Interface Card (NIC). In this case, the synchronised time can be maintained in one of two places, namely :

C On the FDDI NIC

C On the local CPU host

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F14 of 26 ydthsaf1-02.wpd Appendix F Network Time Services

3.6.1 Implementing NTP on the FDDI NIC

Since the FDDI NIC contains a dedicated processor as well as a timer, an accurate clock can easily be maintained. NTP timestamp requests to and from the network can be serviced immediately, resulting in a high inherent accuracy. To enable the host processor to obtain the current time, an extra function can be added to the networking software. This function will return the current time in NTP timestamp format.

Unfortunately, any requests for the current time have to pass through the Multibus II backplane bus, as well as through the two protocol stacks on either side. This introduces jitter, which directly affects the accuracy of the timestamp returned. Assuming that the host processor has the highest priority and the FDDI NIC the second highest priority on the bus, the jitter would be extremely low. Any constant delays (as opposed to jitter) could be compensated for when the time is read. For example, if it is known that it takes 100 µs (delay) ± 30 µs (jitter) to obtain a timestamp, adding 100 µs to the timestamp would make it accurate to 30 µs, instead of 130 µs.

3.6.2 Implementing the NTP on the Host CPU

If a spare timer is available on the host CPU, the latter could maintain the accurate system time. The FDDI NIC would provide a function to send and receive NTP timestamps with minimum latency. The NTP algorithm will compensate for any latency introduced by the Multibus, since the NTP algorithm will consider the backplane bus as part of the overall network. Jitter caused by the backplane bus would have half the effect of the previous example, due to the nature of the NTP algorithm and symmetry of the parallel backplane bus communication channel.

Since maintaining the local clock is hardware specific, each sub-system in the system as a whole would have to implement the NTP algorithm independently.

3.7 NTP Timestamp Format

The NTP uses eight bytes (64 bits) to represent a specific instant in time. Four bytes are used to represent the number of seconds, relative to January 1, 1900. The other four bytes represent binary fractions of a second, i.e. ½ second, ¼ second, etc., with the least significant bit representing a value of 232 picoseconds (2-32 seconds).

3.8 Exchanging Timestamps

The NTP algorithm is based on the exchange of timestamps. A network node (Node A) desiring to synchronise to a peer (Node B) sends a synchronisation request to that peer. This

request contains the current time at Node A (T1). When the synchronisation request is

received at Node B, the local time (of the clock at Node B) is added to the datagram (T2).

When Node B has some spare processing time, a third timestamp (T3) is added to the

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F15 of 26 ydthsaf1-02.wpd Appendix F Network Time Services

datagram, which is then sent back to Node A, where a fourth timestamp (T4) is added. This process is illustrated by Figure 1.

All timestamps are obtained as close to the actual time of transmission or reception as possible. In other words, all protocol stack-related operations are performed before adding the timestamp and transmitting the datagram and on reception the timestamp is added to the datagram before protocol processing. Any jitter in the protocol stack would affect the accuracy of the algorithm if the timestamps are added on the “wrong” side of the stack.

Figure 1 : Exchange of Timestamps

3.9 Maintaining an Accurate Local Clock

The NTP periodically synchronises the local clock to that of an accurate peer. A good local clock is required to maintain accurate time between these periodic synchronisations.

In BSD Unix, the clock is implemented as a software counter, with 1 µs resolution. This counter is periodically updated by a hardware interrupt occurring approximately every 1 to 10 ms. Each interrupt causes an increment tick to be added to the kernel time variable. When the interrupt does not evenly divide a second into microseconds, an additional increment fixtick is added once per second to make up the difference. For example, the Ultrix kernel (Digital Equipment DECstation) uses a hardware interrupt of 256 Hz (3 906,25 µs). In this case the timescale consists of 255 advances of 3 906 µs and one advance of 3 970 µs to make up the accumulated 64 µs error. The jitter introduced in this way is unimportant, since the clock is only accurate to the interrupt period (3,9 µs in this case).

There are two ways of increasing the accuracy of a clock maintained in this way. The first is to increase the frequency of the interrupt, which in turn increases the load on the processor. The other way is to add the value of the hardware counter, used to generate the interrupt, to the current software counter and to supply the sum in response to a time request.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F16 of 26 ydthsaf1-02.wpd Appendix F Network Time Services

Hardware interrupts are obtained by dividing a fast clock (typically 1 to 10 MHz) using a hardware timer. In the case of the timer available on the FDDI NIC (an Intel 8254 IC), the desired division ratio is loaded into the chip. This value is decremented at the clock frequency and an interrupt generated when the count reaches 1. The current value of the counter can be obtained at any time by reading from the chip.

Network Time Services replaces the standard Unix clock with a full 64-bit (NTP timestamp format) clock. On receipt of an ntp_gettime() request, the hardware timer is read and the value obtained converted to NTP timestamp format and added to the software counter value to obtain the correct timestamp value. The choice of the hardware timer input clock frequency determines the algorithm to be followed when converting the timer value to timestamp format. Two cases arise :

3.9.1 Input Frequency as a Power of Two

In the special case where the input frequency is a power of two (e.g. 1,048576 MHz or 220 Hz), the counter is programmed to count down from the appropriate power of two to one (4 096 for a 256 Hz interrupt). The value that is added to the software counter is 65 536 - (value read from hardware counter), appropriately shifted to the right (20 places for this example).

3.9.2 Input Frequency not a Power of Two

If the input frequency is not a power of two (e.g. 1,25 MHz or period of 800 ns), the value read from the timer needs to be scaled and converted to NTP timestamp format. Assume that the interrupts are occurring at 10 ms intervals; the timer is programmed with a value of 12 500. The correct value to add to the software counter is (12 500 - (value read from hardware counter)) * 3 435, added directly to the least significant 4 bytes of the timestamp. The “magic number” 3 435 is the number of 232 picosecond (2-32 second) intervals in 800 ns. The actual value is closer to 3 436, but this would cause the clock to run backwards in certain cases.

Since this method is more maths-intensive than the first, it is recommended that the crystal oscillator module on the NIC be fitted with an 8,338608 MHz unit. The input frequency to the counter is then 1,048576 MHz.

3.10 Adjusting the NTP Clock

Since one of the requirements of a good clock is that it should be monotonic, the clock cannot simply be set to the correct time when required. Instead, adjustment should be done gradually by forcing the clock to run slower or faster than normal. This ensures that the clock reaches the correct time without skipping or running backwards.

Adjusting the clock is done by changing the value of tick. This is the standard BSD Unix method, causing the clock to run slow or fast until the correct time is achieved, after which the default value of tick is restored.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F17 of 26 ydthsaf1-02.wpd Appendix F Network Time Services

3.11 Accessing the NTP Clock

The host sub-system should have direct access to the NTP clock. This is accomplished by providing an NTP clock access point in the Multibus II device driver code running on the NIC. In this way, the most accurate time can be supplied in reply to an nts_get_time request.

Figure 2 : NTS Architecture 3.12 NTS Software Architecture

The NTS software consists of three distinct parts, as shown in Figure 2. The main NTP code maintains the 64-bit software clock in NTP timestamp format. A periodic hardware interrupt is used to increment the counter. Periodic synchronization requests are sent to other nodes on the network. These requests are timestamped by the FDDI device driver as close to actual time of transmission or reception as possible. The FDDI device driver obtains the correct time for timestamping from the main NTP code via the ntp_gettime() call. The NTP code uses the information obtained by the synchronization requests to update the software clock. The third part of the NTS software is the timeserver code. This code supplies the current time, obtained via an ntp_gettime() call, to the Multibus II interface in reply to an nts_get_time request.

3.13 Use of Existing Code

The designer of the Network Time Protocol, David Mills, has released a robust and well documented implementation of NTP into the public domain[105]. This code (xntpd) is used as the basis of the main NTP software architecture depicted in Figure 2.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F18 of 26 ydthsaf1-02.wpd Appendix F Network Time Services

4. NTP over FDDI Error Analysis

An analysis of errors arising in the generation and processing of NTP timestamps is done by Mills. It is shown that the correctness interval I for any one clock is :

I = [è - ä/2 - å, è + ä/2 +å]

where è, ä and å are the clock offset, round-trip delay and dispersion for that particular clock.

I needs to be investigated for all possible conditions. In particular, the FDDI network can be in one of three states, namely intact, interrupted (and thus wrapped) in the path from this clock to the relevant peer (forward path), or interrupted in the return path from the peer to this clock. For these scenarios, the clock offset è and round-trip delay ä can be calculated.

To obtain the confidence interval for the ensemble of clocks on the network, è needs to be computed individually for each clock and for each of the three network states. For each of the network states, the minimum and maximum clock offset has to be used to compute the confidence interval for the ensemble of clocks.

In this analysis, the following assumptions are made : the network contains N stations, with 1/m km of cable between nodes. Trcv and Tsnd are the times taken from receiving the NTP message to timestamping it and from timestamping an outgoing message to transmitting it. Signals travel through the fibre optic cable (i.e. the media propagation delay) with a speed of 5,1 ìs per kilometre (65% velocity factor) and the delay per node (i.e. PHY Latency) for retransmitting information is 0,6 ìs.

Figure 3 : FDDI with Ring Intact

4.1 Network Intact

With the network intact, as shown in Figure 3, the following applies for network node n :

Tn0 = Tsndn0 + 5,1 ìs * (1/m) * (N-n) + 0,6 ìs * (N-n-1) + Trcv

T0n = Tsnd0n + 5,1 ìs * (1/m) * n + 0,6 ìs * (n-1) + Trcv

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F19 of 26

ydthsaf1-02.wpd Appendix F Network Time Services

è = ((T2-T1) + (T3-T4))/2 = ((T2-T1) - (T4-T3))/2 = (Tn0 - T0n)/2

= (Tsndn0 - Tsnd + Trcv0 - Trcvn + (5,1 ìs * (1/m) + 0,6 ìs) * (N-2n)) / 2

ä = (T4-T1) - (T3-T2) = (T2-T1) + (T4-T3) = Tn0 + T0n

= Tsndn + Tsnd00n + Trcv + Trcv + (5,1 ìs * (1/m) * N) + 0,6 ìs * (N- 2)

Figure 4 : FDDI with Ring Interrupted (A)

4.2 Network Interrupted in Forward Path

When the network is interrupted in the path from node n to node 0 as shown in Figure 4 (n N), the following applies :

Tn0 = Tsndn0 + 5,1 ìs * (1/m) * (2N-n-2) + 0,6 ìs * (2N-n-3) + Trcv

T0n = Tsnd0n + 5,1 ìs * (1/m) * n + 0,6 ìs * (n-1) + Trcv

è = (Tn0 - T0n)/2

= (Tsndn0 - Tsnd + Trcv0 - Trcvn +(5,1 ìs * (1/m) + 0,6 ìs) * (2N-2n-2)) / 2

ä = Tn0 + T0n

= Tsndn + Tsnd00n + Trcv + Trcv + (5,1 ìs * (1/m) * (2N-2)) + 0,6 ìs * (2N- 4)

4.3 Network Interrupted in Return Path

When the network is interrupted in the path from node 0 to node n as shown in Figure 5 (n 0), the following applies :

Tn0 = Tsndn0 + 5,1 ìs * (1/m) * (N-n) + 0,6 ìs * (N-n-1) + Trcv

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F20 of 26 ydthsaf1-02.wpd Appendix F Network Time Services

T0n = Tsnd0n + 5,1 ìs * (1/m) * (N+n-2) + 0,6 ìs * (N+n-3) + Trcv

è = (Tn0 - T0n)/2

= (Tsndn0 - Tsnd + Trcv0 - Trcvn + (5,1 ìs * (1/m) + 0,6 ìs) * (2-2n)) / 2

ä = Tn0 + T0n

= Tsndn + Tsnd00n + Trcv + Trcv + (5,1 ìs * (1/m) * (2N-2)) + 0,6 ìs * (2N - 4)

4.4 Correctness Interval

The correctness interval for a specific clock can be determined using the above equations, as well as the dispersion. The dispersion is given by Mills as :

n n å= ñ + AB(T4-T1) + (T3-T2) n n where ñ is the reading error, i.e. the accuracy of the clock implementation, while AB and are the maximum frequency errors of the two clocks A and B.

The dispersion for a good clock implementation will be well below 1 ìs, so that the influence of å on the correctness interval can be ignored.

Two important terms in the above equations are Tsnd and Trcv. Trcv is the processor and operating system-dependent delay from receiving the NTP packet to actually timestamping it and consists of the interrupt response time and the time required to obtain a timestamp. Using LynxOS[96] on a 60 MHz Pentium computer, the interrupt response time can be from 6 ìs to 35 ìs, with an average of 8 ìs.

Tsnd is the delay between timestamping an outgoing NTP packet to the actual transmission of that packet, which can only happen after the token has been received. This means that Tsnd has a lower bound of 0 and an upper bound of the actual, instantaneous Token Rotation Time (TRT), which can vary from 43 ìs for a completely unloaded network to several milliseconds when all stations are transmitting. The Probability Density Function (PDF) for Tsnd is uniformly distributed between 0 and TRT, but the TRT is dependent on the PDF for the network loading. For this analysis it assumed that the network loading has a Chi-Square PDF, so that the average TRT would be in the vicinity of 200 ìs (with 10% network loading).

NTP relies on advanced analysis to extract the best timestamp exchanges, so that it would be fair to use the lower limits of Tsnd and Trcv as a true indication of the influence of these variables on the accuracy of the clock. Values of Tsnd = 15 ìs and Trcv = 30 ìs are used for the rest of the analysis.

A reasonable assumption for the physical size of a typical RTCS LAN would be that the LAN would consist of less than 50 stations with a circumference of less than 2 500 m, with an average of less than 50 m of fibre optic cable between stations. These values (N = 50, m = 20) can be substituted into the above equations to obtain worst case values for è and ä. These can then be used to compute the worst case correctness interval for the ensemble of clocks.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F21 of 26 ydthsaf1-02.wpd Appendix F Network Time Services

Case èmin èmax ä I Total Network Intact -21 ìs 21 ìs 132 ìs [-86 ìs, 86 ìs] 175 µs (n=49) (n=1) Interrupted in 0 41 ìs 173 ìs [-86 ìs, 127 ìs] 220 µs Forward Path (n=49) (n=1) Interrupted in -41 ìs 0 173 ìs [-127 ìs, 86 ìs] 220 µs Return Path (n=49) (n=1)

Table II : Synchronisation Accuracies for FDDI Ring in Thru' and Wrapped States

Table II shows that all the clocks on the network can be expected to be within 175 ìs of one another when the network is intact. This degrades to 220 ìs when the ring wraps. Note that all of these values were obtained by a theoretical analysis of the proposed implementation.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F22 of 26 ydthsaf1-02.wpd Appendix F Network Time Services

5. NTP Performance Measurements

5.1 Test Scenarios

The performance of NTS was tested by running NTP between machines on an isolated network. The machines were part of a standard Ethernet LAN with one machine being set up as a primary master taking time from its local clock. Measurements over FDDI could not be properly performed due to the fact that only SCO Unix drivers were available for FDDI and these exhibited a bug. It appears that with both SCO Unix and LynxOS this call does not function correctly (both operating systems use generic Unix code). The result of this problem is that while NTP works accurately enough in the WAN-type environment of the Internet where synchronisation precisions of tens to hundreds of milliseconds are expected, it cannot provide sub-millisecond precision in a small FDDI LAN.

The only Unix-type operating system (which is required to host NTP) that did not exhibit this bug was Linux which only supports Ethernet drivers.

The following table shows the machines and their respective operating systems together with the processors used.

Computers used in Testing Machine Name Operating System Processor ktc.ccii.co.za Linux 1.2.13 & 1.2.3 Pentium 100 gateway.ccii.co.za Linux 1.2.1 486DX2-66 laba.ccii.co.za Linux 1.2.13 & 1.2.3 486DX2-50 iccs1.ccii.co.za Linux 1.2.13 486DX2-66

Table III : NTP Measurement Environment

Initially NTP was installed on the gateway and ktc (with Linux 1.2.3). A test program to record the difference in time was written and measured with an oscilloscope in order to correlate the oscilloscope readings with that of the xntpdc program. Once it was established that the xntpdc program could be reliably used, the oscilloscope was dispensed with. After initial trials on these machines proved successful, the tests were expanded to include the other machines listed.

5.2 Test Results

Both the gateway and ktc compiled and loaded the NTP code without problems and after initial configuration and a setup time of approximately 24 hours, both were in synchronisation to within 250 ìs. This period of time taken for initial setup allows NTP to calculate the drift compensation for the local clock, thus once this is done, the type of platform on which the program is running should have no influence over the offsets obtained. Precision and offsets between machines of less than 100 ìs were regularly obtained and became normal.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F23 of 26

ydthsaf1-02.wpd Appendix F Network Time Services

The inclusion of the other machines followed, minor problems were experienced and these were rectified by changing some of the variables in the code. Polling intervals were decreased to increase the stability of the time offsets and similar results were obtained. After the initial setup time of approximately 24 hours, the drift files were accurate and upon reboot of the systems the machines came up synchronised to within 250 ìs and normally reached sub-100 ìs offsets within a minute. Jitter would then cause the time to move by approximately 40 ìs, but this can still be considered synchronised.

Tests on SCO Unix and LynxOS both proved unsuccessful; on both operating systems the clocks offsets would vary erratically and would only be synchronised to within a second. After further investigation it was established that the adjust time call [adjtime()] on these systems was not performing according to what the NTP code expected. Thus the adjust time procedure was not working correctly and the clock was being stepped to remain within a second.

The problem with adjust time was acknowledged by Lynx Real-Time Systems; they were, however, unwilling to treat its correction as a priority. SCO failed to respond to email messages, but Internet user group correspondents reported similar findings. A possible solution to this dilemma is to investigate the VxWorks real-time operating system from Wind River Systems in order to determine whether it implements adjust time correctly. If not, the Linux system call could be patched into any of SCO Unix, LynxOS or VxWorks; however, this would not be a simple undertaking as the call is deeply embedded within the operating system.

5.3 Test Conclusions

It is difficult to extrapolate the successful results on Linux to other operating systems. Although NTP is available for a variety of operating systems, there are no established ports for real-time operating systems. The problems experienced by the tests on the SCO Unix and LynxOS systems show that the compatibility between different versions of Unix cannot be taken for granted. It would be expected, however, that a real-time operating system with the correct functionality would show results of similar to, if not better than, the Linux results.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F24 of 26 ydthsaf1-02.wpd Appendix F Network Time Services

6. Conclusions and Recommendations

6.1 LAN Timing

For a typical real-time FDDI control LAN consisting of 50 nodes and a 2 500 m fibre optic ring, 100 physical interfaces (PHYs) and 5 000 m of optic fibre become active in the case of a single failure (i.e. the wrapped state). Typically such a system would support some synchronous and some asynchronous traffic. If an allocation of between 40% and 70% of the bandwidth is made to synchronous traffic, it can be determined from Table I using Formulae 1 and 2, that the possible choice of minimum target token rotation times (TTRTs) lies in the range 0,46 ms to 0,92 ms. This implies worst case token rotation times (TRTs) of twice that, i.e. in the region of 1 ms to 2 ms.

Worst case TRTs of 1 to 2 ms imply LAN latencies, without protocol or other overhead, in the same order. Such latencies are not sufficient for many classes of real-time systems. This implies that other methods are required to recover timing information between the distributed processes.

6.2 NTP Performance

When NTP is implemented on a wide area packet-switched network such as the Internet, the accuracy of the clock is affected mostly by errors resulting from the random network latencies. Even in this environment, where message latencies are in the order of seconds, a clock precision of a millisecond a day can be achieved[105] by clock synchronisation using the NTP. The SAFENET standard recommends that SAFENET Time Services (based on NTP) provide a synchronisation accuracy of 500 µs.

Different factors become important when implementing a clock synchronisation algorithm on a fast network such as FDDI. The network latency errors no longer dominate, so that any latency or jitter introduced by the clock reading and related algorithms, as well as the protocol stacks, become significant. Also, an error is introduced by the network transmit and receive paths not being the same length, as stated previously. Considering these factors and by performing an NTP over FDDI error analysis, it can expected that an algorithm based on NTP should maintain time to a worst case accuracy of 220 µs for a 50-node LAN of 2,5 km circumference.

6.3 Synchronisation

A Network Time Protocol is such a method which synchronises network nodes to accuracies an order of magnitude better than typical latencies, i.e. in the order of 220 µs. Many distributed algorithms are capable of converging with these levels of indirect synchronisation.

Without a Network Time Protocol it is possible to effect relative synchronisation between nodes on a small FDDI ring using direct methods, but not accurate calendar time synchronization. On large rings or with slower LANs, accurate synchronisation without NTP is not possible.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F25 of 26

ydthsaf1-02.wpd Appendix F Network Time Services

Systems requiring synchronisation accuracies of better than the 100 to 200 µs offered by NTP require special synchronisation methods, typically dedicated synchronisation networks or hardware support capable of extracting synchronisation from bit signalling (e.g. FDDI II).

6.4 Synchronous Bandwidth Allocation

TTRTs and TRTs can only be determined once a synchronous bandwidth allocation is undertaken. This requires an analysis of all data traffic expected within the real-time distributed system and categorisation of this data as either state or event-type data. State data is then allocated to synchronous bandwidth and event data to asynchronous bandwidth. Repetition cycles, deadlines, precedences and priorities have to be determined by functional analysis of the data and their related algorithms. The synchronous bandwidth allocation can then proceed in an iterative manner in order to accommodate and optimise all system data transfers, as well as allow an appropriate margin for growth. The SAFENET Network Development Guidance[103] offers an appropriate procedure and guidelines for such a process.

6.5 Topology Implications

Ring topologies are not ideally suited to tight synchronisation due to the latencies intrinsic in each node. Bus topologies on the other hand are better suited as nodes effectively receive the synchronisation pulses simultaneously. However, bus topologies without bridges are limited in range, especially for fibre optic media (typically less than a few hundred metres). While wire media are capable of extended range (typically up to several thousand metres), they suffer from electromagnetic interference.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page F26 of 26 ydthsaf1-02.wpd Appendix G

LAN Profiles

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G1 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

Appendix G ...... 1

1. Scope ...... 4

1.1 Scope...... 4 1.2 Introduction...... 4 1.3 Appendix Layout...... 4

2. LAN Profiles ...... 5

2.1 ISO OSI Basic Reference Model ...... 5 2.2 OSI Profile ...... 6 2.3 Internet Profile ...... 6 2.4 GOSIP ...... 7 2.5 MAP and TOP...... 8 2.6 Novell Corporation LAN Profile ...... 9 2.8 Synchronous Digital Hierarchy...... 11 2.9 SAFENET ...... 11 2.11 Scalable Coherent Interface ...... 14 2.12 Fibre Channel ...... 15 2.13 ATM ...... 15 2.14 HPN...... 16

3. Conclusions and Recommendations ...... 18

3.1 ISO OSI Basic Reference Model ...... 18 3.2 OSI vs Internet ...... 18 3.3 OSI Profile ...... 18 3.4 Internet Profile ...... 18 3.5 MAP/TOP ...... 19 3.6 GOSIP ...... 19 3.7 SAFENET vs Internet ...... 19 3.8 ATM and OSI ...... 19 3.9 Novell Corporation LAN Profile ...... 20 3.10 SAFENET ...... 20 3.11 Real-Time LAN Profile ...... 20 3.12 Technologies of the Future ...... 20

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G2 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

List of Tables

Table I : ISO OSI Basic Reference Model ...... 5

Table II: Internet Profile ...... 7

Table III : US Government OSI Profile (GOSIP) ...... 8

Table IV : US Government OSI Profile (GOSIP) ...... 10

Table V: SDH/SONET Signal Hierarchy ...... 11

Table VI : SAFENET Standards Suites...... 12

Table VII : Real-Time LAN Profile ...... 14

Table VIII : Fibre Channel Architecture Model...... 15

Table IX : ATM Architecture Model...... 16

Table X: HPN Framework ...... 16

Table XI : HPN Domain ...... 17

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G3 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

1. Scope

1.1 Scope

This appendix describes the general characteristics and structure of LAN Profiles, specifically in terms of their suitability or otherwise for the implementation of real-time, mission-critical, distributed systems.

1.2 Introduction

Considering that the purpose of a local area network is to facilitate communication between computers and that these computers may consist of hardware and software components from different vendors, local area networks are driven to standardisation.

With this objective, the International Standards Organisation (ISO) defined and in 1984 published a framework for communications standards which partitions the functions required for communication into seven layers. This was termed the Open Systems Interconnect (OSI) Basic Reference Model. The ISO has since published a set of standards conforming to this framework.

However, these standards are presented very broadly and are subject to various interpretations. The result of such interpretation is that either a further set of documented restrictions must be imposed upon the LAN, or the danger of different equipment, ostensibly developed according to the same standards, not being able to communicate.

A number of organisations have addressed this issue and defined a number of groupings or sets of standards, along with additional implementation agreements required for interoperability. These are known as LAN Profiles.

The following LAN profiles have found extensive implementation and are considered significant for evaluation in respect of real-time, mission-critical distributed systems; OSI, GOSIP, MAP/TOP, Internet and SAFENET. Due to its predominance in the commercial LAN market, Novell Corporation's LAN profile is also addressed.

Emerging technologies have indicated that the ISO OSI model is restrictive and new models more appropriate. These include ATM, Scalable Coherent Interface (SCI) and Fibre Channel. Models for these new standards are also presented.

1.3 Appendix Layout

The appendix provides an overview of a number of LAN Profile considered applicable to real-time distributed systems.

Specific conclusions and recommendations are then made, with the most significant of these being adopted in the main section.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G4 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

2. LAN Profiles

2.1 ISO OSI Basic Reference Model

The International Standards Organisation's Open System Interconnect Basic Reference Model is a seven-layer architecture for data communication protocol suites. While insufficient for the complete specification of a local area network, the OSI Basic Reference Model has proved valuable in its role as a conceptual and functional framework for co- ordinating the development of protocol standards[39].

The OSI Basic Reference Model encapsulates a set of communications functions within each layer. Each layer provides a set of services to the next higher layer which requests a set of services from the next lower layer. Layer interaction takes place on well defined boundaries by a small number of service primitives. These primitives abstract the details of the more basic tasks being performed in the service-providing layer.

The layered structure of the OSI Basic Reference Model is provided in Table I below.

No. Layer Functional Description

7 Application Interfaces to user programs by translating user application syntax into abstract syntax as well provides specific and common services for applications. 6 Presentation Negotiates appropriate transfer syntax (format) used within the network and translates abstract syntax into transfer syntax.

5 Session Sets up and controls a logical communication path (session connection) including logical synchronisation.

4 Transport Enhances reliability of the network by providing end-to-end dataflow control and optimises use of the network by providing special services such as timestamping and priority message scheduling.

3 Network Provides internetwork message routing, global addressing, congestion control, intermediate error control, as well as packet fragmentation and reassembly.

2 Data Link Controls access to the physical medium, formats/disassembles packets, implements link-level flow and error control as well as low-level network addressing.

1 Physical Encodes and physically transfers packets bit-wise onto the physical medium.

Table I : ISO OSI Basic Reference Model

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G5 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

2.2 OSI Profile

The OSI Profile provides services from Layer 3 to Layer 7. At Layer 3, the OSI Profile offers the OSI Network Protocol. Two options are available, i.e. the Connection-Oriented Network Protocol (CONP) and the Connectionless Network Protocol (CLNP).

At Layer 4, the OSI Profile offers the OSI Transport Protocol. Five classes of increasing capability with respect to retransmission of lost data, flow control and reordering of packets are offered. The Class 4 Transport Protocol (TP4) is the most reliable option and is frame orientated.

At Layer 5, the OSI Profile provides a connection-oriented session layer protocol. This is specified in ISO 8327[15]. This protocol provides basic session services including connection management, logical synchronisation, dialogue control and exception reporting.

At Layer 6, the OSI Profile provides a presentation layer protocol. This is specified in ISO 8823[19]. This protocol provides basic presentation services including syntax negotiation and transformation, as well as service request mapping.

At Layer 7, the OSI Profile offers a variety of services, e.g. :

! ISO FTAM (File Transfer, Access and Management) ! ISO MHS (Message Handling Service) ! ISO MMS (Manufacturing Message Service) ! ISO JTM (Job Transfer and Manipulation) ! ISO VT (Virtual Terminal) ! ISO DS (Directory Services, e.g. X.400).

2.3 Internet Profile

During the mid-1970s, the USA Department of Defence (DOD) commissioned the design of TCP/IP (Transmission Control Protocol/Internet Protocol)[25, 26]. This was before the OSI Reference Model was conceived. TCP/IP provides point-to-point, guaranteed-delivery communication between networked nodes and was originally designed for packet-switching communications. Since then, a number of other protocols, both within the transfer layers as well as the higher layers have been developed and are now in widespread use. Together these protocols are known as the Internet Profile.

The Internet Profile consists of a suite of protocols, offering services that communicate between and providing control of incompatible computers and networks.

The five-core military standard protocols of the Internet Profile are[83] :

! TCP - Transmission Control Protocol ! UDP - User Datagram Protocol ! IP - Internet Protocol ! FTP - File Transfer Protocol ! SMTP - Simple Mail Transfer Protocol

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G6 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

! TELNET - Protocol for Remote Login

The Host-to-Host Layer uses TCP to ensure the reliability of data transfer between two hosts.

TCP is a connection-orientated protocol providing reliable data transfer between two transport users (e.g. FTP and SMTP). Data is passed from the transport user to TCP which then encapsulates the data into segments containing the user data and control information. Outgoing segments are numbered sequentially and are acknowledged by number by the destination TCP module.

The TCP standard defines the main levels of service as being Multiplexing (Multiple Users), Connection Management, Data Transport and Error Reporting. TCP allows the transport user to specify the quality of transmission service and data transmission priority.

The Internet Profile is depicted in Table II.

Layer Internet Profile Layers Internet Profile No.

4 Process FTP SMTP Telnet

3 Host to Host TCP UDP

2 Internet IP

1 Network Access Not Specified

Table II : Internet Profile

2.4 GOSIP

GOSIP (Government Open System Interconnect Profile) is the US government's preferred LAN profile. It is based on standard products conforming to the OSI 7-layer Basic Reference Model and provides for a complete network solution.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G7 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

The GOSIP Profile is depicted in Table III.

Layer ISO OSI GOSIP No. Layers

7 Application FTAM

6 Presentation X.400 MHS

5 Session

4 Transport ISO TP4

3 Network CNLP

2 Data Link LLC1 X.25 PLP

CSMA/CD Token Bus Token Ring HDLC LAPB

1 Physical ISO 8802/3 ISO 8802/4 ISO 8802/5 V.35 RS232C (IEEE 802.3) (IEEE 802.4) (IEEE 802.5)

Table III : US Government OSI Profile (GOSIP)

2.5 MAP and TOP

General Motors and Boeing Corporation respectively developed the Manufacturing Automation Protocol (MAP) and Technical and Office Protocol (TOP)[111]. MAP utilises multinational interoperability standards frozen as of 1987 for the use of the manufacturing community. The TOP specification, first published in 1985, is intended to address office functions in manufacturing companies and is identical to MAP in most respects, except for the type of cable and the network structures.

The MAP/TOP effort attempted to formalise an approach to networking philosophies, interoperability and specifications. MAP and TOP both conform to the ISO OSI model, differing only at the lowest and highest levels. At the lowest level, i.e. the physical layer, a token bus topology operating at 10 Mbits-1 over broadband coaxial media, or 5 Mbits-1 over carrierband coaxial media, is specified. A token passing access scheme was chosen to provide some degree of determinism to the network infrastructure in order to support real- time applications. Up to seven frequency multiplexed channels can be operated over the broadband media. At the highest levels, i.e. the user level, MAP defines certain functionality whereas the upper boundary of the OSI definition is the application layer.

The MAP/TOP effort was one of the first to attempt to specify a complete workable network profile rather than a conceptual model. As such, it received substantial support from network vendors and manufacturing users throughout the US and Europe. In 1987 MAP V3.0[23] was

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G8 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

published by the standards authority, the MAP User's Group (MUG). A number of conforming products, including software and hardware, were developed and a number of plants were automated using MAP.

Despite some earlier successes, MAP and TOP have not survived the onslaught of Ethernet and IBM Token Ring at the physical layer or the Internet and Novell protocols at the higher layers. This is mainly due to the high cost of MAP implementation, especially the broadband media. Full MAP implementation was also not capable of real-time performance due to the cumbersome full-stack OSI protocol implementation.

In order to provide for lower cost and real-time capability, the MUG defined the Mini-MAP architecture which is completely non-MAP compatible. This effort has resulted in the fieldbus concept which provides for real-time interconnection of devices at the lowest level of the automated system. A number of proprietary fieldbuses have emerged from the industrial automation community. These include Intel's Bitbus, Bosch's CANbus and Firewire as well as the French INRIA bus. Public-domain buses such as MIL-STD-1553 and VNet have also been proposed for standardisation. To the present day, bitter commercial rivalry has precluded the widespread adoption of a standard fieldbus. However, international efforts are in progress to achieve this goal.

2.6 Novell Corporation LAN Profile

Novell Corporation is the developer of the Novell NetWare Network Operating System. NetWare fileservers are the hosts of a large proportion of all commercial computer nodes connected via local area networks. Until recently, this was estimated to be in the region of over 50% worldwide and as high as 95% in South Africa. While NetWare has lost some market share to other operating systems such as Microsoft Windows NT and Unix, it constitutes a significant factor in global network connectivity. As such, NetWare protocols require consideration in terms of functionality and interconnectivity.

When Novell began operations in 1982, several proprietary protocols for transferring data between workstations were used. As time went on, the decision was made to base Novell's network communications on a fast and efficient networking standard. Xerox's XNS protocol was determined to be one of the best available at the time, so Novell's Internetwork Packet Exchange (IPX) protocol was developed to conform to the XNS standard. NetWare IPX is functionally equivalent to Xerox's Internet Datagram Protocol (IDP).

Three primary peer-to-peer protocols are supported in the NetWare LAN environment, NetWare IPX, SPX and NetBIOS. Additional protocols supported include the Transport Layer Interface (TLI), Named Pipes, LU6.2 and others. IPX is Novell's Network Layer Protocol.

Novell provide protocols in a layered approach that can be approximately mapped to the OSI Basic Reference Model.

At the lower layers, Novell provides the Open Datalink Interface (ODI). ODI provides a standard interface and drivers to most LAN technologies such as Arcnet, Ethernet, IBM

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G9 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

Token Ring and WAN communications. Co-operating vendors supply drivers for more exotic technologies such as FDDI and Fast Ethernet.

At Layer 3 Novell provide the IPX (Internetwork Packet Exchange) protocol.

At Layer 4 Novell provide the SPX (Sequenced Packet Exchange) and NetBIOS (Network Basic Input/Output) protocols.

The higher layer functionality, such as file management and simple messaging, is implemented within the NetWare Operating System.

The Novell Profile is depicted in Table IV.

Layer ISO OSI Novell Profile No. Layers

7 Application NetWare Operating System Services

6 Presentation

5 Session

4 Transport SPX NetBIOS

3 Network IPX

2 Data Link Open Datalink Interface (ODI)

CSMA/CD Token Bus Token Ring

1 Physical ISO 8802/3 ISO 8802/4 ISO 8802/5 (IEEE 802.3) (IEEE 802.4) (IEEE 802.5)

Table IV : US Government OSI Profile (GOSIP)

2.7 Plesiochronous Digital Hierarchy

The Plesiochronous Digital Hierarchy (PDH) is basically the existing digital telephony transmission system. CCITT (now ITU) defined PDH in the 1960s and 1970s. There are five levels in PDH, although the bit rates associated with those levels vary with geographic region. In the USA, the levels of the hierarchy are DS0 (64 kbits-1), DS1 (1,544 Mbits-1), DS2 (6,312 Mbits-1), DS3 (44,736 Mbits-1) and DS4 (139 Mbits-1). In South Africa, the primarily used levels of the hierarchy are E1 (2,048 Mbits-1), E3 (34 Mbits-1) and E4 (150 Mbits-1).

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G10 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

2.8 Synchronous Digital Hierarchy

The Synchronous Digital Hierarchy (SDH) is a flexible synchronous time division multiplexing transmission system defined by CCITT to carry data streams at rates higher than those defined for PDH. SDH was derived from the Synchronous Optical Network (SONET).

SONET was developed by Bellcore in the 1980s as an optical fibre based transmission standard to serve telecommunications networks in the USA. SONET defines a set of framing formats, transmission speeds and multiplexing standards. The first level of the SONET framing hierarchy is STS-1 which is designed for 51,84 Mbits-1 rates. When the STS-1 structure is carried at 51,84 Mbits-1 over fibre optic media, the resulting service is called OC-1. Higher rate signals (STS-n) are achieved by multiplying the basic rate by n.

The SDH/SONET signal hierarchy is described in Table V :

Data Rate SDH SONET (Mbits-1) Optical Copper 51,84 OC-1 STS-1 155,52 STM-1 OC-3 STS-3 622,08 STM-4 OC-12 STS-12 1 244,16 OC-24 STS-24 2 488,37 STM-16 OC-48 STS-48

Table V : SDH/SONET Signal Hierarchy

2.9 SAFENET

The Survivable Adaptable Fibre Optic Embedded Network (SAFENET) model, has been defined for the US Navy[29].

Table VI shows the SAFENET profile and its relationship to the ISO OSI 7-layer Basic Reference Model.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G11 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

The SAFENET profile is derived from the ISO OSI communications model[39].

Layer ISO OSI OSI Family Internet Family SAFENET No. Layers Profiles

8 Application Ada Task Ada Task Process (SAFENET User) *

7 Application Optional OSI Extended Profiles Optional Internet SAFENET Extended Profiles Extended 6 Presentation Profiles

5 Session

4 Transport ISO CO OSI CL XTP UDP TCP XTP SAFENET Transport Transport Protocol Protocol Transport Profiles 3 Network OSI CLNP IP

2 Data Link SNAP

IEEE Logical Link Control SAFENET LAN Profiles FDDI 1 Physical

0 Cable Plant Common Cable Plant *

Note : * denotes outside the scope of the 7-layer model

Table VI : SAFENET Standards Suites

The SAFENET communication model is defined by a US Military Standard and supported by a Military Handbook.

MIL-STD-2204A - Survivable Adaptable Fiber Optic Embedded Network (1994-09-30).

MIL-HDBK-818A - Survivable Adaptable Fiber Optic Embedded Network - Network Development Guidance (1994-09-30).

The new issue of the standard deviates significantly from the previous one. The main departure addressed by the new revision is that two SAFENET families are allowed, i.e. the

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G12 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

Internet Family and the OSI Family. The Internet Family is based on the TCP/IP protocols while the OSI Family is based on the ISO OSI Connectionless Network Protocol (CLNP). Both families have a common LAN profile based on Logical Link Control (Layer 2) and FDDI (Layer 1).

XTP at Layer 4 features as the real-time transport protocol option in both families.

A summary of the most important features of the new SAFENET revision is :

! Two streams (families), i.e. either Internet or OSI are allowed.

! XTP is defined at Layer 4 only and not at Layers 3 and 4.

! Layers above the Transport Layer (Layer 5 and up) are termed SAFENET Extended Profiles and are optional.

2.10 Real-Time LAN Profile

To support real-time, mission-critical systems, the LAN profile is required to include protocols capable of real-time performance at each layer. It is therefore contended that a derivative of the SAFENET model is appropriate for generic real-time systems. This derivative is termed the Real-Time LAN Profile.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G13 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

Table VII shows the Real-Time LAN Profile with the various options each appropriate level.

Layer No. ISO OSI Layers Real-Time LAN Profile

9 Application Software Application Tasks Process * Network Management Services Timestamping Services

8 Operating System POSIX Real-Time Operating System Extension * Real-Time Operating System

7 Built-in Network File Application Test Time Transfer Interface Application Services Services Services Services (APIS) 6 Presentation Null Network Time Protocol

5 Session Null Null

4 Transport UDP XTP

3 Network IP

2 Data Link SNAP

IEEE LLC Type I Protocol

ANSI FDDI SMT Protocol ANSI FDDI MAC Protocol

1 Physical ANSI FDDI PHY Protocol

0 Cable Layer * Multimode Fibre Cable Plant

Note : Layers marked with an asterisk (*) fall outside the ISO OSI 7-layer Model.

Table VII : Real-Time LAN Profile

2.11 Scalable Coherent Interface

The Scalable Coherent Interface (SCI) standard defines two levels of interfaces. The physical layer specifies electrical, mechanical and thermal characteristics of connectors and network interface cards. The logical level describes the address space, data transfer protocols, cache coherence mechanisms, synchronisation primitives, control and status registers as well as initialization and error recovery facilities.

Three physical layer interfaces are defined for SCI: an electrical parallel interface designed for short distances of less than 10 m with a data rate of 8 Gbits-1, an electrical serial interface used for distances in the order of tens of metres at a data rate of 1 Gbits-1 and a serial optical interface used for up to ten kilometres at a data rate of 1 Gbits-1.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G14 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

2.12 Fibre Channel

Fibre Channel has a layered architecture in which functions are organised into a succession of five layers, FC-0 through FC-4. Of the five layers, the lowest three are contained in the Fibre Channel Physical and Signalling Interface (FC-PH) document. Moving upward from the lower to the higher layers is equivalent to moving from the physical to the logical domains. At the lowest level of the hierarchy are those functions that concern themselves with providing a physical connection and the transmission of raw bits. At the highest level are portions of application processes that are the originators and destinations of communication requests. Intermediate layers comprise of other functions including detecting and correcting errors, achieving efficient utilization of resources, performing routing, preventing congestion and regulating the flow of data.

The Fibre Channel Architecture Model is described in Table VIII :

FC-4 HiPPI IPI SCSI IP 802.X Others FC-3 Common Service FC-2 Signalling Protocol FC-1 Transmission Protocol FC-0 Physical

Table VIII : Fibre Channel Architecture Model

2.13 ATM

The designers of ATM have identified that conformance to the OSI 7-layer model has serious implications for performance, especially in terms of throughput and latency which are required by real-time systems as well as to support multimedia applications. They have therefore dispensed with adherence to this model, but nevertheless recognised the requirement for a coherent and accepted model in order to achieve interconnectivity and interoperability.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G15 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

The ATM Architecture Model is described in Table IX :

Higher Layers ATM Adaption Layer (AAL) Convergence Sub-Layer (CS) Segmentation and Reassembly Sub-Layer (SAR) ATM Layer (ATM) Virtual Channel Sub-Layer (VC) Virtual Path Sub-Layer (VP) Physical Layer (PL) Transmission Convergence Sub-Layer (TC) Physical Medium Sub-Layer (PM)

Table IX : ATM Architecture Model

2.14 HPN

The High Performance Network (HPN) is being formulated by the HPN Working Group (HPNWG) of the US Navy.

The HPNWG have also recognised the advantages of diverging from the OSI 7-layer model and formulated the HPN Framework described in Table X and the HPN Domain described in Table XI :

Application/User Facilities

Communication Facilities Transfer Facilities

Table X : HPN Framework

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G16 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

Application Software Entities User Entities (Combat System Services/Functions) (People and Combat Systems Equipments)

Application Program Interfaces

Management External Environment Interfaces Security

Audio/Voice Distributed File Video/Image Other Distributed Services Service Services Services

Time Naming Message Presentation Sensor Service Service Service Service Data

Operating System Services

Communication Facilities

Computing Platform

Transfer Facilities

Table XI : HPN Domain

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G17 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

3. Conclusions and Recommendations

3.1 ISO OSI Basic Reference Model

The OSI Basic Reference Model has proved valuable in its role as a conceptual and functional framework for co-ordinating the development of protocol standards. However, in itself, it is merely a paradigm for describing a LAN system and is not a practicable LAN Profile.

3.2 OSI vs Internet

The OSI concept was developed by the International Standards Organisation. They attempted to define an elegant approach to interoperability in opposition to the US Department of Defense Internet Profile (i.e. TCP/IP). The US DOD contributed a great deal of money to the development of the Internet protocols as well as the ARPAnet now more commonly known as the Internet.

The US DOD has enormous resources and resolve when sponsoring a standardisation effort. They also continue to sponsor the Internet which now has tens of millions of subscribers with more joining at the rate of over a million per month. The Internet is also the forerunner of the much awaited Information Superhighway. At present the Internet possesses the subscribers, the information hosts, but not the bandwidth to support the required data rates. The Internet profiles have enormous support from the most powerful organisation on Earth (i.e. the US DOD) as well as legitimacy from a vast number of network users and implementers.

The Internet was just coming to implementation maturity when the OSI protocols were in their paper standardisation phase. It thus has an extensive and probably unassailable headstart.

It is ventured that the OSI Profile and OSI-conformant products will experience a steady decline until they are completely replaced by newer technologies (such as IPng and ATM) within the next ten years.

3.3 OSI Profile

The OSI Profile provides services from Layer 3 to Layer 7. These services are available as standards OSI-compliant products. In the context of real-time, mission-critical, distributed systems, the OSI network layer and transport layer protocol products are potentially appropriate. At the network layer, the connectionless option is better suited to real-time systems due to lower communication overhead

3.4 Internet Profile

The Internet Profile is probably the most widely deployed network profile at present.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G18 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

3.5 MAP/TOP

MAP/TOP is somewhat dated and will disappear in the short to medium term.

3.6 GOSIP

There are many networking experts who recognise the elegance and appropriateness of OSI. In 1988 the US Government attempted to specify obligatory conformance to OSI by all network vendors supplying products and systems to all US Government organisations. They did this by issuing a Federal Information Processing Standards Publication called Government Open System Interconnect Profile (GOSIP)[22] which was a contractually applicable document in all US Government requests for tender and resulting contracts for network systems.

However, programme managers ubiquitously and consistently found means to have this prescription wavered to such an extent that the GOSIP initiative has effectively been nullified. In most cases, GOSIP gave way to the Internet Profile, thus enhancing the application base of TCP/IP at the expense of TP4/CLNP.

3.7 SAFENET vs Internet

When SAFENET was first approved (MIL-STD-2204), the maximum interoperability option specified the use of OSI protocols (SAFENET I), while the maximum performance option specified the use of the Xpress Transfer Protocol (SAFENET II). The latter was to perform the functionality of the network and transport layers in order to provide maximum performance.

Two fundamental changes have occurred since then (1992) and are approved for the latest issue (late 1994) of the SAFENET standard (MIL-STD-2204A). These are the inclusion of the Internet stream as an alternative to the OSI stream and the redefinition of XTP as a transport protocol to operate above IP or CLNP at the network layer.

It is concluded that this again illustrates the enormous power of the Internet lobby (as the standardisation process is a democratic one). It also serves to amplify the previously ventured contention that this will be another factor that will allow the Internet to prevail over OSI in the medium term.

3.8 ATM and OSI

ATM has been designed for maximum performance and flexibility. As such, the designers of the ATM standards have felt constrained by the rigidity of the OSI Basic Reference Model and have therefore dispensed with strict adherence thereto. Therefore technical requirements have been favoured over the elegance of the paradigm.

It is concluded that as ATM finds increasing application throughout the networking domain, the importance of the OSI model will diminish, probably to a position of academic rather than practical significance. The issue of open systems will gain in importance, but technical and commercial practicalities will prevail.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G19 of 20

ydthsag1-02.wpd Appendix G LAN Profiles

3.9 Novell Corporation LAN Profile

Worldwide, NetWare IPX and SPX probably support the most LAN node interconnections; however TCP/IP is gaining ground at their expense.

3.10 SAFENET

It is proposed that SAFENET, or derivatives thereof, is the most appropriate LAN Profile for most real-time, mission-critical, distributed systems.

It is imperative to tailor the SAFENET standard in order to achieve maximum flexibility and affordability. The US DOD now applies SAFENET, as well as most other standards, as guidelines and not mandatory.

3.11 Real-Time LAN Profile

It is contended that the Real-Time LAN Profile is appropriate in that it is a flexible, practical, achievable implementation of the ISO OSI Model that is capable of cost-effective, real-time performance as well as maximum interconnectivity. It provides for all of the layers required for a complete system and is achievable because implementations (hardware and software) for all of the layers exist or are implementable without resorting to proprietary solutions.

Another reason for its applicability is that, while SAFENET is a US Navy standard, it can be followed as a guideline without necessarily requiring full adherence. This is significant in the South African context (or even in many commercial or industrial contexts) as full adherence would have extensive and possibly prohibitive cost implications.

3.12 Technologies of the Future

In the future, ATM, Fibre Channel and Scalable Coherent Interface will mature to the extent that they will be appropriate for the construction of real-time, mission-critical, distributed systems. An appropriate Network Profile, based on their currently defined models, will have to be expanded and adopted.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page G20 of 20

ydthsag1-02.wpd Appendix H

Error Analysis and Modelling

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H1 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

Appendix H ...... 1

1. Scope ...... 4 1.1 Scope ...... 4 1.2 Introduction ...... 4 1.3 Appendix Layout ...... 5

2. Error Analysis and Modelling Process ...... 6 2.1 Iterative Modelling Process ...... 6 2.2 Operational and Environmental Conditions...... 6 2.3 1st Order Error Analysis...... 7

3. System Models ...... 8 3.1 Functional Flow Block Diagrams...... 8 3.2 Control System Model...... 10

4. Requirements Derivation ...... 11 4.1 Derivation Process...... 11 4.2 Probability Theory ...... 11 4.3 Probability of Survival ...... 12 4.4 Probability of Hardkill ...... 13

5. Real-Life Example ...... 15 5.1 Engagement Conditions...... 15 5.1.1 Target Characteristics ...... 15 5.1.2 Tracking Characterists ...... 16 5.1.3 Weapon Characterists ...... 16 5.2 Basic Calculations ...... 16 5.3 Determination of Miss Distance from Single Round Hit Probability ...... 17 5.4 Determination of Acceptable Timing Error from Angular Error ...... 18

6. Conclusions and Recommendations ...... 21 6.1 Systems Approach ...... 21 6.2 LAN Implications...... 21 6.3 Synchronisation and Timestamping...... 21 6.4 Network Time Protocol ...... 21 6.5 Internal NTP...... 22 6.6 Direct Approach to System Timing...... 22 6.7 Derivation of System TTRT and Synchronous Bandwidth Allocation ...... 22 6.8 Derivation of Precise Timing Requirements ...... 22

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H2 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

List of Figures

Figure 1 : Control System Model ...... 10

Figure 2 : Crossing Target Scenario ...... 15

Figure 3 : Scenario Time/Position Relationships ...... 16

List of Diagrams

Diagram 1 : Top-Level Functional Flow Block Diagram for Defend Against Air Threats ...... 8 Diagram 2 : Decomposed Functional Flow Block Diagram for Defend Against Air Threats ...... 9

List of Tables

Table 1 : Derivation Process ...... 11

Table 2 : Ph1 vs Phn for Maximum No. of Rounds ...... 17

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H3 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

1. Scope

1.1 Scope

This appendix presents an approach to the determination of timing accuracy requirements in typical real-time, distributed systems. It does so by first examining overall system effectiveness requirements, then analysing and modelling system errors in terms of these requirements and finally determining acceptable levels of system error due to timing inaccuracies of the interconnecting network.

1.2 Introduction

Real-time, distributed systems provide a solution to many complex user system requirements. However, the system implementations are normally equally complex, with many sub-systems contributing to the overall system functionality, with the added complexity of time delays in the interconnection fabric; i.e. those related to the distributed nature of the real-time system.

In deriving an appropriate system architecture, as well as identifying appropriate technologies and technical solutions for the implementation of the system, it is important to derive the timing requirements of the system or to determine the effects of timing inaccuracies in order to circumvent these by other methods.

Such determination of the timing requirements is most rigorously undertaken by analysing the allocated system requirements and then deriving performance requirements in a top-down manner until the precise timing requirements for the interconnecting network can be separated from other non-timing performance requirements.

However, the environment in which such real-time distributed systems are required to operate successfully is often extremely complex to model. Examples of such environments are the atmosphere and the ocean. Where there is interaction between, for example, the atmosphere, the ocean and highly dynamic man-made systems, the modelling of system performance gains an added dimension of complexity. Such systems can normally only be modelled in terms of a number of stochastic processes.

For any system, the overall user requirements can normally be expressed in terms of system effectiveness. For non-deterministic systems, system effectiveness can be expressed in terms

of probability of success (Ps) and cost effectiveness.

For stochastic systems, the element of non-determinism arises from random system phenomena resulting in measurement errors which cannot be reduced to zero. In most practical systems some errors can be reduced to zero or to insignificance by the allocation of system resources; however this normally involves cost, leading to the issue of cost effectiveness. In such systems it is normal to perform trade-offs in order to isolate and constrain errors in equitable proportion to the various sources of error.

System errors can be considered in three categories, i.e. geometric errors (εg), quantisation errors (εq) and timing errors (εt). In order to investigate the relationship between these errors, an error analysis is undertaken for a specific example of a mission-critical, real-time, distributed system.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H4 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

There are few more mission-critical systems with hard real-time performance requirements than the air defence system of a modern naval surface combatant. In parallel with the combat vessel's primary mission, e.g. anti-surface (offensive strike) or anti-submarine warfare, the combat system must detect, engage and destroy a variety of high-performance, sophisticated, air-launched weapons, often in saturation attack. Detection, classification and engage times are in the order of seconds[44] with the result of failure normally being loss of life and/or platform, often compounded by tactical or even strategic implications.

The very nature of a mission-capable combat system lends itself to a distributed architecture; i.e. in order to maximise its capability in dealing with multiple threats in multiple warfare areas, as well as maximising survivability.

This appendix provides an overview of the typical operating environment of a modern naval combat system and a methodology for determining network timing requirements from the allocated requirements.

As a completely rigorous approach is beyond the scope of this thesis, the methodology is mainly illustrative.

1.3 Appendix Layout

This appendix derives system performance requirements from the system effectiveness requirements of a typical real-time, distributed system. Using the theory of probability of hit as a starting point, order of magnitude angular errors are determined with acceptable timing errors being derived from the angular errors.

Analysis of the results supports the conclusions for the requirement for special timing techniques to be provided within the real-time, mission-critical, distributed system.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H5 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

2. Error Analysis and Modelling Process

It is clear that a complete error analysis of such a real-time system is likely to be extremely complex. It is unlikely that the process could be performed in one iteration or even by one team of system engineers. Also the engineering disciplines involved are complex and disparate, requiring that system engineers from different disciplines perform analysis in their specific areas of expertise and then collaborate in producing an integrated result.

Also the operational conditions, including threat scenarios and types, environmental considerations and platform dynamics are numerous. It would be almost impossible in a simple model to account for all of these variables.

It is proposed, therefore, that an iterative process should be followed in the determination of a system error model that supports meaningful and useful error analysis, especially in providing feedback into the system design process, as well as providing confidence to the user that the system is capable of performing its operational mission and that he has acquired a system with the capabilities for which he has paid.

2.1 Iterative Modelling Process

! The error analysis should be an iterative process where each iteration provides a further level of detail, sophistication and accuracy.

! Each successive iteration should be founded upon a preceding error model which has been understood and accepted by all stakeholders of the problem.

! The initial error model should be the seed of the iterative error model and should be based on an appropriate standard or reference set of operational and environmental conditions.

2.2 Operational and Environmental Conditions

Threat Characteristics - Type - Dynamic Performance - Detectability - Attack Profile

e.g. - Missile/Bomb/Aircraft - Maximum Velocity - Maximum Acceleration - Radar Cross-Section - Electro-Optic Cross-Section - Launch Altitude, Range, Velocity - Three Dimensional Spatial Approach

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H6 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

Environmental Conditions - Sea State - Wind Velocity - Barometric Pressure - Air Temperature

Platform Motion Characteristics - Direction (relative to earth) - Speed - Roll - Pitch -Yaw - Heave -Flex

The operational user will normally have multiple and diverse means to neutralise the threat; i.e. hardkill with weapons (e.g. gun, missiles) and softkill with electronic warfare counter- measures (jamming, chaff, decoys, etc.). Defensive tactics will normally follow a layered approach, i.e. long range defence with missiles, intermediate range defence with large calibre guns with intelligent munitions, short range defence with rapid fire, unintelligent munitions and finally electronic countermeasures. These different defensive sub-systems will normally be replicated so that they can engage multiple targets or be used co-operatively against single targets. Replication also increases system dependability.

2.3 1st Order Error Analysis

For the purposes of 1st order error analysis, each defensive weapon system should be considered individually.

The overall situation involves a complex dynamic situation. For a 1st order error analysis, steady-state conditions need to be used.

Summary of conditions for a 1st Order Error Analysis :

! Standard Threat ! Standard Scenario ! Standard Environmental Conditions ! Standard Platform Motion Conditions ! Steady-State Conditions

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H7 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

3. System Models

The following system models can be defined for the function Defend Against Air Threats :

3.1 Functional Flow Block Diagrams

Detect Control Engage

Diagram 1 : Top-Level Functional Flow Block Diagram for Defend Against Air Threats

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H8 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

The top-level FFBD can be decomposed as depicted in Diagram 2 below :

Detect Control Engage Radar Detection Identify Ballistic Computation Classify Weapon Alignment Electro-Optical Detection Designate Weapon Control Track Munition Launch Electromagnetic Detection Navigate Munition Flight

Compensate - Meterological Detect Locally (By Munition) (Onboard) - Platform Motion

Diagram 2 : Decomposed Functional Flow Block Diagram for Defend Against Air Threats

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H9 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

3.2 Control System Model

Figure 1 depicts the control system model for the air defence segment consisting of tracker, motion and environmental compensators and ballistics unit.

Figure 1 : Control System Model

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H10 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

4. Requirements Derivation

The user requirement of an air defense system is survival, that is to have a high probability of

survival (Ps). Ps is a function of the probability of softkill (Psk) and hardkill (Phk), the latter being a function of the engagement hit probability (Phn). Phn is a function of the number of rounds fired n and the single round hit probability (Ph1). The number of rounds that can be fired depends on weapon's rate of fire and the engagement time. The latter depends on the threat detection time and

the weapon range. Ph1 depends on the geometric, quantisation and timing errors of the system and effectively defines the performance and quality of the control system.

In modern digital systems, quantisation errors are likely to be negligible (due to the availability of inexpensive wide format processors, inexpensive memory and large bandwidth communication networks; hence for the 1st order model, quantisation errors can ignored. This leaves geometric errors and timing errors. Geometric errors arise from positioning and angle measurement errors of the detection, tracking and motion compensation sub-systems, as well as dispersion of the projectiles due to atmospheric phenomena and manufacturing tolerances. Geometric errors are impossible or inordinately expensive to reduce to zero and can be considered as intrinsic within the system. All geometric errors can be considered as relative miss distances and expressed as angular errors in milliradians (mrad).

By means of various system transformations, timing errors will eventually manifest themselves in terms of miss distance and it is therefore proposed that timing errors are reduced such as not to substantially contribute to overall miss distance.

4.1 Derivation Process

The following process is proposed to derive timing requirements from the system requirements.

Specify Probability of Survival Determine Engagement Probability of Hit Derive Single Round Probability of Hit Derive Statistical Round Miss Distance Specify Contribution of Timing Error to Miss Distance Derive Timing Performance Requirements

Table 1 : Derivation Process

4.2 Probability Theory

Due to the numerous random processes involved, the probability of hit of a ballistic munition can only be described mathematically by the use of probability theory and calculated by means of probability calculus or simulation modelling.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H11 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

Such probabilities are dependent on combinations of events and as such are governed by the Laws of Combination :

The Addition Theorem

P(A+B) = P(A)+P(B)-P(AB)...... (Eq. 1)

where :

P(A) = Probability of Event A occurring

P(B) = Probability of Event B occurring

The Multiplication Theorem

P(AB) = P(A)@P(B/A)...... (Eq. 2)

If the occurrence of B is independent of the occurrence of A, the two events are said to be stochastically independent of one another and then :

P(A/B) = P(B)

and then :

P(AB) = P(A)@P(B)...... (Eq. 3)

Also, in respect of complimentary events :

P(Ā) = 1-P(A)...... (Eq. 4)

4.3 Probability of Survival

Ps = Probability of Survival

=ƒs(Phk, Psk, Ptm) ...... (Eq. 5)

where :

Phk = Probability of Hardkill Psk = Probability of Softkill Ptm = Probability of Threat Weapon Miss or Failure

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H12 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

4.4 Probability of Hardkill

Considering only hardkill :

Phk = ƒhk(ML, Phn) ...... (Eq. 6)

where :

ML = Munition Lethality

Phn = Cumulative probability of hardkill for n rounds

Note that the munition doesn't have to hit the target for hardkill to occur, e.g. proximity-fused, ballistic munitions or missiles detonate within a certain distance of the target and use shrapnel or blast effects rather than kinetic energy. This leads to an effective target cross-section. Missiles have terminal homing capability while ballistics munitions do not.

Considering ballistic munitions :

Normally these are launched in bursts with the probability of kill being the cumulation of the individual hit probability. Consider a hit being the effect of either a direct hit or proximity action.

The laws of combination apply to the probability of multiple rounds hitting the target. It is assumed that when a round is fired at a target there are only two possible outcomes. i.e. hit w or miss and that these are two mutually exclusive events. We also assume that the probability of each further shot scoring a hit is w, irrespective of whether the preceding one scored a hit or not (i.e. independent firing).

Now i specific shots can be selected from n in different combinations. According to the Addition Theorem, the probability that precisely i shots hit the target out of n rounds fired is :

...... (Eq. 7)

Since only discrete numbers of hits can occur, i is a discrete random variable and a binomial distribution is applicable.

Wi = 1-f(i-1)

...... (Eq. 8)

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H13 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

Considering only one hit reduces Eq. 8 to :

n W1 = 1-(1-w) ...... (Eq. 9)

or :

n Phn = 1 - (1 - Ph1) ...... (Eq. 10)

where :

Ph1 = probability of hit of individual munition, i.e. single round hit probability

Phn = probability of one hit from n launch munitions

n = Number of Launched Munitions

Eq. 10 implies that as long as Ph1 > 0, Phn 6 1 for n 6 4.

then :

[ln(1-Phn)/n] Ph1 = 1 - e ...... (Eq. 11)

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H14 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

5. Real-Life Example

To determine the performance requirements of the network, a worst-case scenario, considered realistic over the complete lifespan of the system, needs to be analyzed.

5.1 Engagement Conditions

Consider a crossing target such as in an area defence scenario, i.e. own ship providing air defence to a consort as depicted in Figure 3. Such a target effectively maintains a constant range from own ship, unlike a radially closing target. This implies essentially constant single round

probability of kill Ph1, whereas with a radially closing target, Ph1 increases with decreasing range. This considerably simplifies the derivation of Ph1 from Phn. Assume also that the round to round hit probabilities are uncorrelated.

Figure 2 : Crossing Target Scenario

5.1.1 Target Characteristics

Type : Anti-Ship Missile (ASM) Velocity : Mach 5 (1 500 ms-1) Maximum Acceleration : 10 g (100 ms-2) Cross-Section : 0,2 m² Length : 3 m Safe Detonation Range : 500 m

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H15 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

5.1.2 Tracking Characterists

Maximum Range : 8 000 m Minimum Range : 100 m

5.1.3 Weapon Characterists

Type : Close-in Weapon System (CIWS) , Four-Barrelled Automatic Revolver Cannon Calibre : 27 mm Maximum Range (R) : 1 500 m Muzzle Velocity (v) : 1 100 ms-1 Rate of Fire : 4 x 1 880 rounds per minute Available Rounds : 1 440 Terminal Action : Kinetic

5.2 Basic Calculations

Time of flight (ToF) of first round from instant of fire to maximum weapon range :

ToF1 =R/v

= 1 500/1 100

= 1,36 s

Figure 3 : Scenario Time/Position Relationships

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H16 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

Range of ASM from weapon range at first instance of fire :

R1 = 1 500 x 1,36

= 2 040 m (i.e. start firing at 3 540 m from consort)

Time of flight of ASM from range of first instance of fire to safe detonation range :

ToF2 = (2 040 + 1 000)/1 500

= 2,03 s

No. of Rounds in Maximum Burst

=ToF2 / Rate of Fire

= 2,03 x (4 x 1 880/60)

= 254

For n = 254 and Phn in the range of 90,0% to 99,9%, Eq. 11 determines Ph1 as provided by Table 2 :

No. of Phn Ph1 Rounds 254 90,0% 0,9% 254 95,0% 1,2% 254 99,0% 1,8% 254 99,9% 2,7%

Table 2 : Ph1 vs Phn for Maximum No. of Rounds

5.3 Determination of Miss Distance from Single Round Hit Probability

Single round hit probability is dependent on system errors in the real-time, distributed fire control system. These errors are a complex cumulation of geometric, quantisation and timing errors.

Functionally, the combat system transforms various sensor signals and operator commands into weapon control orders which align a ballistic weapon in order to deliver ballistic projectiles. In the case of kinetic terminal effect projectiles, these must physically hit the target in order to neutralise it, while in the case of proximity effects projectiles, these must approach the target to within a certain effective distance in order for the proximity action to take effect.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H17 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

The control system transformations contain significant non-linearities that make a thorough analytical evaluation of hit probabilities extremely complex or impossible. Two approaches can be taken :

! Approximate analytical modelling and numerical evaluation of hit probabilities based on linearised signal transfer relationships, order-of-magnitude correlations and simple realistic probability distributions.

! Time-domain, dynamic scenario modelling by means of thorough simulation modelling.

Eitelberg[71] models a typical combat system using both methods. He shows that a cumulation of geometric and ambient system errors results in typical stationary miss distances in the order of 2,5 mrad (2σ). His results concur with those of van Zyl[133] who also uses simulation modelling to determine the effect of geometric and timing errors on engagement probability of hit.

i.e. for :

0,9% < Ph1 < 2,7%

a system static angular error σsr is assumed :

σsr = 2,5 mrad (2 σ)

Eitelberg also shows that a dynamic error σdr in the same order of magnitude as σsr is required to effect dispersion of the rounds in the burst around the target and thereby increase Phn. Therefore some dynamic errors, such as those introduced by timing jitter, do not adversely

affect Phn, as long as they are a fraction of σsr.

Hence, for the purposes of this study, 2,5 mrad is assumed to represent the acceptable total angular error that still allows the system to meet its overall performance requirements.

5.4 Determination of Acceptable Timing Error from Angular Error

Consider the Control System Model as depicted in Figure 1. Target range and velocity are required to be fed back after ballistics computation into the target data filter in order to perform target position prediction. These two processes are therefore collaborative.

Now consider the case where the ballistics unit and tracker (which performs target position prediction) are distributed, i.e. as would be the case where the sub-systems are connected via a local area network (LAN). In this case, there would be a finite time delay (latency) or timing unknown (jitter) in transmitting the target position from the tracker to the ballistic unit. The ballistics unit requires target range, azimuth and elevation for the ballistic computation and target velocity for lead angle prediction.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H18 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

For Target Velocity of :

v = 1 500 ms-1

x = vt...... (Eq. 12)

where :

x = distance, v = target velocity, t = time

and :

εp =vεt ...... (Eq. 13)

where:

εp= position error (or miss distance), εt = time error

Therefore Position Error per Millisecond Latency is :

-3 εp = 1 500 x 1 x 10

= 1,5 m per millisecond

Considering target position :

d=Rθ ...... (Eq. 14)

where :

d= miss distance, R = target range, θ = angular error

hence for:

R = 1 500 m, θ = 2,5 mrad

d = 3,75 m

Therefore, at an average engagement range of 1 500 m, the angular error of 2,5 mrad results in a miss distance 3,75 m (note that this is in the same order as the size of the target).

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H19 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

It is contended that the timing error component of the miss distance should be no greater than 20% of the total error, i.e. :

εt = 20% of εtotal ...... (Eq. 15)

= 20% of 3,75

= 0,75 m

or in terms of time :

εt / 0,75/1,5

= 0,5 millisecond

Hence the acceptable timing error for the distributed combat system is 0,5 ms (500 µs) which implies that the transfer latency and jitter have to be less than 500 µs.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H20 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

6. Conclusions and Recommendations

6.1 Systems Approach

The example and methodology provided in this appendix illustrate a systematic approach to the derivation of network timing requirements from the system requirements.

It is clear that accurate determination of errors in such a space/time/position system as described in the example is extremely complex, whether by analytical methods or by simulation. It is probable that only simulation can provide such accurate results, however a 1st order analytical model can provide typical timing requirements in terms of worst case scenarios.

The analysis performed shows that there is a finite probability of success (more specifically probability of hit) and that timing errors due to the proposed distributed architecture can be specified at level where they are insignificant compared with other system errors (specifically geometric and ambient errors).

6.2 LAN Implications

When considering a typical real-time control system employing a 50-node, 2 500 m FDDI LAN, a Target Token Rotation Time (TTRT) of 1 ms is in the order of the best achievable (see Appendix F). A TTRT of 1 ms implies a worst case Token Rotation Time of 2 ms (also see Appendix F). Hence 2 ms would be the worst case latency and jitter in the transfer of target position from the tracker to the ballistics unit.

The analysis has proved that 2 ms latency would be unacceptable for the system and it is therefore contended that other methods are required to recover the timing information.

6.3 Synchronisation and Timestamping

The most appropriate method of circumventing the latency and jitter problem is by synchronising the tracker and ballistics application tasks and then timestamping the time-critical data transfer (such as target position, velocity and acceleration).

6.4 Network Time Protocol

Sub-system Network Interface Cards (NICs) are readily synchronised by means of a Network Time Protocol (NTP) (refer Appendix F). It has been proposed in Appendix F that a NIC-NIC synchronisation accuracy of 220 μs is achievable for a typical FDDI LAN of 50 nodes and 2 500 m circumference. CPU-NIC transfer of data is still required with further latency and jitter being contributed by the interconnecting medium, typically a parallel backplane bus (PBB). Such latencies are in the order of 50 to 150 µs for unsolicited messages and 225 to 500 µs for solicited messages (refer to Paragraph 8.4.5 of the main section). Therefore using NTP and unsolicited messages CPU-CPU synchronisation accuracy of 500 μs should be achievable.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H21 of 22 ydthsah1-02.wpd Appendix H Error Analysis and Modelling

6.5 Internal NTP

An extension to the LAN-based NTP could provide synchronisation between NIC and CPU (i.e. internal to the sub-system) and therefore between distributed CPUs. Applications running on these CPUs can use this synchronisation as well as timestamping of data messages with synchronised clock time to effectively filter out the effect of PBB latency and jitter.

6.6 Direct Approach to System Timing

A direct approach to reduction of these timing errors, e.g. by raw performance of the data communication system, is neither feasible in terms of available technology or cost effective. Therefore an indirect method is proposed, i.e. timestamping of time-critical data by means of a Network Time Protocol.

6.7 Derivation of System TTRT and Synchronous Bandwidth Allocation

An iterative approach should be taken to determine an optimal TTRT, NTP synchronisation accuracy and synchronous bandwidth allocation for the particular system.

6.8 Derivation of Precise Timing Requirements

In order to derive precise timing requirements, more rigorous analysis and detailed simulation should be undertaken.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page H22 of 22 ydthsah1-02.wpd Appendix I

Dataflow Interface Management

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page I1 of 11 ydthsai1-02.wpd Appendix I Dataflow Interface Management

Appendix I ...... 1

1. Scope ...... 4 1.1 Scope ...... 4 1.2 Introduction ...... 4 1.3 Appendix Layout ...... 4

2. Data Interface Definition ...... 5

3. Data Interface Management ...... 5

4. Message Attributes ...... 6 4.1 Source Sub-System ...... 7 4.2 Destination Sub-system ...... 7 4.3 Maximum Message Length ...... 7 4.4 Payload Data Type ...... 7 4.5 Repetition Rate...... 7 4.6 Transfer Type...... 8 4.7 Transfer Mode ...... 8 4.8 Maximum Transfer Latency...... 8 4.9 Maximum Transfer Jitter ...... 8 4.10 Precedence ...... 8 4.11 Service Type ...... 9 4.12 Sample Rate/Consume Rate...... 9 4.13 Message Type ...... 10 4.14 Message Sub-Type...... 10 4.15 Message Format ...... 10 4.15.1 Message ID...... 10 4.15.2 Parameter Name ...... 10 4.15.3 Parameter Resolution ...... 10 4.15.4 Parameter Accuracy ...... 10 4.15.5 Parameter Value Limits ...... 10 4.15.6 Parameter Typical Value ...... 11 4.15.7 Parameter Bit Representation...... 11 4.15.8 Parameter Description ...... 11 4.15.9 Units of Measure ...... 11 4.15.10 Byte Representation ...... 11

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page I2 of 11

ydthsai1-02.wpd Appendix I Dataflow Interface Management

List of Tables

Table I : Message Attributes ...... 6

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page I3 of 11

ydthsai1-02.wpd Appendix I Dataflow Interface Management

1. Scope

1.1 Scope

This appendix provides some insight into appropriate strategies and mechanisms for interface management of the dataflow of typical real-time, mission-critical distributed systems.

1.2 Introduction

The ISO OSI 7-layer model includes the Presentation Layer at Layer 6. The Presentation Layer is responsible for the syntax of the data during transfer between two peer application layer protocol entities. To achieve true open systems interconnectivity, a number of common abstract data syntax formats have been defined for use by application layer entities together with associated transfer (or concrete) syntaxes.

It has been recognised, however, that implementation of the ISO Presentation Layer protocol can detract from real-time performance. Other methods are therefore required to ensure consistency of data interpretation within real-time, mission-critical, distributed systems.

Other dataflow attributes also need to be defined and set for each data entity. In particular, precedence, priority and transfer mode. Apart from facilitating consistent data interpretation, these allow the definition of system-level error control and priority management schemes.

1.3 Appendix Layout

This appendix introduces the requirements for data interface management, provides a table of typical message attributes and then provides a detailed, textual description of each identified message attribute.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page I4 of 11

ydthsai1-02.wpd Appendix I Dataflow Interface Management

2. Data Interface Definition

Real-time, mission-critical, distributed systems should follow open systems principles as far as possible. However, the most important requirements for these systems is performance and dependability and not interoperability. A priori data interface definition and rigorously managed data interface control are therefore critical in achieving these requirements.

Data interface design has to include identification of every critical data entity and its complete definition in terms of parameters, tolerances, bounds, priority, precedence as well as bit and byte ordering.

An effective method of capturing data interface definitions is by means of a computerised Information Flow Database (IFD).

3. Data Interface Management

Data interface management has to include clear allocation of responsibility for definition and specification, acceptance, qualification and configuration management.

The Information Flow Database (IFD) should be managed by the System Integration Authority (SIA) and maintained under their configuration control. The IFD should contain the messages (or information) that will be transferred on the System LAN with all their attributes as defined by the sub-system designers.

The attributes of a message also define its transfer characteristics on the LAN. These characteristics will be used to test the performance of the LAN in meeting the timeliness requirements of the integrated system. Each sub-system designer should therefore specify the attributes of all the messages that are produced and consumed by his sub-system. These attributes will be either according to the constraints placed on him by his sub-system or to what is required from his sub-system to produce the required inputs to other sub-systems.

The SIA will capture all the messages supplied by the sub-system designers into the IFD. The SIA will also determine in conjunction with the LAN developer additional attributes for the messages and these will also be entered into the IFD. The IFD should then be distributed to all the sub-system designers for their information and usage during sub-system integration into the system. Any changes to the data in the IFD will be passed through the SIA who will update the IFD and redistribute the updated IFD data. The SIA also uses the complete IFD to integrate, qualify and optimise the complete system.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page I5 of 11

ydthsai1-02.wpd Appendix I Dataflow Interface Management

4. Message Attributes

A message (or information) can be defined with the following attributes and definition responsibilities :

Message Attribute Responsibility Source Sub-System - Sub-System Name System Designer - Application Name Sub-System Designer Destination Sub-System - Sub-System Name System Designer - Application Name Sub-System Designer Maximum Message Length Sub-System Designer Payload Data Type Sub-System Designer Repetition Rate Sub-System Designer Transfer Type Sub-System Designer Transfer Mode Sub-System Designer Service Type Sub-System Designer Maximum Transfer Latency Sub-System Designer Maximum Transfer Jitter Sub-System Designer Priority Sub-System Designer Precedence Sub-System Designer Sample Rate/Consume Rate Sub-System Designer Message Timing Relationships Sub-System Designer Message Type System Dataflow Manager Message Sub-Type System Dataflow Manager Message Format Message ID System Dataflow Manager Parameter Name Sub-System Designer Parameter Resolution Sub-System Designer Parameter Accuracy Sub-System Designer Parameter Value Limits Sub-System Designer Parameter Typical Value Sub-System Designer Parameter Bit Representation Sub-System Designer Parameter Description Sub-System Designer Units of Measure Sub-System Designer Byte Representation Sub-System Designer

Table I : Message Attributes

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page I6 of 11

ydthsai1-02.wpd Appendix I Dataflow Interface Management

4.1 Source Sub-System

The Source Sub-System attribute of a message defines the producer of the information. The Source Sub-System attribute has two fields, namely :

! Sub-System Name

! Application Name

The Sub-System Name is the name for the total sub-system connected to the LAN.

The Application Name is the name of the application or task on the host CPU that generates the information (e.g. Target Tracker Task).

4.2 Destination Sub-system

The Destination Sub-System attribute of a message defines the consumer of the information. The Destination Sub-System attribute has two fields, namely :

! Sub-System Name

! Application Name

The same will apply as for the Source Sub-System attribute.

4.3 Maximum Message Length

The Maximum Message Length attribute of a message specifies the maximum message length (including the Message Identifier) that will be transferred on the LAN. This value is expressed in bytes.

4.4 Payload Data Type

The Payload Data Type attribute specifies the type of data payload of a message. Typically these are control data, uncompressed digital video, compressed digital video, compression type, audio or image.

4.5 Repetition Rate

The Repetition Rate attribute of the message specifies the update rate of a message. The Repetition Rate, Maximum Message Length, Message Type, Transfer Mode and Precedence attributes are used to determine the Error Management Policy as well as calculate the loading of the LAN.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page I7 of 11 ydthsai1-02.wpd Appendix I Dataflow Interface Management

4.6 Transfer Type

The Transfer Type attribute specifies the data transfer type of message. The following transfer types are available :

! Unicast (i.e. only to a single destination).

! Multicast (i.e. to a group of destinations).

! Broadcast (i.e. to all the sub-systems on the LAN).

! Monitoring (i.e. the message is not acted upon by the destination sub-system, but it monitors the message).

4.7 Transfer Mode

The Transfer Mode attribute of a message specifies whether the message transfer is periodic or aperiodic or requires special attention in terms of Quality of Service (QoS). Periodic messages are state-type messages and are allocated to FDDI synchronous mode. Aperiodic messages are event-type messages and are allocated to FDDI asynchronous mode.

4.8 Maximum Transfer Latency

The Maximum Transfer Latency attribute of a message specifies the maximum allowable latency from when the information was sampled and readied for transmission by the source application to the time that the destination application receives the information in its buffer. The latency is specified in milliseconds and will be the latency that the relevant system function can tolerate.

4.9 Maximum Transfer Jitter

The Maximum Transfer Jitter attribute specifies the maximum allowable tolerance from the specified repetition rate of the transfer.

4.10 Precedence

The Precedence attribute defines ranking of the message in terms of functional importance. Precedence may be either critical, major or minor.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page I8 of 11 ydthsai1-02.wpd Appendix I Dataflow Interface Management

4.11 Service Type

The Service Type defines whether the source sub-system requires to receive an acknowledgement from the destination sub-system. A message may have the following Service Type attributes :

! Unreliable Datagram : Datagram-type message with no acknowledgement from the destination NIC being required.

! Reliable Datagram : Datagram-type message with acknowledgement from the destination NIC being required.

! Unreliable Transaction : Transaction-type message with no acknowledgement from the destination NIC being required.

! Reliable Datagram : Transaction-type message with acknowledgement from the destination NIC being required.

! Connection-Orientated : Transport level connection-type message with acknowledgement from the destination NIC being required.

! Connectionless with Host : Transport-level connectionless message with Acknowledge the destination application being required to generate a message to acknowledge the reception of the message by the host.

! Connection-oriented with : Transport-level connection-oriented message with Host Acknowledge the destination application being required to generate a message to acknowledge the reception of the message by the host.

! Isochronous : Messages of fixed sample size and repetition rate, normally derived from sensor or sampled continuous data (such as video and audio).

! Critical Virtual Circuit : Connection-oriented messages allocated to FDDI synchronous mode with high levels of priority and precedence.

4.12 Sample Rate/Consume Rate

The Sample Rate/Consume Rate attribute specifies the rate at which a source application can generate information or the rate at which the destination application requires information to perform its tasks.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page I9 of 11 ydthsai1-02.wpd Appendix I Dataflow Interface Management

4.13 Message Type

The Message Type attribute specifies the functional type of message, e.g. target data, environmental data, navigation data.

4.14 Message Sub-Type

The Message Sub-Type attribute specifies the functional sub-type of message, e.g. air target, atmospheric data, position data.

4.15 Message Format

4.15.1 Message ID

The Message ID attribute uniquely identifies a message on the LAN. The Message ID consists of the following fields :

! Source Sub-System. ! Message Type. ! Message Sub-Type. ! Message Serial Number.

4.15.2 Parameter Name

A message on the LAN consists of a concatenation of parameters. The Parameter Name attribute identifies each parameter.

4.15.3 Parameter Resolution

The Parameter Resolution attribute specifies the resolution of the data in each parameter.

4.15.4 Parameter Accuracy

The Parameter Accuracy attribute specifies the accuracy of the data in the parameter.

4.15.5 Parameter Value Limits

The Parameter Value Limits attribute specifies the upper and lower limits of the parameter.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page I10 of 11 ydthsai1-02.wpd Appendix I Dataflow Interface Management

4.15.6 Parameter Typical Value

The Parameter Typical Value attribute specifies a typical value for a parameter. This can be used for simple data interface tests and simulations.

4.15.7 Parameter Bit Representation

The Parameter Bit Representation specifies the encoding of the parameter in bits.

4.15.8 Parameter Description

The Parameter Description gives a short description of the parameter and each field in the parameter.

4.15.9 Units of Measure

The Units of Measure provide the engineering units of each parameter if applicable.

4.15.10 Byte Representation

The Byte Representation attribute specifies how the parameters are packed in bytes.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page I11 of 11 ydthsai1-02.wpd Appendix J

Recommended Products and Standards

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page J1 of 10

ydthsaj1-02.wpd Appendix J Recommended Standards and Products

Appendix J ...... 1

1. Scope ...... 4

2. Recommended Standards ...... 4 2.1 International and National Standards...... 4 2.2 Recognised Body Standards ...... 4 2.3 De-Facto Standards ...... 4 2.4 Standard Building Blocks ...... 4

3. Recommended Products ...... 7 3.1 Real-Time Operating Systems...... 7 3.2 Software Language Compilers ...... 7 3.3 FDDI Network Interface Cards ...... 8 3.4 XTP V4.0 Protocol Software ...... 10

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page J2 of 10 ydthsaj1-02.wpd Appendix J Recommended Standards and Products

List of Tables

Table I : Recommended Standards ...... 5 Table II : Recommend Options...... 6 Table III : Recommended Real-Time Operating Systems ...... 7 Table IV : Recommended High-level Software Language Compilers ...... 7 Table V : Recommended FDDI Network Interface Cards ...... 9 Table VI : Recommended XTP V4.0 Protocol Software ...... 10

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page J3 of 10 ydthsaj1-02.wpd Appendix J Recommended Standards and Products

1. Scope

This appendix provide lists of standards authorities, as well tables of recommended standards and products that are considered appropriate for the design and construction of real-time, mission- critical, distributed systems.

2. Recommended Standards

The solution should be strictly standards-based. Only where no existing, useable standards exist should proprietary standards or products be developed.

It is proposed that real-time, mission-critical, distributed systems should be constructed using products based on the standards indicated in Table I.

2.1 International and National Standards

Where international or national standards exist, these should be employed. International standards include ISO and ITU (CCITT) standards. National standards include US ANSI, US DOD, UK MOD, RSA SANDF and RSA SABS standards. This should be obligatory for military and government systems and is strongly recommended for all open systems.

2.2 Recognised Body Standards

Where international and national standards do not exist, the standards of recognised non- profit organisations should be adopted. These include IEEE, Internet Engineering Task Force (IETF), XTP Forum and ATM Forum.

2.3 De-Facto Standards

Where neither international, national or recognised body standards exist, de-facto standards of international corporations should be adopted. These include those developed by IBM (International Business Machines), HP (Hewlett Packard), Bell Labs (ATT), DEC (Digital Equipment Corporation), Intel Corporation and Xerox Corporation.

2.4 Standard Building Blocks

It is proposed that the standard building blocks indicated in Table II, II and based on commercial technology, are appropriate for the development of real-time, mission-critical, distributed systems.

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page J4 of 10 ydthsaj1-02.wpd Appendix J Recommended Standards and Products

Layer Layer Standard Description Standard Standards Standard No. Description Reference Organisation Type

LAN Profile SAFENET Survivable Adaptable Fibre Optic MIL-STD-2204A US DOD National Embedded Network

0 Cable General Specification 50 µm/125 µm or 62,5 µm/125 µm MIL-C-85045 US DOD National for Fibre Optic Cable Multimode Fibre Optic Media

1a Physical FDDI Physical Media Dependant Protocol ISO 9314-3 ISO International ANSI X3.166 ANSI National

1b Physical FDDI Physical Layer Protocol ISO 9314-1 ISO International ANSI X3.148 ANSI National

2a MAC FDDI Dual-Counter Rotating Ring with ISO 9314-2 ISO, International Timed Token ANSI X3.139 ANSI National

2b LLC IEEE 802.2 Logical Link Control Type 1 ISO 8802.2 ISO International Logical Link Control Type 4 IEEE 802.2 IEEE National

3 Network MIL-STD-1777 Internet Protocol RFC 768 IETF International MIL-STD-1777 US DOD National

4 Transport XTP V4.0 Xpress Transport Protocol XTP V4.0 XTP Forum International (SAFENET Sponsored)

TCP Transport Control Protocol RFC 793 IETF International MIL-STD-1778 US DOD National

7 Application NTP Network Time Protocol RFC 1305 IETF International (SAFENET Sponsored)

8 Operating System POSIX Portable Operating System IEEEP1003.1 IEEE National Real-Time Extensions Extensions IEEEP1003.4 IEEEP1003.4a

Table I : Recommended Standards

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page J5 of 10 ydthsaj1-02.wpd Appendix J Recommended Standards and Products

Layer No. Product/ Standard Description Organisation Standard Guideline Type

7 MIL-HDBK-818A SAFENET Application Programming Interface US DOD National

8 iRMXIII Multitasking Real-Time Operating System Intel De-facto

LynxOS Multitasking, Multithreaded, Real-Time Unix- Lynx Real-Time De-facto compatible Operating System Systems

VxWorks Multitasking, Multithreaded, Real-Time Unix- Wind River De-facto compatible Operating System Systems

9 HP OpenView Network Management Hewlett-Packard De-facto

9 C++ Object-Oriented High Level Language Watcom De-facto Borland Microsoft Cygnus

MIL-STD-1815A/B Ada, Ada94 Structured High Level Language US DOD National

9 SQL 3 Database Structured Query Language

Parallel Backplane IEEE896.3 Futurebus+ High Performance 64-bit/128-bit Open PBB IEEE National Bus (PBB) Peripheral Component High Performance 32-bit/64-bit Parallel Backplane Intel De-facto Interconnect (PCI) Bus for Personal Computers

IEEE1386.1 PCI Mezzanine Card (PMC) High Performance 32-bit/64-bit Open PBB for IEEE National (based on PCI) SBus, VME, Multibus II, Futurebus+

Table II : Recommend Options

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page J6 of 10 ydthsaj1-02.wpd Appendix J Recommended Standards and Products

3. Recommended Products

3.1 Real-Time Operating Systems

The following Real-Time Operating Systems are recommended in the specific environment :

Real-Time Current Manufacturer Support Environments Operating System Version iRMXIII 2.2 Intel Intel Computing Platforms VxWorks 5.3 Wind River Systems VME Computing Platforms Multibus II Computing Platforms LynxOS 2.4 Lynx Real-Time Systems PC Platforms VME Computing Platforms

Table III : Recommended Real-Time Operating Systems

3.2 Software Language Compilers

The following high-level software language compilers are recommended in the specific environment :

Software Compiler Environment Language C ANSI C 16-bit environment Standard environment

Watcom C 32-bit environment with iRMXIII Real-Time OS C++ Watcom C/C++ VxWorks Real-Time Unix OS C++ Watcom C/C++ LynxOS Real-Time Unix OS Ada94 Alsys

Table IV : Recommended High-level Software Language Compilers

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page J7 of 10 ydthsaj1-02.wpd Appendix J Recommended Standards and Products

3.3 FDDI Network Interface Cards

The following FDDI network interface cards are recommended in the specific environment :

Platform FDDI Options Designation Manufacturer Country Chipset PC ISA AMD MM Fibre SAS SK-NET FDDI-FI SAS SysKonnect Germany Supernet 2 Copper SAS SK-NET FDDI-UI SAS

Motorola MM Fibre SAS 1265 Series Rockwell Network USA IFDDI Copper SAS Systems PC EISA AMD MM Fibre SAS SK-NET FDDI-FE SAS SysKonnect Germany Supernet 2 Copper SAS SK-NET FDDI-UE SAS MM Fibre DAS SK-NET FDDI-FE DAS Copper DAS SK-NET FDDI-UE DAS

Motorola MM Fibre SAS 4811-SAS-MM Interphase USA IFDDI SM Fibre SAS 4811-SAS-SM Corporation Copper SAS 4811-SAS-UTP MM Fibre DAS 4811-DAS-MM Copper DAS 4811-DAS-UTP

Motorola MM Fibre SAS 1275 Series Rockwell Network USA IFDDI Copper SAS Systems MM Fibre DAS Copper DAS PC MCA AMD MM Fibre SAS SK-NET FDDI-FM SAS SysKonnect Germany Supernet 2 Copper SAS SK-NET FDDI-UM SAS MM Fibre DAS SK-NET FDDI-FM DAS Copper DAS SK-NET FDDI-UM DAS PC PCI AMD MM Fibre SAS SK-NET FDDI-FP SAS SysKonnect Germany Supernet 3 Copper SAS SK-NET FDDI-UP SAS MM Fibre DAS SK-NET FDDI-FP DAS Copper DAS SK-NET FDDI-UP DAS

Motorola MM Fibre SAS 2200-FSS Rockwell Network USA IFDDI Copper SAS 2200-CS Systems MM Fibre DAS 2200-FSS Copper DAS 2200-CD

Motorola MM Fibre SAS 5511-SAS-MM Interphase USA IFDDI Copper SAS 5511-SAS-UTP Corporation SM Fibre SAS 5511-SAS-SM MM Fibre DAS 5511-DAS-MM SM Fibre SAS 5511-DAS-SM

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page J8 of 10

ydthsaj1-02.wpd Appendix J Recommended Standards and Products

PMC AMD MM Fibre SAS FDDI-PMC-MMF-SAS C²I² Systems RSA Supernet 3 Copper SAS FDDI-PMC-UTP-SAS MM Fibre DAS FDDI-PMC-MMF-DAS Copper DAS FDDI-PMC-UTP-DAS

Motorola MM Fibre SAS 4511-SAS-MM Interphase USA IFDDI SM Fibre SAS 4511-SAS-SM Corporation Copper SAS 4511-DAS-UTP

Motorola MM Fibre SAS DEFPZ-AA Digital Equipment USA IFDDI Copper SAS DEFPZ-UA Corporation SBus Motorola MM Fibre SAS 4611-SAS-MM-SC Interphase USA IFDDI Copper SAS 4611-SAS-UTP-SC Corporation MM Fibre DAS 4611-DAS-MM-SC Copper DAS 4611-DAS-UTP-SC GIO Motorola MM Fibre SAS 4911-SAS-MM-SC Interphase USA IFDDI MM Fibre SAS 4911-SAS-UTP-SC Corporation Copper SAS 4911-SAS-UTP-SC VME AMD MM Fibre SAS PME FDDI-1 MM SAS Radstone USA Supernet 2 UTP SAS PME FDDI-1 UTP SAS Technology STP SAS PME FDDI-1 STP SAS Corporation MM Fibre DAS PME FDDI-1 MM DAS UTP DAS PME FDDI-1 UTP DAS STP DAS PME FDDI-1 STP DAS

Motorola MM Fibre SAS 5211-SAS-ST-1M Interphase USA IFDDI SM Fibre SAS 5211-SAS-SM Corporation Copper SAS 5211-SAS-UTP MM Fibre DAS 5211-DAS-ST-1M SM Fibre DAS 5211-DAS-SM Copper DAS 5211-DAS-UTP Multibus II AMD MM Fibre SAS CL486/DA4-SAS Concurrent UK Supernet 2 MM Fibre DAS CL486/DA4-DAS Technologies MM : Multimode SM : Singlemode

Table V : Recommended FDDI Network Interface Cards

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page J9 of 10

ydthsaj1-02.wpd Appendix J Recommended Standards and Products

3.4 XTP V4.0 Protocol Software

The following XTP V4.0 Protocol Software is recommended in the specific environment :

Environment Language Manufacturer Country Generic C++ Sandia National Labs USA V4.0 Kernel Reference Model Unix Streams ANSI C Mentat, Inc. USA Intel 80x86 ANSI C Network Xpress, Inc. USA

Table VI : Recommended XTP V4.0 Protocol Software

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page J10 of 10 ydthsaj1-02.wpd Index

10BaseT...... 151 17B/20B...... 99 4B/5B...... 99 8B/10B...... 99 Abort ...... 116 Absolute time...... 131 Abstract properties...... 39 Abstraction...... 5, 124, 128, 136, 198 Access Control...... 36, 141 Accessibility ...... 124 Accountability ...... 141 Accounting...... 135 Accuracy ...... 7, 132 Acknowledge ...... 25, 118, 121 Acknowledgement ...... 86, 87, 114, 117, 122, 182 Acoustic...... 12 Ada...... 125 Address Resolution Protocol...... 106, 110, 171 Address-based ...... 124 Address-driven...... 24 Address-independent...... 86, 189 Addressing...... 8, 100, 108, 188 Addressing Scheme...... 118, 126 Adjust time...... 184, 191 Admission control ...... 111 Advanced Micro Devices...... 169 Aegis ...... 15, 59 Aerodynamic Control Systems ...... 48 Aerospace...... 8, 44 Affordability ...... 27, 78, 146, 147, 188, 189, 193, 194, 196, 199 Aircraft avionics ...... 2 AIX ...... 179 Alarm...... 120, 141 Alert...... 120, 141 Algorithmic simplicity...... 21 Alignment ...... 21, 117 Allocated requirements ...... 78, 93, 146, 196, 199 Amplitude modulation ...... 99 Analogue sensors ...... 12 Animation ...... 74 ANSI ...... 32, 101, 200 Anti-submarine warfare...... 156 APIS...... 7, 8, 96, 125, 140, 166, 169, 170, 189, 198 APIS Architecture ...... 129 APIS Dataflow Control...... 129 APIS Development ...... 171, 187 APIS Implications ...... 128 APIS Overview ...... 125 APIS Performance...... 172

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page i of xxvi ydthsm2.wpd Index

APIS Principles of Operation ...... 126 APIS Service User...... 125, 127 APIS Services ...... 127 APIS Test Shell...... 172 Application host...... 127 Application interface layer ...... 193 Application interface protocol ...... 86 Application Interface Services ...... 7, 8, 96, 123-125, 166, 197, 200 Application layer...... 7, 24, 25, 88, 123, 124, 149 Application layer protocol...... 189 Application programming interface...... 124 Application Service Access Point ...... 127 Application Services User...... 172 Application-specific services...... 123 Architecture Concept...... 193 Architecture Concept Demonstration Model...... 166 Architecture concepts ...... 44 Architecture Demonstration Model...... 9 Architecture Derivation...... 93 Architecture models...... 9, 93 ARP...... 106, 171 ARPAnet ...... 111 ASAP...... 127 ASU address...... 126 ASU administration...... 125 Asynchronous ...... 82, 101, 118, 119, 134 Asynchronous Mode ...... 103 Asynchronous Transfer Mode ...... 27, 30, 195 ATM...... 27-30, 40, 74, 101, 105, 114, 116, 150, 151, 156, 158, 195, 198, 201 ATM Adaptation Layer...... 122 ATM Forum...... 122 Atomic commitment ...... 85 Atomic group multicast ...... 86 Audio...... 37, 45, 82, 152, 156, 188, 198 Augmentation...... 188 Augmentation schemes ...... 158 Authentication...... 36, 88, 127 Authentication code...... 141 Authorization...... 88 Automated control...... 15 Automatic control ...... 45, 48, 51 Automatic Repeat Request...... 118 Automation ...... 146, 201 Automotive control ...... 22 Auxiliary time services ...... 8 Availability ...... 7, 9, 28, 57, 59, 85, 141, 147, 151 Avionic control systems...... 12 Backbone ...... 66, 195 Backplane...... 27, 31, 137, 162, 198

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page ii of xxvi ydthsm2.wpd Index

Backup...... 140 Bandwidth ...... 4, 12, 21, 22, 25, 57, 80, 98, 99, 102, 123, 157 Bandwidth allocation...... 146 Bandwidth on demand...... 149 Bandwidth utilisation ...... 146 Basic Reference Model ...... 93, 109, 197 Battle damage ...... 37, 143 Benchmarking ...... 190 BER...... 101 Best-effort ...... 22, 81, 82 BIT...... 138 Bit error rate...... 30, 101, 104 Bit signalling ...... 104 Bonding ...... 158 Booch notation...... 172 Bottleneck ...... 117, 193 Bounded inaccessibility ...... 39 Bounded omission degree ...... 39 Bounded transmission delay ...... 39 Bridge...... 80 Bridges...... 63, 66, 106, 137, 146-148 Broadcast...... 24, 25, 39, 79, 85, 86, 126, 169, 171 Buffer...... 4, 83, 90, 122, 127, 187 Buffering ...... 129 Building blocks ...... 110, 166, 199 Built-in Test ...... 45, 56, 104, 152 Built-in Test Services ...... 96, 124 Burst Control...... 90, 117, 118, 188 Bus-based...... 51 Bus-type topology ...... 86 C...... 172 C++ ...... 125, 170 Cable ...... 199 Cable Layer ...... 98 Cable plant...... 96, 98, 158, 198, 199 Cable routing...... 158 Cabling...... 13 Calendar Time...... 58, 86, 131, 169, 185 CAM ...... 149 CASE...... 172 CDDI...... 201 Cell...... 41 Centralised architecture...... 193 Channel setup time...... 150 Checksum algorithm ...... 186 Circuit setup time...... 149, 150 Circuit switching ...... 103, 149 Circuit-switched services...... 189 CLA...... 169

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page iii of xxvi ydthsm2.wpd Index

Class of Service...... 82, 83, 154, 157, 158 Client process...... 84 Client- server...... 108 Client-Server ...... 58, 70 Client-server architecture...... 84 CLNP...... 124, 166, 169, 186, 189 Clock ...... 134 Clock frequency...... 132 Clock recovery...... 99 Clock setting algorithm...... 134 Clock time ...... 132 Closed system ...... 163, 197 Closed-loop control ...... 87, 167, 188, 199 Co-ordinated Universal Time ...... 131 Coherency ...... 143, 146 Collaborative processing...... 37 Collision ...... 148 Collision avoidance ...... 20 Command and control ...... 2, 8, 13, 44, 193 Commercial off-the-shelf...... 79, 170, 186, 190, 199 Communication inter-task...... 162 Communication fabric...... 143 Communication logic...... 126 Communications model...... 116 Communications protocols ...... 4 Competitive advantage ...... 79 Complexity...... 44, 124, 138 Compression ...... 29, 74, 82, 153, 157, 198 Lossless...... 153 Lossy...... 153 Compression Ratio...... 154, 156 Compression Standards...... 155 Computation servers ...... 70 Computer resources...... 163 Comsoft LAN Architecture...... 169 Concentrator...... 31, 151 Concentrators...... 146 Concept demonstrator ...... 200 Concurrency...... 71, 163 Concurrent Technologies ...... 166, 169 Confidentiality ...... 140 Configuration logic ...... 126 Configuration Management...... 104 Congestion control ...... 107, 108 Connection...... 83, 135 Connection management...... 104, 114, 116, 117 Connection-oriented...... 28, 107, 108, 114 Connection-Oriented Network Protocol ...... 109

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page iv of xxvi ydthsm2.wpd Index

Connection-oriented service ...... 105, 150 Connectionless...... 107, 108, 110 Connectionless Network Protocol...... 108, 109 Connectionless service...... 28 Connectivity...... 151 Connectivity Issues ...... 158 Connector...... 199 Connectors...... 98, 151 Consume Data ...... 128 Consumer...... 125, 140 Consumer group...... 127, 171 Contents-addressable memory...... 149 Context lookup database ...... 115 Context switch...... 20, 30 Continuous Media ...... 9, 82, 83, 89, 90, 106, 108, 113, 119, 151, 198 Continuous media services ...... 27, 29, 157, 189 Cost ...... 2, 3, 13, 14, 18, 36, 57, 66, 72, 80, 95, 98, 137, 145, 147, 190, 195 Cost-effective...... 195, 199 Cost-effectiveness ...... 98, 100, 145, 193 Cost/benefit analysis ...... 199 COTS...... 79, 170, 190 Counter-rotating ring ...... 158 Coupling and Bypass Devices...... 147 CPU...... 125, 162 CPU usage...... 115 CPU/NIC Synchronisation...... 134 Credit parameter ...... 122 Credit scheme...... 122 Credit-based...... 118 Critical control loop...... 45, 48, 87, 188 Critical virtual circuit...... 6, 84, 87, 102, 108, 199 Critical Virtual Circuits...... 150 Cross-talk...... 158 CSMA ...... 85 CSMA/CD...... 20 Current loop...... 12 CVC...... 150 DAS...... 146 Data driven approach...... 126 Data fusion...... 27, 153, 157 Data interface management ...... 9, 36 Data Link Layer...... 36, 100 Data Link Layer Protocol ...... 100 Data Multiplex System ...... 188 Data storage...... 141 Data Transfer Latency...... 167 Data Transfer Modes...... 101 Data transfer policy...... 109, 113, 125 Data transfer services...... 122

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page v of xxvi ydthsm2.wpd Index

Data-based...... 124 Data-driven ...... 24, 25 Data-driven approach...... 7, 126, 128 Data-driven protocol ...... 172 Database...... 72, 79, 80, 85, 86, 101, 153 Database management...... 25 Database servers ...... 70 Dataflow...... 82, 126, 143 Dataflow Control...... 8, 26, 88, 112, 113, 118, 123, 127, 128, 131, 150, 158, 188, 194, 198 Dataflow integration ...... 4 Dataflow management...... 170, 193 Dataflow management agent...... 126 Datagram...... 84, 105, 108, 110, 117 Datalink...... 45, 56, 57, 79 Datalink Layer...... 148 DCT...... 155 Deadline...... 35, 81, 88, 119, 123, 198 Debugging ...... 110 Deficiencies...... 186 Deficiency ...... 187 Definitions...... 8 Degradation...... 156 Degradation manager...... 23 Degraded mode ...... 37 Delay ...... 133 Delay error...... 133 Department of Defense ...... 110, 111, 113 Dependability ....2, 3, 7, 23, 29, 36, 39, 48, 51, 78-80, 85, 104, 105, 135, 136, 147, 151, 158, 163, 193, 199 Deregister...... 127 Deregister Consumption ...... 128 Deregister Production ...... 128 Derived Requirements...... 24, 56, 79, 80, 93, 196, 199 Destination address ...... 108 Determinism...... 8, 19, 22, 81, 100, 106, 167, 188, 189, 194 Deterministic...... 4, 19, 35, 79, 81, 114, 151, 162, 163, 187, 196 Development tools...... 186 Device driver...... 125, 170, 184, 187 Devices driver ...... 201 Diagnostics ...... 104, 135, 138, 148 Digital control loop ...... 150 Digital imagery ...... 32 Digital logic...... 150 Digital servo loop...... 48 Digital telephony ...... 25, 86, 199 Digital video ...... 23, 29, 37, 74, 86, 154, 188, 191, 198 Discrete cosine transform ...... 154 Discrete data ...... 45 Discrimination...... 154, 157

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page vi of xxvi ydthsm2.wpd Index

Distributed...... 37 Distributed clocks ...... 131 Distributed control...... 13 Distributed control algorithm...... 131 Distributed control system...... 8, 11 Distributed operating systems...... 85 Distributed systems ...... 8 Drift...... 132 Dual counter-rotating ring ...... 195 Dual-attachment...... 146 Dual-redundancy ...... 29 Duplicate packet ...... 114 Dynamic configuration control ...... 135 Dynamic priority scheme...... 119 Dynamic SBA Scheme ...... 102 Efficiency...... 83, 99 EISA...... 181 Electro- mechanical...... 149 Electro-hydraulic servo ...... 48 Electromagnetic compatibility ...... 12, 13, 80, 98, 105, 158, 188, 194 Electromagnetic ether ...... 151 Electromagnetic interference...... 12, 37, 99 Electromagnetic radiation ...... 37, 99 Electromagnetic susceptibility ...... 158 Electromagnetically compatibility ...... 78 Electronic data interchange...... 36 Electronic funds transfer ...... 141 EMC...... 19 Emergent properties...... 38 Encapsulated IP service...... 171 Encapsulation...... 105, 110 Enclaving...... 63, 66, 144 Encoding ...... 99 Encryption...... 131, 140, 141 Encryption keys...... 141 End System ...... 109 End-to-end delay ...... 83 Enhanced transport services...... 112 Ensemble average ...... 132 Entity Co-ordination Management ...... 104 Environmental hazard ...... 140 Ergonomic ...... 78 Error conditions...... 124 Error control . . . 7, 25-27, 41, 58, 84, 87, 88, 105, 107, 108, 112, 117, 118, 155, 157, 158, 188, 195 Error control policy...... 82, 89, 120 Error correction ...... 31, 99 Error detection...... 31, 39, 99, 100 Error Management...... 120 Error policy ...... 26

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page vii of xxvi ydthsm2.wpd Index

Error rate ...... 31, 83 Error rates ...... 188 Error recovery ...... 105 Error statistics ...... 104 ES-IS ...... 110 Ethernet ...... 20, 30, 40, 100, 146, 148, 151, 179, 184, 187 European Air Traffic Control ...... 188 Event ...... 101, 120 Event and alarm reporting...... 135 Event data ...... 103 Event showers ...... 22 Event-triggered ...... 22, 27, 81, 82, 101, 198 Event-type ...... 87, 118 Event-type data ...... 108 Evolution...... 8, 11, 13 Evolutionary prototyping...... 7, 9, 171, 187, 190 Expandability ...... 6, 7, 78 Expedited data...... 116 Experimental testbed...... 7, 9, 166, 167, 172, 200 Description...... 167 Topology ...... 167 Expert systems...... 79 Extended Message Buffering...... 129 Extended profile protocol ...... 9, 124, 125, 131 Extended profile services...... 8, 9, 123 Extensible network management agent...... 136 External time reference ...... 131 Failure modes, effects and criticality analyses ...... 152 Fast Ethernet ...... 105, 150, 201 Fast Initialisation ...... 133 Fast Negative Acknowledge ...... 118 Fastnack...... 118, 121 Fault tree analysis ...... 152 Fault-tolerance 5, 16, 27-29, 32, 37, 40, 70, 74, 78, 80, 87, 101, 103, 135, 146, 151, 152, 190, 193- 195 FC...... 31 FDDI ....7, 21, 27-29, 31, 40, 46, 61, 66, 86, 96, 99, 101, 104, 111, 114, 122, 132, 133, 137, 146, 152, 156, 158, 159, 166, 167, 179, 184, 188, 189, 194, 195, 198, 201 FDDI chipset ...... 103, 169 FDDI II...... 103, 158 FDDI LAN...... 187 FDDI NIC ...... 103 FDDI protocol analyzer...... 169 FDDI Ring Latency...... 105 FDDI Synchronous Forum ...... 102 Federalism...... 193 Federated topology ...... 58, 66 Feedback ...... 122 Feedthrough connector ...... 159

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page viii of xxvi ydthsm2.wpd Index

Fibre Amplifier ...... 147, 159 Fibre Channel...... 30-32, 74, 99 Fibre Distributed Data Interface ...... 29, 194 Fibre optic ...... 19, 80, 140, 147, 194 Fibre optic cable...... 158 Fibre optic media...... 31, 98, 105 Fibre Optic Patchchords...... 159 Fibre optic receivers ...... 140 Fibre-optic media...... 86 Fibre-type interconnect ...... 147 Fieldbus ...... 17 File transfer ...... 30, 167 File Transfer Protocol ...... 112 File Transfer Services ...... 96 Fileserver...... 70, 152 Filtering ...... 129 Flash EPROM ...... 152 Flash message ...... 87 Flexibility ...... 2, 5, 7, 9, 19, 22, 24, 27, 57, 59, 109, 124, 126, 135, 143, 146, 186, 189, 199 lack of...... 193 Flight control systems ...... 2 Flow Control ...... 25, 26, 41, 83, 87, 89, 105, 107, 112, 115-118, 122, 129, 157, 188 Fly-by-light ...... 99 Fly-by-wire control ...... 22 FMECAs ...... 152 Force multiplication ...... 196 Formal methodology ...... 171, 190, 191, 196 Formal Notation...... 191 Framegrabber...... 156 Frequency...... 132 Frequency modulation ...... 99 Functional Integration ...... 143, 189 Functional performance...... 57, 199 Gateway...... 63, 80, 146, 149, 151 Gateways...... 189 Generalised Rate Monotonic Scheduling ...... 82, 163 Global addressing ...... 107, 108, 112 Global time...... 169 Global Time Service ...... 131 Go-back-N...... 25, 115, 118 GOSIP ...... 38 GPS ...... 169, 185 Graceful close ...... 116 Graceful degradation ...... 23, 82, 155, 198 Granularity...... 81 Graphics...... 138, 152 Greenwich Mean Time ...... 131 Grounding ...... 158 Group addressing ...... 126

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page ix of xxvi ydthsm2.wpd Index

Guaranteed response ...... 48, 81 Guaranteed response service ...... 118 Guaranteed-delivery...... 113 H.261 ...... 154, 155 Handshake...... 115 Hard deadlines...... 35 Hardware platform...... 163 Hazard ...... 115 HDLC...... 146 Heterogeneous environment ...... 137 Hewlett-Packard...... 137 High Performance Network...... 195 High Performance Network Working Group...... 26, 30 High-level language ...... 163 High-speed backplane ...... 31 High-Speed Data Bus...... 201 High-speed LAN ...... 199 High-speed network...... 195 High-speed technologies ...... 74 Homogenous LANs ...... 148 Horizontal approach...... 143, 144 Horizontal architecture ...... 61 Horizontal integration ...... 68, 146 Hot swapping...... 147 HP OpenView ...... 137, 190 HP-UX...... 137 HPN Model ...... 38 HPNWG...... 79, 83, 107 Hub...... 68, 137 Hydraulic Systems...... 12 I/O architecture ...... 119 IBM ...... 169, 191, 200 IBM Token Ring ...... 21, 100, 119, 146 Identifier...... 126 IDRP ...... 110 IEEE...... 31, 105, 135, 163, 200 IEEE 802.2 ...... 105 Image...... 152, 153, 191, 198 Image processing...... 32, 101 Implementation agreements...... 39 Implementer's Agreement ...... 102 In-System Write...... 152 Industrial process control ...... 8, 44, 45 Industrialisation ...... 201 Industry standards ...... 163 Information management...... 2 Information management infrastructure ...... 80, 196 Information revolution...... 2 Information Technology ...... 197

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page x of xxvi ydthsm2.wpd Index

Initialise ...... 127 Integrateability ...... 6, 7, 146, 166 Integrated circuit ...... 152 Integrated Naval Combat System ...... 166 Integration ...... 24 Integrity...... 71, 83, 88, 107, 140, 195 Integrity checking ...... 135 Integrity control...... 88 Intel ...... 166, 169 Intelligent NICs ...... 162 Interconnect device ...... 149 Interconnection ...... 3, 4 Interconnection systems...... 98 Interconnectivity ...... 9, 46, 56, 80, 95, 112, 113, 115, 123, 146, 147, 163, 189 Interface definition ...... 146 Interface Specification Host/APIS...... 167 Host/BITS...... 167 Host/FTS...... 167 Host/NTS ...... 167 Interface specifications ...... 37, 39 Interface Test and Verification Tool ...... 170 Interfacing ...... 190 Interlock...... 36, 88 Intermediate Systems...... 109 International standards...... 199 International Standards Organisation...... 93 Internet...... 38, 106, 132, 166, 184 Internet Control Message Protocol ...... 110 Internet family...... 110 Internet Profile...... 113, 135, 136 Internet Protocol ...... 108, 110, 112, 113, 191, 197 Internet Protocols...... 116 Internetwork...... 4, 58, 79, 80, 111, 147, 149 Internetwork topologies ...... 6, 197, 198 Internetwork topology ...... 41, 107, 111, 118, 189, 200 Internetworking...... 64, 105, 107, 112, 144, 197 Interoperability ...... 2, 39, 111, 123, 189, 197 Interrupt...... 25, 86 Interrupt handling ...... 162 Interrupt latency...... 20 IP ...... 96, 106, 112, 166, 169, 171, 189, 200 IP address...... 110 IP Implementation...... 171 IP Router ...... 111 IPng ...... 110 IPX...... 169, 189 IS-IS...... 110 ISDN ...... 151

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xi of xxvi ydthsm2.wpd Index

Isis ...... 86 ISO...... 200 ISO 8073 ...... 115 ISO model ...... 7 ISO OSI...... 95, 196 ISO OSI Basic Reference Model...... 38 Isochronous ...... 82, 103 ITU...... 155, 200 IVIT...... 170 Jitter ...... 21, 58, 83, 87, 113, 131, 133, 134, 162 Jitter Control ...... 57 JPEG ...... 154, 156 Kernel Reference Model ...... 186 Key control ...... 141 Knowledge- based systems ...... 27 KRM ...... 186 LAN...... 7, 16, 17, 19, 27, 29, 31, 57, 79, 101, 146, 147, 151, 159, 198 LAN Connectivity...... 146 LAN Profile...... 8, 39, 88, 109, 116, 131, 199 LAN Profiles ...... 95 LAN topology ...... 189 LANWatch...... 169 Latency . 7-9, 21, 22, 28, 30, 41, 58, 63, 66, 79, 84-87, 100, 101, 105, 110, 111, 113, 124, 131, 134, 144, 147, 148, 158, 162, 186, 189, 194, 195 FDDI Load...... 183 Logical Link Control ...... 178 Solicited Transfer...... 177 Unsolicited Transfer...... 176 XTP...... 179 Latency Control...... 57, 88, 119, 122, 196 Layer entity ...... 135 LCC Type 1...... 106 LCC Type 4...... 106 Lifecycle ...... 3, 36, 125, 143, 145, 196, 199 Lifecycle support ...... 199 Lightweight support services ...... 198 Line state identification...... 104 Link confidence test...... 104 Linux ...... 184 Live insertion...... 147 LLC ...... 100 LLC layer...... 171 LLC Type 1...... 105 LLC Type 4...... 105, 106, 126 LLC4 ...... 26 Local area network...... 2, 4, 16, 17, 24, 25, 32, 38, 44, 79, 167 Local clock...... 132, 185 Logical connectivity ...... 106 Logical Link Control...... 25, 26, 96, 100, 105

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xii of xxvi ydthsm2.wpd Index

Long-lived connection ...... 107 Long-lived message ...... 108 Lossless Compression ...... 154 Lossy Compression ...... 154 LynxOS ...... 4, 184 MAC ...... 5, 96, 100, 104, 197 MAC address...... 110 MAC Layer ...... 103 MAC priority...... 119 MAC-layer...... 119, 126 MAC-layer priority ...... 119 Mach Operating System...... 85 Magnetic media...... 140 Mainframe computer...... 70 Maintainability ...... 57, 78, 104, 135, 136, 147, 151, 193 Maintenance...... 40, 44, 135, 147, 152, 194 MAN ...... 29, 101, 151 MAN topology ...... 195 Man-in-the loop control ...... 45 Man-machine interface ...... 136, 138, 167 Managed object ...... 136 Managed Objects...... 135 Management Agent ...... 136 Management Information Base...... 136, 137 Management objects ...... 104 Managing application ...... 136 Mass...... 51, 78, 98 Mass storage ...... 163 Massively parallel systems ...... 31 Maximum Transmission Unit ...... 117, 118 Measurement Latency...... 172 throughput ...... 172 Measurement Setup...... 172 Mechanism...... 25, 40, 83, 88, 112, 113, 117, 118, 188 Media Access Control...... 100, 104 Media propagation delay...... 31, 86, 133 Memory...... 162, 163 Memory management ...... 162 Mentat ...... 170 Message...... 41 Message classification...... 126 Message Filtering ...... 129 Message Identification...... 126 Message Identifier...... 125 Message passing ...... 123, 163 Message passing ...... 124 Message Priority Scheme...... 120

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xiii of xxvi ydthsm2.wpd Index

Message scheduling...... 198 Message scheduling policy ...... 82 Message transfer service ...... 112 Message Type ...... 24 Message Type Number ...... 24 Message-type service...... 116 MIB ...... 136, 137 MIB 2...... 138 Microchannel Architecture ...... 179 Microsoft...... 169 MIL-STD-1553B ...... 101, 146 MIL-STD-1777 ...... 110 MIL-STD-1778 ...... 113 Mining Control ...... 46 Mirrored server ...... 71 Mission-Critical...... 11, 36 Modelling ...... 166, 186 Modem...... 79 Modems...... 151 Modulation ...... 99 Monitoring...... 138, 153 Motion JPEG...... 155 MPEG ...... 23, 154, 155 MS- Windows ...... 137 MS-DOS ...... 137, 169, 181 MS-Windows ...... 137 Multi-access ...... 19, 100 Multi-access media ...... 140 Multibus II...... 134, 166, 169, 170 Multicast . 8, 25, 26, 71, 79, 83, 86, 87, 106, 112-114, 117, 118, 126, 127, 140, 153, 167, 169, 182, 185, 186, 188, 189, 199, 201 Multicast Address Scheme...... 118 Multicast group management ...... 25, 26, 71, 117, 118, 126, 127, 140, 167, 169 Multimedia ...... 27, 29, 30, 44, 46, 79, 83, 102, 152, 156, 157, 189, 195 Multimedia Information System ...... 74 Multimedia information systems ...... 8 Multimode ...... 31, 194 Multimode fibre ...... 98, 158 Multiple access ...... 126 Multiplexing ...... 116, 198 Multiprocessing ...... 3, 20, 25 Multiprocessor ...... 7, 31 Multiprotocol ...... 46, 111, 112, 124, 167, 169, 189 Multiprotocol approach ...... 123 Multitasking ...... 35, 82 Myrinet...... 30 National Semiconductor ...... 169 Naval combat system...... 13, 30, 56, 200 Navigation system...... 131

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xiv of xxvi ydthsm2.wpd Index

NetWare...... 169, 189 Network design ...... 126 Network interface ...... 198 Network interface card ...... 102, 125, 134, 159, 162 Network layer ...... 6, 21, 35, 105, 108, 148, 186, 197, 200 Network layer protocol ...... 107, 197 Network management ...... 8, 125, 135, 138, 148, 190 philosophy ...... 166 Network management agent ...... 137 Network management platform...... 137 Network Management Principles...... 135 Network Management Products ...... 137 Network management protocol...... 136 Network Management Services...... 9, 104, 123, 125, 135, 140, 152, 167, 190, 200 Network Management Standards...... 135 Network management station ...... 103, 138 Network manager...... 86 Network monitoring...... 138 Network Operating System...... 141 Network order ...... 39 Network Profile...... 189 Network Security Services...... 140 Network testing ...... 138 Network Time Protocol...... 7, 31, 96, 123, 132, 166, 189, 200 Network Time Services...... 7, 8, 96, 123, 131, 196 Network timing services ...... 119 Network topology ...... 100 Network Xpress...... 170, 179, 180 Next generation ...... 30-32, 53, 58, 63, 70, 152, 194, 195, 201 Next-generation...... 122 NIC intelligent ...... 162 non-intelligent ...... 162 NMEA 0813 ...... 169 NMS...... 138, 167 Noerror...... 121 Non-restricted token ...... 103 Novell...... 169 NTP ...... 134, 166 NTP algorithm...... 133 NTP code...... 184 NTP Development...... 184, 187 NTS ...... 7, 8, 131 NTS Capabilities ...... 132 NXI XTP ...... 170, 186, 191 Object-oriented methodology ...... 172 OBS...... 29, 159 Obsolescence...... 6, 162, 197 Obsolescence management ...... 6, 198, 199

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xv of xxvi ydthsm2.wpd Index

Off-host architecture ...... 125, 131 Offset...... 132, 133 Onboard processing ...... 131 Open Shortest Path First ...... 111 Open systems...... 2, 6, 20, 38, 163, 189, 199-201 Operability ...... 78 Operating system...... 4, 132, 162 Operating systems ...... 163 Operational disconnect ...... 147 Operational requirements...... 14, 27 Optical buses ...... 31 Optical Bypass Switch...... 29, 104, 152, 159 Optical bypass switches ...... 147 Optical fibre...... 19, 101, 133 Optical media...... 140 Optical power...... 99 Optical read/write disk...... 140 Optical transmission power...... 140 Optically-coupled interconnect ...... 147 Order ...... 80, 88 Ordinary data...... 116 Orthogonal Approach ...... 117, 188 OS/2...... 169 Oscilloscope ...... 172 OSI model ...... 104, 115, 186 OSI Profile...... 123, 135, 196 OSI protocol...... 112, 116, 123 OSI Reference Model ...... 125 OSI routing architecture...... 109 Out-of-band...... 113, 117, 134 Overload...... 22, 23, 35, 163, 198 Packet...... 40 Packet fragmentation...... 107 Packet radio...... 79, 151 Packet reassembly ...... 107 Packet switching ...... 149 Packet transmission order ...... 119 Packet-switched network...... 132, 155 Packet-switched services...... 189 PAL ...... 156 Paradigm ...... 8, 38, 197, 199, 201 Parallel backplane bus...... 7, 96, 134, 162, 164, 172, 198 latency...... 175 Parallel processing...... 85, 86 Parameter Management Frames...... 137 Parameter positioning ...... 21 Parametric Addressing...... 118 Password ...... 141 Path establishment...... 83

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xvi of xxvi ydthsm2.wpd Index

Payload data...... 195 PC/EISA...... 166 Peer clock...... 132 Performance...... 2, 5-7, 29, 36, 59, 123, 166, 193, 197 Performance characterisation...... 9 Performance guarantee...... 119 Performance measurement...... 9, 166, 169, 200 Performance optimisation ...... 104 Performance tests...... 166 Phase modulation ...... 99 PHY...... 104 PHY Latency...... 86 Physical circuit...... 74, 150 Physical circuits...... 149 Physical Connection Management...... 104 Physical Layer...... 36, 99, 193 Physical Layer Protocol...... 99 Physical range ...... 98 Physical signalling ...... 36 Picture quality ...... 154 Plant automation ...... 11 Platform management system ...... 188 PMD...... 104 PMF...... 137 Pneumatic systems...... 12 Policy...... 25, 26, 83, 88, 112, 113, 117, 188 Portability ...... 163, 190, 191 Portable Operating System Interface Extension ...... 163 POSIX ...... 96, 163 Compliance Test...... 163 Real-Time Extensions...... 163 Threads Interface ...... 163 Pre-emptive ...... 188 Precedence ...... 7, 8, 23, 27, 41, 81, 82, 87, 89, 119-121, 123, 150, 156 Precision...... 131-133 Predictability ...... 20 Presentation layer...... 7, 123, 124 Priority...... 7, 8, 23, 41, 81, 87, 89, 101, 103, 104, 108, 113, 114, 118-121, 150, 157 Priority band ...... 120 Priority event handling...... 23 Priority inversion...... 119 Priority Management...... 119, 120 Priority Manager ...... 119 Priority mapping ...... 119 Priority Message Scheduling...... 112, 114, 117, 123 Priority message service ...... 23 Priority scheduling...... 188 Process control...... 2, 13, 17, 22, 36, 138, 143, 193 Process control plants ...... 12

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xvii of xxvi ydthsm2.wpd Index

Processing power...... 154 Processor time ...... 163 Produce Data ...... 127 Producer...... 125 Promiscuous mode ...... 140, 170 Proprietary...... 79, 95, 163, 196, 197 Protocol ...... 35, 149 Protocol analyzer...... 172 Protocol conversion ...... 147-149 Protocol data unit...... 40 Protocol data units...... 105, 110 Protocol Engine...... 117, 195 Protocol engineering ...... 166, 170, 197 Protocol feature ...... 118 Protocol mechanism...... 113 Protocol Optimisation ...... 191 Protocol Overheads ...... 175 Protocol processing ...... 21, 131 Protocol stack...... 7, 172 Protocol stacks...... 133 Prototyping...... 9, 166, 171, 186, 200 Public network...... 195 Public switched network ...... 151 Public switched telephone network ...... 149 Push flag...... 114 Q93.B ...... 122 QoS ...... 83 Quad-redundancy ...... 37, 48 Quad-redundant topology ...... 194 Quality of service ...... 57, 83, 101, 104, 106, 111, 122, 158, 198 Quantisation error ...... 9 Radar ...... 17, 57, 78, 153, 158, 169, 185 Radiation ...... 158 Radio clock ...... 132 Rank...... 41 Rapid prototyping ...... 9, 187, 190 Rapid prototyping ...... 185 Rate Control...... 41, 90, 112, 117, 118, 129, 188 Rational Rose...... 172 Re-configuration control ...... 138 Re-ordering ...... 37 Re-useable objects...... 185 Re-useable software...... 191, 200 Reaction time...... 44, 46, 153 Real- time protocol ...... 201 Real-Time ...... 35 Real-Time LAN Profile...... 7, 8, 93, 95, 96, 101, 108, 112, 123, 136, 150, 166, 196, 199 Real-Time LANs...... 39 Real-time monitoring...... 135

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xviii of xxvi ydthsm2.wpd Index

Real-time network...... 163 Real-time operating system...... 20, 21, 46, 85, 125, 162, 163, 169, 191, 198 Real-time protocol...... 4, 117, 122, 158, 166, 199 Real-Time Protocol Stack ...... 96 Receive Message ...... 128 Receive path ...... 133 Recommended Standards and Products ...... 9 Reconfigurability ...... 16, 19, 59, 66, 146, 151, 152, 193 Reconfiguration...... 63, 135, 143, 190, 193 Reconfiguration management ...... 138 Redundancy ...... 40, 80, 89, 152, 154, 188 Redundant ...... 155 Reference Model ...... 93 Register ...... 127 Regular expression...... 126 Relative time ...... 131 Reliability ...... 19, 28, 78, 83, 87, 98, 105, 146, 150, 151 Reliable Datagram...... 84, 108, 112 Remote...... 74 Remote priority processing ...... 83 Remote procedure call...... 84 Reordering...... 115 Repeaters...... 147 Repetition cycle ...... 89, 101 Repetition rate ...... 131, 169 Replicated topology ...... 194 Replication...... 5, 11, 28, 37, 40, 48, 59, 71, 79, 103, 201 Requalification...... 152 Request queue ...... 119 Request/reply...... 107 Request/response ...... 84 Request/response type interaction ...... 114 Reservation Mode ...... 118 Residual error...... 87 Residual error rate...... 83 Resource allocation ...... 83 Resource reservation ...... 111 Resource Reservation Protocol ...... 122 Response time ...... 48, 56 Restricted Mode...... 103 Restricted token...... 103 Retransmission...... 117, 118 Retry...... 120 Retry Count ...... 121 Reverse engineering...... 190 RF communication systems...... 158 RF communications...... 151 RF communications frequencies...... 141 RGB...... 156

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xix of xxvi ydthsm2.wpd Index

Ring circumference ...... 86 Ring Latency Time ...... 105 Ring management ...... 104 Ring topology ...... 31, 133, 146 Ring-type topology ...... 86 RISC...... 181, 190 RISC processor ...... 117 RMK ...... 169 RMX ...... 4, 169 Robotics...... 2, 3, 46 Robustness ...... 166, 193 Rosetta Stone...... 113 Roundtrip ...... 133 Route determination...... 109 Router...... 41, 63, 66, 68, 80, 110, 137, 144-148 Routing...... 41, 108, 111, 112, 171 Routing Information Protocol ...... 111 Royal Navy ...... 14, 188 RS-232...... 146, 169 RS-422...... 146 RSVP...... 122 Ruggedised ...... 199 SAFENET ...... 38, 95, 106, 109, 110, 116, 131, 135, 136, 188, 190, 196, 199, 200 SAFENET guidebook ...... 95 SAFENET Network Development Guidance ...... 123, 135 SAFENET Time Services ...... 131 Safety Critical Circuits...... 88 Safety Virtual Circuit...... 88 Safety-critical...... 11, 36, 37, 48, 58, 87 Sandia National Laboratories...... 170 SAS ...... 146 Satellite ...... 50, 151 Satellite communication ...... 79 SBA...... 102, 138 SBA master ...... 103 SCADA ...... 17 Scalability ...... 20, 80, 193-195 Scalable Coherent Interface...... 31, 99 Scaling factor...... 114 Schedulability ...... 21, 163 Scheduling...... 20, 21, 23, 30, 35, 162 SCI...... 31, 74 SCO Unix...... 169, 184 SDLC...... 146 Secure...... 4 Security ...... 9, 74, 81, 83, 112, 127, 135, 140, 141, 158 Security policy...... 81 Security protocol ...... 36 Seed value ...... 134

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xx of xxvi ydthsm2.wpd Index

Segmentation...... 116, 147 Self-healing ...... 4, 28, 29, 37, 40, 105, 146, 152, 194 Semiconductor logic ...... 149 Send Message ...... 128 Sensor data...... 31, 86 Sequence number...... 122 Sequence space ...... 114, 116 Server farm...... 70 Server Integrity ...... 71 Server mirroring...... 71, 102, 140 Server pool ...... 70 Server process ...... 84 Service call...... 198 Service definition ...... 146 Service discrimination...... 113, 123 Service model...... 83, 106, 107, 123 Service option ...... 127 Service provider...... 124 Session layer ...... 7, 123, 124 Shared state ...... 3, 38 silicon protocol ...... 117 Simple Network Management Protocol...... 124, 136 Simulation ...... 25, 74, 86 Simulator...... 167, 170 Simulator Development...... 185, 187 Single point of failure ...... 11, 28, 40, 68, 152, 194 Single-attached...... 151 Single-attachment ...... 146 Singlemode fibre ...... 32, 98 Size...... 2, 51, 78, 98, 100 Skew...... 132 Slack ARQ...... 158 Sliding window ...... 25, 118, 122 Sliding window scheme ...... 122 SMT...... 96, 104, 137 SNAP...... 106 SNMP...... 136, 137 SNMP MIB ...... 137 Software compilers ...... 186, 191 Software engineering...... 21 Software Language Compilers ...... 186, 191 Software languages ...... 4 Software re-use ...... 187 Solution Derivation ...... 194 Sonar ...... 17, 156, 158 SONET...... 27, 28, 156 South African Navy...... 200 Space Launch Vehicle ...... 50 Space radiation ...... 51

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xxi of xxvi ydthsm2.wpd Index

Space shuttle ...... 2 Spare Capacity...... 196 Spatial errors ...... 9 Special processing...... 83 Special services ...... 108, 112, 113 Splice...... 98 SPX ...... 169, 189 ST-II...... 122 Stability ...... 132, 163 Standard...... 24, 31 Standard components ...... 7 Standard protocols...... 39 Standardisation...... 195, 200 Standardisation...... 93 Standards...... 2, 5, 6, 196 Star- connected network ...... 151 Star-type topology ...... 28, 31, 86 State...... 101, 120 State data ...... 101 State information...... 110, 115 State-type data...... 83, 108 State-type message...... 84 Static priority scheme ...... 119 Static SBA Scheme ...... 102 Station Management ...... 104 Station Management Layer ...... 137 Statistical performance measurement ...... 135 Statistics...... 104 Status conditions ...... 124 Status event ...... 126 Status table...... 126 Strategy ...... 2, 36 Stream-oriented service...... 116 Streams...... 170 Structured Query Language ...... 71, 85 Stuck beacon detection ...... 104 Sub-Network Access Protocol...... 106 Sub-type...... 126 Supernet II...... 169 Supervisory, Control and Data Acquisition ...... 17 Supportability ...... 98 Supportable ...... 78 Surveillance radar ...... 153 Survivability ...... 3, 37, 57, 63, 66, 79, 80, 85, 143, 144, 146, 158, 193, 194, 197 Survivable Adaptable Fibre Optic Embedded Network ...... 95 Switch ...... 27, 28, 30-32, 40, 63, 74, 80, 105, 144, 146, 149, 151 Switching...... 107 Synchro/digital...... 146 Synchronisation . 7, 8, 30, 31, 46, 58, 86, 99, 123, 131-133, 162, 167, 169, 184, 187, 189, 190, 193

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xxii of xxvi ydthsm2.wpd Index

accuracy ...... 195 Synchronisation algorithm...... 133 Synchronisation protocol...... 134 Synchronisation Seed...... 133 Synchronous ...... 82, 101, 118, 121, 169, 196, 201 Synchronous bandwidth allocation ...... 102, 123 Synchronous Bandwidth Allocator ...... 102, 138, 191, 201 Synchronous Mode ...... 101, 150 SysKonnect ...... 166, 169, 183 System...... 37 System Applications ...... 44 System Architecture...... 5, 8, 9, 13, 17, 79, 93 System behaviour...... 167 System BIT ...... 152 System Cable Plant ...... 158 System Data Manager ...... 125 System dataflow...... 124, 125, 144, 147, 170 System Dependability ...... 151 System design ...... 126, 147, 166 System Design Document ...... 167 System effectiveness ...... 78, 151, 197 System engineering ...... 28, 199 System engineering management ...... 193 System engineering methodology ...... 5 System engineering process...... 146, 152 System failure ...... 151 System Fibre Cable Plant ...... 158 System health monitoring ...... 135 System integration...... 79, 163 System integrator...... 125 System integrators...... 44 System Management ...... 45 System performance...... 12, 143 System requirement...... 166 System requirements ...... 8, 44, 78, 93, 193 System segmentation...... 146 System solution ...... 193, 199, 200 System Specification...... 167 System timing ...... 167 System upgrade ...... 193 Systematic approach ...... 146 Table administration ...... 127 Tagged data ...... 134 Target Token Rotation Time...... 122 Task response ...... 20 Task response times ...... 162 Task switching...... 35 Tasking...... 163 Tasking model...... 137

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xxiii of xxvi ydthsm2.wpd Index

TCP ...... 23, 83, 112, 113, 123, 166 TCP Deficiencies...... 114 TCP error control...... 115 TCP Priority...... 114 TCP state machine...... 115 TCP Suitability ...... 115 TCP timestamp...... 114 TCP transaction time...... 115 TCP/IP...... 124, 169, 186, 189 TCU...... 159 Teleradiology...... 36 Test and evaluation ...... 81 Test Equipment ...... 174 Test Scenarios ...... 174 Throughput ...... 8, 13, 27, 32, 74, 79, 83, 98, 111, 113, 140, 147, 153, 158, 186, 188-190, 194 XTP...... 180 Tightness ...... 39, 86 Time delay...... 37 Time offset...... 132 Time token protocol...... 134 Time- Triggered Protocol ...... 22 Time-triggered...... 22, 101 Time-triggered protocol...... 48 Time-wait state ...... 115 Timed-token protocol ...... 21, 101 Timeliness ...... 4, 7, 30, 39, 80, 83, 88, 106, 146, 147 Timer...... 115, 162 Timeslot...... 149 Timestamp...... 131-134 Timestamp age...... 134 Timestamping ...... 7, 88, 112, 113, 132, 134, 189, 196 Timing...... 20, 21, 25, 35, 41, 100, 147, 163, 170, 190, 193 Timing functions ...... 131 Timing mechanisms...... 132 Timing primitives ...... 131 Timing services ...... 132 Token access ...... 85 Token bus...... 21, 119 Token passing ...... 20 Token ring ...... 21 Token ring protocol...... 133 Token rotation ...... 134 Topology ...... 9, 27, 28, 98, 100, 143, 146, 158, 163 TP4...... 23, 83, 115, 124, 166, 169, 186, 189 TP4 Deficiencies ...... 116 TP4 Priority Scheme ...... 116 TP4 Suitability ...... 116 Traceability ...... 146 Traffic descriptor...... 111, 117

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xxiv of xxvi ydthsm2.wpd Index

Traffic pattern ...... 146 Traffic profile...... 82, 150 Training...... 57 Transaction...... 71, 84, 88, 108, 112 Transaction protocol ...... 85 Transfer layers...... 124 Transfer protocol...... 170, 200 Transfer protocols ...... 157 Transformer-coupled interconnect ...... 147 Transmission Control Protocol ...... 113 Transmission mode ...... 120, 121 Transmit path...... 133 Transparency...... 2, 6, 7, 80, 107, 112, 198 Transparent ...... 124, 172 Transport layer...... 6, 21, 26, 35, 40, 105, 106, 108, 125, 186, 197, 200 Transport layer priority ...... 119 Transport Layer Protocol...... 112, 166, 197 Transport Layer protocols...... 112 Transport Layer Requirements ...... 113 Transport protocol...... 4, 22-24, 115, 170, 200 Transport Protocol Data Units...... 116 Tree topology ...... 146 Trunk Coupling Units ...... 159 TTRT...... 123 TUBA...... 110 Type...... 126 UDP...... 96 Ultra-high performance network...... 105 Unacknowledged connectionless service ...... 105 Unicast...... 25 Universal Portable Protocol Stack...... 169 University of Virginia ...... 111 Unix...... 132, 137, 170, 187, 191, 201 Unreliable network ...... 116 Upgradeability ...... 2, 19, 22, 24, 44, 57, 59, 78, 143, 152, 163, 193, 196, 197, 199 Urgent pointer ...... 114 US Navy...... 31, 59, 63, 64, 95, 188 User requirements ...... 6, 31, 78, 199 User-level priority ...... 119 UTC...... 132 Validity period...... 88 Vertical approach...... 143 Vertical integration ...... 66, 146 Vetronics ...... 53 Video...... 45, 82, 152, 153 compression ...... 156 conferencing...... 156 control...... 156 encryption...... 156

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xxv of xxvi ydthsm2.wpd Index

framing...... 156 Networking...... 156 transmission ...... 156 Video compression...... 154 Video conferencing ...... 25, 37 Video Quality ...... 156 Video Server ...... 74 Virtual backplane...... 24, 124, 198 Virtual channel...... 28, 150 Virtual circuit...... 28, 29, 74, 150, 156 Virtual Circuits ...... 150 Visualisation ...... 138 VLSI...... 12 Voice ...... 82, 157 Vulnerable...... 140 VxWorks ...... 4, 191 WAN ...... 27, 79, 146, 151, 198 WAN topology ...... 195 Wide Area Connectivity ...... 151 Wide Area Network...... 79, 111, 149 Wildcard demand...... 127 Wildcard produce ...... 127 Wildcarding...... 126 Wind River Systems ...... 191 Windows NT ...... 137, 169 Windows size ...... 114 Wire-type interconnect ...... 147 Wire-type media...... 140 WORM...... 140 Wrap...... 152 X.25...... 109 Xpress Transfer Protocol...... 117 Xpress Transport Protocol...... 116, 117, 166, 197, 200 XTP . 23, 46, 63, 83, 96, 104, 106, 109, 110, 115-117, 123, 124, 126, 131, 134, 158, 166, 169, 170, 179, 186, 188, 189, 198, 200 XTP Capabilities ...... 117 XTP Deficiencies...... 118 XTP Forum ...... 170 XTP History...... 117 XTP Implementer Options...... 118 XTP Kernel Reference Model...... 170 XTP Suitability ...... 118 XTP V4.0...... 138 XTP-aware IP router ...... 63 XTP-aware IP routing ...... 111, 150, 189 Zombie context ...... 115 Zombie state...... 115

Issue : 1 1996-07-08 Revision : 2 2006-05-31 Page xxvi of xxvi ydthsm2.wpd