Ensuring Reliable Networks

Theory, Concepts and Applications ETR 2015 – Rennes August, the 27th

Jean-Baptiste Chaudron [email protected]

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 1

AGENDA Ensuring Reliable Networks Introduction TTEthernet Basics Critical Traffic over TTEthernet Clock Synchronization Principles Fault Tolerance TTEthernet Products Overview

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 2

AGENDA Ensuring Reliable Networks Introduction TTEthernet Basics Critical Traffic over TTEthernet Clock Synchronization Principles Fault Tolerance TTEthernet Products Overview

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 3

Introduction

Real-Time Computer System Ensuring Reliable Networks

A real-time computer system is a computer system in which the correctness of the system behavior depends not only on the logical results of the computations, but also on the physical time, when these results are produced [Kop97].

The point in time when a certain action must be finished is called deadline. • Soft deadlines: If the result has utility after the deadline. • Hard deadlines: Missing a deadline can result in a catastrophic event.

Computer systems classification • Guaranteed Timeliness – RT systems • Best Effort – no timing guarantees – no RT systems

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 4

Introduction

Distributed Real-Time System Ensuring Reliable Networks

Reasons for distribution: • Scalability – single computer systems have limited computing resources • Complexity – handling through smaller simpler intelligent units • Safe wiring – from single computer to different sensors/actuators • Fault-tolerance – avoid single point of failure

Control loops: sensor actuator • Periodic operation Sensor – communicate – calculate - actuator • Low end-to-end communication latency enables node1 node2 implementation of tighter control

• Real-time communication real-time bus

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 5

Introduction

Latency vs. Deadline Ensuring Reliable Networks

min jitter max Relevant input/measurement occurs at Node A Latency of system response

Deadline for system response

Node A processes input

Result is communicated with node B

Node B acts upon result from A

Flow of time

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 6

Introduction

End to End Latency (1) Ensuring Reliable Networks

The time interval between the initiation of transmission from the host computer to other host computer at the receiver depend on many factors: •, Media access control (MAC) •Transmission speed, cable lengths •Network load

Node Node

Host computer Host computer

Communication Communication Controller Controller

Time

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 7

Introduction

End to End Latency (2) Ensuring Reliable Networks

CSMA access free channel one message in transit (or pending, but with higher Communication can be priority than own message) delayed by: • Concurrency of two messages in transit / pending transmissions and the media access strategy three messages in transit … e.g., CSMA

time

tmin tmsg tmsg tmsg tmsg

PAR Communication can also be delayed by: No error • Error handling strategy, e.g. PAR Retransmit once (Positive Acknowledge or Retransmit) Retransmit twice • Bus access delays due to EMI (External Retransmit three times Memory Interface) - wait for bus idle time tmin tmsg tmsg tmsg tmsg

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 8

Introduction

Peak Load Handling Challenge (1) Ensuring Reliable Networks

Peak load handling • Peak load situation: all nodes on the shared bus require communication services at the same time, send maximum amount/length of data, highest priority messages

• Problem: find out in which scenario this happens, and what the actual load and the worst-case message delays are at this time

• In event driven systems this can be very complex

• Even more complicated if faults that lead to retransmission of messages must be accounted for

• Experiments or approximate scheduling can only offer “probabilities”: • Ex: latency for message X less than 500 µs in 99,96%, but no guaranteed worst case latency

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 9

Introduction

Peak Load Handling Challenge (2) Ensuring Reliable Networks

Thrashing: • Abruptly decreasing throughput that occurs with an increase of the system load.

Cause of trashing: • Retry mechanism in PAR protocols (error handling and time-outs) • Combined with the waits from the CSMA access

throughput

Ideal system 100 % thrashing point

Real system

requested load

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 10

Introduction

Deterministic Networks Ensuring Reliable Networks

Features of deterministic networks:

•Known (maximum) end-to-end latency •Bounded and small jitter •Message ordering guarantee •Error detection •Masquerade protection

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 11

Introduction

TT-System vs. ET-System Ensuring Reliable Networks Transportation - example • Cars and taxis are event-triggered: • they go whenever they are needed • Trains are time-triggered: • they go according to a fixed schedule • Advantage of the event-triggered approach: very flexible • Advantage of the time-triggered approach: very predictable

When would you prefer a time-triggered solution?

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 12

Introduction

Why Clock Synchronization? (1) Ensuring Reliable Networks

In RT systems all ‘layers’ of functionality in the system must meet the ‘quality of service’ requirements defined by the application: •the application layer must operate timely and predictably, reading the sensors in time, computing correct values, updating actuators reliably etc. •the communication layer must meet the specified functionality of transmitting information between the nodes in the system, and must also do this predictably and timely

Timely operation: •Coordination of the computer nodes in the time domain •Clock synchronization

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 13

Introduction

Why Clock Synchronization? (2) Ensuring Reliable Networks

Local clocks, a counter triggered by an oscillator. Oscillators have nominal rate (10 Mhz), and a certain drift rate. • Standard drift rates of oscillators in the market: 10-3s/s to 10-5 s/s • Oscillators with small drift rate ~ 10-6s/s – expensive

• What does 10-3s/s drift rate mean? . 1 microsecond deviation every 1 millisecond, . 1 second deviation in 1000 seconds, . equals 10 min. deviation per week.

Oscillator drift rate can be affected by other factors like: • temperature, humidity, …

Clock synchronization: keeps clocks of distributed computers close to each other

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 14

Introduction

Global Notion of Time Ensuring Reliable Networks

1. GLOBAL notion of time, built on top of local time

Local clocks - free running

Local view of global time

2. Activities triggered on basis of global time

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 15

Introduction

Precision Interval (1) Ensuring Reliable Networks

The precision, or precision interval (denoted  ), is the upper bound between the slowest and the fastest non-faulty clock in the system.

A “fast” clock and a “slow” clock will never differ by more than one precision interval.

Clock 1 10:45 11:00 11:15

Clock 2 10:45 11:00 11:15

Clock 3 10:45 11:00 11:15   

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 16

Introduction

Precision Interval (2) Ensuring Reliable Networks The precision interval in a distributed system depends on • the hardware properties of each clock (clock drifts, e.g. 100 ppm) • the resynchronization interval (e.g. 5 ms) • the resynchronization method used (how efficient does it work)

smaller precision interval  smaller timeouts  more efficient system

drift offset resynchronization

interval

Average clock precision interval precision

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 17

Introduction

Whole System Synchronization (1) Ensuring Reliable Networks

Two different approaches...

Yes, 15:00 centralized vs. distributed I want to join, what‘s the time?

15:00 It‘s 15:00!

14:59

OK, 15:00. I see, 15:00.

15:00

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 18

Introduction

Whole System Synchronization (2) Ensuring Reliable Networks

Synchronization to external time reference is possible • all nodes can apply a bounded correction term to slightly speed up or slow down the local clock End System with GPS receiver • the precision window  will never be left – time will never go backwards 15:00

• this mechanism can be used to broadcast a correction value relative to some external time reference (e.g. GPS time) 15:00

• application of this term to the local node is performed by the host CPU 15:00

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 19

Introduction

Time Triggered System – Summary (1) Ensuring Reliable Networks Any Time-Triggered System must have two key properties: 1  a global notion of time • in case of a distributed system: a GLOBAL notion of time, available to each node in the system

2  a global schedule (when to do what) • in case of a distributed system: a GLOBAL schedule or CONSISTENT parts of a GLOBAL schedule available to each node in the system www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 20

Introduction

Time Triggered System – Summary (2) Ensuring Reliable Networks

• All protocol operations are initiated at a priori known points in time (‘action time’), transmission and reception is thus performed during known ‘slots’. • There is no external (application) control over the protocol progression. • Generation of a global time base and common knowledge about the action times reside inside the protocol controllers and cannot be modified by the application CPU.

Node 1 send receive receive

t1 t2 t3

Node 2 receive send receive

t1 t2 t3

Node 3 receive receive send t t t 1 Slot 2 3 www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 21

AGENDA Ensuring Reliable Networks Introduction TTEthernet Basics Critical Traffic over TTEthernet Clock Synchronization Principles Fault Tolerance TTEthernet Products Overview

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 22

TTEthernet Basics

Ethernet History Ensuring Reliable Networks

is most popular LAN technology in the world • Technology is a well-established open-world standard and very scalable • Early version was bus-based and 10 Mbit/s, today 100 Mbit/s and 1 Gbit/s () are common • Ethernet is specified in OSI layer one and two • Found by Xerox Palo Alto Research Center (PARC) in 1975 • Original designed as a 2.94 Mbps system to connect 100 computers on a 1 km cable • Later, Xerox, Intel and DEC drew up a standard support 10 Mbps • Basis for the IEEE’s 802.3 specification • Ethernet uses the CSMA/CD media access control

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 23

TTEthernet Basics

Ethernet - CSMA/CD (1) Ensuring Reliable Networks

• CSMA/CD (carrier sense multiple access with collision detection) • Data is transmitted in the form of packets. • Listen/Sense channel before packet transmission. • If channel is sensed idle → packet is transmitted • Else → defer the transmission until channel becomes idle I want to send, but there is an ongoing Channel is idle, tranmisison in the bus, I will start transmission I will wait!

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 24

TTEthernet Basics

Ethernet - CSMA/CD (2) Ensuring Reliable Networks

• If collision is detected: I will wait for 12 miliseconds before listening to the channel → stop sending frame data → send a 32-bit "jam” sequence → wait for a random time interval → start retransmission

I will wait for 43 miliseconds before listening to the channel

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 25

TTEthernet Basics

Possible Ethernet Topologies Ensuring Reliable Networks Star / Switched Bus

Ring

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 26

TTEthernet Basics

Switched Star Topology Ensuring Reliable Networks • It’s modular • Independent wires for each end node • Independent traffic in each wire • A second layer of switches can be added to build a hierarchical network that extends the same two benefits above

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 27

TTEthernet Basics

Ethernet Half Duplex vs. Full Duplex Ensuring Reliable Networks

Half duplex Full duplex

• Uni-directional • Bi-directional • Bus based Ethernet • Switched Ethernet • 10 Mbit/s • 100 Mbit/s and above

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 28

TTEthernet Basics

RT Limitations of Ethernet Ensuring Reliable Networks • Bus based Ethernet • No timing guarantees • Latency not bounded latency because of the CSMA/CD • Traffic bursts cause congestion, and packet delays

• Switched Ethernet • Point-to-Point transmission between an switch and an end-system • Separate conductors for transmission and reception paths – no message collisions • Traffic bursts cause congestion in switch and packet delays, buffer overflows and packet loss • Message ordering is not maintained

• Fault handling protocols, like PAR in TCP introduce • additional end-to-end communication latency between sender and receiver • Additional traffic for ACK, increase network congestion

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 29

TTEthernet Basics

Why TTEthernet ? (1) Ensuring Reliable Networks

• Ethernet hardware is low cost. • Ethernet is a well-established open-world standard and very scalable. • The OSI reference model gives a well-structured classification of concepts that can be built on top of Ethernet. • Existing tools can be leveraged as cost-efficient diagnosis tools. • As all messages in TTEthernet are standard Ethernet compliant, existing tools can be leveraged for time-triggered messages as well. • Standard web servers can be leveraged for maintenance and configuration. • Engineers learn about Ethernet at school.

Ethernet compatibility enables the usage of technology that is already established, tested, and verified.

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 30

TTEthernet Basics

Why TTEthernet ? (2) Ensuring Reliable Networks

IEEE 802.3 addresses the lowest layers of the OSI reference model, some higher layers are represented by other IEEE 802 parts. TTEthernet performs services transparently within the , using all IEEE 802.3 services without modification and not modifying IEEE 802.2 services.

7 Application architecture, NM, layers above (TCP,UDP,IP) 6 Presentation 5 Session (IEEE 802.3 LLC) 4 Transport 3 Network Media Access Control (IEEE 802.3 MAC) 2 Data Link 1 Physical Physical Layer (IEEE 802.3 PHY) 10BaseT 100BaseTx 1000BaseCX … OSI layer model

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 31

TTEthernet Basics

TTEthernet Properties Ensuring Reliable Networks

• Short communication loop times (<100 µs) – high-speed controls • Minimized loop jitter (<1 µs) – deterministic control functions • Integration of legacy (non-real-time) traffic in the RT network • Utilize the full potential of switched Ethernet – high bandwidth, dedicated links with full duplex communication • Fault tolerant communication system possible • High-performance variant offering full link speed per device • Low-cost variant using regular Ethernet components • Interoperability between high-performance and low-cost components • Receive-compatible with standard Ethernet components (e.g. for traffic monitoring, gateways, diagnosis) • Transmit-compatible with standard Ethernet components (e.g. for simulation, low-cost non-RT applications)

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 32

AGENDA Ensuring Reliable Networks Introduction TTEthernet Basics Critical Traffic over TTEthernet Clock Synchronization Principles Fault Tolerance TTEthernet Products Overview

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 33

Critical Traffic over TTEthernet

RT Communications Overview (1) Ensuring Reliable Networks

Real-Time Communication

Wired Wireless

Multi-Master Switched Unidirectional Single-Master e.g. ARINC429 e.g. LIN

switched asynchronous

ARINC 664 CAN unswitched (AFDX® ) (ISO11898) Ethernet TTP FlexRay ARINC659

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 34

Critical Traffic over TTEthernet

RT Communications Overview (2) Ensuring Reliable Networks

Real-Time Communication

Wired Wireless

Multi-Master Switched Unidirectional Single-Master e.g. ARINC429 e.g. LIN

switched Ethernet asynchronous synchronous

ARINC 664 (AFDX® ) CAN unswitched (ISO11898) Ethernet TTP FlexRay ARINC659

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 35

Critical Traffic over TTEthernet

ARINC 664 traffic Ensuring Reliable Networks

• ARINC 664 is a standard for an aircraft data network based on Ethernet (IEEE 802.3). • AFDX® ( Full-Duplex switched Ethernet) is a deterministic data network for safety critical applications based on ARINC 664 and defined in ARINC 664 part 7. Note: AFDX® is a registered trademark of Airbus • TTEthernet developments are influenced by ARINC 664 p7 because one key use of TTEthernet is to extend such networks with time-triggered communication services.

• In TTEthernet network, three classes of traffic can co-exist and A664 traffic is named “rate-constrained” traffic.

time-triggered traffic rate-constrained traffic regular (“legacy”) traffic

TT RC BE

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 36

Critical Traffic over TTEthernet

AFDX® Planes & Helicopters Ensuring Reliable Networks

• Airbus A380 • Boeing 787 • Airbus A400M • Airbus A350 • Sukhoi Superjet 100 • AgustaWestland AW101 • Irkut MS-21 • Bombardier CSeries • Comac ARJ21 • AgustaWestland AW149

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 37

Critical Traffic over TTEthernet

Virtual Link Concept (1) Ensuring Reliable Networks

• ARINC 664 p7 traffic is based on Virtual Links (VLs). • A Virtual Link is a path between sender and receiver(s) (1:n relation) with a unique VL identifier (VL ID). Switches are configured to know about the VL definitions (up to 4096 VL IDs per switch). • End-Systems exchange frames through Virtual Links (VLs). • A Virtual Link defines a unidirectional path from one End-System to one or more destination End-Systems. VL ID Sender Receive r(s) VL1 1 a b,d ES2 Network 2 a c

ES1 ES3 3 b c,e,f

4 c a,b,d ES4 5 c a,f VL 2 … … … example Virtual Link definitions www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 38

Critical Traffic over TTEthernet

Virtual Link Concept (2) Ensuring Reliable Networks

Switches are configured to know about the VL definitions…

VL ID Sender Receiver(s)

1 a b,d 2 a c 3 b c,e,f 4 c a,b,d 5 c a,f … … …

a g

3 b f e c d

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 39

Critical Traffic over TTEthernet

Virtual Link Concept (3) Ensuring Reliable Networks ...and route frames according to the VL definition.

VL ID Sender Receiver(s)

1 a b,d 2 a c 3 b c,e,f 4 c a,b,d 5 c a,f … … …

a g

3 b 3 3 f e c d

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 40

Critical Traffic over TTEthernet

Virtual Link Concept (4) Ensuring Reliable Networks

HOST

ES •Error Propagation Boundaries • Static configuration of VLs per port VL not configured on SW-port • Restricted access for configured VL • Traffic filtering – policing • At switch and ES Switch

VL not configured on ES-port

ES ES

HOST HOST

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 41

Critical Traffic over TTEthernet

Virtual Link Address Scheme Ensuring Reliable Networks

• Virtual Link are encoded in the destination MAC address. • Differentiate a critical traffic from a standard Ethernet traffic • All VLID have the same CT marker • VL ID is encoded in the lower two bytes of the destination MAC address.

CT marker VL ID

89 1D

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 42

Critical Traffic over TTEthernet

Rate Constrained Traffic Definition Ensuring Reliable Networks

VL have reserved bandwidth, expressed in terms of: RC • Maximum Frame Size (MFS) • Minimum time between two frames - Bandwidth Allocation Gap (BAG) • For each VL a bandwidth limit per time is defined. • This limit will be enforced by “BAG policing” in the switch(es): VL traffic exceeding the limit will be silently suppressed • RC by itself does not guarantee timing or jitter, or preserve order of transmissions. It only ensures bandwidth limitation.

VL ID Sender Receiver(s) Limit/BAG 1 a b,d 25 Kb/20 ms 2 a c 12 Kb/50 ms 3 b c,e,f 17 Kb/10 ms 4 c a,b,d 20 Kb/80 ms … … … … example Virtual Link definitions with bandwidth limits www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 43

Critical Traffic over TTEthernet

Traffic Shaping in the End System Ensuring Reliable Networks

• Traffic shaping to ensure the limitation of bandwidth assigned to a VL • Host sends instances of the VL arbitrary • ES perform traffic shaping • Switches enforce traffic shaping

Host sends to HOST End-System 1 2 3 4

BAG BAG BAG Traffic on the ES Network 1 2 3 4

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 44

Critical Traffic over TTEthernet

Traffic Policing in the Switch Ensuring Reliable Networks

•Traffic policing to ensure the limitation of bandwidth for a VL •Protection of the network from ES faults (babbling idiot)

Switch

Filtering & Routing

Incoming Port

1 2 3 4 Outgoing Port

dr dr op op

1 2 3 4

BAG BAG

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 45

Critical Traffic over TTEthernet

Virtual Link Scheduling (1) Ensuring Reliable Networks

A664 standard guarantees the maximum allowed jitter •At the output of an ES •Lmax is the maximum frame length of a VL

((20  L max j ) 8) max_ jitter  40µs  j{set of VLs} net _ bandwidth max_ jitter 500 µs

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 46

Critical Traffic over TTEthernet

Virtual Link Scheduling (2) Ensuring Reliable Networks

Virtual Links are defined in terms of: •Bandwidth Allocation Gap (BAG) •Maximum allowed frame size •Maximum jitter

1 2 3 4

Jitter = 0 Jitter < Max Jitter < Max Jitter = Max BAG BAG BAG BAG

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 47

Critical Traffic over TTEthernet

Virtual Link Scheduling (3) Ensuring Reliable Networks

Rate constrained traffic is event-triggered traffic. This means that the assigned bandwith of the VL-IDs will not be constant at any point in time. So what happened in a so called „peak load scenario“? Different network paths will send to a single link (port of a switch). The switch has to store the messages sent to this link and will work through it according to the priorities of the message.

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 48

Critical Traffic over TTEthernet

Rate Constrained Traffic (Summary) Ensuring Reliable Networks

What is Rate-Constrained traffic? • Predefined maximum amount of bytes (or frequency of frames) per VL

• VL traffic exceeding the limit will be silently dropped (by switch)

• Overall network can be analyzed for jitter: • How much will the messages of a VL be delayed in typical case and in worst case? • What happens if all of these VLs generate their traffic at about the same time? • Will that overflow the switch buffers?

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 49

Critical Traffic over TTEthernet

Time Triggered Ethernet Ensuring Reliable Networks

How does TTEthernet build upon the concept of VLs? TT • Each TT frame is transmitted by the end system at a certain time • the end system can also transmit non-TT frames; they have lower priority than the TT frames • The switch expects the frame from the transmitter within a certain time interval (window) • this provides an implicit bus guardian functionality: TTE traffic received outside of the expected time interval is discarded, just as rate constrained traffic would be if it exceeds the BAG • The switch forwards the frame to the receivers (end systems or other switches) at certain times • these times can be different for each port! • The receivers receive the frame with well-defined latency and minimal jitter • …even if they are not TTE nodes

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 50

Critical Traffic over TTEthernet

TTEthernet Scheduling Principle (1) Ensuring Reliable Networks Senders have a defined transmit schedule Switches have an “acceptance schedule” for incoming data

VL ID Sender Receiver(s) 1 a @ [07:25-07:35] b @ 7:45,d @ 8:20 2 a @ [08:55-09:05] c @ 10:30 3 b @ [09:55-10:05] c @ 10:15,e @ 10:15,f @ 10:30 4 c a,b,d 5 c a,f … … …

a g

3 VL ID Time b f 3 @ 10:00 c e 8 @ 11:15 d

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 51

Critical Traffic over TTEthernet

TTEthernet Scheduling Principle (2) Ensuring Reliable Networks

Switches have a “forwarding schedule” per port Receivers receive the frame with defined latency and minimal jitter VL ID Sender Receiver(s) 1 a @ [07:25-07:35] b @ 7:45,d @ 8:20 2 a @ [08:55-09:05] c @ 10:30 3 b @ [09:55-10:05] c @ 10:15,e @ 10:15,f @ 10:30 4 c a,b,d 5 c a,f … … …

a g 3 b 3 3 f e c d

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 52

Critical Traffic over TTEthernet

TTEthernet Traffic (1) Ensuring Reliable Networks

Within a cluster, TTEthernet provides configurable synchronization services to ensure that all time-triggered communication is transmitted according to schedule, with known end-to-end latency and minimal jitter, regardless of other traffic going through the same switches and links.

Each link in each direction has a schedule of its own. But all schedules are synchronized to one global clock per sync domain.

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 53

Critical Traffic over TTEthernet

TTEthernet Traffic (2) Ensuring Reliable Networks

• Time-Triggered (TT) - scheduled and configured • sending instant is triggered by the time • constant transmission delay and small and bounded jitter • the switch expects the frame from the transmitter within a certain time interval (window) • this provides an implicit bus guardian functionality: TTE traffic received outside of the expected time interval is discarded • switch forwards the frame to the receivers (end systems or other switches) at certain times - these times can be different for each port! • receivers receive the frame with well-defined latency and minimal jitter.

•Off–line scheduling by means of SW-tools (TTE-Tools)

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 54

Critical Traffic over TTEthernet

Message Conflict (1) Ensuring Reliable Networks • No message collision can occur, because of point-to-point connections and the full duplex communication. • Conflict resolution among Ethernet Messages is resolved by the switch. Two types of conflicts exists: • Simultaneous messages: messages are competing for the same resource at the same point in time. • E.g., two messages (m1, m2) received in Switch port 1 and port 2 need to be routed through Switch port 3 at the same point in time • Port/Link is occupied: message transmission is on-going and another message need to be transmitted through that port/link. • E.g., message m1 received in port 1 is being routed through port 3. Message m2 received in port 2 need to be routed through port 3, m1 is in transmission • In 100 Mbit/s Ethernet, a transmission of a frame with maximum length takes 123 µs, • meaning that for 123 µs the port and the link are occupied during transmission

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 55

Critical Traffic over TTEthernet

Message Conflict (2) Ensuring Reliable Networks Mechanisms for conflict resolution: • Priorities – configurable • TT traffic has 1 priority, RC traffic has 4 priority levels • TT over all RC priorities, • TT over specific RC priority, e.g., TT has the priority of RC with priority 6.

• Shuffling mechanism – configurable • increase the receive window at the receiver side for one message transmission delay in one link (12,3 µs in 1 Gbit/s).

• Resource media reservation – configurable • Prior to the transmission of TT frames, no transmission is allowed. • No transmissions of rate-constraint and best effort Ethernet traffic are initiated that last across the (scheduled) start of transmission point in time of the time triggered message.

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 56

Critical Traffic over TTEthernet

Message Conflict (3) Ensuring Reliable Networks

Currently routed frames will be finished, the following frames will be handled according to there priority.

Shuffling

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 57

Critical Traffic over TTEthernet

Message Conflict (4) Ensuring Reliable Networks

According to the schedule of the time-triggered messages a window in front of the tt-frame will be reserved so that no other lower priority frame is able to delay the tt-frame.

Media Reservation www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 58

Critical Traffic over TTEthernet

Message Conflict (5) Ensuring Reliable Networks

• Simultaneous TT-TT message conflict • Correct configuration schedules ensure that such message conflicts do not exist.

• Simultaneous RC-RC message conflict • Priority mechanisms resolves the conflict. • Among the same priority: internal priority level based on source port number.

• Simultaneous TT-RC message conflict • Priority mechanisms resolves the conflict.

• Simultaneous TT-BE or RC-BE message conflict • TT or RC over best effort

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 59

Critical Traffic over TTEthernet

Message Conflict (6) Ensuring Reliable Networks

• Port/Link is occupied TT-TT • Correct configuration schedules ensure that such message conflicts do not exist. • Use shuffling mechanism - in case of complex network topologies

• Port/Link is occupied BE-TT, RC-TT • Shuffling mechanism • Media reservation

• Port/Link is occupied BE-RC • RC message is delayed for the transmission length of a BE frame. • Live with that , jitter is introduced to the RC frame • Jitter is part of RC configuration.

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 60

Critical Traffic over TTEthernet

Mixed Network Illustration (1) Ensuring Reliable Networks

1. Starting point: Pure DX AF AFDX® network

DX AF

X FD AF A DX AF DX

DX AF

DX AF

AF AF DX DX

Note: AFDX® is a registered trademark of Airbus www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 61

Critical Traffic over TTEthernet

Mixed Network Illustration (2) Ensuring Reliable Networks

1. Starting point: Pure DX AF AFDX® network

DX 2. Change switches to AF X FD T TTEthernet configured A TE TT as pure AFDX® E DX AF

DX AF

T T TE TE

Note: AFDX® is a registered trademark of Airbus www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 62

Critical Traffic over TTEthernet

Mixed Network Illustration (3) Ensuring Reliable Networks

1. Starting point: Pure DX AF AFDX® network

DX 2. Change switches to AF X FD T TTEthernet configured A TE TT as pure AFDX® E DX AF 3. Add function using

time-triggered services DX AF

TT TT E (Sync, TT messages…) E E TT

The TTEthernet switch is seen as E TT an AFDX® switch by the AFDX® E TT network !!!

 No modification of previous AFDX® architecture organization (Bags, VL…) as long as there is enough bandwidth

Note: AFDX® is a registered trademark of Airbus www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 63

Critical Traffic over TTEthernet

Mixed Network Illustration (4) Ensuring Reliable Networks

1. Starting point: Pure DX AF AFDX® network

DX 2. Change switches to AF X FD T TTEthernet A TE T configured as pure TE DX AFDX® AF

3. Add function using DX AF

time-triggered TT TT E E E TT services (Sync, TT

H messages…) ET H ET 4. Do further changes TE T H TH ET E (e.g., add other E TT A AFDX® network, BE FD E X the rn Ethernet E/S) et H ET DX AF H ET H ET

Note: AFDX® is a registered trademark of Airbus www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 64

Critical Traffic over TTEthernet

Summary Ensuring Reliable Networks

• TTEthernet support Time-Triggered and Rate-Constrained (A664) traffic class (as well as non-critical traffic) • Both traffic classes (TT and RC) use the VL concept • TTEthernet provides deterministic end-to-end timing (latency) and minimal jitter (jitter << latency) for TT traffic including multicast capability • TTEthernet provides bandwidth guaranties for RC traffic including multicast capability • Rate constrained and regular (“best-effort”) Ethernet traffic – are possible and can utilize any remaining bandwidth without disturbing the TT traffic flows.

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 65

AGENDA Ensuring Reliable Networks Introduction TTEthernet Basics Critical Traffic over TTEthernet Clock Synchronization Principles Fault Tolerance TTEthernet Products Overview

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 66

Clock Synchronization Principles

TTEthernet Timing Ensuring Reliable Networks

Time-triggered communication and timing checks in a network with forwarding: multiple schedules and delays apply

I’ll accept M only between I’ll expect M 10:40 and 10:50 between 11:05 and 11:15 I’ll accept M only between 10:55 and 11:05

M

M M I’ll forward M M at 11:00 I’ll transmit M I’ll forward M at 10:45 at 11:10 Let’s see if I can receive M …a switch

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 67

Clock Synchronization Principles

Communication Constraints Ensuring Reliable Networks

• Each TTEthernet component has a local clock. • The clock synchronization service periodically re-synchronizes these clocks so that time-triggered frame transmission is possible. • Reception of time-triggered frames and regular Ethernet communication of these components do not require clock synchronization.

Clock Sync inactive Clock Sync active Can send no yes time-triggered frames Can receive yes yes time-triggered frames Can send/receive yes yes regular frames

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 68

Clock Synchronization Principles

Synchronization Services Ensuring Reliable Networks Clock Synchronization Service Clock Synchronization Service is executed during normal operation mode to keep the local clocks synchronized to each other.

e g n a

Startup/Restart Service is executed to reach h c x E

an initial synchronization of the local clocks in

Fast Clock e g e the system. a s m i s T e

r M e t e u g p

Integration/Reintegration Service is used for n a m h o c C

components to join an already synchronized x E

system. e k g

c a o l s C s t c e e rf M e Clique Detection Services are used to detect P loss of synchronization and establishment of disjoint sets of synchronized components. Slow Clock

Real Time R.int R.int Startup/Restart Service

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 69

Clock Synchronization Principles

Periodicity by Cycles Ensuring Reliable Networks

• Time-triggered systems are always periodic. • The periods for TTEthernet are: • “Integration Cycle” and “Communication Cycle”.

time  integration integration integration integration cycle cycle cycle cycle

communication cycle

• The Integration Cycle is the period for clock synchronization. Typical duration: 1 millisecond. • One Communication Cycle is an integer multiple (often a power-of-two) of an Integration Cycle. www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 70

Clock Synchronization Principles

Protocol Control Frames (1) Ensuring Reliable Networks

• TTEthernet communication on the PHY/MAC layer is indistinguishable from regular IEEE 802.3 Ethernet traffic. All frames/transmissions are fully Ethernet compliant. • But some of them are very short (64 byte) Ethernet frames with a special “Ethertype” field (value 0x891D). These are called Protocol Control Frames (PCFs) • and are used by TTEthernet to perform the synchronization among the TTEthernet components.

89 1D

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 71

Clock Synchronization Principles

Protocol Control Frames (2) Ensuring Reliable Networks

The PCF payload contains several fields required for protocol operation. Explanations for each of them are given in the context of the related functionality.

payload byte offset 0 Integration Cycle

4 Membership

8 - reserved - Sync Sync 12 Type - reserved - “Type” is a 4 bit field Priority Domain 16 - reserved -

20 Transparent Clock 24

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 72

Clock Synchronization Principles

Protocol Control Frames (3) Ensuring Reliable Networks

TTEthernet Time-triggered Ethernet frames (TT) . • TT frames can be sent by TTEthernet components, and received by any Ethernet component. TT • It is sent and routed at defined points in time, giving it highly deterministic timing properties and minimal jitter. • The path and timing of this frame is determined by the schedules configured in the sender and switches along the channel.

TTEthernet Protocol Control Frames (PCF). • PCF frames are special kind of TTEthernet frames exchanged between TTEthernet components in order to establish and PCF maintain synchronization. • They can be received by regular Ethernet components too, but are meaningless for them (except for diagnostic tools).

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 73

Clock Synchronization Principles

Synchronization Domain Ensuring Reliable Networks

• The TTEthernet protocol builds up a common notion of time among the TTEthernet components (end systems and switches) in the cluster. • The group of components with a common notion of time is called a Synchronization Domain.

Regular Ethernet components • cannot be part of a synchronization domain • they can communicate with components in the synchronization domain, • but cannot synchronize with them using TTEthernet synchronization.

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 74

Clock Synchronization Principles

Two classes of Masters Ensuring Reliable Networks

• In each Integration Cycle, the TTEthernet components execute the TTEthernet synchronization protocol. • This execution is triggered by two groups of components, called

Sync Synchronization Masters Sync Sync and Compression Masters Comp Comp Comp

• Each group has to contain at least one member; for fault tolerant synchronization, multiple members are needed.

Note: Typically, Synchronization Master function is located in end systems, and Compression Master function is located in switches.

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 75

Clock Synchronization Principles

Synchronization Topology Ensuring Reliable Networks

• some TTE end systems are synchronization masters, others just synchronization clients • some TTE switches are compression masters (and synchronization clients), others just synchronization clients

Sync

Comp

Sync

Note: this is an architectural choice, but current implementations currently require this setup. Switches can be configured as compression masters, but not as synchronization masters.

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 76

Clock Synchronization Principles

Cluster Setup Example 1 Ensuring Reliable Networks

•Two synchronization masters, one compression master •A simple configuration with little fault tolerance

Sync

Comp

Sync

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 77

Clock Synchronization Principles

Cluster Setup Example 2 Ensuring Reliable Networks

•four Synchronization Masters, one Compression Masters

Sync Sync

Comp

Sync Sync

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 78

Clock Synchronization Principles

Multiple Compression Masters Ensuring Reliable Networks

• If multiple compression masters exist, they will be configured with different synchronization priorities (sync_priority).

• All active Compression Masters will perform the startup sequence, and the one with the highest sync_priority will prevail.

• If it fails, then the next-lower active Compression Master will automatically follow.

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 79

Clock Synchronization Principles

Two-Step Synchronization Ensuring Reliable Networks Protocol Control Frames called “Integration Frames” are used to perform all synchronization functions. They are transmitted accordingly:

The Synchronization Masters send Sync IN Comp Integration Frames at the beginning of each Integration Cycle. The IN Sync Comp timing of these frames is used for IN the “voting”

Sync Comp

The Compression Masters send Sync IN Comp Integration Frames to everybody, timing them in a special way so that Sync IN Comp everybody can correct their clocks. IN

Sync Comp

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 80

Clock Synchronization Principles

Synchronization Step 1-A Ensuring Reliable Networks

• Synchronization masters transmit PCFs to the compression masters

Sync

Comp

Sync

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 81

Clock Synchronization Principles

Synchronization Step 1-B Ensuring Reliable Networks

• Compression masters perform the compression function based on the received PCFs (only applied to PCFs)

• Function overview: 1. collect incoming PCFs, 2. compute an average value of their reception times, and 3. prepare to send out PCFs at a time depending on the result

Let‘s see… I got two PCFs, both are valid, so I will take the average of their timing to determine when I transmit back. Comp

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 82

Clock Synchronization Principles

Synchronization Step 2-A Ensuring Reliable Networks

• Compression masters transmit PCFs to all synchronization masters and clients • Note: compression masters also are synchronization clients

Sync

Comp

Sync

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 83

Clock Synchronization Principles

Synchronization Step 2-B Ensuring Reliable Networks

• Synchronization masters and clients (i.e. everybody) collect PCFs from all compression masters and clients and perform the clock correction function • clock correction function: compute average of PFCs received and adapt the local clock according to the result

OK… I got one PCF, so I simply use that frame‘s arrival time to determine how much my local clock needs to be adapted forward or backward.

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 84

Clock Synchronization Principles

The “Distributed Start-up” Challenge Ensuring Reliable Networks

• To bring a set of nodes, ready for communication but not yet synchronized, to synchronous operation within a guaranteed time limit even in the presence of faulty nodes or links.

• TTEthernet defines a set of related state machines in the synchronization masters (set or subset of end systems) and compression masters (set or subset of switches) to perform this. We will now look at the principles of cold-start and initial synchronization.

• Mathematical modeling and model checking have been applied on these mechanisms to ensure their correctness even in complex and malicious fault scenarios (We will not look at these formal proofs in this training).

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 85

Clock Synchronization Principles

Start-up Procedure 1-2 Ensuring Reliable Networks

At least one Synchronization Sync CS Master sends a Coldstart PCF to Comp Comp the Compression Masters Sync Comp

Sync “I want to get synchronized!”

The Compressions Masters echo back the Coldstart PCF to all Sync Synchronization Masters. CS Comp Sync Comp Comp “OK, I am available!” Sync

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 87

Clock Synchronization Principles

Start-up Procedure 3-4 Ensuring Reliable Networks

CA All Synchronization Masters Sync

send Coldstart-Acknowledge Comp CA Comp PCFs to all Compression Masters Sync Comp CA Sync “Then let’s start all together!”

CA The Compressions Masters echo Sync back the Coldstart-Acknowledge

CA Comp Sync Comp PCFs to the Synchronization Comp Masters. CA Sync “OK, here is the GO signal!”

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 88

Clock Synchronization Principles Synchronization Master (1) Ensuring Reliable Networks

Integration at Synchronization Masters (End systems): • I listen a while to receive a PCF. If I receive an integration PCF, the cluster is already synchronized and I use the PCF to join the cluster.

• If I receive a coldstart PCF, someone has just initiated the start-up (and this frame was echoed to me by the Compression Master). I will transmit a coldstart-acknowledge PCF, which will be echoed back to me. This allows me to integrate.

• If nobody is sending any PCFs, I will (after listening for a while) perform a coldstart: I will transmit a coldstart PCF, which will be echoed to everybody. Then everybody will transmit coldstart-acknowledge PCFs, which will be echoed back to everybody, which allows everybody to integrate.”

Synchronization Clients do not perform coldstart. They wait until they receive Integration PCFs.

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 89

Clock Synchronization Principles Synchronization Master (2) Ensuring Reliable Networks

Integration at Synchronization Master (simplified):

need to integrate

Do I see an Do I see a Send a Cold-Start Integration Cold-Start PCF PCF? n PCF? n

y y

integrated (an echo of my PCF will go Send a Cold-Start- to other nodes, and they Acknowledge PCF will send a Cold-Start- Acknowledge)

Receive back a Cold- Start-Acknowledge PCF

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 90

Clock Synchronization Principles Sync. Compression Master (1) Ensuring Reliable Networks

Integration at Compression Master (TTE Switch):

“I listen to receive a PCF. If I receive an Integration PCF, the cluster is already synchronized and I use the PCF to join the cluster. If I receive a Coldstart or Coldstart-Acknowledge PCF, someone is just initiating the start-up. I will echo back this frame to all Synchronization Masters, letting them know I am here.

Doing this will allow the Synchronization Masters to get into the INTEGRATED state, where they start regular integration cycles and transmit Integration PCFs.”

Being also a Synchronization Client, the Compression Master will then integrate on the Integration PCFs sent by the Synchronization Masters.

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 91

Clock Synchronization Principles Sync. Compression Master (2) Ensuring Reliable Networks

Integration at Synchronization Compression Master (simplified):

need to integrate

Do I see Do I see a an CS / CS Integratio n ACK n n PCF? PCF? y y

integrated Echo back the frame to all Sync. Masters

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 92

Clock Synchronization Principles Start-up – Details (1) Ensuring Reliable Networks After the CSO, everyone (except Cluster starts Echo back from A) transmits a Echo back from first integration A transmits a Compression coldstart- Compression round coldstart PCF Master acknowledge Master

CSO CAO CS IN A Sync CS CA CA IN B Sync CS CA CA IN C Sync CS CA CA IN D Sync CS CA time

CS… ColdStart PCF CA…Coldstart Acknowledge PCF IN …Integration PCF CSO…ColdStart Offset timeout CAO…Coldstart Acknowledge Offset timeout www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 93

Clock Synchronization Principles

Transmission Delay Correction Ensuring Reliable Networks And Total Order

• The previous discussions of fault-tolerant start-up and fault- tolerant clock synchronization made some assumptions:

• no dynamic delays incurred by intermediate switches, network stacks, etc. • constant and known transmission delays over all paths through the network • consistent frame order seen at all receivers (total frame order)

• These assumptions are not valid a priori and need justification

We justify them by introducing a delay correction mechanism one level below the start-up and clock synchronization services

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 94

Clock Synchronization Principles

Transmission Delay Definition Ensuring Reliable Networks

A transmission sent into a network will be delayed by various physical effects (e.g. propagation of electrons/photons in the wire, electric signal conversions in ) and processing times (e.g. forwarding or queueing in a switch).

The overall transmission delay is composed of • static elements, i.e. (nearly) constant delays independent of the load in the network, such as wire propagation delays along the frame path or the processing time for a frame in a receiver. For these a (nearly) exact value can be defined. • dynamic elements, typically incurred by the prioritization scheme in outgoing queues in switches along the frame path. For these, an upper bound can be derived.

In total, an upper bound for the transmission delay of any frame in the whole network can be derived. It is called max_transmission_delay time of reception time of transmission (worst case)

transmitter switch queue-out delay delay receive (dynamic) receiver wire delay wire delay delay delay

time example times: 5 µs 1 µs 2 µs 1-7 µs 1 µs 5 µs

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 95

Clock Synchronization Principles

Acceptance Windows (1) Ensuring Reliable Networks

An Acceptance Window is a time interval of pre-defined duration around an expected “receive point”.

In time-triggered systems, a schedule defines the “receive points” (i.e. the moments when transmissions are expected). Due to clock drifts and jitter, these moments are not “exactly known” and therefore require the definition of acceptance windows.

If a transmission occurs within a defined acceptance window, the transmission is “in-schedule”, otherwise “out-of-schedule”. Time-triggered communication systems consider out-of-schedule transmissions as invalid.

receive point

acceptance window time

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 96

Clock Synchronization Principles

Acceptance Windows (2) Ensuring Reliable Networks

When a Synchronization Master (SM) transmits a PCF to a Compression Master (CM) and expects to receive it back in “correct timing”, the acceptance window is calculated like this:

[ tdispatch+ 2 * max_transmission_delay + compression_master_delay, tdispatch+ 2 * max_transmission_delay + compression_master_delay + acceptance_window_size ]

SM transmits the frame frame travels to CM frame gets echoed frame travels in here it back to SM must arrive

max_ compression_ max_ acceptance_ transmission_ master_ transmission_ window_ delay delay delay size www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 97

Clock Synchronization Principles

Transparent Clock Ensuring Reliable Networks

TTEthernet components accumulate the transmission delays of each PCF. The accumulated value is called transparent_clock.

PCF Generator (sender of the frame): • transparent_clock = my static_send_delay + my dynamic_send_delay

Each PCF forwarder (Switch): • transparent_clock + = wire_delay to me + my static_relay_delay + my dynamic_relay_delay

PCF Consumer (receiver): • transparent_clock + = wire_delay to me + my static_receive_delay + my dynamic_receive_delay

Note: The transparent_clock value in the PCF is modified by each TTE component along the frame’s route. The receiver can read the PCF’s total transmission delay in the PCF.

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 98

Clock Synchronization Principles

Permanence Ensuring Reliable Networks

Frames in a switched network can have different transmission delays. It is possible that receive order is different to the transmit order.

Example: • frame F1 is transmitted by node A at 10:00 • frame F2 is transmitted by node B at 10:05 • frame F1 has a transmission delay A  C of 0:20 • frame F2 has a transmission delay B  C of 0:05 • receiver C sees: frame F2 arrives at 10:10, then F1 arrives at 10:20

In a TTEthernet network, frame F2 is said to become “permanent” when it is certain that no frame F1, which was transmitted at an earlier point in time than F2, will be received anymore. TTEthernet needs to know when certain frames become permanent to run synchronization algorithms.

B F2

F1 10:05 10:10 A C Comp 10:20 10:00 www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 99

Clock Synchronization Principles

Permanence of PCFs Ensuring Reliable Networks

Using the transparent_clock value, a receiver can determine the “earliest safe” point in time when a PCF becomes permanent: • permanence_delay = max_transmission_delay – transparent_clock • permanence_point_in_time = receive_point_in_time + permanence_delay

Example: • max_transmission_delay in this network is 0:30 • frame F1 is transmitted by node A at 10:00 • frame F2 is transmitted by node B at 10:05 • frame F1 has a transmission delay A  C of 0:20. This is visible in F1’s transparent_clock • frame F2 has a transmission delay B  C of 0:05. This is visible in F2’s transparent_clock • receiver C sees: F2 arrives at 10:10, becomes permanent at 10:10 + (0:30 - 0:05) = 10:35 • receiver C sees: F1 arrives at 10:20, F1 becomes permanent at 10:20 + (0:30 - 0:20) = 10:30  F1 becomes permanent before F2 B F2

F1 10:05 10:10 A C Comp 10:20 10:00 www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 100

Clock Synchronization Principles

Reestablishing Total Frame Order (1) Ensuring Reliable Networks

When discussing timing diagrams, we pretend that PCFs have no relevant transmission delays, and that they are always received simultaneously at all receivers. But in reality, each PCF accumulates the transparent_clock and is only used when the permanence function at the receiver allows.

Dt End System 1 ES1 End System 2 Switch 1 Switch 2 SW1 inverted order! Switch 3 SW2

Earliest use at SW3 Comp ES2 Switch 3

Dt dataflow example: send order is restored by permanence function

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 101

Clock Synchronization Principles

Reestablishing Total Frame Order (2) Ensuring Reliable Networks 302 dispatch 0

ES 102 302 SM 1 SM 2 SM 3 send 5

306 dispatch 0 SC 1

ES 106 306 SM 4 send 5 302 SC 2 send 45 Switch 201 receive 5 SM 5 302 send 302 70 Switch 202 receive 45 CM 1

302 SM 6 Switch 203 receive 10 80 302 306 max_transmission_delay (=120) permanence_delay (120 – 10 = 110)

max_transmission_delay (=120) permanence_delay (120 – 80 = 40) 302 Switch 203 permanence

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 110 120 130 140 150 306 105 115 125 135 145 www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 102

Clock Synchronization Principles

Membership Vector (1) Ensuring Reliable Networks

Each Synchronization Master is associated 1:1 with a bit in a 32 bit vector called “membership”. Only synchronization masters (not compression masters) are represented in the membership.

The definition is static and valid network-wide.

31 30 … 7 6 5 4 3 2 1 0 bbbb…bbbbbbbb Membership definition: • A  Bit 7 A • B  Bit 6 Sync • C  Bit 4 B Sync

C Sync

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 103

Clock Synchronization Principles

Membership Vector (2) Ensuring Reliable Networks

• In each Integration Cycle, a Compression Master sets a bit in the membership vector if it receives an Integration PCF from a Synchronization Master. • The membership vector in the Compression Master therefore indicates which Synchronization Masters have successfully contributed to the Integration Round.

Sync

Sync Sync 010110

Sync

Comp

Sync

Sync www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 104

Clock Synchronization Principles

Membership Vector (3) Ensuring Reliable Networks

When the Compression Master echoes back a PCF, it includes the membership in the PCFs. Synchronization Masters and Clients can see how “reliable” the PCF is: more membership bits set indicates higher number of Synchronization Masters contributed their “votes”.

Sync

Sync

Sync 010110 010110 010110 010110 010110 010110 010110 010110 Sync 010110

Comp 010110

010110 010110 010110 Sync 010110

Sync

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 105

Clock Synchronization Principles

Membership - Clock Quality Ensuring Reliable Networks

• In networks with multiple Compression Masters and channels, the Synchronization Clients can select the best (=most reliable) source of synchronization • by selecting the PCF with the maximum membership (=most number of bits set) of all PCFs received.

• This provides immediate fault tolerance in case of failures of complete channels, Synchronization Masters, Compression Masters, or network links between any of these.

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 106

AGENDA Ensuring Reliable Networks Introduction TTEthernet Basics Critical Traffic over TTEthernet Clock Synchronization Principles Fault Tolerance TTEthernet Products Overview

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 107

Fault Tolerance

Fault Tolerance Basics (1) Ensuring Reliable Networks

Fault  Error  Failure •Failure: A system fails to deliver its intended service •Error: Unintended system state •Fault: cause of the error

Fault classification •Transient faults (EMI, SEU - single event upsets, …) •Intermittent faults •Permanent fault

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 108

Fault Tolerance

Fault Tolerance Basics (2) Ensuring Reliable Networks

Failure Type:

• Crash Failure: component stops silently (fail-silent), it does not produce any result at all.

• Consistent Failure: If there are multiple receivers, all receivers see the same erroneous result.

• Masquerading Failure: Component communicate with incorrect ID.

• Byzantine (inconsistent, malicious, asymmetric) Failure: the different receivers see differing results.

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 109

Fault Tolerance

Fault Tolerance Basics (3) Ensuring Reliable Networks

FT System Classification

Fail safe systems •The System can be set in a safe state at any time •Stop all trains, set all train lights to red

Fail operational •The system must continue operation even if the part of distributed computer system fails •Airplane cannot be set in the safe state at any time

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 110

Fault Tolerance

Fault Tolerance in TTEthernet Ensuring Reliable Networks

•Redundant channels – up to 3 end system channels • COM/MON Switches • Anti Masquerading • Guardian (babbling idiot prevention) • Fault tolerant startup mechanism • Formal verification • Fault tolerant clock synchronization mechanism • Formal verification

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 111

Fault Tolerance

Redundancy Management (1) Ensuring Reliable Networks

One/Two/three channels configuration

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 112

Fault Tolerance

Redundancy Management (2) Ensuring Reliable Networks

•The Ports, Links and Switches are duplicated for redundancy Network A e m F AFDX a ra Fr m End-System •Frames are concurrently e ES ES transmitted over both networks F ra m e e m ra Per VL End- F Per VL End- •On the Receiving End-System, System Network B System “First Valid Frame wins” Transmit Receive

Task 1 Task 2 Task 3 Application Layer M1’ M2 M3

Communication Channel 0 M1 M2 M3 Layer Channel 1 M1’ M2’ M3’

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 113

Fault Tolerance

Redundancy Management (3) Ensuring Reliable Networks

If voting over redundant frames is required • Endsystem can be configured such that it can deliver all redundant messages to host CPU. • Host CPU can implements a SW layer (similar to FTCOM in TTP/C protocol) and perform different voting mechanisms • Present it transparently to the host application.

Task 2 Task 3 Application

Layer M2v M3

FTCOM Layer M2 M2’

Communication Channel 0 M2 M3 Layer Channel 1 M2’ M3’ www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 114

Fault Tolerance

Redundancy Management (4) Ensuring Reliable Networks

Integrity Checking

• Integrity checking is done per VL and per Network • Checking is based on Sequence Number & MFCL (Maximum Consecutive Frames Lost) • All Invalid Frames are discarded

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 115

Fault Tolerance

COM/MON Approach Ensuring Reliable Networks

• High integrity design: Self checking pair • Two processor that execute same function in parallel • Comparator checks output of both processors. • If one processor fails (maliciously) and generates wrong data, second processors shuts down.

Fail-silence Enforced in case Self-checking pair ensures fail-silence ! of disagreement

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 116

Fault Tolerance

Diagnosis Management (1) Ensuring Reliable Networks

• Switch and ES can send periodic diagnostic information with the status information • Synchronization • #TT received, #TT dropped messages. • BAG errors, …

• Period of diagnostic messages can be configured.

• Host CPU in addition can read the status fields of its connected end-systems.

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 117

Fault Tolerance

Diagnosis Management (2) Ensuring Reliable Networks

• The TTEthernet devices can be queried to get their current state and additional diagnosis information. • The TTEtherent devices provide diagnostic information as well as internal self checking mechanisms (IP Built-In Test results).

TT Frame TT Frame TT Frame TT Frame ES ES T T F e ra m m ra e F T n Sta T y tio tus St er a in atu qu orm fo s s nf rm q T tu i a ue T e a us ti r St tat on y F m S ra ra m F T e T

ES TT Frame TT Frame TT Frame TT Frame ES

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 118

Fault Tolerance

Diagnosis Management (3) Ensuring Reliable Networks

• The TTEthernet devices can also send periodic status and diagnosis information. • This data can be scheduled with the other TT-messages or event-triggered traffic can be used (RC or BE).

Status information Status information ES ES

ES ES

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 119

Fault Tolerance

TTEthernet Features Overview Ensuring Reliable Networks

• Anti Masquerading • Based on configuration • PORT – VLID mapping • Can be configured for BE traffic as well (on basis of MAC addresses)

• Guardian (babbling idiot prevention) • Based on TT and RC configuration • True only for TT and RC traffic • BE traffic can be also “controlled” – to prevent BE flooding • Bandwidth allocation per MAC address and per port

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 120

AGENDA Ensuring Reliable Networks Introduction TTEthernet Basics Critical Traffic over TTEthernet Clock Synchronization Principles Fault Tolerance TTEthernet Products Overview

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 121

TTEthernet Product Overview

TTEthernet Hardware Ensuring Reliable Networks

Development Equipment • Switch  TTESwitch Lab 24 Ports • E/S  TTEPMC Card, TTEPCI Card  TTEXMC Card, TTEPCIe Card Development Systems • TTEDevelopment System A664 • TTEDevelopment System v2.0 VxWorks 653 Chip IP • TTEEnd Systems chip IP • Certification Package (RTCA DO 254) Airborne Production Hardware • DO-254, DO-178B, DO-160F (DAL A) Certifiable products • TTEEnd System Pro

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 122

TTEthernet Product Overview

TTEthernet 24 ports switch (Hardware)Ensuring Reliable Networks Switching Capability • 18 x 10/100 Mbit/s Ethernet ports • 6 x 10/100/1000 Mbit/s Ethernet ports FPGA-based Switch processor supporting 3 configurable traffic classes • Best-effort traffic (IEEE 802.3) • Rate-constraint traffic (full AFDX) • Time-triggered traffic (SAE 6802 Standard) Management • Built-in management module (MIP) for data loading and diagnostic Management function • 4096 VLs with BAGs from 0.5 to 1600 ms • ICMP, IP/UDP, SNMP • Traffic shaping • Arinc 615A/TFTP Data loading • Health monitoring Built-in tests • Multiple Configurations, PIN programming Form factors / environmental • 19” Rack www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 123

TTEthernet Product Overview

TTEthernet Software Ensuring Reliable Networks

Configuration & Verification Tooling • TTEPlan Network Planning • TTEBuild Network Configuration • TTELoad / ARINC 615 Data Loader • TTEVerify (certification DO 178C)

Partition 1 Partition 2 Partition 3 Task 1A Task 1B Task 2A Task 2B Task 3A

Partition OS Partition OS Partition OS

Embedded Software Message Channels (sampling and queuing ports)

TTE Core OS • Protocol Layer IMA OS Message TTECOM Layer ARINC 653 • TTEDriver (Linux, Windows) Channel API TTEAPI Library TTE Core OS Services • API Library TTEPCI Driver TTE • Sync Library TT RC BE • TTECOM Layer ARINC 653 Hardware TTEthernet middleware layer for time and space partitioned IMA OS (ARINC 653) Enables TTEthernet data exchange through message channels (queuing and sampling ports) www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 124

TTEthernet Product Overview

TTEthernet Tools (1) Ensuring Reliable Networks

A Flexible Set of Development and Verification Tools for TTEthernet Networks and Systems • Modeling of real-time communication requirements • Modeling of network and topology • Support for manual and automated design steps • Based on open XML databases for flexible exchange with 3rd party tools • Specialized editors for each design step

TTE TTEPlan: Network Scheduling (for TT) and View: Packet Analyzer Analysis Tool (for RC) TTEVisualize: NC Database Explorer

TTE Build Network Configuration TTESLC Wizard: GUI for TTE-Plan TTEBuild Device Configuration

TTEVerify: RTCA/DO-178B Qualifiable TTELoad: Data Loading Solution Verification Tool ARINC 615A Loader

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 125 Page 125

TTEthernet Product Overview

TTEthernet Tools (2) Ensuring Reliable Networks Network High-level communication reqs. Ecore EMF Description Senders, receivers, virtual links, Scripting Capability sync domains, fault-tolerance XML TTEPlan requirements, etc. Network Configuration (Schedule) Generation Schedulability analysis for RC traffic This stores the “schedule“ (TT, Network RC, ET configs). Who sends Ecore EMF what at what time (TT) at what Scripting Capability Configuration rate (RC) on what route? XML

TTEBuild Network Configuration Device Config. Generation This is a truthful, human readable + GUI Editor XML representation of the binary DeviceDevice tables in the switches and end Ecore EMF Configuration systems. Scripting Capability Configuration XMLXML TTEBuild Device Configuration Image Generation This is the binary image for a switch or end system, ready for download. Images for multiple ImageImage devices in the system may be collected in a download database www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 126 Page 126

TTEthernet Product Overview

TTEthernet Tools (3) Ensuring Reliable Networks

TTEPlan TTEBuild TTELoad

System Net Config Spec Config Image .xml .xml .xml/.bin

TTEVerify TTEVerify TTEVerify Verifies correct implementation and traceability between high- level and low-level requirements report report report www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved. Page 127

Page 127 En s u r i n g Re l i a b l e Ne t w o r k s

w w w. t t t e ch .c o m

www.tttech.com Copyright © TTTech Computertechnik AG. All rights reserved.